idnits 2.17.1 draft-ietf-bess-evpn-bum-procedure-updates-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 16 instances of too long lines in the document, the longest one being 3 characters in excess of 72. -- The draft header indicates that this document updates RFC7432, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 19, 2017) is 2408 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC 7524' is mentioned on line 176, but not defined == Unused Reference: 'RFC2119' is defined on line 691, but no explicit reference was found in the text == Unused Reference: 'RFC7432' is defined on line 701, but no explicit reference was found in the text == Unused Reference: 'RFC7524' is defined on line 706, but no explicit reference was found in the text == Unused Reference: 'RFC7988' is defined on line 712, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-bess-evpn-overlay' is defined on line 719, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-bier-architecture' is defined on line 725, but no explicit reference was found in the text == Unused Reference: 'I-D.zzhang-bier-evpn' is defined on line 731, but no explicit reference was found in the text == Unused Reference: 'RFC6513' is defined on line 736, but no explicit reference was found in the text == Unused Reference: 'RFC6514' is defined on line 740, but no explicit reference was found in the text == Outdated reference: A later version (-21) exists of draft-ietf-bess-evpn-igmp-mld-proxy-00 == Outdated reference: A later version (-12) exists of draft-ietf-bess-evpn-overlay-08 Summary: 1 error (**), 0 flaws (~~), 13 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Z. Zhang 3 Internet-Draft W. Lin 4 Updates: 7432 (if approved) Juniper Networks 5 Intended status: Standards Track J. Rabadan 6 Expires: March 23, 2018 Nokia 7 K. Patel 8 Arrcus 9 A. Sajassi 10 Cisco Systems 11 September 19, 2017 13 Updates on EVPN BUM Procedures 14 draft-ietf-bess-evpn-bum-procedure-updates-02 16 Abstract 18 This document specifies procedure updates for broadcast, unknown 19 unicast, and multicast (BUM) traffic in Ethernet VPNs (EVPN), 20 including selective multicast, and provider tunnel segmentation. 22 Requirements Language 24 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 25 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 26 document are to be interpreted as described in RFC2119. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on March 23, 2018. 45 Copyright Notice 47 Copyright (c) 2017 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (https://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 64 2.1. Reasons for Tunnel Segmentation . . . . . . . . . . . . . 4 65 3. Additional Route Types of EVPN NLRI . . . . . . . . . . . . . 5 66 3.1. Per-Region I-PMSI A-D route . . . . . . . . . . . . . . . 5 67 3.2. S-PMSI A-D route . . . . . . . . . . . . . . . . . . . . 6 68 3.3. Leaf-AD route . . . . . . . . . . . . . . . . . . . . . . 6 69 4. Selective Multicast . . . . . . . . . . . . . . . . . . . . . 7 70 5. Inter-AS Segmentation . . . . . . . . . . . . . . . . . . . . 8 71 5.1. Changes to Section 7.2.2 of RFC 7117 . . . . . . . . . . 8 72 5.2. I-PMSI Leaf Tracking . . . . . . . . . . . . . . . . . . 9 73 5.3. Backward Compatibility . . . . . . . . . . . . . . . . . 9 74 6. Inter-Region Segmentation . . . . . . . . . . . . . . . . . . 11 75 6.1. Area vs. Region . . . . . . . . . . . . . . . . . . . . . 11 76 6.2. Per-region Aggregation . . . . . . . . . . . . . . . . . 12 77 6.3. Use of S-NH-EC . . . . . . . . . . . . . . . . . . . . . 13 78 6.4. Ingress PE's I-PMSI Leaf Tracking . . . . . . . . . . . . 14 79 7. Multi-homing Support . . . . . . . . . . . . . . . . . . . . 14 80 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 81 9. Security Considerations . . . . . . . . . . . . . . . . . . . 14 82 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 14 83 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 14 84 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 15 85 12.1. Normative References . . . . . . . . . . . . . . . . . . 15 86 12.2. Informative References . . . . . . . . . . . . . . . . . 16 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 89 1. Terminology 91 To be added 93 2. Introduction 95 RFC 7432 specifies procedures to handle broadcast, unknown unicast, 96 and multicast (BUM) traffic in Section 11, 12 and 16, using Inclusive 97 Multicast Ethernet Tag Route. A lot of details are referred to RFC 98 7117 (VPLS Multicast). In particular, selective multicast is briefly 99 mentioned for Ingress Replication but referred to RFC 7117. 101 RFC 7117 specifies procedures for using both inclusive tunnels and 102 selective tunnels, similar to MVPN procedures specified in RFC 6513 103 and RFC 6514. A new SAFI "MCAST-VPLS" is introduced, with two types 104 of NLRIs that match MVPN's S-PMSI A-D routes and Leaf A-D routes. 105 The same procedures can be applied to EVPN selective multicast for 106 both Ingress Replication and other tunnel types, but new route types 107 need to be defined under the same EVPN SAFI. 109 MVPN uses terms I-PMSI and S-PMSI A-D Routes. For consistency and 110 convenience, this document will use the same I/S-PMSI terms for VPLS 111 and EVPN. In particular, EVPN's Inclusive Multicast Ethernet Tag 112 Route and VPLS's VPLS A-D route carrying PTA (PMSI Tunnel Attribute) 113 for BUM traffic purpose will all be referred to as I-PMSI A-D routes. 114 Depending on the context, they may be used interchangeably. 116 MVPN provider tunnels and EVPN/VPLS BUM provider tunnels, which are 117 referred to as MVPN/EVPN/VPLS provider tunnels in this document for 118 simplicity, can be segmented for technical or administrative reasons, 119 which are summarized in Section 2.1 of this document. RFC 6513/6514 120 cover MVPN inter-as segmentation, RFC 7117 covers VPLS multicast 121 inter-as segmentation, and RFC 7524 (Seamless MPLS Multicast) covers 122 inter-area segmentation for both MVPN and VPLS. 124 There is a difference between MVPN and VPLS multicast inter-as 125 segmentation. For simplicity, EVPN will use the same procedures as 126 in MVPN. All ASBRs can re-advertise their choice of the best route. 127 Each can become the root of its intra-AS segment and inject traffic 128 it receives from its upstream, while each downstream PE/ASBR will 129 only pick one of the upstream ASBRs as its upstream. This is also 130 the behavior even for VPLS in case of inter-area segmentation. 132 For inter-area segmentation, RFC 7524 requires the use of Inter-area 133 P2MP Segmented Next-Hop Extended Community (S-NH-EC), and the setting 134 of "Leaf Information Required" (LIR) flag in PTA in certain 135 situations. Either of these could be optional in case of EVPN. 137 Removing these requirements would make the segmentation procedures 138 transparent to ingress and egress PEs. 140 RFC 7524 assumes that segmentation happens at area borders. However, 141 it could be at "regional" borders, where a region could be a sub- 142 area, or even an entire AS plus its external links (Section 6). That 143 would allow for more flexible deployment scenarios (e.g. for single- 144 area provider networks). 146 This document specifies/clarifies/redefines certain/additional EVPN 147 BUM procedures, with a salient goal that they're better aligned among 148 MVPN, EVPN and VPLS. For brevity, only changes/additions to relevant 149 RFC 7117 and RFC 7524 procedures are specified, instead of repeating 150 the entire procedures. Note that these are to be applied to EVPN 151 only, even though sometimes they may sound to be updates to RFC 152 7117/7524. 154 2.1. Reasons for Tunnel Segmentation 156 Tunnel segmentation may be required and/or desired because of 157 administrative and/or technical reasons. 159 For example, an MVPN/VPLS/EVPN network may span multiple providers 160 and Inter-AS Option-B has to be used, in which the end-to-end 161 provider tunnels have to be segmented at and stitched by the ASBRs. 162 Different providers may use different tunnel technologies (e.g., 163 provider A uses Ingress Replication, provider B uses RSVP-TE P2MP 164 while provider C uses mLDP). Even if they use the same tunnel 165 technology like RSVP-TE P2MP, it may be impractical to set up the 166 tunnels across provider boundaries. 168 The same situations may apply between the ASes and/or areas of a 169 single provider. For example, the backbone area may use RSVP-TE P2MP 170 tunnels while non-backbone areas may use mLDP tunnels. 172 Segmentation can also be used to divide an AS/area to smaller 173 regions, so that control plane state and/or forwarding plane state/ 174 burden can be limited to that of individual regions. For example, 175 instead of Ingress Replicating to 100 PEs in the entire AS, with 176 inter-area segmentation [RFC 7524] a PE only needs to replicate to 177 local PEs and ABRs. The ABRs will further replicate to their 178 downstream PEs and ABRs. This not only reduces the forwarding plane 179 burden, but also reduces the leaf tracking burden in the control 180 plane. 182 Smaller regions also have the benefit that, in case of tunnel 183 aggregation, it is easier to find congruence among the segments of 184 different constituent (service) tunnels and the resulting aggregation 185 (base) tunnel in a region. This leads to better bandwidth 186 efficiency, because the more congruent they are, the fewer leaves of 187 the base tunnel need to discard traffic when a service tunnel's 188 segment does not need to receive the traffic (yet it is receiving the 189 traffic due to aggregation). 191 Another advantage of the smaller region is smaller BIER sub-domains. 192 In this new multicast architecture BIER, packets carry a BitString, 193 in which the bits correspond to edge routers that needs to receive 194 traffic. Smaller sub-domains means smaller BitStrings can be used 195 without having to send multiple copies of the same packet. 197 3. Additional Route Types of EVPN NLRI 199 RFC 7432 defines the format of EVPN NLRI as the following: 201 +-----------------------------------+ 202 | Route Type (1 octet) | 203 +-----------------------------------+ 204 | Length (1 octet) | 205 +-----------------------------------+ 206 | Route Type specific (variable) | 207 +-----------------------------------+ 209 So far five types have been defined: 211 + 1 - Ethernet Auto-Discovery (A-D) route 212 + 2 - MAC/IP Advertisement route 213 + 3 - Inclusive Multicast Ethernet Tag route 214 + 4 - Ethernet Segment route 215 + 5 - IP Prefix Route 217 This document defines three additional route types: 219 + 9 - Per-Region I-PMSI A-D route 220 + 10 - S-PMSI A-D route 221 + 11 - Leaf A-D route 223 The "Route Type specific" field of the type 9 and type 10 EVPN NLRIs 224 starts with a type 1 RD, whose Administrative sub-field MUST match 225 that of the RD in all the EVPN routes from the same advertising 226 router for a given EVI, except the Leaf A-D route (Section 3.3). 228 3.1. Per-Region I-PMSI A-D route 230 The Per-region I-PMSI A-D route has the following format. Its usage 231 is discussed in Section 6.2. 233 +-----------------------------------+ 234 | RD (8 octets) | 235 +-----------------------------------+ 236 | Ethernet Tag ID (4 octets) | 237 +-----------------------------------+ 238 | Extended Community (8 octets) | 239 +-----------------------------------+ 241 After Ethernet Tag ID, an Extended Community (EC) is used to identify 242 the region. Various types and sub-types of ECs provide maximum 243 flexibility. Note that this is not an EC Attribute, but an 8-octet 244 field embedded in the NLRI itself, following EC encoding scheme. 246 3.2. S-PMSI A-D route 248 The S-PMSI A-D route has the following format: 250 +-----------------------------------+ 251 | RD (8 octets) | 252 +-----------------------------------+ 253 | Ethernet Tag ID (4 octets) | 254 +-----------------------------------+ 255 | Multicast Source Length (1 octet) | 256 +-----------------------------------+ 257 | Multicast Source (Variable) | 258 +-----------------------------------+ 259 | Multicast Group Length (1 octet) | 260 +-----------------------------------+ 261 | Multicast Group (Variable) | 262 +-----------------------------------+ 263 | Originating Router's IP Addr | 264 +-----------------------------------+ 266 Other than the addition of Ethernet Tag ID, it is identical to the 267 S-PMSI A-D route as defined in RFC 7117. The procedures in RFC 7117 268 also apply (including wildcard functionality), except that the 269 granularity level is per Ethernet Tag. 271 3.3. Leaf-AD route 273 The Route Type specific field of a Leaf A-D route consists of the 274 following: 276 +-----------------------------------+ 277 | Route Key (variable) | 278 +-----------------------------------+ 279 | Originating Router's IP Addr | 280 +-----------------------------------+ 282 A Leaf A-D route is originated in response to a PMSI route, which 283 could be an Inclusive Multicast Tag route, a per-region I-PMSI A-D 284 route, an S-PMSI A-D route, or some other types of routes that may be 285 defined in the future that triggers Leaf A-D routes. The Route Key 286 is the "Route Type Specific" field of the route for which this Leaf 287 A-D route is generated. 289 The general procedures of Leaf A-D route are first specified in RFC 290 6514 for MVPN. The principles apply to VPLS and EVPN as well. RFC 291 7117 has details for VPLS Multicast, and this document points out 292 some specifics for EVPN, e.g. in Section 5. 294 4. Selective Multicast 296 [I-D.ietf-bess-evpn-igmp-mld-proxy] specifies procedures for EVPN 297 selective forwarding of IP multicast using SMET routes. It assumes 298 selective forwarding is always used with IR or BIER for all flows. 299 An NVE proxies the IGMP/MLD state that it learns on its ACs to 300 (C-S,C-G) or (C-*,C-G) SMET routes and advertises to other NVEs, and 301 an receiving NVE converts the SMET routes back to IGMP/MLD messages 302 and send them out of its ACs. The receiving NVE also uses the SMET 303 routes to identify which NVEs need to receive traffic for a 304 particular (C-S,C-G) or (C-*,C-G) to achieve selective forwarding 305 using IR or BIER. 307 With the above procedures, selective forwarding is done for all flows 308 and the SMET routes are advertised for all flows. It is possible 309 that an operator may not want to track all those (C-S, C-G) or 310 (C-*,C-G) state on the NVEs, and the multicast traffic pattern allows 311 inclusive forwarding for most flows while selective forwarding is 312 needed only for a few high-rate flows. For that, or for tunnel types 313 other than IR/BIER, S-PMSI/Leaf A-D procedures defined for Selective 314 Multicast for VPLS in [RFC7117] are used. Other than that different 315 route types and formats are specified with EVPN SAFI for S-PMSI A-D 316 and Leaf A-D routes (Section 3), all procedures in [RFC7117] with 317 respect to Selective Multicast apply to EVPN as well, including 318 wildcard procedures. In a nut shell, a source NVE advertises S-SPMSI 319 A-D routes to announce the tunnels used for certain flows, and 320 receiving NVEs either join the announced PIM/mLDP tunnel or respond 321 with Leaf A-D routes if the Leaf Information Requested flag is set in 322 the S-PMSI A-D route's PTA (so that the source NVE can include them 323 as tunnel leaves). 325 An optimization to the [RFC7117] procedures may be applied. In case 326 of RSVP-TE P2MP tunnels, while a source NVE sets the LIR bit to 327 request Leaf A-D routes, an egress NVE may omit the Leaf A-D route if 328 it already advertises a corresponding SMET route, and the source NVE 329 will use that in lieu of the Leaf A-D route. 331 5. Inter-AS Segmentation 333 5.1. Changes to Section 7.2.2 of RFC 7117 335 The first paragraph of Section 7.2.2.2 of RFC 7117 says: 337 "... The best route procedures ensure that if multiple 338 ASBRs, in an AS, receive the same Inter-AS A-D route from their EBGP 339 neighbors, only one of these ASBRs propagates this route in Internal 340 BGP (IBGP). This ASBR becomes the root of the intra-AS segment of 341 the inter-AS tree and ensures that this is the only ASBR that accepts 342 traffic into this AS from the inter-AS tree." 344 The above VPLS behavior requires complicated VPLS specific procedures 345 for the ASBRs to reach agreement. For EVPN, a different approach is 346 used and the above quoted text is not applicable to EVPN. 348 The Leaf A-D based procedure is used for each ASBR who re-advertises 349 into the AS to discover the leaves on the segment rooted at itself. 350 This is the same as the procedures for S-PMSI in RFC 7117 itself. 352 The following text at the end of the second bullet: 354 "................................................... If, in order 355 to instantiate the segment, the ASBR needs to know the leaves of 356 the tree, then the ASBR obtains this information from the A-D 357 routes received from other PEs/ASBRs in the ASBR's own AS." 359 is changed to the following: 361 "................................................... If, in order 362 to instantiate the segment, the ASBR needs to know the leaves of 363 the tree, then the ASBR MUST set the LIR flag to 1 in the PTA to 364 trigger Leaf A-D routes from egress PEs and downstream ASBRs. 365 It MUST be (auto-)configured with an import RT, which controls 366 acceptance of leaf A-D routes by the ASBR." 368 Accordingly, the following paragraph in Section 7.2.2.4: 370 "If the received Inter-AS A-D route carries the PMSI Tunnel attribute 371 with the Tunnel Identifier set to RSVP-TE P2MP LSP, then the ASBR 372 that originated the route MUST establish an RSVP-TE P2MP LSP with the 373 local PE/ASBR as a leaf. This LSP MAY have been established before 374 the local PE/ASBR receives the route, or it MAY be established after 375 the local PE receives the route." 377 is changed to the following: 379 "If the received Inter-AS A-D route has the LIR flag set in its PTA, 380 then a receiving PE must originate a corresponding Leaf A-D route, 381 and a receiving ASBR must originate a corresponding Leaf A-D route 382 if and only if it received and imported one or more corresponding Leaf 383 A-D routes from its downstream IBGP or EBGP peers, or it has non-null 384 downstream forwarding state for the PIM/mLDP tunnel that instantiates 385 its downstream intra-AS segment. The ASBR that (re-)advertised the 386 Inter-AS A-D route then establishes a tunnel to the leaves discovered 387 by the Leaf A-D routes." 389 5.2. I-PMSI Leaf Tracking 391 An ingress PE does not set the LIR flag in its I-PMSI's PTA, even 392 with Ingress Replication or RSVP-TE P2MP tunnels. It does not rely 393 on the Leaf A-D routes to discover leaves in its AS, and Section 11.2 394 of RFC 7432 explicitly states that the LIR flag must be set to zero. 396 An implementation of RFC 7432 might have used the Originating 397 Router's IP Address field of the Inclusive Multicast Ethernet Tag 398 routes to determine the leaves, or might have used the Next Hop field 399 instead. Within the same AS, both will lead to the same result. 401 With segmentation, an ingress PE MUST determine the leaves in its AS 402 from the BGP next hops in all its received I-PMSI A-D routes, so it 403 does not have to set the LIR bit set to request Leaf A-D routes. PEs 404 within the same AS will all have different next hops in their I-PMSI 405 A-D routes (hence will all be considered as leaves), and PEs from 406 other ASes will have the next hop in their I-PMSI A-D routes set to 407 addresses of ASBRs in this local AS, hence only those ASBRs will be 408 considered as leaves (as proxies for those PEs in other ASes). Note 409 that in case of Ingress Replication, when an ASBR re-advertises IBGP 410 I-PMSI A-D routes, it MUST advertise the same label for all those for 411 the same Ethernet Tag ID and the same EVI. When an ingress PE builds 412 its flooding list, multiple routes may have the same (nexthop, label) 413 tuple and they will only be added as a single branch in the flooding 414 list. 416 5.3. Backward Compatibility 418 The above procedures assume that all PEs are upgraded to support the 419 segmentation procedures: 421 o An ingress PE uses the Next Hop instead of Originating Router's IP 422 Address to determine leaves for the I-PMSI tunnel. 424 o An egress PE sends Leaf A-D routes in response to I-PMSI routes, 425 if the PTA has the LIR flag set (by the re-advertising ASBRs). 427 o In case of Ingress Replication, when an ingress PE builds its 428 flooding list, multiple I-PMSI routes may have the same (nexthop, 429 label) tuple and only a single branch for those will be added in 430 the flooding list. 432 If a deployment has legacy PEs that does not support the above, then 433 a legacy ingress PE would include all PEs (including those in remote 434 ASes) as leaves of the inclusive tunnel and try to send traffic to 435 them directly (no segmentation), which is either undesired or not 436 possible; a legacy egress PE would not send Leaf A-D routes so the 437 ASBRs would not know to send external traffic to them. 439 To address this backward compatibility problem, the following 440 procedure can be used (see Section 6.2 for per-PE/AS/region I-PMSI 441 A-D routes): 443 o An upgraded PE indicates in its per-PE I-PMSI A-D route that it 444 supports the new procedures. Details will be provided in a future 445 revision. 447 o All per-PE I-PMSI A-D routes are restricted to the local AS and 448 not propagated to external peers. 450 o The ASBRs in an AS originate per-region I-PMSI A-D routes and 451 advertise to their external peers to advertise tunnels used to 452 carry traffic from the local AS to other ASes. Depending on the 453 types of tunnels being used, the LIR flag in the PTA may be set, 454 in which case the downstream ASBRs and upgraded PEs will send Leaf 455 A-D routes to pull traffic from their upstream ASBRs. In a 456 particular downstream AS, one of the ASBRs is elected, based on 457 the per-region I-PMSI A-D routes for a particular source AS, to 458 send traffic from that source AS to legacy PEs in the downstream 459 AS. The traffic arrives at the elected ASBR on the tunnel 460 announced in the best per-region I-PMSI A-D route for the source 461 AS, that the ASBR has selected of all those that it received over 462 EBGP or IBGP sessions. Details of the election procedure will be 463 provided in a future revision. 465 o In an ingress AS, if and only if an ASBR has active downstream 466 receivers (PEs and ASBRs), which are learned either explicitly via 467 Leaf AD routes or implicitly via PIM join or mLDP label mapping, 468 the ASBR originates a per-PE I-PMSI A-D route (i.e., regular 469 Inclusive Multicast Ethernet Tag route) into the local AS, and 470 stitches incoming per-PE I-PMSI tunnels into its per-region I-PMSI 471 tunnel. With this, it gets traffic from local PEs and send to 472 other ASes via the tunnel announced in its per-region I-PMSI A-D 473 route. 475 Note that, even if there is no backward compatibility issue, the 476 above procedures have the benefit of keeping all per-PE I-PMSI A-D 477 routes in their local ASes, greatly reducing the flooding of the 478 routes and their corresponding Leaf A-D routes (when needed), and the 479 number of inter-as tunnels. 481 6. Inter-Region Segmentation 483 6.1. Area vs. Region 485 RFC 7524 is for MVPN/VPLS inter-area segmentation and does not 486 explicitly cover EVPN. However, if "area" is replaced by "region" 487 and "ABR" is replaced by "RBR" (Regional Border Router) then 488 everything still works, and can be applied to EVPN as well. 490 A region can be a sub-area, or can be an entire AS including its 491 external links. Instead of automatic region definition based on IGP 492 areas, a region would be defined as a BGP peer group. In fact, even 493 with IGP area based region definition, a BGP peer group listing the 494 PEs and ABRs in an area is still needed. 496 Consider the following example diagram: 498 --------- ------ --------- 499 / \ / \ / \ 500 / \ / \ / \ 501 | PE1 o ASBR1 -- ASBR2 ASBR3 -- ASBR4 o PE2 | 502 \ / \ / \ / 503 \ / \ / \ / 504 --------- ------ --------- 505 AS 100 AS 200 AS 300 506 |-----------|--------|---------|--------|------------| 507 segment1 segment2 segment3 segment4 segment5 509 The inter-as segmentation procedures specified so far (RFC 6513/6514, 510 7117, and Section 5 of this document) requires all ASBRs to be 511 involved, and Ingress Replication is used between two ASBRs in 512 different ASes. 514 In the above diagram, it's possible that ASBR1/4 does not support 515 segmentation, and the provider tunnels in AS 100/300 can actually 516 extend across the external link. In the case, the inter-region 517 segmentation procedures can be used instead - a region is the entire 518 (AS100 + ASBR1-ASBR2 link) or (AS300 + ASBR3-ASBR4 link). ASBR2/3 519 would be the RBRs, and ASBR1/4 will just be a transit core router 520 with respect to provider tunnels. 522 As illustrated in the diagram below, ASBR2/3 will establish a 523 multihop EBGP session with either a RR or directly with PEs in the 524 neighboring AS. I/S-PMSI A-D routes from ingress PEs will not be 525 processed by ASBR1/4. When ASBR2 re-advertises the routes into AS 526 200, it changes the next hop to its own address and changes PTA to 527 specify the tunnel type/identification in its own AS. When ASBR3 re- 528 advertises I/S-PMSI A-D routes into the neighboring AS 300, it 529 changes the next hop to its own address and changes PTA to specify 530 the tunnel type/identification in the neighboring region 3. Now the 531 segment is rooted at ASBR3 and extends across the external link to 532 PEs. 534 --------- ------ --------- 535 / RR....\.mh-ebpg / \ mh-ebgp/....RR \ 536 / : \ `. / \ .' / : \ 537 | PE1 o ASBR1 -- ASBR2 ASBR3 -- ASBR4 o PE2 | 538 \ / \ / \ / 539 \ / \ / \ / 540 --------- ------ --------- 541 AS 100 AS 200 AS 300 542 |-------------------|----------|---------------------| 543 segment 1 segment 2 segment 3 545 6.2. Per-region Aggregation 547 Notice that every I/S-PMSI route from each PE will be propagated 548 throughout all the ASes or regions. They may also trigger 549 corresponding Leaf A-D routes depending on the types of tunnels used 550 in each region. This may become too many - routes and corresponding 551 tunnels. To address this concern, the I-PMSI routes from all PEs in 552 a AS/region can be aggregated into a single I-PMSI route originated 553 from the RBRs, and traffic from all those individual I-PMSI tunnels 554 will be switched into the single I-PMSI tunnel. This is like the 555 MVPN Inter-AS I-PMSI route originated by ASBRs. 557 The MVPN Inter-AS I-PMSI A-D route can be better called as per-AS 558 I-PMSI A-D route, to be compared against the (per-PE) Intra-AS I-PMSI 559 A-D routes originated by each PE. In this document we will call it 560 as per-region I-PMSI A-D route, in case we want to apply the 561 aggregation at regional level. The per-PE I-PMSI routes will not be 562 propagated to other regions. If multiple RBRs are connected to a 563 region, then each will advertise such a route, with the same route 564 key (Section 3.1). Similar to the per-PE I-PMSI A-D routes, RBRs/PEs 565 in a downstream region will each select a best one from all those re- 566 advertised by the upstream RBRs, hence will only receive traffic 567 injected by one of them. 569 MVPN does not aggregate S-PMSI routes from all PEs in an AS like it 570 does for I-PMSIs routes, because the number of PEs that will 571 advertise S-PMSI routes for the same (s,g) or (*,g) is small. This 572 is also the case for EVPN, i.e., there is no per-region S-PMSI 573 routes. 575 Notice that per-region I-PMSI routes can also be used to address 576 backwards compatibility issue, as discussed in Section 5.3. 578 The per-region I-PMSI route uses an embedded EC in NLRI to identify a 579 region. As long as it uniquely identifies the region and the RBRs 580 for the same region uses the same EC it is permitted. In the case 581 where an AS number or area ID is needed, the following can be used: 583 o For a two-octet AS number, a Transitive Two-Octet AS-Specific EC 584 of sub-type 0x09 (Source AS), with the Global Administrator sub- 585 field set to the AS number and the Local Administrator sub-field 586 set to 0. 588 o For a four-octet AS number, a Transitive Four-Octet AS-Specific EC 589 of sub-type 0x09 (Source AS), with the Global Administrator sub- 590 field set to the AS number and the Local Administrator sub-field 591 set to 0. 593 o For an area ID, a Transitive IPv4-Address-Specific EC of any sub- 594 type. 596 Uses of other particular ECs may be specified in other documents. 598 6.3. Use of S-NH-EC 600 RFC 7524 specifies the use of S-NH-EC because it does not allow ABRs 601 to change the BGP next hop when they re-advertise I/S-PMSI AD routes 602 to downstream areas. That is only to be consistent with the MVPN 603 Inter-AS I-PMSI A-D routes, whose next hop must not be changed when 604 they're re-advertised by the segmenting ABRs for reasons specific to 605 MVPN. For EVPN, it is perfectly fine to change the next hop when 606 RBRs re-advertise the I/S-PMSI A-D routes, instead of relying on S- 607 NH-EC. As a result, this document specifies that RBRs change the BGP 608 next hop when they re-advertise I/S-PMSI A-D routes and do not use S- 609 NH-EC. if a downstream PE/RBR needs to originate Leaf A-D routes, it 610 simply uses the BGP next hop in the corresponding I/S-PMSI A-D routes 611 to construct Route Targets. 613 The advantage of this is that neither ingress nor egress PEs need to 614 understand/use S-NH-EC, and consistent procedure (based on BGP next 615 hop) is used for both inter-as and inter-region segmentation. 617 6.4. Ingress PE's I-PMSI Leaf Tracking 619 RFC 7524 specifies that when an ingress PE/ASBR (re-)advertises an 620 VPLS I-PMSI A-D route, it sets the LIR flag to 1 in the route's PTA. 621 Similar to the inter-as case, this is actually not really needed for 622 EVPN. To be consistent with the inter-as case, the ingress PE does 623 not set the LIR flag in its originated I-PMSI A-D routes, and 624 determines the leaves based on the BGP next hops in its received 625 I-PMSI A-D routes, as specified in Section 5.2. 627 The same backward compatibility issue exists, and the same solution 628 as in the inter-as case applies, as specified in Section 5.3. 630 7. Multi-homing Support 632 If multi-homing does not span across different ASes or regions, 633 existing procedures work with segmentation, and a segmentation point 634 will remove the ESI label from the packets. If an ES is multi-homed 635 to PEs in different ASes or regions, additional procedures are needed 636 to work with segmentation. The procedures are well understood but 637 omitted here until the requirement becomes clear. 639 8. IANA Considerations 641 This document requests IANA to assign the following new EVPN route 642 types: 644 o 9 - Per-Region I-PMSI A-D route 646 o 10 - S-PMSI A-D route 648 o 11 - Leaf A-D route 650 9. Security Considerations 652 This document does not seem to introduce new security risks, though 653 this may be revised after further review and scrutiny. 655 10. Acknowledgements 657 The authors thank Eric Rosen, John Drake, and Ron Bonica for their 658 comments and suggestions. 660 11. Contributors 662 The following also contributed to this document through their earlier 663 work in EVPN selective multicast. 665 Junlin Zhang 666 Huawei Technologies 667 Huawei Bld., No.156 Beiqing Rd. 668 Beijing 100095 669 China 671 Email: jackey.zhang@huawei.com 673 Zhenbin Li 674 Huawei Technologies 675 Huawei Bld., No.156 Beiqing Rd. 676 Beijing 100095 677 China 679 Email: lizhenbin@huawei.com 681 12. References 683 12.1. Normative References 685 [I-D.ietf-bess-evpn-igmp-mld-proxy] 686 Sajassi, A., Thoria, S., Patel, K., Yeung, D., Drake, J., 687 and W. Lin, "IGMP and MLD Proxy for EVPN", draft-ietf- 688 bess-evpn-igmp-mld-proxy-00 (work in progress), March 689 2017. 691 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 692 Requirement Levels", BCP 14, RFC 2119, 693 DOI 10.17487/RFC2119, March 1997, 694 . 696 [RFC7117] Aggarwal, R., Ed., Kamite, Y., Fang, L., Rekhter, Y., and 697 C. Kodeboniya, "Multicast in Virtual Private LAN Service 698 (VPLS)", RFC 7117, DOI 10.17487/RFC7117, February 2014, 699 . 701 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 702 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 703 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 704 2015, . 706 [RFC7524] Rekhter, Y., Rosen, E., Aggarwal, R., Morin, T., 707 Grosclaude, I., Leymann, N., and S. Saad, "Inter-Area 708 Point-to-Multipoint (P2MP) Segmented Label Switched Paths 709 (LSPs)", RFC 7524, DOI 10.17487/RFC7524, May 2015, 710 . 712 [RFC7988] Rosen, E., Ed., Subramanian, K., and Z. Zhang, "Ingress 713 Replication Tunnels in Multicast VPN", RFC 7988, 714 DOI 10.17487/RFC7988, October 2016, 715 . 717 12.2. Informative References 719 [I-D.ietf-bess-evpn-overlay] 720 Sajassi, A., Drake, J., Bitar, N., Shekhar, R., Uttaro, 721 J., and W. Henderickx, "A Network Virtualization Overlay 722 Solution using EVPN", draft-ietf-bess-evpn-overlay-08 723 (work in progress), March 2017. 725 [I-D.ietf-bier-architecture] 726 Wijnands, I., Rosen, E., Dolganow, A., Przygienda, T., and 727 S. Aldrin, "Multicast using Bit Index Explicit 728 Replication", draft-ietf-bier-architecture-08 (work in 729 progress), September 2017. 731 [I-D.zzhang-bier-evpn] 732 Zhang, Z., Przygienda, T., Sajassi, A., and J. Rabadan, 733 "EVPN BUM Using BIER", draft-zzhang-bier-evpn-00 (work in 734 progress), June 2017. 736 [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ 737 BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February 738 2012, . 740 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 741 Encodings and Procedures for Multicast in MPLS/BGP IP 742 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 743 . 745 Authors' Addresses 747 Zhaohui Zhang 748 Juniper Networks 750 EMail: zzhang@juniper.net 752 Wen Lin 753 Juniper Networks 755 EMail: wlin@juniper.net 756 Jorge Rabadan 757 Nokia 759 EMail: jorge.rabadan@nokia.com 761 Keyur Patel 762 Arrcus 764 EMail: keyur@arrcus.com 766 Ali Sajassi 767 Cisco Systems 769 EMail: sajassi@cisco.com