idnits 2.17.1 draft-ietf-bess-evpn-bum-procedure-updates-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 16 instances of too long lines in the document, the longest one being 3 characters in excess of 72. -- The draft header indicates that this document updates RFC7432, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (December 13, 2018) is 1933 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC 7524' is mentioned on line 196, but not defined == Unused Reference: 'RFC2119' is defined on line 763, but no explicit reference was found in the text == Unused Reference: 'RFC7432' is defined on line 773, but no explicit reference was found in the text == Unused Reference: 'RFC7524' is defined on line 778, but no explicit reference was found in the text == Unused Reference: 'RFC7988' is defined on line 784, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-bier-architecture' is defined on line 791, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-bier-evpn' is defined on line 797, but no explicit reference was found in the text == Unused Reference: 'RFC6513' is defined on line 802, but no explicit reference was found in the text == Unused Reference: 'RFC6514' is defined on line 806, but no explicit reference was found in the text == Outdated reference: A later version (-09) exists of draft-ietf-bess-evpn-df-election-framework-06 == Outdated reference: A later version (-21) exists of draft-ietf-bess-evpn-igmp-mld-proxy-02 == Outdated reference: A later version (-14) exists of draft-ietf-bier-evpn-01 Summary: 1 error (**), 0 flaws (~~), 13 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Z. Zhang 3 Internet-Draft W. Lin 4 Updates: 7432 (if approved) Juniper Networks 5 Intended status: Standards Track J. Rabadan 6 Expires: June 16, 2019 Nokia 7 K. Patel 8 Arrcus 9 A. Sajassi 10 Cisco Systems 11 December 13, 2018 13 Updates on EVPN BUM Procedures 14 draft-ietf-bess-evpn-bum-procedure-updates-05 16 Abstract 18 This document specifies procedure updates for broadcast, unknown 19 unicast, and multicast (BUM) traffic in Ethernet VPNs (EVPN), 20 including selective multicast, and provider tunnel segmentation. 22 Requirements Language 24 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 25 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 26 document are to be interpreted as described in RFC2119. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on June 16, 2019. 45 Copyright Notice 47 Copyright (c) 2018 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (https://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 64 2.1. Reasons for Tunnel Segmentation . . . . . . . . . . . . . 4 65 3. Additional Route Types of EVPN NLRI . . . . . . . . . . . . . 5 66 3.1. Per-Region I-PMSI A-D route . . . . . . . . . . . . . . . 6 67 3.2. S-PMSI A-D route . . . . . . . . . . . . . . . . . . . . 6 68 3.3. Leaf-AD route . . . . . . . . . . . . . . . . . . . . . . 7 69 4. Selective Multicast . . . . . . . . . . . . . . . . . . . . . 8 70 5. Inter-AS Segmentation . . . . . . . . . . . . . . . . . . . . 8 71 5.1. Changes to Section 7.2.2 of RFC 7117 . . . . . . . . . . 9 72 5.2. I-PMSI Leaf Tracking . . . . . . . . . . . . . . . . . . 10 73 5.3. Backward Compatibility . . . . . . . . . . . . . . . . . 10 74 5.3.1. Designated ASBR Election . . . . . . . . . . . . . . 12 75 6. Inter-Region Segmentation . . . . . . . . . . . . . . . . . . 12 76 6.1. Area vs. Region . . . . . . . . . . . . . . . . . . . . . 12 77 6.2. Per-region Aggregation . . . . . . . . . . . . . . . . . 14 78 6.3. Use of S-NH-EC . . . . . . . . . . . . . . . . . . . . . 15 79 6.4. Ingress PE's I-PMSI Leaf Tracking . . . . . . . . . . . . 15 80 7. Multi-homing Support . . . . . . . . . . . . . . . . . . . . 15 81 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 16 82 9. Security Considerations . . . . . . . . . . . . . . . . . . . 16 83 10. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 16 84 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 16 85 12. References . . . . . . . . . . . . . . . . . . . . . . . . . 17 86 12.1. Normative References . . . . . . . . . . . . . . . . . . 17 87 12.2. Informative References . . . . . . . . . . . . . . . . . 18 88 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 18 90 1. Terminology 92 It is expected that audience is familiar with EVPN and MVPN concepts 93 and terminologies. For convenience, the following terms are briefly 94 explained. 96 o PMSI: P-Multicast Service Interface - a conceptual interface for a 97 PE to send customer multicast traffic to all or some PEs in the 98 same VPN. 100 o I-PMSI: Inclusive PMSI - to all PEs in the same VPN. 102 o S-PMSI: Selective PMSI - to some of the PEs in the same VPN. 104 o Leaf A-D routes: For explicit leaf tracking purpose. Triggered by 105 S-PMSI A-D routes and targeted at triggering route's originator. 107 o IMET A-D route: Inclusive Multicast Ethernet Tag A-D route. The 108 EVPN equivalent of MVPN Intra-AS I-PMSI A-D route. 110 o SMET A-D route: Selective Multicast Ethernet Tag A-D route. The 111 EVPN equivalent of MVPN Leaf A-D route but unsolicited and 112 untargeted. 114 2. Introduction 116 RFC 7432 specifies procedures to handle broadcast, unknown unicast, 117 and multicast (BUM) traffic in Section 11, 12 and 16, using Inclusive 118 Multicast Ethernet Tag Route. A lot of details are referred to RFC 119 7117 (VPLS Multicast). In particular, selective multicast is briefly 120 mentioned for Ingress Replication but referred to RFC 7117. 122 RFC 7117 specifies procedures for using both inclusive tunnels and 123 selective tunnels, similar to MVPN procedures specified in RFC 6513 124 and RFC 6514. A new SAFI "MCAST-VPLS" is introduced, with two types 125 of NLRIs that match MVPN's S-PMSI A-D routes and Leaf A-D routes. 126 The same procedures can be applied to EVPN selective multicast for 127 both Ingress Replication and other tunnel types, but new route types 128 need to be defined under the same EVPN SAFI. 130 MVPN uses terms I-PMSI and S-PMSI A-D Routes. For consistency and 131 convenience, this document will use the same I/S-PMSI terms for VPLS 132 and EVPN. In particular, EVPN's Inclusive Multicast Ethernet Tag 133 Route and VPLS's VPLS A-D route carrying PTA (PMSI Tunnel Attribute) 134 for BUM traffic purpose will all be referred to as I-PMSI A-D routes. 135 Depending on the context, they may be used interchangeably. 137 MVPN provider tunnels and EVPN/VPLS BUM provider tunnels, which are 138 referred to as MVPN/EVPN/VPLS provider tunnels in this document for 139 simplicity, can be segmented for technical or administrative reasons, 140 which are summarized in Section 2.1 of this document. RFC 6513/6514 141 cover MVPN inter-as segmentation, RFC 7117 covers VPLS multicast 142 inter-as segmentation, and RFC 7524 (Seamless MPLS Multicast) covers 143 inter-area segmentation for both MVPN and VPLS. 145 There is a difference between MVPN and VPLS multicast inter-as 146 segmentation. For simplicity, EVPN will use the same procedures as 147 in MVPN. All ASBRs can re-advertise their choice of the best route. 148 Each can become the root of its intra-AS segment and inject traffic 149 it receives from its upstream, while each downstream PE/ASBR will 150 only pick one of the upstream ASBRs as its upstream. This is also 151 the behavior even for VPLS in case of inter-area segmentation. 153 For inter-area segmentation, RFC 7524 requires the use of Inter-area 154 P2MP Segmented Next-Hop Extended Community (S-NH-EC), and the setting 155 of "Leaf Information Required" (LIR) flag in PTA in certain 156 situations. Either of these could be optional in case of EVPN. 157 Removing these requirements would make the segmentation procedures 158 transparent to ingress and egress PEs. 160 RFC 7524 assumes that segmentation happens at area borders. However, 161 it could be at "regional" borders, where a region could be a sub- 162 area, or even an entire AS plus its external links (Section 6). That 163 would allow for more flexible deployment scenarios (e.g. for single- 164 area provider networks). 166 This document specifies/clarifies/redefines certain/additional EVPN 167 BUM procedures, with a salient goal that they're better aligned among 168 MVPN, EVPN and VPLS. For brevity, only changes/additions to relevant 169 RFC 7117 and RFC 7524 procedures are specified, instead of repeating 170 the entire procedures. Note that these are to be applied to EVPN 171 only, even though sometimes they may sound to be updates to RFC 172 7117/7524. 174 2.1. Reasons for Tunnel Segmentation 176 Tunnel segmentation may be required and/or desired because of 177 administrative and/or technical reasons. 179 For example, an MVPN/VPLS/EVPN network may span multiple providers 180 and Inter-AS Option-B has to be used, in which the end-to-end 181 provider tunnels have to be segmented at and stitched by the ASBRs. 182 Different providers may use different tunnel technologies (e.g., 183 provider A uses Ingress Replication, provider B uses RSVP-TE P2MP 184 while provider C uses mLDP). Even if they use the same tunnel 185 technology like RSVP-TE P2MP, it may be impractical to set up the 186 tunnels across provider boundaries. 188 The same situations may apply between the ASes and/or areas of a 189 single provider. For example, the backbone area may use RSVP-TE P2MP 190 tunnels while non-backbone areas may use mLDP tunnels. 192 Segmentation can also be used to divide an AS/area to smaller 193 regions, so that control plane state and/or forwarding plane state/ 194 burden can be limited to that of individual regions. For example, 195 instead of Ingress Replicating to 100 PEs in the entire AS, with 196 inter-area segmentation [RFC 7524] a PE only needs to replicate to 197 local PEs and ABRs. The ABRs will further replicate to their 198 downstream PEs and ABRs. This not only reduces the forwarding plane 199 burden, but also reduces the leaf tracking burden in the control 200 plane. 202 Smaller regions also have the benefit that, in case of tunnel 203 aggregation, it is easier to find congruence among the segments of 204 different constituent (service) tunnels and the resulting aggregation 205 (base) tunnel in a region. This leads to better bandwidth 206 efficiency, because the more congruent they are, the fewer leaves of 207 the base tunnel need to discard traffic when a service tunnel's 208 segment does not need to receive the traffic (yet it is receiving the 209 traffic due to aggregation). 211 Another advantage of the smaller region is smaller BIER sub-domains. 212 In this new multicast architecture BIER, packets carry a BitString, 213 in which the bits correspond to edge routers that needs to receive 214 traffic. Smaller sub-domains means smaller BitStrings can be used 215 without having to send multiple copies of the same packet. 217 3. Additional Route Types of EVPN NLRI 219 RFC 7432 defines the format of EVPN NLRI as the following: 221 +-----------------------------------+ 222 | Route Type (1 octet) | 223 +-----------------------------------+ 224 | Length (1 octet) | 225 +-----------------------------------+ 226 | Route Type specific (variable) | 227 +-----------------------------------+ 229 So far eight types have been defined: 231 + 1 - Ethernet Auto-Discovery (A-D) route 232 + 2 - MAC/IP Advertisement route 233 + 3 - Inclusive Multicast Ethernet Tag route 234 + 4 - Ethernet Segment route 235 + 5 - IP Prefix Route 236 + 6 - Selective Multicast Ethernet Tag Route 237 + 7 - Multicast Join Synch Route 238 + 8 - Multicast Leave Synch Route 240 This document defines three additional route types: 242 + 9 - Per-Region I-PMSI A-D route 243 + 10 - S-PMSI A-D route 244 + 11 - Leaf A-D route 246 The "Route Type specific" field of the type 9 and type 10 EVPN NLRIs 247 starts with a type 1 RD, whose Administrative sub-field MUST match 248 that of the RD in all the EVPN routes from the same advertising 249 router for a given EVI, except the Leaf A-D route (Section 3.3). 251 3.1. Per-Region I-PMSI A-D route 253 The Per-region I-PMSI A-D route has the following format. Its usage 254 is discussed in Section 6.2. 256 +-----------------------------------+ 257 | RD (8 octets) | 258 +-----------------------------------+ 259 | Ethernet Tag ID (4 octets) | 260 +-----------------------------------+ 261 | Extended Community (8 octets) | 262 +-----------------------------------+ 264 After Ethernet Tag ID, an Extended Community (EC) is used to identify 265 the region. Various types and sub-types of ECs provide maximum 266 flexibility. Note that this is not an EC Attribute, but an 8-octet 267 field embedded in the NLRI itself, following EC encoding scheme. 269 3.2. S-PMSI A-D route 271 The S-PMSI A-D route has the following format: 273 +-----------------------------------+ 274 | RD (8 octets) | 275 +-----------------------------------+ 276 | Ethernet Tag ID (4 octets) | 277 +-----------------------------------+ 278 | Multicast Source Length (1 octet) | 279 +-----------------------------------+ 280 | Multicast Source (Variable) | 281 +-----------------------------------+ 282 | Multicast Group Length (1 octet) | 283 +-----------------------------------+ 284 | Multicast Group (Variable) | 285 +-----------------------------------+ 286 |Originator's Addr Length (1 octet) | 287 +-----------------------------------+ 288 |Originator's Addr (4 or 16 octets) | 289 +-----------------------------------+ 291 Other than the addition of Ethernet Tag ID and Originator's Addr 292 Length, it is identical to the S-PMSI A-D route as defined in RFC 293 7117. The procedures in RFC 7117 also apply (including wildcard 294 functionality), except that the granularity level is per Ethernet 295 Tag. 297 3.3. Leaf-AD route 299 The Route Type specific field of a Leaf A-D route consists of the 300 following: 302 +-----------------------------------+ 303 | Route Key (variable) | 304 +-----------------------------------+ 305 |Originator's Addr Length (1 octet) | 306 +-----------------------------------+ 307 |Originator's Addr (4 or 16 octets) | 308 +-----------------------------------+ 310 A Leaf A-D route is originated in response to a PMSI route, which 311 could be an Inclusive Multicast Tag route, a per-region I-PMSI A-D 312 route, an S-PMSI A-D route, or some other types of routes that may be 313 defined in the future that triggers Leaf A-D routes. The Route Key 314 is the "Route Type Specific" field of the route for which this Leaf 315 A-D route is generated. 317 The general procedures of Leaf A-D route are first specified in RFC 318 6514 for MVPN. The principles apply to VPLS and EVPN as well. RFC 319 7117 has details for VPLS Multicast, and this document points out 320 some specifics for EVPN, e.g. in Section 5. 322 4. Selective Multicast 324 [I-D.ietf-bess-evpn-igmp-mld-proxy] specifies procedures for EVPN 325 selective forwarding of IP multicast using SMET routes. It assumes 326 selective forwarding is always used with IR or BIER for all flows. 327 An NVE proxies the IGMP/MLD state that it learns on its ACs to 328 (C-S,C-G) or (C-*,C-G) SMET routes and advertises to other NVEs, and 329 an receiving NVE converts the SMET routes back to IGMP/MLD messages 330 and send them out of its ACs. The receiving NVE also uses the SMET 331 routes to identify which NVEs need to receive traffic for a 332 particular (C-S,C-G) or (C-*,C-G) to achieve selective forwarding 333 using IR or BIER. 335 With the above procedures, selective forwarding is done for all flows 336 and the SMET routes are advertised for all flows. It is possible 337 that an operator may not want to track all those (C-S, C-G) or 338 (C-*,C-G) state on the NVEs, and the multicast traffic pattern allows 339 inclusive forwarding for most flows while selective forwarding is 340 needed only for a few high-rate flows. For that, or for tunnel types 341 other than IR/BIER, S-PMSI/Leaf A-D procedures defined for Selective 342 Multicast for VPLS in [RFC7117] are used. Other than that different 343 route types and formats are specified with EVPN SAFI for S-PMSI A-D 344 and Leaf A-D routes (Section 3), all procedures in [RFC7117] with 345 respect to Selective Multicast apply to EVPN as well, including 346 wildcard procedures. In a nut shell, a source NVE advertises S-SPMSI 347 A-D routes to announce the tunnels used for certain flows, and 348 receiving NVEs either join the announced PIM/mLDP tunnel or respond 349 with Leaf A-D routes if the Leaf Information Requested flag is set in 350 the S-PMSI A-D route's PTA (so that the source NVE can include them 351 as tunnel leaves). 353 An optimization to the [RFC7117] procedures may be applied. Even if 354 a source NVE sets the LIR bit to request Leaf A-D routes, an egress 355 NVE may omit the Leaf A-D route if it already advertises a 356 corresponding SMET route, and the source NVE will use that in lieu of 357 the Leaf A-D route. 359 The optional optimizations specified for MVPN in 360 [I-D.ietf-bess-mvpn-expl-track] are also applicable to EVPN when the 361 S-PMSI/Leaf A-D routes procedures are used for EVPN selective 362 multicast forwarding. 364 5. Inter-AS Segmentation 365 5.1. Changes to Section 7.2.2 of RFC 7117 367 The first paragraph of Section 7.2.2.2 of RFC 7117 says: 369 "... The best route procedures ensure that if multiple 370 ASBRs, in an AS, receive the same Inter-AS A-D route from their EBGP 371 neighbors, only one of these ASBRs propagates this route in Internal 372 BGP (IBGP). This ASBR becomes the root of the intra-AS segment of 373 the inter-AS tree and ensures that this is the only ASBR that accepts 374 traffic into this AS from the inter-AS tree." 376 The above VPLS behavior requires complicated VPLS specific procedures 377 for the ASBRs to reach agreement. For EVPN, a different approach is 378 used and the above quoted text is not applicable to EVPN. 380 With the different approach for EVPN, each ASBR will re-advertise its 381 received Inter-AS A-D route to its IBGP peers and becomes the root of 382 an intra-AS segment of the inter-AS tree. The intra-AS segment 383 rooted at one ASBR is disjoint with another intra-AS segment rooted 384 at another ASBR. This is the same as the procedures for S-PMSI in 385 RFC 7117 itself. 387 The following text at the end of the second bullet: 389 "................................................... If, in order 390 to instantiate the segment, the ASBR needs to know the leaves of 391 the tree, then the ASBR obtains this information from the A-D 392 routes received from other PEs/ASBRs in the ASBR's own AS." 394 is changed to the following when applied to EVPN: 396 "................................................... If, in order 397 to instantiate the segment, the ASBR needs to know the leaves of 398 the tree, then the ASBR MUST set the LIR flag to 1 in the PTA to 399 trigger Leaf A-D routes from egress PEs and downstream ASBRs. 400 It MUST be (auto-)configured with an import RT, which controls 401 acceptance of leaf A-D routes by the ASBR." 403 Accordingly, the following paragraph in Section 7.2.2.4: 405 "If the received Inter-AS A-D route carries the PMSI Tunnel attribute 406 with the Tunnel Identifier set to RSVP-TE P2MP LSP, then the ASBR 407 that originated the route MUST establish an RSVP-TE P2MP LSP with the 408 local PE/ASBR as a leaf. This LSP MAY have been established before 409 the local PE/ASBR receives the route, or it MAY be established after 410 the local PE receives the route." 412 is changed to the following when applied to EVPN: 414 "If the received Inter-AS A-D route has the LIR flag set in its PTA, 415 then a receiving PE must originate a corresponding Leaf A-D route, 416 and a receiving ASBR must originate a corresponding Leaf A-D route 417 if and only if it received and imported one or more corresponding Leaf 418 A-D routes from its downstream IBGP or EBGP peers, or it has non-null 419 downstream forwarding state for the PIM/mLDP tunnel that instantiates 420 its downstream intra-AS segment. The ASBR that (re-)advertised the 421 Inter-AS A-D route then establishes a tunnel to the leaves discovered 422 by the Leaf A-D routes." 424 5.2. I-PMSI Leaf Tracking 426 An ingress PE does not set the LIR flag in its I-PMSI's PTA, even 427 with Ingress Replication or RSVP-TE P2MP tunnels. It does not rely 428 on the Leaf A-D routes to discover leaves in its AS, and Section 11.2 429 of RFC 7432 explicitly states that the LIR flag must be set to zero. 431 An implementation of RFC 7432 might have used the Originating 432 Router's IP Address field of the Inclusive Multicast Ethernet Tag 433 routes to determine the leaves, or might have used the Next Hop field 434 instead. Within the same AS, both will lead to the same result. 436 With segmentation, an ingress PE MUST determine the leaves in its AS 437 from the BGP next hops in all its received I-PMSI A-D routes, so it 438 does not have to set the LIR bit set to request Leaf A-D routes. PEs 439 within the same AS will all have different next hops in their I-PMSI 440 A-D routes (hence will all be considered as leaves), and PEs from 441 other ASes will have the next hop in their I-PMSI A-D routes set to 442 addresses of ASBRs in this local AS, hence only those ASBRs will be 443 considered as leaves (as proxies for those PEs in other ASes). Note 444 that in case of Ingress Replication, when an ASBR re-advertises IBGP 445 I-PMSI A-D routes, it MUST advertise the same label for all those for 446 the same Ethernet Tag ID and the same EVI. When an ingress PE builds 447 its flooding list, multiple routes may have the same (nexthop, label) 448 tuple and they will only be added as a single branch in the flooding 449 list. 451 5.3. Backward Compatibility 453 The above procedures assume that all PEs are upgraded to support the 454 segmentation procedures: 456 o An ingress PE uses the Next Hop instead of Originating Router's IP 457 Address to determine leaves for the I-PMSI tunnel. 459 o An egress PE sends Leaf A-D routes in response to I-PMSI routes, 460 if the PTA has the LIR flag set (by the re-advertising ASBRs). 462 o In case of Ingress Replication, when an ingress PE builds its 463 flooding list, multiple I-PMSI routes may have the same (nexthop, 464 label) tuple and only a single branch for those will be added in 465 the flooding list. 467 If a deployment has legacy PEs that does not support the above, then 468 a legacy ingress PE would include all PEs (including those in remote 469 ASes) as leaves of the inclusive tunnel and try to send traffic to 470 them directly (no segmentation), which is either undesired or not 471 possible; a legacy egress PE would not send Leaf A-D routes so the 472 ASBRs would not know to send external traffic to them. 474 To address this backward compatibility problem, the following 475 procedure can be used (see Section 6.2 for per-PE/AS/region I-PMSI 476 A-D routes): 478 o An upgraded PE indicates in its per-PE I-PMSI A-D route that it 479 supports the new procedures. This is done by setting a flag bit 480 in the EVPN Multicast Flags Extended Community. 482 o All per-PE I-PMSI A-D routes are restricted to the local AS and 483 not propagated to external peers. 485 o The ASBRs in an AS originate per-region I-PMSI A-D routes and 486 advertise to their external peers to advertise tunnels used to 487 carry traffic from the local AS to other ASes. Depending on the 488 types of tunnels being used, the LIR flag in the PTA may be set, 489 in which case the downstream ASBRs and upgraded PEs will send Leaf 490 A-D routes to pull traffic from their upstream ASBRs. In a 491 particular downstream AS, one of the ASBRs is elected, based on 492 the per-region I-PMSI A-D routes for a particular source AS, to 493 send traffic from that source AS to legacy PEs in the downstream 494 AS. The traffic arrives at the elected ASBR on the tunnel 495 announced in the best per-region I-PMSI A-D route for the source 496 AS, that the ASBR has selected of all those that it received over 497 EBGP or IBGP sessions. The election procedure is described in 498 Section 5.3.1. 500 o In an ingress/upstream AS, if and only if an ASBR has active 501 downstream receivers (PEs and ASBRs), which are learned either 502 explicitly via Leaf AD routes or implicitly via PIM join or mLDP 503 label mapping, the ASBR originates a per-PE I-PMSI A-D route 504 (i.e., regular Inclusive Multicast Ethernet Tag route) into the 505 local AS, and stitches incoming per-PE I-PMSI tunnels into its 506 per-region I-PMSI tunnel. With this, it gets traffic from local 507 PEs and send to other ASes via the tunnel announced in its per- 508 region I-PMSI A-D route. 510 Note that, even if there is no backward compatibility issue, the use 511 of per-region I-PMSI has the benefit of keeping all per-PE I-PMSI A-D 512 routes in their local ASes, greatly reducing the flooding of the 513 routes and their corresponding Leaf A-D routes (when needed), and the 514 number of inter-as tunnels. 516 5.3.1. Designated ASBR Election 518 When an ASBR re-advertises a per-region I-PMSI A-D route into an AS 519 in which a designated ASBR needs to be used to forward traffic to the 520 legacy PEs in the AS, it SHOULD include a DF Election EC. The EC and 521 its use is specified in [I-D.ietf-bess-evpn-df-election-framework]. 522 The AC-DF bit in the DF Election EC SHOULD be cleared. If it is 523 known that no legacy PEs exist in the AS, the ASBR SHOULD NOT include 524 the EC and SHOULD remove the DF Election EC if one is carried in the 525 per-region I-PMSI A-D routes that it receives. Note that this is 526 done for each set of per-region I-PMSI A-D routes with the same NLRI. 528 Based on the procedures in 529 [I-D.ietf-bess-evpn-df-election-framework], an election algorithm is 530 determined according to the DF Election ECs carried in the set of 531 per-region I-PMSI routes of the same NLRI re-adverised into the AS. 532 The algorithm is then applied to a candidate list, which is the set 533 of ASBRs that re-advertised the per-region I-PMSI routes of the same 534 NLRI carrying the DF Election EC. 536 6. Inter-Region Segmentation 538 6.1. Area vs. Region 540 RFC 7524 is for MVPN/VPLS inter-area segmentation and does not 541 explicitly cover EVPN. However, if "area" is replaced by "region" 542 and "ABR" is replaced by "RBR" (Regional Border Router) then 543 everything still works, and can be applied to EVPN as well. 545 A region can be a sub-area, or can be an entire AS including its 546 external links. Instead of automatic region definition based on IGP 547 areas, a region would be defined as a BGP peer group. In fact, even 548 with IGP area based region definition, a BGP peer group listing the 549 PEs and ABRs in an area is still needed. 551 Consider the following example diagram: 553 --------- ------ --------- 554 / \ / \ / \ 555 / \ / \ / \ 556 | PE1 o ASBR1 -- ASBR2 ASBR3 -- ASBR4 o PE2 | 557 \ / \ / \ / 558 \ / \ / \ / 559 --------- ------ --------- 560 AS 100 AS 200 AS 300 561 |-----------|--------|---------|--------|------------| 562 segment1 segment2 segment3 segment4 segment5 564 The inter-as segmentation procedures specified so far (RFC 6513/6514, 565 7117, and Section 5 of this document) requires all ASBRs to be 566 involved, and Ingress Replication is used between two ASBRs in 567 different ASes. 569 In the above diagram, it's possible that ASBR1/4 does not support 570 segmentation, and the provider tunnels in AS 100/300 can actually 571 extend across the external link. In this case, the inter-region 572 segmentation procedures can be used instead - a region is the entire 573 (AS100 + ASBR1-ASBR2 link) or (AS300 + ASBR3-ASBR4 link). ASBR2/3 574 would be the RBRs, and ASBR1/4 will just be a transit core router 575 with respect to provider tunnels. 577 As illustrated in the diagram below, ASBR2/3 will establish a 578 multihop EBGP session with either a RR or directly with PEs in the 579 neighboring AS. I/S-PMSI A-D routes from ingress PEs will not be 580 processed by ASBR1/4. When ASBR2 re-advertises the routes into AS 581 200, it changes the next hop to its own address and changes PTA to 582 specify the tunnel type/identification in its own AS. When ASBR3 re- 583 advertises I/S-PMSI A-D routes into the neighboring AS 300, it 584 changes the next hop to its own address and changes PTA to specify 585 the tunnel type/identification in the neighboring region 3. Now the 586 segment is rooted at ASBR3 and extends across the external link to 587 PEs. 589 --------- ------ --------- 590 / RR....\.mh-ebpg / \ mh-ebgp/....RR \ 591 / : \ `. / \ .' / : \ 592 | PE1 o ASBR1 -- ASBR2 ASBR3 -- ASBR4 o PE2 | 593 \ / \ / \ / 594 \ / \ / \ / 595 --------- ------ --------- 596 AS 100 AS 200 AS 300 597 |-------------------|----------|---------------------| 598 segment 1 segment 2 segment 3 600 6.2. Per-region Aggregation 602 Notice that every I/S-PMSI route from each PE will be propagated 603 throughout all the ASes or regions. They may also trigger 604 corresponding Leaf A-D routes depending on the types of tunnels used 605 in each region. This may become too many - routes and corresponding 606 tunnels. To address this concern, the I-PMSI routes from all PEs in 607 a AS/region can be aggregated into a single I-PMSI route originated 608 from the RBRs, and traffic from all those individual I-PMSI tunnels 609 will be switched into the single I-PMSI tunnel. This is like the 610 MVPN Inter-AS I-PMSI route originated by ASBRs. 612 The MVPN Inter-AS I-PMSI A-D route can be better called as per-AS 613 I-PMSI A-D route, to be compared against the (per-PE) Intra-AS I-PMSI 614 A-D routes originated by each PE. In this document we will call it 615 as per-region I-PMSI A-D route, in case we want to apply the 616 aggregation at regional level. The per-PE I-PMSI routes will not be 617 propagated to other regions. If multiple RBRs are connected to a 618 region, then each will advertise such a route, with the same route 619 key (Section 3.1). Similar to the per-PE I-PMSI A-D routes, RBRs/PEs 620 in a downstream region will each select a best one from all those re- 621 advertised by the upstream RBRs, hence will only receive traffic 622 injected by one of them. 624 MVPN does not aggregate S-PMSI routes from all PEs in an AS like it 625 does for I-PMSIs routes, because the number of PEs that will 626 advertise S-PMSI routes for the same (s,g) or (*,g) is small. This 627 is also the case for EVPN, i.e., there is no per-region S-PMSI 628 routes. 630 Notice that per-region I-PMSI routes can also be used to address 631 backwards compatibility issue, as discussed in Section 5.3. 633 The per-region I-PMSI route uses an embedded EC in NLRI to identify a 634 region. As long as it uniquely identifies the region and the RBRs 635 for the same region uses the same EC it is permitted. In the case 636 where an AS number or area ID is needed, the following can be used: 638 o For a two-octet AS number, a Transitive Two-Octet AS-Specific EC 639 of sub-type 0x09 (Source AS), with the Global Administrator sub- 640 field set to the AS number and the Local Administrator sub-field 641 set to 0. 643 o For a four-octet AS number, a Transitive Four-Octet AS-Specific EC 644 of sub-type 0x09 (Source AS), with the Global Administrator sub- 645 field set to the AS number and the Local Administrator sub-field 646 set to 0. 648 o For an area ID, a Transitive IPv4-Address-Specific EC of any sub- 649 type. 651 Uses of other particular ECs may be specified in other documents. 653 6.3. Use of S-NH-EC 655 RFC 7524 specifies the use of S-NH-EC because it does not allow ABRs 656 to change the BGP next hop when they re-advertise I/S-PMSI AD routes 657 to downstream areas. That is only to be consistent with the MVPN 658 Inter-AS I-PMSI A-D routes, whose next hop must not be changed when 659 they're re-advertised by the segmenting ABRs for reasons specific to 660 MVPN. For EVPN, it is perfectly fine to change the next hop when 661 RBRs re-advertise the I/S-PMSI A-D routes, instead of relying on S- 662 NH-EC. As a result, this document specifies that RBRs change the BGP 663 next hop when they re-advertise I/S-PMSI A-D routes and do not use S- 664 NH-EC. if a downstream PE/RBR needs to originate Leaf A-D routes, it 665 simply uses the BGP next hop in the corresponding I/S-PMSI A-D routes 666 to construct Route Targets. 668 The advantage of this is that neither ingress nor egress PEs need to 669 understand/use S-NH-EC, and consistent procedure (based on BGP next 670 hop) is used for both inter-as and inter-region segmentation. 672 6.4. Ingress PE's I-PMSI Leaf Tracking 674 RFC 7524 specifies that when an ingress PE/ASBR (re-)advertises an 675 VPLS I-PMSI A-D route, it sets the LIR flag to 1 in the route's PTA. 676 Similar to the inter-as case, this is actually not really needed for 677 EVPN. To be consistent with the inter-as case, the ingress PE does 678 not set the LIR flag in its originated I-PMSI A-D routes, and 679 determines the leaves based on the BGP next hops in its received 680 I-PMSI A-D routes, as specified in Section 5.2. 682 The same backward compatibility issue exists, and the same solution 683 as in the inter-as case applies, as specified in Section 5.3. 685 7. Multi-homing Support 687 If multi-homing does not span across different ASes or regions, 688 existing procedures work with segmentation, and a segmentation point 689 will remove the ESI label from the packets. If an ES is multi-homed 690 to PEs in different ASes or regions, additional procedures are needed 691 to work with segmentation. The procedures are well understood but 692 omitted here until the requirement becomes clear. 694 8. IANA Considerations 696 IANA has temporarily assigned the following new EVPN route types: 698 o 9 - Per-Region I-PMSI A-D route 700 o 10 - S-PMSI A-D route 702 o 11 - Leaf A-D route 704 This document requests IANA to assign one flag bit from the EVPN 705 Multicast Flags Extended Community: 707 o Bit-S - The router supports segmentation procedure defined in this 708 document 710 9. Security Considerations 712 This document does not seem to introduce new security risks, though 713 this may be revised after further review and scrutiny. 715 10. Acknowledgements 717 The authors thank Eric Rosen, John Drake, and Ron Bonica for their 718 comments and suggestions. 720 11. Contributors 722 The following also contributed to this document through their earlier 723 work in EVPN selective multicast. 725 Junlin Zhang 726 Huawei Technologies 727 Huawei Bld., No.156 Beiqing Rd. 728 Beijing 100095 729 China 731 Email: jackey.zhang@huawei.com 733 Zhenbin Li 734 Huawei Technologies 735 Huawei Bld., No.156 Beiqing Rd. 736 Beijing 100095 737 China 739 Email: lizhenbin@huawei.com 741 12. References 743 12.1. Normative References 745 [I-D.ietf-bess-evpn-df-election-framework] 746 Rabadan, J., satyamoh@cisco.com, s., Sajassi, A., Drake, 747 J., Nagaraj, K., and S. Sathappan, "Framework for EVPN 748 Designated Forwarder Election Extensibility", draft-ietf- 749 bess-evpn-df-election-framework-06 (work in progress), 750 December 2018. 752 [I-D.ietf-bess-evpn-igmp-mld-proxy] 753 Sajassi, A., Thoria, S., Patel, K., Yeung, D., Drake, J., 754 and W. Lin, "IGMP and MLD Proxy for EVPN", draft-ietf- 755 bess-evpn-igmp-mld-proxy-02 (work in progress), June 2018. 757 [I-D.ietf-bess-mvpn-expl-track] 758 Dolganow, A., Kotalwar, J., Rosen, E., and Z. Zhang, 759 "Explicit Tracking with Wild Card Routes in Multicast 760 VPN", draft-ietf-bess-mvpn-expl-track-13 (work in 761 progress), November 2018. 763 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 764 Requirement Levels", BCP 14, RFC 2119, 765 DOI 10.17487/RFC2119, March 1997, 766 . 768 [RFC7117] Aggarwal, R., Ed., Kamite, Y., Fang, L., Rekhter, Y., and 769 C. Kodeboniya, "Multicast in Virtual Private LAN Service 770 (VPLS)", RFC 7117, DOI 10.17487/RFC7117, February 2014, 771 . 773 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 774 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 775 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 776 2015, . 778 [RFC7524] Rekhter, Y., Rosen, E., Aggarwal, R., Morin, T., 779 Grosclaude, I., Leymann, N., and S. Saad, "Inter-Area 780 Point-to-Multipoint (P2MP) Segmented Label Switched Paths 781 (LSPs)", RFC 7524, DOI 10.17487/RFC7524, May 2015, 782 . 784 [RFC7988] Rosen, E., Ed., Subramanian, K., and Z. Zhang, "Ingress 785 Replication Tunnels in Multicast VPN", RFC 7988, 786 DOI 10.17487/RFC7988, October 2016, 787 . 789 12.2. Informative References 791 [I-D.ietf-bier-architecture] 792 Wijnands, I., Rosen, E., Dolganow, A., Przygienda, T., and 793 S. Aldrin, "Multicast using Bit Index Explicit 794 Replication", draft-ietf-bier-architecture-08 (work in 795 progress), September 2017. 797 [I-D.ietf-bier-evpn] 798 Zhang, Z., Przygienda, T., Sajassi, A., and J. Rabadan, 799 "EVPN BUM Using BIER", draft-ietf-bier-evpn-01 (work in 800 progress), April 2018. 802 [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ 803 BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February 804 2012, . 806 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 807 Encodings and Procedures for Multicast in MPLS/BGP IP 808 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 809 . 811 Authors' Addresses 813 Zhaohui Zhang 814 Juniper Networks 816 EMail: zzhang@juniper.net 818 Wen Lin 819 Juniper Networks 821 EMail: wlin@juniper.net 823 Jorge Rabadan 824 Nokia 826 EMail: jorge.rabadan@nokia.com 828 Keyur Patel 829 Arrcus 831 EMail: keyur@arrcus.com 832 Ali Sajassi 833 Cisco Systems 835 EMail: sajassi@cisco.com