idnits 2.17.1 draft-zzhang-bess-evpn-bum-procedure-updates-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 16 instances of too long lines in the document, the longest one being 3 characters in excess of 72. -- The draft header indicates that this document updates RFC7432, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 21, 2016) is 2925 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC 7524' is mentioned on line 173, but not defined == Unused Reference: 'I-D.ietf-bess-ir' is defined on line 641, but no explicit reference was found in the text == Unused Reference: 'RFC2119' is defined on line 646, but no explicit reference was found in the text == Unused Reference: 'RFC7117' is defined on line 651, but no explicit reference was found in the text == Unused Reference: 'RFC7432' is defined on line 656, but no explicit reference was found in the text == Unused Reference: 'RFC7524' is defined on line 661, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-bess-dci-evpn-overlay' is defined on line 669, but no explicit reference was found in the text == Unused Reference: 'I-D.ietf-bess-evpn-overlay' is defined on line 675, but no explicit reference was found in the text == Unused Reference: 'I-D.rabadan-bess-evpn-optimized-ir' is defined on line 681, but no explicit reference was found in the text == Unused Reference: 'I-D.wijnands-bier-architecture' is defined on line 687, but no explicit reference was found in the text == Unused Reference: 'RFC6513' is defined on line 693, but no explicit reference was found in the text == Unused Reference: 'RFC6514' is defined on line 697, but no explicit reference was found in the text == Outdated reference: A later version (-05) exists of draft-ietf-bess-ir-00 == Outdated reference: A later version (-10) exists of draft-ietf-bess-dci-evpn-overlay-00 == Outdated reference: A later version (-12) exists of draft-ietf-bess-evpn-overlay-01 == Outdated reference: A later version (-02) exists of draft-rabadan-bess-evpn-optimized-ir-00 Summary: 2 errors (**), 0 flaws (~~), 17 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Z. Zhang 3 Internet-Draft Juniper Networks 4 Updates: 7432 (if approved) W. Lin 5 Intended status: Standards Track Juniper Networks, Inc. 6 Expires: October 23, 2016 J. Rabadan 7 Nokia 8 K. Patel 9 Cisco Systems 10 April 21, 2016 12 Updates on EVPN BUM Procedures 13 draft-zzhang-bess-evpn-bum-procedure-updates-02 15 Abstract 17 This document specifies procedure updates for broadcast, unknown 18 unicast, and multicast (BUM) traffic in Ethernet VPNs (EVPN), 19 including selective multicast, and provider tunnel segmentation. 21 Requirements Language 23 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 24 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 25 document are to be interpreted as described in RFC2119. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on October 23, 2016. 44 Copyright Notice 46 Copyright (c) 2016 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 2 62 2. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 63 2.1. Reasons for Tunnel Segmentation . . . . . . . . . . . . . 4 64 3. Additional Route Types of EVPN NLRI . . . . . . . . . . . . . 5 65 3.1. Per-Region I-PMSI A-D route . . . . . . . . . . . . . . . 5 66 3.2. S-PMSI A-D route . . . . . . . . . . . . . . . . . . . . 6 67 3.3. Leaf-AD route . . . . . . . . . . . . . . . . . . . . . . 6 68 4. Selective Multicast . . . . . . . . . . . . . . . . . . . . . 7 69 5. Inter-AS Segmentation . . . . . . . . . . . . . . . . . . . . 7 70 5.1. Changes to Section 7.2.2 of RFC 7117 . . . . . . . . . . 7 71 5.2. I-PMSI Leaf Tracking . . . . . . . . . . . . . . . . . . 8 72 5.3. Backward Compatibility . . . . . . . . . . . . . . . . . 9 73 6. Inter-Region Segmentation . . . . . . . . . . . . . . . . . . 10 74 6.1. Area vs. Region . . . . . . . . . . . . . . . . . . . . . 10 75 6.2. Per-region Aggregation . . . . . . . . . . . . . . . . . 12 76 6.3. Use of S-NH-EC . . . . . . . . . . . . . . . . . . . . . 13 77 6.4. Ingress PE's I-PMSI Leaf Tracking . . . . . . . . . . . . 13 78 7. Multi-homing Support . . . . . . . . . . . . . . . . . . . . 13 79 8. Security Considerations . . . . . . . . . . . . . . . . . . . 14 80 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 14 81 10. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 14 82 11. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 83 11.1. Normative References . . . . . . . . . . . . . . . . . . 14 84 11.2. Informative References . . . . . . . . . . . . . . . . . 15 85 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 16 87 1. Terminology 89 To be added 91 2. Introduction 93 RFC 7432 specifies procedures to handle broadcast, unknown unicast, 94 and multicast (BUM) traffic in Section 11, 12 and 16, using Inclusive 95 Multicast Ethernet Tag Route. A lot of details are referred to RFC 96 7117 (VPLS Multicast). In particular, selective multicast is briefly 97 mentioned for Ingress Replication but referred to RFC 7117. 99 RFC 7117 specifies procedures for using both inclusive tunnels and 100 selective tunnels, similar to MVPN procedures specified in RFC 6513 101 and RFC 6514. A new SAFI "MCAST-VPLS" is introduced, with two types 102 of NLRIs that match MVPN's S-PMSI A-D routes and Leaf A-D routes. 103 The same procedures can be applied to EVPN selective multicast for 104 both Ingress Replication and other tunnel types, but new route types 105 need to be defined under the same EVPN SAFI. 107 MVPN uses terms I-PMSI and S-PMSI A-D Routes. For consistency and 108 convenience, this document will use the same I/S-PMSI terms for VPLS 109 and EVPN. In particular, EVPN's Inclusive Multicast Ethernet Tag 110 Route and VPLS's VPLS A-D route carrying PTA (PMSI Tunnel Attribute) 111 for BUM traffic purpose will all be referred to as I-PMSI A-D routes. 112 Depending on the context, they may be used interchangeably. 114 MVPN provider tunnels and EVPN/VPLS BUM provider tunnels, which are 115 referred to as MVPN/EVPN/VPLS provider tunnels in this document for 116 simplicity, can be segmented for technical or administrative reasons, 117 which are summarized in Section 2.1 of this document. RFC 6513/6514 118 cover MVPN inter-as segmentation, RFC 7117 covers VPLS multicast 119 inter-as segmentation, and RFC 7524 (Seamless MPLS Multicast) covers 120 inter-area segmentation for both MVPN and VPLS. 122 There is a difference between MVPN and VPLS multicast inter-as 123 segmentation. For simplicity, EVPN uses the same procedures as in 124 MVPN. All ASBRs can re-advertise their choice of the best route. 125 Each can become the root of its intra-AS segment and inject traffic 126 it receives from its upstream, while each downstream PE/ASBR will 127 only pick one of the upstream ASBRs as its upstream. This is also 128 the behavior even for VPLS in case of inter-area segmentation. 130 For inter-area segmentation, RFC 7524 requires the use of Inter-area 131 P2MP Segmented Next-Hop Extended Community (S-NH-EC), and the setting 132 of "Leaf Information Required" (LIR) flag in PTA in certain 133 situations. Either of these could be optional in case of EVPN. 134 Removing these requirements would make the segmentation procedures 135 transparent to ingress and egress PEs. 137 RFC 7524 assumes that segmentation happens at area borders. However, 138 it could be at "regional" borders, where a region could be a sub- 139 area, or even an entire AS plus its external links (Section 6). That 140 would allow for more flexible deployment scenarios (e.g. for single- 141 area provider networks). 143 This document specifies/clarifies/redefines certain/additional EVPN 144 BUM procedures, with a salient goal that they're better aligned among 145 MVPN, EVPN and VPLS. For brevity, only changes/additions to relevant 146 RFC 7117 and RFC 7524 procedures are specified, instead of repeating 147 the entire procedures. Note that these are to be applied to EVPN 148 only, even though sometimes they may sound to be updates to RFC 149 7117/7524. 151 2.1. Reasons for Tunnel Segmentation 153 Tunnel segmentation may be required and/or desired because of 154 administrative and/or technical reasons. 156 For example, an MVPN/VPLS/EVPN network may span multiple providers 157 and Inter-AS Option-B has to be used, in which the end-to-end 158 provider tunnels have to be segmented at and stitched by the ASBRs. 159 Different providers may use different tunnel technologies (e.g., 160 provider A uses Ingress Replication, provider B uses RSVP-TE P2MP 161 while provider C uses mLDP). Even if they use the same tunnel 162 technology like RSVP-TE P2MP, it may be impractical to set up the 163 tunnels across provider boundaries. 165 The same situations may apply between the ASes and/or areas of a 166 single provider. For example, the backbone area may use RSVP-TE P2MP 167 tunnels while non-backbone areas may use mLDP tunnels. 169 Segmentation can also be used to divide an AS/area to smaller 170 regions, so that control plane state and/or forwarding plane state/ 171 burden can be limited to that of individual regions. For example, 172 instead of Ingress Replicating to 100 PEs in the entire AS, with 173 inter-area segmentation [RFC 7524] a PE only needs to replicate to 174 local PEs and ABRs. The ABRs will further replicate to their 175 downstream PEs and ABRs. This not only reduces the forwarding plane 176 burden, but also reduces the leaf tracking burden in the control 177 plane. 179 Smaller regions also have the benefit that, in case of tunnel 180 aggregation, it is easier to find congruence among the segments of 181 different constituent (service) tunnels and the resulting aggregation 182 (base) tunnel in a region. This leads to better bandwidth 183 efficiency, because the more congruent they are, the fewer leaves of 184 the base tunnel need to discard traffic when a service tunnel's 185 segment does not need to receive the traffic (yet it is receiving the 186 traffic due to aggregation). 188 Another advantage of the smaller region is smaller BIER sub-domains. 189 In this new multicast architecture BIER, packets carry a BitString, 190 in which the bits correspond to edge routers that needs to receive 191 traffic. Smaller sub-domains means smaller BitStrings can be used 192 without having to send multiple copies of the same packet. 194 3. Additional Route Types of EVPN NLRI 196 RFC 7432 defines the format of EVPN NLRI as the following: 198 +-----------------------------------+ 199 | Route Type (1 octet) | 200 +-----------------------------------+ 201 | Length (1 octet) | 202 +-----------------------------------+ 203 | Route Type specific (variable) | 204 +-----------------------------------+ 206 So far five types have been defined: 208 + 1 - Ethernet Auto-Discovery (A-D) route 209 + 2 - MAC/IP Advertisement route 210 + 3 - Inclusive Multicast Ethernet Tag route 211 + 4 - Ethernet Segment route 212 + 5 - IP Prefix Route 214 This document defines three additional route types: 216 + 6 - Per-Region I-PMSI A-D route 217 + 7 - S-PMSI A-D route 218 + 8 - Leaf A-D route 220 The "Route Type specific" field of the type 6 and type 7 EVPN NLRIs 221 starts with a type 1 RD, whose Administrative sub-field MUST match 222 that of the RD in all the EVPN routes from the same advertising 223 router for a given EVI, except the Leaf A-D route (Section 3.3). 225 3.1. Per-Region I-PMSI A-D route 227 The Per-region I-PMSI A-D route has the following format. Its usage 228 is discussed in Section 6.2. 230 +-----------------------------------+ 231 | RD (8 octets) | 232 +-----------------------------------+ 233 | Ethernet Tag ID (4 octets) | 234 +-----------------------------------+ 235 | Extended Community (8 octets) | 236 +-----------------------------------+ 238 After Ethernet Tag ID, an Extended Community (EC) is used to identify 239 the region. Various types and sub-types of ECs provide maximum 240 flexibility. Note that this is not an EC Attribute, but an 8-octet 241 field embedded in the NLRI itself, following EC encoding scheme. 243 3.2. S-PMSI A-D route 245 The S-PMSI A-D route has the following format: 247 +-----------------------------------+ 248 | RD (8 octets) | 249 +-----------------------------------+ 250 | Ethernet Tag ID (4 octets) | 251 +-----------------------------------+ 252 | Multicast Source Length (1 octet) | 253 +-----------------------------------+ 254 | Multicast Source (Variable) | 255 +-----------------------------------+ 256 | Multicast Group Length (1 octet) | 257 +-----------------------------------+ 258 | Multicast Group (Variable) | 259 +-----------------------------------+ 260 | Originating Router's IP Addr | 261 +-----------------------------------+ 263 Other than the addition of Ethernet Tag ID, it is identical to the 264 S-PMSI A-D route as defined in RFC 7117. The procedures in RFC 7117 265 also apply (including wildcard functionality), except that the 266 granularity level is per Ethernet Tag. 268 3.3. Leaf-AD route 270 The Route Type specific field of a Leaf A-D route consists of the 271 following: 273 +-----------------------------------+ 274 | Route Key (variable) | 275 +-----------------------------------+ 276 | Originating Router's IP Addr | 277 +-----------------------------------+ 279 A Leaf A-D route is originated in response to a PMSI route, which 280 could be an Inclusive Multicast Tag route, a per-region I-PMSI A-D 281 route, an S-PMSI A-D route, or some other types of routes that may be 282 defined in the future that triggers Leaf A-D routes. The Route Key 283 is the "Route Type Specific" field of the route for which this Leaf 284 A-D route is generated. 286 The general procedures of Leaf A-D route are first specified in RFC 287 6514 for MVPN. The principles apply to VPLS and EVPN as well. RFC 288 7117 has details for VPLS Multicast, and this document points out 289 some specifics for EVPN, e.g. in Section 5. 291 4. Selective Multicast 293 RFC 7117 specifies Selective Multicast for VPLS. Other than that 294 different route types and formats are specified with EVPN SAFI for 295 S-PMSI A-D and Leaf A-D routes (Section 3), all procedures in RFC 296 7117 with respect to Selective Multicast apply to EVPN as well, 297 including wildcard procedures. 299 5. Inter-AS Segmentation 301 5.1. Changes to Section 7.2.2 of RFC 7117 303 The first paragraph of Section 7.2.2.2 of RFC 7117 says: 305 "... The best route procedures ensure that if multiple 306 ASBRs, in an AS, receive the same Inter-AS A-D route from their EBGP 307 neighbors, only one of these ASBRs propagates this route in Internal 308 BGP (IBGP). This ASBR becomes the root of the intra-AS segment of 309 the inter-AS tree and ensures that this is the only ASBR that accepts 310 traffic into this AS from the inter-AS tree." 312 The above VPLS behavior requires complicated VPLS specific procedures 313 for the ASBRs to reach agreement. For EVPN, a different approach is 314 used and the above quoted text is not applicable to EVPN. 316 The Leaf A-D based procedure is used for each ASBR who re-advertises 317 into the AS to discover the leaves on the segment rooted at itself. 318 This is the same as the procedures for S-PMSI in RFC 7117 itself. 320 The following text at the end of the second bullet: 322 "................................................... If, in order 323 to instantiate the segment, the ASBR needs to know the leaves of 324 the tree, then the ASBR obtains this information from the A-D 325 routes received from other PEs/ASBRs in the ASBR's own AS." 327 is changed to the following: 329 "................................................... If, in order 330 to instantiate the segment, the ASBR needs to know the leaves of 331 the tree, then the ASBR MUST set the LIR flag to 1 in the PTA to 332 trigger Leaf A-D routes from egress PEs and downstream ASBRs. 333 It MUST be (auto-)configured with an import RT, which controls 334 acceptance of leaf A-D routes by the ASBR." 336 Accordingly, the following paragraph in Section 7.2.2.4: 338 "If the received Inter-AS A-D route carries the PMSI Tunnel attribute 339 with the Tunnel Identifier set to RSVP-TE P2MP LSP, then the ASBR 340 that originated the route MUST establish an RSVP-TE P2MP LSP with the 341 local PE/ASBR as a leaf. This LSP MAY have been established before 342 the local PE/ASBR receives the route, or it MAY be established after 343 the local PE receives the route." 345 is changed to the following: 347 "If the received Inter-AS A-D route has the LIR flag set in its PTA, 348 then a receiving PE must originate a corresponding Leaf A-D route, 349 and a receiving ASBR must originate a corresponding Leaf A-D route 350 if and only if it received and imported one or more corresponding Leaf 351 A-D routes from its downstream IBGP or EBGP peers, or it has non-null 352 downstream forwarding state for the PIM/mLDP tunnel that instantiates 353 its downstream intra-AS segment. The ASBR that (re-)advertised the 354 Inter-AS A-D route then establishes a tunnel to the leaves discovered 355 by the Leaf A-D routes." 357 5.2. I-PMSI Leaf Tracking 359 An ingress PE does not set the LIR flag in its I-PMSI's PTA, even 360 with Ingress Replication or RSVP-TE P2MP tunnels. It does not rely 361 on the Leaf A-D routes to discover leaves in its AS, and Section 11.2 362 of RFC 7432 explicitly states that the LIR flag must be set to zero. 364 An implementation of RFC 7432 might have used the Originating 365 Router's IP Address field of the Inclusive Multicast Ethernet Tag 366 routes to determine the leaves, or might have used the Next Hop field 367 instead. Within the same AS, both will lead to the same result. 369 With segmentation, an ingress PE MUST determine the leaves in its AS 370 from the BGP next hops in all its received I-PMSI A-D routes, so it 371 does not have to set the LIR bit set to request Leaf A-D routes. PEs 372 within the same AS will all have different next hops in their I-PMSI 373 A-D routes (hence will all be considered as leaves), and PEs from 374 other ASes will have the next hop in their I-PMSI A-D routes set to 375 addresses of ASBRs in this local AS, hence only those ASBRs will be 376 considered as leaves (as proxies for those PEs in other ASes). Note 377 that in case of Ingress Replication, when an ASBR re-advertises IBGP 378 I-PMSI A-D routes, it MUST advertise the same label for all those for 379 the same Ethernet Tag ID and the same EVI. When an ingress PE builds 380 its flooding list, multiple routes may have the same (nexthop, label) 381 tuple and they will only be added as a single branch in the flooding 382 list. 384 5.3. Backward Compatibility 386 The above procedures assume that all PEs are upgraded to support the 387 segmentation procedures: 389 o An ingress PE uses the Next Hop instead of Originating Router's IP 390 Address to determine leaves for the I-PMSI tunnel. 392 o An egress PE sends Leaf A-D routes in response to I-PMSI routes, 393 if the PTA has the LIR flag set (by the re-advertising ASBRs). 395 o In case of Ingress Replication, when an ingress PE builds its 396 flooding list, multiple I-PMSI routes may have the same (nexthop, 397 label) tuple and only a single branch for those will be added in 398 the flooding list. 400 If a deployment has legacy PEs that does not support the above, then 401 a legacy ingress PE would include all PEs (including those in remote 402 ASes) as leaves of the inclusive tunnel and try to send traffic to 403 them directly (no segmentation), which is either undesired or not 404 possible; a legacy egress PE would not send Leaf A-D routes so the 405 ASBRs would not know to send external traffic to them. 407 To address this backward compatibility problem, the following 408 procedure can be used (see Section 6.2 for per-PE/AS/region I-PMSI 409 A-D routes): 411 o An upgraded PE indicates in its per-PE I-PMSI A-D route that it 412 supports the new procedures. Details will be provided in a future 413 revision. 415 o All per-PE I-PMSI A-D routes are restricted to the local AS and 416 not propagated to external peers. 418 o The ASBRs in an AS originate per-region I-PMSI A-D routes and 419 advertise to their external peers to advertise tunnels used to 420 carry traffic from the local AS to other ASes. Depending on the 421 types of tunnels being used, the LIR flag in the PTA may be set, 422 in which case the downstream ASBRs and upgraded PEs will send Leaf 423 A-D routes to pull traffic from their upstream ASBRs. In a 424 particular downstream AS, one of the ASBRs is elected, based on 425 the per-region I-PMSI A-D routes for a particular source AS, to 426 send traffic from that source AS to legacy PEs in the downstream 427 AS. The traffic arrives at the elected ASBR on the tunnel 428 announced in the best per-region I-PMSI A-D route for the source 429 AS, that the ASBR has selected of all those that it received over 430 EBGP or IBGP sessions. Details of the election procedure will be 431 provided in a future revision. 433 o In an ingress AS, if and only if an ASBR has active downstream 434 receivers (PEs and ASBRs), which are learned either explicitly via 435 Leaf AD routes or implicitly via PIM join or mLDP label mapping, 436 the ASBR originates a per-PE I-PMSI A-D route (i.e., regular 437 Inclusive Multicast Ethernet Tag route) into the local AS, and 438 stitches incoming per-PE I-PMSI tunnels into its per-region I-PMSI 439 tunnel. With this, it gets traffic from local PEs and send to 440 other ASes via the tunnel announced in its per-region I-PMSI A-D 441 route. 443 Note that, even if there is no backward compatibility issue, the 444 above procedures have the benefit of keeping all per-PE I-PMSI A-D 445 routes in their local ASes, greatly reducing the flooding of the 446 routes and their corresponding Leaf A-D routes (when needed), and the 447 number of inter-as tunnels. 449 6. Inter-Region Segmentation 451 6.1. Area vs. Region 453 RFC 7524 is for MVPN/VPLS inter-area segmentation and does not 454 explicitly cover EVPN. However, if "area" is replaced by "region" 455 and "ABR" is replaced by "RBR" (Regional Border Router) then 456 everything still works, and can be applied to EVPN as well. 458 A region can be a sub-area, or can be an entire AS including its 459 external links. Instead of automatic region definition based on IGP 460 areas, a region would be defined as a BGP peer group. In fact, even 461 with IGP area based region definition, a BGP peer group listing the 462 PEs and ABRs in an area is still needed. 464 Consider the following example diagram: 466 --------- ------ --------- 467 / \ / \ / \ 468 / \ / \ / \ 469 | PE1 o ASBR1 -- ASBR2 ASBR3 -- ASBR4 o PE2 | 470 \ / \ / \ / 471 \ / \ / \ / 472 --------- ------ --------- 473 AS 100 AS 200 AS 300 474 |-----------|--------|---------|--------|------------| 475 segment1 segment2 segment3 segment4 segment5 477 The inter-as segmentation procedures specified so far (RFC 6513/6514, 478 7117, and Section 5 of this document) requires all ASBRs to be 479 involved, and Ingress Replication is used between two ASBRs in 480 different ASes. 482 In the above diagram, it's possible that ASBR1/4 does not support 483 segmentation, and the provider tunnels in AS 100/300 can actually 484 extend across the external link. In the case, the inter-region 485 segmentation procedures can be used instead - a region is the entire 486 (AS100 + ASBR1-ASBR2 link) or (AS300 + ASBR3-ASBR4 link). ASBR2/3 487 would be the RBRs, and ASBR1/4 will just be a transit core router 488 with respect to provider tunnels. 490 As illustrated in the diagram below, ASBR2/3 will establish a 491 multihop EBGP session with either a RR or directly with PEs in the 492 neighboring AS. I/S-PMSI A-D routes from ingress PEs will not be 493 processed by ASBR1/4. When ASBR2 re-advertises the routes into AS 494 200, it changes the next hop to its own address and changes PTA to 495 specify the tunnel type/identification in its own AS. When ASBR3 re- 496 advertises I/S-PMSI A-D routes into the neighboring AS 300, it 497 changes the next hop to its own address and changes PTA to specify 498 the tunnel type/identification in the neighboring region 3. Now the 499 segment is rooted at ASBR3 and extends across the external link to 500 PEs. 502 --------- ------ --------- 503 / RR....\.mh-ebpg / \ mh-ebgp/....RR \ 504 / : \ `. / \ .' / : \ 505 | PE1 o ASBR1 -- ASBR2 ASBR3 -- ASBR4 o PE2 | 506 \ / \ / \ / 507 \ / \ / \ / 508 --------- ------ --------- 509 AS 100 AS 200 AS 300 510 |-------------------|----------|---------------------| 511 segment 1 segment 2 segment 3 513 6.2. Per-region Aggregation 515 Notice that every I/S-PMSI route from each PE will be propagated 516 throughout all the ASes or regions. They may also trigger 517 corresponding Leaf A-D routes depending on the types of tunnels used 518 in each region. This may become too many - routes and corresponding 519 tunnels. To address this concern, the I-PMSI routes from all PEs in 520 a AS/region can be aggregated into a single I-PMSI route originated 521 from the RBRs, and traffic from all those individual I-PMSI tunnels 522 will be switched into the single I-PMSI tunnel. This is like the 523 MVPN Inter-AS I-PMSI route originated by ASBRs. 525 The MVPN Inter-AS I-PMSI A-D route can be better called as per-AS 526 I-PMSI A-D route, to be compared against the (per-PE) Intra-AS I-PMSI 527 A-D routes originated by each PE. In this document we will call it 528 as per-region I-PMSI A-D route, in case we want to apply the 529 aggregation at regional level. The per-PE I-PMSI routes will not be 530 propagated to other regions. If multiple RBRs are connected to a 531 region, then each will advertise such a route, with the same route 532 key (Section 3.1). Similar to the per-PE I-PMSI A-D routes, RBRs/PEs 533 in a downstream region will each select a best one from all those re- 534 advertised by the upstream RBRs, hence will only receive traffic 535 injected by one of them. 537 MVPN does not aggregate S-PMSI routes from all PEs in an AS like it 538 does for I-PMSIs routes, because the number of PEs that will 539 advertise S-PMSI routes for the same (s,g) or (*,g) is small. This 540 is also the case for EVPN, i.e., there is no per-region S-PMSI 541 routes. 543 Notice that per-region I-PMSI routes can also be used to address 544 backwards compatibility issue, as discussed in Section 5.3. 546 The per-region I-PMSI route uses an embedded EC in NLRI to identify a 547 region. As long as it uniquely identifies the region and the RBRs 548 for the same region uses the same EC it is permitted. In the case 549 where an AS number or area ID is needed, the following can be used: 551 o For a two-octet AS number, a Transitive Two-Octet AS-Specific EC 552 of sub-type 0x09 (Source AS), with the Global Administrator sub- 553 field set to the AS number and the Local Administrator sub-field 554 set to 0. 556 o For a four-octet AS number, a Transitive Four-Octet AS-Specific EC 557 of sub-type 0x09 (Source AS), with the Global Administrator sub- 558 field set to the AS number and the Local Administrator sub-field 559 set to 0. 561 o For an area ID, a Transitive IPv4-Address-Specific EC of any sub- 562 type. 564 Uses of other particular ECs may be specified in other documents. 566 6.3. Use of S-NH-EC 568 RFC 7524 specifies the use of S-NH-EC because it does not allow ABRs 569 to change the BGP next hop when they re-advertise I/S-PMSI AD routes 570 to downstream areas. That is only to be consistent with the MVPN 571 Inter-AS I-PMSI A-D routes, whose next hop must not be changed when 572 they're re-advertised by the segmenting ABRs for reasons specific to 573 MVPN. For EVPN, it is perfectly fine to change the next hop when 574 RBRs re-advertise the I/S-PMSI A-D routes, instead of relying on S- 575 NH-EC. As a result, this document specifies that RBRs change the BGP 576 next hop when they re-advertise I/S-PMSI A-D routes and do not use S- 577 NH-EC. if a downstream PE/RBR needs to originate Leaf A-D routes, it 578 simply uses the BGP next hop in the corresponding I/S-PMSI A-D routes 579 to construct Route Targets. 581 The advantage of this is that neither ingress nor egress PEs need to 582 understand/use S-NH-EC, and consistent procedure (based on BGP next 583 hop) is used for both inter-as and inter-region segmentation. 585 6.4. Ingress PE's I-PMSI Leaf Tracking 587 RFC 7524 specifies that when an ingress PE/ASBR (re-)advertises an 588 VPLS I-PMSI A-D route, it sets the LIR flag to 1 in the route's PTA. 589 Similar to the inter-as case, this is actually not really needed for 590 EVPN. To be consistent with the inter-as case, the ingress PE does 591 not set the LIR flag in its originated I-PMSI A-D routes, and 592 determines the leaves based on the BGP next hops in its received 593 I-PMSI A-D routes, as specified in Section 5.2. 595 The same backward compatibility issue exists, and the same solution 596 as in the inter-as case applies, as specified in Section 5.3. 598 7. Multi-homing Support 600 If multi-homing does not span across different ASes or regions, 601 existing procedures work with segmentation. If an ES is multi-homed 602 to PEs in different ASes or regions, additional procedures are needed 603 to work with segmentation. The procedures are well understood but 604 omitted here until the requirement becomes clear. 606 8. Security Considerations 608 This document does not seem to introduce new security risks, though 609 this may be revised after further review and scrutiny. 611 9. Acknowledgements 613 The authors thank Eric Rosen, John Drake, and Ron Bonica for their 614 comments and suggestions. 616 10. Contributors 618 The following also contributed to this document through their earlier 619 work in EVPN selective multicast. 621 Junlin Zhang 622 Huawei Technologies 623 Huawei Bld., No.156 Beiqing Rd. 624 Beijing 100095 625 China 627 Email: jackey.zhang@huawei.com 629 Zhenbin Li 630 Huawei Technologies 631 Huawei Bld., No.156 Beiqing Rd. 632 Beijing 100095 633 China 635 Email: lizhenbin@huawei.com 637 11. References 639 11.1. Normative References 641 [I-D.ietf-bess-ir] 642 Rosen, E., Subramanian, K., and J. Zhang, "Ingress 643 Replication Tunnels in Multicast VPN", draft-ietf-bess- 644 ir-00 (work in progress), January 2015. 646 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 647 Requirement Levels", BCP 14, RFC 2119, 648 DOI 10.17487/RFC2119, March 1997, 649 . 651 [RFC7117] Aggarwal, R., Ed., Kamite, Y., Fang, L., Rekhter, Y., and 652 C. Kodeboniya, "Multicast in Virtual Private LAN Service 653 (VPLS)", RFC 7117, DOI 10.17487/RFC7117, February 2014, 654 . 656 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 657 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 658 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 659 2015, . 661 [RFC7524] Rekhter, Y., Rosen, E., Aggarwal, R., Morin, T., 662 Grosclaude, I., Leymann, N., and S. Saad, "Inter-Area 663 Point-to-Multipoint (P2MP) Segmented Label Switched Paths 664 (LSPs)", RFC 7524, DOI 10.17487/RFC7524, May 2015, 665 . 667 11.2. Informative References 669 [I-D.ietf-bess-dci-evpn-overlay] 670 Rabadan, J., Sathappan, S., Henderickx, W., Palislamovic, 671 S., Balus, F., Sajassi, A., and D. Cai, "Interconnect 672 Solution for EVPN Overlay networks", draft-ietf-bess-dci- 673 evpn-overlay-00 (work in progress), January 2015. 675 [I-D.ietf-bess-evpn-overlay] 676 Sajassi, A., Drake, J., Bitar, N., Isaac, A., Uttaro, J., 677 and W. Henderickx, "A Network Virtualization Overlay 678 Solution using EVPN", draft-ietf-bess-evpn-overlay-01 679 (work in progress), February 2015. 681 [I-D.rabadan-bess-evpn-optimized-ir] 682 Rabadan, J., Sathappan, S., Henderickx, W., Sajassi, A., 683 and A. Isaac, "Optimized Ingress Replication solution for 684 EVPN", draft-rabadan-bess-evpn-optimized-ir-00 (work in 685 progress), October 2014. 687 [I-D.wijnands-bier-architecture] 688 Wijnands, I., Rosen, E., Dolganow, A., Przygienda, T., and 689 S. Aldrin, "Multicast using Bit Index Explicit 690 Replication", draft-wijnands-bier-architecture-05 (work in 691 progress), March 2015. 693 [RFC6513] Rosen, E., Ed. and R. Aggarwal, Ed., "Multicast in MPLS/ 694 BGP IP VPNs", RFC 6513, DOI 10.17487/RFC6513, February 695 2012, . 697 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 698 Encodings and Procedures for Multicast in MPLS/BGP IP 699 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 700 . 702 Authors' Addresses 704 Zhaohui Zhang 705 Juniper Networks 707 EMail: zzhang@juniper.net 709 Wen Lin 710 Juniper Networks, Inc. 712 EMail: wlin@juniper.net 714 Jorge Rabadan 715 Nokia 717 EMail: jorge.rabadan@nokia.com 719 Keyur Patel 720 Cisco Systems 722 EMail: keyupate@cisco.com