idnits 2.17.1 draft-ietf-bess-evpn-etree-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == The 'Updates: ' line in the draft header should list only the _numbers_ of the RFCs which will be updated by this document (if approved); it should not include the word 'RFC' in the list. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 12, 2017) is 2659 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC5226' is mentioned on line 715, but not defined ** Obsolete undefined reference: RFC 5226 (Obsoleted by RFC 8126) == Unused Reference: 'RFC7385' is defined on line 730, but no explicit reference was found in the text == Unused Reference: 'RFC4360' is defined on line 745, but no explicit reference was found in the text Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup A. Sajassi, Ed. 3 INTERNET-DRAFT S. Salam 4 Intended Status: Standards Track Cisco 5 Updates: RFC7385 J. Drake 6 Juniper 7 J. Uttaro 8 ATT 9 S. Boutros 10 VMware 11 J. Rabadan 12 Nokia 14 Expires: June 12, 2017 January 12, 2017 16 E-TREE Support in EVPN & PBB-EVPN 17 draft-ietf-bess-evpn-etree-08 19 Abstract 21 The Metro Ethernet Forum (MEF) has defined a rooted-multipoint 22 Ethernet service known as Ethernet Tree (E-Tree). A solution 23 framework for supporting this service in MPLS networks is proposed in 24 and RFC called "A Framework for E-Tree Service over MPLS Network". 25 This document discusses how those functional requirements can be 26 easily met with (PBB-)EVPN and how (PBB-)EVPN offers a more efficient 27 implementation of these functions. This document makes use of the 28 most significant bit of the scope governed by the IANA registry 29 created by RFC7385, and hence updates that RFC accordingly. 31 Status of this Memo 33 This Internet-Draft is submitted to IETF in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF), its areas, and its working groups. Note that 38 other groups may also distribute working documents as 39 Internet-Drafts. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 The list of current Internet-Drafts can be accessed at 47 http://www.ietf.org/1id-abstracts.html 49 The list of Internet-Draft Shadow Directories can be accessed at 50 http://www.ietf.org/shadow.html 52 Copyright and License Notice 54 Copyright (c) 2016 IETF Trust and the persons identified as the 55 document authors. All rights reserved. 57 This document is subject to BCP 78 and the IETF Trust's Legal 58 Provisions Relating to IETF Documents 59 (http://trustee.ietf.org/license-info) in effect on the date of 60 publication of this document. Please review these documents 61 carefully, as they describe your rights and restrictions with respect 62 to this document. Code Components extracted from this document must 63 include Simplified BSD License text as described in Section 4.e of 64 the Trust Legal Provisions and are provided without warranty as 65 described in the Simplified BSD License. 67 Table of Contents 69 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 70 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 71 2 E-Tree Scenarios and EVPN / PBB-EVPN Support . . . . . . . . . 4 72 2.1 Scenario 1: Leaf OR Root site(s) per PE . . . . . . . . . . 4 73 2.2 Scenario 2: Leaf OR Root site(s) per AC . . . . . . . . . . 5 74 2.3 Scenario 3: Leaf OR Root site(s) per MAC . . . . . . . . . . 7 75 3 Operation for EVPN . . . . . . . . . . . . . . . . . . . . . . . 7 76 3.1 Known Unicast Traffic . . . . . . . . . . . . . . . . . . . 8 77 3.2 BUM Traffic . . . . . . . . . . . . . . . . . . . . . . . . 9 78 3.2.1 BUM traffic originated from a single-homed site on a 79 leaf AC . . . . . . . . . . . . . . . . . . . . . . . . 10 80 3.2.2 BUM traffic originated from a single-homed site on a 81 root AC . . . . . . . . . . . . . . . . . . . . . . . . 10 82 3.2.3 BUM traffic originated from a multi-homed site on a 83 leaf AC . . . . . . . . . . . . . . . . . . . . . . . . 10 84 3.2.4 BUM traffic originated from a multi-homed site on a 85 root AC . . . . . . . . . . . . . . . . . . . . . . . . 10 86 3.3 E-TREE Traffic Flows for EVPN . . . . . . . . . . . . . . . 11 87 3.3.1 E-Tree with MAC Learning . . . . . . . . . . . . . . . . 11 88 3.3.2 E-Tree without MAC Learning . . . . . . . . . . . . . . 12 89 4 Operation for PBB-EVPN . . . . . . . . . . . . . . . . . . . . . 12 90 4.1 Known Unicast Traffic . . . . . . . . . . . . . . . . . . . 13 91 4.2 BUM Traffic . . . . . . . . . . . . . . . . . . . . . . . . 13 92 4.3 E-Tree without MAC Learning . . . . . . . . . . . . . . . . 14 93 5 BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 14 94 5.1 E-TREE Extended Community . . . . . . . . . . . . . . . . . 14 95 5.2 PMSI Tunnel Attribute . . . . . . . . . . . . . . . . . . . 15 96 6 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . 16 97 7 Security Considerations . . . . . . . . . . . . . . . . . . . . 16 98 8 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 16 99 8.1 Considerations for PMSI Tunnel Types . . . . . . . . . . . . 16 100 9 References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 101 9.1 Normative References . . . . . . . . . . . . . . . . . . . 17 102 9.2 Informative References . . . . . . . . . . . . . . . . . . 17 103 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 104 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 18 106 1 Introduction 108 The Metro Ethernet Forum (MEF) has defined a rooted-multipoint 109 Ethernet service known as Ethernet Tree (E-Tree). In an E-Tree 110 service, endpoints are labeled as either Root or Leaf sites. Root 111 sites can communicate with all other sites. Leaf sites can 112 communicate with Root sites but not with other Leaf sites. 114 [RFC7387] proposes the solution framework for supporting E-Tree 115 service in MPLS networks. The document identifies the functional 116 components of the overall solution to emulate E-Tree services in 117 addition to Ethernet LAN (E-LAN) services on an existing MPLS 118 network. 120 [RFC7432] is a solution for multipoint L2VPN services, with advanced 121 multi-homing capabilities, using BGP for distributing customer/client 122 MAC address reach-ability information over the MPLS/IP network. 123 [RFC7623] combines the functionality of EVPN with [802.1ah] Provider 124 Backbone Bridging for MAC address scalability. 126 This document discusses how the functional requirements for E-Tree 127 service can be easily met with (PBB-)EVPN and how (PBB-)EVPN offers a 128 more efficient implementation of these functions. 130 1.1 Terminology 132 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 133 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 134 document are to be interpreted as described in RFC 2119 [KEYWORDS]. 136 2 E-Tree Scenarios and EVPN / PBB-EVPN Support 138 In this section, we will categorize support for E-Tree into three 139 different scenarios, depending on the nature of the site association 140 (Root/Leaf) per PE or per Ethernet Segment: 142 - Leaf OR Root site(s) per PE 144 - Leaf OR Root site(s) per AC 146 - Leaf OR Root site(s) per MAC 148 2.1 Scenario 1: Leaf OR Root site(s) per PE 150 In this scenario, a PE may receive traffic from either Root sites OR 151 Leaf sites for a given MAC-VRF/bridge table, but not both 152 concurrently. In other words, a given EVI on a PE is either 153 associated with root(s) or leaf(s). The PE may have both Root and 154 Leaf sites albeit for different EVIs. 156 +---------+ +---------+ 157 | PE1 | | PE2 | 158 +---+ | +---+ | +------+ | +---+ | +---+ 159 |CE1+---ES1----+--+ | | | MPLS | | | +--+----ES2-----+CE2| 160 +---+ (Root) | |MAC| | | /IP | | |MAC| | (Leaf) +---+ 161 | |VRF| | | | | |VRF| | 162 | | | | | | | | | | +---+ 163 | | | | | | | | +--+----ES3-----+CE3| 164 | +---+ | +------+ | +---+ | (Leaf) +---+ 165 +---------+ +---------+ 167 Figure 1: Scenario 1 169 In such scenario, using tailored BGP Route Target (RT) import/export 170 policies among the PEs belonging to the same EVI, can be used to 171 restrict the communications among Leaf PEs. To restrict the 172 communications among Leaf sites connected to the same PE and 173 belonging to the same EVI, split-horizon filtering is used to block 174 traffic from one Leaf interface to another Leaf interface of a given 175 E-TREE EVI. The purpose of this topology constraint is to avoid 176 having PEs with only Leaf sites importing and processing BGP MAC 177 routes from each other. To support such topology constrain in EVPN, 178 two BGP Route-Targets (RTs) are used for every EVPN Instance (EVI): 179 one RT is associated with the Root sites and the other is associated 180 with the Leaf sites. On a per EVI basis, every PE exports the single 181 RT associated with its type of site(s). Furthermore, a PE with Root 182 site(s) imports both Root and Leaf RTs, whereas a PE with Leaf 183 site(s) only imports the Root RT. 185 2.2 Scenario 2: Leaf OR Root site(s) per AC 187 In this scenario, a PE receives traffic from either Root OR Leaf 188 sites (but not both) on a given Attachment Circuit (AC) of an EVI. In 189 other words, an AC (ES or ES/VLAN) is either a Root AC or a Leaf AC 190 (but not both). 192 +---------+ +---------+ 193 | PE1 | | PE2 | 194 +---+ | +---+ | +------+ | +---+ | +---+ 195 |CE1+-----ES1----+--+ | | | | | | +--+---ES2/AC1--+CE2| 196 +---+ (Leaf) | |MAC| | | MPLS | | |MAC| | (Leaf) +---+ 197 | |VRF| | | /IP | | |VRF| | 198 | | | | | | | | | | +---+ 199 | | | | | | | | +--+---ES2/AC2--+CE3| 200 | +---+ | +------+ | +---+ | (Root) +---+ 201 +---------+ +---------+ 203 Figure 2: Scenario 2 205 In this scenario, just like the previous scenario (in section 2.1), 206 two Route Targets (one for Root and another for Leaf) can be used. 207 However, the difference is that on a PE with both Root and Leaf ACs, 208 all remote MAC routes are imported and thus there needs to be a way 209 to differentiate remote MAC routes associated with Leaf ACs versus 210 the ones associated with Root ACs in order to apply the proper 211 ingress filtering. 213 In order to support such ingress filtering on the ingress PE with 214 both Leaf and Root ACs, one the following two approaches can be used: 216 A) To use two MAC-VRFs (two bridge tables per VLANs if a given VLAN 217 exists on the PE for both Leaf and Root ACs of an EVI) - one for Root 218 ACs and another for Leaf ACs. 220 B) To color MAC addresses with Leaf or Root color before distributing 221 them in BGP to other PEs depending on whether they are learned on a 222 Leaf AC or a Root AC. 224 Maintaining two MAC-VRFs (two bridge tables) per VLAN (when both Leaf 225 and Root ACs exists for that VLAN) would either require two lookups 226 be performed per MAC address in each direction in case of a miss, or 227 duplicating many MAC addresses between the two bridge tables 228 belonging to the same VLAN (same E-TREE instance). Unless two lookups 229 are made, duplication of MAC addresses would be needed for both 230 locally learned and remotely learned MAC addresses. Locally learned 231 MAC addresses from Leaf ACs need to be duplicated onto Root bridge 232 table and locally learned MAC addresses from Root ACs need to be 233 duplicated onto Leaf bridge table. Remotely learned MAC addresses 234 from Root ACs need to be copied onto both Root and Leaf bridge 235 tables. Because of potential inefficiencies associated with data- 236 plane implementation of additional MAC lookup or duplication of MAC 237 entries, option (A) is not believed to be implementable without 238 dataplane performance inefficiencies in some platforms and thus this 239 draft introduces the coloring option (B) as detailed in section 3.1. 241 In order to avoid two MAC-VRFs, this draft introduces the coloring 242 option (B) as detailed in section 3.1. 244 For this scenario, if for a given EVI, the vast majority of PEs will 245 have both Leaf and Root sites attached, even though they may start as 246 Root-only or Leaf-only PEs, then a single RT per EVI MAY be used in 247 order to alleviate the configuration overhead associated with using 248 two RTs per EVI at the expense of having unwanted MAC addresses on 249 the Leaf-only PEs. 251 2.3 Scenario 3: Leaf OR Root site(s) per MAC 253 In this scenario, a PE may receive traffic from both Root AND Leaf 254 sites on a single Attachment Circuit (AC) of an EVI. Since an 255 Attachment Circuit (ES or ES/VLAN) carries traffic from both Root and 256 Leaf sites, the granularity at which Root or Leaf sites are 257 identified is on a per MAC address. This scenario is considered in 258 this draft for EVPN service with only known unicast traffic because 259 the DF filtering per [RFC7432] would not be compatible with the 260 required egress filtering - i.e., BUM traffic is not supported in 261 this scenario and it is dropped by the ingress PE. 263 +---------+ +---------+ 264 | PE1 | | PE2 | 265 +---+ | +---+ | +------+ | +---+ | +---+ 266 |CE1+-----ES1----+--+ | | | | | | +--+---ES2/AC1--+CE2| 267 +---+ (Root) | | E | | | MPLS | | | E | | (Leaf/Root)+---+ 268 | | V | | | /IP | | | V | | 269 | | I | | | | | | I | | +---+ 270 | | | | | | | | +--+---ES2/AC2--+CE3| 271 | +---+ | +------+ | +---+ | (Leaf) +---+ 272 +---------+ +---------+ 274 Figure 3: Scenario 3 276 3 Operation for EVPN 278 [RFC7432] defines the notion of ESI MPLS label used for split-horizon 279 filtering of BUM traffic at the egress PE. Such egress filtering 280 capabilities can be leveraged in provision of E-TREE services as seen 281 shortly. In other words, [RFC7432] has inherent capability to support 282 E-TREE services without defining any new BGP routes but by just 283 defining a new BGP Extended Community for leaf indication as shown 284 later in this document. 286 3.1 Known Unicast Traffic 288 Since in EVPN, MAC learning is performed in control plane via 289 advertisement of BGP routes, the filtering needed by E-TREE service 290 for known unicast traffic can be performed at the ingress PE, thus 291 providing very efficient filtering and avoiding sending known unicast 292 traffic over MPLS/IP core to be filtered at the egress PE as done in 293 traditional E-TREE solutions (e.g., E-TREE for VPLS). 295 To provide such ingress filtering for known unicast traffic, a PE 296 MUST indicate to other PEs what kind of sites (root or leaf) its MAC 297 addresses are associated with by advertising a leaf indication flag 298 (via an Extended Community) along with each of its MAC/IP 299 Advertisement route. The lack of such flag indicates that the MAC 300 address is associated with a root site. This scheme applies to all 301 scenarios described in section 2. 303 Furthermore, for multi-homing scenario of section 2.2, where an AC is 304 either root or leaf (but not both), the PE MAY advertise leaf 305 indication along with the Ethernet A-D per EVI route. This 306 advertisement is used for sanity checking in control-plane to ensure 307 that there is no discrepancy in configuration among different PEs of 308 the same redundancy group. For example, if a leaf site is multi-homed 309 to PE1 an PE2, and PE1 advertises the Ethernet A-D per EVI 310 corresponding to this leaf site with the leaf-indication flag but PE2 311 does not, then the receiving PE notifies the operator of such 312 discrepancy and ignore the leaf-indication flag on PE1. In other 313 words, in case of discrepancy, the multi-homing for that pair of PEs 314 is assumed to be in default "root" mode for that or . The leaf indication flag on Ethernet A-D per EVI route 316 tells the receiving PEs that all MAC addresses associated with this 317 or are from a leaf site. Therefore, if a 318 PE receives a leaf indication for an AC via the Ethernet A-D per EVI 319 route but doesn't receive a leaf indication in the corresponding 320 MAC/IP Advertisement route, then it notifies the operator and ignore 321 the leaf indication on the Ethernet A-D per EVI route. 323 Tagging MAC addresses with a leaf indication enables remote PEs to 324 perform ingress filtering for known unicast traffic - i.e., on the 325 ingress PE, the MAC destination address lookup yields, in addition to 326 the forwarding adjacency, a flag which indicates whether the target 327 MAC is associated with a Leaf site or not. The ingress PE cross- 328 checks this flag with the status of the originating AC, and if both 329 are Leafs, then the packet is not forwarded. 331 In situation where MAC moves are allowed among Leaf and Root sites 332 (e.g., non-static MAC), PEs can receive multiple MAC/IP 333 advertisements routes for the same MAC address with different 334 Leaf/Root indications (and possibly different ESIs for multi-homing 335 scenarios). In such situations, MAC mobility procedures take 336 precedence to first identify the location of the MAC before 337 associating that MAC with a Root or a Leaf site. 339 To support the above ingress filtering functionality, a new E-TREE 340 Extended Community with a Leaf indication flag is introduced [section 341 5.2]. This new Extended Community MUST be advertised with MAC/IP 342 Advertisement route and MAY be advertised with an Ethernet A-D per 343 EVI route as described above. 345 3.2 BUM Traffic 347 This specification does not provide support for filtering BUM traffic 348 on the ingress PE because it is not possible to perform filtering of 349 BUM traffic on the ingress PE, as is the case with known unicast 350 described above, due to the multi-destination nature of BUM traffic. 351 As such, the solution relies on egress filtering. In order to apply 352 the proper egress filtering, which varies based on whether a packet 353 is sent from a Leaf AC or a root AC, the MPLS-encapsulated frames 354 MUST be tagged with an indication when they originated from a Leaf 355 AC. In other words, leaf indication for BUM traffic is done at the 356 granularity of AC. This can be achieved in EVPN through the use of a 357 MPLS label where it can be used to either identify the Ethernet 358 segment of origin per [RFC7432] (i.e., ESI label) or it can be used 359 to indicate that the packet is originated from a leaf site (Leaf 360 label). 362 BUM traffic sent over a P2MP LSP or ingress replication, may need to 363 carry an upstream assigned or downstream assigned MPLS label 364 (respectively) for the purpose of egress filtering to indicate to the 365 egress PEs whether this packet is originated from a leaf AC. 367 The main difference between downstream and upstream assigned MPLS 368 label is that in case of downstream assigned not all egress PE 369 devices need to receive the label just like ingress replication 370 procedures defined in [RFC7432]. 372 The PE places all Leaf Ethernet Segments of a given bridge domain in 373 a single split-horizon group in order to prevent intra-PE forwarding 374 among Leaf segments. This split-horizon function applies to BUM 375 traffic as well as known-unicast traffic. 377 There are four scenarios to consider as follows. In all these 378 scenarios, the ingress PE imposes the right MPLS label associated 379 with the originated Ethernet Segment (ES) depending on whether the 380 Ethernet frame originated from a Root or a Leaf site on that Ethernet 381 Segment (ESI label or Leaf label). The mechanism by which the PE 382 identifies whether a given frame originated from a Root or a Leaf 383 site on the segment is based on the AC identifier for that segment 384 (e.g., Ethernet Tag of the frame for 802.1Q frames). Other mechanisms 385 for identifying root or leaf (e.g., on a per MAC address basis) is 386 beyond the scope of this document. 388 3.2.1 BUM traffic originated from a single-homed site on a leaf AC 390 In this scenario, the ingress PE adds a special MPLS label indicating 391 a Leaf site. This special Leaf MPLS label, used for single-homing 392 scenarios, is not on a per ES basis but rather on a per PE basis - 393 i.e., a single Leaf MPLS label is used for all single-homed ES's on 394 that PE. This Leaf label is advertised to other PE devices, using a 395 new EVPN Extended Community called E-TREE Extended Community (section 396 5.1) along with an Ethernet A-D per ES route with ESI of zero and a 397 set of Route Targets (RTs) corresponding to all EVIs on the PE with 398 at least one leaf site per EVI. The set of Ethernet A-D per ES routes 399 may be needed if the number of Route Targets (RTs) that need to be 400 sent exceed the limit on a single route per [RFC7432]. The ESI for 401 the Ethernet A-D per ES route is set to zero to indicate single-homed 402 sites. 404 When a PE receives this special Leaf label in the data path, it 405 blocks the packet if the destination AC is of type Leaf; otherwise, 406 it forwards the packet. 408 3.2.2 BUM traffic originated from a single-homed site on a root AC 410 In this scenario, the ingress PE does not add any ESI label or Leaf 411 label and it operates per [RFC7432] procedures. 413 3.2.3 BUM traffic originated from a multi-homed site on a leaf AC 415 In this scenario, it is assumed that while different ACs (VLANs) on 416 the same ES could have different root/leaf designation (some being 417 roots and some being leafs), the same AC (e.g., VLAN) does have the 418 same root/leaf designation on all PEs on the same ES. Furthermore, it 419 is assumed that there is no forwarding among subnets - ie, the 420 service is EVPN L2 and not EVPN IRB. IRB use case is outside the 421 scope of this document. 423 In such scenarios, If a multicast or broadcast packet is originated 424 from a leaf AC, then it only needs to carry Leaf label described in 425 section 3.2.1. This label is sufficient in providing the necessary 426 egress filtering of BUM traffic from getting sent to leaf ACs 427 including the leaf AC on the same Ethernet Segment. 429 3.2.4 BUM traffic originated from a multi-homed site on a root AC 430 In this scenario, both the ingress and egress PE devices follows the 431 procedure defined in [RFC7432] for adding and/or processing an ESI 432 MPLS label. 434 3.3 E-TREE Traffic Flows for EVPN 436 Per [RFC7387], a generic E-Tree service supports all of the following 437 traffic flows: 439 - Ethernet Unicast from Root to Roots & Leaf 440 - Ethernet Unicast from Leaf to Root 441 - Ethernet Broadcast/Multicast from Root to Roots & Leafs 442 - Ethernet Broadcast/Multicast from Leaf to Roots 444 A particular E-Tree service may need to support all of the above 445 types of flows or only a select subset, depending on the target 446 application. In the case where unicast flows need not be supported, 447 the L2VPN PEs can avoid performing any MAC learning function. 449 In the subsections that follow, we will describe the operation of 450 EVPN to support E-Tree service with and without MAC learning. 452 3.3.1 E-Tree with MAC Learning 454 The PEs implementing an E-Tree service must perform MAC learning when 455 unicast traffic flows must be supported among Root and Leaf sites. In 456 this case, the PE(s) with Root sites performs MAC learning in the 457 data-path over the Ethernet Segments, and advertises reachability in 458 EVPN MAC Advertisement routes. These routes will be imported by all 459 PEs for that EVI (i.e., PEs that have Leaf sites as well as PEs that 460 have Root sites). Similarly, the PEs with Leaf sites perform MAC 461 learning in the data-path over their Ethernet Segments, and advertise 462 reachability in EVPN MAC Advertisement routes. For the scenario 463 described in section 2.1 (or possibly section 2.2), these routes are 464 imported only by PEs with at least one Root site in the EVI - i.e., a 465 PE with only Leaf sites will not import these routes. PEs with Root 466 and/or Leaf sites may use the Ethernet A-D routes for aliasing (in 467 the case of multi-homed segments) and for mass MAC withdrawal per 468 [RFC7432]. 470 To support multicast/broadcast from Root to Leaf sites, either a P2MP 471 tree rooted at the PE(s) with the Root site(s) or ingress replication 472 can be used. The multicast tunnels are set up through the exchange of 473 the EVPN Inclusive Multicast route, as defined in [RFC7432]. 475 To support multicast/broadcast from Leaf to Root sites, ingress 476 replication should be sufficient for most scenarios where there are 477 only a few Roots (typically two). Therefore, in a typical scenario, a 478 root PE needs to support both a P2MP tunnel in transmit direction 479 from itself to leaf PEs and at the same time it needs to support 480 ingress-replication tunnels in receive direction from leaf PEs to 481 itself. In order to signal this efficiently from the root PE, a new 482 composite tunnel type is defined per section 5.3. This new composite 483 tunnel type is advertised by the root PE to simultaneously indicate a 484 P2MP tunnel in transmit direction and an ingress-replication tunnel 485 in the receive direction for the BUM traffic. 487 If the number of Roots is large, P2MP tunnels originated at the PEs 488 with Leaf sites may be used and thus there will be no need to use the 489 modified PMSI tunnel attribute in section 5.2 for composite tunnel 490 type. 492 3.3.2 E-Tree without MAC Learning 494 The PEs implementing an E-Tree service need not perform MAC learning 495 when the traffic flows between Root and Leaf sites are only multicast 496 or broadcast. In this case, the PEs do not exchange EVPN MAC 497 Advertisement routes. Instead, the Inclusive Multicast Ethernet Tag 498 route is used to support BUM traffic. 500 The fields of this route are populated per the procedures defined in 501 [RFC7432], and the multicast tunnel setup criteria are as described 502 in the previous section. 504 Just as in the previous section, if the number of PEs with root sites 505 are only a few and thus ingress replication is desired from leaf PEs 506 to these root PEs, then the modified PMSI attribute as defined in 507 section 5.3 should be used. 509 4 Operation for PBB-EVPN 511 In PBB-EVPN, the PE advertises a Root/Leaf indication along with each 512 B-MAC Advertisement route, to indicate whether the associated B-MAC 513 address corresponds to a Root or a Leaf site. Just like the EVPN 514 case, the new E-TREE Extended Community defined in section [5.1] is 515 advertised with each MAC Advertisement route. 517 In the case where a multi-homed Ethernet Segment has both Root and 518 Leaf sites attached, two B-MAC addresses are advertised: one B-MAC 519 address is per ES as specified in [RFC7623] and implicitly denoting 520 Root, and the other B-MAC address is per PE and explicitly denoting 521 Leaf. The former B-MAC address is not advertised with the E-TREE 522 extended community but the latter B-MAC denoting Leaf is advertised 523 with the new E-TREE extended community where "Leaf-indication" flag 524 is set. In such multi-homing scenarios where and Ethernet Segment has 525 both Root and Leaf ACs, it is assumed that While different ACs 526 (VLANs) on the same ES could have different root/leaf designation 527 (some being roots and some being leafs), the same VLAN does have the 528 same root/leaf designation on all PEs on the same ES. Furthermore, it 529 is assumed that there is no forwarding among subnets - ie, the 530 service is L2 and not IRB. IRB use case is outside the scope of this 531 document. 533 The ingress PE uses the right B-MAC source address depending on 534 whether the Ethernet frame originated from the Root or Leaf AC on 535 that Ethernet Segment. The mechanism by which the PE identifies 536 whether a given frame originated from a Root or Leaf site on the 537 segment is based on the Ethernet Tag associated with the frame. Other 538 mechanisms of identification, beyond the Ethernet Tag, are outside 539 the scope of this document. 541 Furthermore, a PE advertises two special global B-MAC addresses: one 542 for Root and another for Leaf, and tags the Leaf one as such in the 543 MAC Advertisement route. These B-MAC addresses are used as source 544 addresses for traffic originating from single-homed segments. The B- 545 MAC address used for indicating Leaf sites can be the same for both 546 single-homed and multi-homed segments. 548 4.1 Known Unicast Traffic 550 For known unicast traffic, the PEs perform ingress filtering: On the 551 ingress PE, the C-MAC destination address lookup yields, in addition 552 to the target B-MAC address and forwarding adjacency, a flag which 553 indicates whether the target B-MAC is associated with a Root or a 554 Leaf site. The ingress PE cross-checks this flag with the status of 555 the originating site, and if both are a Leaf, then the packet is not 556 forwarded. 558 4.2 BUM Traffic 560 For BUM traffic, the PEs must perform egress filtering. When a PE 561 receives a MAC advertisement route (which will be used as a source B- 562 MAC for BUM traffic), it updates its egress filtering (based on the 563 source B-MAC address), as follows: 565 - If the MAC Advertisement route indicates that the advertised B-MAC 566 is a Leaf, and the local Ethernet Segment is a Leaf as well, then the 567 source B-MAC address is added to its B-MAC list used for egress 568 filtering - i.e., to block traffic from that B-MAC address. 570 - Otherwise, the B-MAC filtering list is not updated. 572 When the egress PE receives the packet, it examines the B-MAC source 573 address to check whether it should filter or forward the frame. Note 574 that this uses the same filtering logic as baseline [RFC7623] and 575 does not require any additional flags in the data-plane. 577 Just as in section 3.2, the PE places all Leaf Ethernet Segments of a 578 given bridge domain in a single split-horizon group in order to 579 prevent intra-PE forwarding among Leaf segments. This split-horizon 580 function applies to BUM traffic as well as known-unicast traffic. 582 4.3 E-Tree without MAC Learning 584 In scenarios where the traffic of interest is only Multicast and/or 585 broadcast, the PEs implementing an E-Tree service do not need to do 586 any MAC learning. In such scenarios the filtering must be performed 587 on egress PEs. For PBB-EVPN, the handling of such traffic is per 588 section 4.2 without C-MAC learning part of it at both ingress and 589 egress PEs. 591 5 BGP Encoding 593 This document defines two new BGP Extended Community for EVPN. 595 5.1 E-TREE Extended Community 597 This Extended Community is a new transitive Extended Community having 598 a Type field value of 0x06 (EVPN) and the Sub-Type 0x05. It is used 599 for leaf indication of known unicast and BUM traffic. For BUM 600 traffic, the Leaf Label field is set to a valid MPLS label and this 601 EC is advertised along with Ethernet A-D per ES route with an ESI of 602 zero to enable egress filtering on disposition PEs per section 3.2.1 603 and 3.2.3. There is no need to send ESI Label Extended Community when 604 sending Ethernet A-D per ES route with an ESI of zero. For known 605 unicast traffic, the Leaf flag bit is set to one and this EC is 606 advertised along with MAC/IP Advertisement route per section 3.1. 608 The E-TREE Extended Community is encoded as an 8-octet value as 609 follows: 611 0 1 2 3 612 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 613 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 614 | Type=0x06 | Sub-Type=0x05 | Flags(1 Octet)| Reserved=0 | 615 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 616 | Reserved=0 | Leaf Label | 617 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 619 The low-order bit of the Flags octet is defined as the "Leaf- 620 Indication" bit. A value of one indicates a Leaf AC/Site. 622 When this EC is advertised along with MAC/IP Advertisement route (for 623 known unicast traffic), the Leaf-Indication flag MUST be set to one 624 and Leaf Label is set to zero. The received PE should ignore Leaf 625 Label and only processes Leaf-Indication flag. A value of zero for 626 Leaf-Indication flag is invalid when sent along with MAC/IP 627 advertisement route and an error should be logged. 629 When this EC is advertised along with Ethernet A-D per ES route (with 630 ESI of zero) for BUM traffic, the Leaf Label MUST be set to a valid 631 MPLS label and the Leaf-Indication flag should be set to zero. The 632 received PE should ignore the Leaf-Indication flag. A non-valid MPLS 633 label when sent along with the Ethernet A-D per ES route, should be 634 logged as an error. 636 5.2 PMSI Tunnel Attribute 638 [RFC6514] defines PMSI Tunnel attribute which is an optional 639 transitive attribute with the following format: 641 +---------------------------------+ 642 | Flags (1 octet) | 643 +---------------------------------+ 644 | Tunnel Type (1 octets) | 645 +---------------------------------+ 646 | MPLS Label (3 octets) | 647 +---------------------------------+ 648 | Tunnel Identifier (variable) | 649 +---------------------------------+ 651 This draft uses all the fields per existing definition except for the 652 following modifications to the Tunnel Type and Tunnel Identifier: 654 When receiver ingress-replication label is needed, the high-order bit 655 of the tunnel type field (C bit - Composite tunnel bit) is set while 656 the remaining low-order seven bits indicate the tunnel type as 657 before. When this C bit is set, the "tunnel identifier" field would 658 begin with a three-octet label, followed by the actual tunnel 659 identifier for the transmit tunnel. PEs that don't understand the 660 new meaning of the high-order bit would treat the tunnel type as an 661 invalid tunnel type. For the PEs that do understand the new meaning 662 of the high-order, if ingress replication is desired when sending BUM 663 traffic, the PE will use the the label in the Tunnel Identifier field 664 when sending its BUM traffic. 666 Using the Composite flag for Tunnel Types 0x00 'no tunnel information 667 present' and 0x06 'Ingress Replication' is invalid, and should be 668 treated as an invalid tunnel type on reception. 670 6 Acknowledgement 672 We would like to thank Dennis Cai, Antoni Przygienda, and Jeffrey 673 Zhang for their valuable comments. The authors would also like to 674 thank Thomas Morin for shepherding this document and providing 675 valuable comments. 677 7 Security Considerations 679 Since this draft uses the EVPN constructs of [RFC7432] and [RFC7623], 680 the same security considerations in these drafts are also applicable 681 here. Furthermore, this draft provides additional security check by 682 allowing sites (or ACs) of an EVPN instance to be designated as 683 "Root" or "Leaf" and preventing any traffic exchange among "Leaf" 684 sites of that VPN through ingress filtering for known unicast traffic 685 and egress filtering for BUM traffic. 687 8 IANA Considerations 689 IANA has allocated value 5 in the "EVPN Extended Community Sub-Types" 690 registry defined in [RFC7153] as follow: 692 SUB-TYPE VALUE NAME Reference 694 0x05 E-TREE Extended Community This document 696 8.1 Considerations for PMSI Tunnel Types 697 The "P-Multicast Service Interface Tunnel (PMSI Tunnel) Tunnel Types" 698 registry in the "Border Gateway Protocol (BGP) Parameters" registry 699 needs to be updated to reflect the use of the most significant bit to 700 advertise the use of "composite tunnels" (section 5.2). 702 For this purpose, this document updates RFC7385. 704 The registry is to be updated, by removing the entries for 0xFB-0xFE 705 and 0x0F, and replacing them by: 707 - 0x7B-0x7E Reserved for Experimental Use [this document] 708 - 0x7F Reserved [this document] 709 - 0x80-0xFF Not Allocatable, corresponds to Composite tunnel types 710 [this document] 712 The allocation policy for values 0x00 to 0x7A is IETF Review 713 [RFC5226]. The range for experimental use is now 0x7B-0x7E, and value 714 in this range are not to be assigned. The status of 0x7F may only be 715 changed through Standards Action [RFC5226]. 717 9 References 719 9.1 Normative References 721 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 722 Requirement Levels", BCP 14, RFC 2119, March 1997. 724 [RFC7432] Sajassi et al., "BGP MPLS Based Ethernet VPN", February, 725 2015. 727 [RFC7623] Sajassi et al., "Provider Backbone Bridging Combined with 728 Ethernet VPN (PBB-EVPN)", September, 2015. 730 [RFC7385] Andersson et al., "IANA Registry for P-Multicast 731 Service Interface (PMSI) Tunnel Type Code Points", 732 October, 2014. 734 [RFC7153] Rosen et al., "IANA Registries for BGP Extended 735 Communities", March, 2014. 737 [RFC6514] Aggarwal et al., "BGP Encodings and Procedures 738 for Multicast in MPLS/BGP IP VPNs", February, 2012. 740 9.2 Informative References 742 [RFC7387] Key et al., "A Framework for E-Tree Service over MPLS 743 Network", October 2014. 745 [RFC4360] S. Sangli et al, "BGP Extended Communities Attribute", 746 February, 2006. 748 Contributors 750 In addition to the authors listed on the front page, the following 751 co-authors have also contributed to this document: 753 Wim Henderickx 754 Nokia 756 Aldrin Isaac 757 Wen Lin 758 Juniper 760 Authors' Addresses 762 Ali Sajassi 763 Cisco 764 Email: sajassi@cisco.com 766 Samer Salam 767 Cisco 768 Email: ssalam@cisco.com 770 John Drake 771 Juniper 772 Email: jdrake@juniper.net 774 Jim Uttaro 775 AT&T 776 Email: ju1738@att.com 778 Sami Boutros 779 VMware 780 Email: sboutros@vmware.com 781 Jorge Rabadan 782 Nokia 783 Email: jorge.rabadan@nokia.com