idnits 2.17.1 draft-ietf-bess-evpn-etree-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 9, 2017) is 2537 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-03 -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup A. Sajassi, Ed. 3 INTERNET-DRAFT S. Salam 4 Intended Status: Standards Track Cisco 5 Updates: 7385 J. Drake 6 6514 Juniper 7 J. Uttaro 8 ATT 9 S. Boutros 10 VMware 11 J. Rabadan 12 Nokia 14 Expires: November 9, 2017 May 9, 2017 16 E-TREE Support in EVPN & PBB-EVPN 17 draft-ietf-bess-evpn-etree-10 19 Abstract 21 The Metro Ethernet Forum (MEF) has defined a rooted-multipoint 22 Ethernet service known as Ethernet Tree (E-Tree). A solution 23 framework for supporting this service in MPLS networks is proposed in 24 RFC7387 ("A Framework for Ethernet Tree (E-Tree) Service over a 25 Multiprotocol Label Switching (MPLS) Network"). This document 26 discusses how those functional requirements can be easily met with 27 Ethernet VPN (EVPN) and how EVPN offers a more efficient 28 implementation of these functions. This document makes use of the 29 most significant bit of the scope governed by the IANA registry 30 created by RFC7385, and hence updates RFC7385 accordingly. 32 Status of this Memo 34 This Internet-Draft is submitted to IETF in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF), its areas, and its working groups. Note that 39 other groups may also distribute working documents as 40 Internet-Drafts. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 46 The list of current Internet-Drafts can be accessed at 47 http://www.ietf.org/1id-abstracts.html 49 The list of Internet-Draft Shadow Directories can be accessed at 50 http://www.ietf.org/shadow.html 52 Copyright and License Notice 54 Copyright (c) 2017 IETF Trust and the persons identified as the 55 document authors. All rights reserved. 57 This document is subject to BCP 78 and the IETF Trust's Legal 58 Provisions Relating to IETF Documents 59 (http://trustee.ietf.org/license-info) in effect on the date of 60 publication of this document. Please review these documents 61 carefully, as they describe your rights and restrictions with respect 62 to this document. Code Components extracted from this document must 63 include Simplified BSD License text as described in Section 4.e of 64 the Trust Legal Provisions and are provided without warranty as 65 described in the Simplified BSD License. 67 Table of Contents 69 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 70 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 71 2 E-Tree Scenarios . . . . . . . . . . . . . . . . . . . . . . . 4 72 2.1 Scenario 1: Leaf OR Root site(s) per PE . . . . . . . . . . 4 73 2.2 Scenario 2: Leaf OR Root site(s) per AC . . . . . . . . . . 5 74 2.3 Scenario 3: Leaf OR Root site(s) per MAC . . . . . . . . . . 7 75 3 Operation for EVPN . . . . . . . . . . . . . . . . . . . . . . . 7 76 3.1 Known Unicast Traffic . . . . . . . . . . . . . . . . . . . 8 77 3.2 BUM Traffic . . . . . . . . . . . . . . . . . . . . . . . . 9 78 3.2.1 BUM traffic originated from a single-homed site on a 79 leaf AC . . . . . . . . . . . . . . . . . . . . . . . . 9 80 3.2.2 BUM traffic originated from a single-homed site on a 81 root AC . . . . . . . . . . . . . . . . . . . . . . . . 10 82 3.2.3 BUM traffic originated from a multi-homed site on a 83 leaf AC . . . . . . . . . . . . . . . . . . . . . . . . 10 84 3.2.4 BUM traffic originated from a multi-homed site on a 85 root AC . . . . . . . . . . . . . . . . . . . . . . . . 10 86 3.3 E-TREE Traffic Flows for EVPN . . . . . . . . . . . . . . . 10 87 3.3.1 E-Tree with MAC Learning . . . . . . . . . . . . . . . . 11 88 3.3.2 E-Tree without MAC Learning . . . . . . . . . . . . . . 12 89 4 Operation for PBB-EVPN . . . . . . . . . . . . . . . . . . . . . 12 90 4.1 Known Unicast Traffic . . . . . . . . . . . . . . . . . . . 13 91 4.2 BUM Traffic . . . . . . . . . . . . . . . . . . . . . . . . 13 92 4.3 E-Tree without MAC Learning . . . . . . . . . . . . . . . . 13 93 5 BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 14 94 5.1 E-TREE Extended Community . . . . . . . . . . . . . . . . . 14 95 5.2 PMSI Tunnel Attribute . . . . . . . . . . . . . . . . . . . 15 96 6 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . 16 97 7 Security Considerations . . . . . . . . . . . . . . . . . . . . 16 98 8 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 16 99 8.1 Considerations for PMSI Tunnel Types . . . . . . . . . . . . 16 100 9 References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 101 9.1 Normative References . . . . . . . . . . . . . . . . . . . 17 102 9.2 Informative References . . . . . . . . . . . . . . . . . . 17 103 Appendix-A . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 104 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 105 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19 107 1 Introduction 109 The Metro Ethernet Forum (MEF) has defined a rooted-multipoint 110 Ethernet service known as Ethernet Tree (E-Tree) [MEF6.1]. In an E- 111 Tree service, Attachment Circuits (ACs) are labeled as either Root or 112 Leaf ACs. Root ACs can communicate with all other ACs. Leaf ACs can 113 communicate with Root ACs but not with other Leaf ACs. 115 [RFC7387] proposes the solution framework for supporting E-Tree 116 service in MPLS networks. The document identifies the functional 117 components of the overall solution to emulate E-Tree services in 118 addition to Ethernet LAN (E-LAN) services on an existing MPLS 119 network. 121 [RFC7432] is a solution for multipoint L2VPN services, with advanced 122 multi-homing capabilities, using BGP for distributing customer/client 123 MAC address reach-ability information over the MPLS/IP network. 124 [RFC7623] combines the functionality of EVPN with [802.1ah] Provider 125 Backbone Bridging (PBB) for MAC address scalability. 127 This document discusses how the functional requirements for E-Tree 128 service can be met with (PBB-)EVPN and how (PBB-)EVPN offers a more 129 efficient implementation of these functions. Section 2 discusses E- 130 TREE scenarios. Section 3 and 4 describe E-TREE solutions for EVPN 131 and PBB-EVPN respectively, and section 5 covers BGP encoding for E- 132 TREE solutions. 134 1.1 Terminology 136 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 137 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 138 document are to be interpreted as described in RFC 2119 [KEYWORDS]. 140 2 E-Tree Scenarios 142 This document categorizes E-Tree scenarios into the following three 143 scenarios, depending on the nature of the Root/Leaf site association: 145 - Leaf OR Root site(s) per PE 147 - Leaf OR Root site(s) per Attachment Circuit (AC) 149 - Leaf OR Root site(s) per MAC 151 2.1 Scenario 1: Leaf OR Root site(s) per PE 152 In this scenario, a PE may receive traffic from either Root ACs OR 153 Leaf ACs for a given MAC-VRF/bridge table, but not both concurrently. 154 In other words, a given EVI on a PE is either associated with root(s) 155 or leaf(s). The PE may have both Root and Leaf ACs albeit for 156 different EVIs. 158 +---------+ +---------+ 159 | PE1 | | PE2 | 160 +---+ | +---+ | +------+ | +---+ | +---+ 161 |CE1+---AC1----+--+ | | | MPLS | | | +--+----AC2-----+CE2| 162 +---+ (Root) | |MAC| | | /IP | | |MAC| | (Leaf) +---+ 163 | |VRF| | | | | |VRF| | 164 | | | | | | | | | | +---+ 165 | | | | | | | | +--+----AC3-----+CE3| 166 | +---+ | +------+ | +---+ | (Leaf) +---+ 167 +---------+ +---------+ 169 Figure 1: Scenario 1 171 In such scenario, using tailored BGP Route Target (RT) import/export 172 policies among the PEs belonging to the same EVI, can be used to 173 restrict the communications among Leaf PEs. To restrict the 174 communications among Leaf ACs connected to the same PE and belonging 175 to the same EVI, split-horizon filtering is used to block traffic 176 from one Leaf AC to another Leaf AC on a MAC-VRF for a given E-TREE 177 EVI. The purpose of this topology constraint is to avoid having PEs 178 with only Leaf sites importing and processing BGP MAC routes from 179 each other. To support such topology constrain in EVPN, two BGP 180 Route-Targets (RTs) are used for every EVPN Instance (EVI): one RT is 181 associated with the Root sites (Root ACs) and the other is associated 182 with the Leaf sites (Leaf ACs). On a per EVI basis, every PE exports 183 the single RT associated with its type of site(s). Furthermore, a PE 184 with Root site(s) imports both Root and Leaf RTs, whereas a PE with 185 Leaf site(s) only imports the Root RT. 187 2.2 Scenario 2: Leaf OR Root site(s) per AC 189 In this scenario, a PE can receive traffic from both Root ACs and 190 Leaf ACs for a given EVI. In other words, a given EVI on a PE can be 191 associated with both root(s) and leaf(s). 193 +---------+ +---------+ 194 | PE1 | | PE2 | 195 +---+ | +---+ | +------+ | +---+ | +---+ 196 |CE1+-----AC1----+--+ | | | | | | +--+---AC2--+CE2| 197 +---+ (Leaf) | |MAC| | | MPLS | | |MAC| | (Leaf) +---+ 198 | |VRF| | | /IP | | |VRF| | 199 | | | | | | | | | | +---+ 200 | | | | | | | | +--+---AC3--+CE3| 201 | +---+ | +------+ | +---+ | (Root) +---+ 202 +---------+ +---------+ 204 Figure 2: Scenario 2 206 In this scenario, just like the previous scenario (in section 2.1), 207 two Route Targets (one for Root and another for Leaf) can be used. 208 However, the difference is that on a PE with both Root and Leaf ACs, 209 all remote MAC routes are imported and thus there needs to be a way 210 to differentiate remote MAC routes associated with Leaf ACs versus 211 the ones associated with Root ACs in order to apply the proper 212 ingress filtering. 214 In order to recognize the association of a destination MAC address to 215 a Leaf or Root AC and thus support ingress filtering on the ingress 216 PE with both Leaf and Root ACs, MAC addresses need to be colored with 217 Root or Leaf indication before advertisements to other PEs. There are 218 two approaches for such coloring: 220 A) To always use two RTs (one to designate Leaf RT and another for 221 Root RT) 223 B) To allow for a single RT be used per EVI just like [RFC7432] and 224 thus color MAC addresses via a "color" flag in a new extended 225 community as detailed in section 3.1. 227 Approach (A) would require the same data plane enhancements as 228 approach (B) if MAC-VRF and bridge tables used per VLAN, are to 229 remain consistent with [RFC7432] (section 6). In order to avoid data- 230 plane enhancements for approach (A), multiple bridge tables per VLAN 231 may be considered; however, this has major drawbacks as described in 232 appendix-A and thus is not recommended. 234 Given that both approaches (A) and (B) would require exact same data- 235 plane enhancements, approach (B) is chosen here in order to allow for 236 RT usage consistent with baseline EVPN [RFC7432] and for better 237 generality. It should be noted that if one wants to use RT constrain 238 in order to avoid MAC advertisements associated with a Leaf AC to PEs 239 with only Leaf ACs, then two RTs (one for Root and another for Leaf) 240 can still be used with approach (B); however, in such applications 241 Leaf/Root RTs will be used to constrain MAC advertisements and they 242 are not used to color the MAC routes for ingress filtering - i.e., in 243 approach (B), the coloring is always done via the new extended 244 community. 246 For this scenario, if for a given EVI, significant number of PEs have 247 both Leaf and Root sites attached, even though they may start as 248 Root-only or Leaf-only PEs, then a single RT per EVI should be used. 249 The reason for such recommendation is to alleviate the configuration 250 overhead associated with using two RTs per EVI at the expense of 251 having some unwanted MAC addresses on the Leaf-only PEs. 253 2.3 Scenario 3: Leaf OR Root site(s) per MAC 255 In this scenario, a PE may receive traffic from both Root AND Leaf 256 sites on a single Attachment Circuit (AC) of an EVI. This scenario is 257 not covered in both [RFC7387] and [MEF6.1]; however, it is covered in 258 this document for the sake of completeness. In this scenario, since 259 an AC carries traffic from both Root and Leaf sites, the granularity 260 at which Root or Leaf sites are identified is on a per MAC address. 261 This scenario is considered in this document for EVPN service with 262 only known unicast traffic because the Designated Forwarding (DF) 263 filtering per [RFC7432] would not be compatible with the required 264 egress filtering - i.e., Broadcast, Unknown, and Multicast (BUM) 265 traffic is not supported in this scenario and it is dropped by the 266 ingress PE. 268 +---------+ +---------+ 269 | PE1 | | PE2 | 270 +---+ | +---+ | +------+ | +---+ | +---+ 271 |CE1+-----AC1----+--+ | | | | | | +--+-----AC2----+CE2| 272 +---+ (Root) | | E | | | MPLS | | | E | | (Leaf/Root)+---+ 273 | | V | | | /IP | | | V | | 274 | | I | | | | | | I | | +---+ 275 | | | | | | | | +--+-----AC3----+CE3| 276 | +---+ | +------+ | +---+ | (Leaf) +---+ 277 +---------+ +---------+ 279 Figure 3: Scenario 3 281 3 Operation for EVPN 283 [RFC7432] defines the notion of Ethernet Segment Identifier (ESI) 284 MPLS label used for split-horizon filtering of BUM traffic at the 285 egress PE. Such egress filtering capabilities can be leveraged in 286 provision of E-TREE services as seen shortly. In other words, 287 [RFC7432] has inherent capability to support E-TREE services without 288 defining any new BGP routes but by just defining a new BGP Extended 289 Community for leaf indication as shown later in this document 290 (section 5.1). 292 3.1 Known Unicast Traffic 294 Since in EVPN, MAC learning is performed in control plane via 295 advertisement of BGP routes, the filtering needed by E-TREE service 296 for known unicast traffic can be performed at the ingress PE, thus 297 providing very efficient filtering and avoiding sending known unicast 298 traffic over MPLS/IP core to be filtered at the egress PE as done in 299 traditional E-TREE solutions (e.g., E-TREE for VPLS [RFC7796]). 301 To provide such ingress filtering for known unicast traffic, a PE 302 MUST indicate to other PEs what kind of sites (root or leaf) its MAC 303 addresses are associated with by advertising a leaf indication flag 304 (via an Extended Community) along with each of its MAC/IP 305 Advertisement routes. The lack of such flag indicates that the MAC 306 address is associated with a root site. This scheme applies to all 307 scenarios described in section 2. 309 Tagging MAC addresses with a leaf indication enables remote PEs to 310 perform ingress filtering for known unicast traffic - i.e., on the 311 ingress PE, the MAC destination address lookup yields, in addition to 312 the forwarding adjacency, a flag which indicates whether the target 313 MAC is associated with a Leaf site or not. The ingress PE cross- 314 checks this flag with the status of the originating AC, and if both 315 are Leafs, then the packet is not forwarded. 317 In situation where MAC moves are allowed among Leaf and Root sites 318 (e.g., non-static MAC), PEs can receive multiple MAC/IP 319 advertisements routes for the same MAC address with different 320 Leaf/Root indications (and possibly different ESIs for multi-homing 321 scenarios). In such situations, MAC mobility procedures (section 15 322 of [RFC7432]) take precedence to first identify the location of the 323 MAC before associating that MAC with a Root or a Leaf site. 325 To support the above ingress filtering functionality, a new E-TREE 326 Extended Community with a Leaf indication flag is introduced [section 327 5.2]. This new Extended Community MUST be advertised with MAC/IP 328 Advertisement route. Besides MAC/IP Advertisement route, no other 329 EVPN routes are required to carry this new extended community. 331 3.2 BUM Traffic 333 This specification does not provide support for filtering BUM 334 (Broadcast, Unknown, and Multicast) traffic on the ingress PE because 335 it is not possible to perform filtering of BUM traffic on the ingress 336 PE, as is the case with known unicast described above, due to the 337 multi-destination nature of BUM traffic. As such, the solution relies 338 on egress filtering. In order to apply the proper egress filtering, 339 which varies based on whether a packet is sent from a Leaf AC or a 340 root AC, the MPLS-encapsulated frames MUST be tagged with an 341 indication that they originated from a Leaf AC - i.e., to be tagged 342 with a Leaf label as specified in section 5.1. 344 The Leaf label can be upstream assigned for P2MP LSP or downstream 345 assigned for ingress replication tunnels. The main difference between 346 downstream and upstream assigned Leaf label is that in case of 347 downstream assigned not all egress PE devices need to receive the 348 label just like ESI label for ingress replication procedures defined 349 in [RFC7432]. 351 On the ingress PE, the PE needs to place all its Leaf ACs for a given 352 bridge domain in a single split-horizon group in order to prevent 353 intra-PE forwarding among its Leaf ACs. This intra-PE split-horizon 354 filtering applies to BUM traffic as well as known-unicast traffic. 356 There are four scenarios to consider as follows. In all these 357 scenarios, the ingress PE imposes the right MPLS label associated 358 with the originated Ethernet Segment (ES) depending on whether the 359 Ethernet frame originated from a Root or a Leaf site on that Ethernet 360 Segment (ESI label or Leaf label). The mechanism by which the PE 361 identifies whether a given frame originated from a Root or a Leaf 362 site on the segment is based on the AC identifier for that segment 363 (e.g., Ethernet Tag of the frame for 802.1Q frames). Other mechanisms 364 for identifying root or leaf (e.g., on a per MAC address basis) is 365 beyond the scope of this document. 367 3.2.1 BUM traffic originated from a single-homed site on a leaf AC 369 In this scenario, the ingress PE adds a Leaf label advertised using 370 the E-Tree Extended Community (Section 5.1) indicating a Leaf site. 371 This Leaf label, used for single-homing scenarios, is not on a per ES 372 basis but rather on a per PE basis - i.e., a single Leaf MPLS label 373 is used for all single-homed ES's on that PE. This Leaf label is 374 advertised to other PE devices, using the E-TREE Extended Community 375 (section 5.1) along with an Ethernet A-D per ES route with ESI of 376 zero and a set of Route Targets (RTs) corresponding to all EVIs on 377 the PE with at least one leaf site per EVI. The set of Ethernet A-D 378 per ES routes may be needed if the number of Route Targets (RTs) that 379 need to be sent exceed the limit on a single route per [RFC7432]. The 380 ESI for the Ethernet A-D per ES route is set to zero to indicate 381 single-homed sites. 383 When a PE receives this special Leaf label in the data path, it 384 blocks the packet if the destination AC is of type Leaf; otherwise, 385 it forwards the packet. 387 3.2.2 BUM traffic originated from a single-homed site on a root AC 389 In this scenario, the ingress PE does not add any ESI label or Leaf 390 label and it operates per [RFC7432] procedures. 392 3.2.3 BUM traffic originated from a multi-homed site on a leaf AC 394 In this scenario, it is assumed that while different ACs (VLANs) on 395 the same ES could have different root/leaf designation (some being 396 roots and some being leafs), the same VLAN does have the same 397 root/leaf designation on all PEs on the same ES. Furthermore, it is 398 assumed that there is no forwarding among subnets - ie, the service 399 is EVPN L2 and not EVPN IRB [EVPN-IRB]. IRB use cases described in 400 [EVPN-IRB] are outside the scope of this document. 402 In such scenarios, If a multicast or broadcast packet is originated 403 from a leaf AC, then it only needs to carry Leaf label described in 404 section 3.2.1. This label is sufficient in providing the necessary 405 egress filtering of BUM traffic from getting sent to leaf ACs 406 including the leaf AC on the same Ethernet Segment. 408 3.2.4 BUM traffic originated from a multi-homed site on a root AC 410 In this scenario, both the ingress and egress PE devices follows the 411 procedure defined in [RFC7432] for adding and/or processing an ESI 412 MPLS label. 414 3.3 E-TREE Traffic Flows for EVPN 416 Per [RFC7387], a generic E-Tree service supports all of the following 417 traffic flows: 419 - Ethernet known unicast from Root to Roots & Leaf 420 - Ethernet known unicast from Leaf to Root 421 - Ethernet BUM traffic from Root to Roots & Leafs 422 - Ethernet BUM traffic from Leaf to Roots 424 A particular E-Tree service may need to support all of the above 425 types of flows or only a select subset, depending on the target 426 application. In the case where unicast flows need not be supported, 427 the L2VPN PEs can avoid performing any MAC learning function. 429 The following subsections will describe the operation of EVPN to 430 support E-Tree service with and without MAC learning. 432 3.3.1 E-Tree with MAC Learning 434 The PEs implementing an E-Tree service must perform MAC learning when 435 unicast traffic flows must be supported among Root and Leaf sites. In 436 this case, the PE(s) with Root sites performs MAC learning in the 437 data-path over the Ethernet Segments, and advertises reachability in 438 EVPN MAC/IP Advertisement Routes. These routes will be imported by 439 all PEs for that EVI (i.e., PEs that have Leaf sites as well as PEs 440 that have Root sites). Similarly, the PEs with Leaf sites perform MAC 441 learning in the data-path over their Ethernet Segments, and advertise 442 reachability in EVPN MAC/IP Advertisement Routes. For the scenario 443 described in section 2.1 (or possibly section 2.2), these routes are 444 imported only by PEs with at least one Root site in the EVI - i.e., a 445 PE with only Leaf sites will not import these routes. PEs with Root 446 and/or Leaf sites may use the Ethernet A-D routes for aliasing (in 447 the case of multi-homed segments) and for mass MAC withdrawal per 448 [RFC7432]. 450 To support multicast/broadcast from Root to Leaf sites, either a P2MP 451 tree rooted at the PE(s) with the Root site(s) or ingress replication 452 can be used (section 16 of [RFC7432]). The multicast tunnels are set 453 up through the exchange of the EVPN Inclusive Multicast route, as 454 defined in [RFC7432]. 456 To support multicast/broadcast from Leaf to Root sites, ingress 457 replication should be sufficient for most scenarios where there are 458 only a few Roots (typically two). Therefore, in a typical scenario, a 459 root PE needs to support both a P2MP tunnel in transmit direction 460 from itself to leaf PEs and at the same time it needs to support 461 ingress-replication tunnels in receive direction from leaf PEs to 462 itself. In order to signal this efficiently from the root PE, a new 463 composite tunnel type is defined per section 5.2. This new composite 464 tunnel type is advertised by the root PE to simultaneously indicate a 465 P2MP tunnel in transmit direction and an ingress-replication tunnel 466 in the receive direction for the BUM traffic. 468 If the number of Roots is large, P2MP tunnels originated at the PEs 469 with Leaf sites may be used and thus there will be no need to use the 470 modified PMSI tunnel attribute in section 5.2 for composite tunnel 471 type. 473 3.3.2 E-Tree without MAC Learning 475 The PEs implementing an E-Tree service need not perform MAC learning 476 when the traffic flows between Root and Leaf sites are mainly 477 multicast or broadcast. In this case, the PEs do not exchange EVPN 478 MAC/IP Advertisement Routes. Instead, the Inclusive Multicast 479 Ethernet Tag route is used to support BUM traffic. 481 The fields of this route are populated per the procedures defined in 482 [RFC7432], and the multicast tunnel setup criteria are as described 483 in the previous section. 485 Just as in the previous section, if the number of PEs with root sites 486 are only a few and thus ingress replication is desired from leaf PEs 487 to these root PEs, then the modified PMSI attribute as defined in 488 section 5.2 should be used. 490 4 Operation for PBB-EVPN 492 In PBB-EVPN, the PE advertises a Root/Leaf indication along with each 493 B-MAC Advertisement route, to indicate whether the associated B-MAC 494 address corresponds to a Root or a Leaf site. Just like the EVPN 495 case, the new E-TREE Extended Community defined in section [5.1] is 496 advertised with each MAC Advertisement route. 498 In the case where a multi-homed Ethernet Segment has both Root and 499 Leaf sites attached, two B-MAC addresses are advertised: one B-MAC 500 address is per ES as specified in [RFC7623] and implicitly denoting 501 Root, and the other B-MAC address is per PE and explicitly denoting 502 Leaf. The former B-MAC address is not advertised with the E-TREE 503 extended community but the latter B-MAC denoting Leaf is advertised 504 with the new E-TREE extended community where "Leaf-indication" flag 505 is set. In such multi-homing scenarios where an Ethernet Segment has 506 both Root and Leaf ACs, it is assumed that While different ACs 507 (VLANs) on the same ES could have different root/leaf designation 508 (some being roots and some being leafs), the same VLAN does have the 509 same root/leaf designation on all PEs on the same ES. Furthermore, it 510 is assumed that there is no forwarding among subnets - ie, the 511 service is L2 and not IRB. IRB use case is outside the scope of this 512 document. 514 The ingress PE uses the right B-MAC source address depending on 515 whether the Ethernet frame originated from the Root or Leaf AC on 516 that Ethernet Segment. The mechanism by which the PE identifies 517 whether a given frame originated from a Root or Leaf site on the 518 segment is based on the Ethernet Tag associated with the frame. Other 519 mechanisms of identification, beyond the Ethernet Tag, are outside 520 the scope of this document. 522 Furthermore, a PE advertises two special global B-MAC addresses: one 523 for Root and another for Leaf, and tags the Leaf one as such in the 524 MAC Advertisement route. These B-MAC addresses are used as source 525 addresses for traffic originating from single-homed segments. The B- 526 MAC address used for indicating Leaf sites can be the same for both 527 single-homed and multi-homed segments. 529 4.1 Known Unicast Traffic 531 For known unicast traffic, the PEs perform ingress filtering: On the 532 ingress PE, the C-MAC destination address lookup yields, in addition 533 to the target B-MAC address and forwarding adjacency, a flag which 534 indicates whether the target B-MAC is associated with a Root or a 535 Leaf site. The ingress PE cross-checks this flag with the status of 536 the originating site, and if both are a Leaf, then the packet is not 537 forwarded. 539 4.2 BUM Traffic 541 For BUM traffic, the PEs must perform egress filtering. When a PE 542 receives a MAC advertisement route (which will be used as a source B- 543 MAC for BUM traffic), it updates its egress filtering (based on the 544 source B-MAC address), as follows: 546 - If the MAC Advertisement route indicates that the advertised B-MAC 547 is a Leaf, and the local Ethernet Segment is a Leaf as well, then the 548 source B-MAC address is added to its B-MAC list used for egress 549 filtering - i.e., to block traffic from that B-MAC address. 551 - Otherwise, the B-MAC filtering list is not updated. 553 When the egress PE receives the packet, it examines the B-MAC source 554 address to check whether it should filter or forward the frame. Note 555 that this uses the same filtering logic as baseline [RFC7623] and 556 does not require any additional flags in the data-plane. 558 Just as in section 3.2, the PE places all Leaf Ethernet Segments of a 559 given bridge domain in a single split-horizon group in order to 560 prevent intra-PE forwarding among Leaf segments. This split-horizon 561 function applies to BUM traffic as well as known-unicast traffic. 563 4.3 E-Tree without MAC Learning 565 In scenarios where the traffic of interest is only Multicast and/or 566 broadcast, the PEs implementing an E-Tree service do not need to do 567 any MAC learning. In such scenarios the filtering must be performed 568 on egress PEs. For PBB-EVPN, the handling of such traffic is per 569 section 4.2 without C-MAC learning part of it at both ingress and 570 egress PEs. 572 5 BGP Encoding 574 This document defines a new BGP Extended Community for EVPN. 576 5.1 E-TREE Extended Community 578 This Extended Community is a new transitive Extended Community 579 [RFC4360] having a Type field value of 0x06 (EVPN) and the Sub-Type 580 0x05. It is used for leaf indication of known unicast and BUM 581 traffic. 583 The E-TREE Extended Community is encoded as an 8-octet value as 584 follows: 586 0 1 2 3 587 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 588 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 589 | Type=0x06 | Sub-Type=0x05 | Flags(1 Octet)| Reserved=0 | 590 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 591 | Reserved=0 | Leaf Label | 592 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 594 Figure 4: E-TREE Extended Community 596 The low-order bit of the Flags octet is defined as the "Leaf- 597 Indication" bit. A value of one indicates a Leaf AC/Site. The rest of 598 flag bits should be set to zero. 600 When this Extended Community (EC) is advertised along with MAC/IP 601 Advertisement route (for known unicast traffic) per section 3.1, the 602 Leaf-Indication flag MUST be set to one and Leaf Label SHOULD be set 603 to zero. The label value is encoded in the high-order 20 bits of the 604 Leaf Label field. The received PE SHOULD ignore Leaf Label and only 605 processes Leaf-Indication flag. A value of zero for Leaf-Indication 606 flag is invalid when sent along with MAC/IP advertisement route and 607 an error should be logged. 609 When this EC is advertised along with Ethernet A-D per ES route (with 610 ESI of zero) for BUM traffic to enable egress filtering on 611 disposition PEs per sections 3.2.1 and 3.2.3, the Leaf Label MUST be 612 set to a valid MPLS label (i.e., non-reserved assigned MPLS label 613 [RFC3032]) and the Leaf-Indication flag SHOULD be set to zero. The 614 received PE SHOULD ignore the Leaf-Indication flag. A non-valid MPLS 615 label when sent along with the Ethernet A-D per ES route, should be 616 ignored and logged as an error. 618 The reserved bits should be set to zero by the transmitter and should 619 be ignored by the receiver. 621 5.2 PMSI Tunnel Attribute 623 [RFC6514] defines PMSI Tunnel attribute which is an optional 624 transitive attribute with the following format: 626 +---------------------------------+ 627 | Flags (1 octet) | 628 +---------------------------------+ 629 | Tunnel Type (1 octets) | 630 +---------------------------------+ 631 | MPLS Label (3 octets) | 632 +---------------------------------+ 633 | Tunnel Identifier (variable) | 634 +---------------------------------+ 636 Figure 5: PMSI Tunnel Attribute 638 This document defines a new Composite tunnel type by introducing a 639 new 'Composite Tunnel' bit in the Tunnel Type field and adding a MPLS 640 label to the Tunnel Identifier field of PMSI Tunnel attribute as 641 detailed below. This document uses all other remaining fields per 642 existing definition. Composite tunnel type is advertised by the root 643 PE to simultaneously indicate a P2MP tunnel in transmit direction and 644 an ingress-replication tunnel in the receive direction for the BUM 645 traffic. 647 When receiver ingress-replication label is needed, the high-order bit 648 of the tunnel type field (Composite Tunnel bit) is set while the 649 remaining low-order seven bits indicate the tunnel type as before. 650 When this Composite Tunnel bit is set, the "tunnel identifier" field 651 would begin with a three-octet label, followed by the actual tunnel 652 identifier for the transmit tunnel. PEs that don't understand the 653 new meaning of the high-order bit would treat the tunnel type as an 654 undefined tunnel type and would treat the PMSI tunnel attribute as a 655 malformed attribute [RFC6514]. For the PEs that do understand the new 656 meaning of the high-order, if ingress replication is desired when 657 sending BUM traffic, the PE will use the the label in the Tunnel 658 Identifier field when sending its BUM traffic. 660 Using the Composite Tunnel bit for Tunnel Types 0x00 'no tunnel 661 information present' and 0x06 'Ingress Replication' is invalid, and a 662 PE that receives a PMSI Tunnel attribute with such information, 663 considers it as malformed and it SHOULD treat this Update as though 664 all the routes contained in this Update had been withdrawn per 665 section 5 of [RFC6514]. 667 6 Acknowledgement 669 We would like to thank Dennis Cai, Antoni Przygienda, and Jeffrey 670 Zhang for their valuable comments. The authors would also like to 671 thank Thomas Morin for shepherding this document and providing 672 valuable comments. 674 7 Security Considerations 676 Since this document uses the EVPN constructs of [RFC7432] and 677 [RFC7623], the same security considerations in these documents are 678 also applicable here. Furthermore, this document provides additional 679 security check by allowing sites (or ACs) of an EVPN instance to be 680 designated as "Root" or "Leaf" and preventing any traffic exchange 681 among "Leaf" sites of that VPN through ingress filtering for known 682 unicast traffic and egress filtering for BUM traffic. 684 8 IANA Considerations 686 IANA has allocated value 5 in the "EVPN Extended Community Sub-Types" 687 registry defined in [RFC7153] as follow: 689 SUB-TYPE VALUE NAME Reference 691 0x05 E-TREE Extended Community This document 693 8.1 Considerations for PMSI Tunnel Types 695 The "P-Multicast Service Interface Tunnel (PMSI Tunnel) Tunnel Types" 696 registry in the "Border Gateway Protocol (BGP) Parameters" registry 697 needs to be updated to reflect the use of the most significant bit as 698 "Composite Tunnel" bit (section 5.2). 700 For this purpose, this document updates [RFC7385]. 702 The registry is to be updated, by removing the entries for 0xFB-0xFE 703 and 0x0F, and replacing them by: 705 Value Meaning Reference 706 0x0B-0x7A Unassigned 707 0x7B-0x7E Reserved for Experimental Use this document 708 0x7F Reserved this document 709 0x80-0xFF Reserved for Composite Tunnels this document 711 The allocation policy for values 0x00 to 0x7A is IETF Review 712 [RFC5226]. The range for experimental use is now 0x7B-0x7E, and value 713 in this range are not to be assigned. The status of 0x7F may only be 714 changed through Standards Action [RFC5226]. 716 9 References 718 9.1 Normative References 720 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 721 Requirement Levels", BCP 14, RFC 2119, March 1997. 723 [RFC7432] Sajassi et al., "BGP MPLS Based Ethernet VPN", February, 724 2015. 726 [RFC7623] Sajassi et al., "Provider Backbone Bridging Combined with 727 Ethernet VPN (PBB-EVPN)", September, 2015. 729 [RFC7385] Andersson et al., "IANA Registry for P-Multicast Service 730 Interface (PMSI) Tunnel Type Code Points", October, 2014. 732 [RFC7153] Rosen et al., "IANA Registries for BGP Extended 733 Communities", March, 2014. 735 [RFC6514] Aggarwal et al., "BGP Encodings and Procedures for 736 Multicast in MPLS/BGP IP VPNs", February, 2012. 738 [RFC4360] Sangli et al., "BGP Extended Communities Attribute", 739 February, 2006. 741 9.2 Informative References 743 [RFC7387] Key et al., "A Framework for E-Tree Service over MPLS 744 Network", October 2014. 746 [MEF6.1] Metro Ethernet Forum, "Ethernet Services Definitions - Phase 747 2", MEF 6.1, April 2008. 749 [RFC4360] S. Sangli et al, "BGP Extended Communities Attribute", 750 February, 2006. 752 [RFC3032] E. Rosen et al, "MPLS Label Stack Encoding", January 2001. 754 [RFC7796] Y. Jiang et al, "Ethernet-Tree (E-Tree) Support in Virtual 755 Private LAN Service (VPLS)", March 2016. 757 [EVPN-IRB] A. Sajassi et al, "Integrated Routing and Bridging in 758 EVPN", draft-ietf-bess-evpn-inter-subnet-forwarding-03, February 8, 759 2017. 761 [RFC5226] T. Narten et al, "Guidelines for Writing an IANA 762 Considerations Section in RFCs", May, 2008. 764 Appendix-A 766 When two MAC-VRFs (two bridge tables per VLANs) are used for an E- 767 TREE service (one for root ACs and another for Leaf ACs) on a given 768 PE, then the following complications in data-plane path can result. 770 Maintaining two MAC-VRFs (two bridge tables) per VLAN (when both Leaf 771 and Root ACs exists for that VLAN) would either require two lookups 772 be performed per MAC address in each direction in case of a miss, or 773 duplicating many MAC addresses between the two bridge tables 774 belonging to the same VLAN (same E-TREE instance). Unless two lookups 775 are made, duplication of MAC addresses would be needed for both 776 locally learned and remotely learned MAC addresses. Locally learned 777 MAC addresses from Leaf ACs need to be duplicated onto Root bridge 778 table and locally learned MAC addresses from Root ACs need to be 779 duplicated onto Leaf bridge table. Remotely learned MAC addresses 780 from Root ACs need to be copied onto both Root and Leaf bridge 781 tables. Because of potential inefficiencies associated with data- 782 plane implementation of additional MAC lookup or duplication of MAC 783 entries, this option is not believed to be implementable without 784 dataplane performance inefficiencies in some platforms and thus this 785 document introduces the coloring as described in section 2.2 and 786 detailed in section 3.1. 788 Contributors 790 In addition to the authors listed on the front page, the following 791 co-authors have also contributed to this document: 793 Wim Henderickx 794 Nokia 796 Aldrin Isaac 797 Wen Lin 798 Juniper 800 Authors' Addresses 802 Ali Sajassi 803 Cisco 804 Email: sajassi@cisco.com 806 Samer Salam 807 Cisco 808 Email: ssalam@cisco.com 810 John Drake 811 Juniper 812 Email: jdrake@juniper.net 814 Jim Uttaro 815 AT&T 816 Email: ju1738@att.com 818 Sami Boutros 819 VMware 820 Email: sboutros@vmware.com 822 Jorge Rabadan 823 Nokia 824 Email: jorge.rabadan@nokia.com