idnits 2.17.1 draft-ietf-bess-evpn-etree-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 12, 2017) is 2541 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) ** Downref: Normative reference to an Informational RFC: RFC 7387 == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-03 Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup A. Sajassi, Ed. 3 INTERNET-DRAFT S. Salam 4 Intended Status: Standards Track Cisco 5 Updates: 7385 J. Drake 6 Juniper 7 J. Uttaro 8 ATT 9 S. Boutros 10 VMware 11 J. Rabadan 12 Nokia 14 Expires: November 12, 2017 May 12, 2017 16 E-TREE Support in EVPN & PBB-EVPN 17 draft-ietf-bess-evpn-etree-11 19 Abstract 21 The Metro Ethernet Forum (MEF) has defined a rooted-multipoint 22 Ethernet service known as Ethernet Tree (E-Tree). A solution 23 framework for supporting this service in MPLS networks is proposed in 24 RFC7387 ("A Framework for Ethernet Tree (E-Tree) Service over a 25 Multiprotocol Label Switching (MPLS) Network"). This document 26 discusses how those functional requirements can be easily met with 27 Ethernet VPN (EVPN) and how EVPN offers a more efficient 28 implementation of these functions. This document makes use of the 29 most significant bit of the scope governed by the IANA registry 30 created by RFC7385, and hence updates RFC7385 accordingly. 32 Status of this Memo 34 This Internet-Draft is submitted to IETF in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF), its areas, and its working groups. Note that 39 other groups may also distribute working documents as 40 Internet-Drafts. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 46 The list of current Internet-Drafts can be accessed at 47 http://www.ietf.org/1id-abstracts.html 49 The list of Internet-Draft Shadow Directories can be accessed at 50 http://www.ietf.org/shadow.html 52 Copyright and License Notice 54 Copyright (c) 2017 IETF Trust and the persons identified as the 55 document authors. All rights reserved. 57 This document is subject to BCP 78 and the IETF Trust's Legal 58 Provisions Relating to IETF Documents 59 (http://trustee.ietf.org/license-info) in effect on the date of 60 publication of this document. Please review these documents 61 carefully, as they describe your rights and restrictions with respect 62 to this document. Code Components extracted from this document must 63 include Simplified BSD License text as described in Section 4.e of 64 the Trust Legal Provisions and are provided without warranty as 65 described in the Simplified BSD License. 67 Table of Contents 69 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 70 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 71 2 E-Tree Scenarios . . . . . . . . . . . . . . . . . . . . . . . 4 72 2.1 Scenario 1: Leaf OR Root site(s) per PE . . . . . . . . . . 5 73 2.2 Scenario 2: Leaf OR Root site(s) per AC . . . . . . . . . . 5 74 2.3 Scenario 3: Leaf OR Root site(s) per MAC . . . . . . . . . . 7 75 3 Operation for EVPN . . . . . . . . . . . . . . . . . . . . . . . 7 76 3.1 Known Unicast Traffic . . . . . . . . . . . . . . . . . . . 8 77 3.2 BUM Traffic . . . . . . . . . . . . . . . . . . . . . . . . 9 78 3.2.1 BUM traffic originated from a single-homed site on a 79 leaf AC . . . . . . . . . . . . . . . . . . . . . . . . 9 80 3.2.2 BUM traffic originated from a single-homed site on a 81 root AC . . . . . . . . . . . . . . . . . . . . . . . . 10 82 3.2.3 BUM traffic originated from a multi-homed site on a 83 leaf AC . . . . . . . . . . . . . . . . . . . . . . . . 10 84 3.2.4 BUM traffic originated from a multi-homed site on a 85 root AC . . . . . . . . . . . . . . . . . . . . . . . . 10 86 3.3 E-TREE Traffic Flows for EVPN . . . . . . . . . . . . . . . 10 87 3.3.1 E-Tree with MAC Learning . . . . . . . . . . . . . . . . 11 88 3.3.2 E-Tree without MAC Learning . . . . . . . . . . . . . . 12 89 4 Operation for PBB-EVPN . . . . . . . . . . . . . . . . . . . . . 12 90 4.1 Known Unicast Traffic . . . . . . . . . . . . . . . . . . . 13 91 4.2 BUM Traffic . . . . . . . . . . . . . . . . . . . . . . . . 13 92 4.3 E-Tree without MAC Learning . . . . . . . . . . . . . . . . 13 93 5 BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 14 94 5.1 E-TREE Extended Community . . . . . . . . . . . . . . . . . 14 95 5.2 PMSI Tunnel Attribute . . . . . . . . . . . . . . . . . . . 15 96 6 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . 16 97 7 Security Considerations . . . . . . . . . . . . . . . . . . . . 16 98 8 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 16 99 8.1 Considerations for PMSI Tunnel Types . . . . . . . . . . . . 16 100 9 References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 101 9.1 Normative References . . . . . . . . . . . . . . . . . . . 17 102 9.2 Informative References . . . . . . . . . . . . . . . . . . 17 103 Appendix-A . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 104 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 105 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19 107 1 Introduction 109 The Metro Ethernet Forum (MEF) has defined a rooted-multipoint 110 Ethernet service known as Ethernet Tree (E-Tree) [MEF6.1]. In an E- 111 Tree service, Attachment Circuits (ACs) are labeled as either Root or 112 Leaf ACs. Root ACs can communicate with all other ACs. Leaf ACs can 113 communicate with Root ACs but not with other Leaf ACs. 115 [RFC7387] proposes the solution framework for supporting E-Tree 116 service in MPLS networks. The document identifies the functional 117 components of the overall solution to emulate E-Tree services in 118 addition to Ethernet LAN (E-LAN) services on an existing MPLS 119 network. 121 [RFC7432] is a solution for multipoint L2VPN services, with advanced 122 multi-homing capabilities, using BGP for distributing customer/client 123 MAC address reach-ability information over the MPLS/IP network. 124 [RFC7623] combines the functionality of EVPN with [802.1ah] Provider 125 Backbone Bridging (PBB) for MAC address scalability. 127 This document discusses how the functional requirements for E-Tree 128 service can be met with (PBB-)EVPN and how (PBB-)EVPN offers a more 129 efficient implementation of these functions. This document makes use 130 of the most significant bit of the scope governed by the IANA 131 registry created by RFC7385, and hence updates RFC7385 accordingly. 132 Section 2 discusses E-TREE scenarios. Section 3 and 4 describe E-TREE 133 solutions for EVPN and PBB-EVPN respectively, and section 5 covers 134 BGP encoding for E-TREE solutions. 136 1.1 Terminology 138 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 139 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 140 document are to be interpreted as described in RFC 2119 [KEYWORDS]. 142 2 E-Tree Scenarios 144 This document categorizes E-Tree scenarios into the following three 145 scenarios, depending on the nature of the Root/Leaf site association: 147 - Leaf OR Root site(s) per PE 149 - Leaf OR Root site(s) per Attachment Circuit (AC) 151 - Leaf OR Root site(s) per MAC 153 2.1 Scenario 1: Leaf OR Root site(s) per PE 155 In this scenario, a PE may receive traffic from either Root ACs OR 156 Leaf ACs for a given MAC-VRF/bridge table, but not both concurrently. 157 In other words, a given EVI on a PE is either associated with root(s) 158 or leaf(s). The PE may have both Root and Leaf ACs albeit for 159 different EVIs. 161 +---------+ +---------+ 162 | PE1 | | PE2 | 163 +---+ | +---+ | +------+ | +---+ | +---+ 164 |CE1+---AC1----+--+ | | | MPLS | | | +--+----AC2-----+CE2| 165 +---+ (Root) | |MAC| | | /IP | | |MAC| | (Leaf) +---+ 166 | |VRF| | | | | |VRF| | 167 | | | | | | | | | | +---+ 168 | | | | | | | | +--+----AC3-----+CE3| 169 | +---+ | +------+ | +---+ | (Leaf) +---+ 170 +---------+ +---------+ 172 Figure 1: Scenario 1 174 In such scenario, using tailored BGP Route Target (RT) import/export 175 policies among the PEs belonging to the same EVI, can be used to 176 restrict the communications among Leaf PEs. To restrict the 177 communications among Leaf ACs connected to the same PE and belonging 178 to the same EVI, split-horizon filtering is used to block traffic 179 from one Leaf AC to another Leaf AC on a MAC-VRF for a given E-TREE 180 EVI. The purpose of this topology constraint is to avoid having PEs 181 with only Leaf sites importing and processing BGP MAC routes from 182 each other. To support such topology constrain in EVPN, two BGP 183 Route-Targets (RTs) are used for every EVPN Instance (EVI): one RT is 184 associated with the Root sites (Root ACs) and the other is associated 185 with the Leaf sites (Leaf ACs). On a per EVI basis, every PE exports 186 the single RT associated with its type of site(s). Furthermore, a PE 187 with Root site(s) imports both Root and Leaf RTs, whereas a PE with 188 Leaf site(s) only imports the Root RT. 190 2.2 Scenario 2: Leaf OR Root site(s) per AC 192 In this scenario, a PE can receive traffic from both Root ACs and 193 Leaf ACs for a given EVI. In other words, a given EVI on a PE can be 194 associated with both root(s) and leaf(s). 196 +---------+ +---------+ 197 | PE1 | | PE2 | 198 +---+ | +---+ | +------+ | +---+ | +---+ 199 |CE1+-----AC1----+--+ | | | | | | +--+---AC2--+CE2| 200 +---+ (Leaf) | |MAC| | | MPLS | | |MAC| | (Leaf) +---+ 201 | |VRF| | | /IP | | |VRF| | 202 | | | | | | | | | | +---+ 203 | | | | | | | | +--+---AC3--+CE3| 204 | +---+ | +------+ | +---+ | (Root) +---+ 205 +---------+ +---------+ 207 Figure 2: Scenario 2 209 In this scenario, just like the previous scenario (in section 2.1), 210 two Route Targets (one for Root and another for Leaf) can be used. 211 However, the difference is that on a PE with both Root and Leaf ACs, 212 all remote MAC routes are imported and thus there needs to be a way 213 to differentiate remote MAC routes associated with Leaf ACs versus 214 the ones associated with Root ACs in order to apply the proper 215 ingress filtering. 217 In order to recognize the association of a destination MAC address to 218 a Leaf or Root AC and thus support ingress filtering on the ingress 219 PE with both Leaf and Root ACs, MAC addresses need to be colored with 220 Root or Leaf indication before advertisements to other PEs. There are 221 two approaches for such coloring: 223 A) To always use two RTs (one to designate Leaf RT and another for 224 Root RT) 226 B) To allow for a single RT be used per EVI just like [RFC7432] and 227 thus color MAC addresses via a "color" flag in a new extended 228 community as detailed in section 3.1. 230 Approach (A) would require the same data plane enhancements as 231 approach (B) if MAC-VRF and bridge tables used per VLAN, are to 232 remain consistent with [RFC7432] (section 6). In order to avoid data- 233 plane enhancements for approach (A), multiple bridge tables per VLAN 234 may be considered; however, this has major drawbacks as described in 235 appendix-A and thus is not recommended. 237 Given that both approaches (A) and (B) would require exact same data- 238 plane enhancements, approach (B) is chosen here in order to allow for 239 RT usage consistent with baseline EVPN [RFC7432] and for better 240 generality. It should be noted that if one wants to use RT constrain 241 in order to avoid MAC advertisements associated with a Leaf AC to PEs 242 with only Leaf ACs, then two RTs (one for Root and another for Leaf) 243 can still be used with approach (B); however, in such applications 244 Leaf/Root RTs will be used to constrain MAC advertisements and they 245 are not used to color the MAC routes for ingress filtering - i.e., in 246 approach (B), the coloring is always done via the new extended 247 community. 249 For this scenario, if for a given EVI, significant number of PEs have 250 both Leaf and Root sites attached, even though they may start as 251 Root-only or Leaf-only PEs, then a single RT per EVI should be used. 252 The reason for such recommendation is to alleviate the configuration 253 overhead associated with using two RTs per EVI at the expense of 254 having some unwanted MAC addresses on the Leaf-only PEs. 256 2.3 Scenario 3: Leaf OR Root site(s) per MAC 258 In this scenario, a PE may receive traffic from both Root AND Leaf 259 sites on a single Attachment Circuit (AC) of an EVI. This scenario is 260 not covered in both [RFC7387] and [MEF6.1]; however, it is covered in 261 this document for the sake of completeness. In this scenario, since 262 an AC carries traffic from both Root and Leaf sites, the granularity 263 at which Root or Leaf sites are identified is on a per MAC address. 264 This scenario is considered in this document for EVPN service with 265 only known unicast traffic because the Designated Forwarding (DF) 266 filtering per [RFC7432] would not be compatible with the required 267 egress filtering - i.e., Broadcast, Unknown, and Multicast (BUM) 268 traffic is not supported in this scenario and it is dropped by the 269 ingress PE. 271 +---------+ +---------+ 272 | PE1 | | PE2 | 273 +---+ | +---+ | +------+ | +---+ | +---+ 274 |CE1+-----AC1----+--+ | | | | | | +--+-----AC2----+CE2| 275 +---+ (Root) | | E | | | MPLS | | | E | | (Leaf/Root)+---+ 276 | | V | | | /IP | | | V | | 277 | | I | | | | | | I | | +---+ 278 | | | | | | | | +--+-----AC3----+CE3| 279 | +---+ | +------+ | +---+ | (Leaf) +---+ 280 +---------+ +---------+ 282 Figure 3: Scenario 3 284 3 Operation for EVPN 286 [RFC7432] defines the notion of Ethernet Segment Identifier (ESI) 287 MPLS label used for split-horizon filtering of BUM traffic at the 288 egress PE. Such egress filtering capabilities can be leveraged in 289 provision of E-TREE services as seen shortly. In other words, 290 [RFC7432] has inherent capability to support E-TREE services without 291 defining any new BGP routes but by just defining a new BGP Extended 292 Community for leaf indication as shown later in this document 293 (section 5.1). 295 3.1 Known Unicast Traffic 297 Since in EVPN, MAC learning is performed in control plane via 298 advertisement of BGP routes, the filtering needed by E-TREE service 299 for known unicast traffic can be performed at the ingress PE, thus 300 providing very efficient filtering and avoiding sending known unicast 301 traffic over MPLS/IP core to be filtered at the egress PE as done in 302 traditional E-TREE solutions (e.g., E-TREE for VPLS [RFC7796]). 304 To provide such ingress filtering for known unicast traffic, a PE 305 MUST indicate to other PEs what kind of sites (root or leaf) its MAC 306 addresses are associated with by advertising a leaf indication flag 307 (via an Extended Community) along with each of its MAC/IP 308 Advertisement routes. The lack of such flag indicates that the MAC 309 address is associated with a root site. This scheme applies to all 310 scenarios described in section 2. 312 Tagging MAC addresses with a leaf indication enables remote PEs to 313 perform ingress filtering for known unicast traffic - i.e., on the 314 ingress PE, the MAC destination address lookup yields, in addition to 315 the forwarding adjacency, a flag which indicates whether the target 316 MAC is associated with a Leaf site or not. The ingress PE cross- 317 checks this flag with the status of the originating AC, and if both 318 are Leafs, then the packet is not forwarded. 320 In situation where MAC moves are allowed among Leaf and Root sites 321 (e.g., non-static MAC), PEs can receive multiple MAC/IP 322 advertisements routes for the same MAC address with different 323 Leaf/Root indications (and possibly different ESIs for multi-homing 324 scenarios). In such situations, MAC mobility procedures (section 15 325 of [RFC7432]) take precedence to first identify the location of the 326 MAC before associating that MAC with a Root or a Leaf site. 328 To support the above ingress filtering functionality, a new E-TREE 329 Extended Community with a Leaf indication flag is introduced [section 330 5.2]. This new Extended Community MUST be advertised with MAC/IP 331 Advertisement route. Besides MAC/IP Advertisement route, no other 332 EVPN routes are required to carry this new extended community. 334 3.2 BUM Traffic 336 This specification does not provide support for filtering BUM 337 (Broadcast, Unknown, and Multicast) traffic on the ingress PE because 338 it is not possible to perform filtering of BUM traffic on the ingress 339 PE, as is the case with known unicast described above, due to the 340 multi-destination nature of BUM traffic. As such, the solution relies 341 on egress filtering. In order to apply the proper egress filtering, 342 which varies based on whether a packet is sent from a Leaf AC or a 343 root AC, the MPLS-encapsulated frames MUST be tagged with an 344 indication that they originated from a Leaf AC - i.e., to be tagged 345 with a Leaf label as specified in section 5.1. 347 The Leaf label can be upstream assigned for P2MP LSP or downstream 348 assigned for ingress replication tunnels. The main difference between 349 downstream and upstream assigned Leaf label is that in case of 350 downstream assigned not all egress PE devices need to receive the 351 label just like ESI label for ingress replication procedures defined 352 in [RFC7432]. 354 On the ingress PE, the PE needs to place all its Leaf ACs for a given 355 bridge domain in a single split-horizon group in order to prevent 356 intra-PE forwarding among its Leaf ACs. This intra-PE split-horizon 357 filtering applies to BUM traffic as well as known-unicast traffic. 359 There are four scenarios to consider as follows. In all these 360 scenarios, the ingress PE imposes the right MPLS label associated 361 with the originated Ethernet Segment (ES) depending on whether the 362 Ethernet frame originated from a Root or a Leaf site on that Ethernet 363 Segment (ESI label or Leaf label). The mechanism by which the PE 364 identifies whether a given frame originated from a Root or a Leaf 365 site on the segment is based on the AC identifier for that segment 366 (e.g., Ethernet Tag of the frame for 802.1Q frames). Other mechanisms 367 for identifying root or leaf (e.g., on a per MAC address basis) is 368 beyond the scope of this document. 370 3.2.1 BUM traffic originated from a single-homed site on a leaf AC 372 In this scenario, the ingress PE adds a Leaf label advertised using 373 the E-Tree Extended Community (Section 5.1) indicating a Leaf site. 374 This Leaf label, used for single-homing scenarios, is not on a per ES 375 basis but rather on a per PE basis - i.e., a single Leaf MPLS label 376 is used for all single-homed ES's on that PE. This Leaf label is 377 advertised to other PE devices, using the E-TREE Extended Community 378 (section 5.1) along with an Ethernet A-D per ES route with ESI of 379 zero and a set of Route Targets (RTs) corresponding to all EVIs on 380 the PE with at least one leaf site per EVI. The set of Ethernet A-D 381 per ES routes may be needed if the number of Route Targets (RTs) that 382 need to be sent exceed the limit on a single route per [RFC7432]. The 383 ESI for the Ethernet A-D per ES route is set to zero to indicate 384 single-homed sites. 386 When a PE receives this special Leaf label in the data path, it 387 blocks the packet if the destination AC is of type Leaf; otherwise, 388 it forwards the packet. 390 3.2.2 BUM traffic originated from a single-homed site on a root AC 392 In this scenario, the ingress PE does not add any ESI label or Leaf 393 label and it operates per [RFC7432] procedures. 395 3.2.3 BUM traffic originated from a multi-homed site on a leaf AC 397 In this scenario, it is assumed that while different ACs (VLANs) on 398 the same ES could have different root/leaf designation (some being 399 roots and some being leafs), the same VLAN does have the same 400 root/leaf designation on all PEs on the same ES. Furthermore, it is 401 assumed that there is no forwarding among subnets - ie, the service 402 is EVPN L2 and not EVPN IRB [EVPN-IRB]. IRB use cases described in 403 [EVPN-IRB] are outside the scope of this document. 405 In such scenarios, If a multicast or broadcast packet is originated 406 from a leaf AC, then it only needs to carry Leaf label described in 407 section 3.2.1. This label is sufficient in providing the necessary 408 egress filtering of BUM traffic from getting sent to leaf ACs 409 including the leaf AC on the same Ethernet Segment. 411 3.2.4 BUM traffic originated from a multi-homed site on a root AC 413 In this scenario, both the ingress and egress PE devices follows the 414 procedure defined in [RFC7432] for adding and/or processing an ESI 415 MPLS label. 417 3.3 E-TREE Traffic Flows for EVPN 419 Per [RFC7387], a generic E-Tree service supports all of the following 420 traffic flows: 422 - Ethernet known unicast from Root to Roots & Leaf 423 - Ethernet known unicast from Leaf to Root 424 - Ethernet BUM traffic from Root to Roots & Leafs 425 - Ethernet BUM traffic from Leaf to Roots 427 A particular E-Tree service may need to support all of the above 428 types of flows or only a select subset, depending on the target 429 application. In the case where unicast flows need not be supported, 430 the L2VPN PEs can avoid performing any MAC learning function. 432 The following subsections will describe the operation of EVPN to 433 support E-Tree service with and without MAC learning. 435 3.3.1 E-Tree with MAC Learning 437 The PEs implementing an E-Tree service must perform MAC learning when 438 unicast traffic flows must be supported among Root and Leaf sites. In 439 this case, the PE(s) with Root sites performs MAC learning in the 440 data-path over the Ethernet Segments, and advertises reachability in 441 EVPN MAC/IP Advertisement Routes. These routes will be imported by 442 all PEs for that EVI (i.e., PEs that have Leaf sites as well as PEs 443 that have Root sites). Similarly, the PEs with Leaf sites perform MAC 444 learning in the data-path over their Ethernet Segments, and advertise 445 reachability in EVPN MAC/IP Advertisement Routes. For the scenario 446 described in section 2.1 (or possibly section 2.2), these routes are 447 imported only by PEs with at least one Root site in the EVI - i.e., a 448 PE with only Leaf sites will not import these routes. PEs with Root 449 and/or Leaf sites may use the Ethernet A-D routes for aliasing (in 450 the case of multi-homed segments) and for mass MAC withdrawal per 451 [RFC7432]. 453 To support multicast/broadcast from Root to Leaf sites, either a P2MP 454 tree rooted at the PE(s) with the Root site(s) or ingress replication 455 can be used (section 16 of [RFC7432]). The multicast tunnels are set 456 up through the exchange of the EVPN Inclusive Multicast route, as 457 defined in [RFC7432]. 459 To support multicast/broadcast from Leaf to Root sites, ingress 460 replication should be sufficient for most scenarios where there are 461 only a few Roots (typically two). Therefore, in a typical scenario, a 462 root PE needs to support both a P2MP tunnel in transmit direction 463 from itself to leaf PEs and at the same time it needs to support 464 ingress-replication tunnels in receive direction from leaf PEs to 465 itself. In order to signal this efficiently from the root PE, a new 466 composite tunnel type is defined per section 5.2. This new composite 467 tunnel type is advertised by the root PE to simultaneously indicate a 468 P2MP tunnel in transmit direction and an ingress-replication tunnel 469 in the receive direction for the BUM traffic. 471 If the number of Roots is large, P2MP tunnels originated at the PEs 472 with Leaf sites may be used and thus there will be no need to use the 473 modified PMSI tunnel attribute in section 5.2 for composite tunnel 474 type. 476 3.3.2 E-Tree without MAC Learning 478 The PEs implementing an E-Tree service need not perform MAC learning 479 when the traffic flows between Root and Leaf sites are mainly 480 multicast or broadcast. In this case, the PEs do not exchange EVPN 481 MAC/IP Advertisement Routes. Instead, the Inclusive Multicast 482 Ethernet Tag route is used to support BUM traffic. 484 The fields of this route are populated per the procedures defined in 485 [RFC7432], and the multicast tunnel setup criteria are as described 486 in the previous section. 488 Just as in the previous section, if the number of PEs with root sites 489 are only a few and thus ingress replication is desired from leaf PEs 490 to these root PEs, then the modified PMSI attribute as defined in 491 section 5.2 should be used. 493 4 Operation for PBB-EVPN 495 In PBB-EVPN, the PE advertises a Root/Leaf indication along with each 496 B-MAC Advertisement route, to indicate whether the associated B-MAC 497 address corresponds to a Root or a Leaf site. Just like the EVPN 498 case, the new E-TREE Extended Community defined in section [5.1] is 499 advertised with each MAC Advertisement route. 501 In the case where a multi-homed Ethernet Segment has both Root and 502 Leaf sites attached, two B-MAC addresses are advertised: one B-MAC 503 address is per ES as specified in [RFC7623] and implicitly denoting 504 Root, and the other B-MAC address is per PE and explicitly denoting 505 Leaf. The former B-MAC address is not advertised with the E-TREE 506 extended community but the latter B-MAC denoting Leaf is advertised 507 with the new E-TREE extended community where "Leaf-indication" flag 508 is set. In such multi-homing scenarios where an Ethernet Segment has 509 both Root and Leaf ACs, it is assumed that While different ACs 510 (VLANs) on the same ES could have different root/leaf designation 511 (some being roots and some being leafs), the same VLAN does have the 512 same root/leaf designation on all PEs on the same ES. Furthermore, it 513 is assumed that there is no forwarding among subnets - ie, the 514 service is L2 and not IRB. IRB use case is outside the scope of this 515 document. 517 The ingress PE uses the right B-MAC source address depending on 518 whether the Ethernet frame originated from the Root or Leaf AC on 519 that Ethernet Segment. The mechanism by which the PE identifies 520 whether a given frame originated from a Root or Leaf site on the 521 segment is based on the Ethernet Tag associated with the frame. Other 522 mechanisms of identification, beyond the Ethernet Tag, are outside 523 the scope of this document. 525 Furthermore, a PE advertises two special global B-MAC addresses: one 526 for Root and another for Leaf, and tags the Leaf one as such in the 527 MAC Advertisement route. These B-MAC addresses are used as source 528 addresses for traffic originating from single-homed segments. The B- 529 MAC address used for indicating Leaf sites can be the same for both 530 single-homed and multi-homed segments. 532 4.1 Known Unicast Traffic 534 For known unicast traffic, the PEs perform ingress filtering: On the 535 ingress PE, the C-MAC destination address lookup yields, in addition 536 to the target B-MAC address and forwarding adjacency, a flag which 537 indicates whether the target B-MAC is associated with a Root or a 538 Leaf site. The ingress PE cross-checks this flag with the status of 539 the originating site, and if both are a Leaf, then the packet is not 540 forwarded. 542 4.2 BUM Traffic 544 For BUM traffic, the PEs must perform egress filtering. When a PE 545 receives a MAC advertisement route (which will be used as a source B- 546 MAC for BUM traffic), it updates its egress filtering (based on the 547 source B-MAC address), as follows: 549 - If the MAC Advertisement route indicates that the advertised B-MAC 550 is a Leaf, and the local Ethernet Segment is a Leaf as well, then the 551 source B-MAC address is added to its B-MAC list used for egress 552 filtering - i.e., to block traffic from that B-MAC address. 554 - Otherwise, the B-MAC filtering list is not updated. 556 When the egress PE receives the packet, it examines the B-MAC source 557 address to check whether it should filter or forward the frame. Note 558 that this uses the same filtering logic as baseline [RFC7623] and 559 does not require any additional flags in the data-plane. 561 Just as in section 3.2, the PE places all Leaf Ethernet Segments of a 562 given bridge domain in a single split-horizon group in order to 563 prevent intra-PE forwarding among Leaf segments. This split-horizon 564 function applies to BUM traffic as well as known-unicast traffic. 566 4.3 E-Tree without MAC Learning 568 In scenarios where the traffic of interest is only Multicast and/or 569 broadcast, the PEs implementing an E-Tree service do not need to do 570 any MAC learning. In such scenarios the filtering must be performed 571 on egress PEs. For PBB-EVPN, the handling of such traffic is per 572 section 4.2 without C-MAC learning part of it at both ingress and 573 egress PEs. 575 5 BGP Encoding 577 This document defines a new BGP Extended Community for EVPN. 579 5.1 E-TREE Extended Community 581 This Extended Community is a new transitive Extended Community 582 [RFC4360] having a Type field value of 0x06 (EVPN) and the Sub-Type 583 0x05. It is used for leaf indication of known unicast and BUM 584 traffic. 586 The E-TREE Extended Community is encoded as an 8-octet value as 587 follows: 589 0 1 2 3 590 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 591 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 592 | Type=0x06 | Sub-Type=0x05 | Flags(1 Octet)| Reserved=0 | 593 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 594 | Reserved=0 | Leaf Label | 595 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 597 Figure 4: E-TREE Extended Community 599 The low-order bit of the Flags octet is defined as the "Leaf- 600 Indication" bit. A value of one indicates a Leaf AC/Site. The rest of 601 flag bits should be set to zero. 603 When this Extended Community (EC) is advertised along with MAC/IP 604 Advertisement route (for known unicast traffic) per section 3.1, the 605 Leaf-Indication flag MUST be set to one and Leaf Label SHOULD be set 606 to zero. The label value is encoded in the high-order 20 bits of the 607 Leaf Label field. The received PE SHOULD ignore Leaf Label and only 608 processes Leaf-Indication flag. A value of zero for Leaf-Indication 609 flag is invalid when sent along with MAC/IP advertisement route and 610 an error should be logged. 612 When this EC is advertised along with Ethernet A-D per ES route (with 613 ESI of zero) for BUM traffic to enable egress filtering on 614 disposition PEs per sections 3.2.1 and 3.2.3, the Leaf Label MUST be 615 set to a valid MPLS label (i.e., non-reserved assigned MPLS label 616 [RFC3032]) and the Leaf-Indication flag SHOULD be set to zero. The 617 received PE SHOULD ignore the Leaf-Indication flag. A non-valid MPLS 618 label when sent along with the Ethernet A-D per ES route, should be 619 ignored and logged as an error. 621 The reserved bits should be set to zero by the transmitter and should 622 be ignored by the receiver. 624 5.2 PMSI Tunnel Attribute 626 [RFC6514] defines PMSI Tunnel attribute which is an optional 627 transitive attribute with the following format: 629 +---------------------------------+ 630 | Flags (1 octet) | 631 +---------------------------------+ 632 | Tunnel Type (1 octets) | 633 +---------------------------------+ 634 | MPLS Label (3 octets) | 635 +---------------------------------+ 636 | Tunnel Identifier (variable) | 637 +---------------------------------+ 639 Figure 5: PMSI Tunnel Attribute 641 This document defines a new Composite tunnel type by introducing a 642 new 'Composite Tunnel' bit in the Tunnel Type field and adding a MPLS 643 label to the Tunnel Identifier field of PMSI Tunnel attribute as 644 detailed below. This document uses all other remaining fields per 645 existing definition. Composite tunnel type is advertised by the root 646 PE to simultaneously indicate a P2MP tunnel in transmit direction and 647 an ingress-replication tunnel in the receive direction for the BUM 648 traffic. 650 When receiver ingress-replication label is needed, the high-order bit 651 of the tunnel type field (Composite Tunnel bit) is set while the 652 remaining low-order seven bits indicate the tunnel type as before. 653 When this Composite Tunnel bit is set, the "tunnel identifier" field 654 would begin with a three-octet label, followed by the actual tunnel 655 identifier for the transmit tunnel. PEs that don't understand the 656 new meaning of the high-order bit would treat the tunnel type as an 657 undefined tunnel type and would treat the PMSI tunnel attribute as a 658 malformed attribute [RFC6514]. For the PEs that do understand the new 659 meaning of the high-order, if ingress replication is desired when 660 sending BUM traffic, the PE will use the the label in the Tunnel 661 Identifier field when sending its BUM traffic. 663 Using the Composite Tunnel bit for Tunnel Types 0x00 'no tunnel 664 information present' and 0x06 'Ingress Replication' is invalid, and a 665 PE that receives a PMSI Tunnel attribute with such information, 666 considers it as malformed and it SHOULD treat this Update as though 667 all the routes contained in this Update had been withdrawn per 668 section 5 of [RFC6514]. 670 6 Acknowledgement 672 We would like to thank Dennis Cai, Antoni Przygienda, and Jeffrey 673 Zhang for their valuable comments. The authors would also like to 674 thank Thomas Morin for shepherding this document and providing 675 valuable comments. 677 7 Security Considerations 679 Since this document uses the EVPN constructs of [RFC7432] and 680 [RFC7623], the same security considerations in these documents are 681 also applicable here. Furthermore, this document provides additional 682 security check by allowing sites (or ACs) of an EVPN instance to be 683 designated as "Root" or "Leaf" and preventing any traffic exchange 684 among "Leaf" sites of that VPN through ingress filtering for known 685 unicast traffic and egress filtering for BUM traffic. 687 8 IANA Considerations 689 IANA has allocated value 5 in the "EVPN Extended Community Sub-Types" 690 registry defined in [RFC7153] as follow: 692 SUB-TYPE VALUE NAME Reference 694 0x05 E-TREE Extended Community This document 696 8.1 Considerations for PMSI Tunnel Types 698 The "P-Multicast Service Interface Tunnel (PMSI Tunnel) Tunnel Types" 699 registry in the "Border Gateway Protocol (BGP) Parameters" registry 700 needs to be updated to reflect the use of the most significant bit as 701 "Composite Tunnel" bit (section 5.2). 703 For this purpose, this document updates [RFC7385]. 705 The registry is to be updated, by removing the entries for 0xFB-0xFE 706 and 0x0F, and replacing them by: 708 Value Meaning Reference 709 0x0B-0x7A Unassigned 710 0x7B-0x7E Reserved for Experimental Use this document 711 0x7F Reserved this document 712 0x80-0xFF Reserved for Composite Tunnels this document 714 The allocation policy for values 0x00 to 0x7A is IETF Review 715 [RFC5226]. The range for experimental use is now 0x7B-0x7E, and value 716 in this range are not to be assigned. The status of 0x7F may only be 717 changed through Standards Action [RFC5226]. 719 9 References 721 9.1 Normative References 723 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 724 Requirement Levels", BCP 14, RFC 2119, March 1997. 726 [RFC5226] T. Narten et al, "Guidelines for Writing an IANA 727 Considerations Section in RFCs", May, 2008. 729 [RFC7387] Key et al., "A Framework for E-Tree Service over MPLS 730 Network", October 2014. 732 [MEF6.1] Metro Ethernet Forum, "Ethernet Services Definitions - Phase 733 2", MEF 6.1, April 2008. 735 [RFC7432] Sajassi et al., "BGP MPLS Based Ethernet VPN", February, 736 2015. 738 [RFC7623] Sajassi et al., "Provider Backbone Bridging Combined with 739 Ethernet VPN (PBB-EVPN)", September, 2015. 741 [RFC7385] Andersson et al., "IANA Registry for P-Multicast Service 742 Interface (PMSI) Tunnel Type Code Points", October, 2014. 744 [RFC7153] Rosen et al., "IANA Registries for BGP Extended 745 Communities", March, 2014. 747 [RFC6514] Aggarwal et al., "BGP Encodings and Procedures for 748 Multicast in MPLS/BGP IP VPNs", February, 2012. 750 [RFC4360] Sangli et al., "BGP Extended Communities Attribute", 751 February, 2006. 753 9.2 Informative References 755 [RFC4360] S. Sangli et al, "BGP Extended Communities Attribute", 756 February, 2006. 758 [RFC3032] E. Rosen et al, "MPLS Label Stack Encoding", January 2001. 760 [RFC7796] Y. Jiang et al, "Ethernet-Tree (E-Tree) Support in Virtual 761 Private LAN Service (VPLS)", March 2016. 763 [EVPN-IRB] A. Sajassi et al, "Integrated Routing and Bridging in 764 EVPN", draft-ietf-bess-evpn-inter-subnet-forwarding-03, February 8, 765 2017. 767 Appendix-A 769 When two MAC-VRFs (two bridge tables per VLANs) are used for an E- 770 TREE service (one for root ACs and another for Leaf ACs) on a given 771 PE, then the following complications in data-plane path can result. 773 Maintaining two MAC-VRFs (two bridge tables) per VLAN (when both Leaf 774 and Root ACs exists for that VLAN) would either require two lookups 775 be performed per MAC address in each direction in case of a miss, or 776 duplicating many MAC addresses between the two bridge tables 777 belonging to the same VLAN (same E-TREE instance). Unless two lookups 778 are made, duplication of MAC addresses would be needed for both 779 locally learned and remotely learned MAC addresses. Locally learned 780 MAC addresses from Leaf ACs need to be duplicated onto Root bridge 781 table and locally learned MAC addresses from Root ACs need to be 782 duplicated onto Leaf bridge table. Remotely learned MAC addresses 783 from Root ACs need to be copied onto both Root and Leaf bridge 784 tables. Because of potential inefficiencies associated with data- 785 plane implementation of additional MAC lookup or duplication of MAC 786 entries, this option is not believed to be implementable without 787 dataplane performance inefficiencies in some platforms and thus this 788 document introduces the coloring as described in section 2.2 and 789 detailed in section 3.1. 791 Contributors 793 In addition to the authors listed on the front page, the following 794 co-authors have also contributed to this document: 796 Wim Henderickx 797 Nokia 799 Aldrin Isaac 800 Wen Lin 801 Juniper 803 Authors' Addresses 805 Ali Sajassi 806 Cisco 807 Email: sajassi@cisco.com 809 Samer Salam 810 Cisco 811 Email: ssalam@cisco.com 813 John Drake 814 Juniper 815 Email: jdrake@juniper.net 817 Jim Uttaro 818 AT&T 819 Email: ju1738@att.com 821 Sami Boutros 822 VMware 823 Email: sboutros@vmware.com 825 Jorge Rabadan 826 Nokia 827 Email: jorge.rabadan@nokia.com