idnits 2.17.1 draft-ietf-bess-evpn-etree-14.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 28, 2017) is 2370 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 7387 == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-03 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup A. Sajassi, Ed. 3 INTERNET-DRAFT S. Salam 4 Intended Status: Standards Track Cisco 5 Updates: 7385 J. Drake 6 Juniper 7 J. Uttaro 8 ATT 9 S. Boutros 10 VMware 11 J. Rabadan 12 Nokia 14 Expires: April 28, 2018 October 28, 2017 16 E-TREE Support in EVPN & PBB-EVPN 17 draft-ietf-bess-evpn-etree-14 19 Abstract 21 The Metro Ethernet Forum (MEF) has defined a rooted-multipoint 22 Ethernet service known as Ethernet Tree (E-Tree). A solution 23 framework for supporting this service in MPLS networks is described 24 in RFC7387 ("A Framework for Ethernet-Tree (E-Tree) Service over a 25 Multiprotocol Label Switching (MPLS) Network"). This document 26 discusses how those functional requirements can be met with a 27 solution based on RFC7432, BGP MPLS Based Ethernet VPN (EVPN), with 28 some extensions and how such a solution can offer a more efficient 29 implementation of these functions than that of RFC7796, E-Tree 30 Support in Virtual Private LAN Service (VPLS). This document makes 31 use of the most significant bit of the "Tunnel Type" field (in PMSI 32 Tunnel Attribute) governed by the IANA registry created by RFC7385, 33 and hence updates RFC7385 accordingly. 35 Status of this Memo 37 This Internet-Draft is submitted to IETF in full conformance with the 38 provisions of BCP 78 and BCP 79. 40 Internet-Drafts are working documents of the Internet Engineering 41 Task Force (IETF), its areas, and its working groups. Note that 42 other groups may also distribute working documents as 43 Internet-Drafts. 45 Internet-Drafts are draft documents valid for a maximum of six months 46 and may be updated, replaced, or obsoleted by other documents at any 47 time. It is inappropriate to use Internet-Drafts as reference 48 material or to cite them other than as "work in progress." 50 The list of current Internet-Drafts can be accessed at 51 http://www.ietf.org/1id-abstracts.html 53 The list of Internet-Draft Shadow Directories can be accessed at 54 http://www.ietf.org/shadow.html 56 Copyright and License Notice 58 Copyright (c) 2017 IETF Trust and the persons identified as the 59 document authors. All rights reserved. 61 This document is subject to BCP 78 and the IETF Trust's Legal 62 Provisions Relating to IETF Documents 63 (http://trustee.ietf.org/license-info) in effect on the date of 64 publication of this document. Please review these documents 65 carefully, as they describe your rights and restrictions with respect 66 to this document. Code Components extracted from this document must 67 include Simplified BSD License text as described in Section 4.e of 68 the Trust Legal Provisions and are provided without warranty as 69 described in the Simplified BSD License. 71 Table of Contents 73 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 74 1.1 Specification of Requirements . . . . . . . . . . . . . . . 4 75 1.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 5 76 2 E-Tree Scenarios . . . . . . . . . . . . . . . . . . . . . . . 5 77 2.1 Scenario 1: Leaf or Root Site(s) per PE . . . . . . . . . . 6 78 2.2 Scenario 2: Leaf or Root Site(s) per AC . . . . . . . . . . 6 79 2.3 Scenario 3: Leaf or Root Site(s) per MAC Address . . . . . . 8 80 3 Operation for EVPN . . . . . . . . . . . . . . . . . . . . . . . 9 81 3.1 Known Unicast Traffic . . . . . . . . . . . . . . . . . . . 9 82 3.2 Broadcast, Unkonwn, and Multicast (BUM) Traffic . . . . . . 10 83 3.2.1 BUM Traffic Originated from a Single-homed Site on a 84 Leaf AC . . . . . . . . . . . . . . . . . . . . . . . . 11 85 3.2.2 BUM Traffic Originated from a Single-homed Site on a 86 Root AC . . . . . . . . . . . . . . . . . . . . . . . . 11 87 3.2.3 BUM Traffic Originated from a Multi-homed Site on a 88 Leaf AC . . . . . . . . . . . . . . . . . . . . . . . . 11 89 3.2.4 BUM Traffic Originated from a Multi-homed Site on a 90 Root AC . . . . . . . . . . . . . . . . . . . . . . . . 11 91 3.3 E-Tree Traffic Flows for EVPN . . . . . . . . . . . . . . . 12 92 3.3.1 E-Tree with MAC Learning . . . . . . . . . . . . . . . . 12 93 3.3.2 E-Tree without MAC Learning . . . . . . . . . . . . . . 13 94 4 Operation for PBB-EVPN . . . . . . . . . . . . . . . . . . . . . 13 95 4.1 Known Unicast Traffic . . . . . . . . . . . . . . . . . . . 14 96 4.2 Broadcast, Unkonwn, and Multicast (BUM) Traffic . . . . . . 14 97 4.3 E-Tree without MAC Learning . . . . . . . . . . . . . . . . 15 98 5 BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 15 99 5.1 E-Tree Extended Community . . . . . . . . . . . . . . . . . 15 100 5.2 PMSI Tunnel Attribute . . . . . . . . . . . . . . . . . . . 17 101 6 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . 18 102 7 Security Considerations . . . . . . . . . . . . . . . . . . . . 18 103 8 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 18 104 8.1 Considerations for PMSI Tunnel Types . . . . . . . . . . . . 19 105 9 References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 106 9.1 Normative References . . . . . . . . . . . . . . . . . . . 19 107 9.2 Informative References . . . . . . . . . . . . . . . . . . 20 108 Appendix-A . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 109 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21 111 1 Introduction 113 The Metro Ethernet Forum (MEF) has defined a rooted-multipoint 114 Ethernet service known as Ethernet Tree (E-Tree) [MEF6.1]. In an E- 115 Tree service, a customer site that is typically represented by an 116 Attachment Circuits (AC) (e.g., a 802.1Q VLAN tag but may also be 117 represented by a MAC address) is labeled as either a Root or a Leaf 118 site. Root sites can communicate with all other customer sites (both 119 Root and Leaf sites). However, Leaf sites can communicate with Root 120 sites but not with other Leaf sits. In this document unless 121 explicitly mentioned otherwise, a site is always represented by an 122 AC. 124 [RFC7387] describes a solution framework for supporting E-Tree 125 service in MPLS networks. The document identifies the functional 126 components of an overall solution to emulate E-Tree services in MPLS 127 networks in addition to multipoint-to-multipoint Ethernet LAN (E-LAN) 128 services specified in [RFC7432] and [RFC7623]. 130 [RFC7432] defines EVPN, a solution for multipoint L2VPN services with 131 advanced multi-homing capabilities, using BGP for distributing 132 customer/client MAC address reach-ability information over the 133 MPLS/IP network. [RFC7623] combines the functionality of EVPN with 134 [802.1ah] Provider Backbone Bridging (PBB) for MAC address 135 scalability. 137 This document discusses how the functional requirements for E-Tree 138 service can be met with a solution based on (PBB-)EVPN (i.e., 139 [RFC7432] and [RFC7623]) with some extensions to their procedures and 140 BGP attributes. Such (PBB-)EVPN based solution can offer a more 141 efficient implementation of these functions than that of RFC7796, E- 142 Tree Support in Virtual Private LAN Service (VPLS). This efficiency 143 is achieved by performing filtering of unicast traffic at the ingress 144 PE nodes as opposed to egress filtering where the traffic is sent 145 through the network and gets filtered and discarded at the egress PE 146 nodes. The details of this ingress filtering is described in section 147 3.1. Since this document specifies a solution based on [RFC7432], it 148 requires the readers to have the knowledge of [RFC7432] as 149 prerequisite. This document makes use of the most significant bit of 150 the "Tunnel Type" field (in PMSI Tunnel Attribute) governed by the 151 IANA registry created by RFC7385, and hence updates RFC7385 152 accordingly. Section 2 discusses E-Tree scenarios. Section 3 and 4 153 describe E-Tree solutions for EVPN and PBB-EVPN respectively, and 154 section 5 covers BGP encoding for E-Tree solutions. 156 1.1 Specification of Requirements 158 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 159 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 160 document are to be interpreted as described in RFC 2119 [KEYWORDS]. 162 1.2 Terminology 164 Broadcast Domain: In a bridged network, the broadcast domain 165 corresponds to a Virtual LAN (VLAN), where a VLAN is typically 166 represented by a single VLAN ID (VID) but can be represented by 167 several VIDs where Shared VLAN Learning (SVL) is used per [802.1Q]. 169 Bridge Table: An instantiation of a broadcast domain on a MAC-VRF. 171 CE: Customer Edge device, e.g., a host, router, or switch. 173 EVI: An EVPN instance spanning the Provider Edge (PE) devices 174 participating in that EVPN. 176 MAC-VRF: A Virtual Routing and Forwarding table for Media Access 177 Control (MAC) addresses on a PE. 179 Ethernet Segment (ES): When a customer site (device or network) is 180 connected to one or more PEs via a set of Ethernet links, then that 181 set of links is referred to as an 'Ethernet segment'. 183 Ethernet Segment Identifier (ESI): A unique non-zero identifier that 184 identifies an Ethernet segment is called an 'Ethernet Segment 185 Identifier'. 187 Ethernet Tag: An Ethernet tag identifies a particular broadcast 188 domain, e.g., a VLAN. An EVPN instance consists of one or more 189 broadcast domains. 191 P2MP: Point to Multipoint. 193 PE: Provider Edge device. 195 2 E-Tree Scenarios 197 This document categorizes E-Tree scenarios into the following three 198 scenarios, depending on the nature of the Root/Leaf site association: 200 - Either Leaf or Root site(s) per PE 202 - Either Leaf or Root site(s) per Attachment Circuit (AC) 204 - Either Leaf or Root site(s) per MAC address 206 2.1 Scenario 1: Leaf or Root Site(s) per PE 208 In this scenario, a PE may receive traffic from either Root ACs or 209 Leaf ACs for a given MAC-VRF/bridge table, but not both. In other 210 words, a given EVPN Instance (EVI) on a Provider Edge (PE) device is 211 either associated with Root(s) or Leaf(s). The PE may have both Root 212 and Leaf ACs albeit for different EVIs. 214 +---------+ +---------+ 215 | PE1 | | PE2 | 216 +---+ | +---+ | +------+ | +---+ | +---+ 217 |CE1+---AC1----+--+ | | | MPLS | | | +--+----AC2-----+CE2| 218 +---+ (Root) | |MAC| | | /IP | | |MAC| | (Leaf) +---+ 219 | |VRF| | | | | |VRF| | 220 | | | | | | | | | | +---+ 221 | | | | | | | | +--+----AC3-----+CE3| 222 | +---+ | +------+ | +---+ | (Leaf) +---+ 223 +---------+ +---------+ 225 Figure 1: Scenario 1 227 In this scenario, tailored BGP Route Target (RT) import/export 228 policies among the PEs belonging to the same EVI can be used to 229 prevent the communications among Leaf PEs. To prevent the 230 communications among Leaf ACs connected to the same PE and belonging 231 to the same EVI, split-horizon filtering is used to block traffic 232 from one Leaf AC to another Leaf AC on a MAC-VRF for a given E-Tree 233 EVI. The purpose of this topology constraint is to avoid having PEs 234 with only Leaf sites importing and processing BGP MAC routes from 235 each other. To support such topology constrain in EVPN, two BGP 236 Route-Targets (RTs) are used for every EVPN Instance (EVI): one RT is 237 associated with the Root sites (Root ACs) and the other is associated 238 with the Leaf sites (Leaf ACs). On a per EVI basis, every PE exports 239 the single RT associated with its type of site(s). Furthermore, a PE 240 with Root site(s) imports both Root and Leaf RTs, whereas a PE with 241 Leaf site(s) only imports the Root RT. 243 For this scenario, if it is desired to use only a single RT per EVI 244 (just like E-LAN services in [RFC7432]), then the approach B in 245 scenario 2 (described below) needs to be used. 247 2.2 Scenario 2: Leaf or Root Site(s) per AC 249 In this scenario, a PE can receive traffic from both Root ACs and 250 Leaf ACs for a given EVI. In other words, a given EVI on a PE can be 251 associated with both Root(s) and Leaf(s). 253 +---------+ +---------+ 254 | PE1 | | PE2 | 255 +---+ | +---+ | +------+ | +---+ | +---+ 256 |CE1+-----AC1----+--+ | | | | | | +--+---AC2--+CE2| 257 +---+ (Leaf) | |MAC| | | MPLS | | |MAC| | (Leaf) +---+ 258 | |VRF| | | /IP | | |VRF| | 259 | | | | | | | | | | +---+ 260 | | | | | | | | +--+---AC3--+CE3| 261 | +---+ | +------+ | +---+ | (Root) +---+ 262 +---------+ +---------+ 264 Figure 2: Scenario 2 266 In this scenario, just like the previous scenario (in section 2.1), 267 two Route Targets (one for Root and another for Leaf) can be used. 268 However, the difference is that on a PE with both Root and Leaf ACs, 269 all remote MAC routes are imported and thus there needs to be a way 270 to differentiate remote MAC routes associated with Leaf ACs versus 271 the ones associated with Root ACs in order to apply the proper 272 ingress filtering. 274 In order to recognize the association of a destination MAC address to 275 a Leaf or Root AC and thus support ingress filtering on the ingress 276 PE with both Leaf and Root ACs, MAC addresses need to be colored with 277 Root or Leaf indication before advertisements to other PEs. There are 278 two approaches for such coloring: 280 A) To always use two RTs (one to designate Leaf RT and another for 281 Root RT) 283 B) To allow for a single RT be used per EVI just like [RFC7432] and 284 thus color MAC addresses via a "color" flag in a new extended 285 community as detailed in section 5.1. 287 Approach (A) would require the same data plane enhancements as 288 approach (B) if MAC-VRF and bridge tables used per VLAN, are to 289 remain consistent with [RFC7432] (section 6). In order to avoid data- 290 plane enhancements for approach (A), multiple bridge tables per VLAN 291 may be considered; however, this has major drawbacks as described in 292 appendix-A and thus is not recommended. 294 Given that both approaches (A) and (B) would require the same data- 295 plane enhancements, approach (B) is chosen here in order to allow for 296 RT usage consistent with baseline EVPN [RFC7432] and for better 297 generality. It should be noted that if one wants to use RT 298 constraints in order to avoid MAC advertisements associated with a 299 Leaf AC to PEs with only Leaf ACs, then two RTs (one for Root and 300 another for Leaf) can still be used with approach (B); however, in 301 such applications Leaf/Root RTs will be used to constrain MAC 302 advertisements and they are not used to color the MAC routes for 303 ingress filtering - i.e., in approach (B), the coloring is always 304 done via the new extended community. 306 If, for a given EVI, a significant number of PEs have both Leaf and 307 Root sites attached (even though they may start as Root-only or Leaf- 308 only PEs), then a single RT per EVI should be used. The reason for 309 such recommendation is to alleviate the configuration overhead 310 associated with using two RTs per EVI at the expense of having some 311 unwanted MAC addresses on the Leaf-only PEs. 313 2.3 Scenario 3: Leaf or Root Site(s) per MAC Address 315 In this scenario, a customer Root or Leaf site is represented by a 316 MAC address and a PE may receive traffic from both Root AND Leaf 317 sites on a single Attachment Circuit (AC) of an EVI. This scenario is 318 not covered in either [RFC7387] or [MEF6.1]; however, it is covered 319 in this document for the sake of completeness. In this scenario, 320 since an AC carries traffic from both Root and Leaf sites, the 321 granularity at which Root or Leaf sites are identified is on a per 322 MAC address. This scenario is considered in this document for EVPN 323 service with only known unicast traffic because the Designated 324 Forwarding (DF) filtering per [RFC7432] would not be compatible with 325 the required egress filtering - i.e., Broadcast, Unknown, and 326 Multicast (BUM) traffic is not supported in this scenario and it is 327 dropped by the ingress PE. 329 For this scenario, the approach B in scenario 2 (described above) is 330 used in order to allow for single RT usage by service providers. 332 +---------+ +---------+ 333 | PE1 | | PE2 | 334 +---+ | +---+ | +------+ | +---+ | +---+ 335 |CE1+-----AC1----+--+ | | | | | | +--+-----AC2----+CE2| 336 +---+ (Root) | | E | | | MPLS | | | E | | (Leaf/Root)+---+ 337 | | V | | | /IP | | | V | | 338 | | I | | | | | | I | | +---+ 339 | | | | | | | | +--+-----AC3----+CE3| 340 | +---+ | +------+ | +---+ | (Leaf) +---+ 341 +---------+ +---------+ 343 Figure 3: Scenario 3 345 In conclusion, the approach B in scenario 2 is the recommended 346 approach across all the above three scenarios and the corresponding 347 solution is detailed in the following sections. 349 3 Operation for EVPN 351 [RFC7432] defines the notion of Ethernet Segment Identifier (ESI) 352 MPLS label used for split-horizon filtering of BUM traffic at the 353 egress PE. Such egress filtering capabilities can be leveraged in 354 provision of E-Tree services as it will be seen shortly for BUM 355 traffic. For know unicast traffic, additional extensions to [RFC7432] 356 is needed (i.e., a new BGP Extended Community for Leaf indication 357 described in section 5.1) in order to enable ingress filtering as 358 described in detail in the following sections. 360 3.1 Known Unicast Traffic 362 Since in EVPN, MAC learning is performed in the control plane via 363 advertisement of BGP routes, the filtering needed by E-Tree service 364 for known unicast traffic can be performed at the ingress PE, thus 365 providing very efficient filtering and avoiding sending known unicast 366 traffic over the MPLS/IP core to be filtered at the egress PE as done 367 in traditional E-Tree solutions - i.e., E-Tree for VPLS [RFC7796]. 369 To provide such ingress filtering for known unicast traffic, a PE 370 MUST indicate to other PEs what kind of sites (Root or Leaf) its MAC 371 addresses are associated with. This is done by advertising a Leaf 372 indication flag (via an Extended Community) along with each of its 373 MAC/IP Advertisement routes learned from a Leaf site. The lack of 374 such flag indicates that the MAC address is associated with a Root 375 site. This scheme applies to all scenarios described in section 2. 377 Tagging MAC addresses with a Leaf indication enables remote PEs to 378 perform ingress filtering for known unicast traffic - i.e., on the 379 ingress PE, the MAC destination address lookup yields, in addition to 380 the forwarding adjacency, a flag which indicates whether the target 381 MAC is associated with a Leaf site or not. The ingress PE cross- 382 checks this flag with the status of the originating AC, and if both 383 are leafs, then the packet is not forwarded. 385 In situation where MAC moves are allowed among Leaf and Root sites 386 (e.g., non-static MAC), PEs can receive multiple MAC/IP 387 advertisements routes for the same MAC address with different 388 Leaf/Root indications (and possibly different ESIs for multi-homing 389 scenarios). In such situations, MAC mobility procedures (section 15 390 of [RFC7432]) take precedence to first identify the location of the 391 MAC before associating that MAC with a Root or a Leaf site. 393 To support the above ingress filtering functionality, a new E-Tree 394 Extended Community with a Leaf indication flag is introduced [section 395 5.1]. This new Extended Community MUST be advertised with MAC/IP 396 Advertisement routes learned from a Leaf site. Besides MAC/IP 397 Advertisement route, no other EVPN routes are required to carry this 398 new extended community. 400 3.2 Broadcast, Unkonwn, and Multicast (BUM) Traffic 402 This specification does not provide support for filtering BUM 403 (Broadcast, Unknown, and Multicast) traffic on the ingress PE; due to 404 the multi-destination nature of BUM traffic, is is not possible to 405 perform filtering of the same on the ingress PE. As such, the 406 solution relies on egress filtering. In order to apply the proper 407 egress filtering, which varies based on whether a packet is sent from 408 a Leaf AC or a Root AC, the MPLS-encapsulated frames MUST be tagged 409 with an indication when they originated from a Leaf AC - i.e., to be 410 tagged with a Leaf label as specified in section 5.1. This Leaf label 411 allows for disposition PE (e.g., egress PE) to perform the necessary 412 egress filtering function in data-plane similar to ESI label in 413 [RFC7432]. The allocation of the Leaf label is on a per PE basis 414 (e.g., independent of ESI and EVI) as descried in the following 415 sections. 417 The Leaf label can be upstream assigned for P2MP LSP or downstream 418 assigned for ingress replication tunnels. The main difference between 419 downstream and upstream assigned Leaf label is that in case of 420 downstream assigned not all egress PE devices need to receive the 421 label in MPLS encapsulated BUM packets just like ESI label for 422 ingress replication procedures defined in [RFC7432]. 424 On the ingress PE, the PE needs to place all its Leaf ACs for a given 425 bridge domain in a single split-horizon group in order to prevent 426 intra-PE forwarding among its Leaf ACs. This intra-PE split-horizon 427 filtering applies to BUM traffic as well as known-unicast traffic. 429 There are four scenarios to consider as follows. In all these 430 scenarios, the ingress PE imposes the right MPLS label associated 431 with the originated Ethernet Segment (ES) depending on whether the 432 Ethernet frame originated from a Root or a Leaf site on that Ethernet 433 Segment (ESI label or Leaf label). The mechanism by which the PE 434 identifies whether a given frame originated from a Root or a Leaf 435 site on the segment is based on the AC identifier for that segment 436 (e.g., Ethernet Tag of the frame for 802.1Q frames). Other mechanisms 437 for identifying Root or Leaf sites such as the use of source MAC 438 address of the receiving frame are optional. The scenarios below are 439 described in context of Root/Leaf AC; however, they can be extended 440 to Root/Leaf MAC address if needed. 442 3.2.1 BUM Traffic Originated from a Single-homed Site on a Leaf AC 444 In this scenario, the ingress PE adds a Leaf label advertised using 445 the E-Tree Extended Community (Section 5.1) indicating a Leaf site. 446 This Leaf label, used for single-homing scenarios, is not on a per ES 447 basis but rather on a per PE basis - i.e., a single Leaf MPLS label 448 is used for all single-homed ES's on that PE. This Leaf label is 449 advertised to other PE devices, using the E-Tree Extended Community 450 (section 5.1) along with an Ethernet Auto-discovery per ES (EAD-ES) 451 route with ESI of zero and a set of Route Targets (RTs) corresponding 452 to all EVIs on the PE where each EVI has at least one Leaf site. 453 Multiple EAD-ES routes will need to be advertised if the number of 454 Route Targets (RTs) that need to be carried exceed the limit on a 455 single route per [RFC7432]. The ESI for the EAD-ES route is set to 456 zero to indicate single-homed sites. 458 When a PE receives this special Leaf label in the data path, it 459 blocks the packet if the destination AC is of type Leaf; otherwise, 460 it forwards the packet. 462 3.2.2 BUM Traffic Originated from a Single-homed Site on a Root AC 464 In this scenario, the ingress PE does not add any ESI label or Leaf 465 label and it operates per [RFC7432] procedures. 467 3.2.3 BUM Traffic Originated from a Multi-homed Site on a Leaf AC 469 In this scenario, it is assumed that while different ACs (VLANs) on 470 the same ES could have different Root/Leaf designation (some being 471 Roots and some being Leafs), the same VLAN does have the same 472 Root/Leaf designation on all PEs on the same ES. Furthermore, it is 473 assumed that there is no forwarding among subnets - ie, the service 474 is EVPN L2 and not EVPN IRB [EVPN-IRB]. IRB use cases described in 475 [EVPN-IRB] are outside the scope of this document. 477 In this scenario, if a multicast or broadcast packet is originated 478 from a Leaf AC, then it only needs to carry Leaf label described in 479 section 3.2.1. This label is sufficient in providing the necessary 480 egress filtering of BUM traffic from getting sent to Leaf ACs 481 including the Leaf AC on the same Ethernet Segment. 483 3.2.4 BUM Traffic Originated from a Multi-homed Site on a Root AC 485 In this scenario, both the ingress and egress PE devices follows the 486 procedure defined in [RFC7432] for adding and/or processing an ESI 487 MPLS label - i.e., existing procedures for BUM traffic in [RFC7432] 488 are sufficient and there is no need to add a Leaf label. 490 3.3 E-Tree Traffic Flows for EVPN 492 Per [RFC7387], a generic E-Tree service supports all of the following 493 traffic flows: 495 - Known unicast traffic from Root to Roots & Leaf 496 - Known unicast traffic from Leaf to Root 497 - BUM traffic from Root to Roots & Leafs 498 - BUM traffic from Leaf to Roots 500 A particular E-Tree service may need to support all of the above 501 types of flows or only a select subset, depending on the target 502 application. In the case where only multicast and broadcast flows 503 need to be supported, the L2VPN PEs can avoid performing any MAC 504 learning function. 506 The following subsections will describe the operation of EVPN to 507 support E-Tree service with and without MAC learning. 509 3.3.1 E-Tree with MAC Learning 511 The PEs implementing an E-Tree service must perform MAC learning when 512 unicast traffic flows must be supported among Root and Leaf sites. In 513 this case, the PE(s) with Root sites performs MAC learning in the 514 data-path over the Ethernet Segments, and advertises reachability in 515 EVPN MAC/IP Advertisement Routes. These routes will be imported by 516 all PEs for that EVI (i.e., PEs that have Leaf sites as well as PEs 517 that have Root sites). Similarly, the PEs with Leaf sites perform MAC 518 learning in the data-path over their Ethernet Segments, and advertise 519 reachability in EVPN MAC/IP Advertisement Routes. For scenarios where 520 two different RTs are used per EVI (one to designate Root site and 521 another to designate Leaf site), the MAC/IP Advertisement routes are 522 imported only by PEs with at least one Root site in the EVI - i.e., a 523 PE with only Leaf sites will not import these routes. PEs with Root 524 and/or Leaf sites may use the Ethernet Auto-discovery per EVI (EAD- 525 EVI) routes for aliasing (in the case of multi-homed segments) and 526 EAD-ES routes for mass MAC withdrawal per [RFC7432]. 528 To support multicast/broadcast from Root to Leaf sites, either a P2MP 529 tree rooted at the PE(s) with the Root site(s) (e.g., Root PEs) or 530 ingress replication can be used (section 16 of [RFC7432]). The 531 multicast tunnels are set up through the exchange of the EVPN 532 Inclusive Multicast route, as defined in [RFC7432]. 534 To support multicast/broadcast from Leaf to Root sites, either 535 ingress replication tunnels from each Leaf PE or a P2MP tree rooted 536 at each Leaf PE can be used. The following two paragraphs describes 537 when each of these tunneling schemes can be used and how to signal 538 them. 540 When there are only a few Root PEs with small amount of 541 multicast/broadcast traffic from Leaf PEs toward Root PEs, then 542 ingress replication tunnels from Leaf PEs toward Root PEs should be 543 sufficient. Therefore, if a Root PE needs to support a P2MP tunnel in 544 transmit direction from itself to Leaf PEs and at the same time it 545 wants to support ingress-replication tunnels in receive direction, 546 the Root PE can signal it efficiently by using a new composite tunnel 547 type defined in section 5.2. This new composite tunnel type is 548 advertised by the Root PE to simultaneously indicate a P2MP tunnel in 549 transmit direction and an ingress-replication tunnel in the receive 550 direction for the BUM traffic. 552 If the number of Root PEs is large, P2MP tunnels (e.g., mLDP or RSVP- 553 TE) originated at the Leaf PEs may be used and thus there will be no 554 need to use the modified PMSI tunnel attribute and the composite 555 tunnel type values defined in section 5.2. 557 3.3.2 E-Tree without MAC Learning 559 The PEs implementing an E-Tree service need not perform MAC learning 560 when the traffic flows between Root and Leaf sites are mainly 561 multicast or broadcast. In this case, the PEs do not exchange EVPN 562 MAC/IP Advertisement Routes. Instead, the Inclusive Multicast 563 Ethernet Tag route is used to support BUM traffic. In such scenarios, 564 the small amount of unicast traffic (if any) is sent as part of BUM 565 traffic. 567 The fields of this route are populated per the procedures defined in 568 [RFC7432], and the multicast tunnel setup criteria are as described 569 in the previous section. 571 Just as in the previous section, if the number of Root PEs are only a 572 few and thus ingress replication is desired from Leaf PEs to these 573 Root PEs, then the modified PMSI attribute and the composite tunnel 574 type values defined in section 5.2 should be used. 576 4 Operation for PBB-EVPN 578 In PBB-EVPN, the PE advertises a Root/Leaf indication along with each 579 B-MAC Advertisement route to indicate whether the associated B-MAC 580 address corresponds to a Root or a Leaf site. Just like the EVPN 581 case, the new E-Tree Extended Community defined in section [5.1] is 582 advertised with each EVPN MAC/IP Advertisement route. 584 In the case where a multi-homed Ethernet Segment has both Root and 585 Leaf sites attached, two B-MAC addresses are advertised: one B-MAC 586 address is per ES as specified in [RFC7623] and implicitly denoting 587 Root, and the other B-MAC address is per PE and explicitly denoting 588 Leaf. The former B-MAC address is not advertised with the E-Tree 589 extended community but the latter B-MAC denoting Leaf is advertised 590 with the new E-Tree extended community where "Leaf-indication" flag 591 is set. In multi-homing scenarios where an Ethernet Segment has both 592 Root and Leaf ACs, it is assumed that while different ACs (VLANs) on 593 the same ES could have different Root/Leaf designation (some being 594 Roots and some being Leafs), the same VLAN does have the same 595 Root/Leaf designation on all PEs on the same ES. Furthermore, it is 596 assumed that there is no forwarding among subnets - ie, the service 597 is L2 and not IRB. IRB use case is outside the scope of this 598 document. 600 The ingress PE uses the right B-MAC source address depending on 601 whether the Ethernet frame originated from the Root or Leaf AC on 602 that Ethernet Segment. The mechanism by which the PE identifies 603 whether a given frame originated from a Root or Leaf site on the 604 segment is based on the Ethernet Tag associated with the frame. Other 605 mechanisms of identification, beyond the Ethernet Tag, are outside 606 the scope of this document. 608 Furthermore, a PE advertises two special global B-MAC addresses: one 609 for Root and another for Leaf, and tags the Leaf one as such in the 610 MAC Advertisement route. These B-MAC addresses are used as source 611 addresses for traffic originating from single-homed segments. The B- 612 MAC address used for indicating Leaf sites can be the same for both 613 single-homed and multi-homed segments. 615 4.1 Known Unicast Traffic 617 For known unicast traffic, the PEs perform ingress filtering: On the 618 ingress PE, the C-MAC [RFC7623] destination address lookup yields, in 619 addition to the target B-MAC address and forwarding adjacency, a flag 620 which indicates whether the target B-MAC is associated with a Root or 621 a Leaf site. The ingress PE also checks the status of the originating 622 site, and if both are a Leaf, then the packet is not forwarded. 624 4.2 Broadcast, Unkonwn, and Multicast (BUM) Traffic 626 For BUM traffic, the PEs must perform egress filtering. When a PE 627 receives an EVPN MAC/IP advertisement route (which will be used as a 628 source B-MAC for BUM traffic), it updates its egress filtering (based 629 on the source B-MAC address), as follows: 631 - If the EVPN MAC/IP Advertisement route indicates that the 632 advertised B-MAC is a Leaf, and the local Ethernet Segment is a Leaf 633 as well, then the source B-MAC address is added to its B-MAC list 634 used for egress filtering - i.e., to block traffic from that B-MAC 635 address. 637 - Otherwise, the B-MAC filtering list is not updated. 639 - If the EVPN MAC/IP Advertisement route indicates that the 640 advertised B-MAC has changed its designation from a Leaf to a Root 641 and the local Ethernet Segment is a Leaf, then the source B-MAC 642 address is removed from the B-MAC list corresponding to the local 643 Ethernet Segment used for egress filtering - i.e., to unblock traffic 644 from that B-MAC address. 646 When the egress PE receives the packet, it examines the B-MAC source 647 address to check whether it should filter or forward the frame. Note 648 that this uses the same filtering logic as baseline [RFC7623] for an 649 ESI and does not require any additional flags in the data-plane. 651 Just as in section 3.2, the PE places all Leaf Ethernet Segments of a 652 given bridge domain in a single split-horizon group in order to 653 prevent intra-PE forwarding among Leaf segments. This split-horizon 654 function applies to BUM traffic as well as known-unicast traffic. 656 4.3 E-Tree without MAC Learning 658 In scenarios where the traffic of interest is only multicast and/or 659 broadcast, the PEs implementing an E-Tree service do not need to do 660 any MAC learning. In such scenarios the filtering must be performed 661 on egress PEs. For PBB-EVPN, the handling of such traffic is per 662 section 4.2 without the need for C-MAC learning (in data-plane) in I- 663 component (C-bridge table) of PBB-EVPN PEs (at both ingress and 664 egress PEs). 666 5 BGP Encoding 668 This document defines a new BGP Extended Community for EVPN. 670 5.1 E-Tree Extended Community 672 This Extended Community is a new transitive Extended Community 673 [RFC4360] having a Type field value of 0x06 (EVPN) and the Sub-Type 674 0x05. It is used for Leaf indication of known unicast and BUM 675 traffic. It indicates that the frame is originated from a Leaf site. 677 The E-Tree Extended Community is encoded as an 8-octet value as 678 follows: 680 0 1 2 3 681 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 682 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 683 | Type=0x06 | Sub-Type=0x05 | Flags(1 Octet)| Reserved=0 | 684 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 685 | Reserved=0 | Leaf Label | 686 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 688 Figure 4: E-Tree Extended Community 690 The Flags field has the following format: 692 0 1 2 3 4 5 6 7 693 +-+-+-+-+-+-+-+-+ 694 | MBZ |L| (MBZ = Must Be Zero) 695 +-+-+-+-+-+-+-+-+ 697 This document defines the following flags: 699 + Leaf-Indication (L) 701 A value of one indicates a Leaf AC/Site. The rest of flag bits are 702 reserved and should be set to zero. 704 When this Extended Community (EC) is advertised along with MAC/IP 705 Advertisement route (for known unicast traffic) per section 3.1, the 706 Leaf-Indication flag MUST be set to one and Leaf Label SHOULD be set 707 to zero. The receiving PE MUST ignore Leaf Label and only processes 708 Leaf-Indication flag. A value of zero for Leaf-Indication flag is 709 invalid when sent along with MAC/IP advertisement route and an error 710 should be logged. 712 When this EC is advertised along with EAD-ES route (with ESI of zero) 713 for BUM traffic to enable egress filtering on disposition PEs per 714 sections 3.2.1 and 3.2.3, the Leaf Label MUST be set to a valid MPLS 715 label (i.e., non-reserved assigned MPLS label [RFC3032]) and the 716 Leaf-Indication flag SHOULD be set to zero. The value of the 20-bit 717 MPLS label is encoded in the high-order 20 bits of the Leaf Label 718 field. The receiving PE MUST ignore the Leaf-Indication flag. A non- 719 valid MPLS label when sent along with the EAD-ES route, should be 720 ignored and logged as an error. 722 The reserved bits SHOULD be set to zero by the transmitter and MUST 723 be ignored by the receiver. 725 5.2 PMSI Tunnel Attribute 727 [RFC6514] defines PMSI Tunnel attribute which is an optional 728 transitive attribute with the following format: 730 +---------------------------------+ 731 | Flags (1 octet) | 732 +---------------------------------+ 733 | Tunnel Type (1 octet) | 734 +---------------------------------+ 735 | Ingress Replication MPLS Label | 736 | (3 octets) | 737 +---------------------------------+ 738 | Tunnel Identifier (variable) | 739 +---------------------------------+ 741 Figure 5: PMSI Tunnel Attribute 743 This document defines a new Composite tunnel type by introducing a 744 new 'Composite Tunnel' bit in the Tunnel Type field and adding a MPLS 745 label to the Tunnel Identifier field of PMSI Tunnel attribute as 746 detailed below. All other fields remain as defined in [RFC6514]. 747 Composite tunnel type is advertised by the Root PE to simultaneously 748 indicate a non-(ingress replication) tunnel (e.g., P2MP tunnel) in 749 transmit direction and an ingress-replication tunnel in the receive 750 direction for the BUM traffic. 752 When receiver ingress-replication labels are needed, the high-order 753 bit of the tunnel type field (Composite Tunnel bit) is set while the 754 remaining low-order seven bits indicate the tunnel type as before 755 (for the existing tunnel types). When this Composite Tunnel bit is 756 set, the "tunnel identifier" field begins with a three-octet label, 757 followed by the actual tunnel identifier for the transmit tunnel. 758 PEs that don't understand the new meaning of the high-order bit treat 759 the tunnel type as an undefined tunnel type and treat the PMSI tunnel 760 attribute as a malformed attribute [RFC6514]. That is why the 761 composite tunnel bit is allocated in the Tunnel Type field rather 762 than the Flags field. For the PEs that do understand the new meaning 763 of the high-order, if ingress replication is desired when sending BUM 764 traffic, the PE will use the the label in the Tunnel Identifier field 765 when sending its BUM traffic. 767 Using the Composite Tunnel bit for Tunnel Types 0x00 'no tunnel 768 information present' and 0x06 'Ingress Replication' is invalid, and a 769 PE that receives a PMSI Tunnel attribute with such information, 770 considers it as malformed and it SHOULD treat this Update as though 771 all the routes contained in this Update had been withdrawn per 772 section 5 of [RFC6514]. 774 6 Acknowledgement 776 We would like to thank Eric Rosen, Jeffrey Zhang, Wen Lin, Aldrin 777 Issac, Wim Henderickx, Dennis Cai, and Antoni Przygienda for their 778 valuable comments and contributions. The authors would also like to 779 thank Thomas Morin for shepherding this document and providing 780 valuable comments. 782 7 Security Considerations 784 Since this document uses the EVPN constructs of [RFC7432] and 785 [RFC7623], the same security considerations in these documents are 786 also applicable here. Furthermore, this document provides an 787 additional security check by allowing sites (or ACs) of an EVPN 788 instance to be designated as "Root" or "Leaf" by the network 789 operator/ service provider and thus preventing any traffic exchange 790 among "Leaf" sites of that VPN through ingress filtering for known 791 unicast traffic and egress filtering for BUM traffic. Since by 792 default and for the purpose of backward compatibility, an AC that 793 doesn't have a Leaf designation is considered as a Root AC, in order 794 to avoid any traffic exchange among Leaf ACs, the operator SHOULD 795 configure the AC with a proper role (Leaf or Root) before activating 796 the AC. 798 8 IANA Considerations 800 IANA has allocated value 5 in the "EVPN Extended Community Sub-Types" 801 registry defined in [RFC7153] as follow: 803 SUB-TYPE VALUE NAME Reference 805 0x05 E-Tree Extended Community This document 807 This document creates a one-octet registry called "E-Tree Flags". 808 New registrations will be made through the "RFC Required" procedure 809 defined in [RFC8126]. Initial registrations are as follows: 811 bit Name Reference 813 0-6 Unassigned 814 7 Leaf-Indication This document 816 8.1 Considerations for PMSI Tunnel Types 818 The "P-Multicast Service Interface Tunnel (PMSI Tunnel) Tunnel Types" 819 registry in the "Border Gateway Protocol (BGP) Parameters" registry 820 needs to be updated to reflect the use of the most significant bit as 821 "Composite Tunnel" bit (section 5.2). 823 For this purpose, this document updates [RFC7385] by changing the 824 previously unassigned values (i.e., 0x08 - 0xFA) as follow: 826 Value Meaning Reference 827 0x08-0x7A Unassigned 828 0x7B-0x7E Experimental this document 829 0x7F Reserved this document 830 0x80-0xFA Reserved for Composite tunnel this document 831 0xFB-0xFE Experimental [RFC7385] 832 0xFF Reserved [RFC7385] 834 The allocation policy for values 0x08-0x7A is per IETF Review 835 [RFC8126]. The range for experimental has been expanded to include 836 the previously assigned range of 0xFB-0xFE and the new range of 0x7B- 837 0x7E. The value in these ranges are not to be assigned. The value 838 0x7F which is the mirror image of (0xFF) is reserved in this 839 document. 841 9 References 843 9.1 Normative References 845 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 846 Requirement Levels", BCP 14, RFC 2119, March 1997. 848 [RFC8126] Cotton et al, "Guidelines for Writing an IANA 849 Considerations Section in RFCs", June, 2017. 851 [RFC7387] Key et al., "A Framework for E-Tree Service over MPLS 852 Network", October 2014. 854 [MEF6.1] Metro Ethernet Forum, "Ethernet Services Definitions - Phase 855 2", MEF 6.1, April 2008, https://mef.net/PDF_Documents/technical- 856 specifications/MEF6-1.pdf 858 [RFC7432] Sajassi et al., "BGP MPLS Based Ethernet VPN", February, 859 2015. 861 [RFC7623] Sajassi et al., "Provider Backbone Bridging Combined with 862 Ethernet VPN (PBB-EVPN)", September, 2015. 864 [RFC7385] Andersson et al., "IANA Registry for P-Multicast Service 865 Interface (PMSI) Tunnel Type Code Points", October, 2014. 867 [RFC7153] Rosen et al., "IANA Registries for BGP Extended 868 Communities", March, 2014. 870 [RFC6514] Aggarwal et al., "BGP Encodings and Procedures for 871 Multicast in MPLS/BGP IP VPNs", February, 2012. 873 [RFC4360] Sangli et al., "BGP Extended Communities Attribute", 874 February, 2006. 876 9.2 Informative References 878 [RFC4360] S. Sangli et al, "BGP Extended Communities Attribute", 879 February, 2006. 881 [RFC3032] E. Rosen et al, "MPLS Label Stack Encoding", January 2001. 883 [RFC7796] Y. Jiang et al, "Ethernet-Tree (E-Tree) Support in Virtual 884 Private LAN Service (VPLS)", March 2016. 886 [EVPN-IRB] A. Sajassi et al, "Integrated Routing and Bridging in 887 EVPN", draft-ietf-bess-evpn-inter-subnet-forwarding-03, February 8, 888 2017. 890 [802.1ah] IEEE, "IEEE Standard for Local and metropolitan area 891 networks - Media Access Control (MAC) Bridges and Virtual Bridged 892 Local Area Networks", Clauses 25 and 26, IEEE Std 802.1Q, DOI 893 10.1109/IEEESTD.2011.6009146. 895 Appendix-A 897 When two MAC-VRFs (two bridge tables per VLANs) are used for an E- 898 Tree service (one for Root ACs and another for Leaf ACs) on a given 899 PE, then the following complications in data-plane path can result. 901 Maintaining two MAC-VRFs (two bridge tables) per VLAN (when both Leaf 902 and Root ACs exists for that VLAN) would either require two lookups 903 be performed per MAC address in each direction in case of a miss, or 904 duplicating many MAC addresses between the two bridge tables 905 belonging to the same VLAN (same E-Tree instance). Unless two lookups 906 are made, duplication of MAC addresses would be needed for both 907 locally learned and remotely learned MAC addresses. Locally learned 908 MAC addresses from Leaf ACs need to be duplicated onto Root bridge 909 table and locally learned MAC addresses from Root ACs need to be 910 duplicated onto Leaf bridge table. Remotely learned MAC addresses 911 from Root ACs need to be copied onto both Root and Leaf bridge 912 tables. Because of potential inefficiencies associated with data- 913 plane implementation of additional MAC lookup or duplication of MAC 914 entries, this option is not believed to be implementable without 915 dataplane performance inefficiencies in some platforms and thus this 916 document introduces the coloring as described in section 2.2 and 917 detailed in section 3.1. 919 Authors' Addresses 921 Ali Sajassi 922 Cisco 923 Email: sajassi@cisco.com 925 Samer Salam 926 Cisco 927 Email: ssalam@cisco.com 929 John Drake 930 Juniper 931 Email: jdrake@juniper.net 933 Jim Uttaro 934 AT&T 935 Email: ju1738@att.com 937 Sami Boutros 938 VMware 939 Email: sboutros@vmware.com 941 Jorge Rabadan 942 Nokia 943 Email: jorge.rabadan@nokia.com