idnits 2.17.1 draft-ietf-bess-evpn-etree-13.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 28, 2017) is 2432 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Downref: Normative reference to an Informational RFC: RFC 7387 == Outdated reference: A later version (-15) exists of draft-ietf-bess-evpn-inter-subnet-forwarding-03 Summary: 1 error (**), 0 flaws (~~), 2 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Workgroup A. Sajassi, Ed. 3 INTERNET-DRAFT S. Salam 4 Intended Status: Standards Track Cisco 5 Updates: 7385 J. Drake 6 Juniper 7 J. Uttaro 8 ATT 9 S. Boutros 10 VMware 11 J. Rabadan 12 Nokia 14 Expires: February 28, 2018 August 28, 2017 16 E-TREE Support in EVPN & PBB-EVPN 17 draft-ietf-bess-evpn-etree-13 19 Abstract 21 The Metro Ethernet Forum (MEF) has defined a rooted-multipoint 22 Ethernet service known as Ethernet Tree (E-Tree). A solution 23 framework for supporting this service in MPLS networks is described 24 in RFC7387 ("A Framework for Ethernet-Tree (E-Tree) Service over a 25 Multiprotocol Label Switching (MPLS) Network"). This document 26 discusses how those functional requirements can be met with a 27 solution based on RFC7432, BGP MPLS Based Ethernet VPN (EVPN), with 28 some extensions and how such a solution can offer a more efficient 29 implementation of these functions than that of RFC7796, E-Tree 30 Support in Virtual Private LAN Service (VPLS). This document makes 31 use of the most significant bit of the "Tunnel Type" field (in PMSI 32 Tunnel Attribute) governed by the IANA registry created by RFC7385, 33 and hence updates RFC7385 accordingly. 35 Status of this Memo 37 This Internet-Draft is submitted to IETF in full conformance with the 38 provisions of BCP 78 and BCP 79. 40 Internet-Drafts are working documents of the Internet Engineering 41 Task Force (IETF), its areas, and its working groups. Note that 42 other groups may also distribute working documents as 43 Internet-Drafts. 45 Internet-Drafts are draft documents valid for a maximum of six months 46 and may be updated, replaced, or obsoleted by other documents at any 47 time. It is inappropriate to use Internet-Drafts as reference 48 material or to cite them other than as "work in progress." 50 The list of current Internet-Drafts can be accessed at 51 http://www.ietf.org/1id-abstracts.html 53 The list of Internet-Draft Shadow Directories can be accessed at 54 http://www.ietf.org/shadow.html 56 Copyright and License Notice 58 Copyright (c) 2017 IETF Trust and the persons identified as the 59 document authors. All rights reserved. 61 This document is subject to BCP 78 and the IETF Trust's Legal 62 Provisions Relating to IETF Documents 63 (http://trustee.ietf.org/license-info) in effect on the date of 64 publication of this document. Please review these documents 65 carefully, as they describe your rights and restrictions with respect 66 to this document. Code Components extracted from this document must 67 include Simplified BSD License text as described in Section 4.e of 68 the Trust Legal Provisions and are provided without warranty as 69 described in the Simplified BSD License. 71 Table of Contents 73 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 74 1.1 Specification of Requirements . . . . . . . . . . . . . . . 4 75 1.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 4 76 2 E-Tree Scenarios . . . . . . . . . . . . . . . . . . . . . . . 5 77 2.1 Scenario 1: Leaf or Root Site(s) per PE . . . . . . . . . . 5 78 2.2 Scenario 2: Leaf or Root Site(s) per AC . . . . . . . . . . 6 79 2.3 Scenario 3: Leaf or Root Site(s) per MAC Address . . . . . . 8 80 3 Operation for EVPN . . . . . . . . . . . . . . . . . . . . . . . 9 81 3.1 Known Unicast Traffic . . . . . . . . . . . . . . . . . . . 9 82 3.2 Broadcast, Unkonwn, and Multicast (BUM) Traffic . . . . . . 10 83 3.2.1 BUM Traffic Originated from a Single-homed Site on a 84 Leaf AC . . . . . . . . . . . . . . . . . . . . . . . . 11 85 3.2.2 BUM Traffic Originated from a Single-homed Site on a 86 Root AC . . . . . . . . . . . . . . . . . . . . . . . . 11 87 3.2.3 BUM Traffic Originated from a Multi-homed Site on a 88 Leaf AC . . . . . . . . . . . . . . . . . . . . . . . . 11 89 3.2.4 BUM Traffic Originated from a Multi-homed Site on a 90 Root AC . . . . . . . . . . . . . . . . . . . . . . . . 11 91 3.3 E-Tree Traffic Flows for EVPN . . . . . . . . . . . . . . . 12 92 3.3.1 E-Tree with MAC Learning . . . . . . . . . . . . . . . . 12 93 3.3.2 E-Tree without MAC Learning . . . . . . . . . . . . . . 13 94 4 Operation for PBB-EVPN . . . . . . . . . . . . . . . . . . . . . 13 95 4.1 Known Unicast Traffic . . . . . . . . . . . . . . . . . . . 14 96 4.2 Broadcast, Unkonwn, and Multicast (BUM) Traffic . . . . . . 14 97 4.3 E-Tree without MAC Learning . . . . . . . . . . . . . . . . 15 98 5 BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 15 99 5.1 E-Tree Extended Community . . . . . . . . . . . . . . . . . 15 100 5.2 PMSI Tunnel Attribute . . . . . . . . . . . . . . . . . . . 17 101 6 Acknowledgement . . . . . . . . . . . . . . . . . . . . . . . . 18 102 7 Security Considerations . . . . . . . . . . . . . . . . . . . . 18 103 8 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 18 104 8.1 Considerations for PMSI Tunnel Types . . . . . . . . . . . . 18 105 9 References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 106 9.1 Normative References . . . . . . . . . . . . . . . . . . . 19 107 9.2 Informative References . . . . . . . . . . . . . . . . . . 20 108 Appendix-A . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 109 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 110 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21 112 1 Introduction 114 The Metro Ethernet Forum (MEF) has defined a rooted-multipoint 115 Ethernet service known as Ethernet Tree (E-Tree) [MEF6.1]. In an E- 116 Tree service, a customer site that is typically represented by an 117 Attachment Circuits (AC) (e.g., an Ethernet tag but may also be 118 represented by a MAC address) is labeled as either a Root or a Leaf 119 site. Root sites can communicate with all other customer sites (both 120 Root and Leaf sites). However, Leaf sites can communicate with Root 121 sites but not with other Leaf sits. In this document unless 122 explicitly mentioned otherwise, a site is always represented by an 123 AC. 125 [RFC7387] describes a solution framework for supporting E-Tree 126 service in MPLS networks. The document identifies the functional 127 components of an overall solution to emulate E-Tree services in MPLS 128 networks in addition to multipoint-to-multipoint Ethernet LAN (E-LAN) 129 services specified in [RFC7432] and [RFC7623]. 131 [RFC7432] defines EVPN, a solution for multipoint L2VPN services with 132 advanced multi-homing capabilities, using BGP for distributing 133 customer/client MAC address reach-ability information over the 134 MPLS/IP network. [RFC7623] combines the functionality of EVPN with 135 [802.1ah] Provider Backbone Bridging (PBB) for MAC address 136 scalability. 138 This document discusses how the functional requirements for E-Tree 139 service can be met with a solution based on (PBB-)EVPN (i.e., 140 [RFC7432] and [RFC7623]) with some extensions and how such a solution 141 can offer a more efficient implementation of these functions than 142 that of RFC7796, E-Tree Support in Virtual Private LAN Service 143 (VPLS). Since this document specifies a solution based on [RFC7432], 144 it requires the readers to have the knowledge of [RFC7432] as 145 prerequisite. This document makes use of the most significant bit of 146 the "Tunnel Type" field (in PMSI Tunnel Attribute) governed by the 147 IANA registry created by RFC7385, and hence updates RFC7385 148 accordingly. Section 2 discusses E-Tree scenarios. Section 3 and 4 149 describe E-Tree solutions for EVPN and PBB-EVPN respectively, and 150 section 5 covers BGP encoding for E-Tree solutions. 152 1.1 Specification of Requirements 154 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 155 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 156 document are to be interpreted as described in RFC 2119 [KEYWORDS]. 158 1.2 Terminology 159 Broadcast Domain: In a bridged network, the broadcast domain 160 corresponds to a Virtual LAN (VLAN), where a VLAN is typically 161 represented by a single VLAN ID (VID) but can be represented by 162 several VIDs where Shared VLAN Learning (SVL) is used per [802.1Q]. 164 Bridge Table: An instantiation of a broadcast domain on a MAC-VRF. 166 CE: Customer Edge device, e.g., a host, router, or switch. 168 EVI: An EVPN instance spanning the Provider Edge (PE) devices 169 participating in that EVPN. 171 MAC-VRF: A Virtual Routing and Forwarding table for Media Access 172 Control (MAC) addresses on a PE. 174 Ethernet Segment (ES): When a customer site (device or network) is 175 connected to one or more PEs via a set of Ethernet links, then that 176 set of links is referred to as an 'Ethernet segment'. 178 Ethernet Segment Identifier (ESI): A unique non-zero identifier that 179 identifies an Ethernet segment is called an 'Ethernet Segment 180 Identifier'. 182 Ethernet Tag: An Ethernet tag identifies a particular broadcast 183 domain, e.g., a VLAN. An EVPN instance consists of one or more 184 broadcast domains. 186 P2MP: Point to Multipoint. 188 PE: Provider Edge device. 190 2 E-Tree Scenarios 192 This document categorizes E-Tree scenarios into the following three 193 scenarios, depending on the nature of the Root/Leaf site association: 195 - Either Leaf or Root site(s) per PE 197 - Either Leaf or Root site(s) per Attachment Circuit (AC) 199 - Either Leaf or Root site(s) per MAC address 201 2.1 Scenario 1: Leaf or Root Site(s) per PE 203 In this scenario, a PE may receive traffic from either Root ACs or 204 Leaf ACs for a given MAC-VRF/bridge table, but not both. In other 205 words, a given EVPN Instance (EVI) on a Provider Edge (PE) device is 206 either associated with root(s) or leaf(s). The PE may have both Root 207 and Leaf ACs albeit for different EVIs. 209 +---------+ +---------+ 210 | PE1 | | PE2 | 211 +---+ | +---+ | +------+ | +---+ | +---+ 212 |CE1+---AC1----+--+ | | | MPLS | | | +--+----AC2-----+CE2| 213 +---+ (Root) | |MAC| | | /IP | | |MAC| | (Leaf) +---+ 214 | |VRF| | | | | |VRF| | 215 | | | | | | | | | | +---+ 216 | | | | | | | | +--+----AC3-----+CE3| 217 | +---+ | +------+ | +---+ | (Leaf) +---+ 218 +---------+ +---------+ 220 Figure 1: Scenario 1 222 In this scenario, tailored BGP Route Target (RT) import/export 223 policies among the PEs belonging to the same EVI can be used to 224 prevent the communications among Leaf PEs. To prevent the 225 communications among Leaf ACs connected to the same PE and belonging 226 to the same EVI, split-horizon filtering is used to block traffic 227 from one Leaf AC to another Leaf AC on a MAC-VRF for a given E-Tree 228 EVI. The purpose of this topology constraint is to avoid having PEs 229 with only Leaf sites importing and processing BGP MAC routes from 230 each other. To support such topology constrain in EVPN, two BGP 231 Route-Targets (RTs) are used for every EVPN Instance (EVI): one RT is 232 associated with the Root sites (Root ACs) and the other is associated 233 with the Leaf sites (Leaf ACs). On a per EVI basis, every PE exports 234 the single RT associated with its type of site(s). Furthermore, a PE 235 with Root site(s) imports both Root and Leaf RTs, whereas a PE with 236 Leaf site(s) only imports the Root RT. 238 For this scenario, if it is desired to use only a single RT per EVI 239 (just like E-LAN services in [RFC7432]), then the approach B in 240 scenario 2 (described below) needs to be used. 242 2.2 Scenario 2: Leaf or Root Site(s) per AC 244 In this scenario, a PE can receive traffic from both Root ACs and 245 Leaf ACs for a given EVI. In other words, a given EVI on a PE can be 246 associated with both root(s) and leaf(s). 248 +---------+ +---------+ 249 | PE1 | | PE2 | 250 +---+ | +---+ | +------+ | +---+ | +---+ 251 |CE1+-----AC1----+--+ | | | | | | +--+---AC2--+CE2| 252 +---+ (Leaf) | |MAC| | | MPLS | | |MAC| | (Leaf) +---+ 253 | |VRF| | | /IP | | |VRF| | 254 | | | | | | | | | | +---+ 255 | | | | | | | | +--+---AC3--+CE3| 256 | +---+ | +------+ | +---+ | (Root) +---+ 257 +---------+ +---------+ 259 Figure 2: Scenario 2 261 In this scenario, just like the previous scenario (in section 2.1), 262 two Route Targets (one for Root and another for Leaf) can be used. 263 However, the difference is that on a PE with both Root and Leaf ACs, 264 all remote MAC routes are imported and thus there needs to be a way 265 to differentiate remote MAC routes associated with Leaf ACs versus 266 the ones associated with Root ACs in order to apply the proper 267 ingress filtering. 269 In order to recognize the association of a destination MAC address to 270 a Leaf or Root AC and thus support ingress filtering on the ingress 271 PE with both Leaf and Root ACs, MAC addresses need to be colored with 272 Root or Leaf indication before advertisements to other PEs. There are 273 two approaches for such coloring: 275 A) To always use two RTs (one to designate Leaf RT and another for 276 Root RT) 278 B) To allow for a single RT be used per EVI just like [RFC7432] and 279 thus color MAC addresses via a "color" flag in a new extended 280 community as detailed in section 5.1. 282 Approach (A) would require the same data plane enhancements as 283 approach (B) if MAC-VRF/bridge tables used per broadcast domain 284 (e.g., VLAN) are to remain consistent with [RFC7432] (section 6). In 285 order to avoid data-plane enhancements for approach (A), multiple 286 bridge tables per VLAN may be considered; however, this has major 287 drawbacks as described in appendix-A and thus is not recommended. 289 Given that both approaches (A) and (B) would require exact same data- 290 plane enhancements, approach (B) is chosen here in order to allow for 291 RT usage consistent with baseline EVPN [RFC7432] and for better 292 generality. It should be noted that if one wants to use RT constrain 293 in order to avoid MAC advertisements associated with a Leaf AC to PEs 294 with only Leaf ACs, then two RTs (one for Root and another for Leaf) 295 can still be used with approach (B); however, in such applications 296 Leaf/Root RTs will be used to constrain MAC advertisements and they 297 are not used to color the MAC routes for ingress filtering - i.e., in 298 approach (B), the coloring is always done via the new extended 299 community. 301 For this scenario, if for a given EVI, significant number of PEs have 302 both Leaf and Root sites attached, even though they may start as 303 Root-only or Leaf-only PEs, then a single RT per EVI should be used. 304 The reason for such recommendation is to alleviate the configuration 305 overhead associated with using two RTs per EVI at the expense of 306 having some unwanted MAC addresses on the Leaf-only PEs. 308 2.3 Scenario 3: Leaf or Root Site(s) per MAC Address 310 In this scenario, a customer Root or Leaf site is represented by a 311 MAC address and a PE may receive traffic from both Root AND Leaf 312 sites on a single Attachment Circuit (AC) of an EVI. This scenario is 313 not covered in either [RFC7387] or [MEF6.1]; however, it is covered 314 in this document for the sake of completeness. In this scenario, 315 since an AC carries traffic from both Root and Leaf sites, the 316 granularity at which Root or Leaf sites are identified is on a per 317 MAC address. This scenario is considered in this document for EVPN 318 service with only known unicast traffic because the Designated 319 Forwarding (DF) filtering per [RFC7432] would not be compatible with 320 the required egress filtering - i.e., Broadcast, Unknown, and 321 Multicast (BUM) traffic is not supported in this scenario and it is 322 dropped by the ingress PE. 324 For this scenario, the approach B in scenario 2 (described above) is 325 used in order to allow for single RT usage by service providers. 327 +---------+ +---------+ 328 | PE1 | | PE2 | 329 +---+ | +---+ | +------+ | +---+ | +---+ 330 |CE1+-----AC1----+--+ | | | | | | +--+-----AC2----+CE2| 331 +---+ (Root) | | E | | | MPLS | | | E | | (Leaf/Root)+---+ 332 | | V | | | /IP | | | V | | 333 | | I | | | | | | I | | +---+ 334 | | | | | | | | +--+-----AC3----+CE3| 335 | +---+ | +------+ | +---+ | (Leaf) +---+ 336 +---------+ +---------+ 338 Figure 3: Scenario 3 340 In conclusion, the approach B in scenario 2 is the recommended 341 approach across all the above three scenarios and the corresponding 342 solution is detailed in the following sections. 344 3 Operation for EVPN 346 [RFC7432] defines the notion of Ethernet Segment Identifier (ESI) 347 MPLS label used for split-horizon filtering of BUM traffic at the 348 egress PE. Such egress filtering capabilities can be leveraged in 349 provision of E-Tree services as it will be seen shortly for BUM 350 traffic. For know unicast traffic, additional extensions to [RFC7432] 351 is needed (i.e., a new BGP Extended Community for leaf indication 352 described in section 5.1) in order to enable ingress filtering as 353 described in detail in the following sections. 355 3.1 Known Unicast Traffic 357 Since in EVPN, MAC learning is performed in control plane via 358 advertisement of BGP routes, the filtering needed by E-Tree service 359 for known unicast traffic can be performed at the ingress PE, thus 360 providing very efficient filtering and avoiding sending known unicast 361 traffic over MPLS/IP core to be filtered at the egress PE as done in 362 traditional E-Tree solutions - i.e., E-Tree for VPLS [RFC7796]. 364 To provide such ingress filtering for known unicast traffic, a PE 365 MUST indicate to other PEs what kind of sites (root or leaf) its MAC 366 addresses are associated with. This is done by advertising a Leaf 367 indication flag (via an Extended Community) along with each of its 368 MAC/IP Advertisement routes learned from a Leaf site. The lack of 369 such flag indicates that the MAC address is associated with a root 370 site. This scheme applies to all scenarios described in section 2. 372 Tagging MAC addresses with a Leaf indication enables remote PEs to 373 perform ingress filtering for known unicast traffic - i.e., on the 374 ingress PE, the MAC destination address lookup yields, in addition to 375 the forwarding adjacency, a flag which indicates whether the target 376 MAC is associated with a Leaf site or not. The ingress PE cross- 377 checks this flag with the status of the originating AC, and if both 378 are leafs, then the packet is not forwarded. 380 In situation where MAC moves are allowed among Leaf and Root sites 381 (e.g., non-static MAC), PEs can receive multiple MAC/IP 382 advertisements routes for the same MAC address with different 383 Leaf/Root indications (and possibly different ESIs for multi-homing 384 scenarios). In such situations, MAC mobility procedures (section 15 385 of [RFC7432]) take precedence to first identify the location of the 386 MAC before associating that MAC with a Root or a Leaf site. 388 To support the above ingress filtering functionality, a new E-Tree 389 Extended Community with a Leaf indication flag is introduced [section 390 5.1]. This new Extended Community MUST be advertised with MAC/IP 391 Advertisement route learned from a Leaf site. Besides MAC/IP 392 Advertisement route, no other EVPN routes are required to carry this 393 new extended community. 395 3.2 Broadcast, Unkonwn, and Multicast (BUM) Traffic 397 In this specification, the support for filtering BUM (Broadcast, 398 Unknown, and Multicast) traffic does not include ingress filtering 399 because it is not possible to do so, due to the multi-destination 400 nature of BUM traffic. As such, the solution relies on egress 401 filtering. In order to apply the proper egress filtering, which 402 varies based on whether a packet is sent from a Leaf AC or a root AC, 403 the MPLS-encapsulated frames MUST be tagged with an indication when 404 they originated from a Leaf AC - i.e., to be tagged with a Leaf label 405 as specified in section 5.1. This Leaf label allows for disposition 406 PE (e.g., egress PE) to perform the necessary egress filtering 407 function in data-plane similar to ESI label in [RFC7432]. The 408 allocation of the Leaf label is on a per PE basis (e.g., independent 409 of ESI and EVI) as descried in the following sections. 411 The Leaf label can be upstream assigned for P2MP LSP or downstream 412 assigned for ingress replication tunnels. The main difference between 413 downstream and upstream assigned Leaf label is that in case of 414 downstream assigned not all egress PE devices need to receive the 415 label in MPLS encapsulated BUM packets just like ESI label for 416 ingress replication procedures defined in [RFC7432]. 418 On the ingress PE, the PE needs to place all its Leaf ACs for a given 419 bridge domain in a single split-horizon group in order to prevent 420 intra-PE forwarding among its Leaf ACs. This intra-PE split-horizon 421 filtering applies to BUM traffic as well as known-unicast traffic. 423 There are four scenarios to consider as follows. In all these 424 scenarios, the ingress PE imposes the right MPLS label associated 425 with the originated Ethernet Segment (ES) depending on whether the 426 Ethernet frame originated from a Root or a Leaf site on that Ethernet 427 Segment (ESI label or Leaf label). The mechanism by which the PE 428 identifies whether a given frame originated from a Root or a Leaf 429 site on the segment is based on the AC identifier for that segment 430 (e.g., Ethernet Tag of the frame for 802.1Q frames). Other mechanisms 431 for identifying root or Leaf sites such the use of source MAC address 432 of the receiving frame are optional. The scenarios below are 433 described in context of Root/Leaf AC; however, they can be extended 434 to Root/Leaf MAC address if needed. 436 3.2.1 BUM Traffic Originated from a Single-homed Site on a Leaf AC 438 In this scenario, the ingress PE adds a Leaf label advertised using 439 the E-Tree Extended Community (Section 5.1) indicating a Leaf site. 440 This Leaf label, used for single-homing scenarios, is not on a per ES 441 basis but rather on a per PE basis - i.e., a single Leaf MPLS label 442 is used for all single-homed ES's on that PE. This Leaf label is 443 advertised to other PE devices, using the E-Tree Extended Community 444 (section 5.1) along with an Ethernet A-D per ES route with ESI of 445 zero and a set of Route Targets (RTs) corresponding to all EVIs on 446 the PE where each EVI has at least one Leaf site. Multiple Ethernet 447 A-D per ES routes will need to be advertised if the number of Route 448 Targets (RTs) that need to be carried exceed the limit on a single 449 route per [RFC7432]. The ESI for the Ethernet A-D per ES route is set 450 to zero to indicate single-homed sites. 452 When a PE receives this special Leaf label in the data path, it 453 blocks the packet if the destination AC is of type Leaf; otherwise, 454 it forwards the packet. 456 3.2.2 BUM Traffic Originated from a Single-homed Site on a Root AC 458 In this scenario, the ingress PE does not add any ESI label or Leaf 459 label and it operates per [RFC7432] procedures. 461 3.2.3 BUM Traffic Originated from a Multi-homed Site on a Leaf AC 463 In this scenario, it is assumed that while different ACs (VLANs) on 464 the same ES could have different Root/Leaf designation (some being 465 roots and some being leafs), the same VLAN does have the same 466 Root/Leaf designation on all PEs on the same ES. Furthermore, it is 467 assumed that there is no forwarding among subnets - ie, the service 468 is EVPN L2 and not EVPN IRB [EVPN-IRB]. IRB use cases described in 469 [EVPN-IRB] are outside the scope of this document. 471 In this scenario, If a multicast or broadcast packet is originated 472 from a Leaf AC, then it only needs to carry Leaf label described in 473 section 3.2.1. This label is sufficient in providing the necessary 474 egress filtering of BUM traffic from getting sent to Leaf ACs 475 including the Leaf AC on the same Ethernet Segment. 477 3.2.4 BUM Traffic Originated from a Multi-homed Site on a Root AC 479 In this scenario, both the ingress and egress PE devices follows the 480 procedure defined in [RFC7432] for adding and/or processing an ESI 481 MPLS label - i.e., existing procedures for BUM traffic in [RFC7432] 482 are sufficient and there is no need to add a Leaf label. 484 3.3 E-Tree Traffic Flows for EVPN 486 Per [RFC7387], a generic E-Tree service supports all of the following 487 traffic flows: 489 - Known unicast traffic from Root to Roots & Leaf 490 - Known unicast traffic from Leaf to Root 491 - BUM traffic from Root to Roots & Leafs 492 - BUM traffic from Leaf to Roots 494 A particular E-Tree service may need to support all of the above 495 types of flows or only a select subset, depending on the target 496 application. In the case where only multicast and broadcast flows 497 need to be supported, the L2VPN PEs can avoid performing any MAC 498 learning function. 500 The following subsections will describe the operation of EVPN to 501 support E-Tree service with and without MAC learning. 503 3.3.1 E-Tree with MAC Learning 505 The PEs implementing an E-Tree service must perform MAC learning when 506 unicast traffic flows must be supported among Root and Leaf sites. In 507 this case, the PE(s) with Root sites performs MAC learning in the 508 data-path over the Ethernet Segments, and advertises reachability in 509 EVPN MAC/IP Advertisement Routes. These routes will be imported by 510 all PEs for that EVI (i.e., PEs that have Leaf sites as well as PEs 511 that have Root sites). Similarly, the PEs with Leaf sites perform MAC 512 learning in the data-path over their Ethernet Segments, and advertise 513 reachability in EVPN MAC/IP Advertisement Routes. For scenarios where 514 two different RTs are used per EVI (one to designate Root site and 515 another to designate Leaf site), the MAC/IP Advertisement routes are 516 imported only by PEs with at least one Root site in the EVI - i.e., a 517 PE with only Leaf sites will not import these routes. PEs with Root 518 and/or Leaf sites may use the Ethernet A-D routes for aliasing (in 519 the case of multi-homed segments) and for mass MAC withdrawal per 520 [RFC7432]. 522 To support multicast/broadcast from Root to Leaf sites, either a P2MP 523 tree rooted at the PE(s) with the Root site(s) (e.g., Root PEs) or 524 ingress replication can be used (section 16 of [RFC7432]). The 525 multicast tunnels are set up through the exchange of the EVPN 526 Inclusive Multicast route, as defined in [RFC7432]. 528 To support multicast/broadcast from Leaf to Root sites, either 529 ingress replication tunnels from each Leaf PE or a P2MP tree rooted 530 at each Leaf PE can be used. The following two paragraphs describes 531 when each of these tunneling schemes can be used and how to signal 532 them. 534 When there are only a few Root PEs with small amount of 535 multicast/broadcast traffic from Leaf PEs toward Root PEs, then 536 ingress replication tunnels from Leaf PEs toward Root PEs should be 537 sufficient. Therefore, if a root PE needs to support a P2MP tunnel in 538 transmit direction from itself to Leaf PEs and at the same time it 539 wants to support ingress-replication tunnels in receive direction, 540 the Root PE can signal it efficiently by using a new composite tunnel 541 type defined in section 5.2. This new composite tunnel type is 542 advertised by the root PE to simultaneously indicate a P2MP tunnel in 543 transmit direction and an ingress-replication tunnel in the receive 544 direction for the BUM traffic. 546 If the number of Root PEs is large, P2MP tunnels (e.g., mLDP or RSVP- 547 TE) originated at the Leaf PEs may be used and thus there will be no 548 need to use the modified PMSI tunnel attribute and the composite 549 tunnel type values defined in section 5.2. 551 3.3.2 E-Tree without MAC Learning 553 The PEs implementing an E-Tree service need not perform MAC learning 554 when the traffic flows between Root and Leaf sites are mainly 555 multicast or broadcast. In this case, the PEs do not exchange EVPN 556 MAC/IP Advertisement Routes. Instead, the Inclusive Multicast 557 Ethernet Tag route is used to support BUM traffic. 559 The fields of this route are populated per the procedures defined in 560 [RFC7432], and the multicast tunnel setup criteria are as described 561 in the previous section. 563 Just as in the previous section, if the number of Root PEs are only a 564 few and thus ingress replication is desired from Leaf PEs to these 565 root PEs, then the modified PMSI attribute and the composite tunnel 566 type values defined in section 5.2 should be used. 568 4 Operation for PBB-EVPN 570 In PBB-EVPN, the PE advertises a Root/Leaf indication along with each 571 B-MAC Advertisement route, to indicate whether the associated B-MAC 572 address corresponds to a Root or a Leaf site. Just like the EVPN 573 case, the new E-Tree Extended Community defined in section [5.1] is 574 advertised with each EVPN MAC/IP Advertisement route. 576 In the case where a multi-homed Ethernet Segment has both Root and 577 Leaf sites attached, two B-MAC addresses are advertised: one B-MAC 578 address is per ES as specified in [RFC7623] and implicitly denoting 579 Root, and the other B-MAC address is per PE and explicitly denoting 580 Leaf. The former B-MAC address is not advertised with the E-Tree 581 extended community but the latter B-MAC denoting Leaf is advertised 582 with the new E-Tree extended community where "Leaf-indication" flag 583 is set. In such multi-homing scenarios where an Ethernet Segment has 584 both Root and Leaf ACs, it is assumed that While different ACs 585 (VLANs) on the same ES could have different Root/Leaf designation 586 (some being Roots and some being Leafs), the same VLAN does have the 587 same Root/Leaf designation on all PEs on the same ES. Furthermore, it 588 is assumed that there is no forwarding among subnets - ie, the 589 service is L2 and not IRB. IRB use case is outside the scope of this 590 document. 592 The ingress PE uses the right B-MAC source address depending on 593 whether the Ethernet frame originated from the Root or Leaf AC on 594 that Ethernet Segment. The mechanism by which the PE identifies 595 whether a given frame originated from a Root or Leaf site on the 596 segment is based on the Ethernet Tag associated with the frame. Other 597 mechanisms of identification, beyond the Ethernet Tag, are outside 598 the scope of this document. 600 Furthermore, a PE advertises two special global B-MAC addresses: one 601 for Root and another for Leaf, and tags the Leaf one as such in the 602 MAC Advertisement route. These B-MAC addresses are used as source 603 addresses for traffic originating from single-homed segments. The B- 604 MAC address used for indicating Leaf sites can be the same for both 605 single-homed and multi-homed segments. 607 4.1 Known Unicast Traffic 609 For known unicast traffic, the PEs perform ingress filtering: On the 610 ingress PE, the C-MAC destination address lookup yields, in addition 611 to the target B-MAC address and forwarding adjacency, a flag which 612 indicates whether the target B-MAC is associated with a Root or a 613 Leaf site. The ingress PE also checks the status of the originating 614 site, and if both are a Leaf, then the packet is not forwarded. 616 4.2 Broadcast, Unkonwn, and Multicast (BUM) Traffic 618 For BUM traffic, the PEs must perform egress filtering. When a PE 619 receives an EVPN MAC/IP advertisement route (which will be used as a 620 source B-MAC for BUM traffic), it updates its egress filtering (based 621 on the source B-MAC address), as follows: 623 - If the EVPN MAC/IP Advertisement route indicates that the 624 advertised B-MAC is a Leaf, and the local Ethernet Segment is a Leaf 625 as well, then the source B-MAC address is added to its B-MAC list 626 used for egress filtering - i.e., to block traffic from that B-MAC 627 address. 629 - Otherwise, the B-MAC filtering list is not updated. 631 - If the EVPN MAC/IP Advertisement route indicates that the 632 advertised B-MAC has changed its designation from a Leaf to a Root 633 and the local Ethernet Segment is a Leaf, then the source B-MAC 634 address is removed from the B-MAC list corresponding to the local 635 Ethernet Segment used for egress filtering - i.e., to unblock traffic 636 from that B-MAC address. 638 When the egress PE receives the packet, it examines the B-MAC source 639 address to check whether it should filter or forward the frame. Note 640 that this uses the same filtering logic as baseline [RFC7623] for an 641 ESI and does not require any additional flags in the data-plane. 643 Just as in section 3.2, the PE places all Leaf Ethernet Segments of a 644 given bridge domain in a single split-horizon group in order to 645 prevent intra-PE forwarding among Leaf segments. This split-horizon 646 function applies to BUM traffic as well as known-unicast traffic. 648 4.3 E-Tree without MAC Learning 650 In scenarios where the traffic of interest is only Multicast and/or 651 broadcast, the PEs implementing an E-Tree service do not need to do 652 any MAC learning. In such scenarios the filtering must be performed 653 on egress PEs. For PBB-EVPN, the handling of such traffic is per 654 section 4.2 without the need for C-MAC learning (in data-plane) in I- 655 component (C-bridge table) of PBB-EVPN PEs (at both ingress and 656 egress PEs). 658 5 BGP Encoding 660 This document defines a new BGP Extended Community for EVPN. 662 5.1 E-Tree Extended Community 664 This Extended Community is a new transitive Extended Community 665 [RFC4360] having a Type field value of 0x06 (EVPN) and the Sub-Type 666 0x05. It is used for Leaf indication of known unicast and BUM 667 traffic. It indicates that the frame is originated from a Leaf site. 669 The E-Tree Extended Community is encoded as an 8-octet value as 670 follows: 672 0 1 2 3 673 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 674 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 675 | Type=0x06 | Sub-Type=0x05 | Flags(1 Octet)| Reserved=0 | 676 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 677 | Reserved=0 | Leaf Label | 678 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 680 Figure 4: E-Tree Extended Community 682 The Flags field has the following format: 684 0 1 2 3 4 5 6 7 685 +-+-+-+-+-+-+-+-+ 686 | MBZ |L| (MBZ = Must Be Zero) 687 +-+-+-+-+-+-+-+-+ 689 This document defines the following flags: 691 + Leaf-Indication (L) 693 A value of one indicates a Leaf AC/Site. The rest of flag bits are 694 reserved and should be set to zero. 696 When this Extended Community (EC) is advertised along with MAC/IP 697 Advertisement route (for known unicast traffic) per section 3.1, the 698 Leaf-Indication flag MUST be set to one and Leaf Label SHOULD be set 699 to zero. The value of the 20-bit MPLS label is encoded in the high- 700 order 20 bits of the Leaf Label field. The receiving PE SHOULD ignore 701 Leaf Label and only processes Leaf-Indication flag. A value of zero 702 for Leaf-Indication flag is invalid when sent along with MAC/IP 703 advertisement route and an error should be logged. 705 When this EC is advertised along with Ethernet A-D per ES route (with 706 ESI of zero) for BUM traffic to enable egress filtering on 707 disposition PEs per sections 3.2.1 and 3.2.3, the Leaf Label MUST be 708 set to a valid MPLS label (i.e., non-reserved assigned MPLS label 709 [RFC3032]) and the Leaf-Indication flag SHOULD be set to zero. The 710 receiving PE SHOULD ignore the Leaf-Indication flag. A non-valid MPLS 711 label when sent along with the Ethernet A-D per ES route, should be 712 ignored and logged as an error. 714 The reserved bits SHOULD be set to zero by the transmitter and SHOULD 715 be ignored by the receiver. 717 5.2 PMSI Tunnel Attribute 719 [RFC6514] defines PMSI Tunnel attribute which is an optional 720 transitive attribute with the following format: 722 +---------------------------------+ 723 | Flags (1 octet) | 724 +---------------------------------+ 725 | Tunnel Type (1 octet) | 726 +---------------------------------+ 727 | Ingress Replication MPLS Label | 728 | (3 octets) | 729 +---------------------------------+ 730 | Tunnel Identifier (variable) | 731 +---------------------------------+ 733 Figure 5: PMSI Tunnel Attribute 735 This document defines a new Composite tunnel type by introducing a 736 new 'Composite Tunnel' bit in the Tunnel Type field and adding a MPLS 737 label to the Tunnel Identifier field of PMSI Tunnel attribute as 738 detailed below. This document uses all other remaining fields per 739 existing definition. Composite tunnel type is advertised by the root 740 PE to simultaneously indicate a non-(ingress replication) tunnel 741 (e.g., P2MP tunnel) in transmit direction and an ingress-replication 742 tunnel in the receive direction for the BUM traffic. 744 When receiver ingress-replication label is needed, the high-order bit 745 of the tunnel type field (Composite Tunnel bit) is set while the 746 remaining low-order seven bits indicate the tunnel type as before 747 (for the existing tunnel types). When this Composite Tunnel bit is 748 set, the "tunnel identifier" field begins with a three-octet label, 749 followed by the actual tunnel identifier for the transmit tunnel. 750 PEs that don't understand the new meaning of the high-order bit would 751 treat the tunnel type as an undefined tunnel type and would treat the 752 PMSI tunnel attribute as a malformed attribute [RFC6514]. That is why 753 the composite tunnel bit is allocated in the Tunnel Type field rather 754 than the Flags field. For the PEs that do understand the new meaning 755 of the high-order, if ingress replication is desired when sending BUM 756 traffic, the PE will use the the label in the Tunnel Identifier field 757 when sending its BUM traffic. 759 Using the Composite Tunnel bit for Tunnel Types 0x00 'no tunnel 760 information present' and 0x06 'Ingress Replication' is invalid, and a 761 PE that receives a PMSI Tunnel attribute with such information, 762 considers it as malformed and it SHOULD treat this Update as though 763 all the routes contained in this Update had been withdrawn per 764 section 5 of [RFC6514]. 766 6 Acknowledgement 768 We would like to thank Eric Rosen, Jeffrey Zhang, Dennis Cai, and 769 Antoni Przygienda for their valuable comments. The authors would also 770 like to thank Thomas Morin for shepherding this document and 771 providing valuable comments. 773 7 Security Considerations 775 Since this document uses the EVPN constructs of [RFC7432] and 776 [RFC7623], the same security considerations in these documents are 777 also applicable here. Furthermore, this document provides additional 778 security check by allowing sites (or ACs) of an EVPN instance to be 779 designated as "Root" or "Leaf" and preventing any traffic exchange 780 among "Leaf" sites of that VPN through ingress filtering for known 781 unicast traffic and egress filtering for BUM traffic. 783 8 IANA Considerations 785 IANA has allocated value 5 in the "EVPN Extended Community Sub-Types" 786 registry defined in [RFC7153] as follow: 788 SUB-TYPE VALUE NAME Reference 790 0x05 E-Tree Extended Community This document 792 This document creates a one-octet registry called "E-Tree Flags". 793 New registrations will be made through the "RFC Required" procedure 794 defined in [RFC8126]. Initial registrations are as follows: 796 bit Name Reference 798 0-6 Unassigned 799 7 Leaf-Indication This document 801 8.1 Considerations for PMSI Tunnel Types 803 The "P-Multicast Service Interface Tunnel (PMSI Tunnel) Tunnel Types" 804 registry in the "Border Gateway Protocol (BGP) Parameters" registry 805 needs to be updated to reflect the use of the most significant bit as 806 "Composite Tunnel" bit (section 5.2). 808 For this purpose, this document updates [RFC7385]. 810 The registry is to be updated, by removing the entries for 0xFB-0xFE 811 and 0x0F, and replacing them by: 813 Value Meaning Reference 814 0x0C-0x7A Unassigned 815 0x7B-0x7E Experimental this document 816 0x7F Reserved this document 817 0x80-0xFA Reserved for Composite tunnel this document 818 0xFB-0xFE Experimental [RFC7385] 819 0xFF Reserved [RFC7385] 821 The allocation policy for values 0x00 to 0x7A is IETF Review 822 [RFC8126]. The range for experimental use is now 0x7B-0x7E, and value 823 in this range are not to be assigned. The status of 0x7F may only be 824 changed through Standards Action [RFC8126]. 826 9 References 828 9.1 Normative References 830 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 831 Requirement Levels", BCP 14, RFC 2119, March 1997. 833 [RFC8126] Cotton et al, "Guidelines for Writing an IANA 834 Considerations Section in RFCs", June, 2017. 836 [RFC7387] Key et al., "A Framework for E-Tree Service over MPLS 837 Network", October 2014. 839 [MEF6.1] Metro Ethernet Forum, "Ethernet Services Definitions - Phase 840 2", MEF 6.1, April 2008, https://mef.net/PDF_Documents/technical- 841 specifications/MEF6-1.pdf 843 [RFC7432] Sajassi et al., "BGP MPLS Based Ethernet VPN", February, 844 2015. 846 [RFC7623] Sajassi et al., "Provider Backbone Bridging Combined with 847 Ethernet VPN (PBB-EVPN)", September, 2015. 849 [RFC7385] Andersson et al., "IANA Registry for P-Multicast Service 850 Interface (PMSI) Tunnel Type Code Points", October, 2014. 852 [RFC7153] Rosen et al., "IANA Registries for BGP Extended 853 Communities", March, 2014. 855 [RFC6514] Aggarwal et al., "BGP Encodings and Procedures for 856 Multicast in MPLS/BGP IP VPNs", February, 2012. 858 [RFC4360] Sangli et al., "BGP Extended Communities Attribute", 859 February, 2006. 861 9.2 Informative References 863 [RFC4360] S. Sangli et al, "BGP Extended Communities Attribute", 864 February, 2006. 866 [RFC3032] E. Rosen et al, "MPLS Label Stack Encoding", January 2001. 868 [RFC7796] Y. Jiang et al, "Ethernet-Tree (E-Tree) Support in Virtual 869 Private LAN Service (VPLS)", March 2016. 871 [EVPN-IRB] A. Sajassi et al, "Integrated Routing and Bridging in 872 EVPN", draft-ietf-bess-evpn-inter-subnet-forwarding-03, February 8, 873 2017. 875 [802.1ah] IEEE, "IEEE Standard for Local and metropolitan area 876 networks - Media Access Control (MAC) Bridges and Virtual Bridged 877 Local Area Networks", Clauses 25 and 26, IEEE Std 802.1Q, DOI 878 10.1109/IEEESTD.2011.6009146. 880 Appendix-A 882 When two MAC-VRFs (two bridge tables per VLANs) are used for an E- 883 Tree service (one for root ACs and another for Leaf ACs) on a given 884 PE, then the following complications in data-plane path can result. 886 Maintaining two MAC-VRFs (two bridge tables) per VLAN (when both Leaf 887 and Root ACs exists for that VLAN) would either require two lookups 888 be performed per MAC address in each direction in case of a miss, or 889 duplicating many MAC addresses between the two bridge tables 890 belonging to the same VLAN (same E-Tree instance). Unless two lookups 891 are made, duplication of MAC addresses would be needed for both 892 locally learned and remotely learned MAC addresses. Locally learned 893 MAC addresses from Leaf ACs need to be duplicated onto Root bridge 894 table and locally learned MAC addresses from Root ACs need to be 895 duplicated onto Leaf bridge table. Remotely learned MAC addresses 896 from Root ACs need to be copied onto both Root and Leaf bridge 897 tables. Because of potential inefficiencies associated with data- 898 plane implementation of additional MAC lookup or duplication of MAC 899 entries, this option is not believed to be implementable without 900 dataplane performance inefficiencies in some platforms and thus this 901 document introduces the coloring as described in section 2.2 and 902 detailed in section 3.1. 904 Contributors 906 In addition to the authors listed on the front page, the following 907 co-authors have also contributed to this document: 909 Wim Henderickx 910 Nokia 912 Aldrin Isaac 913 Wen Lin 914 Juniper 916 Authors' Addresses 918 Ali Sajassi 919 Cisco 920 Email: sajassi@cisco.com 922 Samer Salam 923 Cisco 924 Email: ssalam@cisco.com 926 John Drake 927 Juniper 928 Email: jdrake@juniper.net 930 Jim Uttaro 931 AT&T 932 Email: ju1738@att.com 934 Sami Boutros 935 VMware 936 Email: sboutros@vmware.com 938 Jorge Rabadan 939 Nokia 940 Email: jorge.rabadan@nokia.com