idnits 2.17.1 draft-ietf-bess-rfc7432bis-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 12, 2021) is 990 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 2996 == Missing Reference: 'RFC4448' is mentioned on line 1035, but not defined == Missing Reference: 'EVI' is mentioned on line 2230, but not defined == Missing Reference: 'BD' is mentioned on line 2230, but not defined == Outdated reference: A later version (-21) exists of draft-ietf-bess-evpn-igmp-mld-proxy-07 == Outdated reference: A later version (-08) exists of draft-ietf-bess-evpn-mh-split-horizon-01 == Outdated reference: A later version (-14) exists of draft-ietf-bess-mvpn-evpn-aggregation-label-06 == Outdated reference: A later version (-14) exists of draft-ietf-bier-evpn-04 -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) Summary: 1 error (**), 0 flaws (~~), 8 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Working Group A. Sajassi 3 Internet-Draft LA. Burdet, Ed. 4 Intended status: Standards Track Cisco 5 Expires: January 13, 2022 J. Drake 6 Juniper 7 J. Rabadan 8 Nokia 9 July 12, 2021 11 BGP MPLS-Based Ethernet VPN 12 draft-ietf-bess-rfc7432bis-01 14 Abstract 16 This document describes procedures for BGP MPLS-based Ethernet VPNs 17 (EVPN). The procedures described here meet the requirements 18 specified in RFC 7209 -- "Requirements for Ethernet VPN (EVPN)". 20 Note to Readers 22 _RFC EDITOR: please remove this section before publication_ 24 The complete and detailed set of all changes between this version and 25 RFC7432 may be found as an Annotated Diff (rfcdiff) here [1]. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at https://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on January 13, 2022. 44 Copyright Notice 46 Copyright (c) 2021 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (https://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 62 1.1. Summary of changes from RFC 7432 . . . . . . . . . . . . 4 63 2. Specification of Requirements . . . . . . . . . . . . . . . . 5 64 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 65 4. BGP MPLS-Based EVPN Overview . . . . . . . . . . . . . . . . 7 66 5. Ethernet Segment . . . . . . . . . . . . . . . . . . . . . . 8 67 6. Ethernet Tag ID . . . . . . . . . . . . . . . . . . . . . . . 12 68 6.1. VLAN-Based Service Interface . . . . . . . . . . . . . . 12 69 6.2. VLAN Bundle Service Interface . . . . . . . . . . . . . . 12 70 6.2.1. Port-Based Service Interface . . . . . . . . . . . . 13 71 6.3. VLAN-Aware Bundle Service Interface . . . . . . . . . . . 13 72 6.3.1. Port-Based VLAN-Aware Service Interface . . . . . . . 14 73 6.4. EVPN PE Model . . . . . . . . . . . . . . . . . . . . . . 14 74 7. BGP EVPN Routes . . . . . . . . . . . . . . . . . . . . . . . 15 75 7.1. Ethernet Auto-discovery Route . . . . . . . . . . . . . . 16 76 7.2. MAC/IP Advertisement Route . . . . . . . . . . . . . . . 17 77 7.3. Inclusive Multicast Ethernet Tag Route . . . . . . . . . 18 78 7.4. Ethernet Segment Route . . . . . . . . . . . . . . . . . 18 79 7.5. ESI Label Extended Community . . . . . . . . . . . . . . 19 80 7.6. ES-Import Route Target . . . . . . . . . . . . . . . . . 20 81 7.7. MAC Mobility Extended Community . . . . . . . . . . . . . 20 82 7.8. Default Gateway Extended Community . . . . . . . . . . . 21 83 7.9. Route Distinguisher Assignment per MAC-VRF . . . . . . . 21 84 7.10. Route Targets . . . . . . . . . . . . . . . . . . . . . . 21 85 7.10.1. Auto-derivation from the Ethernet Tag (VLAN ID) . . 22 86 7.11. EVPN Layer 2 Attributes Extended Community . . . . . . . 22 87 7.11.1. EVPN Layer 2 Attributes Partitioning . . . . . . . . 23 88 7.12. Route Prioritization . . . . . . . . . . . . . . . . . . 25 89 8. Multihoming Functions . . . . . . . . . . . . . . . . . . . . 25 90 8.1. Multihomed Ethernet Segment Auto-discovery . . . . . . . 25 91 8.1.1. Constructing the Ethernet Segment Route . . . . . . . 25 92 8.2. Fast Convergence . . . . . . . . . . . . . . . . . . . . 26 93 8.2.1. Constructing Ethernet A-D per Ethernet Segment Route 26 94 8.2.1.1. Ethernet A-D Route Targets . . . . . . . . . . . 27 95 8.3. Split Horizon . . . . . . . . . . . . . . . . . . . . . . 27 96 8.3.1. ESI Label Assignment . . . . . . . . . . . . . . . . 28 97 8.3.1.1. Ingress Replication . . . . . . . . . . . . . . . 28 98 8.3.1.2. P2MP MPLS LSPs . . . . . . . . . . . . . . . . . 29 99 8.3.1.3. MP2MP MPLS LSPs . . . . . . . . . . . . . . . . . 30 100 8.4. Aliasing and Backup Path . . . . . . . . . . . . . . . . 31 101 8.4.1. Constructing Ethernet A-D per EVPN Instance Route . . 32 102 8.5. Designated Forwarder Election . . . . . . . . . . . . . . 33 103 8.6. Signaling Primary and Backup DF Elected PEs . . . . . . . 35 104 8.7. Interoperability with Single-Homing PEs . . . . . . . . . 35 105 9. Determining Reachability to Unicast MAC Addresses . . . . . . 36 106 9.1. Local Learning . . . . . . . . . . . . . . . . . . . . . 36 107 9.2. Remote Learning . . . . . . . . . . . . . . . . . . . . . 36 108 9.2.1. Constructing MAC/IP Address Advertisement . . . . . . 37 109 9.2.2. Route Resolution . . . . . . . . . . . . . . . . . . 38 110 10. ARP and ND . . . . . . . . . . . . . . . . . . . . . . . . . 39 111 10.1. Default Gateway . . . . . . . . . . . . . . . . . . . . 40 112 10.1.1. Best Path selection for Default Gateway . . . . . . 42 113 11. Handling of Multi-destination Traffic . . . . . . . . . . . . 42 114 11.1. Constructing Inclusive Multicast Ethernet Tag Route . . 43 115 11.2. P-Tunnel Identification . . . . . . . . . . . . . . . . 43 116 12. Processing of Unknown Unicast Packets . . . . . . . . . . . . 44 117 12.1. Ingress Replication . . . . . . . . . . . . . . . . . . 45 118 12.2. P2MP MPLS LSPs . . . . . . . . . . . . . . . . . . . . . 45 119 13. Forwarding Unicast Packets . . . . . . . . . . . . . . . . . 45 120 13.1. Forwarding Packets Received from a CE . . . . . . . . . 45 121 13.2. Forwarding Packets Received from a Remote PE . . . . . . 46 122 13.2.1. Unknown Unicast Forwarding . . . . . . . . . . . . . 47 123 13.2.2. Known Unicast Forwarding . . . . . . . . . . . . . . 47 124 14. Load Balancing of Unicast Packets . . . . . . . . . . . . . . 47 125 14.1. Load Balancing of Traffic from a PE to Remote CEs . . . 47 126 14.1.1. Single-Active Redundancy Mode . . . . . . . . . . . 48 127 14.1.2. All-Active Redundancy Mode . . . . . . . . . . . . . 48 128 14.2. Load Balancing of Traffic between a PE and a Local CE . 50 129 14.2.1. Data-Plane Learning . . . . . . . . . . . . . . . . 50 130 14.2.2. Control-Plane Learning . . . . . . . . . . . . . . . 50 131 15. MAC Mobility . . . . . . . . . . . . . . . . . . . . . . . . 50 132 15.1. MAC Duplication Issue . . . . . . . . . . . . . . . . . 52 133 15.2. Sticky MAC Addresses . . . . . . . . . . . . . . . . . . 53 134 15.3. Loop Protection . . . . . . . . . . . . . . . . . . . . 53 135 16. Multicast and Broadcast . . . . . . . . . . . . . . . . . . . 54 136 16.1. Ingress Replication . . . . . . . . . . . . . . . . . . 54 137 16.2. P2MP or MP2MP LSPs . . . . . . . . . . . . . . . . . . . 55 138 16.2.1. Inclusive Trees . . . . . . . . . . . . . . . . . . 55 139 17. Convergence . . . . . . . . . . . . . . . . . . . . . . . . . 56 140 17.1. Transit Link and Node Failures between PEs . . . . . . . 56 141 17.2. PE Failures . . . . . . . . . . . . . . . . . . . . . . 56 142 17.3. PE-to-CE Network Failures . . . . . . . . . . . . . . . 56 143 18. Frame Ordering . . . . . . . . . . . . . . . . . . . . . . . 57 144 18.1. Flow Label . . . . . . . . . . . . . . . . . . . . . . . 57 146 19. Use of Domain-wide Common Block (DCB) Labels . . . . . . . . 58 147 20. Security Considerations . . . . . . . . . . . . . . . . . . . 59 148 21. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 60 149 22. References . . . . . . . . . . . . . . . . . . . . . . . . . 61 150 22.1. Normative References . . . . . . . . . . . . . . . . . . 61 151 22.2. Informative References . . . . . . . . . . . . . . . . . 62 152 22.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 64 153 Appendix A. Acknowledgments for This Document (2021) . . . . . . 64 154 Appendix B. Contributors for This Document (2021) . . . . . . . 64 155 Appendix C. Acknowledgments from the First Edition (2015) . . . 65 156 C.1. Contributors from the First Edition (2015) . . . . . . . 65 157 C.2. Authors from the First Edition (2015) . . . . . . . . . . 65 158 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 66 160 1. Introduction 162 Virtual Private LAN Service (VPLS), as defined in [RFC4664], 163 [RFC4761], and [RFC4762], is a proven and widely deployed technology. 164 However, the existing solution has a number of limitations when it 165 comes to multihoming and redundancy, multicast optimization, 166 provisioning simplicity, flow-based load balancing, and multipathing; 167 these limitations are important considerations for Data Center (DC) 168 deployments. [RFC7209] describes the motivation for a new solution 169 to address these limitations. It also outlines a set of requirements 170 that the new solution must address. 172 This document describes procedures for a BGP MPLS-based solution 173 called Ethernet VPN (EVPN) to address the requirements specified in 174 [RFC7209]. Please refer to [RFC7209] for the detailed requirements 175 and motivation. EVPN requires extensions to existing IP/MPLS 176 protocols as described in this document. In addition to these 177 extensions, EVPN uses several building blocks from existing MPLS 178 technologies. 180 1.1. Summary of changes from RFC 7432 182 This section describes the significant changes between [RFC4762] and 183 this document. 185 - Updates to Terminology i.a. BD, EVI, Ethernet Tag ID, P-tunnel, 186 DF/BDF/NDF, DCB; 188 - Added Section 6.4 for description and disambiguation of EVPN 189 bridging terminology; 191 - Precision of 'encoding' language for all references to 'Label' 192 fields; 194 - Added Section 7.11 for usage of 195 EVPN Layer 2 Attributes Extended Community in EVPN Bridging; 197 - Added Section 7.12 proposes relative order-of-magnitude route 198 priority and processing to help achieve fast convergence; 200 - Corrected Section 8.2.1 to include reference to E-TREE exception; 202 - Updated Section 8.5 to include Backup- and Non-Designated Forwarder 203 roles to DF-Election algorithm, description of those roles and 204 signaling updates; 206 - Added Section 8.3.1.3 for MP2MP MPLS LSPs and updated Section 12.2; 208 - Address conflicts in Best Path algorithm for Default Gateway in 209 Section 10.1.1; 211 - Update to Section 14.1.1 redundancy mode description; 213 - Added Section 15.3 describing a loop detection and protection 214 mechanism; 216 - Added Section 18.1 describing Flow-label usage and signaling (see 217 also new Section 7.11); 219 - Section 19 specifies use of Domain-wide Common Block (DCB) for 220 several cases; 222 - Restructuring, namely Section 8.5 to Section 5, simplify all 223 Ethernet Tag ID references to Section 6 ; and 225 - Cross-references and editorial changes; 227 2. Specification of Requirements 229 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 230 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 231 document are to be interpreted as described in [RFC2119]. 233 3. Terminology 235 BD: Broadcast Domain. In a bridged network, the broadcast domain 236 corresponds to a Virtual LAN (VLAN), where a VLAN is typically 237 represented by a single VLAN ID (VID) but can be represented by 238 several VIDs where Shared VLAN Learning (SVL) is used per 239 [IEEE.802.1Q_2014]. 241 Bridge Table: An instantiation of a broadcast domain on a MAC-VRF. 243 CE: Customer Edge device, e.g., a host, router, or switch. 245 EVI: An EVPN instance spanning the Provider Edge (PE) devices 246 participating in that EVPN. An EVI may be comprised of one BD 247 (VLAN-based, VLAN Bundle, or Port-based services) or multiple BDs 248 (VLAN-aware Bundle or Port-based VLAN-Aware services). 250 MAC-VRF: A Virtual Routing and Forwarding table for Media Access 251 Control (MAC) addresses on a PE. 253 Ethernet Segment (ES): When a customer site (device or network) is 254 connected to one or more PEs via a set of Ethernet links, then 255 that set of links is referred to as an 'Ethernet segment'. 257 Ethernet Segment Identifier (ESI): A unique non-zero identifier that 258 identifies an Ethernet segment is called an 'Ethernet Segment 259 Identifier'. 261 VID: VLAN Identifier. 263 Ethernet Tag: Used to represent a BD that is configured on a given 264 ES for the purposes of DF election and identification 265 for frames received from the CE. Note that any of the following 266 may be used to represent a BD: VIDs (including Q-in-Q tags), 267 configured IDs, VNIs (Virtual Extensible Local Area Network 268 (VXLAN) Network Identifiers), normalized VIDs, I-SIDs (Service 269 Instance Identifiers), etc., as long as the representation of the 270 BDs is configured consistently across the multihomed PEs attached 271 to that ES. 273 Ethernet Tag ID: Normalized network wide ID that is used to identify 274 a BD within an EVI and carried in EVPN routes. 276 LACP: Link Aggregation Control Protocol. 278 MP2MP: Multipoint to Multipoint. 280 MP2P: Multipoint to Point. 282 P2MP: Point to Multipoint. 284 P2P: Point to Point. 286 P-tunnel: A tunnel through the network of one or more SPs. In this 287 document, P-tunnels are instantiated as bidirectional multicast 288 distribution trees. 290 PE: Provider Edge device. 292 Single-Active Redundancy Mode: When only a single PE, among all the 293 PEs attached to an Ethernet segment, is allowed to forward traffic 294 to/from that Ethernet segment for a given VLAN, then the Ethernet 295 segment is defined to be operating in Single-Active redundancy 296 mode. 298 All-Active Redundancy Mode: When all PEs attached to an Ethernet 299 segment are allowed to forward known unicast traffic to/from that 300 Ethernet segment for a given VLAN, then the Ethernet segment is 301 defined to be operating in All-Active redundancy mode. 303 BUM: Broadcast, unknown unicast, and multicast. 305 DF: Designated Forwarder. 307 Backup-DF (BDF): Backup-Designated Forwarder. 309 Non-DF (NDF): Non-Designated Forwarder. 311 DCB: Domain-wide Common Block (of labels), as in 312 [I-D.ietf-bess-mvpn-evpn-aggregation-label]. 314 AC: Attachment Circuit. 316 4. BGP MPLS-Based EVPN Overview 318 This section provides an overview of EVPN. An EVPN instance 319 comprises Customer Edge devices (CEs) that are connected to Provider 320 Edge devices (PEs) that form the edge of the MPLS infrastructure. A 321 CE may be a host, a router, or a switch. The PEs provide virtual 322 Layer 2 bridged connectivity between the CEs. There may be multiple 323 EVPN instances in the provider's network. 325 The PEs may be connected by an MPLS Label Switched Path (LSP) 326 infrastructure, which provides the benefits of MPLS technology, such 327 as fast reroute, resiliency, etc. The PEs may also be connected by 328 an IP infrastructure, in which case IP/GRE (Generic Routing 329 Encapsulation) tunneling or other IP tunneling can be used between 330 the PEs. The detailed procedures in this document are specified only 331 for MPLS LSPs as the tunneling technology. However, these procedures 332 are designed to be extensible to IP tunneling as the Packet Switched 333 Network (PSN) tunneling technology. 335 In an EVPN, MAC learning between PEs occurs not in the data plane (as 336 happens with traditional bridging in VPLS [RFC4761] [RFC4762]) but in 337 the control plane. Control-plane learning offers greater control 338 over the MAC learning process, such as restricting who learns what, 339 and the ability to apply policies. Furthermore, the control plane 340 chosen for advertising MAC reachability information is multi-protocol 341 (MP) BGP (similar to IP VPNs [RFC4364]). This provides flexibility 342 and the ability to preserve the "virtualization" or isolation of 343 groups of interacting agents (hosts, servers, virtual machines) from 344 each other. In EVPN, PEs advertise the MAC addresses learned from 345 the CEs that are connected to them, along with an MPLS label, to 346 other PEs in the control plane using Multiprotocol BGP (MP-BGP). 347 Control-plane learning enables load balancing of traffic to and from 348 CEs that are multihomed to multiple PEs. This is in addition to load 349 balancing across the MPLS core via multiple LSPs between the same 350 pair of PEs. In other words, it allows CEs to connect to multiple 351 active points of attachment. It also improves convergence times in 352 the event of certain network failures. 354 However, learning between PEs and CEs is done by the method best 355 suited to the CE: data-plane learning, IEEE 802.1x, the Link Layer 356 Discovery Protocol (LLDP), IEEE 802.1aq, Address Resolution Protocol 357 (ARP), management plane, or other protocols. 359 It is a local decision as to whether the Layer 2 forwarding table on 360 a PE is populated with all the MAC destination addresses known to the 361 control plane, or whether the PE implements a cache-based scheme. 362 For instance, the MAC forwarding table may be populated only with the 363 MAC destinations of the active flows transiting a specific PE. 365 The policy attributes of EVPN are very similar to those of IP-VPN. 366 An EVPN instance requires a Route Distinguisher (RD) that is unique 367 per MAC-VRF and one or more globally unique Route Targets (RTs). A 368 CE attaches to a BD on a PE, on an Ethernet interface that may be 369 configured for one or more Ethernet tags. If the Ethernet tags are 370 VLAN IDs, some deployment scenarios guarantee uniqueness of VLAN IDs 371 across EVPN instances: all points of attachment for a given EVPN 372 instance use the same VLAN ID, and no other EVPN instance uses this 373 VLAN ID. This document refers to this case as a "Unique VLAN EVPN" 374 and describes simplified procedures to optimize for it. See for 375 example Section 7.10.1 which describes deriving automatically the 376 RT(s) for each EVPN instance from the corresponding VID. 378 5. Ethernet Segment 380 As indicated in [RFC7209], each Ethernet segment needs a unique 381 identifier in an EVPN. This section defines how such identifiers are 382 assigned and how they are encoded for use in EVPN signaling. Later 383 sections of this document describe the protocol mechanisms that 384 utilize the identifiers. 386 When a customer site is connected to one or more PEs via a set of 387 Ethernet links, then this set of Ethernet links constitutes an 388 "Ethernet segment". For a multihomed site, each Ethernet segment 389 (ES) is identified by a unique non-zero identifier called an Ethernet 390 Segment Identifier (ESI). An ESI is encoded as a 10-octet integer in 391 line format with the most significant octet sent first. The 392 following two ESI values are reserved: 394 - ESI 0 denotes a single-homed site. 396 - ESI {0xFF} (repeated 10 times) is known as MAX-ESI and is reserved. 398 In general, an Ethernet segment SHOULD have a non-reserved ESI that 399 is unique network wide (i.e., across all EVPN instances on all the 400 PEs). If the CE(s) constituting an Ethernet segment is (are) managed 401 by the network operator, then ESI uniqueness should be guaranteed; 402 however, if the CE(s) is (are) not managed, then the operator MUST 403 configure a network-wide unique ESI for that Ethernet segment. This 404 is required to enable auto-discovery of Ethernet segments and 405 Designated Forwarder (DF) election. 407 In a network with managed and non-managed CEs, the ESI has the 408 following format: 410 +---+---+---+---+---+---+---+---+---+---+ 411 | T | ESI Value | 412 +---+---+---+---+---+---+---+---+---+---+ 414 Where: 416 T (ESI Type) is a 1-octet field (most significant octet) that 417 specifies the format of the remaining 9 octets (ESI Value). The 418 following six ESI types can be used: 420 - Type 0 (T=0x00) - This type indicates an arbitrary 9-octet ESI 421 value, which is managed and configured by the operator. 423 - Type 1 (T=0x01) - When IEEE 802.1AX LACP is used between the PEs 424 and CEs, this ESI type indicates an auto-generated ESI value 425 determined from LACP by concatenating the following parameters: 427 + CE LACP System MAC address (6 octets). The CE LACP System MAC 428 address MUST be encoded in the high-order 6 octets of the ESI 429 Value field. 431 + CE LACP Port Key (2 octets). The CE LACP port key MUST be 432 encoded in the 2 octets next to the System MAC address. 434 + The remaining octet will be set to 0x00. 436 As far as the CE is concerned, it would treat the multiple PEs that 437 it is connected to as the same switch. This allows the CE to 438 aggregate links that are attached to different PEs in the same 439 bundle. 441 This mechanism could be used only if it produces ESIs that satisfy 442 the uniqueness requirement specified above. 444 - Type 2 (T=0x02) - This type is used in the case of indirectly 445 connected hosts via a bridged LAN between the CEs and the PEs. The 446 ESI Value is auto-generated and determined based on the Layer 2 447 bridge protocol as follows: If the Multiple Spanning Tree Protocol 448 (MSTP) is used in the bridged LAN, then the value of the ESI is 449 derived by listening to Bridge PDUs (BPDUs) on the Ethernet 450 segment. To achieve this, the PE is not required to run MSTP. 451 However, the PE must learn the Root Bridge MAC address and Bridge 452 Priority of the root of the Internal Spanning Tree (IST) by 453 listening to the BPDUs. The ESI Value is constructed as follows: 455 + Root Bridge MAC address (6 octets). The Root Bridge MAC address 456 MUST be encoded in the high-order 6 octets of the ESI Value 457 field. 459 + Root Bridge Priority (2 octets). The CE Root Bridge Priority 460 MUST be encoded in the 2 octets next to the Root Bridge MAC 461 address. 463 + The remaining octet will be set to 0x00. 465 This mechanism could be used only if it produces ESIs that satisfy 466 the uniqueness requirement specified above. 468 - Type 3 (T=0x03) - This type indicates a MAC-based ESI Value that 469 can be auto-generated or configured by the operator. The ESI Value 470 is constructed as follows: 472 + System MAC address (6 octets). The PE MAC address MUST be 473 encoded in the high-order 6 octets of the ESI Value field. 475 + Local Discriminator value (3 octets). The Local Discriminator 476 value MUST be encoded in the low-order 3 octets of the ESI Value. 478 This mechanism could be used only if it produces ESIs that satisfy 479 the uniqueness requirement specified above. 481 - Type 4 (T=0x04) - This type indicates a router-ID ESI Value that 482 can be auto-generated or configured by the operator. The ESI Value 483 is constructed as follows: 485 + Router ID (4 octets). The system router ID MUST be encoded in 486 the high-order 4 octets of the ESI Value field. 488 + Local Discriminator value (4 octets). The Local Discriminator 489 value MUST be encoded in the 4 octets next to the IP address. 491 + The low-order octet of the ESI Value will be set to 0x00. 493 This mechanism could be used only if it produces ESIs that satisfy 494 the uniqueness requirement specified above. 496 - Type 5 (T=0x05) - This type indicates an Autonomous System 497 (AS)-based ESI Value that can be auto-generated or configured by 498 the operator. The ESI Value is constructed as follows: 500 + AS number (4 octets). This is an AS number owned by the system 501 and MUST be encoded in the high-order 4 octets of the ESI Value 502 field. If a 2-octet AS number is used, the high-order extra 503 2 octets will be 0x0000. 505 + Local Discriminator value (4 octets). The Local Discriminator 506 value MUST be encoded in the 4 octets next to the AS number. 508 + The low-order octet of the ESI Value will be set to 0x00. 510 This mechanism could be used only if it produces ESIs that satisfy 511 the uniqueness requirement specified above. 513 Note that a CE always sends packets belonging to a specific flow 514 using a single link towards a PE. For instance, if the CE is a host, 515 then, as mentioned earlier, the host treats the multiple links that 516 it uses to reach the PEs as a Link Aggregation Group (LAG). The CE 517 employs a local hashing function to map traffic flows onto links in 518 the LAG. 520 If a bridged network is multihomed to more than one PE in an EVPN 521 network via switches, then the support of All-Active redundancy mode 522 requires the bridged network to be connected to two or more PEs using 523 a LAG. 525 If a bridged network does not connect to the PEs using a LAG, then 526 only one of the links between the bridged network and the PEs must be 527 the active link for a given . In this case, the set of 528 Ethernet A-D per ES routes advertised by each PE MUST have the 529 "Single-Active" bit in the flags of the ESI Label extended community 530 set to 1. 532 6. Ethernet Tag ID 534 An Ethernet Tag ID is a 32-bit field containing either a 12-bit or 535 24-bit identifier that identifies a particular broadcast domain 536 (e.g., a VLAN) in an EVPN instance. The 12-bit identifier is called 537 the VLAN ID (VID). An EVPN instance consists of one or more 538 broadcast domains (one or more VLANs). VLANs are assigned to a given 539 EVPN instance by the provider of the EVPN service. A given VLAN can 540 itself be represented by multiple VIDs. In such cases, the PEs 541 participating in that VLAN for a given EVPN instance are responsible 542 for performing VLAN ID translation to/from locally attached CE 543 devices. 545 The following subsections discuss the relationship between broadcast 546 domains (e.g., VLANs), Ethernet Tag IDs (e.g., VIDs), and MAC-VRFs as 547 well as the setting of the Ethernet Tag ID, in the various EVPN BGP 548 routes (defined in Section 8), for the different types of service 549 interfaces described in [RFC7209]. 551 The following Ethernet Tag ID value is reserved: 553 - Ethernet Tag ID {0xFFFFFFFF} is known as MAX-ET. 555 6.1. VLAN-Based Service Interface 557 With this service interface, an EVPN instance consists of only a 558 single broadcast domain (e.g., a single VLAN). Therefore, there is a 559 one-to-one mapping between a VID on this interface and a MAC-VRF. 560 Since a MAC-VRF corresponds to a single VLAN, it consists of a single 561 bridge table corresponding to that VLAN. If the VLAN is represented 562 by multiple VIDs (e.g., a different VID per Ethernet segment per PE), 563 then each PE needs to perform VID translation for frames destined to 564 its Ethernet segment(s). In such scenarios, the Ethernet frames 565 transported over an MPLS/IP network SHOULD remain tagged with the 566 originating VID, and a VID translation MUST be supported in the data 567 path and MUST be performed on the disposition PE. The Ethernet Tag 568 ID in all EVPN routes MUST be set to 0. 570 6.2. VLAN Bundle Service Interface 572 With this service interface, an EVPN instance corresponds to multiple 573 broadcast domains (e.g., multiple VLANs); however, only a single 574 bridge table is maintained per MAC-VRF, which means multiple VLANs 575 share the same bridge table. This implies that MAC addresses MUST be 576 unique across all VLANs for that EVI in order for this service to 577 work. In other words, there is a many-to-one mapping between VLANs 578 and a MAC-VRF, and the MAC-VRF consists of a single bridge table. 579 Furthermore, a single VLAN must be represented by a single VID -- 580 e.g., no VID translation is allowed for this service interface type. 581 The MPLS-encapsulated frames MUST remain tagged with the originating 582 VID. Tag translation is NOT permitted. The Ethernet Tag ID in all 583 EVPN routes MUST be set to 0. 585 6.2.1. Port-Based Service Interface 587 This service interface is a special case of the VLAN bundle service 588 interface, where all of the VLANs on the port are part of the same 589 service and map to the same bundle. The procedures are identical to 590 those described in Section 6.2. 592 6.3. VLAN-Aware Bundle Service Interface 594 With this service interface, an EVPN instance consists of multiple 595 broadcast domains (e.g., multiple VLANs) with each VLAN having its 596 own bridge table -- i.e., multiple bridge tables (one per VLAN) are 597 maintained by a single MAC-VRF corresponding to the EVPN instance. 599 Broadcast, unknown unicast, or multicast (BUM) traffic is sent only 600 to the CEs in a given broadcast domain; however, the broadcast 601 domains within an EVI either MAY each have their own P-Tunnel or MAY 602 share P-Tunnels -- e.g., all of the broadcast domains in an EVI MAY 603 share a single P-Tunnel. 605 In the case where a single VLAN is represented by a single VID and 606 thus no VID translation is required, an MPLS-encapsulated packet MUST 607 carry that VID. The Ethernet Tag ID in all EVPN routes MUST be set 608 to that VID. The advertising PE MAY advertise the MPLS Label1 in the 609 MAC/IP Advertisement route representing ONLY the EVI or representing 610 both the Ethernet Tag ID and the EVI. This decision is only a local 611 matter by the advertising PE (which is also the disposition PE) and 612 doesn't affect any other PEs. 614 In the case where a single VLAN is represented by different VIDs on 615 different CEs and thus VID translation is required, a normalized 616 Ethernet Tag ID (VID) MUST be carried in the EVPN BGP routes. 617 Furthermore, the advertising PE advertises the MPLS Label1 in the 618 MAC/IP Advertisement route representing both the Ethernet Tag ID and 619 the EVI, so that upon receiving an MPLS-encapsulated packet, it can 620 identify the corresponding bridge table from the MPLS EVPN label and 621 perform Ethernet Tag ID translation ONLY at the disposition PE -- 622 i.e., the Ethernet frames transported over the MPLS/IP network MUST 623 remain tagged with the originating VID, and VID translation is 624 performed on the disposition PE. The Ethernet Tag ID in all EVPN 625 routes MUST be set to the normalized Ethernet Tag ID assigned by the 626 EVPN provider. 628 6.3.1. Port-Based VLAN-Aware Service Interface 630 This service interface is a special case of the VLAN-aware bundle 631 service interface, where all of the VLANs on the port are part of the 632 same service and are mapped to a single bundle but without any VID 633 translation. The procedures are a subset of those described in 634 Section 6.3. 636 6.4. EVPN PE Model 638 Since this document discusses EVPN operation in relationship to MAC- 639 VRF, EVI, Broadcast Domain (BD), and Bridge Table (BT), it is 640 important to understand the relationship between these terms. 641 Therefore, the following PE model is depicted below to illustrate the 642 relationship among them. 644 +--------------------------------------------------+ 645 | | 646 | +------------------+ EVPN PE | 647 | Attachment | +------------------+ | 648 | Circuit(AC1) | | +----------+ | MPLS/NVO tnl 649 ----------------------*Bridge | | +----- 650 | | | |Table(BT1)| | / \ \ 651 | | | | |<------------------> |Eth| 652 | | | | VLAN x | | \ / / 653 | | | +----------+ | +----- 654 | | | ... | | 655 | | | +----------+ | MPLS/NVO tnl 656 | | | |Bridge | | +----- 657 | | | |Table(BT2)| | / \ \ 658 | | | | |<-------------------> |Eth| 659 ----------------------* VLAN y | | \ / / 660 | AC2 | | +----------+ | +----- 661 | | | MAC-VRF1 | | 662 | +-+ RD1/RT1 | | 663 | +------------------+ | 664 | | 665 | | 666 +---------------------------------------------------+ 668 Figure 1: EVPN PE Model 670 A tenant configured for an EVPN service instance (i.e, EVI) on a PE, 671 is instantiated by a single MAC Virtual Routing and Forwarding table 672 (MAC-VRF) on that PE. A MAC-VRF consists of one or more Bridge 673 Tables (BTs) where each BT corresponds to a VLAN (broadcast domain - 674 BD). If a service interface for an EVPN PE is configured in VLAN- 675 Based mode (i.e., section 6.1), then there is only a single BT per 676 MAC-VRF (per EVI) - i.e., there is only one tenant VLAN per EVI. 677 However, if a service interface for an EVPN PE is configured in VLAN- 678 Aware Bundle mode (i.e., section 6.3), then there are several BTs per 679 MAC-VRF (per EVI) - i.e., there are several tenant VLANs per EVI. 680 The relationship among these terms can be summarized as follow: 682 - An EVI consists of one or more BDs and a MAC-VRF consists of one or 683 more BTs, one for each BD. A BD is identified by an Ethernet Tag 684 ID which is typically represented by a single VLAN ID (VID); 685 however, it can be represented by multiple VIDs (i.e., Shared VLAN 686 Learning (SVL) mode in 802.1Q). 688 - In VLAN-based mode, there is one EVI per VLAN and thus one BD/BT 689 per VLAN. Furthermore, there is one BT per MAC-VRF. 691 - In VLAN-bundle service, it can be considered as analogous to SVL 692 mode in 802.1Q i.e., one BD per EVI and one BT per MAC-VRF with 693 multiple VIDs representing that BD. 695 - In VLAN-aware bundle service, there is one EVI with multiple BDs 696 where each BD is represented by a VLAN. Furthermore, there are 697 multiple BTs in a single MAC-VRF. 699 Since a single tenant subnet is typically (and in this document) 700 represented by a VLAN (and thus supported by a single BT), for a 701 given tenant there are as many BTs as there are subnets as shown in 702 the PE model above. 704 MAC-VRF is identified by its corresponding route target and route 705 distinguisher. If operating in EVPN VLAN-Based mode, then a 706 receiving PE that receives an EVPN route with MAC-VRF route target 707 can identify the corresponding BT; however, if operating in EVPN 708 VLAN-Aware Bundle mode, then the receiving PE needs both the MAC-VRF 709 route target and Ethernet Tag ID in order to identify the 710 corresponding BT. 712 7. BGP EVPN Routes 714 This document defines a new BGP Network Layer Reachability 715 Information (NLRI) called the EVPN NLRI. 717 The format of the EVPN NLRI is as follows: 719 +-----------------------------------+ 720 | Route Type (1 octet) | 721 +-----------------------------------+ 722 | Length (1 octet) | 723 +-----------------------------------+ 724 | Route Type specific (variable) | 725 +-----------------------------------+ 727 The Route Type field defines the encoding of the rest of the EVPN 728 NLRI (Route Type specific EVPN NLRI). 730 The Length field indicates the length in octets of the Route Type 731 specific field of the EVPN NLRI. 733 This document defines the following Route Types: 735 + 1 - Ethernet Auto-Discovery (A-D) route 736 + 2 - MAC/IP Advertisement route 737 + 3 - Inclusive Multicast Ethernet Tag route 738 + 4 - Ethernet Segment route 740 The detailed encoding and procedures for these route types are 741 described in subsequent sections. 743 The EVPN NLRI is carried in BGP [RFC4271] using BGP Multiprotocol 744 Extensions [RFC4760] with an Address Family Identifier (AFI) of 25 745 (L2VPN) and a Subsequent Address Family Identifier (SAFI) of 70 746 (EVPN). The NLRI field in the MP_REACH_NLRI/MP_UNREACH_NLRI 747 attribute contains the EVPN NLRI (encoded as specified above). 749 In order for two BGP speakers to exchange labeled EVPN NLRI, they 750 must use BGP Capabilities Advertisements to ensure that they both are 751 capable of properly processing such NLRI. This is done as specified 752 in [RFC4760], by using capability code 1 (multiprotocol BGP) with an 753 AFI of 25 (L2VPN) and a SAFI of 70 (EVPN). 755 7.1. Ethernet Auto-discovery Route 757 An Ethernet A-D route type specific EVPN NLRI consists of the 758 following: 760 +---------------------------------------+ 761 | Route Distinguisher (RD) (8 octets) | 762 +---------------------------------------+ 763 |Ethernet Segment Identifier (10 octets)| 764 +---------------------------------------+ 765 | Ethernet Tag ID (4 octets) | 766 +---------------------------------------+ 767 | MPLS Label (3 octets) | 768 +---------------------------------------+ 770 For the purpose of BGP route key processing, only the Ethernet 771 Segment Identifier and the Ethernet Tag ID are considered to be part 772 of the prefix in the NLRI. The MPLS Label field is to be treated as 773 a route attribute as opposed to being part of the route. 775 The MPLS Label field is encoded as 3 octets, where the high-order 776 20 bits contain the label value. 778 For procedures and usage of this route, please see Sections 8.2 779 ("Fast Convergence") and 8.4 ("Aliasing and Backup Path"). 781 7.2. MAC/IP Advertisement Route 783 A MAC/IP Advertisement route type specific EVPN NLRI consists of the 784 following: 786 +---------------------------------------+ 787 | RD (8 octets) | 788 +---------------------------------------+ 789 |Ethernet Segment Identifier (10 octets)| 790 +---------------------------------------+ 791 | Ethernet Tag ID (4 octets) | 792 +---------------------------------------+ 793 | MAC Address Length (1 octet) | 794 +---------------------------------------+ 795 | MAC Address (6 octets) | 796 +---------------------------------------+ 797 | IP Address Length (1 octet) | 798 +---------------------------------------+ 799 | IP Address (0, 4, or 16 octets) | 800 +---------------------------------------+ 801 | MPLS Label1 (3 octets) | 802 +---------------------------------------+ 803 | MPLS Label2 (0 or 3 octets) | 804 +---------------------------------------+ 806 For the purpose of BGP route key processing, only the Ethernet Tag 807 ID, MAC Address Length, MAC Address, IP Address Length, and IP 808 Address fields are considered to be part of the prefix in the NLRI. 809 The Ethernet Segment Identifier, MPLS Label1, and MPLS Label2 fields 810 are to be treated as route attributes as opposed to being part of the 811 "route". Both the IP and MAC address lengths are in bits. 813 The MPLS Label1 and MPLS Label2 fields are encoded as 3 octets, where 814 the high-order 20 bits contain the label value. 816 For procedures and usage of this route, please see Sections 9 817 ("Determining Reachability to Unicast MAC Addresses") and 14 818 ("Load Balancing of Unicast Packets"). 820 7.3. Inclusive Multicast Ethernet Tag Route 822 An Inclusive Multicast Ethernet Tag route type specific EVPN NLRI 823 consists of the following: 825 +---------------------------------------+ 826 | RD (8 octets) | 827 +---------------------------------------+ 828 | Ethernet Tag ID (4 octets) | 829 +---------------------------------------+ 830 | IP Address Length (1 octet) | 831 +---------------------------------------+ 832 | Originating Router's IP Address | 833 | (4 or 16 octets) | 834 +---------------------------------------+ 836 For procedures and usage of this route, please see Sections 11 837 ("Handling of Multi-destination Traffic"), 12 838 ("Processing of Unknown Unicast Packets"), and 16 839 ("Multicast and Broadcast"). The IP address length is in bits. For 840 the purpose of BGP route key processing, only the Ethernet Tag ID, IP 841 Address Length, and Originating Router's IP Address fields are 842 considered to be part of the prefix in the NLRI. 844 7.4. Ethernet Segment Route 846 An Ethernet Segment route type specific EVPN NLRI consists of the 847 following: 849 +---------------------------------------+ 850 | RD (8 octets) | 851 +---------------------------------------+ 852 |Ethernet Segment Identifier (10 octets)| 853 +---------------------------------------+ 854 | IP Address Length (1 octet) | 855 +---------------------------------------+ 856 | Originating Router's IP Address | 857 | (4 or 16 octets) | 858 +---------------------------------------+ 860 For procedures and usage of this route, please see Section 8.5 861 ("Designated Forwarder Election"). The IP address length is in bits. 862 For the purpose of BGP route key processing, only the Ethernet 863 Segment ID, IP Address Length, and Originating Router's IP Address 864 fields are considered to be part of the prefix in the NLRI. 866 7.5. ESI Label Extended Community 868 This Extended Community is a new transitive Extended Community having 869 a Type field value of 0x06 and the Sub-Type 0x01. It may be 870 advertised along with Ethernet Auto-discovery routes, and it enables 871 split-horizon procedures for multihomed sites as described in 872 Section 8.3 ("Split Horizon"). The ESI Label field represents an ES 873 by the advertising PE, and it is used in split-horizon filtering by 874 other PEs that are connected to the same multihomed Ethernet segment. 876 The ESI Label field is encoded as 3 octets, where the high-order 877 20 bits contain the label value. 879 The ESI label value MAY be zero if no split-horizon filtering 880 procedures are required in any of the VLANs of the Ethernet Segment. 881 This is the case in [RFC8214] or Ethernet Segments using Local Bias 882 procedures in [I-D.ietf-bess-evpn-mh-split-horizon]. 884 Each ESI Label extended community is encoded as an 8-octet value, as 885 follows: 887 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 888 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 889 | Type=0x06 | Sub-Type=0x01 | Flags(1 octet)| Reserved=0 | 890 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 891 | Reserved=0 | ESI Label | 892 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 894 The low-order bit of the Flags octet is defined as the 895 "Single-Active" bit. A value of 0 means that the multihomed site 896 is operating in All-Active redundancy mode, and a value of 1 means 897 that the multihomed site is operating in Single-Active redundancy 898 mode. 900 7.6. ES-Import Route Target 902 This is a new transitive Route Target extended community carried with 903 the Ethernet Segment route. When used, it enables all the PEs 904 connected to the same multihomed site to import the Ethernet Segment 905 routes. The value is derived automatically for the ESI Types 1, 2, 906 and 3, by encoding the high-order 6-octet portion of the 9-octet ESI 907 Value, which corresponds to a MAC address, in the ES-Import Route 908 Target. The format of this Extended Community is as follows: 910 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 911 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 912 | Type=0x06 | Sub-Type=0x02 | ES-Import | 913 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 914 | ES-Import Cont'd | 915 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 917 This document expands the definition of the Route Target extended 918 community to allow the value of the high-order octet (Type field) to 919 be 0x06 (in addition to the values specified in [RFC4360]). The 920 low-order octet (Sub-Type field) value 0x02 indicates that this 921 Extended Community is of type "Route Target". The new Type field 922 value 0x06 indicates that the structure of this RT is a 6-octet value 923 (e.g., a MAC address). A BGP speaker that implements RT Constraint 924 [RFC4684] MUST apply the RT Constraint procedures to the ES-Import RT 925 as well. 927 For procedures and usage of this attribute, please see Section 8.1 928 ("Multihomed Ethernet Segment Auto-discovery"). 930 7.7. MAC Mobility Extended Community 932 This Extended Community is a new transitive Extended Community having 933 a Type field value of 0x06 and the Sub-Type 0x00. It may be 934 advertised along with MAC/IP Advertisement routes. The procedures 935 for using this Extended Community are described in Section 15 936 ("MAC Mobility"). 938 The MAC Mobility extended community is encoded as an 8-octet value, 939 as follows: 941 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 942 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 943 | Type=0x06 | Sub-Type=0x00 |Flags(1 octet)| Reserved=0 | 944 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 945 | Sequence Number | 946 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 948 The low-order bit of the Flags octet is defined as the 949 "Sticky/static" flag and may be set to 1. A value of 1 means that 950 the MAC address is static and cannot move. The sequence number is 951 used to ensure that PEs retain the correct MAC/IP Advertisement route 952 when multiple updates occur for the same MAC address. 954 7.8. Default Gateway Extended Community 956 The Default Gateway community is an Extended Community of an Opaque 957 Type (see Section 3.3 of [RFC4360]). It is a transitive community, 958 which means that the first octet is 0x03. The value of the second 959 octet (Sub-Type) is 0x0d (Default Gateway) as assigned by IANA. The 960 Value field of this community is reserved (set to 0 by the senders, 961 ignored by the receivers). For procedures and usage of this 962 attribute, please see Section 10.1 ("Default Gateway"). 964 7.9. Route Distinguisher Assignment per MAC-VRF 966 The Route Distinguisher (RD) MUST be set to the RD of the MAC-VRF 967 that is advertising the NLRI. An RD MUST be assigned for a given 968 MAC-VRF on a PE. This RD MUST be unique across all MAC-VRFs on a PE. 969 It is RECOMMENDED to use the Type 1 RD [RFC4364]. The value field 970 comprises an IP address of the PE (typically, the loopback address) 971 followed by a number unique to the PE. This number may be generated 972 by the PE. In case of VLAN-based or VLAN Bundle services, this 973 number may also be generated out of the Ethernet Tag ID for the BD as 974 long as the value does not exceed a length of 16 bits. Or, in the 975 Unique VLAN EVPN case, the low-order 12 bits may be the 12-bit VLAN 976 ID, with the remaining high-order 4 bits set to 0. 978 7.10. Route Targets 980 The EVPN route MAY carry one or more Route Target (RT) attributes. 981 RTs may be configured (as in IP VPNs) or may be derived 982 automatically. 984 If a PE uses RT Constraint, the PE advertises all such RTs using RT 985 Constraints per [RFC4684]. The use of RT Constraints allows each 986 EVPN route to reach only those PEs that are configured to import at 987 least one RT from the set of RTs carried in the EVPN route. 989 7.10.1. Auto-derivation from the Ethernet Tag (VLAN ID) 991 For the "Unique VLAN EVPN" scenario (Section 4), it is highly 992 desirable to auto-derive the RT from the Ethernet Tag (VLAN ID). The 993 procedure for performing such auto-derivation is as follows: 995 + The Global Administrator field of the RT MUST be set to the 996 Autonomous System (AS) number with which the PE is associated. 998 + The 12-bit VLAN ID MUST be encoded in the lowest 12 bits of the 999 Local Administrator field, with the remaining bits set to zero. 1001 For VLAN-based and VLAN Bundle services, the RT may also be auto- 1002 derived as per the above rules but replacing the 12-bit VLAN ID with 1003 a 16-bit Ethernet Tag ID configured for the BD. If the Ethernet Tag 1004 ID length is 24 bits, the RT for the MAC-VRF can be auto-derived as 1005 per [RFC8365] section 5.1.2.1. 1007 7.11. EVPN Layer 2 Attributes Extended Community 1009 [RFC8214] defines this extended community ("L2-Attr"), to be included 1010 with per-EVI Ethernet A-D routes and mandatory if multihoming is 1011 enabled. 1013 Usage and applicability of this Extended community to Bridging is 1014 clarified here. 1016 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 1017 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1018 | MBZ |RSV|RSV|F|C|P|B| (MBZ = MUST Be Zero) 1019 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1021 The following bits in Control Flags from [RFC8214] are listed here 1022 for completeness only: 1024 Name Meaning 1025 --------------------------------------------------------------- 1026 P If set to 1 in multihoming Single-Active scenarios, 1027 this flag indicates that the advertising PE is the 1028 primary PE. MUST be set to 1 for multihoming 1029 All-Active scenarios by all active PE(s). 1031 B If set to 1 in multihoming Single-Active scenarios, 1032 this flag indicates that the advertising PE is the 1033 backup PE. 1035 C If set to 1, a control word [RFC4448] MUST be present 1036 when sending EVPN packets to this PE. It is 1037 recommended that the control word be included in the 1038 absence of an entropy label [RFC6790]. 1040 The bits in Control Flags are extended by the following defined bits: 1042 Name Meaning 1043 --------------------------------------------------------------- 1044 F If set to 1, a Flow Label MUST be present 1045 when sending EVPN packets to this PE. 1047 For procedures and usage of this attribute, with respect to Control 1048 Word and Flow Label, please see Section 18. ("Frame Ordering"). 1050 For procedures and usage of this attribute, with respect to 1051 Primary-Backup bits, please see Section 8.5. 1052 ("Designated Forwarder Election"). 1054 7.11.1. EVPN Layer 2 Attributes Partitioning 1056 The information carried in the L2-Attr Extended Community may be ESI- 1057 specific or BD/MAC-VRF-specific. In order to minimize the processing 1058 overhead of configuration-time items such as MTU not expected to 1059 change at runtime based on failures, the Extended Community from 1060 [RFC8214] is partitioned, with a subset of information carried over 1061 each Ethernet A-D per EVI and Inclusive Multicast routes. 1063 The EVPN Layer 2 Attributes Extended Community, when added to 1064 Inclusive Multicast route: 1066 - BD/MAC-VRF attributes MTU, Control Word and Flow Label are 1067 conveyed, and; 1069 - per-ESI attributes P, B MUST be zero. 1071 +-------------------------------------------+ 1072 | Type (0x06) / Sub-type (0x04) (2 octets) | 1073 +-------------------------------------------+ 1074 | Control Flags (2 octets) | 1075 +-------------------------------------------+ 1076 | L2 MTU (2 octets) | 1077 +-------------------------------------------+ 1078 | Reserved (2 octets) | 1079 +-------------------------------------------+ 1081 1 1 1 1 1 1082 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 1083 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1084 | MBZ | MBZ |F|C|MBZ| (MBZ = MUST Be Zero) 1085 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1087 The EVPN Layer 2 Attributes Extended Community is included on 1088 Ethernet A-D per EVI route and: 1090 - per-ESI attributes P, B are conveyed, and; 1092 - BD/MAC-VRF attributes MTU, Control Word and Flow Label MUST be 1093 zero. 1095 +-------------------------------------------+ 1096 | Type (0x06) / Sub-type (0x04) (2 octets) | 1097 +-------------------------------------------+ 1098 | Control Flags (2 octets) | 1099 +-------------------------------------------+ 1100 | MBZ (2 octets) | 1101 +-------------------------------------------+ 1102 | Reserved (2 octets) | 1103 +-------------------------------------------+ 1105 1 1 1 1 1 1106 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 1107 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1108 | MBZ | MBZ |P|B| (MBZ = MUST Be Zero) 1109 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1111 Note that in both of the above cases, the values conveyed in this 1112 extended community are at the granularity of an individual EVI (or 1113 [EVI, BD] for VLAN-aware bundle) and hence may vary for different 1114 EVIs. 1116 7.12. Route Prioritization 1118 In order to achieve the Fast Convergence referred to in (Section 8.2 1119 ("Fast Convergence")), BGP speakers SHOULD prioritise advertisement, 1120 processing and redistribution of routes based on relative scale of 1121 priority vs. expected or average scale. 1123 1. Ethernet A-D per ES (Mass-Withdraw Route Type 1) and Ethernet 1124 Segment (Route Type 4) are lower scale and highly convergence 1125 affecting, and SHOULD be handled in first order of priority 1127 2. Ethernet A-D per EVI, Inclusive Multicast Ethernet Tag route, and 1128 IP Prefix route defined in 1129 [I-D.ietf-bess-evpn-prefix-advertisement] are sent for each 1130 Bridge or AC at medium scale and may be convergence affecting, 1131 and SHOULD be handled in second order of priority 1133 3. MAC advertisement route (zero and nonzero IP portion), Multicast 1134 Join Sync and Multicast Leave Sync routes defined in 1135 [I-D.ietf-bess-evpn-igmp-mld-proxy] are considered 'individual 1136 routes' and very-high scale or of relatively low importance for 1137 fast convergence and SHOULD be handled in last order of priority. 1139 8. Multihoming Functions 1141 This section discusses the functions, procedures, and associated BGP 1142 routes used to support multihoming in EVPN. This covers both 1143 multihomed device (MHD) and multihomed network (MHN) scenarios. 1145 8.1. Multihomed Ethernet Segment Auto-discovery 1147 PEs connected to the same Ethernet segment can automatically discover 1148 each other with minimal to no configuration through the exchange of 1149 the Ethernet Segment route. 1151 8.1.1. Constructing the Ethernet Segment Route 1153 The Route Distinguisher (RD) MUST be a Type 1 RD [RFC4364]. The 1154 value field comprises an IP address of the PE (typically, the 1155 loopback address) followed by a number unique to the PE. 1157 The Ethernet Segment Identifier (ESI) MUST be set to the 10-octet 1158 value described in Section 5. 1160 The BGP advertisement that advertises the Ethernet Segment route MUST 1161 also carry an ES-Import Route Target, as defined in Section 7.6. 1163 The Ethernet Segment route filtering MUST be done such that the 1164 Ethernet Segment route is imported only by the PEs that are 1165 multihomed to the same Ethernet segment. To that end, each PE that 1166 is connected to a particular Ethernet segment constructs an import 1167 filtering rule to import a route that carries the ES-Import Route 1168 Target, constructed from the ESI. 1170 8.2. Fast Convergence 1172 In EVPN, MAC address reachability is learned via the BGP control 1173 plane over the MPLS network. As such, in the absence of any fast 1174 protection mechanism, the network convergence time is a function of 1175 the number of MAC/IP Advertisement routes that must be withdrawn by 1176 the PE encountering a failure. For highly scaled environments, this 1177 scheme yields slow convergence. 1179 To alleviate this, EVPN defines a mechanism to efficiently and 1180 quickly signal, to remote PE nodes, the need to update their 1181 forwarding tables upon the occurrence of a failure in connectivity to 1182 an Ethernet segment. This is done by having each PE advertise a set 1183 of one or more Ethernet A-D per ES routes for each locally attached 1184 Ethernet segment (refer to Section 8.2.1 below for details on how 1185 these routes are constructed). A PE may need to advertise more than 1186 one Ethernet A-D per ES route for a given ES because the ES may be in 1187 a multiplicity of EVIs and the RTs for all of these EVIs may not fit 1188 into a single route. Advertising a set of Ethernet A-D per ES routes 1189 for the ES allows each route to contain a subset of the complete set 1190 of RTs. Each Ethernet A-D per ES route is differentiated from the 1191 other routes in the set by a different Route Distinguisher (RD). 1193 Upon a failure in connectivity to the attached segment, the PE 1194 withdraws the corresponding set of Ethernet A-D per ES routes. This 1195 triggers all PEs that receive the withdrawal to update their next-hop 1196 adjacencies for all MAC addresses associated with the Ethernet 1197 segment in question. If no other PE had advertised an Ethernet A-D 1198 per ES route for the same segment, then the PE that received the 1199 withdrawal simply invalidates the MAC entries for that segment. 1200 Otherwise, the PE updates its next-hop adjacencies accordingly. 1202 8.2.1. Constructing Ethernet A-D per Ethernet Segment Route 1204 This section describes the procedures used to construct the Ethernet 1205 A-D per ES route, which is used for fast convergence (as discussed 1206 above) and for advertising the ESI label used for split-horizon 1207 filtering (as discussed in Section 8.3). Support of this route is 1208 REQUIRED. 1210 The Route Distinguisher (RD) MUST be a Type 1 RD [RFC4364]. The 1211 value field comprises an IP address of the PE (typically, the 1212 loopback address) followed by a number unique to the PE. 1214 The Ethernet Segment Identifier MUST be a 10-octet entity as 1215 described in Section 5 ("Ethernet Segment"). The Ethernet A-D route 1216 is not needed when the Segment Identifier is set to 0 (e.g., single- 1217 homed scenarios). An exception to this rule is described in 1218 [RFC8317]. 1220 The Ethernet Tag ID MUST be set to MAX-ET. 1222 The MPLS label in the NLRI MUST be set to 0. 1224 The ESI Label extended community MUST be included in the route. If 1225 All-Active redundancy mode is desired, then the "Single-Active" bit 1226 in the flags of the ESI Label extended community MUST be set to 0 and 1227 the MPLS label in that Extended Community MUST be set to a valid MPLS 1228 label value. The MPLS label in this Extended Community is referred 1229 to as the ESI label and MUST have the same value in each Ethernet A-D 1230 per ES route advertised for the ES. This label MUST be a downstream 1231 assigned MPLS label if the advertising PE is using ingress 1232 replication for receiving multicast, broadcast, or unknown unicast 1233 traffic from other PEs. If the advertising PE is using P2MP MPLS 1234 LSPs for sending multicast, broadcast, or unknown unicast traffic, 1235 then this label MUST be an upstream assigned MPLS label, unless DCB 1236 allocated labels are used. The usage of this label is described in 1237 Section 8.3. 1239 If Single-Active redundancy mode is desired, then the "Single-Active" 1240 bit in the flags of the ESI Label extended community MUST be set to 1 1241 and the ESI label SHOULD be set to a valid MPLS label value. 1243 8.2.1.1. Ethernet A-D Route Targets 1245 Each Ethernet A-D per ES route MUST carry one or more Route Target 1246 (RT) attributes. The set of Ethernet A-D routes per ES MUST carry 1247 the entire set of RTs for all the EVPN instances to which the 1248 Ethernet segment belongs. 1250 8.3. Split Horizon 1252 Consider a CE that is multihomed to two or more PEs on an Ethernet 1253 segment ES1 operating in All-Active redundancy mode. If the CE sends 1254 a broadcast, unknown unicast, or multicast (BUM) packet to one of the 1255 Non-Designated Forwarder (Non-DF) PEs, say PE1, then PE1 will forward 1256 that packet to all or a subset of the other PEs in that EVPN 1257 instance, including the DF PE for that Ethernet segment. In this 1258 case, the DF PE to which the CE is multihomed MUST drop the packet 1259 and not forward back to the CE. This filtering is referred to as 1260 "split-horizon filtering" in this document. 1262 When a set of PEs are operating in Single-Active redundancy mode, the 1263 use of this split-horizon filtering mechanism is highly recommended 1264 because it prevents transient loops at the time of failure or 1265 recovery that would impact the Ethernet segment -- e.g., when two PEs 1266 think that both are DFs for that segment before the DF election 1267 procedure settles down. 1269 In order to achieve this split-horizon function, every BUM packet 1270 originating from a Non-DF PE is encapsulated with an MPLS label that 1271 identifies the Ethernet segment of origin (i.e., the segment from 1272 which the frame entered the EVPN network). This label is referred to 1273 as the ESI label and MUST be distributed by all PEs when operating in 1274 All-Active redundancy mode using a set of Ethernet A-D per ES routes, 1275 per Section 8.2.1 above. The ESI label SHOULD be distributed by all 1276 PEs when operating in Single-Active redundancy mode using a set of 1277 Ethernet A-D per ES routes. These routes are imported by the PEs 1278 connected to the Ethernet segment and also by the PEs that have at 1279 least one EVPN instance in common with the Ethernet segment in the 1280 route. As described in Section 8.1.1, the route MUST carry an ESI 1281 Label extended community with a valid ESI label. The disposition PE 1282 relies on the value of the ESI label to determine whether or not a 1283 BUM frame is allowed to egress a specific Ethernet segment. 1285 8.3.1. ESI Label Assignment 1287 The following subsections describe the assignment procedures for the 1288 ESI label, which differ depending on the type of tunnels being used 1289 to deliver multi-destination packets in the EVPN network. 1291 8.3.1.1. Ingress Replication 1293 Each PE that operates in All-Active or Single-Active redundancy mode 1294 and that uses ingress replication to receive BUM traffic advertises a 1295 downstream assigned ESI label in the set of Ethernet A-D per ES 1296 routes for its attached ES. This label MUST be programmed in the 1297 platform label space by the advertising PE, and the forwarding entry 1298 for this label must result in NOT forwarding packets received with 1299 this label onto the Ethernet segment for which the label was 1300 distributed. 1302 The rules for the inclusion of the ESI label in a BUM packet by the 1303 ingress PE operating in All-Active redundancy mode are as follows: 1305 - A Non-DF ingress PE MUST include the ESI label distributed by the 1306 DF egress PE in the copy of a BUM packet sent to it. 1308 - An ingress PE (DF or Non-DF) SHOULD include the ESI label 1309 distributed by each Non-DF egress PE in the copy of a BUM packet 1310 sent to it. 1312 The rule for the inclusion of the ESI label in a BUM packet by the 1313 ingress PE operating in Single-Active redundancy mode is as follows: 1315 - An ingress DF PE SHOULD include the ESI label distributed by the 1316 egress PE in the copy of a BUM packet sent to it. 1318 In both All-Active and Single-Active redundancy mode, an ingress PE 1319 MUST NOT include an ESI label in the copy of a BUM packet sent to an 1320 egress PE that is not attached to the ES through which the BUM packet 1321 entered the EVI. 1323 As an example, consider PE1 and PE2, which are multihomed to CE1 on 1324 ES1 and operating in All-Active multihoming mode. Further, consider 1325 that PE1 is using P2P or MP2P LSPs to send packets to PE2. Consider 1326 that PE1 is the Non-DF for VLAN1 and PE2 is the DF for VLAN1, and PE1 1327 receives a BUM packet from CE1 on VLAN1 on ES1. In this scenario, 1328 PE2 distributes an Inclusive Multicast Ethernet Tag route for VLAN1 1329 corresponding to an EVPN instance. So, when PE1 sends a BUM packet 1330 that it receives from CE1, it MUST first push onto the MPLS label 1331 stack the ESI label that PE2 has distributed for ES1. It MUST then 1332 push onto the MPLS label stack the MPLS label distributed by PE2 in 1333 the Inclusive Multicast Ethernet Tag route for VLAN1. The resulting 1334 packet is further encapsulated in the P2P or MP2P LSP label stack 1335 required to transmit the packet to PE2. When PE2 receives this 1336 packet, it determines, from the top MPLS label, the set of ESIs to 1337 which it will replicate the packet after any P2P or MP2P LSP labels 1338 have been removed. If the next label is the ESI label assigned by 1339 PE2 for ES1, then PE2 MUST NOT forward the packet onto ES1. If the 1340 next label is an ESI label that has not been assigned by PE2, then 1341 PE2 MUST drop the packet. It should be noted that in this scenario, 1342 if PE2 receives a BUM packet for VLAN1 from CE1, then it SHOULD 1343 encapsulate the packet with an ESI label received from PE1 when 1344 sending it to PE1 in order to avoid any transient loops during a 1345 failure scenario that would impact ES1 (e.g., port or link failure). 1347 8.3.1.2. P2MP MPLS LSPs 1349 The Non-DF PEs that operate in All-Active redundancy mode and that 1350 use P2MP LSPs to send BUM traffic advertise an upstream assigned ESI 1351 label in the set of Ethernet A-D per ES routes for their common 1352 attached ES. This label is upstream assigned by the PE that 1353 advertises the route. This label MUST be programmed by the other PEs 1354 that are connected to the ESI advertised in the route, in the context 1355 label space for the advertising PE. Further, the forwarding entry 1356 for this label must result in NOT forwarding packets received with 1357 this label onto the Ethernet segment for which the label was 1358 distributed. This label MUST also be programmed by the other PEs 1359 that import the route but are not connected to the ESI advertised in 1360 the route, in the context label space for the advertising PE. 1361 Further, the forwarding entry for this label must be a label pop with 1362 no other associated action. 1364 The DF PE that operates in Single-Active redundancy mode and that 1365 uses P2MP LSPs to send BUM traffic should advertise an upstream 1366 assigned ESI label in the set of Ethernet A-D per ES routes for its 1367 attached ES, just as described in the previous paragraph. 1369 As an example, consider PE1 and PE2, which are multihomed to CE1 on 1370 ES1 and operating in All-Active multihoming mode. Also, consider 1371 that PE3 belongs to one of the EVPN instances of ES1. Further, 1372 assume that PE1, which is the Non-DF, is using P2MP MPLS LSPs to send 1373 BUM packets. When PE1 sends a BUM packet that it receives from CE1, 1374 it MUST first push onto the MPLS label stack the ESI label that it 1375 has assigned for the ESI on which the packet was received. The 1376 resulting packet is further encapsulated in the P2MP MPLS label stack 1377 necessary to transmit the packet to the other PEs. Penultimate hop 1378 popping MUST be disabled on the P2MP LSPs used in the MPLS transport 1379 infrastructure for EVPN. When PE2 receives this packet, it 1380 decapsulates the top MPLS label and forwards the packet using the 1381 context label space determined by the top label. If the next label 1382 is the ESI label assigned by PE1 to ES1, then PE2 MUST NOT forward 1383 the packet onto ES1. When PE3 receives this packet, it decapsulates 1384 the top MPLS label and forwards the packet using the context label 1385 space determined by the top label. If the next label is the ESI 1386 label assigned by PE1 to ES1 and PE3 is not connected to ES1, then 1387 PE3 MUST pop the label and flood the packet over all local ESIs in 1388 that EVPN instance. It should be noted that when PE2 sends a BUM 1389 frame over a P2MP LSP, it should encapsulate the frame with an ESI 1390 label even though it is the DF for that VLAN, in order to avoid any 1391 transient loops during a failure scenario that would impact ES1 1392 (e.g., port or link failure). 1394 8.3.1.3. MP2MP MPLS LSPs 1396 The procedures for MP2MP tunnels follow Section 8.3.1.2, with the 1397 exceptions described in this section. 1399 When MP2MP tunnels are used, ESI-labels MUST be allocated from a DCB 1400 and the same label must be used by all the PEs attached to the same 1401 Ethernet Segment. 1403 In that way, any egress PE with local Ethernet Segments can identify 1404 the source ES of the received BUM packets. 1406 8.4. Aliasing and Backup Path 1408 In the case where a CE is multihomed to multiple PE nodes, using a 1409 Link Aggregation Group (LAG) with All-Active redundancy, it is 1410 possible that only a single PE learns a set of the MAC addresses 1411 associated with traffic transmitted by the CE. This leads to a 1412 situation where remote PE nodes receive MAC/IP Advertisement routes 1413 for these addresses from a single PE, even though multiple PEs are 1414 connected to the multihomed segment. As a result, the remote PEs are 1415 not able to effectively load balance traffic among the PE nodes 1416 connected to the multihomed Ethernet segment. This could be the 1417 case, for example, when the PEs perform data-plane learning on the 1418 access, and the load-balancing function on the CE hashes traffic from 1419 a given source MAC address to a single PE. 1421 Another scenario where this occurs is when the PEs rely on control- 1422 plane learning on the access (e.g., using ARP), since ARP traffic 1423 will be hashed to a single link in the LAG. 1425 To address this issue, EVPN introduces the concept of 'aliasing', 1426 which is the ability of a PE to signal that it has reachability to an 1427 EVPN instance on a given ES even when it has learned no MAC addresses 1428 from that EVI/ES. The Ethernet A-D per EVI route is used for this 1429 purpose. A remote PE that receives a MAC/IP Advertisement route with 1430 a non-reserved ESI SHOULD consider the advertised MAC address to be 1431 reachable via all PEs that have advertised reachability to that MAC 1432 address's EVI/ES/Ethernet Tag ID via the combination of an Ethernet 1433 A-D per EVI route for that EVI/ES/Ethernet Tag ID AND Ethernet A-D 1434 per ES routes for that ES with the "Single-Active" bit in the flags 1435 of the ESI Label extended community set to 0. 1437 Note that the Ethernet A-D per EVI route may be received by a remote 1438 PE before it receives the set of Ethernet A-D per ES routes. 1439 Therefore, in order to handle corner cases and race conditions, the 1440 Ethernet A-D per EVI route MUST NOT be used for traffic forwarding by 1441 a remote PE until it also receives the associated set of Ethernet A-D 1442 per ES routes. 1444 The backup path is a closely related function, but it is used in 1445 Single-Active redundancy mode. In this case, a PE also advertises 1446 that it has reachability to a given EVI/ES using the same combination 1447 of Ethernet A-D per EVI route and Ethernet A-D per ES route as 1448 discussed above, but with the "Single-Active" bit in the flags of the 1449 ESI Label extended community set to 1. A remote PE that receives a 1450 MAC/IP Advertisement route with a non-reserved ESI SHOULD consider 1451 the advertised MAC address to be reachable via any PE that has 1452 advertised this combination of Ethernet A-D routes, and it SHOULD 1453 install a backup path for that MAC address. 1455 Please see Section 14.1.1 for a description of the backup paths 1456 operation. 1458 8.4.1. Constructing Ethernet A-D per EVPN Instance Route 1460 This section describes the procedures used to construct the Ethernet 1461 A-D per EVPN instance (EVI) route, which is used for aliasing (as 1462 discussed above). Support of this route is OPTIONAL. 1464 The Route Distinguisher (RD) MUST be set per Section 7.9. 1466 The Ethernet Segment Identifier MUST be a 10-octet entity as 1467 described in Section 5 ("Ethernet Segment"). The Ethernet A-D route 1468 is not needed when the Segment Identifier is set to 0. 1470 The Ethernet Tag ID is set as defined in Section 6. 1472 Note that the above allows the Ethernet A-D per EVI route to be 1473 advertised with one of the following granularities: 1475 + One Ethernet A-D route per tuple per 1476 MAC-VRF. This is applicable when the PE uses MPLS-based 1477 disposition with VID translation or may be applicable when the 1478 PE uses MAC-based disposition with VID translation. 1480 + One Ethernet A-D route for each per MAC-VRF (where the 1481 Ethernet Tag ID is set to 0). This is applicable when the PE uses 1482 MAC-based disposition or MPLS-based disposition without VID 1483 translation. 1485 The usage of the MPLS label is described in Section 14 1486 ("Load Balancing of Unicast Packets"). 1488 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 1489 be set to the IPv4 or IPv6 address of the advertising PE. 1491 The Ethernet A-D per EVI route MUST carry one or more Route Target 1492 (RT) attributes, per Section 7.10. 1494 8.5. Designated Forwarder Election 1496 Consider a CE that is a host or a router that is multihomed directly 1497 to more than one PE in an EVPN instance on a given Ethernet segment. 1498 In this scenario, only one of the PEs, referred to as the Designated 1499 Forwarder (DF), is responsible for certain actions: 1501 - Sending broadcast and multicast traffic for a given EVI to that CE. 1503 - If the flooding of unknown unicast traffic (i.e., traffic for which 1504 a PE does not know the destination MAC address, see Section 12) is 1505 allowed, sending unknown unicast traffic for a given EVI to that 1506 CE. 1508 - If the multihoming mode is Single-Active, sending (known) unicast 1509 traffic for a given EVI to that CE. 1511 Note that this behavior, which allows selecting a DF at the 1512 granularity of for is the default behavior in this 1513 specification. 1515 In this same scenario, a second PE referred to as the 1516 Backup-Designated Forwarder (Backup-DF or BDF), is responsible for 1517 assuming the role of DF in the event of DF's failure. Until this 1518 occurs, the Backup-DF PE is a subset of, and behaves like, a Non-DF 1519 PE for all forwarding considerations. 1521 All other PEs, referred to as Non-Designated Forwarder (Non-DF or 1522 NDF) are not responsible for any forwarding nor of assuming any 1523 functionality from the DF in the event of its failure. 1525 The default procedure for DF election at the granularity of 1526 is referred to as "service carving". With service carving, it is 1527 possible to perform load-balancing of traffic destined to a given 1528 segment. The load-balancing procedure carves the set of EVIs on that 1529 ES among the PEs nodes evenly such that every PE is the DF for a 1530 disjoint and distinct set of EVIs for that ES. The procedure for 1531 service carving is as follows according to the DF Election Finite 1532 State Machine as defined in [RFC8584] Section 2.1: 1534 1. When a PE discovers the ESI of the attached Ethernet segment, 1535 it advertises an Ethernet Segment route with the associated 1536 ES-Import extended community attribute. 1538 2. The PE then starts a timer (default value = 3 seconds) to allow 1539 the reception of Ethernet Segment routes from other PE nodes 1540 connected to the same Ethernet segment. This timer value should 1541 be the same across all PEs connected to the same Ethernet 1542 segment. 1544 3. When the timer expires, each PE builds an ordered list of the IP 1545 addresses of all the PE nodes connected to the Ethernet segment 1546 (including itself), in increasing numeric value. Each IP address 1547 in this list is extracted from the "Originating Router's IP 1548 address" field of the advertised Ethernet Segment route. Every 1549 PE is then given an ordinal indicating its position in the 1550 ordered list, starting with 0 as the ordinal for the PE with the 1551 numerically lowest IP address. The ordinals are used to 1552 determine which PE node will be the DF for a given EVPN instance 1553 on the Ethernet segment, using the following rule: 1555 Assuming a redundancy group of N PE nodes, the PE with ordinal i 1556 is the DF for an when (V mod N) = i, where V is the 1557 Ethernet tag for that EVI. For VLAN-Aware Bundle service, then 1558 the numerically lowest Ethernet tag in that EVI MUST be used in 1559 the modulo function. 1561 It should be noted that using the "Originating Router's IP 1562 address" field in the Ethernet Segment route to get the PE IP 1563 address needed for the ordered list allows for a CE to be 1564 multihomed across different ASes if such a need ever arises. 1566 4. For each EVPN instance, a second list of the IP addresses of all 1567 the PE nodes connected to the Ethernet segment is built. The PE 1568 which was determined as DF above is removed from that ordered 1569 candidate list, forming a backup redundancy group of M PE nodes. 1570 Every remaining PE is then given a second ordinal indicating its 1571 position in the secondary ordered list according to the same 1572 criteria as in step 3 above. 1574 The second ordinals are used to determine which PE nodes will be 1575 the BDF for a given EVPN instance on the Ethernet segment, using 1576 the same modulo rule as above, (V mod M) = i. 1578 5. The PE that is elected as a DF for a given will unblock 1579 BUM traffic, or all traffic if in Single-Active mode, for that 1580 EVI on the corresponding ES. Note that the DF PE unblocks BUM 1581 traffic in the egress direction towards the segment. All Non-DF 1582 PEs, including the Backup-DF PE, continue to drop 1583 multi-destination traffic in the egress direction towards that 1584 . 1586 In the case of link or port failure, the affected PE withdraws 1587 its Ethernet Segment route. This will re-trigger the service 1588 carving procedures on all the PEs in the redundancy group: the 1589 expected new-DF will be BDF previously calculated in step 5. For 1590 PE node failure, or upon PE commissioning or decommissioning, the 1591 PEs re-trigger the service carving. In the case of Single-Active 1592 multihoming, when a service moves from one PE in the redundancy 1593 group to another PE as a result of re-carving, the PE, which ends 1594 up being the elected DF for the service, SHOULD trigger a MAC 1595 address flush notification towards the associated Ethernet 1596 segment. This can be done, for example, using the IEEE 802.1ak 1597 Multiple VLAN Registration Protocol (MVRP) 'new' declaration. 1599 It is RECOMMENDED that all future DF Election algorithms specify an 1600 algorithm to select one Designated Forwarder (DF) PE, one Backup-DF 1601 PE and a residual number of Non-DF PE(s). 1603 8.6. Signaling Primary and Backup DF Elected PEs 1605 Once the Primary and Backup DF Elected PEs for a given are 1606 determined, the multi-homed PEs for that ES will each advertise an 1607 Ethernet A-D per EVI route for that EVI and each will include an 1608 L2-Attr extended community with the P and B bits set to reflect the 1609 advertising PE's role for that EVI. 1611 It should be noted if L2-Attr extended community is included for All- 1612 Active mode, then the P bit must be set for all PEs in the redundancy 1613 group. 1615 8.7. Interoperability with Single-Homing PEs 1617 Let's refer to PEs that only support single-homed CE devices as 1618 single-homing PEs. For single-homing PEs, all the above multihoming 1619 procedures can be omitted; however, to allow for single-homing PEs 1620 to fully interoperate with multihoming PEs, some of the multihoming 1621 procedures described above SHOULD be supported even by single- 1622 homing PEs: 1624 - procedures related to processing Ethernet A-D routes for the 1625 purpose of fast convergence (Section 8.2 ("Fast Convergence")), to 1626 let single-homing PEs benefit from fast convergence 1628 - procedures related to processing Ethernet A-D routes for the 1629 purpose of aliasing (Section 8.4 ("Aliasing and Backup Path")), to 1630 let single-homing PEs benefit from load balancing 1632 - procedures related to processing Ethernet A-D routes for the 1633 purpose of a backup path (Section 8.4 1634 ("Aliasing and Backup Path")), to let single-homing PEs benefit 1635 from the corresponding convergence improvement 1637 9. Determining Reachability to Unicast MAC Addresses 1639 PEs forward packets that they receive based on the destination MAC 1640 address. This implies that PEs must be able to learn how to reach a 1641 given destination unicast MAC address. 1643 There are two components to MAC address learning -- "local learning" 1644 and "remote learning": 1646 9.1. Local Learning 1648 A particular PE must be able to learn the MAC addresses from the CEs 1649 that are connected to it. This is referred to as local learning. 1651 The PEs in a particular EVPN instance MUST support local data-plane 1652 learning using standard IEEE Ethernet learning procedures. A PE must 1653 be capable of learning MAC addresses in the data plane when it 1654 receives packets such as the following from the CE network: 1656 - DHCP requests 1658 - An ARP Request for its own MAC 1660 - An ARP Request for a peer 1662 Alternatively, PEs MAY learn the MAC addresses of the CEs in the 1663 control plane or via management-plane integration between the PEs and 1664 the CEs. 1666 There are applications where a MAC address that is reachable via a 1667 given PE on a locally attached segment (e.g., with ESI X) may move, 1668 such that it becomes reachable via another PE on another segment 1669 (e.g., with ESI Y). This is referred to as "MAC Mobility". 1670 Procedures to support this are described in Section 15 1671 ("MAC Mobility"). 1673 9.2. Remote Learning 1675 A particular PE must be able to determine how to send traffic to MAC 1676 addresses that belong to or are behind CEs connected to other PEs, 1677 i.e., to remote CEs or hosts behind remote CEs. We call such MAC 1678 addresses "remote" MAC addresses. 1680 This document requires a PE to learn remote MAC addresses in the 1681 control plane. In order to achieve this, each PE advertises the MAC 1682 addresses it learns from its locally attached CEs in the control 1683 plane, to all the other PEs in that EVPN instance, using MP-BGP and, 1684 specifically, the MAC/IP Advertisement route. 1686 9.2.1. Constructing MAC/IP Address Advertisement 1688 BGP is extended to advertise these MAC addresses using the MAC/IP 1689 Advertisement route type in the EVPN NLRI. 1691 The RD MUST be set per Section 7.9. 1693 The Ethernet Segment Identifier is set to the 10-octet ESI described 1694 in Section 5 ("Ethernet Segment"). 1696 The Ethernet Tag ID is set as defined in Section 6. 1698 The MAC Address Length field is in bits, and it is set to 48. MAC 1699 address length values other than 48 bits are outside the scope of 1700 this document. The encoding of a MAC address MUST be the 6-octet MAC 1701 address specified by [IEEE.802.1Q_2014] and [IEEE.802.1D_2004]. 1703 The IP Address field is optional. By default, the IP Address Length 1704 field is set to 0, and the IP Address field is omitted from the 1705 route. When a valid IP address needs to be advertised, it is then 1706 encoded in this route. When an IP address is present, the IP Address 1707 Length field is in bits, and it is set to 32 or 128 bits. Other IP 1708 Address Length values are outside the scope of this document. The 1709 encoding of an IP address MUST be either 4 octets for IPv4 or 1710 16 octets for IPv6. The Length field of the EVPN NLRI (which is in 1711 octets and is described in Section 7) is sufficient to determine 1712 whether an IP address is encoded in this route and, if so, whether 1713 the encoded IP address is IPv4 or IPv6. 1715 The MPLS Label1 field is encoded as 3 octets, where the high-order 1716 20 bits contain the label value. The MPLS Label1 MUST be downstream 1717 assigned, and it is associated with the MAC address being advertised 1718 by the advertising PE. The advertising PE uses this label when it 1719 receives an MPLS-encapsulated packet to perform forwarding based on 1720 the destination MAC address toward the CE. The forwarding procedures 1721 are specified in Sections 13 and 14. 1723 A PE may advertise the same single EVPN label for all MAC addresses 1724 in a given MAC-VRF. This label assignment is referred to as a per 1725 MAC-VRF label assignment. Alternatively, a PE may advertise a unique 1726 EVPN label per combination. This label 1727 assignment is referred to as a per label 1728 assignment. As a third option, a PE may advertise a unique EVPN 1729 label per combination. This label assignment is 1730 referred to as a per label assignment. As a 1731 fourth option, a PE may advertise a unique EVPN label per MAC 1732 address. This label assignment is referred to as a per MAC label 1733 assignment. All of these label assignment methods have their 1734 trade-offs. The choice of a particular label assignment methodology 1735 is purely local to the PE that originates the route. 1737 An assignment per MAC-VRF label requires the least number of EVPN 1738 labels but requires a MAC lookup in addition to an MPLS lookup on an 1739 egress PE for forwarding. On the other hand, a unique label per 1740 or a unique label per MAC allows an egress PE to 1741 forward a packet that it receives from another PE, to the connected 1742 CE, after looking up only the MPLS labels without having to perform a 1743 MAC lookup. This includes the capability to perform appropriate VLAN 1744 ID translation on egress to the CE. 1746 The MPLS Label2 field is an optional field. If it is present, then 1747 it is encoded as 3 octets, where the high-order 20 bits contain the 1748 label value. 1750 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 1751 be set to the IPv4 or IPv6 address of the advertising PE. 1753 The BGP advertisement for the MAC/IP Advertisement route MUST also 1754 carry one or more Route Target (RT) attributes. RTs may be 1755 configured (as in IP VPNs) or may be derived automatically in the 1756 "Unique VLAN EVPN" case from the Ethernet Tag (VLAN ID), as described 1757 in Section 7.10.1. 1759 It is to be noted that this document does not require PEs to create 1760 forwarding state for remote MACs when they are learned in the control 1761 plane. When this forwarding state is actually created is a local 1762 implementation matter. 1764 9.2.2. Route Resolution 1766 If the Ethernet Segment Identifier field in a received MAC/IP 1767 Advertisement route is set to the reserved ESI value of 0 or MAX-ESI, 1768 then if the receiving PE decides to install forwarding state for the 1769 associated MAC address, it MUST be based on the MAC/IP Advertisement 1770 route alone. 1772 If the Ethernet Segment Identifier field in a received MAC/IP 1773 Advertisement route is set to a non-reserved ESI, and the receiving 1774 PE is locally attached to the same ESI, then the PE does not alter 1775 its forwarding state based on the received route. This ensures that 1776 local routes are preferred to remote routes. 1778 If the Ethernet Segment Identifier field in a received MAC/IP 1779 Advertisement route is set to a non-reserved ESI, then if the 1780 receiving PE decides to install forwarding state for the associated 1781 MAC address, it MUST be when both the MAC/IP Advertisement route AND 1782 the associated set of Ethernet A-D per ES routes have been received. 1783 The dependency of MAC route installation on Ethernet A-D per ES 1784 routes is to ensure that MAC routes don't get accidentally installed 1785 during a mass withdraw period. 1787 To illustrate this with an example, consider two PEs (PE1 and PE2) 1788 connected to a multihomed Ethernet segment ES1. All-Active 1789 redundancy mode is assumed. A given MAC address M1 is learned by PE1 1790 but not PE2. On PE3, the following states may arise: 1792 T1 When the MAC/IP Advertisement route from PE1 and the set of 1793 Ethernet A-D per ES routes and Ethernet A-D per EVI routes from 1794 PE1 and PE2 are received, PE3 can forward traffic destined to 1795 M1 to both PE1 and PE2. 1797 T2 If after T1 PE1 withdraws its set of Ethernet A-D per ES 1798 routes, then PE3 forwards traffic destined to M1 to PE2 only. 1800 T2' If after T1 PE2 withdraws its set of Ethernet A-D per ES 1801 routes, then PE3 forwards traffic destined to M1 to PE1 only. 1803 T2'' If after T1 PE1 withdraws its MAC/IP Advertisement route, then 1804 PE3 treats traffic to M1 as unknown unicast. 1806 T3 PE2 also advertises a MAC route for M1, and then PE1 withdraws 1807 its MAC route for M1. PE3 continues forwarding traffic 1808 destined to M1 to both PE1 and PE2. In other words, despite M1 1809 withdrawal by PE1, PE3 forwards the traffic destined to M1 to 1810 both PE1 and PE2. This is because a flow from the CE, 1811 resulting in M1 traffic getting hashed to PE1, can get 1812 terminated, resulting in M1 being aged out in PE1; however, M1 1813 can be reachable by both PE1 and PE2. 1815 10. ARP and ND 1817 The IP Address field in the MAC/IP Advertisement route may optionally 1818 carry one of the IP addresses associated with the MAC address. This 1819 provides an option that can be used to minimize the flooding of ARP 1820 or Neighbor Discovery (ND) messages over the MPLS network and to 1821 remote CEs. This option also minimizes ARP (or ND) message 1822 processing on end-stations/hosts connected to the EVPN network. A PE 1823 may learn the IP address associated with a MAC address in the control 1824 or management plane between the CE and the PE. Or, it may learn this 1825 binding by snooping certain messages to or from a CE. When a PE 1826 learns the IP address associated with a MAC address of a locally 1827 connected CE, it may advertise this address to other PEs by including 1828 it in the MAC/IP Advertisement route. The IP address may be an IPv4 1829 address encoded using 4 octets or an IPv6 address encoded using 1830 16 octets. For ARP and ND purposes, the IP Address Length field MUST 1831 be set to 32 for an IPv4 address or 128 for an IPv6 address. 1833 If there are multiple IP addresses associated with a MAC address, 1834 then multiple MAC/IP Advertisement routes MUST be generated, one for 1835 each IP address. For instance, this may be the case when there are 1836 both an IPv4 and an IPv6 address associated with the same MAC address 1837 for dual-IP-stack scenarios. When the IP address is dissociated with 1838 the MAC address, then the MAC/IP Advertisement route with that 1839 particular IP address MUST be withdrawn. 1841 Note that a MAC-only route can be advertised along with, but 1842 independent from, a MAC/IP route for scenarios where the MAC learning 1843 over an access network/node is done in the data plane and independent 1844 from ARP snooping that generates a MAC/IP route. In such scenarios, 1845 when the ARP entry times out and causes the MAC/IP to be withdrawn, 1846 then the MAC information will not be lost. In scenarios where the 1847 host MAC/IP is learned via the management or control plane, then the 1848 sender PE may only generate and advertise the MAC/IP route. If the 1849 receiving PE receives both the MAC-only route and the MAC/IP route, 1850 then when it receives a withdraw message for the MAC/IP route, it 1851 MUST delete the corresponding entry from the ARP table but not the 1852 MAC entry from the MAC-VRF table, unless it receives a withdraw 1853 message for the MAC-only route. 1855 When a PE receives an ARP Request for an IP address from a CE, and if 1856 the PE has the MAC address binding for that IP address, the PE SHOULD 1857 perform ARP proxy by responding to the ARP Request. 1859 In the same way, when a PE receives a Neighbor Solicitation for an IP 1860 address from a CE, the PE SHOULD perform ND proxy and respond if the 1861 PE has the binding information for the IP. 1863 10.1. Default Gateway 1865 When a PE needs to perform inter-subnet forwarding where each subnet 1866 is represented by a different broadcast domain (e.g., a different 1867 VLAN), the inter-subnet forwarding is performed at Layer 3, and the 1868 PE that performs such a function is called the default gateway for 1869 the EVPN instance. In this case, when the PE receives an ARP Request 1870 for the IP address configured as the default gateway address, the PE 1871 originates an ARP Reply. 1873 Each PE that acts as a default gateway for a given EVPN instance MAY 1874 advertise in the EVPN control plane its default gateway MAC address 1875 using the MAC/IP Advertisement route, and each such PE indicates that 1876 such a route is associated with the default gateway. This is 1877 accomplished by requiring the route to carry the Default Gateway 1878 extended community defined in Section 7.8 1879 ("Default Gateway Extended Community"). The ESI field is set to zero 1880 when advertising the MAC route with the Default Gateway extended 1881 community. 1883 The IP Address field of the MAC/IP Advertisement route is set to the 1884 default gateway IP address for that subnet (e.g., an EVPN instance). 1885 For a given subnet (e.g., a VLAN or EVPN instance), the default 1886 gateway IP address is the same across all the participant PEs. The 1887 inclusion of this IP address enables the receiving PE to check its 1888 configured default gateway IP address against the one received in the 1889 MAC/IP Advertisement route for that subnet (or EVPN instance), and if 1890 there is a discrepancy, then the PE SHOULD notify the operator and 1891 log an error message. 1893 Unless it is known a priori (by means outside of this document) that 1894 all PEs of a given EVPN instance act as a default gateway for that 1895 EVPN instance, the MPLS label MUST be set to a valid downstream 1896 assigned label. 1898 Furthermore, even if all PEs of a given EVPN instance do act as a 1899 default gateway for that EVPN instance, but only some, but not all, 1900 of these PEs have sufficient (routing) information to provide 1901 inter-subnet routing for all the inter-subnet traffic originated 1902 within the subnet associated with the EVPN instance, then when such a 1903 PE advertises in the EVPN control plane its default gateway MAC 1904 address using the MAC/IP Advertisement route and indicates that such 1905 a route is associated with the default gateway, the route MUST carry 1906 a valid downstream assigned label. 1908 If all PEs of a given EVPN instance act as a default gateway for that 1909 EVPN instance, and the same default gateway MAC address is used 1910 across all gateway devices, then no such advertisement is needed. 1911 However, if each default gateway uses a different MAC address, then 1912 each default gateway needs to be aware of other gateways' MAC 1913 addresses and thus the need for such an advertisement. This is 1914 called MAC address aliasing, since a single default gateway can be 1915 represented by multiple MAC addresses. 1917 Each PE that receives this route and imports it as per procedures 1918 specified in this document follows the procedures in this section 1919 when replying to ARP Requests that it receives. 1921 Each PE that acts as a default gateway for a given EVPN instance that 1922 receives this route and imports it as per procedures specified in 1923 this document MUST create MAC forwarding state that enables it to 1924 apply IP forwarding to the packets destined to the MAC address 1925 carried in the route. 1927 10.1.1. Best Path selection for Default Gateway 1929 Default gateway MAC address that is assigned to an IRB interface (for 1930 a subnet) in a PE MUST be unique in context of that subnet. In other 1931 words, the same MAC address cannot be used by a host either 1932 intentionally or accidently. Therefore, in case such conflicts 1933 arises, there needs to be scheme to detect it and resolve it. In 1934 order to properly detect such conflicts, the following BGP best path 1935 selection MUST be applied. 1937 * When comparing two routes, the route which has Default Gateway 1938 extended community is preferred over a route which does not have 1939 the extended comunity. The PE that has advertised the MAC route 1940 without Default Gateway extended community, upon receiving the 1941 route with Default Gateway extended community, SHALL withdraw its 1942 route and raise an alarm. 1944 * When comparing two routes where both routes have the Default 1945 Gateway extended community, normal BGP best path processing is be 1946 applied. 1948 * When comparing local and remote routes with Default Gateway 1949 extended community, the local route is always preferred. 1951 * MAC Mobility extended community SHALL NOT be attached to routes 1952 which also have Default Gateway extended community on the sending 1953 side and SHALL be ignored on the receiving side. 1955 11. Handling of Multi-destination Traffic 1957 Procedures are required for a given PE to flood broadcast or 1958 multicast traffic received from a CE and with a given Ethernet tag to 1959 the other PEs in the associated [EVI, BD] (EVPN instance). In 1960 certain scenarios, as described in Section 12 1961 ("Processing of Unknown Unicast Packets"), a given PE may also need 1962 to flood unknown unicast traffic to other PEs. 1964 The PEs in a particular EVPN instance may use ingress replication, 1965 P2MP LSPs, or MP2MP LSPs to send unknown unicast, broadcast, or 1966 multicast traffic to other PEs. 1968 Each PE MUST advertise an "Inclusive Multicast Ethernet Tag route" to 1969 enable the above. The following subsection provides the procedures 1970 to construct the Inclusive Multicast Ethernet Tag route. Subsequent 1971 subsections describe its usage in further detail. 1973 11.1. Constructing Inclusive Multicast Ethernet Tag Route 1975 The RD MUST be set per Section 7.9. 1977 The Ethernet Tag ID is set as defined in Section 6. 1979 The Originating Router's IP Address field value MUST be set to an IP 1980 address of the PE that should be common for all the EVIs on the PE 1981 (e.g., this address may be the PE's loopback address). The IP 1982 Address Length field is in bits. 1984 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 1985 be set to the IPv4 or IPv6 address of the advertising PE. 1987 The BGP advertisement for the Inclusive Multicast Ethernet Tag route 1988 MUST also carry one or more Route Target (RT) attributes. The 1989 assignment of RTs as described in Section 7.10 MUST be followed. 1991 11.2. P-Tunnel Identification 1993 In order to identify the P-tunnel used for sending broadcast, unknown 1994 unicast, or multicast traffic, the Inclusive Multicast Ethernet Tag 1995 route MUST carry a Provider Multicast Service Interface (PMSI) Tunnel 1996 attribute as specified in [RFC6514]. 1998 Depending on the technology used for the P-tunnel for the EVPN 1999 instance on the PE, the PMSI Tunnel attribute of the Inclusive 2000 Multicast Ethernet Tag route is constructed as follows. 2002 + If the PE that originates the advertisement uses a P-multicast tree 2003 for the P-tunnel for EVPN, the PMSI Tunnel attribute MUST contain 2004 the identity of the tree (note that the PE could create the 2005 identity of the tree prior to the actual instantiation of the 2006 tree). 2008 + A PE that uses a P-multicast tree for the P-tunnel MAY aggregate 2009 two or more Broadcast Domains (BDs) present on the PE onto the same 2010 tree. In this case, in addition to carrying the identity of the 2011 tree, the PMSI Tunnel attribute MUST carry an MPLS label, which the 2012 PE has bound uniquely to the BD associated with this update (as 2013 determined by its RTs and Ethernet Tag ID). The assigned MPLS 2014 label is upstream allocated unless the procedures in section 19 2015 (Use of Domain-wide Common Block (DCB) Labels) are followed. If 2016 the PE has already advertised Inclusive Multicast Ethernet Tag 2017 routes for two or more BDs that it now desires to aggregate, then 2018 the PE MUST re-advertise those routes. The re-advertised routes 2019 MUST be the same as the original ones, except for the PMSI Tunnel 2020 attribute and the label carried in that attribute. 2022 + If the PE that originates the advertisement uses ingress 2023 replication for the P-tunnel for EVPN, the route MUST include the 2024 PMSI Tunnel attribute with the Tunnel Type set to Ingress 2025 Replication and the Tunnel Identifier set to a routable address of 2026 the PE. The PMSI Tunnel attribute MUST carry a downstream assigned 2027 MPLS label. This label is used to demultiplex the broadcast, 2028 multicast, or unknown unicast EVPN traffic received over an MP2P 2029 tunnel by the PE. 2031 12. Processing of Unknown Unicast Packets 2033 The procedures in this document do not require the PEs to flood 2034 unknown unicast traffic to other PEs. If PEs learn CE MAC addresses 2035 via a control-plane protocol, the PEs can then distribute MAC 2036 addresses via BGP, and all unicast MAC addresses will be learned 2037 prior to traffic to those destinations. 2039 However, if a destination MAC address of a received packet is not 2040 known by the PE, the PE may have to flood the packet. When flooding, 2041 one must take into account "split-horizon forwarding" as follows: The 2042 principles behind the following procedures are borrowed from the 2043 split-horizon forwarding rules in VPLS solutions [RFC4761] [RFC4762]. 2044 When a PE capable of flooding (say PEx) receives an unknown 2045 destination MAC address, it floods the frame. If the frame arrived 2046 from an attached CE, PEx must send a copy of that frame on every 2047 Ethernet segment (belonging to that EVI) for which it is the DF, 2048 other than the Ethernet segment on which it received the frame. In 2049 addition, the PE must flood the frame to all other PEs participating 2050 in that EVPN instance. If, on the other hand, the frame arrived from 2051 another PE (say PEy), PEx must send a copy of the packet on each 2052 Ethernet segment (belonging to that EVI) for which it is the DF. PEx 2053 MUST NOT send the frame to other PEs, since PEy would have already 2054 done so. Split-horizon forwarding rules apply to unknown MAC 2055 addresses. 2057 Whether or not to flood packets to unknown destination MAC addresses 2058 should be an administrative choice, depending on how learning happens 2059 between CEs and PEs. 2061 The PEs in a particular EVPN instance may use ingress replication 2062 using RSVP-TE P2P LSPs or LDP MP2P LSPs for sending unknown unicast 2063 traffic to other PEs. Or, they may use RSVP-TE P2MP or LDP P2MP for 2064 sending such traffic to other PEs. 2066 12.1. Ingress Replication 2068 If ingress replication is in use, the P-tunnel attribute, carried in 2069 the Inclusive Multicast Ethernet Tag routes for the EVPN instance, 2070 specifies the downstream label that the other PEs can use to send 2071 unknown unicast, multicast, or broadcast traffic for that EVPN 2072 instance to this particular PE. 2074 The PE that receives a packet with this particular MPLS label MUST 2075 treat the packet as a broadcast, multicast, or unknown unicast 2076 packet. Further, if the MAC address is a unicast MAC address, the PE 2077 MUST treat the packet as an unknown unicast packet. 2079 12.2. P2MP MPLS LSPs 2081 The procedures for using P2MP or MP2MP LSPs are very similar to the 2082 VPLS procedures described in [RFC7117]. The P-tunnel attribute used 2083 by a PE for sending unknown unicast, broadcast, or multicast traffic 2084 for a particular EVPN instance is advertised in the Inclusive 2085 Multicast Ethernet Tag route as described in Section 11 2086 ("Handling of Multi-destination Traffic"). 2088 The P-tunnel attribute specifies the P2MP or MP2MP LSP identifier. 2089 This is the equivalent of an Inclusive tree as described in 2090 [RFC7117]. Note that multiple BDs in the same or different EVIs may 2091 use the same P2MP or MP2MP LSP, using upstream labels [RFC7117] or 2092 DCB labels [I-D.ietf-bess-mvpn-evpn-aggregation-label]. This is the 2093 equivalent of an Aggregate Inclusive tree [RFC7117]. When P2MP or 2094 MP2MP LSPs are used for flooding unknown unicast traffic, packet 2095 reordering is possible. 2097 The PE that receives a packet on the P2MP or MP2MP LSP specified in 2098 the PMSI Tunnel attribute MUST treat the packet as a broadcast, 2099 multicast, or unknown unicast packet. Further, if the MAC address is 2100 a unicast MAC address, the PE MUST treat the packet as an unknown 2101 unicast packet. 2103 13. Forwarding Unicast Packets 2105 This section describes procedures for forwarding unicast packets by 2106 PEs, where such packets are received from either directly connected 2107 CEs or some other PEs. 2109 13.1. Forwarding Packets Received from a CE 2111 When a PE receives a packet from a CE with a given Ethernet Tag, it 2112 must first look up the packet's source MAC address. In certain 2113 environments that enable MAC security, the source MAC address MAY be 2114 used to validate the host identity and determine that traffic from 2115 the host can be allowed into the network. Source MAC lookup MAY also 2116 be used for local MAC address learning. 2118 If the PE decides to forward the packet, the destination MAC address 2119 of the packet must be looked up. If the PE has received MAC address 2120 advertisements for this destination MAC address from one or more 2121 other PEs or has learned it from locally connected CEs, the MAC 2122 address is considered a known MAC address. Otherwise, it is 2123 considered an unknown MAC address. 2125 For known MAC addresses, the PE forwards this packet to one of the 2126 remote PEs or to a locally attached CE. When forwarding to a remote 2127 PE, the packet is encapsulated in the EVPN MPLS label advertised by 2128 the remote PE, for that MAC address, and in the MPLS LSP label stack 2129 to reach the remote PE. 2131 If the MAC address is unknown and if the administrative policy on the 2132 PE requires flooding of unknown unicast traffic, then: 2134 - The PE MUST flood the packet to other PEs. The PE MUST first 2135 encapsulate the packet in the ESI MPLS label as described in 2136 Section 8.3. If ingress replication is used, the packet MUST be 2137 replicated to each remote PE, with the VPN label being the MPLS 2138 label advertised by the remote PE in a PMSI Tunnel attribute in the 2139 Inclusive Multicast Ethernet Tag route for the [EVI, BD] associated 2140 with the received packet's Ethernet tag. 2142 If P2MP LSPs are being used, the packet MUST be sent on the P2MP 2143 LSP of which the PE is the root, for the [EVI, BD] associated with 2144 the received packet's Ethernet tag. If the same P2MP LSP is used 2145 for all the BD's in the EVI, then all the PEs in the EVI MUST be 2146 the leaves of the P2MP LSP. If a different P2MP LSP is used for a 2147 given BD in the EVI, then only the PEs in that BD MUST be the 2148 leaves of the P2MP LSP. The packet MUST be encapsulated in the 2149 P2MP LSP label stack. 2151 If the MAC address is unknown, then, if the administrative policy on 2152 the PE does not allow flooding of unknown unicast traffic: 2154 - the PE MUST drop the packet. 2156 13.2. Forwarding Packets Received from a Remote PE 2158 This section describes the procedures for forwarding known and 2159 unknown unicast packets received from a remote PE. 2161 13.2.1. Unknown Unicast Forwarding 2163 When a PE receives an MPLS packet from a remote PE, then, after 2164 processing the MPLS label stack, if the top MPLS label ends up being 2165 a P2MP LSP label associated with an EVPN instance or -- in the case 2166 of ingress replication -- the downstream label advertised in the 2167 P-tunnel attribute, and after performing the split-horizon procedures 2168 described in Section 8.3: 2170 - If the PE is the designated forwarder of BUM traffic on a 2171 particular set of ESes for the [EVI, BD], the default behavior is 2172 for the PE to flood that traffic to these ESes. In other words, 2173 the default behavior is for the PE to assume that for BUM traffic 2174 it is not required to perform a destination MAC address lookup. As 2175 an option, the PE may perform a destination MAC lookup to flood the 2176 packet to only a subset of these ESes. For instance, the PE may 2177 decide to not flood a BUM packet on certain Ethernet segments even 2178 if it is the DF on the Ethernet segment, based on administrative 2179 policy. 2181 - If the PE is not the designated forwarder for any ES associated 2182 with the [EVI, BD], the default behavior is for it to drop the BUM 2183 traffic. 2185 13.2.2. Known Unicast Forwarding 2187 If the top MPLS label ends up being an EVPN label that was advertised 2188 in the unicast MAC advertisements, then the PE either forwards the 2189 packet based on CE next-hop forwarding information associated with 2190 the label or does a destination MAC address lookup to forward the 2191 packet to a CE. 2193 14. Load Balancing of Unicast Packets 2195 This section specifies the load-balancing procedures for sending 2196 known unicast packets to a multihomed CE. 2198 14.1. Load Balancing of Traffic from a PE to Remote CEs 2200 When a remote PE imports a MAC/IP Advertisement route for a given ES 2201 in a MAC-VRF, it MUST examine all imported Ethernet A-D routes for 2202 that ESI in order to determine the load- balancing characteristics of 2203 the Ethernet segment. 2205 14.1.1. Single-Active Redundancy Mode 2207 For a given ES, if a remote PE has imported the set of Ethernet A-D 2208 per ES routes from at least one PE, where the "Single-Active" flag in 2209 the ESI Label extended community is set, then that remote PE MUST 2210 deduce that the ES is operating in Single-Active redundancy mode. 2212 This means that for a given [EVI, BD], a given MAC address is only 2213 reachable only via the PE announcing the associated MAC/IP 2214 Advertisement route - this PE will also have advertised an Ethernet 2215 A-D per EVI route for that [EVI, BD] with an L2-Attr extended 2216 community in which the P bit is set. I.e., the Primary DF Elected PE 2217 is also responsible for sending known unicast frames to the CE and 2218 receiving unicast and BUM frames from it. Similarly, the Backup DF 2219 Elected PE will have advertised an Ethernet AD per EVI route for 2220 [EVI, BD] with an L2-Attr extended community in which the B bit is 2221 set. 2223 If the Primary DF Elected PE loses connectivity to the CE it SHOULD 2224 withdraw its set of Ethernet A-D per ES routes for the affected ES 2225 prior to withdrawing the affected MAC/IP Advertisement routes. The 2226 Backup DF Elected PE (which is now the Primary DF Elected PE) needs 2227 to advertise an Ethernet A-D per EVI route for [EVI, BD] with an 2228 L2-Attr extended community in which the P bit is set. Furthermore, 2229 the new Backup DF Elected PE needs to advertise an Ethernet A-D per 2230 EVI route for [EVI, BD] with an L2-Attr extended community in which 2231 the B bit is set. 2233 A remote PE SHOULD use the Primary DF Elected PE's withdrawal of its 2234 set of Ethernet A-D per ES routes as a trigger to update its 2235 forwarding entries for the associated MAC addresses to point at the 2236 Backup DF Elected PE. As the Backup DF Elected PE starts learning 2237 the MAC addresses over its attached ES, it will start sending MAC/IP 2238 Advertisement routes while the failed PE withdraws its routes. This 2239 mechanism minimizes the flooding of traffic during fail-over events. 2241 14.1.2. All-Active Redundancy Mode 2243 For a given ES, if the remote PE has imported the set of Ethernet A-D 2244 per ES routes from one or more PEs and none of them have the 2245 "Single-Active" flag in the ESI Label extended community set, then 2246 the remote PE MUST deduce that the ES is operating in All-Active 2247 redundancy mode. A remote PE that receives a MAC/IP Advertisement 2248 route with a non-reserved ESI SHOULD consider the advertised MAC 2249 address to be reachable via all PEs that have advertised reachability 2250 to that MAC address's EVI/ES/Ethernet Tag ID via the combination of 2251 an Ethernet A-D per EVI route for that EVI/ES/Ethernet Tag ID AND an 2252 Ethernet A-D per ES route for that ES. The remote PE MUST use 2253 received MAC/IP Advertisement routes and Ethernet A-D per EVI/per ES 2254 routes to construct the set of next hops for the advertised MAC 2255 address. 2257 Each next hop comprises an MPLS label stack that is to be used to 2258 reach a given egress PE and allow it to forward a packet. The 2259 portion of the MPLS label stack that is to be used by that egress PE 2260 to forward a packet is constructed by the remote PE as follows: 2262 - If a MAC/IP Advertisement route was received from that PE, then its 2263 label stack MUST be used in the next hop. 2265 - Otherwise, the label stack from the Ethernet A-D per EVI route that 2266 matches the MAC address' EVI/ES/Ethernet Tag ID MUST be used in the 2267 next hop. 2269 The following example explains the above. 2271 Consider a CE (CE1) that is dual-homed to two PEs (PE1 and PE2) on a 2272 LAG interface (ES1), and is sending packets with source MAC address 2273 MAC1 on VLAN1 (mapped to EVI1). A remote PE, say PE3, is able to 2274 learn that MAC1 is reachable via PE1 and PE2. Both PE1 and PE2 may 2275 advertise MAC1 if they receive packets with MAC1 from CE1. If this 2276 is not the case, and if MAC1 is advertised only by PE1, PE3 still 2277 considers MAC1 as reachable via both PE1 and PE2, as both PE1 and PE2 2278 advertise a set of Ethernet A-D per ES routes for ES1 as well as an 2279 Ethernet A-D per EVI route for . 2281 The MPLS label stack to send the packets to PE1 is the MPLS LSP stack 2282 to get to PE1 (at the top of the stack) followed by the EVPN label 2283 advertised by PE1 for CE1's MAC. 2285 The MPLS label stack to send packets to PE2 is the MPLS LSP stack to 2286 get to PE2 (at the top of the stack) followed by the MPLS label in 2287 the Ethernet A-D route advertised by PE2 for , if PE2 has 2288 not advertised MAC1 in BGP. 2290 We will refer to these label stacks as MPLS next hops. 2292 The remote PE (PE3) can now load balance the traffic it receives from 2293 its CEs, destined for CE1, between PE1 and PE2. PE3 may use N-tuple 2294 flow information to hash traffic into one of the MPLS next hops for 2295 load balancing of IP traffic. Alternatively, PE3 may rely on the 2296 source MAC addresses for load balancing. 2298 Note that once PE3 decides to send a particular packet to PE1 or PE2, 2299 it can pick one out of multiple possible paths to reach the 2300 particular remote PE using regular MPLS procedures. For instance, if 2301 the tunneling technology is based on RSVP-TE LSPs and PE3 decides to 2302 send a particular packet to PE1, then PE3 can choose from multiple 2303 RSVP-TE LSPs that have PE1 as their destination. 2305 When PE1 or PE2 receives the packet destined for CE1 from PE3, if the 2306 packet is a known unicast, it is forwarded to CE1. 2308 14.2. Load Balancing of Traffic between a PE and a Local CE 2310 A CE may be configured with more than one interface connected to 2311 different PEs or the same PE for load balancing, using a technology 2312 such as a LAG. The PE(s) and the CE can load balance traffic onto 2313 these interfaces using one of the following mechanisms. 2315 14.2.1. Data-Plane Learning 2317 Consider that the PEs perform data-plane learning for local MAC 2318 addresses learned from local CEs. This enables the PE(s) to learn a 2319 particular MAC address and associate it with one or more interfaces, 2320 if the technology between the PE and the CE supports multipathing. 2321 The PEs can now load balance traffic destined to that MAC address on 2322 the multiple interfaces. 2324 Whether the CE can load balance traffic that it generates on the 2325 multiple interfaces is dependent on the CE implementation. 2327 14.2.2. Control-Plane Learning 2329 The CE can be a host that advertises the same MAC address using a 2330 control protocol on all interfaces. This enables the PE(s) to learn 2331 the host's MAC address and associate it with all interfaces. The PEs 2332 can now load balance traffic destined to the host on all these 2333 interfaces. The host can also load balance the traffic it generates 2334 onto these interfaces, and the PE that receives the traffic employs 2335 EVPN forwarding procedures to forward the traffic. 2337 15. MAC Mobility 2339 It is possible for a given host or end-station (as defined by its MAC 2340 address) to move from one Ethernet segment to another; this is 2341 referred to as 'MAC Mobility' or 'MAC move', and it is different from 2342 the multihoming situation in which a given MAC address is reachable 2343 via multiple PEs for the same Ethernet segment. In a MAC move, there 2344 would be two sets of MAC/IP Advertisement routes -- one set with the 2345 new Ethernet segment and one set with the previous Ethernet segment 2346 -- and the MAC address would appear to be reachable via each of these 2347 segments. 2349 In order to allow all of the PEs in the EVPN instance to correctly 2350 determine the current location of the MAC address, all advertisements 2351 of it being reachable via the previous Ethernet segment MUST be 2352 withdrawn by the PEs, for the previous Ethernet segment, that had 2353 advertised it. 2355 If local learning is performed using the data plane, these PEs will 2356 not be able to detect that the MAC address has moved to another 2357 Ethernet segment, and the receipt of MAC/IP Advertisement routes, 2358 with the MAC Mobility extended community attribute, from other PEs 2359 serves as the trigger for these PEs to withdraw their advertisements. 2360 If local learning is performed using the control or management 2361 planes, these interactions serve as the trigger for these PEs to 2362 withdraw their advertisements. 2364 In a situation where there are multiple moves of a given MAC, 2365 possibly between the same two Ethernet segments, there may be 2366 multiple withdrawals and re-advertisements. In order to ensure that 2367 all PEs in the EVPN instance receive all of these correctly through 2368 the intervening BGP infrastructure, introducing a sequence number 2369 into the MAC Mobility extended community attribute is necessary. 2371 In order to process mobility events correctly, an implementation MUST 2372 handle scenarios in which sequence number wraparound occurs. 2374 Every MAC mobility event for a given MAC address will contain a 2375 sequence number that is set using the following rules: 2377 - A PE advertising a MAC address for the first time advertises it 2378 with no MAC Mobility extended community attribute. 2380 - A PE detecting a locally attached MAC address for which it had 2381 previously received a MAC/IP Advertisement route with a different 2382 Ethernet segment identifier advertises the MAC address in a MAC/IP 2383 Advertisement route tagged with a MAC Mobility extended community 2384 attribute with a sequence number one greater than the sequence 2385 number in the MAC Mobility extended community attribute of the 2386 received MAC/IP Advertisement route. In the case of the first 2387 mobility event for a given MAC address, where the received MAC/IP 2388 Advertisement route does not carry a MAC Mobility extended 2389 community attribute, the value of the sequence number in the 2390 received route is assumed to be 0 for the purpose of this 2391 processing. 2393 - A PE detecting a locally attached MAC address for which it had 2394 previously received a MAC/IP Advertisement route with the same 2395 non-zero Ethernet segment identifier advertises it with: 2397 1. no MAC Mobility extended community attribute, if the received 2398 route did not carry said attribute. 2400 2. a MAC Mobility extended community attribute with the sequence 2401 number equal to the highest of the sequence number(s) in the 2402 received MAC/IP Advertisement route(s), if the received route(s) 2403 is (are) tagged with a MAC Mobility extended community 2404 attribute. 2406 - A PE detecting a locally attached MAC address for which it had 2407 previously received a MAC/IP Advertisement route with the same zero 2408 Ethernet segment identifier (single-homed scenarios) advertises it 2409 with a MAC Mobility extended community attribute with the sequence 2410 number set properly. In the case of single-homed scenarios, there 2411 is no need for ESI comparison. ESI comparison is done for 2412 multihoming in order to prevent false detection of MAC moves among 2413 the PEs attached to the same multihomed site. 2415 A PE receiving a MAC/IP Advertisement route for a MAC address with a 2416 different Ethernet segment identifier and a higher sequence number 2417 than that which it had previously advertised withdraws its MAC/IP 2418 Advertisement route. If two (or more) PEs advertise the same MAC 2419 address with the same sequence number but different Ethernet segment 2420 identifiers, a PE that receives these routes selects the route 2421 advertised by the PE with the lowest IP address as the best route. 2422 If the PE is the originator of the MAC route and it receives the same 2423 MAC address with the same sequence number that it generated, it will 2424 compare its own IP address with the IP address of the remote PE and 2425 will select the lowest IP. If its own route is not the best one, it 2426 will withdraw the route. 2428 15.1. MAC Duplication Issue 2430 A situation may arise where the same MAC address is learned by 2431 different PEs in the same VLAN because of two (or more) hosts being 2432 misconfigured with the same (duplicate) MAC address. In such a 2433 situation, the traffic originating from these hosts would trigger 2434 continuous MAC moves among the PEs attached to these hosts. It is 2435 important to recognize such a situation and avoid incrementing the 2436 sequence number (in the MAC Mobility extended community attribute) to 2437 infinity. In order to remedy such a situation, a PE that detects a 2438 MAC mobility event via local learning starts an M-second timer (with 2439 a default value of M = 180), and if it detects N MAC moves before the 2440 timer expires (with a default value of N = 5), it concludes that a 2441 duplicate-MAC situation has occurred. The PE MUST alert the operator 2442 and stop sending and processing any BGP MAC/IP Advertisement routes 2443 for that MAC address until a corrective action is taken by the 2444 operator. The values of M and N MUST be configurable to allow for 2445 flexibility in operator control. Note that the other PEs in the EVPN 2446 instance will forward the traffic for the duplicate MAC address to 2447 one of the PEs advertising the duplicate MAC address. 2449 15.2. Sticky MAC Addresses 2451 There are scenarios in which it is desired to configure some MAC 2452 addresses as static so that they are not subjected to MAC moves. In 2453 such scenarios, these MAC addresses are advertised with a MAC 2454 Mobility extended community where the static flag is set to 1 and the 2455 sequence number is set to zero. If a PE receives such advertisements 2456 and later learns the same MAC address(es) via local learning, then 2457 the PE MUST alert the operator. 2459 15.3. Loop Protection 2461 The EVPN MAC Duplication procedure in section 15.1 prevents an 2462 endless EVPN MAC/IP route advertisement exchange for a duplicate MAC 2463 between two (or more) PEs. While this helps the control plane 2464 settle, in case there is backdoor link (loop) between two or more PEs 2465 attached to the same BD, BUM frames being sent by a CE are still 2466 endlessly looped within the BD through the backdoor link and among 2467 the PEs. This may cause unpredictable issues in the CEs connected to 2468 the affected BD. 2470 The EVPN MAC Duplication Mechanism in section 15.1 MAY be extended 2471 with a Loop-protection action that is applied on the duplicate-MAC 2472 addresses. This additional mechanism resolves loops created by 2473 accidental or intentional backdoor links and SHOULD be enabled in all 2474 the PEs attached to the BD. 2476 After following the procedure in section 15.1, when a PE detects a 2477 MAC M as duplicate, the PE behaves as follows: 2479 a) Stops advertising M and logs a duplicate event. 2481 b) Initializes a retry-timer, R seconds. 2483 c) Since Loop Protection is enabled, the PE executes a Loop 2484 Protection action, which we refer to as "Black-Holing" M. 2486 When the PE programs M as a Black-Hole MAC in the Bridge Table, M is 2487 no longer associated to the backdoor Attachment Circuit (AC), but to 2488 a Black-Hole destination. 2490 At this point and while M is in Black-Hole state: 2492 a) If a new frame is received (from the EVPN network or the backdoor 2493 AC) with MAC SA = M, the PE identifies M to be Black-Holed and 2494 discards the frame, ending the loop. 2496 b) Optionally, instead of simply discarding the frame with MAC SA = 2497 M, the PE MAY bring down the AC on which the offending frame is 2498 seen last. 2500 c) Optionally, any frame that arrives at the PE with MAC DA = M 2501 SHOULD be discarded too. 2503 When the retry-timer R for M expires, the PE flushes M from the 2504 Bridge Table and the process is restarted. In general, a Black-Hole 2505 MAC M can be flushed from the Bridge Table if any of the following 2506 events occur: 2508 o Retry-timer R for duplicate-MAC M expires (as discussed). R is 2509 initialized when M is detected as duplicate-MAC. Its value is 2510 configurable and SHOULD be at least three times the EVPN MAC 2511 Duplication M-timer window. 2513 o The operator manually flushes a Black-Hole MAC M. This should be 2514 done only if the conditions under which M was identified as 2515 duplicate have been cleared. 2517 o The remote PE withdraws the MAC/IP route for M and there are no 2518 other remote MAC/IP routes for M. 2520 o The remote PE sends a MAC/IP route update for M with the sticky-bit 2521 set (in the MAC Mobility extended community). 2523 16. Multicast and Broadcast 2525 The PEs in a particular EVPN instance may use ingress replication or 2526 P2MP or MP2MP LSPs to send multicast traffic to other PEs. 2528 16.1. Ingress Replication 2530 The PEs may use ingress replication for flooding BUM traffic as 2531 described in Section 11 ("Handling of Multi-destination Traffic"). A 2532 given broadcast packet must be sent to all the remote PEs. However, 2533 a given multicast packet for a multicast flow may be sent to only a 2534 subset of the PEs. Specifically, a given multicast flow may be sent 2535 to only those PEs that have receivers that are interested in the 2536 multicast flow. Determining which of the PEs have receivers for a 2537 given multicast flow is done using the procedures of 2538 [I-D.ietf-bess-evpn-igmp-mld-proxy]. 2540 16.2. P2MP or MP2MP LSPs 2542 A PE may use an "Inclusive" tree for sending a BUM packet. This 2543 terminology is borrowed from [RFC7117]. 2545 A variety of transport technologies may be used in the service 2546 provider (SP) network. For Inclusive P-multicast trees, these 2547 transport technologies include point-to-multipoint LSPs created by 2548 RSVP-TE or Multipoint LDP (mLDP) or BIER. 2550 16.2.1. Inclusive Trees 2552 An Inclusive tree allows the use of a single multicast distribution 2553 tree, referred to as an Inclusive P-multicast tree, in the SP network 2554 to carry all the multicast traffic from a specified set of EVPN 2555 instances on a given PE. A particular P-multicast tree can be set up 2556 to carry the traffic originated by sites belonging to a single EVPN 2557 instance, or to carry the traffic originated by sites belonging to 2558 several EVPN instances. The ability to carry the traffic of more 2559 than one EVPN instance on the same tree is termed 'Aggregation', and 2560 the tree is called an Aggregate Inclusive P-multicast tree or 2561 Aggregate Inclusive tree for short. The Aggregate Inclusive tree 2562 needs to include every PE that is a member of any of the EVPN 2563 instances that are using the tree. This implies that a PE may 2564 receive BUM traffic even if it doesn't have any receivers that are 2565 interested in receiving that traffic. 2567 An Inclusive or Aggregate Inclusive tree as defined in this document 2568 is a P2MP tree. A P2MP or MP2MP tree is used to carry traffic only 2569 for EVPN CEs that are connected to the PE that is the root of the 2570 tree. 2572 The procedures for signaling an Inclusive tree are the same as those 2573 in [RFC7117], with the VPLS A-D route replaced with the Inclusive 2574 Multicast Ethernet Tag route. The P-tunnel attribute [RFC7117] for 2575 an Inclusive tree is advertised with the Inclusive Multicast Ethernet 2576 Tag route as described in Section 11 2577 ("Handling of Multi-destination Traffic"). Note that for an 2578 Aggregate Inclusive tree, a PE can "aggregate" multiple EVPN 2579 instances on the same P2MP LSP using upstream labels or DCB allocated 2580 labels [I-D.ietf-bess-mvpn-evpn-aggregation-label]. The procedures 2581 for aggregation are the same as those described in [RFC7117], with 2582 VPLS A-D routes replaced by EVPN Inclusive Multicast Ethernet Tag 2583 routes. 2585 17. Convergence 2587 This section describes failure recovery from different types of 2588 network failures. 2590 17.1. Transit Link and Node Failures between PEs 2592 The use of existing MPLS fast-reroute mechanisms can provide failure 2593 recovery on the order of 50 ms, in the event of transit link and node 2594 failures in the infrastructure that connects the PEs. 2596 17.2. PE Failures 2598 Consider a host CE1 that is dual-homed to PE1 and PE2. If PE1 fails, 2599 a remote PE, PE3, can discover this based on the failure of the BGP 2600 session. This failure detection can be in the sub-second range if 2601 Bidirectional Forwarding Detection (BFD) is used to detect BGP 2602 session failures. PE3 can update its forwarding state to start 2603 sending all traffic for CE1 to only PE2. 2605 17.3. PE-to-CE Network Failures 2607 If the connectivity between the multihomed CE and one of the PEs to 2608 which it is attached fails, the PE MUST withdraw the set of Ethernet 2609 A-D per ES routes that had been previously advertised for that ES. 2610 This enables the remote PEs to remove the MPLS next hop to this 2611 particular PE from the set of MPLS next hops that can be used to 2612 forward traffic to the CE. When the MAC entry on the PE ages out, 2613 the PE MUST withdraw the MAC address from BGP. 2615 When an EVI is decommissioned on an Ethernet segment the PE MUST 2616 withdraw the Ethernet A-D per EVI route(s) announced for that . In addition, the PE MUST also withdraw the MAC/IP Advertisement 2618 routes that are impacted by the decommissioning. 2620 The Ethernet A-D per ES routes should be used by an implementation to 2621 optimize the withdrawal of MAC/IP Advertisement routes. When a PE 2622 receives a withdrawal of a particular Ethernet A-D route from an 2623 advertising PE, it SHOULD consider all the MAC/IP Advertisement 2624 routes that are learned from the same ESI as in the Ethernet A-D 2625 route from the advertising PE as having been withdrawn. This 2626 optimizes the network convergence times in the event of PE-to-CE 2627 failures. 2629 18. Frame Ordering 2631 In a MAC address, if the value of the first nibble (bits 8 through 5) 2632 of the most significant octet of the destination MAC address (which 2633 follows the last MPLS label) happens to be 0x4 or 0x6, then the 2634 Ethernet frame can be misinterpreted as an IPv4 or IPv6 packet by 2635 intermediate P nodes performing ECMP based on deep packet inspection, 2636 thus resulting in load balancing packets belonging to the same flow 2637 on different ECMP paths and subjecting those packets to different 2638 delays. Therefore, packets belonging to the same flow can arrive at 2639 the destination out of order. This out-of-order delivery can happen 2640 during steady state in the absence of any failures, resulting in 2641 significant impact on network operations. 2643 In order to avoid frame misordering described in Section 18, the 2644 following network-wide rules are applied: 2646 - If a network uses deep packet inspection for its ECMP, then the 2647 "Preferred PW MPLS Control Word" [RFC4385] MUST be used with the 2648 value 0 (e.g., a 4-octet field with a value of zero) when sending 2649 unicast EVPN-encapsulated packets over an MP2P LSP. 2651 - When sending EVPN-encapsulated packets over a P2MP or P2P RSVP-TE 2652 LSP, then the control word SHOULD NOT be used. 2654 - When sending EVPN-encapsulated packets over a P2MP LSP (e.g., using 2655 mLDP signaling), then the control word SHOULD be used. 2657 - If a network uses entropy labels per [RFC6790], then the control 2658 word SHOULD NOT be used when sending EVPN-encapsulated packets over 2659 an MP2P LSP. 2661 18.1. Flow Label 2663 Flow label is used to add entropy to divisible flows, and creates 2664 ECMP load-balancing in the network. The Flow Label MAY be used in 2665 EVPN networks to achieve better load-balancing in the network, when 2666 transit nodes perform deep packet inspection for ECMP hashing. The 2667 following rules apply: 2669 - When F-bit is set to 1, the PE announces the capability of both 2670 sending and receiving flow label for known unicast. If the PE is 2671 capable of supporting Flow Label, then upon receiving the F-bit 2672 from a remote PE, it MUST send known unicast packets to that PE 2673 with Flow labels and it MUST NOT send BUM packets to that PE with 2674 Flow labels. 2676 - An ingress PE will push the Flow Label at the bottom of the stack 2677 of the EVPN-encapsulated known unicast packets sent to an egress PE 2678 that previously signaled F-bit set to 1. 2680 - The Flow Label MUST NOT be used for EVPN-encapsulated BUM packets. 2682 - If a PE receives a unicast packet with two labels, then it can 2683 differentiate between [VPN label + ESI label] and [VPN label + Flow 2684 label] and there should be no ambiguity between ESI and Flow labels 2685 even if they overlap. The reason for this is that the downstream 2686 assigned VPN label for known unicast is different than for BUM 2687 traffic and ESI label (if present) comes after BUM VPN label. 2688 Therefore, from the VPN label, the receiving PE knows whether the 2689 next label is a ESI label or a Flow label - i.e., if the VPN label 2690 is for known unicast, then the next label MUST be a flow label and 2691 if the VPN label is for BUM traffic, then the next label MUST be an 2692 ESI label because BUM packets are not sent with Flow labels. 2694 - When sending EVPN-encapsulated packets over a P2MP LSP (either 2695 RSVP-TE or mLDP), flow label SHOULD NOT be used. This is 2696 independant of any F-bit signalling in the L2-Attr Extended 2697 Community which would still apply to unicast. 2699 19. Use of Domain-wide Common Block (DCB) Labels 2701 The use of DCB labels as in 2702 [I-D.ietf-bess-mvpn-evpn-aggregation-label] is RECOMMENDED in the 2703 following cases: 2705 + Aggregate P-multicast trees: A P-multicast tree MAY aggregate the 2706 traffic of two or more BDs on a given ingress PE. When aggregation 2707 is needed, DCB Labels [I-D.ietf-bess-mvpn-evpn-aggregation-label] 2708 MAY be used in the MPLS label field of the Inclusive Multicast 2709 Ethernet Tag routes PMSI Tunnel Attribute. The use of DCB Labels, 2710 instead of upstream allocated labels, can greatly reduce the number 2711 of labels that the egress PEs need to process when P-multicast 2712 tunnel aggregation is used in a network with a large number of BDs. 2714 + BIER tunnels: As described in [I-D.ietf-bier-evpn], the use of 2715 labels with BIER tunnels in EVPN networks is similar to aggregate 2716 tunnels, since the ingress PE uses upstream allocated labels to 2717 identify the BD. As described in [I-D.ietf-bier-evpn], DCB labels 2718 can be allocated instead of upstream labels in the PMSI Tunnel 2719 Attribute so that the number of labels required on the egress PEs 2720 can be reduced. 2722 + ESI labels: The ESI labels advertised with EVPN A-D per ES routes 2723 MAY be allocated as DCB labels in general, and are RECOMMENDED to 2724 be allocated as DCB labels when used in combination with P2MP/BIER 2725 tunnels. 2727 When MP2MP tunnels are used, ESI-labels MUST be allocated from a DCB 2728 and the same label must be used by all the PEs attached to the same 2729 Ethernet Segment. In that way, any egress PE with local Ethernet 2730 Segments can identify the source ES of the received BUM packets. 2732 20. Security Considerations 2734 Security considerations discussed in [RFC4761] and [RFC4762] apply to 2735 this document for MAC learning in the data plane over an Attachment 2736 Circuit (AC) and for flooding of unknown unicast and ARP messages 2737 over the MPLS/IP core. Security considerations discussed in 2738 [RFC4364] apply to this document for MAC learning in the control 2739 plane over the MPLS/IP core. This section describes additional 2740 considerations. 2742 As mentioned in [RFC4761], there are two aspects to achieving data 2743 privacy and protecting against denial-of-service attacks in a VPN: 2744 securing the control plane and protecting the forwarding path. 2745 Compromise of the control plane could result in a PE sending customer 2746 data belonging to some EVPN to another EVPN, or black-holing EVPN 2747 customer data, or even sending it to an eavesdropper, none of which 2748 are acceptable from a data privacy point of view. In addition, 2749 compromise of the control plane could provide opportunities for 2750 unauthorized EVPN data usage (e.g., exploiting traffic replication 2751 within a multicast tree to amplify a denial-of-service attack based 2752 on sending large amounts of traffic). 2754 The mechanisms in this document use BGP for the control plane. 2755 Hence, techniques such as those discussed in [RFC5925] help 2756 authenticate BGP messages, making it harder to spoof updates (which 2757 can be used to divert EVPN traffic to the wrong EVPN instance) or 2758 withdrawals (denial-of-service attacks). In the multi-AS backbone 2759 options (b) and (c) [RFC4364], this also means protecting the 2760 inter-AS BGP sessions between the Autonomous System Border Routers 2761 (ASBRs), the PEs, or the Route Reflectors. 2763 Further discussion of security considerations for BGP may be found in 2764 the BGP specification itself [RFC4271] and in the security analysis 2765 for BGP [RFC4272]. The original discussion of the use of the TCP MD5 2766 signature option to protect BGP sessions is found in [RFC5925], while 2767 [RFC6952] includes an analysis of BGP keying and authentication 2768 issues. 2770 Note that [RFC5925] will not help in keeping MPLS labels private -- 2771 knowing the labels, one can eavesdrop on EVPN traffic. Such 2772 eavesdropping additionally requires access to the data path within an 2773 SP network. Users of VPN services are expected to take appropriate 2774 precautions (such as encryption) to protect the data exchanged over 2775 a VPN. 2777 One of the requirements for protecting the data plane is that the 2778 MPLS labels be accepted only from valid interfaces. For a PE, valid 2779 interfaces comprise links from other routers in the PE's own AS. For 2780 an ASBR, valid interfaces comprise links from other routers in the 2781 ASBR's own AS, and links from other ASBRs in ASes that have instances 2782 of a given EVPN. It is especially important in the case of multi-AS 2783 EVPN instances that one accept EVPN packets only from valid 2784 interfaces. 2786 It is also important to help limit malicious traffic into a network 2787 for an impostor MAC address. The mechanism described in Section 15.1 2788 shows how duplicate MAC addresses can be detected and continuous 2789 false MAC mobility can be prevented. The mechanism described in 2790 Section 15.2 shows how MAC addresses can be pinned to a given 2791 Ethernet segment, such that if they appear behind any other Ethernet 2792 segments, the traffic for those MAC addresses can be prevented from 2793 entering the EVPN network from the other Ethernet segments. 2795 21. IANA Considerations 2797 This document defines a new NLRI, called "EVPN", to be carried in BGP 2798 using multiprotocol extensions. This NLRI uses the existing AFI of 2799 25 (L2VPN). IANA has assigned BGP EVPNs a SAFI value of 70. 2801 IANA has allocated the following EVPN Extended Community sub-types in 2802 [RFC7153], and this document is the only reference for them, in 2803 addition to [RFC7432]. 2805 0x00 MAC Mobility [RFC7432] 2806 0x01 ESI Label [RFC7432] 2807 0x02 ES-Import Route Target [RFC7432] 2809 This document creates a registry called "EVPN Route Types". New 2810 registrations will be made through the "RFC Required" procedure 2811 defined in [RFC5226]. The registry has a maximum value of 255. 2812 Initial registrations from [RFC7432] are as follows: 2814 0 Reserved [RFC7432] 2815 1 Ethernet Auto-discovery [RFC7432] 2816 2 MAC/IP Advertisement [RFC7432] 2817 3 Inclusive Multicast Ethernet Tag [RFC7432] 2818 4 Ethernet Segment [RFC7432] 2820 This document requests allocation of bit 3 in the "EVPN Layer 2 2821 Attributes Control Flags" registry with name F: 2823 F Flow Label MUST be present 2825 22. References 2827 22.1. Normative References 2829 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2830 Requirement Levels", BCP 14, RFC 2119, 2831 DOI 10.17487/RFC2119, March 1997, 2832 . 2834 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 2835 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 2836 DOI 10.17487/RFC4271, January 2006, 2837 . 2839 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 2840 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, 2841 February 2006, . 2843 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 2844 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 2845 2006, . 2847 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 2848 "Multiprotocol Extensions for BGP-4", RFC 4760, 2849 DOI 10.17487/RFC4760, January 2007, 2850 . 2852 [RFC4761] Kompella, K., Ed. and Y. Rekhter, Ed., "Virtual Private 2853 LAN Service (VPLS) Using BGP for Auto-Discovery and 2854 Signaling", RFC 4761, DOI 10.17487/RFC4761, January 2007, 2855 . 2857 [RFC4762] Lasserre, M., Ed. and V. Kompella, Ed., "Virtual Private 2858 LAN Service (VPLS) Using Label Distribution Protocol (LDP) 2859 Signaling", RFC 4762, DOI 10.17487/RFC4762, January 2007, 2860 . 2862 [RFC7153] Rosen, E. and Y. Rekhter, "IANA Registries for BGP 2863 Extended Communities", RFC 7153, DOI 10.17487/RFC7153, 2864 March 2014, . 2866 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 2867 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 2868 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 2869 2015, . 2871 [RFC8214] Boutros, S., Sajassi, A., Salam, S., Drake, J., and J. 2872 Rabadan, "Virtual Private Wire Service Support in Ethernet 2873 VPN", RFC 8214, DOI 10.17487/RFC8214, August 2017, 2874 . 2876 [RFC8584] Rabadan, J., Ed., Mohanty, S., Ed., Sajassi, A., Drake, 2877 J., Nagaraj, K., and S. Sathappan, "Framework for Ethernet 2878 VPN Designated Forwarder Election Extensibility", 2879 RFC 8584, DOI 10.17487/RFC8584, April 2019, 2880 . 2882 22.2. Informative References 2884 [I-D.ietf-bess-evpn-igmp-mld-proxy] 2885 Sajassi, A., Thoria, S., Mishra, M., Patel, K., Drake, J., 2886 and W. Lin, "IGMP and MLD Proxy for EVPN", draft-ietf- 2887 bess-evpn-igmp-mld-proxy-07 (work in progress), April 2888 2021. 2890 [I-D.ietf-bess-evpn-mh-split-horizon] 2891 Rabadan, J., Nagaraj, K., Lin, W., and A. Sajassi, "EVPN 2892 Multi-Homing Extensions for Split Horizon Filtering", 2893 draft-ietf-bess-evpn-mh-split-horizon-01 (work in 2894 progress), April 2021. 2896 [I-D.ietf-bess-evpn-prefix-advertisement] 2897 Rabadan, J., Henderickx, W., Drake, J., Lin, W., and A. 2898 Sajassi, "IP Prefix Advertisement in EVPN", draft-ietf- 2899 bess-evpn-prefix-advertisement-11 (work in progress), May 2900 2018. 2902 [I-D.ietf-bess-mvpn-evpn-aggregation-label] 2903 Zhang, Z., Rosen, E., Lin, W., Li, Z., and I. Wijnands, 2904 "MVPN/EVPN Tunnel Aggregation with Common Labels", draft- 2905 ietf-bess-mvpn-evpn-aggregation-label-06 (work in 2906 progress), April 2021. 2908 [I-D.ietf-bier-evpn] 2909 Zhang, Z., Przygienda, T., Sajassi, A., and J. Rabadan, 2910 "EVPN BUM Using BIER", draft-ietf-bier-evpn-04 (work in 2911 progress), December 2020. 2913 [IEEE.802.1D_2004] 2914 IEEE, "IEEE Standard for Local and metropolitan area 2915 networks: Media Access Control (MAC) Bridges", IEEE 2916 802.1D-2004, DOI 10.1109/ieeestd.2004.94569, July 2004, 2917 . 2919 [IEEE.802.1Q_2014] 2920 IEEE, "IEEE Standard for Local and metropolitan area 2921 networks--Bridges and Bridged Networks", IEEE 802.1Q-2014, 2922 DOI 10.1109/ieeestd.2014.6991462, December 2014, 2923 . 2926 [RFC4272] Murphy, S., "BGP Security Vulnerabilities Analysis", 2927 RFC 4272, DOI 10.17487/RFC4272, January 2006, 2928 . 2930 [RFC4385] Bryant, S., Swallow, G., Martini, L., and D. McPherson, 2931 "Pseudowire Emulation Edge-to-Edge (PWE3) Control Word for 2932 Use over an MPLS PSN", RFC 4385, DOI 10.17487/RFC4385, 2933 February 2006, . 2935 [RFC4664] Andersson, L., Ed. and E. Rosen, Ed., "Framework for Layer 2936 2 Virtual Private Networks (L2VPNs)", RFC 4664, 2937 DOI 10.17487/RFC4664, September 2006, 2938 . 2940 [RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk, 2941 R., Patel, K., and J. Guichard, "Constrained Route 2942 Distribution for Border Gateway Protocol/MultiProtocol 2943 Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual 2944 Private Networks (VPNs)", RFC 4684, DOI 10.17487/RFC4684, 2945 November 2006, . 2947 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 2948 IANA Considerations Section in RFCs", RFC 5226, 2949 DOI 10.17487/RFC5226, May 2008, 2950 . 2952 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 2953 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 2954 June 2010, . 2956 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 2957 Encodings and Procedures for Multicast in MPLS/BGP IP 2958 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 2959 . 2961 [RFC6790] Kompella, K., Drake, J., Amante, S., Henderickx, W., and 2962 L. Yong, "The Use of Entropy Labels in MPLS Forwarding", 2963 RFC 6790, DOI 10.17487/RFC6790, November 2012, 2964 . 2966 [RFC6952] Jethanandani, M., Patel, K., and L. Zheng, "Analysis of 2967 BGP, LDP, PCEP, and MSDP Issues According to the Keying 2968 and Authentication for Routing Protocols (KARP) Design 2969 Guide", RFC 6952, DOI 10.17487/RFC6952, May 2013, 2970 . 2972 [RFC7117] Aggarwal, R., Ed., Kamite, Y., Fang, L., Rekhter, Y., and 2973 C. Kodeboniya, "Multicast in Virtual Private LAN Service 2974 (VPLS)", RFC 7117, DOI 10.17487/RFC7117, February 2014, 2975 . 2977 [RFC7209] Sajassi, A., Aggarwal, R., Uttaro, J., Bitar, N., 2978 Henderickx, W., and A. Isaac, "Requirements for Ethernet 2979 VPN (EVPN)", RFC 7209, DOI 10.17487/RFC7209, May 2014, 2980 . 2982 [RFC8317] Sajassi, A., Ed., Salam, S., Drake, J., Uttaro, J., 2983 Boutros, S., and J. Rabadan, "Ethernet-Tree (E-Tree) 2984 Support in Ethernet VPN (EVPN) and Provider Backbone 2985 Bridging EVPN (PBB-EVPN)", RFC 8317, DOI 10.17487/RFC8317, 2986 January 2018, . 2988 [RFC8365] Sajassi, A., Ed., Drake, J., Ed., Bitar, N., Shekhar, R., 2989 Uttaro, J., and W. Henderickx, "A Network Virtualization 2990 Overlay Solution Using Ethernet VPN (EVPN)", RFC 8365, 2991 DOI 10.17487/RFC8365, March 2018, 2992 . 2994 22.3. URIs 2996 [1] https://tools.ietf.org/rfcdiff?url1=https://www.rfc- 2997 editor.org/rfc/rfc7432.txt&url2=https://www.ietf.org/archive/id/ 2998 draft-ietf-bess-rfc7432bis-01.txt 3000 Appendix A. Acknowledgments for This Document (2021) 3002 Appendix B. Contributors for This Document (2021) 3004 In addition to the authors listed on the front page, the following 3005 co-authors have also contributed to this document: 3007 Appendix C. Acknowledgments from the First Edition (2015) 3009 Special thanks to Yakov Rekhter for reviewing this document several 3010 times and providing valuable comments, and for his very engaging 3011 discussions on several topics of this document that helped shape this 3012 document. We would also like to thank Pedro Marques, Kaushik Ghosh, 3013 Nischal Sheth, Robert Raszuk, Amit Shukla, and Nadeem Mohammed for 3014 discussions that helped shape this document. We would also like to 3015 thank Han Nguyen for his comments and support of this work. We would 3016 also like to thank Steve Kensil and Reshad Rahman for their reviews. 3017 We would like to thank Jorge Rabadan for his contribution to 3018 Section 5 of this document. We would like to thank Thomas Morin for 3019 his review of this document and his contribution of Section 8.7. 3020 Many thanks to Jakob Heitz for his help to improve several sections 3021 of this document. 3023 We would also like to thank Clarence Filsfils, Dennis Cai, Quaizar 3024 Vohra, Kireeti Kompella, and Apurva Mehta for their contributions to 3025 this document. 3027 Last but not least, special thanks to Giles Heron (our WG chair) for 3028 his detailed review of this document in preparation for WG Last Call 3029 and for making many valuable suggestions. 3031 C.1. Contributors from the First Edition (2015) 3033 In addition to the authors listed on the front page, the following 3034 co-authors have also contributed to this document: 3036 Keyur Patel 3037 Samer Salam 3038 Sami Boutros 3039 Cisco 3041 Yakov Rekhter 3042 Ravi Shekhar 3043 Juniper Networks 3045 Florin Balus 3046 Nuage Networks 3048 C.2. Authors from the First Edition (2015) 3050 Original Authors: 3052 Ali Sajassi 3053 Cisco 3054 EMail: sajassi@cisco.com 3056 Rahul Aggarwal 3057 Arktan 3059 EMail: raggarwa_1@yahoo.com 3061 Nabil Bitar 3062 Verizon Communications 3064 EMail : nabil.n.bitar@verizon.com 3066 Aldrin Isaac 3067 Bloomberg 3069 EMail: aisaac71@bloomberg.net 3071 James Uttaro 3072 AT&T 3074 EMail: uttaro@att.com 3076 John Drake 3077 Juniper Networks 3079 EMail: jdrake@juniper.net 3081 Wim Henderickx 3082 Alcatel-Lucent 3084 EMail: wim.henderickx@alcatel-lucent.com 3086 Authors' Addresses 3088 Ali Sajassi 3089 Cisco 3091 Email: sajassi@cisco.com 3093 Luc Andre Burdet (editor) 3094 Cisco 3096 Email: lburdet@cisco.com 3097 John Drake 3098 Juniper 3100 Email: jdrake@juniper.net 3102 Jorge Rabadan 3103 Nokia 3105 Email: jorge.rabadan@nokia.com