idnits 2.17.1 draft-ietf-bess-rfc7432bis-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHALL not' in this paragraph: * MAC Mobility EC SHALL not be attached to routes having GMAC EC on the sending side and SHALL be ignored on the receiving side. -- The document date (December 21, 2020) is 1220 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'EVI' is mentioned on line 2167, but not defined == Missing Reference: 'VLAN' is mentioned on line 974, but not defined == Missing Reference: 'BD' is mentioned on line 2167, but not defined == Missing Reference: 'DCB' is mentioned on line 2653, but not defined == Missing Reference: 'I-D.ietf-bier-evpn' is mentioned on line 2668, but not defined == Unused Reference: 'I-D.ietf-bess-evpn-vpws-fxc' is defined on line 2779, but no explicit reference was found in the text == Outdated reference: A later version (-08) exists of draft-ietf-bess-evpn-vpws-fxc-01 == Outdated reference: A later version (-21) exists of draft-ietf-bess-evpn-igmp-mld-proxy-05 == Outdated reference: A later version (-14) exists of draft-ietf-bess-mvpn-evpn-aggregation-label-04 -- Obsolete informational reference (is this intentional?): RFC 5226 (Obsoleted by RFC 8126) Summary: 0 errors (**), 0 flaws (~~), 11 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 BESS Working Group A. Sajassi, Ed. 3 Internet-Draft Cisco 4 Intended status: Standards Track J. Drake 5 Expires: June 24, 2021 Juniper 6 J. Rabadan 7 Nokia 8 December 21, 2020 10 BGP MPLS-Based Ethernet VPN 11 draft-ietf-bess-rfc7432bis-00 13 Abstract 15 This document describes procedures for BGP MPLS-based Ethernet VPNs 16 (EVPN). The procedures described here meet the requirements 17 specified in RFC 7209 -- "Requirements for Ethernet VPN (EVPN)". 19 Status of This Memo 21 This Internet-Draft is submitted in full conformance with the 22 provisions of BCP 78 and BCP 79. 24 Internet-Drafts are working documents of the Internet Engineering 25 Task Force (IETF). Note that other groups may also distribute 26 working documents as Internet-Drafts. The list of current Internet- 27 Drafts is at https://datatracker.ietf.org/drafts/current/. 29 Internet-Drafts are draft documents valid for a maximum of six months 30 and may be updated, replaced, or obsoleted by other documents at any 31 time. It is inappropriate to use Internet-Drafts as reference 32 material or to cite them other than as "work in progress." 34 This Internet-Draft will expire on June 24, 2021. 36 Copyright Notice 38 Copyright (c) 2020 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (https://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. Code Components extracted from this document must 47 include Simplified BSD License text as described in Section 4.e of 48 the Trust Legal Provisions and are provided without warranty as 49 described in the Simplified BSD License. 51 Table of Contents 53 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 54 2. Specification of Requirements . . . . . . . . . . . . . . . . 4 55 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 56 4. BGP MPLS-Based EVPN Overview . . . . . . . . . . . . . . . . 6 57 5. Ethernet Segment . . . . . . . . . . . . . . . . . . . . . . 7 58 6. Ethernet Tag ID . . . . . . . . . . . . . . . . . . . . . . . 10 59 6.1. VLAN-Based Service Interface . . . . . . . . . . . . . . 11 60 6.2. VLAN Bundle Service Interface . . . . . . . . . . . . . . 11 61 6.2.1. Port-Based Service Interface . . . . . . . . . . . . 11 62 6.3. VLAN-Aware Bundle Service Interface . . . . . . . . . . . 11 63 6.3.1. Port-Based VLAN-Aware Service Interface . . . . . . . 12 64 6.4. EVPN PE Model . . . . . . . . . . . . . . . . . . . . . . 12 65 7. BGP EVPN Routes . . . . . . . . . . . . . . . . . . . . . . . 14 66 7.1. Ethernet Auto-discovery Route . . . . . . . . . . . . . . 15 67 7.2. MAC/IP Advertisement Route . . . . . . . . . . . . . . . 15 68 7.3. Inclusive Multicast Ethernet Tag Route . . . . . . . . . 16 69 7.4. Ethernet Segment Route . . . . . . . . . . . . . . . . . 17 70 7.5. ESI Label Extended Community . . . . . . . . . . . . . . 17 71 7.6. ES-Import Route Target . . . . . . . . . . . . . . . . . 18 72 7.7. MAC Mobility Extended Community . . . . . . . . . . . . . 19 73 7.8. Default Gateway Extended Community . . . . . . . . . . . 19 74 7.9. EVPN Layer 2 Attributes Extended Community . . . . . . . 20 75 7.9.1. EVPN Layer 2 Attributes Partitioning . . . . . . . . 20 76 7.10. Route Distinguisher Assignment per MAC-VRF . . . . . . . 22 77 7.11. Route Targets . . . . . . . . . . . . . . . . . . . . . . 22 78 7.11.1. Auto-derivation from the Ethernet Tag ID . . . . . . 22 79 7.12. Route Prioritization . . . . . . . . . . . . . . . . . . 23 80 8. Multihoming Functions . . . . . . . . . . . . . . . . . . . . 23 81 8.1. Multihomed Ethernet Segment Auto-discovery . . . . . . . 23 82 8.1.1. Constructing the Ethernet Segment Route . . . . . . . 23 83 8.2. Fast Convergence . . . . . . . . . . . . . . . . . . . . 24 84 8.2.1. Constructing Ethernet A-D per Ethernet Segment Route 24 85 8.2.1.1. Ethernet A-D Route Targets . . . . . . . . . . . 25 86 8.3. Split Horizon . . . . . . . . . . . . . . . . . . . . . . 25 87 8.3.1. ESI Label Assignment . . . . . . . . . . . . . . . . 26 88 8.3.1.1. Ingress Replication . . . . . . . . . . . . . . . 26 89 8.3.1.2. P2MP MPLS LSPs . . . . . . . . . . . . . . . . . 27 90 8.3.1.3. MP2MP MPLS LSPs . . . . . . . . . . . . . . . . . 28 91 8.4. Aliasing and Backup Path . . . . . . . . . . . . . . . . 29 92 8.4.1. Constructing Ethernet A-D per EVPN Instance Route . . 30 93 8.5. Designated Forwarder Election . . . . . . . . . . . . . . 31 94 8.6. Signaling Primary and Backup DF Elected PEs . . . . . . . 33 95 8.7. Interoperability with Single-Homing PEs . . . . . . . . . 34 97 9. Determining Reachability to Unicast MAC Addresses . . . . . . 34 98 9.1. Local Learning . . . . . . . . . . . . . . . . . . . . . 34 99 9.2. Remote Learning . . . . . . . . . . . . . . . . . . . . . 35 100 9.2.1. Constructing MAC/IP Address Advertisement . . . . . . 35 101 9.2.2. Route Resolution . . . . . . . . . . . . . . . . . . 37 102 10. ARP and ND . . . . . . . . . . . . . . . . . . . . . . . . . 38 103 10.1. Default Gateway . . . . . . . . . . . . . . . . . . . . 39 104 10.1.1. Best Path selection for Default Gateway . . . . . . 40 105 11. Handling of Multi-destination Traffic . . . . . . . . . . . . 41 106 11.1. Constructing Inclusive Multicast Ethernet Tag Route . . 41 107 11.2. P-Tunnel Identification . . . . . . . . . . . . . . . . 42 108 12. Processing of Unknown Unicast Packets . . . . . . . . . . . . 43 109 12.1. Ingress Replication . . . . . . . . . . . . . . . . . . 43 110 12.2. P2MP MPLS LSPs . . . . . . . . . . . . . . . . . . . . . 44 111 13. Forwarding Unicast Packets . . . . . . . . . . . . . . . . . 44 112 13.1. Forwarding Packets Received from a CE . . . . . . . . . 44 113 13.2. Forwarding Packets Received from a Remote PE . . . . . . 45 114 13.2.1. Unknown Unicast Forwarding . . . . . . . . . . . . . 45 115 13.2.2. Known Unicast Forwarding . . . . . . . . . . . . . . 46 116 14. Load Balancing of Unicast Packets . . . . . . . . . . . . . . 46 117 14.1. Load Balancing of Traffic from a PE to Remote CEs . . . 46 118 14.1.1. Single-Active Redundancy Mode . . . . . . . . . . . 46 119 14.1.2. All-Active Redundancy Mode . . . . . . . . . . . . . 47 120 14.2. Load Balancing of Traffic between a PE and a Local CE . 49 121 14.2.1. Data-Plane Learning . . . . . . . . . . . . . . . . 49 122 14.2.2. Control-Plane Learning . . . . . . . . . . . . . . . 49 123 15. MAC Mobility . . . . . . . . . . . . . . . . . . . . . . . . 49 124 15.1. MAC Duplication Issue . . . . . . . . . . . . . . . . . 51 125 15.2. Sticky MAC Addresses . . . . . . . . . . . . . . . . . . 52 126 15.3. Loop Protection . . . . . . . . . . . . . . . . . . . . 52 127 16. Multicast and Broadcast . . . . . . . . . . . . . . . . . . . 53 128 16.1. Ingress Replication . . . . . . . . . . . . . . . . . . 53 129 16.2. P2MP or MP2MP LSPs . . . . . . . . . . . . . . . . . . . 54 130 16.2.1. Inclusive Trees . . . . . . . . . . . . . . . . . . 54 131 17. Convergence . . . . . . . . . . . . . . . . . . . . . . . . . 55 132 17.1. Transit Link and Node Failures between PEs . . . . . . . 55 133 17.2. PE Failures . . . . . . . . . . . . . . . . . . . . . . 55 134 17.3. PE-to-CE Network Failures . . . . . . . . . . . . . . . 55 135 18. Frame Ordering . . . . . . . . . . . . . . . . . . . . . . . 56 136 18.1. Flow Label . . . . . . . . . . . . . . . . . . . . . . . 56 137 19. Use of Domain-wide Common Block (DCB) Labels . . . . . . . . 57 138 20. Security Considerations . . . . . . . . . . . . . . . . . . . 58 139 21. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 59 140 22. References . . . . . . . . . . . . . . . . . . . . . . . . . 60 141 22.1. Normative References . . . . . . . . . . . . . . . . . . 60 142 22.2. Informative References . . . . . . . . . . . . . . . . . 61 143 Appendix A. Acknowledgments for This Document (2020) . . . . . . 63 144 Appendix B. Contributors . . . . . . . . . . . . . . . . . . . . 63 145 Appendix C. Acknowledgments from the First Edition (2015) . . . 63 146 C.1. Contributors from the First Edition (2015) . . . . . . . 64 147 C.2. Authors from the First Edition (2015) . . . . . . . . . . 64 148 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 65 150 1. Introduction 152 Virtual Private LAN Service (VPLS), as defined in [RFC4664], 153 [RFC4761], and [RFC4762], is a proven and widely deployed technology. 154 However, the existing solution has a number of limitations when it 155 comes to multihoming and redundancy, multicast optimization, 156 provisioning simplicity, flow-based load balancing, and multipathing; 157 these limitations are important considerations for Data Center (DC) 158 deployments. [RFC7209] describes the motivation for a new solution 159 to address these limitations. It also outlines a set of requirements 160 that the new solution must address. 162 This document describes procedures for a BGP MPLS-based solution 163 called Ethernet VPN (EVPN) to address the requirements specified in 164 [RFC7209]. Please refer to [RFC7209] for the detailed requirements 165 and motivation. EVPN requires extensions to existing IP/MPLS 166 protocols as described in this document. In addition to these 167 extensions, EVPN uses several building blocks from existing MPLS 168 technologies. 170 2. Specification of Requirements 172 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 173 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 174 document are to be interpreted as described in [RFC2119]. 176 3. Terminology 178 BD: Broadcast Domain. In a bridged network, the broadcast domain 179 corresponds to a Virtual LAN (VLAN), where a VLAN is typically 180 represented by a single VLAN ID (VID) but can be represented by 181 several VIDs where Shared VLAN Learning (SVL) is used per 182 [IEEE.802.1Q_2014]. 184 Bridge Table: An instantiation of a broadcast domain on a MAC-VRF. 186 CE: Customer Edge device, e.g., a host, router, or switch. 188 EVI: An EVPN instance spanning the Provider Edge (PE) devices 189 participating in that EVPN. An EVI may be comprised of one BD 190 (VLAN-based or VLAN Bundle services) or multiple BDs (VLAN-aware 191 Bundle services). 193 MAC-VRF: A Virtual Routing and Forwarding table for Media Access 194 Control (MAC) addresses on a PE. 196 Ethernet Segment (ES): When a customer site (device or network) is 197 connected to one or more PEs via a set of Ethernet links, then 198 that set of links is referred to as an 'Ethernet segment'. 200 Ethernet Segment Identifier (ESI): A unique non-zero identifier that 201 identifies an Ethernet segment is called an 'Ethernet Segment 202 Identifier'. 204 Ethernet Tag ID: It is a normalized network wide ID that represents 205 a BD and can be any of the following IDs: VLAN IDs (including 206 Q-in-Q tags), configured IDs, VNIs (Virtual Extensible Local Area 207 Network (VXLAN) Network Identifiers), normalized VIDs, I-SIDs 208 (Service Instance Identifiers), etc. When used for the purpose of 209 DF Election the Ethernet Tag ID for the same BD are configured 210 consistently across the multihomed PEs attached to that ES. It 211 also refers to the non-zero identifier used in the EVPN routes for 212 VLAN-aware bundle service. 214 LACP: Link Aggregation Control Protocol. 216 MP2MP: Multipoint to Multipoint. 218 MP2P: Multipoint to Point. 220 P2MP: Point to Multipoint. 222 P2P: Point to Point. 224 PE: Provider Edge device. 226 Single-Active Redundancy Mode: When only a single PE, among all the 227 PEs attached to an Ethernet segment, is allowed to forward traffic 228 to/from that Ethernet segment for a given VLAN, then the Ethernet 229 segment is defined to be operating in Single-Active redundancy 230 mode. 232 All-Active Redundancy Mode: When all PEs attached to an Ethernet 233 segment are allowed to forward known unicast traffic to/from that 234 Ethernet segment for a given VLAN, then the Ethernet segment is 235 defined to be operating in All-Active redundancy mode. 237 BUM: Broadcast, unknown unicast, and multicast. 239 DF: Designated Forwarder. 241 NDF: Non-Designated Forwarder. 243 VID: VLAN Identifier. 245 DCB: Domain-wide Common Block (of labels), as in [I-D.ietf-bess- 246 mvpn-evpn-aggregation-label]. 248 AC: Attachment Circuit. 250 4. BGP MPLS-Based EVPN Overview 252 This section provides an overview of EVPN. An EVPN instance 253 comprises Customer Edge devices (CEs) that are connected to Provider 254 Edge devices (PEs) that form the edge of the MPLS infrastructure. A 255 CE may be a host, a router, or a switch. The PEs provide virtual 256 Layer 2 bridged connectivity between the CEs. There may be multiple 257 EVPN instances in the provider's network. 259 The PEs may be connected by an MPLS Label Switched Path (LSP) 260 infrastructure, which provides the benefits of MPLS technology, such 261 as fast reroute, resiliency, etc. The PEs may also be connected by 262 an IP infrastructure, in which case IP/GRE (Generic Routing 263 Encapsulation) tunneling or other IP tunneling can be used between 264 the PEs. The detailed procedures in this document are specified only 265 for MPLS LSPs as the tunneling technology. However, these procedures 266 are designed to be extensible to IP tunneling as the Packet Switched 267 Network (PSN) tunneling technology. 269 In an EVPN, MAC learning between PEs occurs not in the data plane (as 270 happens with traditional bridging in VPLS [RFC4761] [RFC4762]) but in 271 the control plane. Control-plane learning offers greater control 272 over the MAC learning process, such as restricting who learns what, 273 and the ability to apply policies. Furthermore, the control plane 274 chosen for advertising MAC reachability information is multi-protocol 275 (MP) BGP (similar to IP VPNs [RFC4364]). This provides flexibility 276 and the ability to preserve the "virtualization" or isolation of 277 groups of interacting agents (hosts, servers, virtual machines) from 278 each other. In EVPN, PEs advertise the MAC addresses learned from 279 the CEs that are connected to them, along with an MPLS label, to 280 other PEs in the control plane using Multiprotocol BGP (MP-BGP). 281 Control-plane learning enables load balancing of traffic to and from 282 CEs that are multihomed to multiple PEs. This is in addition to load 283 balancing across the MPLS core via multiple LSPs between the same 284 pair of PEs. In other words, it allows CEs to connect to multiple 285 active points of attachment. It also improves convergence times in 286 the event of certain network failures. 288 However, learning between PEs and CEs is done by the method best 289 suited to the CE: data-plane learning, IEEE 802.1x, the Link Layer 290 Discovery Protocol (LLDP), IEEE 802.1aq, Address Resolution Protocol 291 (ARP), management plane, or other protocols. 293 It is a local decision as to whether the Layer 2 forwarding table on 294 a PE is populated with all the MAC destination addresses known to the 295 control plane, or whether the PE implements a cache-based scheme. 296 For instance, the MAC forwarding table may be populated only with the 297 MAC destinations of the active flows transiting a specific PE. 299 The policy attributes of EVPN are very similar to those of IP-VPN. 300 An EVPN instance requires a Route Distinguisher (RD) that is unique 301 per MAC-VRF and one or more globally unique Route Targets (RTs). A 302 CE attaches to a BD on a PE, on an Ethernet interface that may be 303 configured for one or more Ethernet tags. If the Ethernet Tags are 304 VLAN IDs, some deployment scenarios guarantee uniqueness of VLAN IDs 305 across EVPN instances: all points of attachment for a given EVPN 306 instance use the same VLAN ID, and no other EVPN instance uses this 307 VLAN ID. This document refers to this case as a "Unique VLAN EVPN" 308 and describes simplified procedures to optimize for it. 310 5. Ethernet Segment 312 As indicated in [RFC7209], each Ethernet segment needs a unique 313 identifier in an EVPN. This section defines how such identifiers are 314 assigned and how they are encoded for use in EVPN signaling. Later 315 sections of this document describe the protocol mechanisms that 316 utilize the identifiers. 318 When a customer site is connected to one or more PEs via a set of 319 Ethernet links, then this set of Ethernet links constitutes an 320 "Ethernet segment". For a multihomed site, each Ethernet segment 321 (ES) is identified by a unique non-zero identifier called an Ethernet 322 Segment Identifier (ESI). An ESI is encoded as a 10-octet integer in 323 line format with the most significant octet sent first. The 324 following two ESI values are reserved: 326 - ESI 0 denotes a single-homed site. 328 - ESI {0xFF} (repeated 10 times) is known as MAX-ESI and is reserved. 330 In general, an Ethernet segment SHOULD have a non-reserved ESI that 331 is unique network wide (i.e., across all EVPN instances on all the 332 PEs). If the CE(s) constituting an Ethernet segment is (are) managed 333 by the network operator, then ESI uniqueness should be guaranteed; 334 however, if the CE(s) is (are) not managed, then the operator MUST 335 configure a network-wide unique ESI for that Ethernet segment. This 336 is required to enable auto-discovery of Ethernet segments and 337 Designated Forwarder (DF) election. 339 In a network with managed and non-managed CEs, the ESI has the 340 following format: 342 +---+---+---+---+---+---+---+---+---+---+ 343 | T | ESI Value | 344 +---+---+---+---+---+---+---+---+---+---+ 346 Where: 348 T (ESI Type) is a 1-octet field (most significant octet) that 349 specifies the format of the remaining 9 octets (ESI Value). The 350 following six ESI types can be used: 352 - Type 0 (T=0x00) - This type indicates an arbitrary 9-octet ESI 353 value, which is managed and configured by the operator. 355 - Type 1 (T=0x01) - When IEEE 802.1AX LACP is used between the PEs 356 and CEs, this ESI type indicates an auto-generated ESI value 357 determined from LACP by concatenating the following parameters: 359 + CE LACP System MAC address (6 octets). The CE LACP System MAC 360 address MUST be encoded in the high-order 6 octets of the ESI 361 Value field. 363 + CE LACP Port Key (2 octets). The CE LACP port key MUST be 364 encoded in the 2 octets next to the System MAC address. 366 + The remaining octet will be set to 0x00. 368 As far as the CE is concerned, it would treat the multiple PEs that 369 it is connected to as the same switch. This allows the CE to 370 aggregate links that are attached to different PEs in the same 371 bundle. 373 This mechanism could be used only if it produces ESIs that satisfy 374 the uniqueness requirement specified above. 376 - Type 2 (T=0x02) - This type is used in the case of indirectly 377 connected hosts via a bridged LAN between the CEs and the PEs. The 378 ESI Value is auto-generated and determined based on the Layer 2 379 bridge protocol as follows: If the Multiple Spanning Tree Protocol 380 (MSTP) is used in the bridged LAN, then the value of the ESI is 381 derived by listening to Bridge PDUs (BPDUs) on the Ethernet 382 segment. To achieve this, the PE is not required to run MSTP. 384 However, the PE must learn the Root Bridge MAC address and Bridge 385 Priority of the root of the Internal Spanning Tree (IST) by 386 listening to the BPDUs. The ESI Value is constructed as follows: 388 + Root Bridge MAC address (6 octets). The Root Bridge MAC address 389 MUST be encoded in the high-order 6 octets of the ESI Value 390 field. 392 + Root Bridge Priority (2 octets). The CE Root Bridge Priority 393 MUST be encoded in the 2 octets next to the Root Bridge MAC 394 address. 396 + The remaining octet will be set to 0x00. 398 This mechanism could be used only if it produces ESIs that satisfy 399 the uniqueness requirement specified above. 401 - Type 3 (T=0x03) - This type indicates a MAC-based ESI Value that 402 can be auto-generated or configured by the operator. The ESI Value 403 is constructed as follows: 405 + System MAC address (6 octets). The PE MAC address MUST be 406 encoded in the high-order 6 octets of the ESI Value field. 408 + Local Discriminator value (3 octets). The Local Discriminator 409 value MUST be encoded in the low-order 3 octets of the ESI Value. 411 This mechanism could be used only if it produces ESIs that satisfy 412 the uniqueness requirement specified above. 414 - Type 4 (T=0x04) - This type indicates a router-ID ESI Value that 415 can be auto-generated or configured by the operator. The ESI Value 416 is constructed as follows: 418 + Router ID (4 octets). The system router ID MUST be encoded in 419 the high-order 4 octets of the ESI Value field. 421 + Local Discriminator value (4 octets). The Local Discriminator 422 value MUST be encoded in the 4 octets next to the IP address. 424 + The low-order octet of the ESI Value will be set to 0x00. 426 This mechanism could be used only if it produces ESIs that satisfy 427 the uniqueness requirement specified above. 429 - Type 5 (T=0x05) - This type indicates an Autonomous System 430 (AS)-based ESI Value that can be auto-generated or configured by 431 the operator. The ESI Value is constructed as follows: 433 + AS number (4 octets). This is an AS number owned by the system 434 and MUST be encoded in the high-order 4 octets of the ESI Value 435 field. If a 2-octet AS number is used, the high-order extra 436 2 octets will be 0x0000. 438 + Local Discriminator value (4 octets). The Local Discriminator 439 value MUST be encoded in the 4 octets next to the AS number. 441 + The low-order octet of the ESI Value will be set to 0x00. 443 This mechanism could be used only if it produces ESIs that satisfy 444 the uniqueness requirement specified above. 446 6. Ethernet Tag ID 448 An Ethernet Tag ID is a normazlied network wide ID that represents a 449 BD. It is a non-zero 32-bit identifier which is also used in the 450 EVPN routes for VLAN-aware Bundle services. An EVI consists of one 451 or more BDs (one or more Ethernet Tags). Ethernet Tag IDs are 452 assigned to a given EVPN instance by the provider of the EVPN 453 service. A given Ethernet Tag ID can itself be represented by 454 multiple VIDs. In such cases, the PEs participating in that BD for a 455 given EVPN instance are responsible for performing VLAN ID 456 translation to/from locally attached CE devices - i.e., they are 457 responsible for translating local VIDs to a normalized Etherent Tag 458 ID. 460 If a VLAN is represented by a single VID across all PE devices 461 participating in that VLAN for that EVPN instance, then there is no 462 need for VLAN translation at the PEs. Furthermore, some deployment 463 scenarios guarantee uniqueness of VIDs across all EVPN instances; all 464 points of attachment for a given EVPN instance use the same VID, and 465 no other EVPN instances use that VID. This allows the RT(s) for each 466 EVPN instance to be derived automatically from the corresponding VID, 467 as described in Section 7.11.1. 469 The following subsections discuss the relationship between broadcast 470 domains (e.g., VLANs), Ethernet Tag IDs (e.g., VIDs), and MAC-VRFs as 471 well as the setting of the Ethernet Tag ID, in the various EVPN BGP 472 routes (defined in Section 8), for the different types of service 473 interfaces described in [RFC7209]. 475 The following Ethernet Tag ID value is reserved: 477 - Ethernet Tag ID {0xFFFFFFFF} is known as MAX-ET. 479 6.1. VLAN-Based Service Interface 481 With this service interface, an EVPN instance consists of only a 482 single broadcast domain (e.g., a single VLAN). Therefore, there is a 483 one-to-one mapping between a VID on this interface and a MAC-VRF. 484 Since a MAC-VRF corresponds to a single VLAN, it consists of a single 485 bridge table corresponding to that VLAN. If the VLAN is represented 486 by multiple VIDs (e.g., a different VID per Ethernet segment per PE), 487 then each PE needs to perform VID translation for frames destined to 488 its Ethernet segment(s). In such scenarios, the Ethernet frames 489 transported over an MPLS/IP network SHOULD remain tagged with the 490 originating VID, and a VID translation MUST be supported in the data 491 path and MUST be performed on the disposition PE. The Ethernet Tag 492 ID in all EVPN routes MUST be set to 0. 494 6.2. VLAN Bundle Service Interface 496 With this service interface, an EVPN instance corresponds to multiple 497 broadcast domains (e.g., multiple VLANs); however, only a single 498 bridge table is maintained per MAC-VRF, which means multiple VLANs 499 share the same bridge table. This implies that MAC addresses MUST be 500 unique across all VLANs for that EVI in order for this service to 501 work. In other words, there is a many-to-one mapping between VLANs 502 and a MAC-VRF, and the MAC-VRF consists of a single bridge table. 503 Furthermore, a single VLAN must be represented by a single VID -- 504 e.g., no VID translation is allowed for this service interface type. 505 The MPLS-encapsulated frames MUST remain tagged with the originating 506 VID. Tag translation is NOT permitted. The Ethernet Tag ID in all 507 EVPN routes MUST be set to 0. 509 6.2.1. Port-Based Service Interface 511 This service interface is a special case of the VLAN bundle service 512 interface, where all of the VLANs on the port are part of the same 513 service and map to the same bundle. The procedures are identical to 514 those described in Section 6.2. 516 6.3. VLAN-Aware Bundle Service Interface 518 With this service interface, an EVPN instance consists of multiple 519 broadcast domains (e.g., multiple VLANs) with each VLAN having its 520 own bridge table -- i.e., multiple bridge tables (one per VLAN) are 521 maintained by a single MAC-VRF corresponding to the EVPN instance. 523 Broadcast, unknown unicast, or multicast (BUM) traffic is sent only 524 to the CEs in a given broadcast domain; however, the broadcast 525 domains within an EVI either MAY each have their own P-Tunnel or MAY 526 share P-Tunnels -- e.g., all of the broadcast domains in an EVI MAY 527 share a single P-Tunnel. 529 In the case where a single VLAN is represented by a single VID and 530 thus no VID translation is required, an MPLS-encapsulated packet MUST 531 carry that VID. The Ethernet Tag ID in all EVPN routes MUST be set 532 to that VID. The advertising PE MAY advertise the MPLS Label1 in the 533 MAC/IP Advertisement route representing ONLY the EVI or representing 534 both the Ethernet Tag ID and the EVI. This decision is only a local 535 matter by the advertising PE (which is also the disposition PE) and 536 doesn't affect any other PEs. 538 In the case where a single VLAN is represented by different VIDs on 539 different CEs and thus VID translation is required, a normalized 540 Ethernet Tag ID (VID) MUST be carried in the EVPN BGP routes. 541 Furthermore, the advertising PE advertises the MPLS Label1 in the 542 MAC/IP Advertisement route representing both the Ethernet Tag ID and 543 the EVI, so that upon receiving an MPLS-encapsulated packet, it can 544 identify the corresponding bridge table from the MPLS EVPN label and 545 perform Ethernet Tag ID translation ONLY at the disposition PE -- 546 i.e., the Ethernet frames transported over the MPLS/IP network MUST 547 remain tagged with the originating VID, and VID translation is 548 performed on the disposition PE. The Ethernet Tag ID in all EVPN 549 routes MUST be set to the normalized Ethernet Tag ID assigned by the 550 EVPN provider. 552 6.3.1. Port-Based VLAN-Aware Service Interface 554 This service interface is a special case of the VLAN-aware bundle 555 service interface, where all of the VLANs on the port are part of the 556 same service and are mapped to a single bundle but without any VID 557 translation. The procedures are a subset of those described in 558 Section 6.3. 560 6.4. EVPN PE Model 562 Since this document discusses EVPN operation in relationship to MAC- 563 VRF, EVI, Bridge Domain (BD), and Bridge Table (BT), it is important 564 to understand the relationship between these terms. Therefore, the 565 following PE model is depicted below to illustrate the relationship 566 among them. 568 +--------------------------------------------------+ 569 | | 570 | +------------------+ EVPN PE | 571 | Attachment | +------------------+ | 572 | Circuit(AC1) | | +----------+ | MPLS/NVO tnl 573 ----------------------*Bridge | | +----- 574 | | | |Table(BT1)| | / \ \ 575 | | | | |<------------------> |Eth| 576 | | | | VLAN x | | \ / / 577 | | | +----------+ | +----- 578 | | | ... | | 579 | | | +----------+ | MPLS/NVO tnl 580 | | | |Bridge | | +----- 581 | | | |Table(BT2)| | / \ \ 582 | | | | |<-------------------> |Eth| 583 ----------------------* VLAN y | | \ / / 584 | AC2 | | +----------+ | +----- 585 | | | MAC-VRF1 | | 586 | +-+ RD1/RT1 | | 587 | +------------------+ | 588 | | 589 | | 590 +---------------------------------------------------+ 592 Figure 1: EVPN PE Model 594 A tenant configured for an EVPN service instance (i.e, EVI) on a PE, 595 is instantiated by a single MAC Virtual Routing and Forwarding table 596 (MAC-VRF) on that PE. A MAC-VRF consists of one or more Bridge 597 Tables (BTs) where each BT corresponds to a VLAN (broadcast domain - 598 BD). If a service interface for an EVPN PE is configured in VLAN- 599 Based mode (i.e., section 6.1), then there is only a single BT per 600 MAC-VRF (per EVI) - i.e., there is only one tenant VLAN per EVI. 601 However, if a service interface for an EVPN PE is configured in VLAN- 602 Aware Bundle mode (i.e., section 6.3), then there are several BTs per 603 MAC-VRF (per EVI) - i.e., there are several tenant VLANs per EVI. 604 The relationship among these terms can be summarized as follow: 606 - An EVI can consists of one or more BDs. Furthermore, a MAC-VRF can 607 consists of one or more BTs. A BD is identified by an Ethernet Tag 608 ID which is typically represented by a single VLAN ID (VID); 609 however, it can be represented by multiple VIDs (i.e., Shared VLAN 610 Learning (SVL) mode in 802.1Q). 612 - In VLAN-based mode, there is one EVI per VLAN and thus one BD/BT 613 per VLAN. Furthermore, there is one BT per MAC-VRF. 615 - In VLAN-bundle service, it can be considered as analogous to SVL 616 mode in 802.1Q i.e., one BD per EVI and one BT per MAC-VRF with 617 multiple VIDs representing that BD. 619 - In VLAN-aware bundle service, there is one EVI with multiple BDs 620 where each BD is represented by a VLAN. Furthermore, there are 621 multiple BTs in a single MAC-VRF. 623 Since a single tenant subnet is typically (and in this document) 624 represented by a VLAN (and thus supported by a single BT), for a 625 given tenant there are as many BTs as there are subnets as shown in 626 the PE model above. 628 MAC-VRF is identified by its corresponding route target and route 629 distinguisher. If operating in EVPN VLAN-Based mode, then a 630 receiving PE that receives an EVPN route with MAC-VRF route target 631 can identify the corresponding BT; however, if operating in EVPN 632 VLAN-Aware Bundle mode, then the receiving PE needs both the MAC-VRF 633 route target and VLAN ID in order to identify the corresponding BT. 635 7. BGP EVPN Routes 637 This document defines a new BGP Network Layer Reachability 638 Information (NLRI) called the EVPN NLRI. 640 The format of the EVPN NLRI is as follows: 642 +-----------------------------------+ 643 | Route Type (1 octet) | 644 +-----------------------------------+ 645 | Length (1 octet) | 646 +-----------------------------------+ 647 | Route Type specific (variable) | 648 +-----------------------------------+ 650 The Route Type field defines the encoding of the rest of the EVPN 651 NLRI (Route Type specific EVPN NLRI). 653 The Length field indicates the length in octets of the Route Type 654 specific field of the EVPN NLRI. 656 This document defines the following Route Types: 658 + 1 - Ethernet Auto-Discovery (A-D) route 659 + 2 - MAC/IP Advertisement route 660 + 3 - Inclusive Multicast Ethernet Tag route 661 + 4 - Ethernet Segment route 663 The detailed encoding and procedures for these route types are 664 described in subsequent sections. 666 The EVPN NLRI is carried in BGP [RFC4271] using BGP Multiprotocol 667 Extensions [RFC4760] with an Address Family Identifier (AFI) of 25 668 (L2VPN) and a Subsequent Address Family Identifier (SAFI) of 70 669 (EVPN). The NLRI field in the MP_REACH_NLRI/MP_UNREACH_NLRI 670 attribute contains the EVPN NLRI (encoded as specified above). 672 In order for two BGP speakers to exchange labeled EVPN NLRI, they 673 must use BGP Capabilities Advertisements to ensure that they both are 674 capable of properly processing such NLRI. This is done as specified 675 in [RFC4760], by using capability code 1 (multiprotocol BGP) with an 676 AFI of 25 (L2VPN) and a SAFI of 70 (EVPN). 678 7.1. Ethernet Auto-discovery Route 680 An Ethernet A-D route type specific EVPN NLRI consists of the 681 following: 683 +---------------------------------------+ 684 | Route Distinguisher (RD) (8 octets) | 685 +---------------------------------------+ 686 |Ethernet Segment Identifier (10 octets)| 687 +---------------------------------------+ 688 | Ethernet Tag ID (4 octets) | 689 +---------------------------------------+ 690 | MPLS Label (3 octets) | 691 +---------------------------------------+ 693 For the purpose of BGP route key processing, only the Ethernet 694 Segment Identifier and the Ethernet Tag ID are considered to be part 695 of the prefix in the NLRI. The MPLS Label field is to be treated as 696 a route attribute as opposed to being part of the route. 698 The MPLS Label field is encoded as 3 octets, where the high-order 699 20 bits contain the label value. 701 For procedures and usage of this route, please see Sections 8.2 702 ("Fast Convergence") and 8.4 ("Aliasing and Backup Path"). 704 7.2. MAC/IP Advertisement Route 706 A MAC/IP Advertisement route type specific EVPN NLRI consists of the 707 following: 709 +---------------------------------------+ 710 | RD (8 octets) | 711 +---------------------------------------+ 712 |Ethernet Segment Identifier (10 octets)| 713 +---------------------------------------+ 714 | Ethernet Tag ID (4 octets) | 715 +---------------------------------------+ 716 | MAC Address Length (1 octet) | 717 +---------------------------------------+ 718 | MAC Address (6 octets) | 719 +---------------------------------------+ 720 | IP Address Length (1 octet) | 721 +---------------------------------------+ 722 | IP Address (0, 4, or 16 octets) | 723 +---------------------------------------+ 724 | MPLS Label1 (3 octets) | 725 +---------------------------------------+ 726 | MPLS Label2 (0 or 3 octets) | 727 +---------------------------------------+ 729 For the purpose of BGP route key processing, only the Ethernet Tag 730 ID, MAC Address Length, MAC Address, IP Address Length, and IP 731 Address fields are considered to be part of the prefix in the NLRI. 732 The Ethernet Segment Identifier, MPLS Label1, and MPLS Label2 fields 733 are to be treated as route attributes as opposed to being part of the 734 "route". Both the IP and MAC address lengths are in bits. 736 The MPLS Label1 and MPLS Label2 fields are encoded as 3 octets, where 737 the high-order 20 bits contain the label value. 739 For procedures and usage of this route, please see Sections 9 740 ("Determining Reachability to Unicast MAC Addresses") and 14 741 ("Load Balancing of Unicast Packets"). 743 7.3. Inclusive Multicast Ethernet Tag Route 745 An Inclusive Multicast Ethernet Tag route type specific EVPN NLRI 746 consists of the following: 748 +---------------------------------------+ 749 | RD (8 octets) | 750 +---------------------------------------+ 751 | Ethernet Tag ID (4 octets) | 752 +---------------------------------------+ 753 | IP Address Length (1 octet) | 754 +---------------------------------------+ 755 | Originating Router's IP Address | 756 | (4 or 16 octets) | 757 +---------------------------------------+ 759 For procedures and usage of this route, please see Sections 11 760 ("Handling of Multi-destination Traffic"), 12 761 ("Processing of Unknown Unicast Packets"), and 16 762 ("Multicast and Broadcast"). The IP address length is in bits. For 763 the purpose of BGP route key processing, only the Ethernet Tag ID, IP 764 Address Length, and Originating Router's IP Address fields are 765 considered to be part of the prefix in the NLRI. 767 7.4. Ethernet Segment Route 769 An Ethernet Segment route type specific EVPN NLRI consists of the 770 following: 772 +---------------------------------------+ 773 | RD (8 octets) | 774 +---------------------------------------+ 775 |Ethernet Segment Identifier (10 octets)| 776 +---------------------------------------+ 777 | IP Address Length (1 octet) | 778 +---------------------------------------+ 779 | Originating Router's IP Address | 780 | (4 or 16 octets) | 781 +---------------------------------------+ 783 For procedures and usage of this route, please see Section 8.5 784 ("Designated Forwarder Election"). The IP address length is in bits. 785 For the purpose of BGP route key processing, only the Ethernet 786 Segment ID, IP Address Length, and Originating Router's IP Address 787 fields are considered to be part of the prefix in the NLRI. 789 7.5. ESI Label Extended Community 791 This Extended Community is a new transitive Extended Community having 792 a Type field value of 0x06 and the Sub-Type 0x01. It may be 793 advertised along with Ethernet Auto-discovery routes, and it enables 794 split-horizon procedures for multihomed sites as described in 795 Section 8.3 ("Split Horizon"). The ESI Label field represents an ES 796 by the advertising PE, and it is used in split-horizon filtering by 797 other PEs that are connected to the same multihomed Ethernet segment. 799 The ESI Label field is encoded as 3 octets, where the high-order 800 20 bits contain the label value. 802 The ESI label value MAY be zero if no split-horizon filtering 803 procedures are required in any of the VLANs of the Ethernet Segment. 804 This is the case in [RFC8214] or Ethernet Segments using Local Bias 805 procedures in [I.D-ietf-bess-evpn-mh-split-horizon]. 807 Each ESI Label extended community is encoded as an 8-octet value, as 808 follows: 810 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 811 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 812 | Type=0x06 | Sub-Type=0x01 | Flags(1 octet)| Reserved=0 | 813 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 814 | Reserved=0 | ESI Label | 815 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 817 The low-order bit of the Flags octet is defined as the 818 "Single-Active" bit. A value of 0 means that the multihomed site 819 is operating in All-Active redundancy mode, and a value of 1 means 820 that the multihomed site is operating in Single-Active redundancy 821 mode. 823 7.6. ES-Import Route Target 825 This is a new transitive Route Target extended community carried with 826 the Ethernet Segment route. When used, it enables all the PEs 827 connected to the same multihomed site to import the Ethernet Segment 828 routes. The value is derived automatically for the ESI Types 1, 2, 829 and 3, by encoding the high-order 6-octet portion of the 9-octet ESI 830 Value, which corresponds to a MAC address, in the ES-Import Route 831 Target. The format of this Extended Community is as follows: 833 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 834 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 835 | Type=0x06 | Sub-Type=0x02 | ES-Import | 836 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 837 | ES-Import Cont'd | 838 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 840 This document expands the definition of the Route Target extended 841 community to allow the value of the high-order octet (Type field) to 842 be 0x06 (in addition to the values specified in [RFC4360]). The 843 low-order octet (Sub-Type field) value 0x02 indicates that this 844 Extended Community is of type "Route Target". The new Type field 845 value 0x06 indicates that the structure of this RT is a 6-octet value 846 (e.g., a MAC address). A BGP speaker that implements RT Constraint 847 [RFC4684] MUST apply the RT Constraint procedures to the ES-Import RT 848 as well. 850 For procedures and usage of this attribute, please see Section 8.1 851 ("Multihomed Ethernet Segment Auto-discovery"). 853 7.7. MAC Mobility Extended Community 855 This Extended Community is a new transitive Extended Community having 856 a Type field value of 0x06 and the Sub-Type 0x00. It may be 857 advertised along with MAC/IP Advertisement routes. The procedures 858 for using this Extended Community are described in Section 15 859 ("MAC Mobility"). 861 The MAC Mobility extended community is encoded as an 8-octet value, 862 as follows: 864 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 865 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 866 | Type=0x06 | Sub-Type=0x00 |Flags(1 octet)| Reserved=0 | 867 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 868 | Sequence Number | 869 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 871 The low-order bit of the Flags octet is defined as the 872 "Sticky/static" flag and may be set to 1. A value of 1 means that 873 the MAC address is static and cannot move. The sequence number is 874 used to ensure that PEs retain the correct MAC/IP Advertisement route 875 when multiple updates occur for the same MAC address. 877 7.8. Default Gateway Extended Community 879 The Default Gateway community is an Extended Community of an Opaque 880 Type (see Section 3.3 of [RFC4360]). It is a transitive community, 881 which means that the first octet is 0x03. The value of the second 882 octet (Sub-Type) is 0x0d (Default Gateway) as assigned by IANA. The 883 Value field of this community is reserved (set to 0 by the senders, 884 ignored by the receivers). For procedures and usage of this 885 attribute, please see Section 10.1 ("Default Gateway"). 887 7.9. EVPN Layer 2 Attributes Extended Community 889 [RFC8214] defines this extended community ("L2-Attr"), to be included 890 with per-EVI Ethernet A-D routes and mandatory if multihoming is 891 enabled. 893 Usage and applicability of this Extended community to Bridging is 894 clarified here. 896 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 897 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 898 | MBZ |RSV|RSV|F|C|P|B| (MBZ = MUST Be Zero) 899 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 901 The bits in Control Flags are extended by the following defined bits: 903 Name Meaning 904 --------------------------------------------------------------- 905 F If set to 1, a Flow Label MUST be present 906 when sending EVPN packets to this PE. 908 For procedures and usage of this attribute, with respect to Control 909 Word and Flow Label, please see Section 18. ("Frame Ordering"). 911 For procedures and usage of this attribute, with respect to 912 Primary-Backup bits, please see Section 8.5. 913 ("Designated Forwarder Election"). 915 7.9.1. EVPN Layer 2 Attributes Partitioning 917 The information carried in the L2-Attr Extended Community may be ESI- 918 specific or BD/MAC-VRF specific. In order to minimize the processing 919 overhead of configuration-time items such as MTU not expected to 920 change at runtime based on failures, the Extended Community from 921 [RFC8214] is partitioned, with a subset of information carried over 922 each Ethernet A-D per EVI and Inclusive Multicast routes. 924 The EVPN Layer 2 Attributes Extended Community, when added to 925 Inclusive Multicast route: 927 - BD/MAC-VRF attributes MTU, Control Word and Flow Label are 928 conveyed, and; 930 - per-ESI attributes P, B MUST be zero. 932 +-------------------------------------------+ 933 | Type (0x06) / Sub-type (0x04) (2 octets) | 934 +-------------------------------------------+ 935 | Control Flags (2 octets) | 936 +-------------------------------------------+ 937 | L2 MTU (2 octets) | 938 +-------------------------------------------+ 939 | Reserved (2 octets) | 940 +-------------------------------------------+ 942 1 1 1 1 1 943 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 944 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 945 | MBZ | MBZ |F|C|MBZ| (MBZ = MUST Be Zero) 946 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 948 The EVPN Layer 2 Attributes Extended Community is included on 949 Ethernet A-D per EVI route and: 951 - per-ESI attributes P, B are conveyed, and; 953 - BD/MAC-VRF attributes MTU, Control Word and Flow Label MUST be 954 zero. 956 +-------------------------------------------+ 957 | Type (0x06) / Sub-type (0x04) (2 octets) | 958 +-------------------------------------------+ 959 | Control Flags (2 octets) | 960 +-------------------------------------------+ 961 | MBZ (2 octets) | 962 +-------------------------------------------+ 963 | Reserved (2 octets) | 964 +-------------------------------------------+ 966 1 1 1 1 1 967 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 968 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 969 | MBZ | MBZ |P|B| (MBZ = MUST Be Zero) 970 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 972 Note that in both of the above cases, the values conveyed in this 973 extended community are at the granularity of an individual EVI (or 974 [EVI, VLAN] for vlan-aware bundle) and hence may vary for different 975 EVIs. 977 7.10. Route Distinguisher Assignment per MAC-VRF 979 The Route Distinguisher (RD) MUST be set to the RD of the MAC-VRF 980 that is advertising the NLRI. An RD MUST be assigned for a given 981 MAC-VRF on a PE. This RD MUST be unique across all MAC-VRFs on a PE. 982 It is RECOMMENDED to use the Type 1 RD [RFC4364]. The value field 983 comprises an IP address of the PE (typically, the loopback address) 984 followed by a number unique to the PE. This number may be generated 985 by the PE. In case of VLAN-based or VLAN Bundle services, this 986 number may also be generated out of the Ethernet Tag ID for the BD as 987 long as the value does not exceed a lenght of 16 bits. Or, in the 988 Unique VLAN EVPN case, the low-order 12 bits may be the 12-bit VLAN 989 ID, with the remaining high-order 4 bits set to 0. 991 7.11. Route Targets 993 The EVPN route MAY carry one or more Route Target (RT) attributes. 994 RTs may be configured (as in IP VPNs) or may be derived 995 automatically. 997 If a PE uses RT Constraint, the PE advertises all such RTs using RT 998 Constraints per [RFC4684]. The use of RT Constraints allows each 999 EVPN route to reach only those PEs that are configured to import at 1000 least one RT from the set of RTs carried in the EVPN route. 1002 7.11.1. Auto-derivation from the Ethernet Tag ID 1004 For the "Unique VLAN EVPN" scenario, it is highly desirable to 1005 auto-derive the RT from the Ethernet Tag ID (VLAN ID) for that EVPN 1006 instance. The procedure for performing such auto-derivation is as 1007 follows: 1009 + The Global Administrator field of the RT MUST be set to the 1010 Autonomous System (AS) number with which the PE is associated. 1012 + The 12-bit VLAN ID MUST be encoded in the lowest 12 bits of the 1013 Local Administrator field, with the remaining bits set to zero. 1015 For VLAN-based and VLAN Bundle services, the RT may also be auto- 1016 derived as per the above rules but replacing the 12-bit VLAN ID with 1017 a 16-bit Ethernet Tag ID configured for the BD. If the Ethernet Tag 1018 ID length is 24 bits, the RT for the MAC-VRF can be auto-derived as 1019 per [RFC8365] section 5.1.2.1. 1021 7.12. Route Prioritization 1023 In order to achieve the Fast Convergence referred to in (Section 8.2 1024 ("Fast Convergence")), BGP speakers SHOULD prioritise advertisement, 1025 processing and redistribution of routes based on relative scale of 1026 priority vs. expected or average scale. 1028 1. Ethernet AD per ES (Mass-Withdraw Route Type 1) and Ethernet 1029 Segment (Route Type 4) are lower scale and highly convergence 1030 affecting, and SHOULD be handled in first order of priority 1032 2. Ethernet AD per EVI, Inclusive Multicast Ethernet Tag route, and 1033 IP Prefix r oute defined in 1034 [I-D.ietf-bess-evpn-prefix-advertisement] are sent for each 1035 Bridge or AC at medium scale and may be convergence affecting, 1036 and SHOULD be handled in second order of priority 1038 3. MAC advertisement route (zero and nonzero IP portion), Multicast 1039 Join Sync and Multicast Leave Sync routes defined in 1040 [I-D.ietf-bess-evpn-igmp-mld-proxy] are considered 'individual 1041 routes' and very-high scale or of relatively low importance for 1042 fast convergence and SHOULD be handled in last order of priority. 1044 8. Multihoming Functions 1046 This section discusses the functions, procedures, and associated BGP 1047 routes used to support multihoming in EVPN. This covers both 1048 multihomed device (MHD) and multihomed network (MHN) scenarios. 1050 8.1. Multihomed Ethernet Segment Auto-discovery 1052 PEs connected to the same Ethernet segment can automatically discover 1053 each other with minimal to no configuration through the exchange of 1054 the Ethernet Segment route. 1056 8.1.1. Constructing the Ethernet Segment Route 1058 The Route Distinguisher (RD) MUST be a Type 1 RD [RFC4364]. The 1059 value field comprises an IP address of the PE (typically, the 1060 loopback address) followed by a number unique to the PE. 1062 The Ethernet Segment Identifier (ESI) MUST be set to the 10-octet 1063 value described in Section 5. 1065 The BGP advertisement that advertises the Ethernet Segment route MUST 1066 also carry an ES-Import Route Target, as defined in Section 7.6. 1068 The Ethernet Segment route filtering MUST be done such that the 1069 Ethernet Segment route is imported only by the PEs that are 1070 multihomed to the same Ethernet segment. To that end, each PE that 1071 is connected to a particular Ethernet segment constructs an import 1072 filtering rule to import a route that carries the ES-Import Route 1073 Target, constructed from the ESI. 1075 8.2. Fast Convergence 1077 In EVPN, MAC address reachability is learned via the BGP control 1078 plane over the MPLS network. As such, in the absence of any fast 1079 protection mechanism, the network convergence time is a function of 1080 the number of MAC/IP Advertisement routes that must be withdrawn by 1081 the PE encountering a failure. For highly scaled environments, this 1082 scheme yields slow convergence. 1084 To alleviate this, EVPN defines a mechanism to efficiently and 1085 quickly signal, to remote PE nodes, the need to update their 1086 forwarding tables upon the occurrence of a failure in connectivity to 1087 an Ethernet segment. This is done by having each PE advertise a set 1088 of one or more Ethernet A-D per ES routes for each locally attached 1089 Ethernet segment (refer to Section 8.2.1 below for details on how 1090 these routes are constructed). A PE may need to advertise more than 1091 one Ethernet A-D per ES route for a given ES because the ES may be in 1092 a multiplicity of EVIs and the RTs for all of these EVIs may not fit 1093 into a single route. Advertising a set of Ethernet A-D per ES routes 1094 for the ES allows each route to contain a subset of the complete set 1095 of RTs. Each Ethernet A-D per ES route is differentiated from the 1096 other routes in the set by a different Route Distinguisher (RD). 1098 Upon a failure in connectivity to the attached segment, the PE 1099 withdraws the corresponding set of Ethernet A-D per ES routes. This 1100 triggers all PEs that receive the withdrawal to update their next-hop 1101 adjacencies for all MAC addresses associated with the Ethernet 1102 segment in question. If no other PE had advertised an Ethernet A-D 1103 route for the same segment, then the PE that received the withdrawal 1104 simply invalidates the MAC entries for that segment. Otherwise, the 1105 PE updates its next-hop adjacencies accordingly. 1107 8.2.1. Constructing Ethernet A-D per Ethernet Segment Route 1109 This section describes the procedures used to construct the Ethernet 1110 A-D per ES route, which is used for fast convergence (as discussed 1111 above) and for advertising the ESI label used for split-horizon 1112 filtering (as discussed in Section 8.3). Support of this route is 1113 REQUIRED. 1115 The Route Distinguisher (RD) MUST be a Type 1 RD [RFC4364]. The 1116 value field comprises an IP address of the PE (typically, the 1117 loopback address) followed by a number unique to the PE. 1119 The Ethernet Segment Identifier MUST be a 10-octet entity as 1120 described in Section 5 ("Ethernet Segment"). The Ethernet A-D route 1121 is not needed when the Segment Identifier is set to 0 (e.g., single- 1122 homed scenarios). An exception to this rule is explained described 1123 in [RFC8317]. 1125 The Ethernet Tag ID MUST be set to MAX-ET. 1127 The MPLS label in the NLRI MUST be set to 0. 1129 The ESI Label extended community MUST be included in the route. If 1130 All-Active redundancy mode is desired, then the "Single-Active" bit 1131 in the flags of the ESI Label extended community MUST be set to 0 and 1132 the MPLS label in that Extended Community MUST be set to a valid MPLS 1133 label value. The MPLS label in this Extended Community is referred 1134 to as the ESI label and MUST have the same value in each Ethernet A-D 1135 per ES route advertised for the ES. This label MUST be a downstream 1136 assigned MPLS label if the advertising PE is using ingress 1137 replication for receiving multicast, broadcast, or unknown unicast 1138 traffic from other PEs. If the advertising PE is using P2MP MPLS 1139 LSPs for sending multicast, broadcast, or unknown unicast traffic, 1140 then this label MUST be an upstream assigned MPLS label, unless DCB 1141 allocated labels are used. The usage of this label is described in 1142 Section 8.3. 1144 If Single-Active redundancy mode is desired, then the "Single-Active" 1145 bit in the flags of the ESI Label extended community MUST be set to 1 1146 and the ESI label SHOULD be set to a valid MPLS label value. 1148 8.2.1.1. Ethernet A-D Route Targets 1150 Each Ethernet A-D per ES route MUST carry one or more Route Target 1151 (RT) attributes. The set of Ethernet A-D routes per ES MUST carry 1152 the entire set of RTs for all the EVPN instances to which the 1153 Ethernet segment belongs. 1155 8.3. Split Horizon 1157 Consider a CE that is multihomed to two or more PEs on an Ethernet 1158 segment ES1 operating in All-Active redundancy mode. If the CE sends 1159 a broadcast, unknown unicast, or multicast (BUM) packet to one of the 1160 non-Designated Forwarder (non-DF) PEs, say PE1, then PE1 will forward 1161 that packet to all or a subset of the other PEs in that EVPN 1162 instance, including the DF PE for that Ethernet segment. In this 1163 case, the DF PE to which the CE is multihomed MUST drop the packet 1164 and not forward back to the CE. This filtering is referred to as 1165 "split-horizon filtering" in this document. 1167 When a set of PEs are operating in Single-Active redundancy mode, the 1168 use of this split-horizon filtering mechanism is highly recommended 1169 because it prevents transient loops at the time of failure or 1170 recovery that would impact the Ethernet segment -- e.g., when two PEs 1171 think that both are DFs for that segment before the DF election 1172 procedure settles down. 1174 In order to achieve this split-horizon function, every BUM packet 1175 originating from a non-DF PE is encapsulated with an MPLS label that 1176 identifies the Ethernet segment of origin (i.e., the segment from 1177 which the frame entered the EVPN network). This label is referred to 1178 as the ESI label and MUST be distributed by all PEs when operating in 1179 All-Active redundancy mode using a set of Ethernet A-D per ES routes, 1180 per Section 8.2.1 above. The ESI label SHOULD be distributed by all 1181 PEs when operating in Single-Active redundancy mode using a set of 1182 Ethernet A-D per ES routes. These routes are imported by the PEs 1183 connected to the Ethernet segment and also by the PEs that have at 1184 least one EVPN instance in common with the Ethernet segment in the 1185 route. As described in Section 8.1.1, the route MUST carry an ESI 1186 Label extended community with a valid ESI label. The disposition PE 1187 relies on the value of the ESI label to determine whether or not a 1188 BUM frame is allowed to egress a specific Ethernet segment. 1190 8.3.1. ESI Label Assignment 1192 The following subsections describe the assignment procedures for the 1193 ESI label, which differ depending on the type of tunnels being used 1194 to deliver multi-destination packets in the EVPN network. 1196 8.3.1.1. Ingress Replication 1198 Each PE that operates in All-Active or Single-Active redundancy mode 1199 and that uses ingress replication to receive BUM traffic advertises a 1200 downstream assigned ESI label in the set of Ethernet A-D per ES 1201 routes for its attached ES. This label MUST be programmed in the 1202 platform label space by the advertising PE, and the forwarding entry 1203 for this label must result in NOT forwarding packets received with 1204 this label onto the Ethernet segment for which the label was 1205 distributed. 1207 The rules for the inclusion of the ESI label in a BUM packet by the 1208 ingress PE operating in All-Active redundancy mode are as follows: 1210 - A non-DF ingress PE MUST include the ESI label distributed by the 1211 DF egress PE in the copy of a BUM packet sent to it. 1213 - An ingress PE (DF or non-DF) SHOULD include the ESI label 1214 distributed by each non-DF egress PE in the copy of a BUM packet 1215 sent to it. 1217 The rule for the inclusion of the ESI label in a BUM packet by the 1218 ingress PE operating in Single-Active redundancy mode is as follows: 1220 - An ingress DF PE SHOULD include the ESI label distributed by the 1221 egress PE in the copy of a BUM packet sent to it. 1223 In both All-Active and Single-Active redundancy mode, an ingress PE 1224 MUST NOT include an ESI label in the copy of a BUM packet sent to an 1225 egress PE that is not attached to the ES through which the BUM packet 1226 entered the EVI. 1228 As an example, consider PE1 and PE2, which are multihomed to CE1 on 1229 ES1 and operating in All-Active multihoming mode. Further, consider 1230 that PE1 is using P2P or MP2P LSPs to send packets to PE2. Consider 1231 that PE1 is the non-DF for VLAN1 and PE2 is the DF for VLAN1, and PE1 1232 receives a BUM packet from CE1 on VLAN1 on ES1. In this scenario, 1233 PE2 distributes an Inclusive Multicast Ethernet Tag route for VLAN1 1234 corresponding to an EVPN instance. So, when PE1 sends a BUM packet 1235 that it receives from CE1, it MUST first push onto the MPLS label 1236 stack the ESI label that PE2 has distributed for ES1. It MUST then 1237 push onto the MPLS label stack the MPLS label distributed by PE2 in 1238 the Inclusive Multicast Ethernet Tag route for VLAN1. The resulting 1239 packet is further encapsulated in the P2P or MP2P LSP label stack 1240 required to transmit the packet to PE2. When PE2 receives this 1241 packet, it determines, from the top MPLS label, the set of ESIs to 1242 which it will replicate the packet after any P2P or MP2P LSP labels 1243 have been removed. If the next label is the ESI label assigned by 1244 PE2 for ES1, then PE2 MUST NOT forward the packet onto ES1. If the 1245 next label is an ESI label that has not been assigned by PE2, then 1246 PE2 MUST drop the packet. It should be noted that in this scenario, 1247 if PE2 receives a BUM packet for VLAN1 from CE1, then it SHOULD 1248 encapsulate the packet with an ESI label received from PE1 when 1249 sending it to PE1 in order to avoid any transient loops during a 1250 failure scenario that would impact ES1 (e.g., port or link failure). 1252 8.3.1.2. P2MP MPLS LSPs 1254 The non-DF PEs that operate in All-Active redundancy mode and that 1255 use P2MP LSPs to send BUM traffic advertise an upstream assigned ESI 1256 label in the set of Ethernet A-D per ES routes for their common 1257 attached ES. This label is upstream assigned by the PE that 1258 advertises the route. This label MUST be programmed by the other PEs 1259 that are connected to the ESI advertised in the route, in the context 1260 label space for the advertising PE. Further, the forwarding entry 1261 for this label must result in NOT forwarding packets received with 1262 this label onto the Ethernet segment for which the label was 1263 distributed. This label MUST also be programmed by the other PEs 1264 that import the route but are not connected to the ESI advertised in 1265 the route, in the context label space for the advertising PE. 1266 Further, the forwarding entry for this label must be a label pop with 1267 no other associated action. 1269 The DF PE that operates in Single-Active redundancy mode and that 1270 uses P2MP LSPs to send BUM traffic should advertise an upstream 1271 assigned ESI label in the set of Ethernet A-D per ES routes for its 1272 attached ES, just as described in the previous paragraph. 1274 As an example, consider PE1 and PE2, which are multihomed to CE1 on 1275 ES1 and operating in All-Active multihoming mode. Also, consider 1276 that PE3 belongs to one of the EVPN instances of ES1. Further, 1277 assume that PE1, which is the non-DF, is using P2MP MPLS LSPs to send 1278 BUM packets. When PE1 sends a BUM packet that it receives from CE1, 1279 it MUST first push onto the MPLS label stack the ESI label that it 1280 has assigned for the ESI on which the packet was received. The 1281 resulting packet is further encapsulated in the P2MP MPLS label stack 1282 necessary to transmit the packet to the other PEs. Penultimate hop 1283 popping MUST be disabled on the P2MP LSPs used in the MPLS transport 1284 infrastructure for EVPN. When PE2 receives this packet, it 1285 decapsulates the top MPLS label and forwards the packet using the 1286 context label space determined by the top label. If the next label 1287 is the ESI label assigned by PE1 to ES1, then PE2 MUST NOT forward 1288 the packet onto ES1. When PE3 receives this packet, it decapsulates 1289 the top MPLS label and forwards the packet using the context label 1290 space determined by the top label. If the next label is the ESI 1291 label assigned by PE1 to ES1 and PE3 is not connected to ES1, then 1292 PE3 MUST pop the label and flood the packet over all local ESIs in 1293 that EVPN instance. It should be noted that when PE2 sends a BUM 1294 frame over a P2MP LSP, it should encapsulate the frame with an ESI 1295 label even though it is the DF for that VLAN, in order to avoid any 1296 transient loops during a failure scenario that would impact ES1 1297 (e.g., port or link failure). 1299 8.3.1.3. MP2MP MPLS LSPs 1301 The procedures for MP2MP tunnels follow section 8.3.1.2, with the 1302 exceptions described in this section. 1304 When MP2MP tunnels are used, ESI-labels MUST be allocated from a DCB 1305 and the same label must be used by all the PEs attached to the same 1306 Ethernet Segment. 1308 In that way, any egress PE with local Ethernet Segments can identify 1309 the source ES of the received BUM packets. 1311 8.4. Aliasing and Backup Path 1313 In the case where a CE is multihomed to multiple PE nodes, using a 1314 Link Aggregation Group (LAG) with All-Active redundancy, it is 1315 possible that only a single PE learns a set of the MAC addresses 1316 associated with traffic transmitted by the CE. This leads to a 1317 situation where remote PE nodes receive MAC/IP Advertisement routes 1318 for these addresses from a single PE, even though multiple PEs are 1319 connected to the multihomed segment. As a result, the remote PEs are 1320 not able to effectively load balance traffic among the PE nodes 1321 connected to the multihomed Ethernet segment. This could be the 1322 case, for example, when the PEs perform data-plane learning on the 1323 access, and the load-balancing function on the CE hashes traffic from 1324 a given source MAC address to a single PE. 1326 Another scenario where this occurs is when the PEs rely on control- 1327 plane learning on the access (e.g., using ARP), since ARP traffic 1328 will be hashed to a single link in the LAG. 1330 To address this issue, EVPN introduces the concept of 'aliasing', 1331 which is the ability of a PE to signal that it has reachability to an 1332 EVPN instance on a given ES even when it has learned no MAC addresses 1333 from that EVI/ES. The Ethernet A-D per EVI route is used for this 1334 purpose. A remote PE that receives a MAC/IP Advertisement route with 1335 a non-reserved ESI SHOULD consider the advertised MAC address to be 1336 reachable via all PEs that have advertised reachability to that MAC 1337 address's EVI/ES via the combination of an Ethernet A-D per EVI route 1338 for that EVI/ES (and Ethernet tag, if applicable) AND Ethernet A-D 1339 per ES routes for that ES with the "Single-Active" bit in the flags 1340 of the ESI Label extended community set to 0. 1342 Note that the Ethernet A-D per EVI route may be received by a remote 1343 PE before it receives the set of Ethernet A-D per ES routes. 1344 Therefore, in order to handle corner cases and race conditions, the 1345 Ethernet A-D per EVI route MUST NOT be used for traffic forwarding by 1346 a remote PE until it also receives the associated set of Ethernet A-D 1347 per ES routes. 1349 The backup path is a closely related function, but it is used in 1350 Single-Active redundancy mode. In this case, a PE also advertises 1351 that it has reachability to a given EVI/ES using the same combination 1352 of Ethernet A-D per EVI route and Ethernet A-D per ES route as 1353 discussed above, but with the "Single-Active" bit in the flags of the 1354 ESI Label extended community set to 1. A remote PE that receives a 1355 MAC/IP Advertisement route with a non-reserved ESI SHOULD consider 1356 the advertised MAC address to be reachable via any PE that has 1357 advertised this combination of Ethernet A-D routes, and it SHOULD 1358 install a backup path for that MAC address. 1360 Please see section 14.1.1 for a description of the operation backup 1361 paths. 1363 8.4.1. Constructing Ethernet A-D per EVPN Instance Route 1365 This section describes the procedures used to construct the Ethernet 1366 A-D per EVPN instance (EVI) route, which is used for aliasing (as 1367 discussed above). Support of this route is OPTIONAL. 1369 The Route Distinguisher (RD) MUST be set per Section 7.10. 1371 The Ethernet Segment Identifier MUST be a 10-octet entity as 1372 described in Section 5 ("Ethernet Segment"). The Ethernet A-D route 1373 is not needed when the Segment Identifier is set to 0. 1375 The Ethernet Tag ID is the identifier of an Ethernet tag on the 1376 Ethernet segment. This value may be a 12-bit VLAN ID, in which case 1377 the low-order 12 bits are set to the VLAN ID and the high-order 1378 20 bits are set to 0. Or, it may be another Ethernet tag used by the 1379 EVPN. It MAY be set to the default Ethernet tag on the Ethernet 1380 segment or to the value 0. 1382 Note that the above allows the Ethernet A-D route to be advertised 1383 with one of the following granularities: 1385 + One Ethernet A-D route per tuple per 1386 MAC-VRF. This is applicable when the PE uses MPLS-based 1387 disposition with VID translation or may be applicable when the 1388 PE uses MAC-based disposition with VID translation. 1390 + One Ethernet A-D route for each per MAC-VRF (where the 1391 Ethernet Tag ID is set to 0). This is applicable when the PE uses 1392 MAC-based disposition or MPLS-based disposition without VID 1393 translation. 1395 The usage of the MPLS label is described in Section 14 1396 ("Load Balancing of Unicast Packets"). 1398 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 1399 be set to the IPv4 or IPv6 address of the advertising PE. 1401 The Ethernet A-D route MUST carry one or more Route Target (RT) 1402 attributes, per Section 7.11. 1404 8.5. Designated Forwarder Election 1406 Consider a CE that is a host or a router that is multihomed directly 1407 to more than one PE in an EVPN instance on a given Ethernet segment. 1408 One or more Ethernet tags may be configured on the Ethernet segment. 1409 In this scenario, only one of the PEs, referred to as the Designated 1410 Forwarder (DF), is responsible for certain actions: 1412 - Sending multicast and broadcast traffic, on a given Ethernet tag on 1413 a particular Ethernet segment, to the CE. 1415 - Flooding unknown unicast traffic (i.e., traffic for which a PE does 1416 not know the destination MAC address), on a given Ethernet tag on a 1417 particular Ethernet segment to the CE, if the environment requires 1418 flooding of unknown unicast traffic. 1420 Note that this behavior, which allows selecting a DF at the 1421 granularity of or for multicast, 1422 broadcast, and unknown unicast traffic, is the default behavior in 1423 this specification. 1425 In this same scenario, a second PE, referred to as the Backup 1426 Designated Forwarder (BDF), is responsible for assuming those actions 1427 of the DF upon the DF's failure. 1429 Note that a CE always sends packets belonging to a specific flow 1430 using a single link towards a PE. For instance, if the CE is a host, 1431 then, as mentioned earlier, the host treats the multiple links that 1432 it uses to reach the PEs as a Link Aggregation Group (LAG). The CE 1433 employs a local hashing function to map traffic flows onto links in 1434 the LAG. 1436 If a bridged network is multihomed to more than one PE in an EVPN 1437 network via switches, then the support of All-Active redundancy mode 1438 requires the bridged network to be connected to two or more PEs using 1439 a LAG. 1441 If a bridged network does not connect to the PEs using a LAG, then 1442 only one of the links between the bridged network and the PEs must be 1443 the active link for a given or . In this 1444 case, the set of Ethernet A-D per ES routes advertised by each PE 1445 MUST have the "Single-Active" bit in the flags of the ESI Label 1446 extended community set to 1. 1448 The default procedure for DF election at the granularity of for VLAN-based service or for VLAN-(aware) 1450 bundle service is referred to as "service carving". With service 1451 carving, it is possible to elect multiple DFs per Ethernet segment 1452 (one per VLAN or VLAN bundle) in order to perform load balancing of 1453 multi-destination traffic destined to a given segment. The load- 1454 balancing procedures carve up the VLAN space per ES among the PE 1455 nodes evenly, in such a way that every PE is the DF for a disjoint 1456 set of VLANs or VLAN bundles for that ES. The procedure for service 1457 carving is as follows according to the DF Election Finite State 1458 Machine as defined in [RFC8584] Section 2.1: 1460 1. When a PE discovers the ESI of the attached Ethernet segment, 1461 it advertises an Ethernet Segment route with the associated 1462 ES-Import extended community attribute. 1464 2. The PE then starts a timer (default value = 3 seconds) to allow 1465 the reception of Ethernet Segment routes from other PE nodes 1466 connected to the same Ethernet segment. This timer value should 1467 be the same across all PEs connected to the same Ethernet 1468 segment. 1470 3. When the timer expires, each PE builds an ordered list of the IP 1471 addresses of all the PE nodes connected to the Ethernet segment 1472 (including itself), in increasing numeric value. Each IP address 1473 in this list is extracted from the "Originating Router's IP 1474 address" field of the advertised Ethernet Segment route. Every 1475 PE is then given an ordinal indicating its position in the 1476 ordered list, starting with 0 as the ordinal for the PE with the 1477 numerically lowest IP address. The ordinals are used to 1478 determine which PE node will be the DF for a given EVPN instance 1479 on the Ethernet segment, using the following rule: 1481 Assuming a redundancy group of N PE nodes, for VLAN-based 1482 service, the PE with ordinal i is the DF for an when 1483 (V mod N) = i. In the case of VLAN-(aware) bundle service, then 1484 the numerically lowest VLAN value in that bundle on that ES MUST 1485 be used in the modulo function. 1487 It should be noted that using the "Originating Router's IP 1488 address" field in the Ethernet Segment route to get the PE IP 1489 address needed for the ordered list allows for a CE to be 1490 multihomed across different ASes if such a need ever arises. 1492 4. For each EVPN instance, a second list of the IP addresses of all 1493 the PE nodes connected to the Ethernet segment is built. The PE 1494 which was determined as DF above is removed from that ordered 1495 candidate list, forming a backup redundancy group of M PE nodes. 1497 Every remaining PE is then given a second ordinal indicating its 1498 position in the secondary ordered list according to the same 1499 criteria as in step 3 above. 1501 The second ordinals are used to determine which PE nodes will be 1502 the BDF for a given EVPN instance on the Ethernet segment, using 1503 the same modulo rule as above, (V mod M) = i. 1505 5. The PE that is elected as a DF for a given or will unblock multi-destination traffic for that VLAN 1507 or VLAN bundle on the corresponding ES. Note that the DF PE 1508 unblocks multi-destination traffic in the egress direction 1509 towards the segment. All non-DF PEs continue to drop 1510 multi-destination traffic in the egress direction towards that 1511 or . 1513 In the case of link or port failure, the affected PE withdraws 1514 its Ethernet Segment route. This will re-trigger the service 1515 carving procedures on all the PEs in the redundancy group: the 1516 expected new-DF will be BDF previously calculated in step 5. For 1517 PE node failure, or upon PE commissioning or decommissioning, the 1518 PEs re-trigger the service carving. In the case of Single-Active 1519 multihoming, when a service moves from one PE in the redundancy 1520 group to another PE as a result of re-carving, the PE, which ends 1521 up being the elected DF for the service, SHOULD trigger a MAC 1522 address flush notification towards the associated Ethernet 1523 segment. This can be done, for example, using the IEEE 802.1ak 1524 Multiple VLAN Registration Protocol (MVRP) 'new' declaration. 1526 It is RECOMMENDED that all future DF Election algorithms specify an 1527 algorithm to select one DF-elected PE, one Backup-DF-elected PE and 1528 Non-DF-elected PE(s). 1530 8.6. Signaling Primary and Backup DF Elected PEs 1532 Once the Primary and Backup DF Elected PEs for a given EVI are 1533 determined, the multi-homed PEs for that ES will each advertise an 1534 Ethernet AD per EVI route for that EVI and each will include an 1535 L2-Attrib extended community with the P and B bits set to reflect 1536 each PE's DF role for that EVI. 1538 It should be noted if L2-Attrib extended community is included for 1539 All-Active mode, then the P bit must be set for all PEs in the 1540 redundancy group. 1542 8.7. Interoperability with Single-Homing PEs 1544 Let's refer to PEs that only support single-homed CE devices as 1545 single-homing PEs. For single-homing PEs, all the above multihoming 1546 procedures can be omitted; however, to allow for single-homing PEs 1547 to fully interoperate with multihoming PEs, some of the multihoming 1548 procedures described above SHOULD be supported even by single- 1549 homing PEs: 1551 - procedures related to processing Ethernet A-D routes for the 1552 purpose of fast convergence (Section 8.2 ("Fast Convergence")), to 1553 let single-homing PEs benefit from fast convergence 1555 - procedures related to processing Ethernet A-D routes for the 1556 purpose of aliasing (Section 8.4 ("Aliasing and Backup Path")), to 1557 let single-homing PEs benefit from load balancing 1559 - procedures related to processing Ethernet A-D routes for the 1560 purpose of a backup path (Section 8.4 1561 ("Aliasing and Backup Path")), to let single-homing PEs benefit 1562 from the corresponding convergence improvement 1564 9. Determining Reachability to Unicast MAC Addresses 1566 PEs forward packets that they receive based on the destination MAC 1567 address. This implies that PEs must be able to learn how to reach a 1568 given destination unicast MAC address. 1570 There are two components to MAC address learning -- "local learning" 1571 and "remote learning": 1573 9.1. Local Learning 1575 A particular PE must be able to learn the MAC addresses from the CEs 1576 that are connected to it. This is referred to as local learning. 1578 The PEs in a particular EVPN instance MUST support local data-plane 1579 learning using standard IEEE Ethernet learning procedures. A PE must 1580 be capable of learning MAC addresses in the data plane when it 1581 receives packets such as the following from the CE network: 1583 - DHCP requests 1585 - An ARP Request for its own MAC 1587 - An ARP Request for a peer 1588 Alternatively, PEs MAY learn the MAC addresses of the CEs in the 1589 control plane or via management-plane integration between the PEs and 1590 the CEs. 1592 There are applications where a MAC address that is reachable via a 1593 given PE on a locally attached segment (e.g., with ESI X) may move, 1594 such that it becomes reachable via another PE on another segment 1595 (e.g., with ESI Y). This is referred to as "MAC Mobility". 1596 Procedures to support this are described in Section 15 1597 ("MAC Mobility"). 1599 9.2. Remote Learning 1601 A particular PE must be able to determine how to send traffic to MAC 1602 addresses that belong to or are behind CEs connected to other PEs, 1603 i.e., to remote CEs or hosts behind remote CEs. We call such MAC 1604 addresses "remote" MAC addresses. 1606 This document requires a PE to learn remote MAC addresses in the 1607 control plane. In order to achieve this, each PE advertises the MAC 1608 addresses it learns from its locally attached CEs in the control 1609 plane, to all the other PEs in that EVPN instance, using MP-BGP and, 1610 specifically, the MAC/IP Advertisement route. 1612 9.2.1. Constructing MAC/IP Address Advertisement 1614 BGP is extended to advertise these MAC addresses using the MAC/IP 1615 Advertisement route type in the EVPN NLRI. 1617 The RD MUST be set per Section 7.10. 1619 The Ethernet Segment Identifier is set to the 10-octet ESI described 1620 in Section 5 ("Ethernet Segment"). 1622 The Ethernet Tag ID may be zero or may represent a valid Ethernet 1623 Tag ID. This field may be non-zero when there are multiple bridge 1624 tables in the MAC-VRF (i.e., the PE needs to support VLAN-aware 1625 bundle service for that EVI). 1627 When the Ethernet Tag ID in the NLRI is set to a non-zero value for a 1628 particular broadcast domain, then this Ethernet Tag ID may be either 1629 the CE's Ethernet tag value (e.g., CE VLAN ID) or the EVPN provider's 1630 Ethernet tag value (e.g., provider VLAN ID). The latter would be the 1631 case if the CE Ethernet tags (e.g., CE VLAN ID) for a particular 1632 broadcast domain are different on different CEs. 1634 The MAC Address Length field is in bits, and it is set to 48. MAC 1635 address length values other than 48 bits are outside the scope of 1636 this document. The encoding of a MAC address MUST be the 6-octet MAC 1637 address specified by [IEEE.802.1Q_2014] and [IEEE.802.1D_2004]. 1639 The IP Address field is optional. By default, the IP Address Length 1640 field is set to 0, and the IP Address field is omitted from the 1641 route. When a valid IP address needs to be advertised, it is then 1642 encoded in this route. When an IP address is present, the IP Address 1643 Length field is in bits, and it is set to 32 or 128 bits. Other IP 1644 Address Length values are outside the scope of this document. The 1645 encoding of an IP address MUST be either 4 octets for IPv4 or 1646 16 octets for IPv6. The Length field of the EVPN NLRI (which is in 1647 octets and is described in Section 7) is sufficient to determine 1648 whether an IP address is encoded in this route and, if so, whether 1649 the encoded IP address is IPv4 or IPv6. 1651 The MPLS Label1 field is encoded as 3 octets, where the high-order 1652 20 bits contain the label value. The MPLS Label1 MUST be downstream 1653 assigned, and it is associated with the MAC address being advertised 1654 by the advertising PE. The advertising PE uses this label when it 1655 receives an MPLS-encapsulated packet to perform forwarding based on 1656 the destination MAC address toward the CE. The forwarding procedures 1657 are specified in Sections 13 and 14. 1659 A PE may advertise the same single EVPN label for all MAC addresses 1660 in a given MAC-VRF. This label assignment is referred to as a per 1661 MAC-VRF label assignment. Alternatively, a PE may advertise a unique 1662 EVPN label per combination. This label 1663 assignment is referred to as a per label 1664 assignment. As a third option, a PE may advertise a unique EVPN 1665 label per combination. This label assignment is 1666 referred to as a per label assignment. As a 1667 fourth option, a PE may advertise a unique EVPN label per MAC 1668 address. This label assignment is referred to as a per MAC label 1669 assignment. All of these label assignment methods have their 1670 trade-offs. The choice of a particular label assignment methodology 1671 is purely local to the PE that originates the route. 1673 An assignment per MAC-VRF label requires the least number of EVPN 1674 labels but requires a MAC lookup in addition to an MPLS lookup on an 1675 egress PE for forwarding. On the other hand, a unique label per 1676 or a unique label per MAC allows an egress PE to 1677 forward a packet that it receives from another PE, to the connected 1678 CE, after looking up only the MPLS labels without having to perform a 1679 MAC lookup. This includes the capability to perform appropriate VLAN 1680 ID translation on egress to the CE. 1682 The MPLS Label2 field is an optional field. If it is present, then 1683 it is encoded as 3 octets, where the high-order 20 bits contain the 1684 label value. 1686 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 1687 be set to the IPv4 or IPv6 address of the advertising PE. 1689 The BGP advertisement for the MAC/IP Advertisement route MUST also 1690 carry one or more Route Target (RT) attributes. RTs may be 1691 configured (as in IP VPNs) or may be derived automatically from the 1692 Ethernet Tag ID, in the Unique VLAN case, as described in 1693 Section 7.11.1. 1695 It is to be noted that this document does not require PEs to create 1696 forwarding state for remote MACs when they are learned in the control 1697 plane. When this forwarding state is actually created is a local 1698 implementation matter. 1700 9.2.2. Route Resolution 1702 If the Ethernet Segment Identifier field in a received MAC/IP 1703 Advertisement route is set to the reserved ESI value of 0 or MAX-ESI, 1704 then if the receiving PE decides to install forwarding state for the 1705 associated MAC address, it MUST be based on the MAC/IP Advertisement 1706 route alone. 1708 If the Ethernet Segment Identifier field in a received MAC/IP 1709 Advertisement route is set to a non-reserved ESI, and the receiving 1710 PE is locally attached to the same ESI, then the PE does not alter 1711 its forwarding state based on the received route. This ensures that 1712 local routes are preferred to remote routes. 1714 If the Ethernet Segment Identifier field in a received MAC/IP 1715 Advertisement route is set to a non-reserved ESI, then if the 1716 receiving PE decides to install forwarding state for the associated 1717 MAC address, it MUST be when both the MAC/IP Advertisement route AND 1718 the associated set of Ethernet A-D per ES routes have been received. 1719 The dependency of MAC route installation on Ethernet A-D per ES 1720 routes is to ensure that MAC routes don't get accidentally installed 1721 during a mass withdraw period. 1723 To illustrate this with an example, consider two PEs (PE1 and PE2) 1724 connected to a multihomed Ethernet segment ES1. All-Active 1725 redundancy mode is assumed. A given MAC address M1 is learned by PE1 1726 but not PE2. On PE3, the following states may arise: 1728 T1 When the MAC/IP Advertisement route from PE1 and the set of 1729 Ethernet A-D per ES routes and Ethernet A-D per EVI routes from 1730 PE1 and PE2 are received, PE3 can forward traffic destined to 1731 M1 to both PE1 and PE2. 1733 T2 If after T1 PE1 withdraws its set of Ethernet A-D per ES 1734 routes, then PE3 forwards traffic destined to M1 to PE2 only. 1736 T2' If after T1 PE2 withdraws its set of Ethernet A-D per ES 1737 routes, then PE3 forwards traffic destined to M1 to PE1 only. 1739 T2'' If after T1 PE1 withdraws its MAC/IP Advertisement route, then 1740 PE3 treats traffic to M1 as unknown unicast. 1742 T3 PE2 also advertises a MAC route for M1, and then PE1 withdraws 1743 its MAC route for M1. PE3 continues forwarding traffic 1744 destined to M1 to both PE1 and PE2. In other words, despite M1 1745 withdrawal by PE1, PE3 forwards the traffic destined to M1 to 1746 both PE1 and PE2. This is because a flow from the CE, 1747 resulting in M1 traffic getting hashed to PE1, can get 1748 terminated, resulting in M1 being aged out in PE1; however, M1 1749 can be reachable by both PE1 and PE2. 1751 10. ARP and ND 1753 The IP Address field in the MAC/IP Advertisement route may optionally 1754 carry one of the IP addresses associated with the MAC address. This 1755 provides an option that can be used to minimize the flooding of ARP 1756 or Neighbor Discovery (ND) messages over the MPLS network and to 1757 remote CEs. This option also minimizes ARP (or ND) message 1758 processing on end-stations/hosts connected to the EVPN network. A PE 1759 may learn the IP address associated with a MAC address in the control 1760 or management plane between the CE and the PE. Or, it may learn this 1761 binding by snooping certain messages to or from a CE. When a PE 1762 learns the IP address associated with a MAC address of a locally 1763 connected CE, it may advertise this address to other PEs by including 1764 it in the MAC/IP Advertisement route. The IP address may be an IPv4 1765 address encoded using 4 octets or an IPv6 address encoded using 1766 16 octets. For ARP and ND purposes, the IP Address Length field MUST 1767 be set to 32 for an IPv4 address or 128 for an IPv6 address. 1769 If there are multiple IP addresses associated with a MAC address, 1770 then multiple MAC/IP Advertisement routes MUST be generated, one for 1771 each IP address. For instance, this may be the case when there are 1772 both an IPv4 and an IPv6 address associated with the same MAC address 1773 for dual-IP-stack scenarios. When the IP address is dissociated with 1774 the MAC address, then the MAC/IP Advertisement route with that 1775 particular IP address MUST be withdrawn. 1777 Note that a MAC-only route can be advertised along with, but 1778 independent from, a MAC/IP route for scenarios where the MAC learning 1779 over an access network/node is done in the data plane and independent 1780 from ARP snooping that generates a MAC/IP route. In such scenarios, 1781 when the ARP entry times out and causes the MAC/IP to be withdrawn, 1782 then the MAC information will not be lost. In scenarios where the 1783 host MAC/IP is learned via the management or control plane, then the 1784 sender PE may only generate and advertise the MAC/IP route. If the 1785 receiving PE receives both the MAC-only route and the MAC/IP route, 1786 then when it receives a withdraw message for the MAC/IP route, it 1787 MUST delete the corresponding entry from the ARP table but not the 1788 MAC entry from the MAC-VRF table, unless it receives a withdraw 1789 message for the MAC-only route. 1791 When a PE receives an ARP Request for an IP address from a CE, and if 1792 the PE has the MAC address binding for that IP address, the PE SHOULD 1793 perform ARP proxy by responding to the ARP Request. 1795 In the same way, when a PE receives a Neighbor Solicitation for an IP 1796 address from a CE, the PE SHOULD perform ND proxy and respond if the 1797 PE has the binding information for the IP. 1799 10.1. Default Gateway 1801 When a PE needs to perform inter-subnet forwarding where each subnet 1802 is represented by a different broadcast domain (e.g., a different 1803 VLAN), the inter-subnet forwarding is performed at Layer 3, and the 1804 PE that performs such a function is called the default gateway for 1805 the EVPN instance. In this case, when the PE receives an ARP Request 1806 for the IP address configured as the default gateway address, the PE 1807 originates an ARP Reply. 1809 Each PE that acts as a default gateway for a given EVPN instance MAY 1810 advertise in the EVPN control plane its default gateway MAC address 1811 using the MAC/IP Advertisement route, and each such PE indicates that 1812 such a route is associated with the default gateway. This is 1813 accomplished by requiring the route to carry the Default Gateway 1814 extended community defined in Section 7.8 1815 ("Default Gateway Extended Community"). The ESI field is set to zero 1816 when advertising the MAC route with the Default Gateway extended 1817 community. 1819 The IP Address field of the MAC/IP Advertisement route is set to the 1820 default gateway IP address for that subnet (e.g., an EVPN instance). 1821 For a given subnet (e.g., a VLAN or EVPN instance), the default 1822 gateway IP address is the same across all the participant PEs. The 1823 inclusion of this IP address enables the receiving PE to check its 1824 configured default gateway IP address against the one received in the 1825 MAC/IP Advertisement route for that subnet (or EVPN instance), and if 1826 there is a discrepancy, then the PE SHOULD notify the operator and 1827 log an error message. 1829 Unless it is known a priori (by means outside of this document) that 1830 all PEs of a given EVPN instance act as a default gateway for that 1831 EVPN instance, the MPLS label MUST be set to a valid downstream 1832 assigned label. 1834 Furthermore, even if all PEs of a given EVPN instance do act as a 1835 default gateway for that EVPN instance, but only some, but not all, 1836 of these PEs have sufficient (routing) information to provide 1837 inter-subnet routing for all the inter-subnet traffic originated 1838 within the subnet associated with the EVPN instance, then when such a 1839 PE advertises in the EVPN control plane its default gateway MAC 1840 address using the MAC/IP Advertisement route and indicates that such 1841 a route is associated with the default gateway, the route MUST carry 1842 a valid downstream assigned label. 1844 If all PEs of a given EVPN instance act as a default gateway for that 1845 EVPN instance, and the same default gateway MAC address is used 1846 across all gateway devices, then no such advertisement is needed. 1847 However, if each default gateway uses a different MAC address, then 1848 each default gateway needs to be aware of other gateways' MAC 1849 addresses and thus the need for such an advertisement. This is 1850 called MAC address aliasing, since a single default gateway can be 1851 represented by multiple MAC addresses. 1853 Each PE that receives this route and imports it as per procedures 1854 specified in this document follows the procedures in this section 1855 when replying to ARP Requests that it receives. 1857 Each PE that acts as a default gateway for a given EVPN instance that 1858 receives this route and imports it as per procedures specified in 1859 this document MUST create MAC forwarding state that enables it to 1860 apply IP forwarding to the packets destined to the MAC address 1861 carried in the route. 1863 10.1.1. Best Path selection for Default Gateway 1865 Default gateway MAC address that is assigned to an IRB interface (for 1866 a subnet) in a PE MUST be unique in context of that subnet. In other 1867 words, the same MAC address cannot be used by a host either 1868 intentionally or accidently. Therefore, in case such conflicts 1869 arises, there needs to be scheme to detect it and resolve it. In 1870 order to properly detect such conflicts, the following BGP best path 1871 selection MUST be applied. 1873 * When comparing two routes, the route having Gateway GMAC EC is 1874 preferred over the route that doesn't have GMAC EC. The PE that 1875 has advertised the MAC route without GMAC EC upon receiving the 1876 route with GMAC EC, SHALL withdraw its route and raise an alarm. 1878 * When comparing two routes where both routes having GMAC EC, normal 1879 BGP best path processing will be applied. 1881 * When comparing local and remote route having Gateway GMAC EC, the 1882 local route is always preferred. 1884 * MAC Mobility EC SHALL not be attached to routes having GMAC EC on 1885 the sending side and SHALL be ignored on the receiving side. 1887 11. Handling of Multi-destination Traffic 1889 Procedures are required for a given PE to send broadcast or multicast 1890 traffic received from a CE encapsulated in a given Ethernet tag 1891 (VLAN) in an EVPN instance to all the other PEs that span that 1892 Ethernet tag (VLAN) in that EVPN instance. In certain scenarios, as 1893 described in Section 12 ("Processing of Unknown Unicast Packets"), a 1894 given PE may also need to flood unknown unicast traffic to other PEs. 1896 The PEs in a particular EVPN instance may use ingress replication, 1897 P2MP LSPs, or MP2MP LSPs to send unknown unicast, broadcast, or 1898 multicast traffic to other PEs. 1900 Each PE MUST advertise an "Inclusive Multicast Ethernet Tag route" to 1901 enable the above. The following subsection provides the procedures 1902 to construct the Inclusive Multicast Ethernet Tag route. Subsequent 1903 subsections describe its usage in further detail. 1905 11.1. Constructing Inclusive Multicast Ethernet Tag Route 1907 The RD MUST be set per Section 7.10. 1909 The Ethernet Tag ID is the identifier of the Ethernet tag. It may be 1910 set to 0 or to a valid Ethernet tag value. 1912 The Originating Router's IP Address field value MUST be set to an IP 1913 address of the PE that should be common for all the EVIs on the PE 1914 (e.g., this address may be the PE's loopback address). The IP 1915 Address Length field is in bits. 1917 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 1918 be set to the IPv4 or IPv6 address of the advertising PE. 1920 The BGP advertisement for the Inclusive Multicast Ethernet Tag route 1921 MUST also carry one or more Route Target (RT) attributes. The 1922 assignment of RTs as described in Section 7.11 MUST be followed. 1924 11.2. P-Tunnel Identification 1926 In order to identify the P-tunnel used for sending broadcast, unknown 1927 unicast, or multicast traffic, the Inclusive Multicast Ethernet Tag 1928 route MUST carry a Provider Multicast Service Interface (PMSI) Tunnel 1929 attribute as specified in [RFC6514]. 1931 Depending on the technology used for the P-tunnel for the EVPN 1932 instance on the PE, the PMSI Tunnel attribute of the Inclusive 1933 Multicast Ethernet Tag route is constructed as follows. 1935 + If the PE that originates the advertisement uses a P-multicast tree 1936 for the P-tunnel for EVPN, the PMSI Tunnel attribute MUST contain 1937 the identity of the tree (note that the PE could create the 1938 identity of the tree prior to the actual instantiation of the 1939 tree). 1941 + A PE that uses a P-multicast tree for the P-tunnel MAY aggregate 1942 two or more Broadcast Domains (BDs) present on the PE onto the same 1943 tree. In this case, in addition to carrying the identity of the 1944 tree, the PMSI Tunnel attribute MUST carry an MPLS label, which the 1945 PE has bound uniquely to the BD associated with this update (as 1946 determined by its RTs and Ethernet Tag ID). The assigned MPLS 1947 label is upstream allocated unless the procedures in section 19 1948 (Use of Domain-wide Common Block (DCB) Labels) are followed. If 1949 the PE has already advertised Inclusive Multicast Ethernet Tag 1950 routes for two or more BDs that it now desires to aggregate, then 1951 the PE MUST re-advertise those routes. The re-advertised routes 1952 MUST be the same as the original ones, except for the PMSI Tunnel 1953 attribute and the label carried in that attribute. 1955 + If the PE that originates the advertisement uses ingress 1956 replication for the P-tunnel for EVPN, the route MUST include the 1957 PMSI Tunnel attribute with the Tunnel Type set to Ingress 1958 Replication and the Tunnel Identifier set to a routable address of 1959 the PE. The PMSI Tunnel attribute MUST carry a downstream assigned 1960 MPLS label. This label is used to demultiplex the broadcast, 1961 multicast, or unknown unicast EVPN traffic received over an MP2P 1962 tunnel by the PE. 1964 12. Processing of Unknown Unicast Packets 1966 The procedures in this document do not require the PEs to flood 1967 unknown unicast traffic to other PEs. If PEs learn CE MAC addresses 1968 via a control-plane protocol, the PEs can then distribute MAC 1969 addresses via BGP, and all unicast MAC addresses will be learned 1970 prior to traffic to those destinations. 1972 However, if a destination MAC address of a received packet is not 1973 known by the PE, the PE may have to flood the packet. When flooding, 1974 one must take into account "split-horizon forwarding" as follows: The 1975 principles behind the following procedures are borrowed from the 1976 split-horizon forwarding rules in VPLS solutions [RFC4761] [RFC4762]. 1977 When a PE capable of flooding (say PEx) receives an unknown 1978 destination MAC address, it floods the frame. If the frame arrived 1979 from an attached CE, PEx must send a copy of that frame on every 1980 Ethernet segment (belonging to that EVI) for which it is the DF, 1981 other than the Ethernet segment on which it received the frame. In 1982 addition, the PE must flood the frame to all other PEs participating 1983 in that EVPN instance. If, on the other hand, the frame arrived from 1984 another PE (say PEy), PEx must send a copy of the packet on each 1985 Ethernet segment (belonging to that EVI) for which it is the DF. PEx 1986 MUST NOT send the frame to other PEs, since PEy would have already 1987 done so. Split-horizon forwarding rules apply to unknown MAC 1988 addresses. 1990 Whether or not to flood packets to unknown destination MAC addresses 1991 should be an administrative choice, depending on how learning happens 1992 between CEs and PEs. 1994 The PEs in a particular EVPN instance may use ingress replication 1995 using RSVP-TE P2P LSPs or LDP MP2P LSPs for sending unknown unicast 1996 traffic to other PEs. Or, they may use RSVP-TE P2MP or LDP P2MP for 1997 sending such traffic to other PEs. 1999 12.1. Ingress Replication 2001 If ingress replication is in use, the P-tunnel attribute, carried in 2002 the Inclusive Multicast Ethernet Tag routes for the EVPN instance, 2003 specifies the downstream label that the other PEs can use to send 2004 unknown unicast, multicast, or broadcast traffic for that EVPN 2005 instance to this particular PE. 2007 The PE that receives a packet with this particular MPLS label MUST 2008 treat the packet as a broadcast, multicast, or unknown unicast 2009 packet. Further, if the MAC address is a unicast MAC address, the PE 2010 MUST treat the packet as an unknown unicast packet. 2012 12.2. P2MP MPLS LSPs 2014 The procedures for using P2MP LSPs (or MP2MP LSPs for that matter) 2015 are very similar to the VPLS procedures described in [RFC7117]. The 2016 P-tunnel attribute used by a PE for sending unknown unicast, 2017 broadcast, or multicast traffic for a particular EVPN instance is 2018 advertised in the Inclusive Multicast Ethernet Tag route as described 2019 in Section 11 ("Handling of Multi-destination Traffic"). 2021 The P-tunnel attribute specifies the P2MP or MP2MP LSP identifier. 2022 This is the equivalent of an Inclusive tree as described in 2023 [RFC7117]. Note that multiple Ethernet tags, which may be in 2024 different EVPN instances, may use the same P2MP or MP2MP LSP, using 2025 upstream labels [RFC7117] or DCB labels 2026 [I-D.ietf-bess-mvpn-evpn-aggregation-label]. This is the equivalent 2027 of an Aggregate Inclusive tree [RFC7117]. When P2MP or MP2MP LSPs 2028 are used for flooding unknown unicast traffic, packet reordering is 2029 possible. 2031 The PE that receives a packet on the P2MP or MP2MP LSP specified in 2032 the PMSI Tunnel attribute MUST treat the packet as a broadcast, 2033 multicast, or unknown unicast packet. Further, if the MAC address is 2034 a unicast MAC address, the PE MUST treat the packet as an unknown 2035 unicast packet. 2037 13. Forwarding Unicast Packets 2039 This section describes procedures for forwarding unicast packets by 2040 PEs, where such packets are received from either directly connected 2041 CEs or some other PEs. 2043 13.1. Forwarding Packets Received from a CE 2045 When a PE receives a packet from a CE, on a given Ethernet Tag ID, it 2046 must first look up the source MAC address of the packet. In certain 2047 environments that enable MAC security, the source MAC address MAY be 2048 used to validate the host identity and determine that traffic from 2049 the host can be allowed into the network. Source MAC lookup MAY also 2050 be used for local MAC address learning. 2052 If the PE decides to forward the packet, the destination MAC address 2053 of the packet must be looked up. If the PE has received MAC address 2054 advertisements for this destination MAC address from one or more 2055 other PEs or has learned it from locally connected CEs, the MAC 2056 address is considered a known MAC address. Otherwise, it is 2057 considered an unknown MAC address. 2059 For known MAC addresses, the PE forwards this packet to one of the 2060 remote PEs or to a locally attached CE. When forwarding to a remote 2061 PE, the packet is encapsulated in the EVPN MPLS label advertised by 2062 the remote PE, for that MAC address, and in the MPLS LSP label stack 2063 to reach the remote PE. 2065 If the MAC address is unknown and if the administrative policy on the 2066 PE requires flooding of unknown unicast traffic, then: 2068 - The PE MUST flood the packet to other PEs. The PE MUST first 2069 encapsulate the packet in the ESI MPLS label as described in 2070 Section 8.3. If ingress replication is used, the packet MUST be 2071 replicated to each remote PE, with the VPN label being an MPLS 2072 label determined as follows: This is the MPLS label advertised by 2073 the remote PE in a PMSI Tunnel attribute in the Inclusive Multicast 2074 Ethernet Tag route for a or 2075 combination. 2077 The Ethernet tag in the route may be the same as the Ethernet tag 2078 associated with the interface on which the ingress PE receives the 2079 packet. If P2MP LSPs are being used, the packet MUST be sent on 2080 the P2MP LSP of which the PE is the root, for the Ethernet tag in 2081 the EVPN instance. If the same P2MP LSP is used for all Ethernet 2082 tags, then all the PEs in the EVPN instance MUST be the leaves of 2083 the P2MP LSP. If a distinct P2MP LSP is used for a given Ethernet 2084 tag in the EVPN instance, then only the PEs in the Ethernet tag 2085 MUST be the leaves of the P2MP LSP. The packet MUST be 2086 encapsulated in the P2MP LSP label stack. 2088 If the MAC address is unknown, then, if the administrative policy on 2089 the PE does not allow flooding of unknown unicast traffic: 2091 - the PE MUST drop the packet. 2093 13.2. Forwarding Packets Received from a Remote PE 2095 This section describes the procedures for forwarding known and 2096 unknown unicast packets received from a remote PE. 2098 13.2.1. Unknown Unicast Forwarding 2100 When a PE receives an MPLS packet from a remote PE, then, after 2101 processing the MPLS label stack, if the top MPLS label ends up being 2102 a P2MP LSP label associated with an EVPN instance or -- in the case 2103 of ingress replication -- the downstream label advertised in the 2104 P-tunnel attribute, and after performing the split-horizon procedures 2105 described in Section 8.3: 2107 - If the PE is the designated forwarder of BUM traffic on a 2108 particular set of ESIs for the Ethernet tag, the default behavior 2109 is for the PE to flood the packet on these ESIs. In other words, 2110 the default behavior is for the PE to assume that for BUM traffic 2111 it is not required to perform a destination MAC address lookup. As 2112 an option, the PE may perform a destination MAC lookup to flood the 2113 packet to only a subset of the CE interfaces in the Ethernet tag. 2114 For instance, the PE may decide to not flood a BUM packet on 2115 certain Ethernet segments even if it is the DF on the Ethernet 2116 segment, based on administrative policy. 2118 - If the PE is not the designated forwarder on any of the ESIs for 2119 the Ethernet tag, the default behavior is for it to drop the 2120 packet. 2122 13.2.2. Known Unicast Forwarding 2124 If the top MPLS label ends up being an EVPN label that was advertised 2125 in the unicast MAC advertisements, then the PE either forwards the 2126 packet based on CE next-hop forwarding information associated with 2127 the label or does a destination MAC address lookup to forward the 2128 packet to a CE. 2130 14. Load Balancing of Unicast Packets 2132 This section specifies the load-balancing procedures for sending 2133 known unicast packets to a multihomed CE. 2135 14.1. Load Balancing of Traffic from a PE to Remote CEs 2137 Whenever a remote PE imports a MAC/IP Advertisement route for a given 2138 in a MAC-VRF, it MUST examine all imported 2139 Ethernet A-D routes for that ESI in order to determine the load- 2140 balancing characteristics of the Ethernet segment. 2142 14.1.1. Single-Active Redundancy Mode 2144 For a given ES, if a remote PE has imported the set of Ethernet A-D 2145 per ES routes from at least one PE, where the "Single-Active" flag in 2146 the ESI Label extended community is set, then that remote PE MUST 2147 deduce that the ES is operating in Single-Active redundancy mode. 2149 This means that for a given [EVI, BD], a given MAC address is only 2150 reachable only via the PE announcing the associated MAC/IP 2151 Advertisement route - this PE will also have advertised an Ethernet 2152 AD per EVI route for that [EVI, BD] with an L2-Attrib extended 2153 community in which the P bit is set. I.e., the Primary DF Elected PE 2154 is also responsible for sending known unicast frames to the CE and 2155 receiving unicast and BUM frames from it. Similarly, the Backup DF 2156 Elected PE will have advertised an Ethernet AD per EVI route for 2157 [EVI, BD] with an L2-Attrib extended community in which the B bit is 2158 set. 2160 If the Primary DF Elected PE loses connectivity to the CE it SHOULD 2161 withdraw its set of Ethernet A-D per ES routes for the affected ES 2162 prior to withdrawing the affected MAC/IP Advertisement routes. The 2163 Backup DF Elected PE (which is now the Primary DF Elected PE) needs 2164 to advertise an Ethernet AD per EVI route for [EVI, BD] with an 2165 L2-Attrib extended community in which the P bit is set. Furthermore, 2166 the new Backup DF Elected PE needs to advertise an EThernet AD per 2167 EVI route for [EVI, BD] with an L2-Attrib extended community in which 2168 the B bit is set. 2170 A remote PE SHOULD use the Primary DF Elected PE's withdrawal of its 2171 set of Ethernet A-D per ES routes as a trigger to update its 2172 forwarding entries for the associated MAC addresses to point at the 2173 Backup DF Elected PE. As the Backup DF Elected PE starts learning 2174 the MAC addresses over its attached ES, it will start sending MAC/IP 2175 Advertisement routes while the failed PE withdraws its routes. This 2176 mechanism minimizes the flooding of traffic during fail-over events. 2178 14.1.2. All-Active Redundancy Mode 2180 For a given ES, if the remote PE has imported the set of Ethernet A-D 2181 per ES routes from one or more PEs and none of them have the 2182 "Single-Active" flag in the ESI Label extended community set, then 2183 the remote PE MUST deduce that the ES is operating in All-Active 2184 redundancy mode. A remote PE that receives a MAC/IP Advertisement 2185 route with a non-reserved ESI SHOULD consider the advertised MAC 2186 address to be reachable via all PEs that have advertised reachability 2187 to that MAC address's EVI/ES via the combination of an Ethernet A-D 2188 per EVI route for that EVI/ES (and Ethernet tag, if applicable) AND 2189 an Ethernet A-D per ES route for that ES. The remote PE MUST use 2190 received MAC/IP Advertisement routes and Ethernet A-D per EVI/per ES 2191 routes to construct the set of next hops for the advertised MAC 2192 address. 2194 Each next hop comprises an MPLS label stack that is to be used by the 2195 egress PE to forward the packet. This label stack is determined as 2196 follows: 2198 - If the next hop is constructed as a result of a MAC route, then 2199 this label stack MUST be used. However, if the MAC route doesn't 2200 exist for that PE, then the next hop and the MPLS label stack are 2201 constructed as a result of the Ethernet A-D routes. Note that the 2202 following description applies to determining the label stack for a 2203 particular next hop to reach a given PE, from which the remote PE 2204 has received and imported Ethernet A-D routes that have the same 2205 ESI and Ethernet tag as the ones present in the MAC advertisement. 2206 The Ethernet A-D routes mentioned in the following description 2207 refer to the ones imported from this given PE. 2209 - If a set of Ethernet A-D per ES routes for that ES AND an Ethernet 2210 A-D route per EVI exist, only then must the label from that latter 2211 route be used. 2213 The following example explains the above. 2215 Consider a CE (CE1) that is dual-homed to two PEs (PE1 and PE2) on a 2216 LAG interface (ES1), and is sending packets with source MAC address 2217 MAC1 on VLAN1 (mapped to EVI1). A remote PE, say PE3, is able to 2218 learn that MAC1 is reachable via PE1 and PE2. Both PE1 and PE2 may 2219 advertise MAC1 in BGP if they receive packets with MAC1 from CE1. If 2220 this is not the case, and if MAC1 is advertised only by PE1, PE3 2221 still considers MAC1 as reachable via both PE1 and PE2, as both PE1 2222 and PE2 advertise a set of Ethernet A-D per ES routes for ES1 as well 2223 as an Ethernet A-D per EVI route for . 2225 The MPLS label stack to send the packets to PE1 is the MPLS LSP stack 2226 to get to PE1 (at the top of the stack) followed by the EVPN label 2227 advertised by PE1 for CE1's MAC. 2229 The MPLS label stack to send packets to PE2 is the MPLS LSP stack to 2230 get to PE2 (at the top of the stack) followed by the MPLS label in 2231 the Ethernet A-D route advertised by PE2 for , if PE2 has 2232 not advertised MAC1 in BGP. 2234 We will refer to these label stacks as MPLS next hops. 2236 The remote PE (PE3) can now load balance the traffic it receives from 2237 its CEs, destined for CE1, between PE1 and PE2. PE3 may use N-tuple 2238 flow information to hash traffic into one of the MPLS next hops for 2239 load balancing of IP traffic. Alternatively, PE3 may rely on the 2240 source MAC addresses for load balancing. 2242 Note that once PE3 decides to send a particular packet to PE1 or PE2, 2243 it can pick one out of multiple possible paths to reach the 2244 particular remote PE using regular MPLS procedures. For instance, if 2245 the tunneling technology is based on RSVP-TE LSPs and PE3 decides to 2246 send a particular packet to PE1, then PE3 can choose from multiple 2247 RSVP-TE LSPs that have PE1 as their destination. 2249 When PE1 or PE2 receives the packet destined for CE1 from PE3, if the 2250 packet is a known unicast, it is forwarded to CE1. If it is a BUM 2251 packet, then only one of PE1 or PE2 must forward the packet to the 2252 CE. Whether PE1 or PE2 forwards this packet to the CE is determined 2253 based on which of the two is the DF. 2255 14.2. Load Balancing of Traffic between a PE and a Local CE 2257 A CE may be configured with more than one interface connected to 2258 different PEs or the same PE for load balancing, using a technology 2259 such as a LAG. The PE(s) and the CE can load balance traffic onto 2260 these interfaces using one of the following mechanisms. 2262 14.2.1. Data-Plane Learning 2264 Consider that the PEs perform data-plane learning for local MAC 2265 addresses learned from local CEs. This enables the PE(s) to learn a 2266 particular MAC address and associate it with one or more interfaces, 2267 if the technology between the PE and the CE supports multipathing. 2268 The PEs can now load balance traffic destined to that MAC address on 2269 the multiple interfaces. 2271 Whether the CE can load balance traffic that it generates on the 2272 multiple interfaces is dependent on the CE implementation. 2274 14.2.2. Control-Plane Learning 2276 The CE can be a host that advertises the same MAC address using a 2277 control protocol on all interfaces. This enables the PE(s) to learn 2278 the host's MAC address and associate it with all interfaces. The PEs 2279 can now load balance traffic destined to the host on all these 2280 interfaces. The host can also load balance the traffic it generates 2281 onto these interfaces, and the PE that receives the traffic employs 2282 EVPN forwarding procedures to forward the traffic. 2284 15. MAC Mobility 2286 It is possible for a given host or end-station (as defined by its MAC 2287 address) to move from one Ethernet segment to another; this is 2288 referred to as 'MAC Mobility' or 'MAC move', and it is different from 2289 the multihoming situation in which a given MAC address is reachable 2290 via multiple PEs for the same Ethernet segment. In a MAC move, there 2291 would be two sets of MAC/IP Advertisement routes -- one set with the 2292 new Ethernet segment and one set with the previous Ethernet segment 2293 -- and the MAC address would appear to be reachable via each of these 2294 segments. 2296 In order to allow all of the PEs in the EVPN instance to correctly 2297 determine the current location of the MAC address, all advertisements 2298 of it being reachable via the previous Ethernet segment MUST be 2299 withdrawn by the PEs, for the previous Ethernet segment, that had 2300 advertised it. 2302 If local learning is performed using the data plane, these PEs will 2303 not be able to detect that the MAC address has moved to another 2304 Ethernet segment, and the receipt of MAC/IP Advertisement routes, 2305 with the MAC Mobility extended community attribute, from other PEs 2306 serves as the trigger for these PEs to withdraw their advertisements. 2307 If local learning is performed using the control or management 2308 planes, these interactions serve as the trigger for these PEs to 2309 withdraw their advertisements. 2311 In a situation where there are multiple moves of a given MAC, 2312 possibly between the same two Ethernet segments, there may be 2313 multiple withdrawals and re-advertisements. In order to ensure that 2314 all PEs in the EVPN instance receive all of these correctly through 2315 the intervening BGP infrastructure, introducing a sequence number 2316 into the MAC Mobility extended community attribute is necessary. 2318 In order to process mobility events correctly, an implementation MUST 2319 handle scenarios in which sequence number wraparound occurs. 2321 Every MAC mobility event for a given MAC address will contain a 2322 sequence number that is set using the following rules: 2324 - A PE advertising a MAC address for the first time advertises it 2325 with no MAC Mobility extended community attribute. 2327 - A PE detecting a locally attached MAC address for which it had 2328 previously received a MAC/IP Advertisement route with a different 2329 Ethernet segment identifier advertises the MAC address in a MAC/IP 2330 Advertisement route tagged with a MAC Mobility extended community 2331 attribute with a sequence number one greater than the sequence 2332 number in the MAC Mobility extended community attribute of the 2333 received MAC/IP Advertisement route. In the case of the first 2334 mobility event for a given MAC address, where the received MAC/IP 2335 Advertisement route does not carry a MAC Mobility extended 2336 community attribute, the value of the sequence number in the 2337 received route is assumed to be 0 for the purpose of this 2338 processing. 2340 - A PE detecting a locally attached MAC address for which it had 2341 previously received a MAC/IP Advertisement route with the same 2342 non-zero Ethernet segment identifier advertises it with: 2344 1. no MAC Mobility extended community attribute, if the received 2345 route did not carry said attribute. 2347 2. a MAC Mobility extended community attribute with the sequence 2348 number equal to the highest of the sequence number(s) in the 2349 received MAC/IP Advertisement route(s), if the received route(s) 2350 is (are) tagged with a MAC Mobility extended community 2351 attribute. 2353 - A PE detecting a locally attached MAC address for which it had 2354 previously received a MAC/IP Advertisement route with the same zero 2355 Ethernet segment identifier (single-homed scenarios) advertises it 2356 with a MAC Mobility extended community attribute with the sequence 2357 number set properly. In the case of single-homed scenarios, there 2358 is no need for ESI comparison. ESI comparison is done for 2359 multihoming in order to prevent false detection of MAC moves among 2360 the PEs attached to the same multihomed site. 2362 A PE receiving a MAC/IP Advertisement route for a MAC address with a 2363 different Ethernet segment identifier and a higher sequence number 2364 than that which it had previously advertised withdraws its MAC/IP 2365 Advertisement route. If two (or more) PEs advertise the same MAC 2366 address with the same sequence number but different Ethernet segment 2367 identifiers, a PE that receives these routes selects the route 2368 advertised by the PE with the lowest IP address as the best route. 2369 If the PE is the originator of the MAC route and it receives the same 2370 MAC address with the same sequence number that it generated, it will 2371 compare its own IP address with the IP address of the remote PE and 2372 will select the lowest IP. If its own route is not the best one, it 2373 will withdraw the route. 2375 15.1. MAC Duplication Issue 2377 A situation may arise where the same MAC address is learned by 2378 different PEs in the same VLAN because of two (or more) hosts being 2379 misconfigured with the same (duplicate) MAC address. In such a 2380 situation, the traffic originating from these hosts would trigger 2381 continuous MAC moves among the PEs attached to these hosts. It is 2382 important to recognize such a situation and avoid incrementing the 2383 sequence number (in the MAC Mobility extended community attribute) to 2384 infinity. In order to remedy such a situation, a PE that detects a 2385 MAC mobility event via local learning starts an M-second timer (with 2386 a default value of M = 180), and if it detects N MAC moves before the 2387 timer expires (with a default value of N = 5), it concludes that a 2388 duplicate-MAC situation has occurred. The PE MUST alert the operator 2389 and stop sending and processing any BGP MAC/IP Advertisement routes 2390 for that MAC address until a corrective action is taken by the 2391 operator. The values of M and N MUST be configurable to allow for 2392 flexibility in operator control. Note that the other PEs in the EVPN 2393 instance will forward the traffic for the duplicate MAC address to 2394 one of the PEs advertising the duplicate MAC address. 2396 15.2. Sticky MAC Addresses 2398 There are scenarios in which it is desired to configure some MAC 2399 addresses as static so that they are not subjected to MAC moves. In 2400 such scenarios, these MAC addresses are advertised with a MAC 2401 Mobility extended community where the static flag is set to 1 and the 2402 sequence number is set to zero. If a PE receives such advertisements 2403 and later learns the same MAC address(es) via local learning, then 2404 the PE MUST alert the operator. 2406 15.3. Loop Protection 2408 The EVPN MAC Duplication procedure in section 15.1 prevents an 2409 endless EVPN MAC/IP route advertisement exchange for a duplicate MAC 2410 between two (or more) PEs. While this helps the control plane 2411 settle, in case there is backdoor link (loop) between two or more PEs 2412 attached to the same BD, BUM frames being sent by a CE are still 2413 endlessly looped within the BD through the backdoor link and among 2414 the PEs. This may cause unpredictable issues in the CEs connected to 2415 the affected BD. 2417 The EVPN MAC Duplication Mechanism in section 15.1 MAT be extended 2418 with a Loop-protection action that is applied on the duplicate-MAC 2419 addresses. This additional mechanism resolves loops created by 2420 accidental or intentional backdoor links and SHOULD be enabled in all 2421 the PEs attached to the BD. 2423 After following the procedure in section 15.1, when a PE detects a 2424 MAC M as duplicate, the PE behaves as follows: 2426 a) Stops advertising M and logs a duplicate event. 2428 b) Initializes a retry-timer, R seconds. 2430 c) Since Loop Protection is enabled, the PE executes a Loop 2431 Protection action, which we refer to as "Black-Holing" M. 2433 When the PE programs M as a Black-Hole MAC in the Bridge Table, M is 2434 no longer associated to the backdoor Attachment Circuit (AC), but to 2435 a Black-Hole destination. 2437 At this point and while M is in Black-Hole state: 2439 a) If a new frame is received (from the EVPN network or the backdoor 2440 AC) with MAC SA = M, the PE identifies M1 to be Black-Holed and 2441 discards the frame, ending the loop. 2443 b) Optionally, instead of simply discarding the frame with MAC SA = 2444 M, the PE MAY bring down the AC on which the offending frame is 2445 seen last. 2447 c) Optionally, any frame that arrives at the PE with MAC DA = M 2448 SHOULD be discarded too. 2450 When the retry-timer R for M expires, the PE flushes M from the 2451 Bridge Table and the process is restarted. In general, a Black-Hole 2452 MAC M can be flushed from the Bridge Table if any of the following 2453 events occur: 2455 o Retry-timer R for duplicate-MAC M expires (as discussed). R is 2456 initialized when M is detected as duplicate-MAC. Its value is 2457 configurable and SHOULD be at least three times the EVPN MAC 2458 Duplication M-timer window. 2460 o The operator manually flushes a Black-Hole MAC M. This should be 2461 done only if the conditions under which M was identified as 2462 duplicate have been cleared. 2464 o The remote PE withdraws the MAC/IP route for M and there are no 2465 other remote MAC/IP routes for M. 2467 o The remote PE sends a MAC/IP route update for M with the sticky-bit 2468 set (in the MAC Mobility extended community). 2470 16. Multicast and Broadcast 2472 The PEs in a particular EVPN instance may use ingress replication or 2473 P2MP or MP2MP LSPs to send multicast traffic to other PEs. 2475 16.1. Ingress Replication 2477 The PEs may use ingress replication for flooding BUM traffic as 2478 described in Section 11 ("Handling of Multi-destination Traffic"). A 2479 given broadcast packet must be sent to all the remote PEs. However, 2480 a given multicast packet for a multicast flow may be sent to only a 2481 subset of the PEs. Specifically, a given multicast flow may be sent 2482 to only those PEs that have receivers that are interested in the 2483 multicast flow. Determining which of the PEs have receivers for a 2484 given multicast flow is done using the procedures of 2485 [I-D.ietf-bess-evpn-igmp-mld-proxy]. 2487 16.2. P2MP or MP2MP LSPs 2489 A PE may use an "Inclusive" tree for sending a BUM packet. This 2490 terminology is borrowed from [RFC7117]. 2492 A variety of transport technologies may be used in the service 2493 provider (SP) network. For Inclusive P-multicast trees, these 2494 transport technologies include point-to-multipoint LSPs created by 2495 RSVP-TE or Multipoint LDP (mLDP) or BIER. 2497 16.2.1. Inclusive Trees 2499 An Inclusive tree allows the use of a single multicast distribution 2500 tree, referred to as an Inclusive P-multicast tree, in the SP network 2501 to carry all the multicast traffic from a specified set of EVPN 2502 instances on a given PE. A particular P-multicast tree can be set up 2503 to carry the traffic originated by sites belonging to a single EVPN 2504 instance, or to carry the traffic originated by sites belonging to 2505 several EVPN instances. The ability to carry the traffic of more 2506 than one EVPN instance on the same tree is termed 'Aggregation', and 2507 the tree is called an Aggregate Inclusive P-multicast tree or 2508 Aggregate Inclusive tree for short. The Aggregate Inclusive tree 2509 needs to include every PE that is a member of any of the EVPN 2510 instances that are using the tree. This implies that a PE may 2511 receive BUM traffic even if it doesn't have any receivers that are 2512 interested in receiving that traffic. 2514 An Inclusive or Aggregate Inclusive tree as defined in this document 2515 is a P2MP tree. A P2MP or MP2MP tree is used to carry traffic only 2516 for EVPN CEs that are connected to the PE that is the root of the 2517 tree. 2519 The procedures for signaling an Inclusive tree are the same as those 2520 in [RFC7117], with the VPLS A-D route replaced with the Inclusive 2521 Multicast Ethernet Tag route. The P-tunnel attribute [RFC7117] for 2522 an Inclusive tree is advertised with the Inclusive Multicast Ethernet 2523 Tag route as described in Section 11 2524 ("Handling of Multi-destination Traffic"). Note that for an 2525 Aggregate Inclusive tree, a PE can "aggregate" multiple EVPN 2526 instances on the same P2MP LSP using upstream labels or DCB allocated 2527 labels [I-D.ietf-bess-mvpn-evpn-aggregation-label]. The procedures 2528 for aggregation are the same as those described in [RFC7117], with 2529 VPLS A-D routes replaced by EVPN Inclusive Multicast Ethernet Tag 2530 routes. 2532 17. Convergence 2534 This section describes failure recovery from different types of 2535 network failures. 2537 17.1. Transit Link and Node Failures between PEs 2539 The use of existing MPLS fast-reroute mechanisms can provide failure 2540 recovery on the order of 50 ms, in the event of transit link and node 2541 failures in the infrastructure that connects the PEs. 2543 17.2. PE Failures 2545 Consider a host CE1 that is dual-homed to PE1 and PE2. If PE1 fails, 2546 a remote PE, PE3, can discover this based on the failure of the BGP 2547 session. This failure detection can be in the sub-second range if 2548 Bidirectional Forwarding Detection (BFD) is used to detect BGP 2549 session failures. PE3 can update its forwarding state to start 2550 sending all traffic for CE1 to only PE2. 2552 17.3. PE-to-CE Network Failures 2554 If the connectivity between the multihomed CE and one of the PEs to 2555 which it is attached fails, the PE MUST withdraw the set of Ethernet 2556 A-D per ES routes that had been previously advertised for that ES. 2557 This enables the remote PEs to remove the MPLS next hop to this 2558 particular PE from the set of MPLS next hops that can be used to 2559 forward traffic to the CE. When the MAC entry on the PE ages out, 2560 the PE MUST withdraw the MAC address from BGP. 2562 When an Ethernet tag is decommissioned on an Ethernet segment, then 2563 the PE MUST withdraw the Ethernet A-D per EVI route(s) announced for 2564 the that are impacted by the decommissioning. 2565 In addition, the PE MUST also withdraw the MAC/IP Advertisement 2566 routes that are impacted by the decommissioning. 2568 The Ethernet A-D per ES routes should be used by an implementation to 2569 optimize the withdrawal of MAC/IP Advertisement routes. When a PE 2570 receives a withdrawal of a particular Ethernet A-D route from an 2571 advertising PE, it SHOULD consider all the MAC/IP Advertisement 2572 routes that are learned from the same ESI as in the Ethernet A-D 2573 route from the advertising PE as having been withdrawn. This 2574 optimizes the network convergence times in the event of PE-to-CE 2575 failures. 2577 18. Frame Ordering 2579 In a MAC address, if the value of the first nibble (bits 8 through 5) 2580 of the most significant octet of the destination MAC address (which 2581 follows the last MPLS label) happens to be 0x4 or 0x6, then the 2582 Ethernet frame can be misinterpreted as an IPv4 or IPv6 packet by 2583 intermediate P nodes performing ECMP based on deep packet inspection, 2584 thus resulting in load balancing packets belonging to the same flow 2585 on different ECMP paths and subjecting those packets to different 2586 delays. Therefore, packets belonging to the same flow can arrive at 2587 the destination out of order. This out-of-order delivery can happen 2588 during steady state in the absence of any failures, resulting in 2589 significant impact on network operations. 2591 In order to avoid frame misordering described in Section 18, the 2592 following network-wide rules are applied: 2594 - If a network uses deep packet inspection for its ECMP, then the 2595 "Preferred PW MPLS Control Word" [RFC4385] MUST be used with the 2596 value 0 (e.g., a 4-octet field with a value of zero) when sending 2597 unicast EVPN-encapsulated packets over an MP2P LSP. 2599 - When sending EVPN-encapsulated packets over a P2MP or P2P RSVP-TE 2600 LSP, then the control word SHOULD NOT be used. 2602 - When sending EVPN-encapsulated packets over a P2MP LSP (e.g., using 2603 mLDP signaling), then the control word SHOULD be used. 2605 - If a network uses entropy labels per [RFC6790], then the control 2606 word SHOULD NOT be used when sending EVPN-encapsulated packets over 2607 an MP2P LSP. 2609 18.1. Flow Label 2611 Flow label is used to add entropy to divisible flows, and creates 2612 ECMP load-balancing in the network. The Flow Label MAY be used in 2613 EVPN networks to achieve better load-balancing in the network, when 2614 transit nodes perform deep packet inspection for ECMP hashing. The 2615 following rules apply: 2617 - When F-bit is set to 1, the PE announces the capability of both 2618 sending and receiving flow label for known unicast. If the PE is 2619 capable of supporting Flow Label, then upon receiving the F-bit 2620 from a remote PE, it MUST send known unicast packets to that PE 2621 with Flow labels and it MUST NOT send BUM packets to that PE with 2622 Flow lables. 2624 - An ingress PE will push the Flow Label at the bottom of the stack 2625 of the EVPN-encapsulated known unicast packets sent to an egress PE 2626 that previously signaled F-bit set to 1. 2628 - The Flow Label MUST NOT be used for EVPN-encapsulated BUM packets. 2630 - If a PE receives a unicast packet with two labels, then it can 2631 differentiate between [VPN label + ESI label] and [VPN label + Flow 2632 label] and there should be no ambiquity between ESI and Flow labels 2633 even if they overlap. The reason for this is that the downstream 2634 assigned VPN label for known unicast is different than for BUM 2635 traffic and ESI label (if present) comes after BUM VPN label. 2636 Therefore, from the VPN label, the receiving PE knows whether the 2637 next label is a ESI label or a Flow label - i.e., if the VPN label 2638 is for known unicast, then the next label MUST be a flow label and 2639 if the VPN label is for BUM traffic, then the next label MUST be an 2640 ESI label because BUM packets are not sent with Flow labels. 2642 - When sending EVPN-encapsulated packets over a P2MP LSP (either 2643 RSVP-TE or mLDP), flow label SHOULD NOT be used. This is 2644 independant of any F-bit signalling in the L2-Attr Extended 2645 Community which would still apply to unicast. 2647 If a network uses entropy labels per [RFC6790], then the control word 2648 MUST NOT be used when sending EVPN-encapsulated packets over an MP2P 2649 LSP. 2651 19. Use of Domain-wide Common Block (DCB) Labels 2653 The use of DCB labels as in [DCB] is RECOMMENDED in the following 2654 cases: 2656 + Aggregate P-multicast trees: A P-multicast tree MAY aggregate the 2657 traffic of two or more BDs on a given ingress PE. When aggregation 2658 is needed, DCB Labels [I-D.ietf-bess-mvpn-evpn-aggregation-label] 2659 MAY be used in the MPLS label field of the Inclusive Multicast 2660 Ethernet Tag routes PMSI Tunnel Attribute. The use of DCB Labels, 2661 instead of upstream allocated labels, can greatly reduce the number 2662 of labels that the egress PEs need to process when P-multicast 2663 tunnel aggregation is used in a network with a large number of BDs. 2665 + BIER tunnels: As described in [I-D.ietf-bier-evpn], the use of 2666 labels with BIER tunnels in EVPN networks is similar to aggregate 2667 tunnels, since the ingress PE uses upstream allocated labels to 2668 identify the BD. As described in [I-D.ietf-bier-evpn], DCB labels 2669 can be allocated instead of upstream labels in the PMSI Tunnel 2670 Attribute so that the number of labels required on the egress PEs 2671 can be reduced. 2673 + ESI labels: The ESI labels advertised with EVPN A-D per ES routes 2674 MAY be allocated as DCB labels in general, and are RECOMMENDED to 2675 be allocated as DCB labels when used in combination with P2MP/BIER 2676 tunnels. 2678 When MP2MP tunnels are used, ESI-labels MUST be allocated from a DCB 2679 and the same label must be used by all the PEs attached to the same 2680 Ethernet Segment. In that way, any egress PE with local Ethernet 2681 Segments can identify the source ES of the received BUM packets. 2683 20. Security Considerations 2685 Security considerations discussed in [RFC4761] and [RFC4762] apply to 2686 this document for MAC learning in the data plane over an Attachment 2687 Circuit (AC) and for flooding of unknown unicast and ARP messages 2688 over the MPLS/IP core. Security considerations discussed in 2689 [RFC4364] apply to this document for MAC learning in the control 2690 plane over the MPLS/IP core. This section describes additional 2691 considerations. 2693 As mentioned in [RFC4761], there are two aspects to achieving data 2694 privacy and protecting against denial-of-service attacks in a VPN: 2695 securing the control plane and protecting the forwarding path. 2696 Compromise of the control plane could result in a PE sending customer 2697 data belonging to some EVPN to another EVPN, or black-holing EVPN 2698 customer data, or even sending it to an eavesdropper, none of which 2699 are acceptable from a data privacy point of view. In addition, 2700 compromise of the control plane could provide opportunities for 2701 unauthorized EVPN data usage (e.g., exploiting traffic replication 2702 within a multicast tree to amplify a denial-of-service attack based 2703 on sending large amounts of traffic). 2705 The mechanisms in this document use BGP for the control plane. 2706 Hence, techniques such as those discussed in [RFC5925] help 2707 authenticate BGP messages, making it harder to spoof updates (which 2708 can be used to divert EVPN traffic to the wrong EVPN instance) or 2709 withdrawals (denial-of-service attacks). In the multi-AS backbone 2710 options (b) and (c) [RFC4364], this also means protecting the 2711 inter-AS BGP sessions between the Autonomous System Border Routers 2712 (ASBRs), the PEs, or the Route Reflectors. 2714 Further discussion of security considerations for BGP may be found in 2715 the BGP specification itself [RFC4271] and in the security analysis 2716 for BGP [RFC4272]. The original discussion of the use of the TCP MD5 2717 signature option to protect BGP sessions is found in [RFC5925], while 2718 [RFC6952] includes an analysis of BGP keying and authentication 2719 issues. 2721 Note that [RFC5925] will not help in keeping MPLS labels private -- 2722 knowing the labels, one can eavesdrop on EVPN traffic. Such 2723 eavesdropping additionally requires access to the data path within an 2724 SP network. Users of VPN services are expected to take appropriate 2725 precautions (such as encryption) to protect the data exchanged over 2726 a VPN. 2728 One of the requirements for protecting the data plane is that the 2729 MPLS labels be accepted only from valid interfaces. For a PE, valid 2730 interfaces comprise links from other routers in the PE's own AS. For 2731 an ASBR, valid interfaces comprise links from other routers in the 2732 ASBR's own AS, and links from other ASBRs in ASes that have instances 2733 of a given EVPN. It is especially important in the case of multi-AS 2734 EVPN instances that one accept EVPN packets only from valid 2735 interfaces. 2737 It is also important to help limit malicious traffic into a network 2738 for an impostor MAC address. The mechanism described in Section 15.1 2739 shows how duplicate MAC addresses can be detected and continuous 2740 false MAC mobility can be prevented. The mechanism described in 2741 Section 15.2 shows how MAC addresses can be pinned to a given 2742 Ethernet segment, such that if they appear behind any other Ethernet 2743 segments, the traffic for those MAC addresses can be prevented from 2744 entering the EVPN network from the other Ethernet segments. 2746 21. IANA Considerations 2748 This document defines a new NLRI, called "EVPN", to be carried in BGP 2749 using multiprotocol extensions. This NLRI uses the existing AFI of 2750 25 (L2VPN). IANA has assigned BGP EVPNs a SAFI value of 70. 2752 IANA has allocated the following EVPN Extended Community sub-types in 2753 [RFC7153], and this document is the only reference for them. 2755 0x00 MAC Mobility [RFC7432] 2756 0x01 ESI Label [RFC7432] 2757 0x02 ES-Import Route Target [RFC7432] 2759 This document creates a registry called "EVPN Route Types". New 2760 registrations will be made through the "RFC Required" procedure 2761 defined in [RFC5226]. The registry has a maximum value of 255. 2762 Initial registrations are as follows: 2764 0 Reserved [RFC7432] 2765 1 Ethernet Auto-discovery [RFC7432] 2766 2 MAC/IP Advertisement [RFC7432] 2767 3 Inclusive Multicast Ethernet Tag [RFC7432] 2768 4 Ethernet Segment [RFC7432] 2770 This document requests allocation of bit 3 in the "EVPN Layer 2 2771 Attributes Control Flags" registry with name F: 2773 F Flow Label MUST be present 2775 22. References 2777 22.1. Normative References 2779 [I-D.ietf-bess-evpn-vpws-fxc] 2780 Sajassi, A., Brissette, P., Uttaro, J., Drake, J., Lin, 2781 W., Boutros, S., and J. Rabadan, "EVPN VPWS Flexible 2782 Cross-Connect Service", draft-ietf-bess-evpn-vpws-fxc-01 2783 (work in progress), June 2019. 2785 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2786 Requirement Levels", BCP 14, RFC 2119, 2787 DOI 10.17487/RFC2119, March 1997, 2788 . 2790 [RFC4271] Rekhter, Y., Ed., Li, T., Ed., and S. Hares, Ed., "A 2791 Border Gateway Protocol 4 (BGP-4)", RFC 4271, 2792 DOI 10.17487/RFC4271, January 2006, 2793 . 2795 [RFC4360] Sangli, S., Tappan, D., and Y. Rekhter, "BGP Extended 2796 Communities Attribute", RFC 4360, DOI 10.17487/RFC4360, 2797 February 2006, . 2799 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 2800 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 2801 2006, . 2803 [RFC4760] Bates, T., Chandra, R., Katz, D., and Y. Rekhter, 2804 "Multiprotocol Extensions for BGP-4", RFC 4760, 2805 DOI 10.17487/RFC4760, January 2007, 2806 . 2808 [RFC4761] Kompella, K., Ed. and Y. Rekhter, Ed., "Virtual Private 2809 LAN Service (VPLS) Using BGP for Auto-Discovery and 2810 Signaling", RFC 4761, DOI 10.17487/RFC4761, January 2007, 2811 . 2813 [RFC4762] Lasserre, M., Ed. and V. Kompella, Ed., "Virtual Private 2814 LAN Service (VPLS) Using Label Distribution Protocol (LDP) 2815 Signaling", RFC 4762, DOI 10.17487/RFC4762, January 2007, 2816 . 2818 [RFC7153] Rosen, E. and Y. Rekhter, "IANA Registries for BGP 2819 Extended Communities", RFC 7153, DOI 10.17487/RFC7153, 2820 March 2014, . 2822 [RFC7432] Sajassi, A., Ed., Aggarwal, R., Bitar, N., Isaac, A., 2823 Uttaro, J., Drake, J., and W. Henderickx, "BGP MPLS-Based 2824 Ethernet VPN", RFC 7432, DOI 10.17487/RFC7432, February 2825 2015, . 2827 [RFC8214] Boutros, S., Sajassi, A., Salam, S., Drake, J., and J. 2828 Rabadan, "Virtual Private Wire Service Support in Ethernet 2829 VPN", RFC 8214, DOI 10.17487/RFC8214, August 2017, 2830 . 2832 [RFC8584] Rabadan, J., Ed., Mohanty, S., Ed., Sajassi, A., Drake, 2833 J., Nagaraj, K., and S. Sathappan, "Framework for Ethernet 2834 VPN Designated Forwarder Election Extensibility", 2835 RFC 8584, DOI 10.17487/RFC8584, April 2019, 2836 . 2838 22.2. Informative References 2840 [I-D.ietf-bess-evpn-igmp-mld-proxy] 2841 Sajassi, A., Thoria, S., Patel, K., Drake, J., and W. Lin, 2842 "IGMP and MLD Proxy for EVPN", draft-ietf-bess-evpn-igmp- 2843 mld-proxy-05 (work in progress), April 2020. 2845 [I-D.ietf-bess-evpn-prefix-advertisement] 2846 Rabadan, J., Henderickx, W., Drake, J., Lin, W., and A. 2847 Sajassi, "IP Prefix Advertisement in EVPN", draft-ietf- 2848 bess-evpn-prefix-advertisement-11 (work in progress), May 2849 2018. 2851 [I-D.ietf-bess-mvpn-evpn-aggregation-label] 2852 Zhang, Z., Rosen, E., Lin, W., Li, Z., and I. Wijnands, 2853 "MVPN/EVPN Tunnel Aggregation with Common Labels", draft- 2854 ietf-bess-mvpn-evpn-aggregation-label-04 (work in 2855 progress), November 2020. 2857 [IEEE.802.1D_2004] 2858 IEEE, "IEEE Standard for Local and metropolitan area 2859 networks: Media Access Control (MAC) Bridges", IEEE 2860 802.1D-2004, DOI 10.1109/ieeestd.2004.94569, July 2004, 2861 . 2863 [IEEE.802.1Q_2014] 2864 IEEE, "IEEE Standard for Local and metropolitan area 2865 networks--Bridges and Bridged Networks", IEEE 802.1Q-2014, 2866 DOI 10.1109/ieeestd.2014.6991462, December 2014, 2867 . 2870 [RFC4272] Murphy, S., "BGP Security Vulnerabilities Analysis", 2871 RFC 4272, DOI 10.17487/RFC4272, January 2006, 2872 . 2874 [RFC4385] Bryant, S., Swallow, G., Martini, L., and D. McPherson, 2875 "Pseudowire Emulation Edge-to-Edge (PWE3) Control Word for 2876 Use over an MPLS PSN", RFC 4385, DOI 10.17487/RFC4385, 2877 February 2006, . 2879 [RFC4664] Andersson, L., Ed. and E. Rosen, Ed., "Framework for Layer 2880 2 Virtual Private Networks (L2VPNs)", RFC 4664, 2881 DOI 10.17487/RFC4664, September 2006, 2882 . 2884 [RFC4684] Marques, P., Bonica, R., Fang, L., Martini, L., Raszuk, 2885 R., Patel, K., and J. Guichard, "Constrained Route 2886 Distribution for Border Gateway Protocol/MultiProtocol 2887 Label Switching (BGP/MPLS) Internet Protocol (IP) Virtual 2888 Private Networks (VPNs)", RFC 4684, DOI 10.17487/RFC4684, 2889 November 2006, . 2891 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 2892 IANA Considerations Section in RFCs", RFC 5226, 2893 DOI 10.17487/RFC5226, May 2008, 2894 . 2896 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 2897 Authentication Option", RFC 5925, DOI 10.17487/RFC5925, 2898 June 2010, . 2900 [RFC6514] Aggarwal, R., Rosen, E., Morin, T., and Y. Rekhter, "BGP 2901 Encodings and Procedures for Multicast in MPLS/BGP IP 2902 VPNs", RFC 6514, DOI 10.17487/RFC6514, February 2012, 2903 . 2905 [RFC6790] Kompella, K., Drake, J., Amante, S., Henderickx, W., and 2906 L. Yong, "The Use of Entropy Labels in MPLS Forwarding", 2907 RFC 6790, DOI 10.17487/RFC6790, November 2012, 2908 . 2910 [RFC6952] Jethanandani, M., Patel, K., and L. Zheng, "Analysis of 2911 BGP, LDP, PCEP, and MSDP Issues According to the Keying 2912 and Authentication for Routing Protocols (KARP) Design 2913 Guide", RFC 6952, DOI 10.17487/RFC6952, May 2013, 2914 . 2916 [RFC7117] Aggarwal, R., Ed., Kamite, Y., Fang, L., Rekhter, Y., and 2917 C. Kodeboniya, "Multicast in Virtual Private LAN Service 2918 (VPLS)", RFC 7117, DOI 10.17487/RFC7117, February 2014, 2919 . 2921 [RFC7209] Sajassi, A., Aggarwal, R., Uttaro, J., Bitar, N., 2922 Henderickx, W., and A. Isaac, "Requirements for Ethernet 2923 VPN (EVPN)", RFC 7209, DOI 10.17487/RFC7209, May 2014, 2924 . 2926 [RFC8317] Sajassi, A., Ed., Salam, S., Drake, J., Uttaro, J., 2927 Boutros, S., and J. Rabadan, "Ethernet-Tree (E-Tree) 2928 Support in Ethernet VPN (EVPN) and Provider Backbone 2929 Bridging EVPN (PBB-EVPN)", RFC 8317, DOI 10.17487/RFC8317, 2930 January 2018, . 2932 [RFC8365] Sajassi, A., Ed., Drake, J., Ed., Bitar, N., Shekhar, R., 2933 Uttaro, J., and W. Henderickx, "A Network Virtualization 2934 Overlay Solution Using Ethernet VPN (EVPN)", RFC 8365, 2935 DOI 10.17487/RFC8365, March 2018, 2936 . 2938 Appendix A. Acknowledgments for This Document (2020) 2940 Appendix B. Contributors 2942 In addition to the authors listed on the front page, the following 2943 co-authors have also contributed to this document: 2945 Luc Andre Cisco 2947 Appendix C. Acknowledgments from the First Edition (2015) 2949 Special thanks to Yakov Rekhter for reviewing this document several 2950 times and providing valuable comments, and for his very engaging 2951 discussions on several topics of this document that helped shape this 2952 document. We would also like to thank Pedro Marques, Kaushik Ghosh, 2953 Nischal Sheth, Robert Raszuk, Amit Shukla, and Nadeem Mohammed for 2954 discussions that helped shape this document. We would also like to 2955 thank Han Nguyen for his comments and support of this work. We would 2956 also like to thank Steve Kensil and Reshad Rahman for their reviews. 2957 We would like to thank Jorge Rabadan for his contribution to 2958 Section 5 of this document. We would like to thank Thomas Morin for 2959 his review of this document and his contribution of Section 8.7. 2960 Many thanks to Jakob Heitz for his help to improve several sections 2961 of this document. 2963 We would also like to thank Clarence Filsfils, Dennis Cai, Quaizar 2964 Vohra, Kireeti Kompella, and Apurva Mehta for their contributions to 2965 this document. 2967 Last but not least, special thanks to Giles Heron (our WG chair) for 2968 his detailed review of this document in preparation for WG Last Call 2969 and for making many valuable suggestions. 2971 C.1. Contributors from the First Edition (2015) 2973 In addition to the authors listed on the front page, the following 2974 co-authors have also contributed to this document: 2976 Keyur Patel 2977 Samer Salam 2978 Sami Boutros 2979 Cisco 2981 Yakov Rekhter 2982 Ravi Shekhar 2983 Juniper Networks 2985 Florin Balus 2986 Nuage Networks 2988 C.2. Authors from the First Edition (2015) 2990 Original Authors: 2992 Ali Sajassi 2993 Cisco 2995 EMail: sajassi@cisco.com 2997 Rahul Aggarwal 2998 Arktan 3000 EMail: raggarwa_1@yahoo.com 3002 Nabil Bitar 3003 Verizon Communications 3005 EMail : nabil.n.bitar@verizon.com 3006 Aldrin Isaac 3007 Bloomberg 3009 EMail: aisaac71@bloomberg.net 3011 James Uttaro 3012 AT&T 3014 EMail: uttaro@att.com 3016 John Drake 3017 Juniper Networks 3019 EMail: jdrake@juniper.net 3021 Wim Henderickx 3022 Alcatel-Lucent 3024 EMail: wim.henderickx@alcatel-lucent.com 3026 Authors' Addresses 3028 Ali Sajassi (editor) 3029 Cisco 3031 Email: sajassi@cisco.com 3033 John Drake 3034 Juniper 3036 Email: jdrake@juniper.net 3038 Jorge Rabadan 3039 Nokia 3041 Email: jorge.rabadan@nokia.com