idnits 2.17.1 draft-ietf-l2vpn-evpn-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 24, 2012) is 4443 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2119' is mentioned on line 142, but not defined == Missing Reference: 'BGP-VPLS-MH' is mentioned on line 252, but not defined == Missing Reference: 'RFC4271' is mentioned on line 318, but not defined == Missing Reference: 'RFC4760' is mentioned on line 327, but not defined == Missing Reference: 'MPLS-ENCAPS' is mentioned on line 749, but not defined == Missing Reference: 'BGP MVPN' is mentioned on line 1002, but not defined == Missing Reference: 'MLDP' is mentioned on line 1526, but not defined == Unused Reference: 'RFC4761' is defined on line 1731, but no explicit reference was found in the text == Unused Reference: 'RFC4762' is defined on line 1735, but no explicit reference was found in the text == Unused Reference: 'VPLS-MULTIHOMING' is defined on line 1739, but no explicit reference was found in the text == Unused Reference: 'PIM-SNOOPING' is defined on line 1743, but no explicit reference was found in the text == Unused Reference: 'IGMP-SNOOPING' is defined on line 1746, but no explicit reference was found in the text == Outdated reference: A later version (-01) exists of draft-sajassi-raggarwa-l2vpn-evpn-req-00 -- Possible downref: Normative reference to a draft: ref. 'E-VPN-REQ' == Outdated reference: A later version (-16) exists of draft-ietf-l2vpn-vpls-mcast-04 == Outdated reference: A later version (-07) exists of draft-ietf-l2vpn-vpls-multihoming-00 == Outdated reference: A later version (-07) exists of draft-ietf-l2vpn-vpls-pim-snooping-01 ** Downref: Normative reference to an Informational draft: draft-ietf-l2vpn-vpls-pim-snooping (ref. 'PIM-SNOOPING') ** Downref: Normative reference to an Informational RFC: RFC 4541 (ref. 'IGMP-SNOOPING') Summary: 5 errors (**), 0 flaws (~~), 17 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Aggarwal 3 INTERNET-DRAFT Arktan 4 Category: Standards Track 5 Expires: August 25, 2012 A. Sajassi 6 Cisco 8 J. Uttaro W. Henderickx 9 AT&T Alcatel-Lucent 11 A. Isaac N. Bitar 12 Bloomberg Verizon 14 F. Balus R. Shekhar 15 Alcatel-Lucent J. Drake 16 Juniper Networks 17 S. Boutros 18 K. Patel 19 Cisco February 24, 2012 21 BGP MPLS Based Ethernet VPN 22 draft-ietf-l2vpn-evpn-00 24 Status of this Memo 26 This Internet-Draft is submitted to IETF in full conformance with the 27 provisions of BCP 78 and BCP 79. 29 Internet-Drafts are working documents of the Internet Engineering 30 Task Force (IETF), its areas, and its working groups. Note that 31 other groups may also distribute working documents as 32 Internet-Drafts. 34 Internet-Drafts are draft documents valid for a maximum of six months 35 and may be updated, replaced, or obsoleted by other documents at any 36 time. It is inappropriate to use Internet-Drafts as reference 37 material or to cite them other than as "work in progress." 39 The list of current Internet-Drafts can be accessed at 40 http://www.ietf.org/1id-abstracts.html 42 The list of Internet-Draft Shadow Directories can be accessed at 43 http://www.ietf.org/shadow.html 45 Copyright and License Notice 47 Copyright (c) 2012 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (http://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Abstract 62 This document describes procedures for BGP MPLS based Ethernet VPNs 63 (E-VPN). 65 Table of Contents 67 1. Specification of requirements . . . . . . . . . . . . . . . . . 4 68 2. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 4 69 3. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 70 4. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 4 71 5. BGP MPLS Based E-VPN Overview . . . . . . . . . . . . . . . . . 4 72 6. Ethernet Segment Identifier . . . . . . . . . . . . . . . . . . 6 73 7. BGP E-VPN NLRI . . . . . . . . . . . . . . . . . . . . . . . . 7 74 7.1. Ethernet Auto-Discovery Route . . . . . . . . . . . . . . . 8 75 7.2. MAC Advertisement Route . . . . . . . . . . . . . . . . . 8 76 7.3. Inclusive Multicast Ethernet Tag Route . . . . . . . . . . 9 77 8. ESI MPLS Label Extended Community . . . . . . . . . . . . . . . 9 78 9. Auto-Discovery . . . . . . . . . . . . . . . . . . . . . . . . 9 79 10. Auto-Discovery of Ethernet Tags on Ethernet Segments . . . . . 10 80 10.1. Constructing the Ethernet A-D Route . . . . . . . . . . . 10 81 10.1.1. Ethernet A-D Route per E-VPN . . . . . . . . . . . . . 11 82 10.1.1.1. Ethernet A-D Route Targets . . . . . . . . . . . . 12 83 10.1.2. Ethernet A-D Route per Ethernet Segment . . . . . . . 12 84 10.1.2.1. Ethernet A-D Route Targets . . . . . . . . . . . . 13 85 10.2. Motivations for Ethernet A-D Route per Ethernet Segment . 13 86 10.2.1. Multi-Homing . . . . . . . . . . . . . . . . . . . . . 14 87 10.2.2. Optimizing Control Plane Convergence . . . . . . . . . 14 88 10.2.3. Reducing Number of Ethernet A-D Routes . . . . . . . . 14 89 11. Determining Reachability to Unicast MAC Addresses . . . . . . 14 90 11.1. Local Learning . . . . . . . . . . . . . . . . . . . . . . 15 91 11.2. Remote learning . . . . . . . . . . . . . . . . . . . . . 15 92 11.2.1. Constructing the BGP E-VPN MAC Address Advertisement . 15 93 12. Optimizing ARP . . . . . . . . . . . . . . . . . . . . . . . . 17 94 13. Designated Forwarder Election . . . . . . . . . . . . . . . . 18 95 13.1. DF Election Performed by All MESes . . . . . . . . . . . . 19 96 13.2. DF Election Performed Only on Multi-Homed MESes . . . . . 20 97 14. Handling of Multi-Destination Traffic . . . . . . . . . . . . 21 98 14.1. Construction of the Inclusive Multicast Ethernet Tag 99 Route . . . . . . . . . . . . . . . . . . . . . . . . . . 21 100 14.2. P-Tunnel Identification . . . . . . . . . . . . . . . . . 22 101 14.3. Ethernet Segment Identifier and Ethernet Tag . . . . . . . 22 102 15. Processing of Unknown Unicast Packets . . . . . . . . . . . . 23 103 15.1. Ingress Replication . . . . . . . . . . . . . . . . . . . 24 104 15.2. P2MP MPLS LSPs . . . . . . . . . . . . . . . . . . . . . . 24 105 16. Forwarding Unicast Packets . . . . . . . . . . . . . . . . . . 24 106 16.1. Forwarding packets received from a CE . . . . . . . . . . 24 107 16.2. Forwarding packets received from a remote MES . . . . . . 25 108 16.2.1. Unknown Unicast Forwarding . . . . . . . . . . . . . . 25 109 16.2.2. Known Unicast Forwarding . . . . . . . . . . . . . . . 26 110 17. Split Horizon . . . . . . . . . . . . . . . . . . . . . . . . 26 111 17.1. ESI MPLS Label: Ingress Replication . . . . . . . . . . . 26 112 17.2. ESI MPLS Label: P2MP MPLS LSPs . . . . . . . . . . . . . . 27 113 17.3. ESI MPLS Label: MP2MP LSPs . . . . . . . . . . . . . . . . 28 114 18. Load Balancing of Unicast Packets . . . . . . . . . . . . . . 28 115 18.1. Load balancing of traffic from an MES to remote CEs . . . 28 116 18.2. Load balancing of traffic between an MES and a local CE . 30 117 18.2.1. Data plane learning . . . . . . . . . . . . . . . . . 31 118 18.2.2. Control plane learning . . . . . . . . . . . . . . . . 31 119 19. MAC Moves . . . . . . . . . . . . . . . . . . . . . . . . . . 31 120 20. Multicast . . . . . . . . . . . . . . . . . . . . . . . . . . 32 121 20.1. Ingress Replication . . . . . . . . . . . . . . . . . . . 32 122 20.2. P2MP LSPs . . . . . . . . . . . . . . . . . . . . . . . . 32 123 20.3. MP2MP LSPs . . . . . . . . . . . . . . . . . . . . . . . . 32 124 20.3.1. Inclusive Trees . . . . . . . . . . . . . . . . . . . 33 125 20.3.2. Selective Trees . . . . . . . . . . . . . . . . . . . 33 126 20.4. Explicit Tracking . . . . . . . . . . . . . . . . . . . . 34 127 21. Convergence . . . . . . . . . . . . . . . . . . . . . . . . . 34 128 21.1. Transit Link and Node Failures between MESes . . . . . . . 34 129 21.2. MES Failures . . . . . . . . . . . . . . . . . . . . . . . 34 130 21.2.1. Local Repair . . . . . . . . . . . . . . . . . . . . . 35 131 21.3. MES to CE Network Failures . . . . . . . . . . . . . . . . 35 132 22. LACP State Synchronization . . . . . . . . . . . . . . . . . . 35 133 23. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 36 134 24. References . . . . . . . . . . . . . . . . . . . . . . . . . . 37 135 25. Author's Address . . . . . . . . . . . . . . . . . . . . . . . 37 137 1. Specification of requirements 139 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 140 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 141 document are to be interpreted as described in [RFC2119]. 143 2. Contributors 145 In addition to the authors listed above, the following individuals 146 also contributed to this document: 147 Quaizar Vohra 148 Kireeti Kompella 149 Apurva Mehta 150 Juniper Networks 152 Samer Salam 153 Cisco 155 3. Introduction 157 This document describes procedures for BGP MPLS based Ethernet VPNs 158 (E-VPN). The procedures described here are intended to meet the 159 requirements specified in [E-VPN-REQ]. Please refer to [E-VPN-REQ] 160 for the detailed requirements and motivation. 162 This document proposes an MPLS based technology, referred to as MPLS- 163 based E-VPN (E-VPN). E-VPN requires extensions to existing IP/MPLS 164 protocols as described in this document. In addition to these 165 extensions E-VPN uses several building blocks from existing MPLS 166 technologies. 168 4. Terminology 170 CE: Customer Edge device e.g., host or router or switch 171 MES: MPLS Edge Switch 172 EVI: E-VPN Instance 173 ESI: Ethernet segment identifier 174 LACP: Link Aggregation Control Protocol 175 MP2MP: Multipoint to Multipoint 176 P2MP: Point to Multipoint 177 P2P: Point to Point 179 5. BGP MPLS Based E-VPN Overview 181 This section provides an overview of E-VPN. 183 An E-VPN comprises CEs that are connected to PEs, or MPLS Edge 184 Switches (MES), that form the edge of the MPLS infrastructure. A CE 185 may be a host, a router or a switch. The MPLS Edge Switches provide 186 layer 2 virtual bridge connectivity between the CEs. There may be 187 multiple E-VPNs in the provider's network. An E-VPN routing and 188 forwarding instance on an MES is referred to as an E-VPN Instance 189 (EVI). 191 The MESes maybe connected by an MPLS LSP infrastructure which 192 provides the benefits of MPLS LSP technology such as fast-reroute, 193 resiliency, etc. The MESes may also be connected by an IP 194 infrastructure in which case IP/GRE tunneling is used between the 195 MESes. The detailed procedures in this version of this document are 196 specified only for MPLS LSPs as the tunneling technology. However 197 these procedures are designed to be extensible to IP/GRE as the 198 tunneling technology. 200 In an E-VPN, MAC learning between MESes occurs not in the data plane 201 (as happens with traditional bridging) but in the control plane. 202 Control plane learning offers greater control over the MAC learning 203 process, such as restricting who learns what, and the ability to 204 apply policies. Furthermore, the control plane chosen for 205 advertising MAC reachability information is multi-protocol (MP) BGP 206 (very similar to IP VPNs (RFC 4364)), providing greater scale, and 207 the ability to preserve the "virtualization" or isolation of groups 208 of interacting agents (hosts, servers, Virtual Machines) from each 209 other. In E-VPNs MESes advertise the MAC addresses learned from the 210 CEs that are connected to them, along with an MPLS label, to other 211 MESes in the control plane using MP-BGP. Control plane learning 212 enables load balancing of traffic to and from CEs that are multi- 213 homed to multiple MESes. This is in addition to load balancing across 214 the MPLS core via multiple LSPs betwen the same pair of MESes. In 215 other words it allows CEs to connect to multiple active points of 216 attachment. It also improves convergence times in the event of 217 certain network failures. 219 However, learning between MESes and CEs is done by the method best 220 suited to the CE: data plane learning, IEEE 802.1x, LLDP, 802.1aq, 221 ARP, management plane or other protocols. 223 It is a local decision as to whether the Layer 2 forwarding table on 224 a MES is populated with all the MAC destinations known to the control 225 plane or whether the MES implements a cache based scheme. For 226 instance the MAC forwarding table may be populated only with the MAC 227 destinations of the active flows transiting a specific MES. 229 The policy attributes of an E-VPN are very similar to those of an IP 230 VPN. An E-VPN instance requires a Route-Distinguisher (RD) and an E- 231 VPN requires one or more Route-Targets (RTs). A CE attaches to an E- 232 VPN instance (EVI) on an MES, on an Ethernet interface which may be 233 configured for one or more Ethernet Tags, e.g., VLANs. Some 234 deployment scenarios guarantee uniqueness of VLANs across E-VPNs: all 235 points of attachment of a given E-VPN use the same VLAN, and no other 236 E-VPN uses this VLAN. This document refers to this case as a "Unique 237 Single VLAN E-VPN" and describes simplified procedures to optimize 238 for it. 240 6. Ethernet Segment Identifier 242 If a CE is multi-homed to two or more MESes, the set of Ethernet 243 links constitutes an "Ethernet segment". An Ethernet segment may 244 appear to the CE as a Link Aggregation Group (LAG). Ethernet 245 segments have an identifier, called the "Ethernet Segment Identifier" 246 (ESI) which is encoded as a ten octets integer. A single-homed CE is 247 considered to be attached to an Ethernet segment with ESI 0. 248 Otherwise, an Ethernet segment MUST have a unique non-zero ESI. The 249 ESI can be assigned using various mechanisms: 250 1. The ESI may be configured. For instance when E-VPNs are used to 251 provide a VPLS service the ESI is fairly analogous to the Multi- 252 homing site ID in [BGP-VPLS-MH]. 254 2. If IEEE 802.1AX LACP is used, between the MESes and CEs, then 255 the ESI is determined from LACP by concatenating the following 256 parameters: 258 + CE LACP System Identifier comprised of two bytes of System 259 Priority and six bytes of System MAC address, where the System 260 Priority is encoded in the most significant two bytes. The CE 261 LACP identifier MUST be encoded in the high order eight bytes 262 of the ESI. 264 + CE LACP two byte Port Key. The CE LACP port key MUST be 265 encoded in the low order two bytes of the ESI 267 As far as the CE is concerned it would treat the multiple MESes 268 that it is connected to as the same switch. This allows the CE 269 to aggregate links that are attached to different MESes in the 270 same bundle. 272 3. If LLDP is used, between the MESes and CEs that are hosts, then 273 the ESI is determined by LLDP. The ESI will be specified in a 274 following version. 276 4. In the case of indirectly connected hosts via a bridged LAN 277 between the CEs and the MESes, the ESI is determined based on the 278 Layer 2 bridge protocol as follows: If STP is used in the bridged 279 LAN then the value of the ESI is derived by listening to BPDUs on 280 the Ethernet segment. To achieve this the MES is not required to 281 run STP. However the MES must learn the Switch ID, MSTP ID and 282 Root Bridge ID by listening to STP BPDUs. The ESI is constructed 283 as follows: 285 {Switch ID (6 bits), MSTP ID (6 bits), Root Bridge ID (48 bits)} 287 7. BGP E-VPN NLRI 289 This document defines a new BGP NLRI, called the E-VPN NLRI. 291 Following is the format of the E-VPN NLRI: 293 +-----------------------------------+ 294 | Route Type (1 octet) | 295 +-----------------------------------+ 296 | Length (1 octet) | 297 +-----------------------------------+ 298 | Route Type specific (variable) | 299 +-----------------------------------+ 301 The Route Type field defines encoding of the rest of E-VPN NLRI 302 (Route Type specific E-VPN NLRI). 304 The Length field indicates the length in octets of the Route Type 305 specific field of E-VPN NLRI. 307 This document defines the following Route Types: 309 + 1 - Ethernet Auto-Discovery (A-D) route 310 + 2 - MAC advertisement route 311 + 3 - Inclusive Multicast Route 312 + 5 - Selective Multicast Auto-Discovery (A-D) Route 313 + 6 - Leaf Auto-Discovery (A-D) Route 315 The detailed encoding and procedures for these route types are 316 described in subsequent sections. 318 The E-VPN NLRI is carried in BGP [RFC4271] using BGP Multiprotocol 319 Extensions [RFC4760] with an AFI of TBD and an SAFI of E-VPN (To be 320 assigned by IANA). The NLRI field in the 321 MP_REACH_NLRI/MP_UNREACH_NLRI attribute contains the E-VPN NLRI 322 (encoded as specified above). 324 In order for two BGP speakers to exchange labeled E-VPN NLRI, they 325 must use BGP Capabilities Advertisement to ensure that they both are 326 capable of properly processing such NLRI. This is done as specified 327 in [RFC4760], by using capability code 1 (multiprotocol BGP) with an 328 AFI of TBD and an SAFI of E-VPN. 330 7.1. Ethernet Auto-Discovery Route 332 A Ethernet A-D route type specific E-VPN NLRI consists of the 333 following: 335 +---------------------------------------+ 336 | RD (8 octets) | 337 +---------------------------------------+ 338 |Ethernet Segment Identifier (10 octets)| 339 +---------------------------------------+ 340 | Ethernet Tag ID (4 octets) | 341 +---------------------------------------+ 342 | MPLS Label (3 octets) | 343 +---------------------------------------+ 345 For procedures and usage of this route please see the sections on 346 "Auto-Discovery of Ethernet Tags on Ethernet Segments", "Designated 347 Forwarder Election" and "Load Balancing". 349 7.2. MAC Advertisement Route 351 A MAC advertisement route type specific E-VPN NLRI consists of the 352 following: 354 +---------------------------------------+ 355 | RD (8 octets) | 356 +---------------------------------------+ 357 |Ethernet Segment Identifier (10 octets)| 358 +---------------------------------------+ 359 | Ethernet Tag ID (4 octets) | 360 +---------------------------------------+ 361 | MAC Address Length (1 octet) | 362 +---------------------------------------+ 363 | MAC Address (6 octets) | 364 +---------------------------------------+ 365 | IP Address Length (1 octet) | 366 +---------------------------------------+ 367 | IP Address (4 or 16 octets) | 368 +---------------------------------------+ 369 | MPLS Label (n * 3 octets) | 370 +---------------------------------------+ 372 For procedures and usage of this route please see the sections on 373 "Determining Reachability to Unicast MAC Addresses" and "Load 374 Balancing of Unicast Packets". 376 7.3. Inclusive Multicast Ethernet Tag Route 378 An Inclusive Multicast Ethernet Tag route type specific E-VPN NLRI 379 consists of the following: 381 +---------------------------------------+ 382 | RD (8 octets) | 383 +---------------------------------------+ 384 |Ethernet Segment Identifier (10 octets)| 385 +---------------------------------------+ 386 | Ethernet Tag ID (4 octets) | 387 +---------------------------------------+ 388 | Originating Router's IP Addr | 389 | (4 or 16 octets) | 390 +---------------------------------------+ 392 For procedures and usage of this route please see the sections on 393 "Handling of Multi-Destination Traffic", "Unknown Unicast Traffic" 394 and "Multicast". 396 8. ESI MPLS Label Extended Community 398 This extended community is a new transitive extended community. It 399 may be advertised along with Ethernet Auto-Discovery routes. When 400 used it carries properties associated with the ESI. Specifically it 401 enables split horizon procedures for multi-homed sites. The 402 procedures for using this Extended Community are described in 403 following sections. 405 Each ESI MPLS Label Extended Community is encoded as a 8-octet value 406 as follows: 408 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 409 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 410 | 0x44 | Sub-Type | Flags (One Octet) |Reserved=0 | 411 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 412 | Reserved = 0| ESI MPLS label | 413 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 415 The low order bit of the flags octet is defined as the "Active- 416 Standby" bit and may be set to 1. The other bits must be set to 0. 418 9. Auto-Discovery 420 E-VPN requires the following types of auto-discovery procedures: 421 + E-VPN Auto-Discovery, which allows an MES to discover the other 422 MESes in the E-VPN. Each MES advertises one or more "Inclusive 423 Multicast Tag Routes". The procedures for advertising these 424 routes are described in the section on "Handling of Multi- 425 Destination Traffic". 427 + Auto-Discovery of Ethernet Tags on Ethernet Segments, in a 428 particular E-VPN. The procedures are described in section 429 "Auto-Discovery of Ethernet Tags on Ethernet Segments". 431 + Ethernet Segment Auto-Discovery used for auto-discovery of MESes 432 that are multi-homed to the same Ethernet segment. The 433 procedures are described in section "Auto-Discovery of Ethernet 434 Tags on Ethernet Segments". 436 10. Auto-Discovery of Ethernet Tags on Ethernet Segments 438 If a CE is multi-homed to two or more MESes on a particular Ethernet 439 segment, each MES MUST advertise, to other MESes in the E-VPN, the 440 information about the Ethernet Tags that are associated with that 441 Ethernet segment. An Ethernet Tag identifies a particular broadcast 442 domain. An example of an Ethernet Tag is a VLAN ID. The MES MAY 443 advertise each Ethernet Tag associated with the Ethernet Segment, or 444 it may advertise a wildcard to cover all the Ethernet Tags enabled on 445 the segment. If a CE is single-homed, then the MES that it is 446 attached to MAY advertise the information about Ethernet Tags 447 (e.g.,VLANs) on the Ethernet segment connected to the CE. 449 The information about an Ethernet Tag on a particular Ethernet 450 segment is advertised using an "Ethernet Auto-Discovery route 451 (Ethernet A-D route)". This route is advertised using the E-VPN NLRI. 453 The Ethernet Tag Auto-discovery information SHOULD be used to enable 454 active-active load-balancing among MESes as described in section 455 "Load Balancing of Unicast Packets". In the case of a multi-homed CE 456 this route MUST also carry the "ESI Label Extended Community" to 457 enable split horizon as described in section "Split Horizon". Also, 458 the route can be used for Designated Forwarder (DF) election as 459 described in section "Designated Forwarder Election". Further,it MAY 460 be used to optimize the withdrawal of MAC addresses upon failure as 461 described in section "Convergence". 463 This section describes procedures for advertising one or more 464 Ethernet A-D routes per Ethernet tag per E-VPN. We will call this as 465 "Ethernet A-D route per E-VPN". This section also describes 466 procedures to advertise and withdraw a single Ethernet A-D route per 467 Ethernet Segment. We will call this as "Ethernet A-D route per 468 Segment". 470 10.1. Constructing the Ethernet A-D Route 471 The format of the Ethernet A-D NLRI is specified in section "BGP E- 472 VPN NLRI". 474 10.1.1. Ethernet A-D Route per E-VPN 476 This section describes procedures to construct the Ethernet A-D route 477 when one or more such routes are advertised by an MES for a given E- 478 VPN instance. 480 Route-Distinguisher (RD) MUST be set to the RD of the E-VPN instance 481 that is advertising the NLRI. A RD MUST be assigned for a given E-VPN 482 instance on an MES. This RD MUST be unique across all E-VPN instances 483 on an MES. It is RECOMMENDED to use the Type 1 RD [RFC4364]. The 484 value field comprises an IP address of the MES (typically, the 485 loopback address) followed by a number unique to the MES. This 486 number may be generated by the MES. Or in the Unique Single VLAN E- 487 VPN case, the low order 12 bits may be the 12 bit VLAN ID, with the 488 remaining high order 4 bits set to 0. 490 Ethernet Segment Identifier MAY be set to 0. When it is not zero the 491 Ethernet Segment Identifier MUST be a ten octet entity as described 492 in section "Ethernet Segment Identifier". 494 The Ethernet Tag ID is the identifier of an Ethernet Tag on the 495 Ethernet segment. This value may be a 12 bit VLAN ID, in which case 496 the low order 12 bits are set to the VLAN ID and the high order 20 497 bits are set to 0. Or it may be another Ethernet Tag used by the E- 498 VPN. It MAY be set to the default Ethernet Tag on the Ethernet 499 segment or 0. 501 Note that the above allows the Ethernet A-D route to be advertised 502 with one of the following granularities: 504 + One Ethernet A-D route for a given tuple 505 per E-VPN 507 + One Ethernet A-D route for a given in a given 508 E-VPN, for all associated Ethernet segments, where the ESI is 509 set to 0. 511 + One Ethernet A-D route for the E-VPN where both ESI and Ethernet 512 Tag ID are set to 0. 514 E-VPN supports both the non-qualified and qualified learning models. 515 When non-qualified learning is used, the Ethernet Tag Identifier 516 specified in this section and in other places in this document MUST 517 be set to the default Ethernet Tag, e.g., VLAN ID. When qualified 518 learning is used, and the Ethernet Tags between MESes and CEs in the 519 E-VPN are consistently assigned for a given broadcast domain, the 520 Ethernet Tag Identifier MUST be set to the Ethernet Tag, e.g., VLAN 521 ID for the concerned broadcast domain between the advertising MES and 522 the CE. When qualified learning is used, and the Ethernet Tags, 523 e.g., VLAN IDs between MESes and CEs in the E-VPN are not 524 consistently assigned for a given broadcast domain, the Ethernet Tag 525 Identifier, e.g., VLAN ID MUST be set to a common E-VPN provider 526 assigned tag that maps locally on the advertising MES to an Ethernet 527 broadcast domain identifier such as a VLAN ID. The usage of the MPLS 528 label is described in section on "Load Balancing of Unicast Packets". 530 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 531 be set to the IPv4 or IPv6 address of the advertising MES. 533 10.1.1.1. Ethernet A-D Route Targets 535 The Ethernet A-D route MUST carry one or more Route Target (RT) 536 attributes. RTs may be configured (as in IP VPNs), or may be derived 537 automatically. 539 If an MES uses Route Target Constrain [RT-CONSTRAIN], the MES SHOULD 540 advertise all such RTs using Route Target Constrains. The use of RT 541 Constrains allows each Ethernet A-D route to reach only those MESes 542 that are configured to import at least one RT from the set of RTs 543 carried in the Ethernet A-D route. 545 10.1.1.1.1. Auto-Derivation from the Ethernet Tag ID 547 The following is the procedure for deriving the RT attribute 548 automatically from the Ethernet Tag ID associated with the 549 advertisement: 551 + The Global Administrator field of the RT MUST 552 be set to the Autonomous System (AS) number that the MES 553 belongs to. 555 + The Local Administrator field of the RT contains a 4 556 octets long number that encodes the Ethernet Tag-ID. If the 557 Ethernet Tag-ID is a two octet VLAN ID then it MUST be 558 encoded in the lower two octets of the Local Administrator 559 field and the higher two octets MUST be set to zero. 561 For the "Unique Single VLAN E-VPN" this results in auto-deriving the 562 RT from the Ethernet Tag, e.g., VLAN ID for that E-VPN. 564 10.1.2. Ethernet A-D Route per Ethernet Segment 566 This section describes procedures to construct the Ethernet A-D route 567 when a single such route is advertised by an MES for a given Ethernet 568 Segment. 570 Route-Distinguisher (RD) MUST be a Type 1 RD [RFC4364]. The value 571 field comprises an IP address of the MES (typically, the loopback 572 address) followed by 0. The reason for such encoding is that the RD 573 cannot be that of a given E-VPN since the ESI can span across one or 574 more E-VPNs. 576 Ethernet Segment Identifier MUST be a non-zero ten octet entity as 577 described in section "Ethernet Segment Identifier". 579 The Ethernet Tag ID MUST be set to 0. 581 If the Ethernet Segment is connected to more than one MES then the 582 "ESI MPLS Label Extended Community" MUST be included in the route. If 583 the Ethernet Segment is connected to more than one MES and active- 584 active multi-homing is desired then the MPLS label in the ESI MPLS 585 Label Extended Community MUST be set to a valid MPLS label value. The 586 MPLS label in this Extended Community is referred to as an "ESI 587 label". This label MUST be a downstream assigned MPLS label if the 588 advertising MES is using ingress replication for receiving multicast, 589 broadcast or unknown unicast traffic, from other MESes. If the 590 advertising MES is using P2MP MPLS LSPs for sending multicast, 591 broadcast or unknown unicast traffic, then this label MUST be an 592 upstream assigned MPLS label. The usage of this label is described in 593 section "Split Horizon". 595 If the Ethernet Segment is connected to more than one MES and active- 596 standby multi-homing is desired then the "Active-Standby" bit in the 597 flags of the ESI MPLS Label Extended Community MUST be set to 1. 599 If the per Ethernet Segment Ethernet A-D route is used in conjunction 600 with the per {ESI, VLAN} Ethernet A-D route, for reasons described 601 below, then the MPLS label in the NLRI MUST be set to 0. 603 10.1.2.1. Ethernet A-D Route Targets 605 The Ethernet A-D route MUST carry one or more Route Target (RT) 606 attributes. These RTs MUST be the set of RTs associated with all the 607 E-VPN instances to which the Ethernet Segment, corresponding to the 608 Ethernet A-D route, belongs. 610 10.2. Motivations for Ethernet A-D Route per Ethernet Segment 612 This section describes various scenarios in which the Ethernet A-D 613 route should be advertised per Ethernet Segment. 615 10.2.1. Multi-Homing 617 The per Ethernet Segment Ethernet A-D route MUST be advertised when 618 the Ethernet Segment is multi-homed. This allows Multi-Homed Ethernet 619 Segment Auto-Discovery. It allows the set of MESes connected to the 620 same customer site i.e., CE, to discover each other automatically 621 with minimal to no configuration. It also allows other MESes that 622 have at least one E-VPN in common with the multi-homed Ethernet 623 Segment to discover the properties of the multi-homed Ethernet 624 Segment. 626 For active-active multi-homing this route is required for split 627 horizon procedures as described in section "Split Horizon" and MUST 628 carry the ESI MPLS Label Extended Community with a valid ESI MPLS 629 label. For active-standby multi-homing this route is required to 630 indicate that active-standby multi-homing and not active-active 631 multi-homing is desired. 633 This route will be enhanced to carry LAG specific information such as 634 LACP parameters, which will be encoded as new BGP attributes or 635 communities, in the future. Note that this information will be 636 propagated to all MESes that have one or more sites in the VLANs 637 connected to the Ethernet Segment. All the MESes other than the ones 638 that are connected to the MESes will discard this information. 640 10.2.2. Optimizing Control Plane Convergence 642 Ethernet A-D route per Ethernet Segment should be advertised when it 643 is desired to optimize the control plane convergence of the 644 withdrawal of the Ethernet A-D routes. If this is done then when an 645 Ethernet segment fails, the single Ethernet A-D route corresponding 646 to the segment can be withdrawn first. This allows all MESes that 647 receive this withdrawal to invalidate the MAC routes learned from the 648 Ethernet segment. 650 Note that the Ethernet A-D route per Ethernet Segment, when used to 651 optimize control plane convergence, MAY be advertised in addition to 652 the Ethernet Tag A-D routes per E-VPN or MAY be advertised on its 653 own. 655 10.2.3. Reducing Number of Ethernet A-D Routes 657 In certain scenarios advertising Ethernet A-D routes per Ethernet 658 segment, instead of per E-VPN, may reduce the number of Ethernet A-D 659 routes in the network. In these scenarios Ethernet A-D routes may be 660 advertised per Ethernet segment instead of per E-VPN. 662 11. Determining Reachability to Unicast MAC Addresses 663 MESes forward packets that they receive based on the destination MAC 664 address. This implies that MESes must be able to learn how to reach a 665 given destination unicast MAC address. 667 There are two components to MAC address learning, "local learning" 668 and "remote learning": 670 11.1. Local Learning 672 A particular MES must be able to learn the MAC addresses from the CEs 673 that are connected to it. This is referred to as local learning. 675 The MESes in a particular E-VPN MUST support local data plane 676 learning using standard IEEE Ethernet learning procedures. An MES 677 must be capable of learning MAC addresses in the data plane when it 678 receives packets such as the following from the CE network: 680 - DHCP requests 682 - gratuitous ARP request for its own MAC. 684 - ARP request for a peer. 686 Alternatively MESes MAY learn the MAC addresses of the CEs in the 687 control plane or via management plane integration between the MESes 688 and the CEs. 690 There are applications where a MAC address that is reachable via a 691 given MES on a locally attached Segment (e.g. with ESI X) may move 692 such that it becomes reachable via the same MES or another MES on 693 another Segment (e.g. with ESI Y). This is referred to as a "MAC 694 Move". Procedures to support this are described in section "MAC 695 Moves". 697 11.2. Remote learning 699 A particular MES must be able to determine how to send traffic to MAC 700 addresses that belong to or are behind CEs connected to other MESes 701 i.e. to remote CEs or hosts behind remote CEs. We call such MAC 702 addresses as "remote" MAC addresses. 704 This document requires an MES to learn remote MAC addresses in the 705 control plane. In order to achieve this each MES advertises the MAC 706 addresses it learns from its locally attached CEs in the control 707 plane, to all the other MESes in the E-VPN, using MP-BGP and the MAC 708 address advertisement route. 710 11.2.1. Constructing the BGP E-VPN MAC Address Advertisement 711 BGP is extended to advertise these MAC addresses using the MAC 712 advertisement route type in the E-VPN-NLRI. 714 The RD MUST be the RD of the E-VPN instance that is advertising the 715 NLRI. The procedures for setting the RD for a given E-VPN are 716 described in section "Ethernet A-D Route per E-VPN". 718 The Ethernet Segment Identifier is set to the ten octet ESI 719 identifier described in section "Ethernet Segment Identifier". 721 The Ethernet Tag ID may be zero or may represent a valid Ethernet Tag 722 ID. This field may be non-zero when there are multiple bridge 723 domains in the E-VPN instance (e.g., the MES needs to perform 724 qualified learning for the VLANs in that EVPN instance). 726 When the the Ethernet Tag ID in the NLRI is set to a non-zero value, 727 for a particular bridge domain, then this Ethernet Tag may either be 728 the Ethernet tag value associated with the CE, e.g., VLAN ID, or it 729 may be the Ethernet Tag Identifier, e.g., VLAN ID assigned by the E- 730 VPN provider and mapped to the CE's Ethernet tag. The latter would be 731 the case if the CE Ethernet tags, e.g., VLAN ID, for a particular 732 bridge domain are different on different CEs. 734 The MAC address length field is typically set to 48. However this 735 specification enables specifying the MAC address as a prefix in which 736 case the MAC address length field is set to the length of the prefix. 737 This provides the ability to aggregate MAC addresses if the 738 deployment environment supports that. The encoding of a MAC address 739 MUST be the 6-octet MAC address specified by IEEE 802 documents 740 [802.1D-ORIG] [802.1D-REV]. If the MAC address is advertised as a 741 prefix then the trailing bits of the prefix MUST be set to 0 to 742 ensure that the entire prefix is encoded as 6 octets. 744 The MPLS Label Length field value is set to the number of octets in 745 the MPLS Label field. The MPLS label field carries one or more labels 746 (that corresponds to the stack of labels [MPLS-ENCAPS]). Each label 747 is encoded as 3 octets, where the high-order 20 bits contain the 748 label value, and the low order bit contains "Bottom of Stack" (as 749 defined in [MPLS-ENCAPS]). 751 The MPLS label stack MUST be the downstream assigned E-VPN MPLS label 752 stack that is used by the MES to forward MPLS encapsulated Ethernet 753 packets received from remote MESes, where the destination MAC address 754 in the Ethernet packet is the MAC address advertised in the above 755 NLRI. The forwarding procedures are specified in section "Forwarding 756 Unicast Packets" and "Load Balancing of Unicast Packets". 758 An MES may advertise the same single E-VPN label for all MAC 759 addresses in a given E-VPN instance. This label assignment 760 methodology is referred to as a per EVI label assigment. 761 Alternatively an MES may advertise a unique E-VPN label per combination. This label assignment methodology is 763 referred to as a per label assignment. Or an MES 764 may advertise a unique E-VPN label per MAC address. All of these 765 methodologies have their tradeoffs. 767 Per EVI label assignment requires the least number of E-VPN labels, 768 but requires a MAC lookup in addition to an MPLS lookup on an egress 769 MES for forwarding. On the other hand a unique label per or a unique label per MAC allows an egress MES to 771 forward a packet that it receives from another MES, to the connected 772 CE, after looking up only the MPLS labels and not having to do a MAC 773 lookup. 775 As well as to insert the appropriate VLAN ID on egress to the CE A 776 MES may also advertise more than one label for a given MAC address. 777 For instance an MES may advertise two labels, one of which is for the 778 ESI corresponding to the MAC address and the second is for the 779 Ethernet Tag on the ESI that the MAC address is learnt on. 781 The IP Address field is optional. By default the IP Address length is 782 set to 0 and the IP address is excluded. When a valid IP address is 783 included it is encoded as specified in the section "Optimizing ARP". 785 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 786 be set to the IPv4 or IPv6 address of the advertising MES. 788 The BGP advertisement that advertises the MAC advertisement route 789 MUST also carry one or more Route Target (RT) attributes. RTs may be 790 configured (as in IP VPNs), or may be derived automatically from the 792 Ethernet Tag ID, in the Unique Single VLAN case as described in 793 section "Ethernet A-D Route per E-VPN". 795 It is to be noted that this document does not require MESes to create 796 forwarding state for remote MACs when they are learnt in the control 797 plane. When this forwarding state is actually created is a local 798 implementation matter. 800 12. Optimizing ARP 802 The IP address field in the MAC advertisement route may optionally 803 carry one of the IP addresses associated with the MAC address. This 804 provides an option which can be used to minimize the flooding of ARP 805 messages to MAC VPN CEs and to MESes. This option also minimizes ARP 806 message processing on MAC VPN CEs. A MES may learn the IP address 807 associated with a MAC address in the control or management plane 808 between the CE and the MES. Or it may learn this binding by snooping 809 certain messages to or from a CE. When a MES learns the IP address 810 associated with a MAC address, of a locally connected CE, it may 811 advertise it to other MESes by including it in the MAC route 812 advertisement. The IP Address may be an IPv4, encoded using four 813 octets or an IPv6 address encoded using sixteen octets. The IP 814 Address length field MUST be set to 32 for an IPv4 address and 128 815 for an IPv6 address. 817 If there are multiple IP addresses associated with a MAC address then 818 multiple MAC advertisement routes MUST be generated, one for each IP 819 address. For instance this may be the case when there is both an IPv4 820 and an IPv6 address associated with the MAC address. When the IP 821 address is dis-associated with the MAC address then the MAC 822 advertisement route with that particular IP address MUST be 823 withdrawn. 825 When an MES receives an ARP request for an IP address from a CE, and 826 if the MES has the MAC address binding for that IP address, the MES 827 should perform ARP proxy and respond to the ARP request. 829 Further detailed procedures will be specified in a later version. 831 13. Designated Forwarder Election 833 Consider a CE that is a host or a router that is multi-homed directly 834 to more than one MES in an E-VPN on a given Ethernet segment. One or 835 more Ethernet Tags may be configured on the Ethernet segment. In this 836 scenario only one of the MESes, referred to as the Designated 837 Forwarder (DF), is responsible for certain actions: 839 - Sending multicast and broadcast traffic, on a given Ethernet 840 Tag on a particular Ethernet segment, to the CE. Note that 841 this behavior, which allows selecting a DF at the 842 granularity of for multicast and 843 broadcast traffic is the default behavior in this 844 specification. Optional mechanisms, which will be 845 specified in the future, will allow selecting a DF 846 at the granularity of . 848 - Flooding unknown unicast traffic (i.e. traffic for 849 which an MES does not know the destination MAC address), 850 on a given Ethernet Tag on a particular Ethernet segment 851 to the CE, if the environment requires flooding of 852 unknown unicast traffic. 854 Note that a CE always sends packets belonging to a specific flow 855 using a single link towards an MES. For instance, if the CE is a host 856 then, as mentioned earlier, the host treats the multiple links that 857 it uses to reach the MESes as a Link Aggregation Group (LAG). The CE 858 employs a local hashing function to map traffic flows onto links in 859 the LAG. 861 If a bridge network is multi-homed to more than one MES in an E-VPN 862 via switches, then the support of active-active points of attachments 863 as described in this specification requires the bridge network to be 864 connected to two or more MESes using a LAG. In this case the reasons 865 for doing DF election are the same as those described above when a CE 866 is a host or a router. 868 If a bridge network does not connect to the MESes using LAG, then 869 only one of the links between the switched bridged network and the 870 MESes must be the active link. In this case the per Ethernet Segment 871 Ethernet Tag routes MUST be advertised with the "Active-Standby" flag 872 set to one. Procedures for supporting active-active points of 873 attachments, when a bridge network does not connect to the MESes 874 using LAG, are for further study. 876 The granularity of the DF election MUST be at least the Ethernet 877 segment via which the CE is multi-homed to the MESes. If the DF 878 election is done at the Ethernet segment granularity then a single 879 MES MUST be elected as the DF on the Ethernet segment. 881 If there are one or more Ethernet Tags (e.g., VLANs) on the Ethernet 882 segment then the granularity of the DF election SHOULD be the 883 combination of the Ethernet segment and Ethernet Tag on that Ethernet 884 segment. In this case a single MES MUST be elected as the DF for a 885 particular Ethernet Tag on that Ethernet segment. 887 There are two specified mechanisms for performing DF election. 889 13.1. DF Election Performed by All MESes 891 The MESes perform a designated forwarder (DF) election, for an 892 Ethernet segment, or combination using the 893 Ethernet Tag A-D BGP route described in section "Auto-Discovery of 894 Ethernet Tags on Ethernet Segments". 896 The DF election for a particular ESI or a particular combination proceeds as follows. First an MES constructs a 898 candidate list of MESes. This comprises all the Ethernet A-D routes 899 with that particular ESI or tuple that an MES 900 imports in an E-VPN instance, including the Ethernet A-D route(s) 901 generated by the MES itself, if any. The DF MES is chosen from this 902 candidate list. Note that DF election is carried out by all the MESes 903 that import the DF route. 905 The default procedure for choosing the DF is the MES with the highest 906 IP address, of all the MESes in the candidate list. This procedure 907 MUST be implemented. It ensures that, except during routing 908 transients each MES chooses the same DF MES for a given ESI and 909 Ethernet Tag combination. 911 Other alternative procedures for performing DF election are possible 912 and will be described in the future. 914 13.2. DF Election Performed Only on Multi-Homed MESes 916 As an MES discovers other MESs that are members of the same multi- 917 homed segment, using per Ethernet Segment Ethernet A-D Routes, it 918 starts building an ordered list based on the originating MES IP 919 addresses. This list is used to select a DF and a backup DF (BDF) on 920 a per group of Ethernet Tag basis. For example, the MES with the 921 numerically highest IP address is considered the DF for a given group 922 of VLANs for that Ethernet segment and the next MES in the list is 923 considered the BDF. To that end, the range of Ethernet Tags 924 associated with the CE must be partitioned into disjoint sets. The 925 size of each set is a function of the total number of CE Ethernet 926 Tags and the total number of MESs that the Ethernet segment is multi- 927 homed to. The DF can employ any distribution function that achieves 928 an even distribution of Ethernet Tags across the MESes that are 929 multi-homed to the Ethernet segment. The DF takes over the Ethernet 930 Tag set of any MES encountering either a node failure or a 931 link/Ethernet segment failure causing that MES to be isolated from 932 the multi-homed segment. In case of a failure that is affecting the 933 DF, then the BDF takes over the DF VLAN set. 935 It should be noted that once all the MESs participating in an 936 Ethernet segment have the same ordered list for that site, then 937 Ethernet Tag groups can be assigned to each member of that list 938 deterministically without any need to explicitly distribute Ethernet 939 Tags among the member MESs of that list. In other words, the DF 940 election for a group of Ethernet Tags is a local matter and can be 941 done deterministically. As an example, consider, that the ordered 942 list consists of m MESes: (MES1, MES2,., MESm), and there are n 943 Ethernet Tags for that site (V0, V1, V2, ., Vn-1). Then MES1 and MES2 944 can be the DF and the BDF respectively for all the Ethernet Tags 945 corresponding to (i mod m) for i:0 to n-1. MES2 and MES3 can be the 946 DF and the BDF respectively for all the Ethernet Tags corresponding 947 to (i mod m) + 1 and so on till the last MES in the order list is 948 reached. As a result MESm and MES1 is the DF and the BDF respectively 949 for the all the VLANs corresponding to (i mod m) + m-1. 951 14. Handling of Multi-Destination Traffic 953 Procedures are required for a given MES to send broadcast or 954 multicast traffic, received from a CE encapsulated in a given 955 Ethernet Tag in an E-VPN, to all the other MESes that span that 956 Ethernet Tag in the E-VPN. In certain scenarios, described in section 957 "Processing of Unknown Unicast Packets", a given MES may also need to 958 flood unknown unicast traffic to other MESes. 960 The MESes in a particular E-VPN may use ingress replication or P2MP 961 LSPs or MP2MP LSPs to send unknown unicast, broadcast or multicast 962 traffic to other MESes. 964 Each MES MUST advertise an "Inclusive Multicast Ethernet Tag Route" 965 to enable the above. Next section provides procedures to construct 966 the Inclusive Multicast Ethernet Tag route. Subsequent sections 967 describe in further detail its usage. 969 14.1. Construction of the Inclusive Multicast Ethernet Tag Route 971 The RD MUST be the RD of the E-VPN instance that is advertising the 972 NLRI. The procedures for setting the RD for a given E-VPN are 973 described in section "Ethernet A-D Route per E-VPN". 975 The Ethernet Segment Identifier MAY be set to the ten octet ESI 976 identifier described in section "Ethernet Segment Identifier". Or it 977 MAY be set to 0. It MUST be set to 0 if the Ethernet Tag is set to 978 0. 980 The Ethernet Tag ID is the identifier of the Ethernet Tag. It MAY be 981 set to 0 in which case an egress MES MUST perform a MAC lookup to 982 forward the packet. 984 The Originating Router's IP address MUST be set to an IP address of 985 the PE. This address SHOULD be common for all the EVIs on the PE 986 (e.,g., this address may be PE's loopback address). 988 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 989 be set to the same IP address as the one carried in the Originating 990 Router's IP Address field. 992 The BGP advertisement that advertises the Inclusive Multicast 993 Ethernet Tag route MUST also carry one or more Route Target (RT) 994 attributes. The assignment of RTs described in the section on 995 "Constructing the BGP E-VPN MAC Address Advertisement" MUST be 996 followed. 998 14.2. P-Tunnel Identification 1000 In order to identify the P-Tunnel used for sending broadcast, unknown 1001 unicast or multicast traffic, the Inclusive Multicast Ethernet Tag 1002 route MUST carry a "PMSI Tunnel Attribute" specified in [BGP MVPN]. 1004 Depending on the technology used for the P-tunnel for the E-VPN on 1005 the PE, the PMSI Tunnel attribute of the Inclusive Multicast Ethernet 1006 Tag route is constructed as follows. 1008 + If the PE that originates the advertisement uses a P-Multicast 1009 tree for the P-tunnel for the E-VPN, the PMSI Tunnel attribute 1010 MUST contain the identity of the tree (note that the PE could 1011 create the identity of the tree prior to the actual 1012 instantiation of the tree). 1014 + A PE that uses a P-Multicast tree for the P-tunnel MAY 1015 aggregate two or more Ethernet Tags in the same or different 1016 E-VPNs present on the PE onto the same tree. In this case in 1017 addition to carrying the identity of the tree, the PMSI Tunnel 1018 attribute MUST carry an MPLS upstream assigned label which 1019 the PE has bound uniquely to the for 1020 E-VPN associated with this update (as determined by its RTs). 1022 If the PE has already advertised Inclusive Multicast Ethernet 1023 Tag routes for two or more Ethernet Tags that it now desires 1024 to aggregate, then the PE MUST re-advertise those routes. 1025 The re-advertised routes MUST be the same as the original 1026 ones, except for the PMSI Tunnel attribute and the label 1027 carried in that attribute. 1029 + If the PE that originates the advertisement uses ingress 1030 replication for the P-tunnel for the E-VPN, the route MUST 1031 include the PMSI Tunnel attribute with the Tunnel Type set to 1032 Ingress Replication and Tunnel Identifier set to a routable 1033 address of the PE. The PMSI Tunnel attribute MUST carry a 1034 downstream assigned MPLS label. This label is used to 1035 demultiplex the broadcast, multicast or unknown unicast E-VPN 1036 traffic received over a unicast tunnel by the PE. 1038 + The Leaf Information Required flag of the PMSI Tunnel 1039 attribute MUST be set to zero, and MUST be ignored on receipt. 1041 14.3. Ethernet Segment Identifier and Ethernet Tag 1043 As described above the encoding rules allow setting the Ethernet 1044 Segment Identifier and Ethernet Tag to either non-zero valid values 1045 or to 0. If the Ethernet Tag is set to a non-zero valid value, then 1046 an egress MES can forward the packet to the set of egress ESIs in the 1047 Ethernet Tag, in the E-VPN, by performing an MPLS lookup only. 1048 Further if the ESI is also set to non zero then the egress MES does 1049 not need to replicate the packet as it is destined for a given 1050 Ethernet segment. If both Ethernet Tag and ESI are set to 0 then an 1051 egress MES MUST perform a MAC lookup in the EVI determined by the 1052 MPLS label, after the MPLS lookup, to forward the packet. 1054 If an MES advertises multiple Inclusive Ethernet Tag routes for a 1055 given E-VPN then the PMSI Tunnel Attributes for these routes MUST be 1056 distinct. 1058 15. Processing of Unknown Unicast Packets 1060 The procedures in this document do not require MESes to flood unknown 1061 unicast traffic to other MESes. If MESes learn CE MAC addresses via a 1062 control plane, the MESes can then distribute MAC addresses via BGP, 1063 and all unicast MAC addresses will be learnt prior to traffic to 1064 those destinations. 1066 However, if a destination MAC address of a received packet is not 1067 known by the MES, the MES may have to flood the packet. Flooding must 1068 take into account "split horizon forwarding" as follows. The 1069 principles behind the following procedures are borrowed from the 1070 split horizon forwarding rules in VPLS solutions [RFC 4761, RFC 1071 4762]. When an MES capable of flooding (say MESx) receives a 1072 broadcast Ethernet frame, or one with an unknown destination MAC 1073 address, it must flood the frame. If the frame arrived from an 1074 attached CE, MESx must send a copy of the frame to every other 1075 attached CE, on a different ESI than the one it received the frame 1076 on, as well as to all other MESs participating in the E-VPN. If, on 1077 the other hand, the frame arrived from another MES (say MESy), MESx 1078 must send a copy of the packet only to attached CEs. MESx MUST NOT 1079 send the frame to other MESs, since MESy would have already done so. 1080 Split horizon forwarding rules apply to broadcast and multicast 1081 packets, as well as packets to an unknown MAC address. 1083 Whether or not to flood packets to unknown destination MAC addresses 1084 should be an administrative choice, depending on how learning happens 1085 between CEs and MESes. 1087 The MESes in a particular E-VPN may use ingress replication using 1088 RSVP-TE P2P LSPs or LDP MP2P LSPs for sending broadcast, multicast 1089 and unknown unicast traffic to other MESes. Or they may use RSVP-TE 1090 P2MP or LDP P2MP or LDP MP2MP LSPs for sending such traffic to other 1091 MESes. 1093 15.1. Ingress Replication 1095 If ingress replication is in use, the P-Tunnel attribute, carried in 1096 the Inclusive Multicast Ethernet Tag routes for the E-VPN, specifies 1097 the downstream label that the other MESes can use to send unknown 1098 unicast, multicast or broadcast traffic for the E-VPN to this 1099 particular MES. 1101 The MES that receives a packet with this particular MPLS label MUST 1102 treat the packet as a broadcast, multicast or unknown unicast packet. 1103 Further if the MAC address is a unicast MAC address, the MES MUST 1104 treat the packet as an unknown unicast packet. 1106 15.2. P2MP MPLS LSPs 1108 The procedures for using P2MP LSPs are very similar to VPLS 1109 procedures [VPLS-MCAST]. The P-Tunnel attribute used by an MES for 1110 sending unknown unicast, broadcast or multicast traffic for a 1111 particular Ethernet segment, is advertised in the Inclusive Ethernet 1112 Tag Multicast route as described in section "Handling of Multi- 1113 Destination Traffic". 1115 The P-Tunnel attribute specifies the P2MP LSP identifier. This is the 1116 equivalent of an Inclusive tree in [VPLS-MCAST]. Note that multiple 1117 Ethernet Tags, which may be in different E-VPNs, may use the same 1118 P2MP LSP, using upstream labels [VPLS-MCAST]. When P2MP LSPs are used 1119 for flooding unknown unicast traffic, packet re-ordering is possible. 1121 The MES that receives a packet on the P2MP LSP specified in the PMSI 1122 Tunnel Attribute MUST treat the packet as a broadcast, multicast or 1123 unknown unicast packet. Further if the MAC address is a unicast MAC 1124 address, the MES MUST treat the packet as an unknown unicast packet. 1126 16. Forwarding Unicast Packets 1128 16.1. Forwarding packets received from a CE 1130 When an MES receives a packet from a CE, on a given Ethernet Tag, it 1131 must first look up the source MAC address of the packet. In certain 1132 environments the source MAC address MAY be used to authenticate the 1133 CE and determine that traffic from the host can be allowed into the 1134 network. Source MAC lookup MAY also used for local MAC address 1135 learning. 1137 If the MES decides to forward the packet the destination MAC address 1138 of the packet must be looked up. If the MES has received MAC address 1139 advertisements for this destination MAC address from one or more 1140 other MESes or learned it from locally connected CEs, it is 1141 considered as a known MAC address. Else the MAC address is considered 1142 as an unknown MAC address. 1144 For known MAC addresses the MES forwards this packet to one of the 1145 remote MESes or to a locally attached CEs. When forwarding to remote 1146 MESes, the packet is encapsulated in the E-VPN MPLS label advertised 1147 by the remote MES, for that MAC address, and in the MPLS LSP label 1148 stack to reach the remote MES. 1150 If the MAC address is unknown then, if the administrative policy on 1151 the MES requires flooding of unknown unicast traffic: 1153 - The MES MUST flood the packet to other MESes. If the ESI 1154 over which the MES receives the packet is multi-homed, then 1155 the MES MUST first encapsulate the packet in the ESI MPLS 1156 label as described in section "Split Horizon". 1157 If ingress replication is used the packet MUST be replicated 1158 one or more times to each remote MES with the bottom label 1159 of the stack being an MPLS label determined as follows. This 1160 is the MPLS label advertised by the remote MES in a PMSI 1161 Tunnel Attribute in the Inclusive Multicast Ethernet Tag 1162 route for an combination. The Ethernet 1163 Tag in the route must be the same as the Ethernet Tag 1164 advertised by the ingress MES in its Ethernet Tag A-D route 1165 associated with the interface on which the ingress MES 1166 receives the packet. If P2MP LSPs are being used the packet 1167 MUST be sent on the P2MP LSP that the MES is the root of for 1168 the Ethernet Tag in the E-VPN. If the same P2MP LSP is used 1169 for all Ethernet Tags then all the MESes in the E-VPN MUST 1170 be the leaves of the P2MP LSP. If a distinct P2MP LSP is 1171 used for a given Ethernet Tag in the E-VPN then only the 1172 MESes in the Ethernet Tag MUST be the leaves of the P2MP 1173 LSP. The packet MUST be encapsulated in the P2MP LSP label 1174 stack. 1176 If the MAC address is unknown then, if the admnistrative policy on 1177 the MES does not allow flooding of unknown unicast traffic: 1179 - The MES MUST drop the packet. 1181 16.2. Forwarding packets received from a remote MES 1182 16.2.1. Unknown Unicast Forwarding 1184 When an MES receives an MPLS packet from a remote MES then, after 1185 processing the MPLS label stack, if the top MPLS label ends up being 1186 a P2MP LSP label associated with an E-VPN or the downstream label 1187 advertised in the P-Tunnel attribute and after performing the split 1188 horizon procedures described in section "Split Horizon": 1190 - If the MES is the designated forwarder of unknown unicast, 1191 broadcast or multicast traffic, on a particular set of ESIs for the 1192 Ethernet Tag, the default behavior is for the MES to flood the packet 1193 on the ESIs. In other words the default behavior is for the MES to 1194 assume that the destination MAC address is unknown unicast, broadcast 1195 or multicast and it is not required to do a destination MAC address 1196 lookup, as long as the granularity of the MPLS label included the 1197 Ethernet Tag. As an option the MES may do a destination MAC lookup to 1198 flood the packet to only a subset of the CE interfaces in the 1199 Ethernet Tag. For instance the MES may decide to not flood an unknown 1200 unicast packet on certain Ethernet segments even if it is the DF on 1201 the Ethernet segment, based on administrative policy. 1203 - If the MES is not the designated forwarder on any of the ESIs for 1204 the Ethernet Tag, the default behavior is for it to drop the packet. 1206 16.2.2. Known Unicast Forwarding 1208 If the top MPLS label ends up being an E-VPN label that was 1209 advertised in the unicast MAC advertisements, then the MES either 1210 forwards the packet based on CE next-hop forwarding information 1211 associated with the label or does a destination MAC address lookup to 1212 forward the packet to a CE. 1214 17. Split Horizon 1216 Consider a CE that is multi-homed to two or more MESes on an Ethernet 1217 segment ES1. If the CE sends a multicast, broadcast or unknown 1218 unicast packet to a particular MES, say MES1, then MES1 will forward 1219 that packet to all or subset of the other MESes in the E-VPN. In this 1220 case the MESes, other than MES1, that the CE is multi-homed to MUST 1221 drop the packet and not forward back to the CE. This is referred to 1222 as "split horizon" in this document. 1224 In order to accomplish this each MES distributes to other MESes the 1225 "per Ethernet Segment Ethernet A-D route" as per the procedures in 1226 the section "Ethernet A-D Route per Ethernet Segment". This route is 1227 imported by the MESes connected to the Ethernet Segment and also by 1228 the MESes that have at least one E-VPN in common with the Ethernet 1229 Segment in the route. As described in the section "Ethernet A-D Route 1230 per Ethernet Segment", the route MUST carry an ESI MPLS Label 1231 Extended Community with a valid ESI MPLS label. 1233 17.1. ESI MPLS Label: Ingress Replication 1235 An MES that is using ingress replication for sending broadcast, 1236 multicast or unknown unicast traffic, distributes to other MESes, 1237 that belong to the Ethernet segment, a downstream assigned "ESI MPLS 1238 label" in the Ethernet A-D route. This label MUST be programmed in 1239 the platform label space by the advertising MES. Further the 1240 forwarding entry for this label must result in NOT forwarding packets 1241 received with this label onto the Ethernet segment that the label was 1242 distributed for. 1244 Consider MES1 and MES2 that are multi-homed to CE1 on ES1. Further 1245 consider that MES1 is using P2P or MP2P LSPs to send packets to MES2. 1246 Consider that MES1 receives a multicast, broadcast or unknown unicast 1247 packet from CE1 on VLAN1 on ESI1. 1249 First consider the case where MES2 distributes an unique Inclusive 1250 Multicast Ethernet Tag route for VLAN1, for each Ethernet segment on 1251 MES2. In this case MES1 MUST NOT replicate the packet to MES2 for 1252 . 1254 Next consider the case where MES2 distributes a single Inclusive 1255 Multicast Ethernet Tag route for VLAN1 for all Ethernet segments on 1256 MES2. In this case when MES1 sends a multicast, broadcast or unknown 1257 unicast packet, that it receives from CE1, it MUST first push onto 1258 the MPLS label stack the ESI label that MES2 has distributed for 1259 ESI1. It MUST then push on the MPLS label distributed by MES2 in the 1260 Inclusive Ethernet Tag Multicast route for Ethernet Tag1. The 1261 resulting packet is further encapsulated in the P2P or MP2P LSP label 1262 stack required to transmit the packet to MES2. When MES2 receives 1263 this packet it determines the set of ESIs to replicate the packet to 1264 from the top MPLS label, after any P2P or MP2P LSP labels have been 1265 removed. If the next label is the ESI label assigned by MES2 then 1266 MES2 MUST NOT forward the packet onto ESI1. 1268 17.2. ESI MPLS Label: P2MP MPLS LSPs 1270 An MES that is using P2MP LSPs for sending broadcast, multicast or 1271 unknown unicast traffic, distributes to other MESes, that belong to 1272 the Ethernet segment or have an E-VPN in common with the Ethernet 1273 Segment, an upstream assigned "ESI MPLS label" in the Ethernet A-D 1274 route. This label is upstream assigned by the MES that advertises the 1275 route. This label MUST be programmed by the other MESes, that are 1276 connected to the ESI advertised in the route, in the context label 1277 space for the advertising MES. Further the forwarding entry for this 1278 label must result in NOT forwarding packets received with this label 1279 onto the Ethernet segment that the label was distributed for. This 1280 label MUST also be programmed by the other MESes, that import the 1281 route but are not connected to the ESI advertised in the route, in 1282 the context label space for the advertising MES. Further the 1283 forwarding entry for this label must be a POP with no other 1284 associated action. 1286 Consider MES1 and MES2 that are multi-homed to CE1 on ES1. Also 1287 consider MES3 that is in the same E-VPN as one of the E-VPNs to which 1288 ES1 belongs. Further assume that MES1 is using P2MP MPLS LSPs to 1289 send broadcast, multicast or uknown unicast packets. When MES1 sends 1290 a multicast, broadcast or unknown unicast packet, that it receives 1291 from CE1, it MUST first push onto the MPLS label stack the ESI label 1292 that it has assigned for the ESI that the packet was received on. The 1293 resulting packet is further encapsulated in the P2MP MPLS label stack 1294 necessary to transmit the packet to the other MESes. Penultimate hop 1295 popping MUST be disabled on the P2MP LSPs used in the MPLS transport 1296 infrastructure for E-VPN. When MES2 receives this packet it 1297 decapsulates the top MPLS label and forwards the packet using the 1298 context label space determined by the top label. If the next label is 1299 the ESI label assigned by MES1 then MES2 MUST NOT forward the packet 1300 onto ESI1. When MES3 receives this packet it decapsulates the top 1301 MPLS label and forwards the packet using the context label space 1302 determined by the top label. If the next label is the ESI label 1303 assigned by MES1 then MES3 MUST pop the label. 1305 17.3. ESI MPLS Label: MP2MP LSPs 1307 The procedures for ESI MPLS Label assignment and usage for MP2MP LSPs 1308 will be described in a future version. 1310 18. Load Balancing of Unicast Packets 1312 This section specifies how load balancing is achieved to/from a CE 1313 that has more than one interface that is directly connected to one or 1314 more MESes. The CE may be a host or a router or it may be a switched 1315 network that is connected via LAG to the MESes. 1317 18.1. Load balancing of traffic from an MES to remote CEs 1319 Whenever a remote MES imports a MAC advertisement for a given in an E-VPN instance, it MUST consider the MAC as 1321 reachahable via all the MESes from which it has imported Ethernet A-D 1322 routes for that . Let us call this the initial 1323 Ethernet A-D route set for the given ESI. 1325 For the given ESI the remote MES has imported a per Ethernet Segment 1326 Ethernet A-D route, from at least one MES, where the "Active-Standby" 1327 flag in the ESI MPLS Label Extended Community is set, then the remote 1328 MES MUST first use the procedures in the section "Designated 1329 Forwarder Election" to pick a Designated Forwarder. The eligible set 1330 of Ethernet A-D routes used in the procedures below must comprise 1331 this single Ethernet A-D route from the DF. 1333 If for the given ESI none of the per Ethernet Segment Ethernet A-D 1334 routse, imported by the remote MES, have the "Active-Standby" flag 1335 set in the ESI MPLS Label Extended Community, then the eligble set of 1336 Ethernet A-D routes is set to the initial Ethernet A-D route set. 1338 The remote MES MUST use the MAC advertisement and eligible Ethernet 1339 A-D routes to constuct the set of next-hops that it can use to send 1340 the packet to the destination MAC. Each next-hop comprises an MPLS 1341 label stack, that is to be used by the egress MES to forward the 1342 packet. This label stack is determined as follows. If the next-hop is 1343 constructed as a result of a MAC route which has a valid MPLS label 1344 stack, then this label stack MUST be used. However if the MAC route 1345 doesn't exist or if it doesn't have a valid MPLS label stack then the 1346 next-hop and MPLS label stack is constructed as a result of one or 1347 more corresponding Ethernet A-D routes as follows. Note that the 1348 following description applies to determining the label stack for a 1349 particular next-hop to reach a given MES, from which the remote MES 1350 has received and imported one or more Ethernet A-D routes that have 1351 the matching ESI and Ethernet Tag as the one present in the MAC 1352 advertisement. The Ethernet A-D routes mentioned in the following 1353 description refer to the ones imported from this given MES. 1355 If there is a corresponding Ethernet A-D route for that then that label stack MUST be used. If such an Ethernet 1357 Tag A-D route doesn't exist but Ethernet A-D routes exist for and then the label stack 1359 must be constructed by using the labels from these two routes. If 1360 this is not the case but an Ethernet A-D route exists for then the label from that route must be used. 1362 Finally if this is also not the case but an Ethernet A-D route exists 1363 for then the label from that route must 1364 be used. 1366 The following example explains the above when Ethernet A-D routes are 1367 advertised per . 1369 Consider a CE, CE1, that is dual homed to two MESes, MES1 and MES2 on 1370 a LAG interface, ES1, and is sending packets with MAC address MAC1 on 1371 VLAN1. Based on E-VPN extensions described in sections "Determining 1372 Reachability of Unicast Addresses" and "Auto-Discovery of Ethernet 1373 Tags on Ethernet Segments", a remote MES say MES3 is able to learn 1374 that a MAC1 is reachable via MES1 and MES2. Both MES1 and MES2 may 1375 advertise MAC1 in BGP if they receive packets with MAC1 from CE1. If 1376 this is not the case and if MAC1 is advertised only by MES1, MES3 1377 still considers MAC1 as reachable via both MES1 and MES2 as both MES1 1378 and MES2 advertise a Ethernet A-D route for . 1380 The MPLS label stack to send the packets to MES1 is the MPLS LSP 1381 stack to get to MES1 and the E-VPN label advertised by MES1 for CE1's 1382 MAC. 1384 The MPLS label stack to send packets to MES2 is the MPLS LSP stack to 1385 get to MES2 and the MPLS label in the Ethernet A-D route advertised 1386 by MES2 for , if MES2 has not advertised MAC1 in BGP. 1388 We will refer to these label stacks as MPLS next-hops. 1390 The remote MES, MES3, can now load balance the traffic it receives 1391 from its CEs, destined for CE1, between MES1 and MES2. MES3 may use 1392 the IP flow information for it to hash into one of the MPLS next-hops 1393 for load balancing for IP traffic. Or MES3 may rely on the source and 1394 destination MAC addresses for load balancing. 1396 Note that once MES3 decides to send a particular packet to MES1 or 1397 MES2 it can pick from more than path to reach the particular remote 1398 MES using regular MPLS procedures. For instance if the tunneling 1399 technology is based on RSVP-TE LSPs, and MES3 decides to send a 1400 particular packet to MES1 then MES3 can choose from multiple RSVP-TE 1401 LSPs that have MES1 as their destination. 1403 When MES1 or MES2 receive the packet destined for CE1 from MES3, if 1404 the packet is a unicast MAC packet it is forwarded to CE1. If it is 1405 a multicast or broadcast MAC packet then only one of MES1 or MES2 1406 must forward the packet to the CE. Which of MES1 or MES2 forward this 1407 packet to the CE is determined by default based on which of the two 1408 is the DF. An alternate procedure to load balance multicast packets 1409 will be described in the future. 1411 If the connectivity between the multi-homed CE and one of the MESes 1412 that it is multi-homed to fails, the MES MUST withdraw the MAC 1413 address from BGP. In addition the MES MUST withdraw the Ethernet Tag 1414 A-D routes, that had been previously advertised, for the Ethernet 1415 Segment to the CE. Note that to aid convergence the Ethernet Tag A-D 1416 routes MAY be withdrawn before the MAC routes. This enables the 1417 remote MESes to remove the MPLS next-hop to this particular MES from 1418 the set of MPLS next-hops that can be used to forward traffic to the 1419 CE. For further details and procedures on withdrawal of E-VPN route 1420 types in the event of MES to CE failures please section "MES to CE 1421 Network Failures". 1423 18.2. Load balancing of traffic between an MES and a local CE 1425 A CE may be configured with more than one interface connected to 1426 different MESes or the same MES for load balancing, using a 1427 technology such as LAG. The MES(s) and the CE can load balance 1428 traffic onto these interfaces using one of the following mechanisms. 1430 18.2.1. Data plane learning 1432 Consider that the MESes perform data plane learning for local MAC 1433 addresses learned from local CEs. This enables the MES(s) to learn a 1434 particular MAC address and associate it with one or more interfaces, 1435 if the technology between the MES and the CE supports multi-pathing. 1436 The MESes can now load balance traffic destined to that MAC address 1437 on the multiple interfaces. 1439 Whether the CE can load balance traffic that it generates on the 1440 multiple interfaces is dependent on the CE implementation. 1442 18.2.2. Control plane learning 1444 The CE can be a host that advertises the same MAC address using a 1445 control protocol on both interfaces. This enables the MES(s) to learn 1446 the host's MAC address and associate it with one or more interfaces. 1447 The MESes can now load balance traffic destined to the host on the 1448 multiple interfaces. The host can also load balance the traffic it 1449 generates onto these interfaces and the MES that receives the traffic 1450 employs E-VPN forwarding procedures to forward the traffic. 1452 19. MAC Moves 1454 In the case where a CE is a host or a switched network connected to 1455 hosts, the MAC address that is reachable via a given MES on a 1456 particular ESI may move such that it becomes reachable via another 1457 MES on another ESI. This is referred to as a "MAC Move". 1459 Remote MESes must be able to distinguish a MAC move from the case 1460 where a MAC address on an ESI is reachable via two different MESes 1461 and load balancing is performed as described in section "Load 1462 Balancing of Unicast Packets". This distinction can be made as 1463 follows. If a MAC is learned by a particular MES from multiple MESes, 1464 then the MES performs load balancing only amongst the set of MESes 1465 that advertised the MAC with the same ESI. If this is not the case 1466 then the MES chooses only one of the advertising MESes to reach the 1467 MAC as per BGP path selection. 1469 There can be traffic loss during a MAC move. Consider MAC1 that is 1470 advertised by MES1 and learned from CE1 on ESI1. If MAC1 now moves 1471 behind MES2, on ESI2, MES2 advertises the MAC in BGP. Until a remote 1472 MES, MES3, determines that the best path is via MES2, it will 1473 continue to send traffic destined for MAC1 to MES1. This will not 1474 occur deterministially until MES1 withdraws the advertisement for 1475 MAC1. 1477 One recommended optimization to reduce the traffic loss during MAC 1478 moves is the following option. When an MES sees a MAC update from a 1479 locally attached CE on an ESI, which is different from the ESI on 1480 which the MES has currently learned the MAC, the corresponding entry 1481 in the local bridge forwarding table SHOULD be immediately purged 1482 causing the MES to withdraw its own E-VPN MAC advertisement route and 1483 replace it with the update. 1485 A future version of this specification will describe other optimized 1486 procedures to minimize traffic loss during MAC moves. 1488 20. Multicast 1490 The MESes in a particular E-VPN may use ingress replication or P2MP 1491 LSPs to send multicast traffic to other MESes. 1493 20.1. Ingress Replication 1495 The MESes may use ingress replication for flooding unknown unicast, 1496 multicast or broadcast traffic as described in section "Handling of 1497 Multi-Destination Traffic". A given unknown unicast or broadcast 1498 packet must be sent to all the remote MESes. However a given 1499 multicast packet for a multicast flow may be sent to only a subset of 1500 the MESes. Specifically a given multicast flow may be sent to only 1501 those MESes that have receivers that are interested in the multicast 1502 flow. Determining which of the MESes have receivers for a given 1503 multicast flow is done using explicit tracking described below. 1505 20.2. P2MP LSPs 1507 A MES may use an "Inclusive" tree for sending an unknown unicast, 1508 broadcast or multicast packet or a "Selective" tree. This terminology 1509 is borrowed from [VPLS-MCAST]. 1511 A variety of transport technologies may be used in the SP network. 1512 For inclusive P-Multicast trees, these transport technologies include 1513 point-to-multipoint LSPs created by RSVP-TE or mLDP. For selective P- 1514 Multicast trees, only unicast MES-MES tunnels (using MPLS or IP/GRE 1515 encapsulation) and P2MP LSPs are supported, and the supported P2MP 1516 LSP signaling protocols are RSVP-TE, and mLDP. 1518 20.3. MP2MP LSPs 1520 The root of the MP2MP LDP LSP advertises the Inclusive Multicast Tag 1521 route with the PMSI Tunnel attribute set to the MP2MP Tunnel 1522 identifier. This advertisement is then sent to all MESes in the E- 1523 VPN. Upon receiving the Inclusive Multicast Tag routes with a PMSI 1524 Tunnel attribute that contains the MP2MP Tunnel identifier, the 1525 receiving MESes initiate the setup of the MP2MP tunnel towards the 1526 root using the procedures in [MLDP]. 1528 20.3.1. Inclusive Trees 1530 An Inclusive Tree allows the use of a single multicast distribution 1531 tree, referred to as an Inclusive P-Multicast tree, in the SP network 1532 to carry all the multicast traffic from a specified set of E-VPN 1533 instances on a given MES. A particular P-Multicast tree can be set up 1534 to carry the traffic originated by sites belonging to a single E-VPN, 1535 or to carry the traffic originated by sites belonging to different E- 1536 VPNs. The ability to carry the traffic of more than one E-VPN on the 1537 same tree is termed 'Aggregation'. The tree needs to include every 1538 MES that is a member of any of the E-VPNs that are using the tree. 1539 This implies that an MES may receive multicast traffic for a 1540 multicast stream even if it doesn't have any receivers that are 1541 interested in receiving traffic for that stream. 1543 An Inclusive P-Multicast tree as defined in this document is a P2MP 1544 tree. A P2MP tree is used to carry traffic only for E-VPN CEs that 1545 are connected to the MES that is the root of the tree. 1547 The procedures for signaling an Inclusive Tree are the same as those 1548 in [VPLS-MCAST] with the VPLS-AD route replaced with the Inclusive 1549 Multicast Ethernet Tag route. The P-Tunnel attribute [VPLS-MCAST] for 1550 an Inclusive tree is advertised in the Inclusive Ethernet A-D route 1551 as described in section "Handling of Multi-Destination Traffic". Note 1552 that an MES can "aggregate" multiple inclusive trees for different E- 1553 VPNs on the same P2MP LSP using upstream labels. The procedures for 1554 aggregation are the same as those described in [VPLS-MCAST], with 1555 VPLS A-D routes replaced by E-VPN Inclusive Multicast Ethernet A-D 1556 routes. 1558 20.3.2. Selective Trees 1560 A Selective P-Multicast tree is used by an MES to send IP multicast 1561 traffic for one or more specific IP multicast streams, originated by 1562 CEs connected to the MES, that belong to the same or different E- 1563 VPNs, to a subset of the MESs that belong to those E-VPNs. Each of 1564 the MESs in the subset should be on the path to a receiver of one or 1565 more multicast streams that are mapped onto the tree. The ability to 1566 use the same tree for multicast streams that belong to different E- 1567 VPNs is termed an MES the ability to create separate SP multicast 1568 trees for specific multicast streams, e.g. high bandwidth multicast 1569 streams. This allows traffic for these multicast streams to reach 1570 only those MES routers that have receivers in these streams. This 1571 avoids flooding other MES routers in the E-VPN. 1573 A SP can use both Inclusive P-Multicast trees and Selective P- 1574 Multicast trees or either of them for a given E-VPN on an MES, based 1575 on local configuration. 1577 The granularity of a selective tree is where S is an 1578 IP multicast source address and G is an IP multicast group address or 1579 G is a multicast MAC address. Wildcard sources and wildcard groups 1580 are supported. Selective trees require explicit tracking as described 1581 below. 1583 A E-VPN MES advertises a selective tree using a E-VPN selective A-D 1584 route. The procedures are the same as those in [VPLS-MCAST] with S- 1585 PMSI A-D routes in [VPLS-MCAST] replaced by E-VPN Selective A-D 1586 routes. The information elements of the E-VPN selective A-D route 1587 are similar to those of the VPLS S-PMSI A-D route with the following 1588 differences. A E-VPN Selective A-D route includes an optional 1589 Ethernet Tag field. Also an E-VPN selective A-D route may encode a 1590 MAC address in the Group field. The encoding details of the E-VPN 1591 selective A-D route will be described in the next revision. 1593 Selective trees can also be aggregated on the same P2MP LSP using 1594 aggregation as described in [VPLS-MCAST]. 1596 20.4. Explicit Tracking 1598 [VPLS-MCAST] describes procedures for explicit tracking that rely on 1599 Leaf A-D routes. The same procedures are used for explicit tracking 1600 in this specification with VPLS Leaf A-D routes replaced with E-VPN 1601 Leaf A-D routes. These procedures allow a root MES to request 1602 multicast membership information for a given (S, G), from leaf MESs. 1603 Leaf MESs rely on IGMP snooping or PIM snooping between the MES and 1604 the CE to determine the multicast membership information. Note that 1605 the procedures in [VPLS-MCAST] do not describe how explicit tracking 1606 is performed if the CEs are enabled with join suppression. The 1607 procedures for this case will be described in a future version. 1609 21. Convergence 1611 This section describes failure recovery from different types of 1612 network failures. 1614 21.1. Transit Link and Node Failures between MESes 1616 The use of existing MPLS Fast-Reroute mechanisms can provide failure 1617 recovery in the order of 50ms, in the event of transit link and node 1618 failures in the infrastructure that connects the MESes. 1620 21.2. MES Failures 1621 Consider a host host1 that is dual homed to MES1 and MES2. If MES1 1622 fails, a remote MES, MES3, can discover this based on the failure of 1623 the BGP session. This failure detection can be in the sub-second 1624 range if BFD is used to detect BGP session failure. MES3 can update 1625 its forwarding state to start sending all traffic for host1 to only 1626 MES2. It is to be noted that this failure recovery is potentially 1627 faster than what would be possible if data plane learning were to be 1628 used. As in that case MES3 would have to rely on re-learning of MAC 1629 addresses via MES2. 1631 21.2.1. Local Repair 1633 It is possible to perform local repair in the case of MES failures. 1634 Details will be specified in the future. 1636 21.3. MES to CE Network Failures 1638 When an Ethernet segment connected to an MES fails or when a Ethernet 1639 Tag is deconfigured on an Ethernet segment, then the MES MUST 1640 withdraw the Ethernet A-D route(s) announced for the that are impacted by the failure or de-configuration. In 1642 addition the MES MUST also withdraw the MAC advertisement routes that 1643 are impacted by the failure or de-configuration. 1645 The Ethernet A-D routes should be used by an implementation to 1646 optimize the withdrawal of MAC advertisement routes. When an MES 1647 receives a withdrawal of a particular Ethernet A-D route from an MES 1648 it SHOULD consider all the MAC advertisement routes, that are learned 1649 from the same as in the Ethernet A-D route, from 1650 the advertising MES, as having been withdrawn. This optimizes the 1651 network convergence times in the event of MES to CE failures. 1653 22. LACP State Synchronization 1655 This section requires review and discussion amongst the authors and 1656 will be revised in the next version. 1658 To support CE multi-homing with multi-chassis Ethernet bundles, the 1659 MESes connected to a given CE should synchronize [802.1AX] LACP state 1660 amongst each other. This ensures that the MESes can present a single 1661 LACP bundle to the CE. This is required for initial system bring-up 1662 and upon any configuration change. 1664 This includes at least the following LACP specific configuration 1665 parameters: 1667 - System Identifier (MAC Address): uniquely identifies a LACP 1668 speaker. 1670 - System Priority: determines which LACP speaker's port 1671 priorities are used in the Selection logic. 1672 - Aggregator Identifier: uniquely identifies a bundle within 1673 a LACP speaker. 1674 - Aggregator MAC Address: identifies the MAC address of the 1675 bundle. 1676 - Aggregator Key: used to determine which ports can join an 1677 Aggregator. 1678 - Port Number: uniquely identifies an interface within a LACP 1679 speaker. 1680 - Port Key: determines the set of ports that can be bundled. 1681 - Port Priority: determines a port's precedence level to join 1682 a bundle in case the number of eligible ports exceeds the 1683 maximum number of links allowed in a bundle. 1685 Furthermore, the MESes should also synchronize operational (run-time) 1686 data, in order for the LACP Selection logic state-machines to 1687 execute. This operational data includes the following LACP 1688 operational parameters, on a per port basis: 1690 - Partner System Identifier: this is the CE System MAC address. 1691 - Partner System Priority: the CE LACP System Priority 1692 - Partner Port Number: CE's AC port number. 1693 - Partner Port Priority: CE's AC Port Priority. 1694 - Partner Key: CE's key for this AC. 1695 - Partner State: CE's LACP State for the AC. 1696 - Actor State: PE's LACP State for the AC. 1697 - Port State: PE's AC port status. 1699 The above state needs to be communicated between MESes forming a 1700 multi-chassis bundle during LACP initial bringup, upon any 1701 configuration change and upon the occurrence of a failure. 1703 It should be noted that the above configuration and operational state 1704 is localized in scope and is only relevant to MESes which connect to 1705 the same multi-homed CE over a given Ethernet bundle. 1707 Furthermore, the communication of state changes, upon failures, must 1708 occur with minimal latency, in order to minimize the switchover time 1709 and consequent service disruption. The protocol details for 1710 synchronizing the LACP state will be described in the following 1711 version. 1713 23. Acknowledgements 1715 We would like to thank Yakov Rekhter, Pedro Marques, Kaushik Ghosh, 1716 Nischal Sheth, Robert Raszuk and Amit Shukla for discussions that 1717 helped shape this document. We would also like to thank Han Nguyen 1718 for his comments and support of this work. We would also like to 1719 thank Steve Kensil for his review. 1721 24. References 1722 [E-VPN-REQ] A. Sajassi, R. Aggarwal et. al., "Requirements for 1723 Ethernet VPN", draft-sajassi-raggarwa-l2vpn-evpn-req- 1724 00.txt 1726 [RFC4364] "BGP/MPLS IP VPNs", Rosen, Rekhter, et. al., February 2006 1728 [VPLS-MCAST] "Multicast in VPLS". R. Aggarwal et.al., draft-ietf- 1729 l2vpn-vpls-mcast-04.txt 1731 [RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN Service 1732 (VPLS) Using BGP for Auto-Discovery and Signaling", RFC 1733 4761, January 2007. 1735 [RFC4762] Lasserre, M. and V. Kompella, "Virtual Private LAN Service 1736 (VPLS) Using Label Distribution Protocol (LDP) Signaling", 1737 RFC 4762, January 2007. 1739 [VPLS-MULTIHOMING] "BGP based Multi-homing in Virtual Private LAN 1740 Service", K. Kompella et. al., draft-ietf-l2vpn-vpls- 1741 multihoming-00.txt 1743 [PIM-SNOOPING] "PIM Snooping over VPLS", V. Hemige et. al., draft- 1744 ietf-l2vpn-vpls-pim-snooping-01 1746 [IGMP-SNOOPING] "Considerations for Internet Group Management 1747 Protocol (IGMP) and Multicast Listener Discovery (MLD) 1748 Snooping Switches", M. Christensen et. al., RFC4541, 1750 [RT-CONSTRAIN] P. Marques et. al., "Constrained Route Distribution 1751 for Border Gateway Protocol/MultiProtocol Label Switching 1752 (BGP/MPLS) Internet Protocol (IP) Virtual Private Networks 1753 (VPNs)", RFC 4684, November 2006 1755 25. Author's Address 1757 Rahul Aggarwal 1758 Email: raggarwa_1@yahoo.com 1760 Ali Sajassi 1761 Cisco 1762 170 West Tasman Drive 1763 San Jose, CA 95134, US 1764 Email: sajassi@cisco.com 1766 Wim Henderickx 1767 Alcatel-Lucent 1768 e-mail: wim.henderickx@alcatel-lucent.com 1770 Aldrin Isaac 1771 Bloomberg 1772 Email: aisaac71@bloomberg.net 1774 James Uttaro 1775 AT&T 1776 200 S. Laurel Avenue 1777 Middletown, NJ 07748 1778 USA 1779 Email: uttaro@att.com 1781 Nabil Bitar 1782 Verizon Communications 1783 Email : nabil.n.bitar@verizon.com 1785 Ravi Shekhar 1786 Juniper Networks 1787 1194 N. Mathilda Ave. 1788 Sunnyvale, CA 94089 US 1789 Email: rshekhar@juniper.net 1791 John Drake 1792 Juniper Networks 1793 1194 N. Mathilda Ave. 1794 Sunnyvale, CA 94089 US 1795 Email: jdrake@juniper.net 1797 Florin Balus 1798 Alcatel-Lucent 1799 e-mail: Florin.Balus@alcatel-lucent.com 1801 Keyur Patel 1802 Cisco 1803 170 West Tasman Drive 1804 San Jose, CA 95134, US 1805 Email: keyupate@cisco.com 1807 Sami Boutros 1808 Cisco 1809 170 West Tasman Drive 1810 San Jose, CA 95134, US 1811 Email: sboutros@cisco.com