idnits 2.17.1 draft-raggarwa-sajassi-l2vpn-evpn-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 9 instances of too long lines in the document, the longest one being 3 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 508 has weird spacing: '...its may be th...' == Line 1721 has weird spacing: '...n MESes formi...' == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (September 12, 2011) is 4610 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2119' is mentioned on line 159, but not defined == Missing Reference: 'BGP-VPLS-MH' is mentioned on line 271, but not defined == Missing Reference: 'RFC4271' is mentioned on line 337, but not defined == Missing Reference: 'RFC4760' is mentioned on line 346, but not defined == Missing Reference: 'MPLS-ENCAPS' is mentioned on line 776, but not defined == Missing Reference: 'BGP MVPN' is mentioned on line 1029, but not defined == Missing Reference: 'MLDP' is mentioned on line 1552, but not defined == Unused Reference: 'RFC4761' is defined on line 1753, but no explicit reference was found in the text == Unused Reference: 'RFC4762' is defined on line 1757, but no explicit reference was found in the text == Unused Reference: 'VPLS-MULTIHOMING' is defined on line 1761, but no explicit reference was found in the text == Unused Reference: 'PIM-SNOOPING' is defined on line 1765, but no explicit reference was found in the text == Unused Reference: 'IGMP-SNOOPING' is defined on line 1768, but no explicit reference was found in the text == Outdated reference: A later version (-01) exists of draft-sajassi-raggarwa-l2vpn-evpn-req-00 -- Possible downref: Normative reference to a draft: ref. 'E-VPN-REQ' == Outdated reference: A later version (-16) exists of draft-ietf-l2vpn-vpls-mcast-04 == Outdated reference: A later version (-07) exists of draft-ietf-l2vpn-vpls-multihoming-00 == Outdated reference: A later version (-07) exists of draft-ietf-l2vpn-vpls-pim-snooping-01 ** Downref: Normative reference to an Informational draft: draft-ietf-l2vpn-vpls-pim-snooping (ref. 'PIM-SNOOPING') ** Downref: Normative reference to an Informational RFC: RFC 4541 (ref. 'IGMP-SNOOPING') Summary: 6 errors (**), 0 flaws (~~), 20 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Aggarwal 3 Internet Draft Arktan 4 Category: Standards Track 5 Expiration Date: March 2012 A. Sajassi 6 Cisco 8 W. Henderickx 9 Alcatel-Lucent 11 A. Isaac 12 Bloomberg 14 J. Uttaro 15 AT&T 17 F. Balus N. Bitar 18 Alcatel-Lucent Verizon 20 S. Boutros R. Shekhar 21 K. Patel Juniper Networks 22 Cisco 24 September 12, 2011 26 BGP MPLS Based Ethernet VPN 28 draft-raggarwa-sajassi-l2vpn-evpn-04.txt 30 Status of this Memo 32 This Internet-Draft is submitted to IETF in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF), its areas, and its working groups. Note that other 37 groups may also distribute working documents as Internet-Drafts. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 The list of current Internet-Drafts can be accessed at 45 http://www.ietf.org/ietf/1id-abstracts.txt. 47 The list of Internet-Draft Shadow Directories can be accessed at 48 http://www.ietf.org/shadow.html. 50 Copyright and License Notice 52 Copyright (c) 2011 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 This document may contain material from IETF Documents or IETF 66 Contributions published or made publicly available before November 67 10, 2008. The person(s) controlling the copyright in some of this 68 material may not have granted the IETF Trust the right to allow 69 modifications of such material outside the IETF Standards Process. 70 Without obtaining an adequate license from the person(s) controlling 71 the copyright in such materials, this document may not be modified 72 outside the IETF Standards Process, and derivative works of it may 73 not be created outside the IETF Standards Process, except to format 74 it for publication as an RFC or to translate it into languages other 75 than English. 77 Abstract 79 This document describes procedures for BGP MPLS based Ethernet VPNs 80 (E-VPN). 82 Table of Contents 84 1 Specification of requirements ......................... 4 85 2 Contributors .......................................... 5 86 3 Introduction .......................................... 5 87 4 Terminology ........................................... 5 88 5 BGP MPLS Based E-VPN Overview ......................... 6 89 6 Ethernet Segment Identifier ........................... 7 90 7 BGP E-VPN NLRI ........................................ 8 91 7.1 Ethernet Auto-Discovery Route ......................... 9 92 7.2 MAC Advertisement Route .............................. 9 93 7.3 Inclusive Multicast Ethernet Tag Route ................ 10 94 8 ESI MPLS Label Extended Community ..................... 11 95 9 Auto-Discovery ........................................ 11 96 10 Auto-Discovery of Ethernet Tags on Ethernet Segments .. 12 97 10.1 Constructing the Ethernet A-D Route ................... 12 98 10.1.1 Ethernet A-D Route per E-VPN .......................... 12 99 10.1.1.1 Ethernet A-D Route Targets ............................ 14 100 10.1.1.1.1 Auto-Derivation from the Ethernet Tag ID .............. 14 101 10.1.2 Ethernet A-D Route per Ethernet Segment ............... 14 102 10.1.2.1 Ethernet A-D Route Targets ............................ 15 103 10.2 Motivations for Ethernet A-D Route per Ethernet Segment ...16 104 10.2.1 Multi-Homing .......................................... 16 105 10.2.2 Optimizing Control Plane Convergence .................. 16 106 10.2.3 Reducing Number of Ethernet A-D Routes ................ 17 107 11 Determining Reachability to Unicast MAC Addresses ..... 17 108 11.1 Local Learning ........................................ 17 109 11.2 Remote learning ....................................... 18 110 11.2.1 Constructing the BGP E-VPN MAC Address Advertisement .. 18 111 12 Optimizing ARP ........................................ 20 112 13 Designated Forwarder Election ......................... 21 113 13.1 DF Election Performed by All MESes .................... 22 114 13.2 DF Election Performed Only on Multi-Homed MESes ....... 22 115 14 Handling of Multi-Destination Traffic ................. 23 116 14.1 Construction of the Inclusive Multicast Ethernet Tag Route 24 117 14.2 P-Tunnel Identification ............................... 24 118 14.3 Ethernet Segment Identifier and Ethernet Tag .......... 25 119 15 Processing of Unknown Unicast Packets ................. 26 120 15.1 Ingress Replication ................................... 26 121 15.2 P2MP MPLS LSPs ........................................ 27 122 16 Forwarding Unicast Packets ............................ 27 123 16.1 Forwarding packets received from a CE ................. 27 124 16.2 Forwarding packets received from a remote MES ......... 28 125 16.2.1 Unknown Unicast Forwarding ............................ 28 126 16.2.2 Known Unicast Forwarding .............................. 29 127 17 Split Horizon ......................................... 29 128 17.1 ESI MPLS Label: Ingress Replication ................... 29 129 17.2 ESI MPLS Label: P2MP MPLS LSPs ........................ 30 130 17.3 ESI MPLS Label: MP2MP LSPs ............................ 31 131 18 Load Balancing of Unicast Packets ..................... 31 132 18.1 Load balancing of traffic from an MES to remote CEs ... 31 133 18.2 Load balancing of traffic between an MES and a local CE ...33 134 18.2.1 Data plane learning ................................... 34 135 18.2.2 Control plane learning ................................ 34 136 19 MAC Moves ............................................. 34 137 20 Multicast ............................................. 35 138 20.1 Ingress Replication ................................... 35 139 20.2 P2MP LSPs ............................................. 35 140 20.3 MP2MP LSPs ............................................ 36 141 20.3.1 Inclusive Trees ....................................... 36 142 20.3.2 Selective Trees ....................................... 36 143 20.4 Explicit Tracking ..................................... 37 144 21 Convergence ........................................... 38 145 21.1 Transit Link and Node Failures between MESes .......... 38 146 21.2 MES Failures .......................................... 38 147 21.2.1 Local Repair .......................................... 38 148 21.3 MES to CE Network Failures ............................ 38 149 22 LACP State Synchronization ............................ 39 150 23 Acknowledgements ...................................... 40 151 24 References ............................................ 40 152 25 Author's Address ...................................... 41 154 1. Specification of requirements 156 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 157 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 158 document are to be interpreted as described in [RFC2119]. 160 2. Contributors 162 In addition to the authors listed above, the following individuals 163 also contributed to this document. 165 Quaizar Vohra 166 Kireeti Kompella 167 Apurva Mehta 168 Juniper Networks 170 Samer Salam 171 Cisco 173 3. Introduction 175 This document describes procedures for BGP MPLS based Ethernet VPNs 176 (E-VPN). The procedures described here are intended to meet the 177 requirements specified in [E-VPN-REQ]. Please refer to [E-VPN-REQ] 178 for the detailed requirements and motivation. 180 This document proposes an MPLS based technology, referred to as MPLS- 181 based E-VPN (E-VPN). E-VPN requires extensions to existing IP/MPLS 182 protocols as described in this document. In addition to these 183 extensions E-VPN uses several building blocks from existing MPLS 184 technologies. 186 4. Terminology 188 CE: Customer Edge device e.g., host or router or switch 189 MES: MPLS Edge Switch 190 EVI: E-VPN Instance 191 ESI: Ethernet segment identifier 192 LACP: Link Aggregation Control Protocol 193 MP2MP: Multipoint to Multipoint 194 P2MP: Point to Multipoint 195 P2P: Point to Point 197 5. BGP MPLS Based E-VPN Overview 199 This section provides an overview of E-VPN. 201 An E-VPN comprises CEs that are connected to PEs, or MPLS Edge 202 Switches (MES), that form the edge of the MPLS infrastructure. A CE 203 may be a host, a router or a switch. The MPLS Edge Switches provide 204 layer 2 virtual bridge connectivity between the CEs. There may be 205 multiple E-VPNs in the provider's network. An E-VPN routing and 206 forwarding instance on an MES is referred to as an E-VPN Instance 207 (EVI). 209 The MESes maybe connected by an MPLS LSP infrastructure which 210 provides the benefits of MPLS LSP technology such as fast-reroute, 211 resiliency, etc. The MESes may also be connected by an IP 212 infrastructure in which case IP/GRE tunneling is used between the 213 MESes. The detailed procedures in this version of this document are 214 specified only for MPLS LSPs as the tunneling technology. However 215 these procedures are designed to be extensible to IP/GRE as the 216 tunneling technology. 218 In an E-VPN, MAC learning between MESes occurs not in the data plane 219 (as happens with traditional bridging) but in the control plane. 220 Control plane learning offers greater control over the MAC learning 221 process, such as restricting who learns what, and the ability to 222 apply policies. Furthermore, the control plane chosen for 223 advertising MAC reachability information is multi-protocol (MP) BGP 224 (very similar to IP VPNs (RFC 4364)), providing greater scale, and 225 the ability to preserve the "virtualization" or isolation of groups 226 of interacting agents (hosts, servers, Virtual Machines) from each 227 other. In E-VPNs MESes advertise the MAC addresses learned from the 228 CEs that are connected to them, along with an MPLS label, to other 229 MESes in the control plane using MP-BGP. Control plane learning 230 enables load balancing of traffic to and from CEs that are multi- 231 homed to multiple MESes. This is in addition to load balancing across 232 the MPLS core via multiple LSPs betwen the same pair of MESes. In 233 other words it allows CEs to connect to multiple active points of 234 attachment. It also improves convergence times in the event of 235 certain network failures. 237 However, learning between MESes and CEs is done by the method best 238 suited to the CE: data plane learning, IEEE 802.1x, LLDP, 802.1aq, 239 ARP, management plane or other protocols. 241 It is a local decision as to whether the Layer 2 forwarding table on 242 a MES is populated with all the MAC destinations known to the control 243 plane or whether the MES implements a cache based scheme. For 244 instance the MAC forwarding table may be populated only with the MAC 245 destinations of the active flows transiting a specific MES. 247 The policy attributes of an E-VPN are very similar to those of an IP 248 VPN. An E-VPN instance requires a Route-Distinguisher (RD) and an E- 249 VPN requires one or more Route-Targets (RTs). A CE attaches to an E- 250 VPN instance (EVI) on an MES, on an Ethernet interface which may be 251 configured for one or more Ethernet Tags, e.g., VLANs. Some 252 deployment scenarios guarantee uniqueness of VLANs across E-VPNs: all 253 points of attachment of a given E-VPN use the same VLAN, and no other 254 E-VPN uses this VLAN. This document refers to this case as a "Unique 255 Single VLAN E-VPN" and describes simplified procedures to optimize 256 for it. 258 6. Ethernet Segment Identifier 260 If a CE is multi-homed to two or more MESes, the set of Ethernet 261 links constitutes an "Ethernet segment". An Ethernet segment may 262 appear to the CE as a Link Aggregation Group (LAG). Ethernet 263 segments have an identifier, called the "Ethernet Segment Identifier" 264 (ESI) which is encoded as a ten octets integer. A single-homed CE is 265 considered to be attached to an Ethernet segment with ESI 0. 266 Otherwise, an Ethernet segment MUST have a unique non-zero ESI. The 267 ESI can be assigned using various mechanisms: 269 1. The ESI may be configured. For instance when E-VPNs are used to 270 provide a VPLS service the ESI is fairly analogous to the Multi- 271 homing site ID in [BGP-VPLS-MH]. 273 2. If IEEE 802.1AX LACP is used, between the MESes and CEs, then the 274 ESI is determined from LACP by concatenating the following 275 parameters: 277 + CE LACP System Identifier comprised of two bytes of System 278 Priority and six bytes of System MAC address, where the System 279 Priority is encoded in the most significant two bytes. The CE 280 LACP identifier MUST be encoded in the high order eight bytes of 281 the ESI. 283 + CE LACP two byte Port Key. The CE LACP port key MUST be encoded 284 in the low order two bytes of the ESI 286 As far as the CE is concerned it would treat the multiple MESes that 287 it is connected to as the same switch. This allows the CE to 288 aggregate links that are attached to different MESes in the same 289 bundle. 291 3. If LLDP is used, between the MESes and CEs that are hosts, then 292 the ESI is determined by LLDP. The ESI will be specified in a 293 following version. 295 4. In the case of indirectly connected hosts via a bridged LAN 296 between the CEs and the MESes, the ESI is determined based on the 297 Layer 2 bridge protocol as follows: If STP is used in the bridged LAN 298 then the value of the ESI is derived by listening to BPDUs on the 299 Ethernet segment. To achieve this the MES is not required to run STP. 300 However the MES must learn the Switch ID, MSTP ID and Root Bridge ID 301 by listening to STP BPDUs. The ESI is constructed as follows: 303 {Switch ID (6 bits), MSTP ID (6 bits), Root Bridge ID (48 304 bits)} 306 7. BGP E-VPN NLRI 308 This document defines a new BGP NLRI, called the E-VPN NLRI. 310 Following is the format of the E-VPN NLRI: 312 +-----------------------------------+ 313 | Route Type (1 octet) | 314 +-----------------------------------+ 315 | Length (1 octet) | 316 +-----------------------------------+ 317 | Route Type specific (variable) | 318 +-----------------------------------+ 320 The Route Type field defines encoding of the rest of E-VPN NLRI 321 (Route Type specific E-VPN NLRI). 323 The Length field indicates the length in octets of the Route Type 324 specific field of E-VPN NLRI. 326 This document defines the following Route Types: 328 + 1 - Ethernet Auto-Discovery (A-D) route 329 + 2 - MAC advertisement route 330 + 3 - Inclusive Multicast Route 331 + 5 - Selective Multicast Auto-Discovery (A-D) Route 332 + 6 - Leaf Auto-Discovery (A-D) Route 334 The detailed encoding and procedures for these route types are 335 described in subsequent sections. 337 The E-VPN NLRI is carried in BGP [RFC4271] using BGP Multiprotocol 338 Extensions [RFC4760] with an AFI of TBD and an SAFI of E-VPN (To be 339 assigned by IANA). The NLRI field in the 340 MP_REACH_NLRI/MP_UNREACH_NLRI attribute contains the E-VPN NLRI 341 (encoded as specified above). 343 In order for two BGP speakers to exchange labeled E-VPN NLRI, they 344 must use BGP Capabilities Advertisement to ensure that they both are 345 capable of properly processing such NLRI. This is done as specified 346 in [RFC4760], by using capability code 1 (multiprotocol BGP) with an 347 AFI of TBD and an SAFI of E-VPN. 349 7.1. Ethernet Auto-Discovery Route 351 A Ethernet A-D route type specific E-VPN NLRI consists of the 352 following: 354 +---------------------------------------+ 355 | RD (8 octets) | 356 +---------------------------------------+ 357 |Ethernet Segment Identifier (10 octets)| 358 +---------------------------------------+ 359 | Ethernet Tag ID (4 octets) | 360 +---------------------------------------+ 361 | MPLS Label (3 octets) | 362 +---------------------------------------+ 364 For procedures and usage of this route please see the sections on 365 "Auto-Discovery of Ethernet Tags on Ethernet Segments", "Designated 366 Forwarder Election" and "Load Balancing". 368 7.2. MAC Advertisement Route 370 A MAC advertisement route type specific E-VPN NLRI consists of the 371 following: 373 +---------------------------------------+ 374 | RD (8 octets) | 375 +---------------------------------------+ 376 |Ethernet Segment Identifier (10 octets)| 377 +---------------------------------------+ 378 | Ethernet Tag ID (4 octets) | 379 +---------------------------------------+ 380 | MAC Address Length (1 octet) | 381 +---------------------------------------+ 382 | MAC Address (6 octets) | 383 +---------------------------------------+ 384 | IP Address Length (1 octet) | 385 +---------------------------------------+ 386 | IP Address (4 or 16 octets) | 387 +---------------------------------------+ 388 | MPLS Label (n * 3 octets) | 389 +---------------------------------------+ 391 For procedures and usage of this route please see the sections on 392 ""Determining Reachability to Unicast MAC Addresses" and "Load 393 Balancing of Unicast Packets". 395 7.3. Inclusive Multicast Ethernet Tag Route 397 An Inclusive Multicast Ethernet Tag route type specific E-VPN NLRI 398 consists of the following: 400 +---------------------------------------+ 401 | RD (8 octets) | 402 +---------------------------------------+ 403 |Ethernet Segment Identifier (10 octets)| 404 +---------------------------------------+ 405 | Ethernet Tag ID (4 octets) | 406 +---------------------------------------+ 407 | Originating Router's IP Addr | 408 | (4 or 16 octets) | 409 +---------------------------------------+ 411 For procedures and usage of this route please see the sections on 412 "Handling of Multi-Destination Traffic", "Unknown Unicast Traffic" 413 and "Multicast". 415 8. ESI MPLS Label Extended Community 417 This extended community is a new transitive extended community. It 418 may be advertised along with Ethernet Auto-Discovery routes. When 419 used it carries properties associated with the ESI. Specifically it 420 enables split horizon procedures for multi-homed sites. The 421 procedures for using this Extended Community are described in 422 following sections. 424 Each ESI MPLS Label Extended Community is encoded as a 8-octet value 425 as follows: 427 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 428 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 429 | 0x44 | Sub-Type | Flags (One Octet) |Reserved=0 | 430 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 431 | Reserved = 0| ESI MPLS label | 432 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 434 The low order bit of the flags octet is defined as the "Active- 435 Standby" bit and may be set to 1. The other bits must be set to 0. 437 9. Auto-Discovery 439 E-VPN requires the following types of auto-discovery procedures: 441 + E-VPN Auto-Discovery, which allows an MES to discover the other 442 MESes in the E-VPN. Each MES advertises one or more "Inclusive 443 Multicast Tag Routes". The procedures for advertising these 444 routes are described in the section on "Handling of Multi- 445 Destination Traffic". 447 + Auto-Discovery of Ethernet Tags on Ethernet Segments, in a 448 particular E-VPN. The procedures are described in section "Auto- 449 Discovery of Ethernet Tags on Ethernet Segments". 451 + Ethernet Segment Auto-Discovery used for auto-discovery of MESes 452 that are multi-homed to the same Ethernet segment. The procedures 453 are described in section "Auto-Discovery of Ethernet Tags on 454 Ethernet Segments". 456 10. Auto-Discovery of Ethernet Tags on Ethernet Segments 458 If a CE is multi-homed to two or more MESes on a particular Ethernet 459 segment, each MES MUST advertise, to other MESes in the E-VPN, the 460 information about the Ethernet Tags that are associated with that 461 Ethernet segment. An Ethernet Tag identifies a particular broadcast 462 domain. An example of an Ethernet Tag is a VLAN ID. The MES MAY 463 advertise each Ethernet Tag associated with the Ethernet Segment, or 464 it may advertise a wildcard to cover all the Ethernet Tags enabled on 465 the segment. If a CE is single-homed, then the MES that it is 466 attached to MAY advertise the information about Ethernet Tags 467 (e.g.,VLANs) on the Ethernet segment connected to the CE. 469 The information about an Ethernet Tag on a particular Ethernet 470 segment is advertised using an "Ethernet Auto-Discovery route 471 (Ethernet A-D route)". This route is advertised using the E-VPN NLRI. 473 The Ethernet Tag Auto-discovery information SHOULD be used to enable 474 active-active load-balancing among MESes as described in section 475 "Load Balancing of Unicast Packets". In the case of a multi-homed CE 476 this route MUST also carry the "ESI Label Extended Community" to 477 enable split horizon as described in section "Split Horizon". Also, 478 the route can be used for Designated Forwarder (DF) election as 479 described in section "Designated Forwarder Election". Further,it MAY 480 be used to optimize the withdrawal of MAC addresses upon failure as 481 described in section "Convergence". 483 This section describes procedures for advertising one or more 484 Ethernet A-D routes per Ethernet tag per E-VPN. We will call this as 485 "Ethernet A-D route per E-VPN". This section also describes 486 procedures to advertise and withdraw a single Ethernet A-D route per 487 Ethernet Segment. We will call this as "Ethernet A-D route per 488 Segment". 490 10.1. Constructing the Ethernet A-D Route 492 The format of the Ethernet A-D NLRI is specified in section "BGP E- 493 VPN NLRI". 495 10.1.1. Ethernet A-D Route per E-VPN 497 This section describes procedures to construct the Ethernet A-D route 498 when one or more such routes are advertised by an MES for a given E- 499 VPN instance. 501 Route-Distinguisher (RD) MUST be set to the RD of the E-VPN instance 502 that is advertising the NLRI. A RD MUST be assigned for a given E-VPN 503 instance on an MES. This RD MUST be unique across all E-VPN instances 504 on an MES. It is RECOMMENDED to use the Type 1 RD [RFC4364]. The 505 value field comprises an IP address of the MES (typically, the 506 loopback address) followed by a number unique to the MES. This 507 number may be generated by the MES. Or in the Unique Single VLAN E- 508 VPN case, the low order 12 bits may be the 12 bit VLAN ID, with the 509 remaining high order 4 bits set to 0. 511 Ethernet Segment Identifier MAY be set to 0. When it is not zero the 512 Ethernet Segment Identifier MUST be a ten octet entity as described 513 in section "Ethernet Segment Identifier". 515 The Ethernet Tag ID is the identifier of an Ethernet Tag on the 516 Ethernet segment. This value may be a 12 bit VLAN ID, in which case 517 the low order 12 bits are set to the VLAN ID and the high order 20 518 bits are set to 0. Or it may be another Ethernet Tag used by the E- 519 VPN. It MAY be set to the default Ethernet Tag on the Ethernet 520 segment or 0. 522 Note that the above allows the Ethernet A-D route to be advertised 523 with one of the following granularities: 525 + One Ethernet A-D route for a given tuple 526 per E-VPN 528 + One Ethernet A-D route for a given in a given 529 E-VPN, for all associated Ethernet segments, where the ESI is set 530 to 0. 532 + One Ethernet A-D route for the E-VPN where both ESI and Ethernet 533 Tag ID are set to 0. 535 E-VPN supports both the non-qualified and qualified learning models. 536 When non-qualified learning is used, the Ethernet Tag Identifier 537 specified in this section and in other places in this document MUST 538 be set to the default Ethernet Tag, e.g., VLAN ID. When qualified 539 learning is used, and the Ethernet Tags between MESes and CEs in the 540 E-VPN are consistently assigned for a given broadcast domain, the 541 Ethernet Tag Identifier MUST be set to the Ethernet Tag, e.g., VLAN 542 ID for the concerned broadcast domain between the advertising MES and 543 the CE. When qualified learning is used, and the Ethernet Tags, 544 e.g., VLAN IDs between MESes and CEs in the E-VPN are not 545 consistently assigned for a given broadcast domain, the Ethernet Tag 546 Identifier, e.g., VLAN ID MUST be set to a common E-VPN provider 547 assigned tag that maps locally on the advertising MES to an Ethernet 548 broadcast domain identifier such as a VLAN ID. 550 The usage of the MPLS label is described in section on "Load 551 Balancing of Unicast Packets". 553 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 554 be set to the IPv4 or IPv6 address of the advertising MES. 556 10.1.1.1. Ethernet A-D Route Targets 558 The Ethernet A-D route MUST carry one or more Route Target (RT) 559 attributes. RTs may be configured (as in IP VPNs), or may be derived 560 automatically. 562 If an MES uses Route Target Constrain [RT-CONSTRAIN], the MES SHOULD 563 advertise all such RTs using Route Target Constrains. The use of RT 564 Constrains allows each Ethernet A-D route to reach only those MESes 565 that are configured to import at least one RT from the set of RTs 566 carried in the Ethernet A-D route. 568 10.1.1.1.1. Auto-Derivation from the Ethernet Tag ID 570 The following is the procedure for deriving the RT attribute 571 automatically from the Ethernet Tag ID associated with the 572 advertisement: 574 + The Global Administrator field of the RT MUST 575 be set to the Autonomous System (AS) number that the MES 576 belongs to. 578 + The Local Administrator field of the RT contains a 4 579 octets long number that encodes the Ethernet Tag-ID. If the 580 Ethernet Tag-ID is a two octet VLAN ID then it MUST be 581 encoded in the lower two octets of the Local Administrator 582 field and the higher two octets MUST be set to zero. 584 For the "Unique Single VLAN E-VPN" this results in auto-deriving the 585 RT from the Ethernet Tag, e.g., VLAN ID for that E-VPN. 587 10.1.2. Ethernet A-D Route per Ethernet Segment 589 This section describes procedures to construct the Ethernet A-D route 590 when a single such route is advertised by an MES for a given Ethernet 591 Segment. 593 Route-Distinguisher (RD) MUST be a Type 1 RD [RFC4364]. The value 594 field comprises an IP address of the MES (typically, the loopback 595 address) followed by 0. The reason for such encoding is that the RD 596 cannot be that of a given E-VPN since the ESI can span across one or 597 more E-VPNs. 599 Ethernet Segment Identifier MUST be a non-zero ten octet entity as 600 described in section "Ethernet Segment Identifier". 602 The Ethernet Tag ID MUST be set to 0. 604 If the Ethernet Segment is connected to more than one MES then the 605 "ESI MPLS Label Extended Community" MUST be included in the route. 607 If the Ethernet Segment is connected to more than one MES and active- 608 active multi-homing is desired then the MPLS label in the ESI MPLS 609 Label Extended Community MUST be set to a valid MPLS label value. 610 The MPLS label in this Extended Community is referred to as an "ESI 611 label". This label MUST be a downstream assigned MPLS label if the 612 advertising MES is using ingress replication for receiving multicast, 613 broadcast or unknown unicast traffic, from other MESes. If the 614 advertising MES is using P2MP MPLS LSPs for sending multicast, 615 broadcast or unknown unicast traffic, then this label MUST be an 616 upstream assigned MPLS 617 label. The usage of this label is described in section "Split 618 Horizon". 620 If the Ethernet Segment is connected to more than one MES and active- 621 standby multi-homing is desired then the "Active-Standby" bit in the 622 flags of the ESI MPLS Label Extended Community MUST be set to 1. 624 If the per Ethernet Segment Ethernet A-D route is used in conjunction 625 with the per {ESI, VLAN} Ethernet A-D route, for reasons described 626 below, then the MPLS label in the NLRI MUST be set to 0. 628 10.1.2.1. Ethernet A-D Route Targets 630 The Ethernet A-D route MUST carry one or more Route Target (RT) 631 attributes. These RTs MUST be the set of RTs associated with all the 632 E-VPN instances to which the Ethernet Segment, corresponding to the 633 Ethernet A-D route, belongs. 635 10.2. Motivations for Ethernet A-D Route per Ethernet Segment 637 This section describes various scenarios in which the Ethernet A-D 638 route should be advertised per Ethernet Segment. 640 10.2.1. Multi-Homing 642 The per Ethernet Segment Ethernet A-D route MUST be advertised when 643 the Ethernet Segment is multi-homed. This allows Multi-Homed Ethernet 644 Segment Auto-Discovery. It allows the set of MESes connected to the 645 same customer site i.e., CE, to discover each other automatically 646 with minimal to no configuration. It also allows other MESes that 647 have at least one E-VPN in common with the multi-homed Ethernet 648 Segment to discover the properties of the multi-homed Ethernet 649 Segment. 651 For active-active multi-homing this route is required for split 652 horizon procedures as described in section "Split Horizon" and MUST 653 carry the ESI MPLS Label Extended Community with a valid ESI MPLS 654 label. For active-standby multi-homing this route is required to 655 indicate that active-standby multi-homing and not active-active 656 multi-homing is desired. 658 This route will be enhanced to carry LAG specific information such as 659 LACP parameters, which will be encoded as new BGP attributes or 660 communities, in the future. Note that this information will be 661 propagated to all MESes that have one or more sites in the VLANs 662 connected to the Ethernet Segment. All the MESes other than the ones 663 that are connected to the MESes will discard this information. 665 10.2.2. Optimizing Control Plane Convergence 667 Ethernet A-D route per Ethernet Segment should be advertised when it 668 is desired to optimize the control plane convergence of the 669 withdrawal of the Ethernet A-D routes. If this is done then when an 670 Ethernet segment fails, the single Ethernet A-D route corresponding 671 to the segment can be withdrawn first. This allows all MESes that 672 receive this withdrawal to invalidate the MAC routes learned from the 673 Ethernet segment. 675 Note that the Ethernet A-D route per Ethernet Segment, when used to 676 optimize control plane convergence, MAY be advertised in addition to 677 the Ethernet Tag A-D routes per E-VPN or MAY be advertised on its 678 own. 680 10.2.3. Reducing Number of Ethernet A-D Routes 682 In certain scenarios advertising Ethernet A-D routes per Ethernet 683 segment, instead of per E-VPN, may reduce the number of Ethernet A-D 684 routes in the network. In these scenarios Ethernet A-D routes may be 685 advertised per Ethernet segment instead of per E-VPN. 687 11. Determining Reachability to Unicast MAC Addresses 689 MESes forward packets that they receive based on the destination MAC 690 address. This implies that MESes must be able to learn how to reach a 691 given destination unicast MAC address. 693 There are two components to MAC address learning, "local learning" 694 and "remote learning": 696 11.1. Local Learning 698 A particular MES must be able to learn the MAC addresses from the CEs 699 that are connected to it. This is referred to as local learning. 701 The MESes in a particular E-VPN MUST support local data plane 702 learning using standard IEEE Ethernet learning procedures. An MES 703 must be capable of learning MAC addresses in the data plane when it 704 receives packets such as the following from the CE network: 706 - DHCP requests 708 - gratuitous ARP request for its own MAC. 710 - ARP request for a peer. 712 Alternatively MESes MAY learn the MAC addresses of the CEs in the 713 control plane or via management plane integration between the MESes 714 and the CEs. 716 There are applications where a MAC address that is reachable via a 717 given MES on a locally attached Segment (e.g. with ESI X) may move 718 such that it becomes reachable via the same MES or another MES on 719 another Segment (e.g. with ESI Y). This is referred to as a "MAC 720 Move". Procedures to support this are described in section "MAC 721 Moves". 723 11.2. Remote learning 725 A particular MES must be able to determine how to send traffic to MAC 726 addresses that belong to or are behind CEs connected to other MESes 727 i.e. to remote CEs or hosts behind remote CEs. We call such MAC 728 addresses as "remote" MAC addresses. 730 This document requires an MES to learn remote MAC addresses in the 731 control plane. In order to achieve this each MES advertises the MAC 732 addresses it learns from its locally attached CEs in the control 733 plane, to all the other MESes in the E-VPN, using MP-BGP and the MAC 734 address advertisement route. 736 11.2.1. Constructing the BGP E-VPN MAC Address Advertisement 738 BGP is extended to advertise these MAC addresses using the MAC 739 advertisement route type in the E-VPN-NLRI. 741 The RD MUST be the RD of the E-VPN instance that is advertising the 742 NLRI. The procedures for setting the RD for a given E-VPN are 743 described in section "Ethernet A-D Route per E-VPN". 745 The Ethernet Segment Identifier is set to the ten octet ESI 746 identifier described in section "Ethernet Segment Identifier". 748 The Ethernet Tag ID may be zero or may represent a valid Ethernet Tag 749 ID. This field may be non-zero when there are multiple bridge 750 domains in the E-VPN instance (e.g., the MES needs to perform 751 qualified learning for the VLANs in that EVPN instance). 753 When the the Ethernet Tag ID in the NLRI is set to a non-zero value, 754 for a particular bridge domain, then this Ethernet Tag may either be 755 the Ethernet tag value associated with the CE, e.g., VLAN ID, or it 756 may be the Ethernet Tag Identifier, e.g., VLAN ID assigned by the E- 757 VPN provider and mapped to the CE's Ethernet tag. The latter would be 758 the case if the CE Ethernet tags, e.g., VLAN ID, for a particular 759 bridge domain are different on different CEs. 761 The MAC address length field is typically set to 48. However this 762 specification enables specifying the MAC address as a prefix in which 763 case the MAC address length field is set to the length of the prefix. 764 This provides the ability to aggregate MAC addresses if the 765 deployment environment supports that. The encoding of a MAC address 766 MUST be the 6-octet MAC address specified by IEEE 802 documents 767 [802.1D-ORIG] [802.1D-REV]. If the MAC address is advertised as a 768 prefix then the trailing bits of the prefix MUST be set to 0 to 769 ensure that the entire prefix is encoded as 6 octets. 771 The MPLS Label Length field value is set to the number of octets in 772 the MPLS Label field. The MPLS label field carries one or more labels 773 (that corresponds to the stack of labels [MPLS-ENCAPS]). Each label 774 is encoded as 3 octets, where the high-order 20 bits contain the 775 label value, and the low order bit contains "Bottom of Stack" (as 776 defined in [MPLS-ENCAPS]). 778 The MPLS label stack MUST be the downstream assigned E-VPN MPLS label 779 stack that is used by the MES to forward MPLS encapsulated Ethernet 780 packets received from remote MESes, where the destination MAC address 781 in the Ethernet packet is the MAC address advertised in the above 782 NLRI. The forwarding procedures are specified in section "Forwarding 783 Unicast Packets" and "Load Balancing of Unicast Packets". 785 An MES may advertise the same single E-VPN label for all MAC 786 addresses in a given E-VPN instance. This label assignment 787 methodology is referred to as a per EVI label assigment. 788 Alternatively an MES may advertise a unique E-VPN label per combination. This label assignment methodology is 790 referred to as a per label assignment. Or an MES 791 may advertise a unique E-VPN label per MAC address. All of these 792 methodologies have their tradeoffs. 794 Per EVI label assignment requires the least number of E-VPN labels, 795 but requires a MAC lookup in addition to an MPLS lookup on an egress 796 MES for forwarding. On the other hand a unique label per or a unique label per MAC allows an egress MES to 798 forward a packet that it receives from another MES, to the connected 799 CE, after looking up only the MPLS labels and not having to do a MAC 800 lookup. 802 as well as to insert the appropriate VLAN ID on egress to the CE 804 A MES may also advertise more than one label for a given MAC address. 805 For instance an MES may advertise two labels, one of which is for the 806 ESI corresponding to the MAC address and the second is for the 807 Ethernet Tag on the ESI that the MAC address is learnt on. 809 The IP Address field is optional. By default the IP Address length is 810 set to 0 and the IP address is excluded. When a valid IP address is 811 included it is encoded as specified in the section "Optimizing ARP". 813 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 814 be set to the IPv4 or IPv6 address of the advertising MES. 816 The BGP advertisement that advertises the MAC advertisement route 817 MUST also carry one or more Route Target (RT) attributes. RTs may be 818 configured (as in IP VPNs), or may be derived automatically from the 819 Ethernet Tag ID, in the Unique Single VLAN case as described in 820 section "Ethernet A-D Route per E-VPN". 822 It is to be noted that this document does not require MESes to create 823 forwarding state for remote MACs when they are learnt in the control 824 plane. When this forwarding state is actually created is a local 825 implementation matter. 827 12. Optimizing ARP 829 The IP address field in the MAC advertisement route may optionally 830 carry one of the IP addresses associated with the MAC address. This 831 provides an option which can be used to minimize the flooding of ARP 832 messages to MAC VPN CEs and to MESes. This option also minimizes ARP 833 message processing on MAC VPN CEs. A MES may learn the IP address 834 associated with a MAC address in the control or management plane 835 between the CE and the MES. Or it may learn this binding by snooping 836 certain messages to or from a CE. When a MES learns the IP address 837 associated with a MAC address, of a locally connected CE, it may 838 advertise it to other MESes by including it in the MAC route 839 advertisement. The IP Address may be an IPv4, encoded using four 840 octets or an IPv6 address encoded using sixteen octets. The IP 841 Address length field MUST be set to 32 for an IPv4 address and 128 842 for an IPv6 address. 844 If there are multiple IP addresses associated with a MAC address then 845 multiple MAC advertisement routes MUST be generated, one for each IP 846 address. For instance this may be the case when there is both an IPv4 847 and an IPv6 address associated with the MAC address. When the IP 848 address is dis-associated with the MAC address then the MAC 849 advertisement route with that particular IP address MUST be 850 withdrawn. 852 When an MES receives an ARP request for an IP address from a CE, and 853 if the MES has the MAC address binding for that IP address, the MES 854 should perform ARP proxy and respond to the ARP request. 856 Further detailed procedures will be specified in a later version. 858 13. Designated Forwarder Election 860 Consider a CE that is a host or a router that is multi-homed directly 861 to more than one MES in an E-VPN on a given Ethernet segment. One or 862 more Ethernet Tags may be configured on the Ethernet segment. In this 863 scenario only one of the MESes, referred to as the Designated 864 Forwarder (DF), is responsible for certain actions: 866 - Sending multicast and broadcast traffic, on a given Ethernet 867 Tag on a particular Ethernet segment, to the CE. Note that 868 this behavior, which allows selecting a DF at the 869 granularity of for multicast and 870 broadcast traffic is the default behavior in this 871 specification. Optional mechanisms, which will be 872 specified in the future, will allow selecting a DF 873 at the granularity of . 875 - Flooding unknown unicast traffic (i.e. traffic for 876 which an MES does not know the destination MAC address), 877 on a given Ethernet Tag on a particular Ethernet segment 878 to the CE, if the environment requires flooding of 879 unknown unicast traffic. 881 Note that a CE always sends packets belonging to a specific flow 882 using a single link towards an MES. For instance, if the CE is a host 883 then, as mentioned earlier, the host treats the multiple links that 884 it uses to reach the MESes as a Link Aggregation Group (LAG). The CE 885 employs a local hashing function to map traffic flows onto links in 886 the LAG. 888 If a bridge network is multi-homed to more than one MES in an E-VPN 889 via switches, then the support of active-active points of attachments 890 as described in this specification requires the bridge network to be 891 connected to two or more MESes using a LAG. In this case the reasons 892 for doing DF election are the same as those described above when a CE 893 is a host or a router. 895 If a bridge network does not connect to the MESes using LAG, then 896 only one of the links between the switched bridged network and the 897 MESes must be the active link. In this case the per Ethernet Segment 898 Ethernet Tag routes MUST be advertised with the "Active-Standby" flag 899 set to one. Procedures for supporting active-active points of 900 attachments, when a bridge network does not connect to the MESes 901 using LAG, are for further study. 903 The granularity of the DF election MUST be at least the Ethernet 904 segment via which the CE is multi-homed to the MESes. If the DF 905 election is done at the Ethernet segment granularity then a single 906 MES MUST be elected as the DF on the Ethernet segment. 908 If there are one or more Ethernet Tags (e.g., VLANs) on the Ethernet 909 segment then the granularity of the DF election SHOULD be the 910 combination of the Ethernet segment and Ethernet Tag on that Ethernet 911 segment. In this case a single MES MUST be elected as the DF for a 912 particular Ethernet Tag on that Ethernet segment. 914 There are two specified mechanisms for performing DF election. 916 13.1. DF Election Performed by All MESes 918 The MESes perform a designated forwarder (DF) election, for an 919 Ethernet segment, or combination using the 920 Ethernet Tag A-D BGP route described in section "Auto-Discovery of 921 Ethernet Tags on Ethernet Segments". 923 The DF election for a particular ESI or a particular combination proceeds as follows. First an MES constructs a 925 candidate list of MESes. This comprises all the Ethernet A-D routes 926 with that particular ESI or tuple that an MES 927 imports in an E-VPN instance, including the Ethernet A-D route(s) 928 generated by the MES itself, if any. The DF MES is chosen from this 929 candidate list. Note that DF election is carried out by all the MESes 930 that import the DF route. 932 The default procedure for choosing the DF is the MES with the highest 933 IP address, of all the MESes in the candidate list. This procedure 934 MUST be implemented. It ensures that, except during routing 935 transients each MES chooses the same DF MES for a given ESI and 936 Ethernet Tag combination. 938 Other alternative procedures for performing DF election are possible 939 and will be described in the future. 941 13.2. DF Election Performed Only on Multi-Homed MESes 943 As an MES discovers other MESs that are members of the same multi- 944 homed segment, using per Ethernet Segment Ethernet A-D Routes, it 945 starts building an ordered list based on the originating MES IP 946 addresses. This list is used to select a DF and a backup DF (BDF) on 947 a per group of Ethernet Tag basis. For example, the MES with the 948 numerically highest IP address is considered the DF for a given group 949 of VLANs for that Ethernet segment and the next MES in the list is 950 considered the BDF. To that end, the range of Ethernet Tags 951 associated with the CE must be partitioned into disjoint sets. The 952 size of each set is a function of the total number of CE Ethernet 953 Tags and the total number of MESs that the Ethernet segment is multi- 954 homed to. The DF can employ any distribution function that achieves 955 an even distribution of Ethernet Tags across the MESes that are 956 multi-homed to the Ethernet segment. The DF takes over the Ethernet 957 Tag set of any MES encountering either a node failure or a 958 link/Ethernet segment failure causing that MES to be isolated from 959 the multi-homed segment. In case of a failure that is affecting the 960 DF, then the BDF takes over the DF VLAN set. 962 It should be noted that once all the MESs participating in an 963 Ethernet segment have the same ordered list for that site, then 964 Ethernet Tag groups can be assigned to each member of that list 965 deterministically without any need to explicitly distribute Ethernet 966 Tags among the member MESs of that list. In other words, the DF 967 election for a group of Ethernet Tags is a local matter and can be 968 done deterministically. As an example, consider, that the ordered 969 list consists of m MESes: (MES1, MES2,., MESm), and there are n 970 Ethernet Tags for that site (V0, V1, V2, ., Vn-1). Then MES1 and MES2 971 can be the DF and the BDF respectively for all the Ethernet Tags 972 corresponding to (i mod m) for i:0 to n-1. MES2 and MES3 can be the 973 DF and the BDF respectively for all the Ethernet Tags corresponding 974 to (i mod m) + 1 and so on till the last MES in the order list is 975 reached. As a result MESm and MES1 is the DF and the BDF respectively 976 for the all the VLANs corresponding to (i mod m) + m-1. 978 14. Handling of Multi-Destination Traffic 980 Procedures are required for a given MES to send broadcast or 981 multicast traffic, received from a CE encapsulated in a given 982 Ethernet Tag in an E-VPN, to all the other MESes that span that 983 Ethernet Tag in the E-VPN. In certain scenarios, described in section 984 "Processing of Unknown Unicast Packets", a given MES may also need to 985 flood unknown unicast traffic to other MESes. 987 The MESes in a particular E-VPN may use ingress replication or P2MP 988 LSPs or MP2MP LSPs to send unknown unicast, broadcast or multicast 989 traffic to other MESes. 991 Each MES MUST advertise an "Inclusive Multicast Ethernet Tag Route" 992 to enable the above. Next section provides procedures to construct 993 the Inclusive Multicast Ethernet Tag route. Subsequent sections 994 describe in further detail its usage. 996 14.1. Construction of the Inclusive Multicast Ethernet Tag Route 998 The RD MUST be the RD of the E-VPN instance that is advertising the 999 NLRI. The procedures for setting the RD for a given E-VPN are 1000 described in section "Ethernet A-D Route per E-VPN". 1002 The Ethernet Segment Identifier MAY be set to the ten octet ESI 1003 identifier described in section "Ethernet Segment Identifier". Or it 1004 MAY be set to 0. It MUST be set to 0 if the Ethernet Tag is set to 1005 0. 1007 The Ethernet Tag ID is the identifier of the Ethernet Tag. It MAY be 1008 set to 0 in which case an egress MES MUST perform a MAC lookup to 1009 forward the packet. 1011 The Originating Router's IP address MUST be set to an IP address of 1012 the PE. This address SHOULD be common for all the EVIs on the PE 1013 (e.,g., this address may be PE's loopback address). 1015 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 1016 be set to the same IP address as the one carried in the Originating 1017 Router's IP Address field. 1019 The BGP advertisement that advertises the Inclusive Multicast 1020 Ethernet Tag route MUST also carry one or more Route Target (RT) 1021 attributes. The assignment of RTs described in the section on 1022 "Constructing the BGP E-VPN MAC Address Advertisement" MUST be 1023 followed. 1025 14.2. P-Tunnel Identification 1027 In order to identify the P-Tunnel used for sending broadcast, unknown 1028 unicast or multicast traffic, the Inclusive Multicast Ethernet Tag 1029 route MUST carry a "PMSI Tunnel Attribute" specified in [BGP MVPN]. 1031 Depending on the technology used for the P-tunnel for the E-VPN on 1032 the PE, the PMSI Tunnel attribute of the Inclusive Multicast Ethernet 1033 Tag route is constructed as follows. 1035 + If the PE that originates the advertisement uses a P-Multicast 1036 tree for the P-tunnel for the E-VPN, the PMSI Tunnel attribute 1037 MUST contain the identity of the tree (note that the PE could 1038 create the identity of the tree prior to the actual instantiation 1039 of the tree). 1041 + A PE that uses a P-Multicast tree for the P-tunnel MAY aggregate 1042 two or more Ethernet Tags in the same or different E-VPNs present 1043 on the PE onto the same tree. In this case in addition to 1044 carrying the identity of the tree, the PMSI Tunnel attribute MUST 1045 carry an MPLS upstream assigned label which the PE has bound 1046 uniquely to the for E-VPN associated with 1047 this update (as determined by its RTs). 1049 If the PE has already advertised Inclusive Multicast Ethernet Tag 1050 routes for two or more Ethernet Tags that it now desires to 1051 aggregate, then the PE MUST re-advertise those routes. The re- 1052 advertised routes MUST be the same as the original ones, except 1053 for the PMSI Tunnel attribute and the label carried in that 1054 attribute. 1056 + If the PE that originates the advertisement uses ingress 1057 replication for the P-tunnel for the E-VPN, the route MUST 1058 include the PMSI Tunnel attribute with the Tunnel Type set to 1059 Ingress Replication and Tunnel Identifier set to a routable 1060 address of the PE. The PMSI Tunnel attribute MUST carry a 1061 downstream assigned MPLS label. This label is used to demultiplex 1062 the broadcast, multicast or unknown unicast E-VPN traffic 1063 received over a unicast tunnel by the PE. 1065 + The Leaf Information Required flag of the PMSI Tunnel attribute 1066 MUST be set to zero, and MUST be ignored on receipt. 1068 14.3. Ethernet Segment Identifier and Ethernet Tag 1070 As described above the encoding rules allow setting the Ethernet 1071 Segment Identifier and Ethernet Tag to either non-zero valid values 1072 or to 0. If the Ethernet Tag is set to a non-zero valid value, then 1073 an egress MES can forward the packet to the set of egress ESIs in the 1074 Ethernet Tag, in the E-VPN, by performing an MPLS lookup only. 1075 Further if the ESI is also set to non zero then the egress MES does 1076 not need to replicate the packet as it is destined for a given 1077 Ethernet segment. If both Ethernet Tag and ESI are set to 0 then an 1078 egress MES MUST perform a MAC lookup in the EVI determined by the 1079 MPLS label, after the MPLS lookup, to forward the packet. 1081 If an MES advertises multiple Inclusive Ethernet Tag routes for a 1082 given E-VPN then the PMSI Tunnel Attributes for these routes MUST be 1083 distinct. 1085 15. Processing of Unknown Unicast Packets 1087 The procedures in this document do not require MESes to flood unknown 1088 unicast traffic to other MESes. If MESes learn CE MAC addresses via a 1089 control plane, the MESes can then distribute MAC addresses via BGP, 1090 and all unicast MAC addresses will be learnt prior to traffic to 1091 those destinations. 1093 However, if a destination MAC address of a received packet is not 1094 known by the MES, the MES may have to flood the packet. Flooding must 1095 take into account "split horizon forwarding" as follows. The 1096 principles behind the following procedures are borrowed from the 1097 split horizon forwarding rules in VPLS solutions [RFC 4761, RFC 1098 4762]. When an MES capable of flooding (say MESx) receives a 1099 broadcast Ethernet frame, or one with an unknown destination MAC 1100 address, it must flood the frame. If the frame arrived from an 1101 attached CE, MESx must send a copy of the frame to every other 1102 attached CE, on a different ESI than the one it received the frame 1103 on, as well as to all other MESs participating in the E-VPN. If, on 1104 the other hand, the frame arrived from another MES (say MESy), MESx 1105 must send a copy of the packet only to attached CEs. MESx MUST NOT 1106 send the frame to other MESs, since MESy would have already done so. 1107 Split horizon forwarding rules apply to broadcast and multicast 1108 packets, as well as packets to an unknown MAC address. 1110 Whether or not to flood packets to unknown destination MAC addresses 1111 should be an administrative choice, depending on how learning happens 1112 between CEs and MESes. 1114 The MESes in a particular E-VPN may use ingress replication using 1115 RSVP-TE P2P LSPs or LDP MP2P LSPs for sending broadcast, multicast 1116 and unknown unicast traffic to other MESes. Or they may use RSVP-TE 1117 P2MP or LDP P2MP or LDP MP2MP LSPs for sending such traffic to other 1118 MESes. 1120 15.1. Ingress Replication 1122 If ingress replication is in use, the P-Tunnel attribute, carried in 1123 the Inclusive Multicast Ethernet Tag routes for the E-VPN, specifies 1124 the downstream label that the other MESes can use to send unknown 1125 unicast, multicast or broadcast traffic for the E-VPN to this 1126 particular MES. 1128 The MES that receives a packet with this particular MPLS label MUST 1129 treat the packet as a broadcast, multicast or unknown unicast packet. 1130 Further if the MAC address is a unicast MAC address, the MES MUST 1131 treat the packet as an unknown unicast packet. 1133 15.2. P2MP MPLS LSPs 1135 The procedures for using P2MP LSPs are very similar to VPLS 1136 procedures [VPLS-MCAST]. The P-Tunnel attribute used by an MES for 1137 sending unknown unicast, broadcast or multicast traffic for a 1138 particular Ethernet segment, is advertised in the Inclusive Ethernet 1139 Tag Multicast route as described in section "Handling of Multi- 1140 Destination Traffic". 1142 The P-Tunnel attribute specifies the P2MP LSP identifier. This is the 1143 equivalent of an Inclusive tree in [VPLS-MCAST]. Note that multiple 1144 Ethernet Tags, which may be in different E-VPNs, may use the same 1145 P2MP LSP, using upstream labels [VPLS-MCAST]. When P2MP LSPs are used 1146 for flooding unknown unicast traffic, packet re-ordering is possible. 1148 The MES that receives a packet on the P2MP LSP specified in the PMSI 1149 Tunnel Attribute MUST treat the packet as a broadcast, multicast or 1150 unknown unicast packet. Further if the MAC address is a unicast MAC 1151 address, the MES MUST treat the packet as an unknown unicast packet. 1153 16. Forwarding Unicast Packets 1155 16.1. Forwarding packets received from a CE 1157 When an MES receives a packet from a CE, on a given Ethernet Tag, it 1158 must first look up the source MAC address of the packet. In certain 1159 environments the source MAC address MAY be used to authenticate the 1160 CE and determine that traffic from the host can be allowed into the 1161 network. Source MAC lookup MAY also used for local MAC address 1162 learning. 1164 If the MES decides to forward the packet the destination MAC address 1165 of the packet must be looked up. If the MES has received MAC address 1166 advertisements for this destination MAC address from one or more 1167 other MESes or learned it from locally connected CEs, it is 1168 considered as a known MAC address. Else the MAC address is considered 1169 as an unknown MAC address. 1171 For known MAC addresses the MES forwards this packet to one of the 1172 remote MESes or to a locally attached CEs. When forwarding to remote 1173 MESes, the packet is encapsulated in the E-VPN MPLS label advertised 1174 by the remote MES, for that MAC address, and in the MPLS LSP label 1175 stack to reach the remote MES. 1177 If the MAC address is unknown then, if the administrative policy on 1178 the MES requires flooding of unknown unicast traffic: 1179 - The MES MUST flood the packet to other MESes. If the ESI over 1181 which the MES receives the packet is multi-homed, then the MES MUST 1182 first encapsulate the packet in the ESI MPLS label as described in 1183 section "Split Horizon". If ingress replication is used the packet 1184 MUST be replicated one or more times to each remote MES with the 1185 bottom label of the stack being an MPLS label determined as follows. 1186 This is the MPLS label advertised by the remote MES in a PMSI Tunnel 1187 Attribute in the Inclusive Multicast Ethernet Tag route for an combination. The Ethernet Tag in the route must be the 1189 same as the Ethernet Tag advertised by the ingress MES in its 1190 Ethernet Tag A-D route associated with the interface on which the 1191 ingress MES receives the packet. If P2MP LSPs are being used the 1192 packet MUST be sent on the P2MP LSP that the MES is the root of for 1193 the Ethernet Tag in the E-VPN. If the same P2MP LSP is used for all 1194 Ethernet Tags then all the MESes in the E-VPN MUST be the leaves of 1195 the P2MP LSP. If a distinct P2MP LSP is used for a given Ethernet Tag 1196 in the E-VPN then only the MESes in the Ethernet Tag MUST be the 1197 leaves of the P2MP LSP. The packet MUST be encapsulated in the P2MP 1198 LSP label stack. 1200 If the MAC address is unknown then, if the admnistrative policy on 1201 the MES does not allow flooding of unknown unicast traffic: 1202 - The MES MUST drop the packet. 1204 16.2. Forwarding packets received from a remote MES 1206 16.2.1. Unknown Unicast Forwarding 1208 When an MES receives an MPLS packet from a remote MES then, after 1209 processing the MPLS label stack, if the top MPLS label ends up being 1210 a P2MP LSP label associated with an E-VPN or the downstream label 1211 advertised in the P-Tunnel attribute and after performing the split 1212 horizon procedures described in section "Split Horizon": 1214 - If the MES is the designated forwarder of unknown unicast, 1215 broadcast or multicast traffic, on a particular set of ESIs for the 1216 Ethernet Tag, the default behavior is for the MES to flood the packet 1217 on the ESIs. In other words the default behavior is for the MES to 1218 assume that the destination MAC address is unknown unicast, broadcast 1219 or multicast and it is not required to do a destination MAC address 1220 lookup, as long as the granularity of the MPLS label included the 1221 Ethernet Tag. As an option the MES may do a destination MAC lookup to 1222 flood the packet to only a subset of the CE interfaces in the 1223 Ethernet Tag. For instance the MES may decide to not flood an unknown 1224 unicast packet on certain Ethernet segments even if it is the DF on 1225 the Ethernet segment, based on administrative policy. 1227 - If the MES is not the designated forwarder on any of the ESIs 1229 for the Ethernet Tag, the default behavior is for it to drop the 1230 packet. 1232 16.2.2. Known Unicast Forwarding 1234 If the top MPLS label ends up being an E-VPN label that was 1235 advertised in the unicast MAC advertisements, then the MES either 1236 forwards the packet based on CE next-hop forwarding information 1237 associated with the label or does a destination MAC address lookup to 1238 forward the packet to a CE. 1240 17. Split Horizon 1242 Consider a CE that is multi-homed to two or more MESes on an Ethernet 1243 segment ES1. If the CE sends a multicast, broadcast or unknown 1244 unicast packet to a particular MES, say MES1, then MES1 will forward 1245 that packet to all or subset of the other MESes in the E-VPN. In this 1246 case the MESes, other than MES1, that the CE is multi-homed to MUST 1247 drop the packet and not forward back to the CE. This is referred to 1248 as "split horizon" in this document. 1250 In order to accomplish this each MES distributes to other MESes the 1251 "per Ethernet Segment Ethernet A-D route" as per the procedures in 1252 the section "Ethernet A-D Route per Ethernet Segment". This route is 1253 imported by the MESes connected to the Ethernet Segment and also by 1254 the MESes that have at least one E-VPN in common with the Ethernet 1255 Segment in the route. As described in the section "Ethernet A-D Route 1256 per Ethernet Segment", the route MUST carry an ESI MPLS Label 1257 Extended Community with a valid ESI MPLS label. 1259 17.1. ESI MPLS Label: Ingress Replication 1261 An MES that is using ingress replication for sending broadcast, 1262 multicast or unknown unicast traffic, distributes to other MESes, 1263 that belong to the Ethernet segment, a downstream assigned "ESI MPLS 1264 label" in the Ethernet A-D route. This label MUST be programmed in 1265 the platform label space by the advertising MES. Further the 1266 forwarding entry for this label must result in NOT forwarding packets 1267 received with this label onto the Ethernet segment that the label was 1268 distributed for. 1270 Consider MES1 and MES2 that are multi-homed to CE1 on ES1. Further 1271 consider that MES1 is using P2P or MP2P LSPs to send packets to MES2. 1272 Consider that MES1 receives a a multicast, broadcast or unknown 1273 unicast packet from CE1 on VLAN1 on ESI1. 1275 First consider the case where MES2 distributes an unique Inclusive 1276 Multicast Ethernet Tag route for VLAN1, for each Ethernet segment on 1277 MES2. In this case MES1 MUST NOT replicate the packet to MES2 for 1278 . 1280 Next consider the case where MES2 distributes a single Inclusive 1281 Multicast Ethernet Tag route for VLAN1 for all Ethernet segments on 1282 MES2. In this case when MES1 sends a multicast, broadcast or unknown 1283 unicast packet, that it receives from CE1, it MUST first push onto 1284 the MPLS label stack the ESI label that MES2 has distributed for 1285 ESI1. It MUST then push on the MPLS label distributed by MES2 in the 1286 Inclusive Ethernet Tag Multicast route for Ethernet Tag1. The 1287 resulting packet is further encapsulated in the P2P or MP2P LSP label 1288 stack required to transmit the packet to MES2. When MES2 receives 1289 this packet it determines the set of ESIs to replicate the packet to 1290 from the top MPLS label, after any P2P or MP2P LSP labels have been 1291 removed. If the next label is the ESI label assigned by MES2 then 1292 MES2 MUST NOT forward the packet onto ESI1. 1294 17.2. ESI MPLS Label: P2MP MPLS LSPs 1296 An MES that is using P2MP LSPs for sending broadcast, multicast or 1297 unknown unicast traffic, distributes to other MESes, that belong to 1298 the Ethernet segment or have an E-VPN in common with the Ethernet 1299 Segment, an upstream assigned "ESI MPLS label" in the Ethernet A-D 1300 route. This label is upstream assigned by the MES that advertises the 1301 route. This label MUST be programmed by the other MESes, that are 1302 connected to the ESI advertised in the route, in the context label 1303 space for the advertising MES. Further the forwarding entry for this 1304 label must result in NOT forwarding packets received with this label 1305 onto the Ethernet segment that the label was distributed for. This 1306 label MUST also be programmed by the other MESes, that import the 1307 route but are not connected to the ESI advertised in the route, in 1308 the context label space for the advertising MES. Further the 1309 forwarding entry for this label must be a POP with no other 1310 associated action. 1312 Consider MES1 and MES2 that are multi-homed to CE1 on ES1. Also 1313 consider MES3 that is in the same E-VPN as one of the E-VPNs to which 1314 ES1 belongs. Further assume that MES1 is using P2MP MPLS LSPs to 1315 send broadcast, multicast or uknown unicast packets. When MES1 sends 1316 a multicast, broadcast or unknown unicast packet, that it receives 1317 from CE1, it MUST first push onto the MPLS label stack the ESI label 1318 that it has assigned for the ESI that the packet was received on. The 1319 resulting packet is further encapsulated in the P2MP MPLS label stack 1320 necessary to transmit the packet to the other MESes. Penultimate hop 1321 popping MUST be disabled on the P2MP LSPs used in the MPLS transport 1322 infrastructure for E-VPN. When MES2 receives this packet it 1323 decapsulates the top MPLS label and forwards the packet using the 1324 context label space determined by the top label. If the next label is 1325 the ESI label assigned by MES1 then MES2 MUST NOT forward the packet 1326 onto ESI1. When MES3 receives this packet it decapsulates the top 1327 MPLS label and forwards the packet using the context label space 1328 determined by the top label. If the next label is the ESI label 1329 assigned by MES1 then MES3 MUST pop the label. 1331 17.3. ESI MPLS Label: MP2MP LSPs 1333 The procedures for ESI MPLS Label assignment and usage for MP2MP LSPs 1334 will be described in a future version. 1336 18. Load Balancing of Unicast Packets 1338 This section specifies how load balancing is achieved to/from a CE 1339 that has more than one interface that is directly connected to one or 1340 more MESes. The CE may be a host or a router or it may be a switched 1341 network that is connected via LAG to the MESes. 1343 18.1. Load balancing of traffic from an MES to remote CEs 1345 Whenever a remote MES imports a MAC advertisement for a given in an E-VPN instance, it MUST consider the MAC as 1347 reachahable via all the MESes from which it has imported Ethernet A-D 1348 routes for that . Let us call this the initial 1349 Ethernet A-D route set for the given ESI. 1351 For the given ESI the remote MES has imported a per Ethernet Segment 1352 Ethernet A-D route, from at least one MES, where the "Active-Standby" 1353 flag in the ESI MPLS Label Extended Community is set, then the remote 1354 MES MUST first use the procedures in the section "Designated 1355 Forwarder Election" to pick a Designated Forwarder. The eligible set 1356 of Ethernet A-D routes used in the procedures below must comprise 1357 this single Ethernet A-D route from the DF. 1359 If for the given ESI none of the per Ethernet Segment Ethernet A-D 1360 routse, imported by the remote MES, have the "Active-Standby" flag 1361 set in the ESI MPLS Label Extended Community, then the eligble set of 1362 Ethernet A-D routes is set to the initial Ethernet A-D route set. 1364 The remote MES MUST use the MAC advertisement and eligible Ethernet 1365 A-D routes to constuct the set of next-hops that it can use to send 1366 the packet to the destination MAC. Each next-hop comprises an MPLS 1367 label stack, that is to be used by the egress MES to forward the 1368 packet. This label stack is determined as follows. If the next-hop is 1369 constructed as a result of a MAC route which has a valid MPLS label 1370 stack, then this label stack MUST be used. However if the MAC route 1371 doesn't exist or if it doesn't have a valid MPLS label stack then the 1372 next-hop and MPLS label stack is constructed as a result of one or 1373 more corresponding Ethernet A-D routes as follows. Note that the 1374 following description applies to determining the label stack for a 1375 particular next-hop to reach a given MES, from which the remote MES 1376 has received and imported one or more Ethernet A-D routes that have 1377 the matching ESI and Ethernet Tag as the one present in the MAC 1378 advertisement. The Ethernet A-D routes mentioned in the following 1379 description refer to the ones imported from this given MES. 1381 If there is a corresponding Ethernet A-D route for that then that label stack MUST be used. If such an Ethernet 1383 Tag A-D route doesn't exist but Ethernet A-D routes exist for and then the label stack 1385 must be constructed by using the labels from these two routes. If 1386 this is not the case but an Ethernet A-D route exists for then the label from that route must be used. 1388 Finally if this is also not the case but an Ethernet A-D route exists 1389 for then the label from that route must 1390 be used. 1392 The following example explains the above when Ethernet A-D routes are 1393 advertised per . 1395 Consider a CE, CE1, that is dual homed to two MESes, MES1 and MES2 on 1396 a LAG interface, ES1, and is sending packets with MAC address MAC1 on 1397 VLAN1. Based on E-VPN extensions described in sections "Determining 1398 Reachability of Unicast Addresses" and "Auto-Discovery of Ethernet 1399 Tags on Ethernet Segments", a remote MES say MES3 is able to learn 1400 that a MAC1 is reachable via MES1 and MES2. Both MES1 and MES2 may 1401 advertise MAC1 in BGP if they receive packets with MAC1 from CE1. If 1402 this is not the case and if MAC1 is advertised only by MES1, MES3 1403 still considers MAC1 as reachable via both MES1 and MES2 as both MES1 1404 and MES2 advertise a Ethernet A-D route for . 1406 The MPLS label stack to send the packets to MES1 is the MPLS LSP 1407 stack to get to MES1 and the E-VPN label advertised by MES1 for CE1's 1408 MAC. 1410 The MPLS label stack to send packets to MES2 is the MPLS LSP stack to 1411 get to MES2 and the MPLS label in the Ethernet A-D route advertised 1412 by MES2 for , if MES2 has not advertised MAC1 in BGP. 1414 We will refer to these label stacks as MPLS next-hops. 1416 The remote MES, MES3, can now load balance the traffic it receives 1417 from its CEs, destined for CE1, between MES1 and MES2. MES3 may use 1418 the IP flow information for it to hash into one of the MPLS next-hops 1419 for load balancing for IP traffic. Or MES3 may rely on the source and 1420 destination MAC addresses for load balancing. 1422 Note that once MES3 decides to send a particular packet to MES1 or 1423 MES2 it can pick from more than path to reach the particular remote 1424 MES using regular MPLS procedures. For instance if the tunneling 1425 technology is based on RSVP-TE LSPs, and MES3 decides to send a 1426 particular packet to MES1 then MES3 can choose from multiple RSVP-TE 1427 LSPs that have MES1 as their destination. 1429 When MES1 or MES2 receive the packet destined for CE1 from MES3, if 1430 the packet is a unicast MAC packet it is forwarded to CE1. If it is 1431 a multicast or broadcast MAC packet then only one of MES1 or MES2 1432 must forward the packet to the CE. Which of MES1 or MES2 forward this 1433 packet to the CE is determined by default based on which of the two 1434 is the DF. An alternate procedure to load balance multicast packets 1435 will be described in the future. 1437 If the connectivity between the multi-homed CE and one of the MESes 1438 that it is multi-homed to fails, the MES MUST withdraw the MAC 1439 address from BGP. In addition the MES MUST withdraw the Ethernet Tag 1440 A-D routes, that had been previously advertised, for the Ethernet 1441 Segment to the CE. Note that to aid convergence the Ethernet Tag A-D 1442 routes MAY be withdrawn before the MAC routes. This enables the 1443 remote MESes to remove the MPLS next-hop to this particular MES from 1444 the set of MPLS next-hops that can be used to forward traffic to the 1445 CE. For further details and procedures on withdrawal of E-VPN route 1446 types in the event of MES to CE failures please section "MES to CE 1447 Network Failures". 1449 18.2. Load balancing of traffic between an MES and a local CE 1451 A CE may be configured with more than one interface connected to 1452 different MESes or the same MES for load balancing, using a 1453 technology such as LAG. The MES(s) and the CE can load balance 1454 traffic onto these interfaces using one of the following mechanisms. 1456 18.2.1. Data plane learning 1458 Consider that the MESes perform data plane learning for local MAC 1459 addresses learned from local CEs. This enables the MES(s) to learn a 1460 particular MAC address and associate it with one or more interfaces, 1461 if the technology between the MES and the CE supports multi-pathing. 1462 The MESes can now load balance traffic destined to that MAC address 1463 on the multiple interfaces. 1465 Whether the CE can load balance traffic that it generates on the 1466 multiple interfaces is dependent on the CE implementation. 1468 18.2.2. Control plane learning 1470 The CE can be a host that advertises the same MAC address using a 1471 control protocol on both interfaces. This enables the MES(s) to learn 1472 the host's MAC address and associate it with one or more interfaces. 1473 The MESes can now load balance traffic destined to the host on the 1474 multiple interfaces. The host can also load balance the traffic it 1475 generates onto these interfaces and the MES that receives the traffic 1476 employs E-VPN forwarding procedures to forward the traffic. 1478 19. MAC Moves 1480 In the case where a CE is a host or a switched network connected to 1481 hosts, the MAC address that is reachable via a given MES on a 1482 particular ESI may move such that it becomes reachable via another 1483 MES on another ESI. This is referred to as a "MAC Move". 1485 Remote MESes must be able to distinguish a MAC move from the case 1486 where a MAC address on an ESI is reachable via two different MESes 1487 and load balancing is performed as described in section "Load 1488 Balancing of Unicast Packets". This distinction can be made as 1489 follows. If a MAC is learned by a particular MES from multiple MESes, 1490 then the MES performs load balancing only amongst the set of MESes 1491 that advertised the MAC with the same ESI. If this is not the case 1492 then the MES chooses only one of the advertising MESes to reach the 1493 MAC as per BGP path selection. 1495 There can be traffic loss during a MAC move. Consider MAC1 that is 1496 advertised by MES1 and learned from CE1 on ESI1. If MAC1 now moves 1497 behind MES2, on ESI2, MES2 advertises the MAC in BGP. Until a remote 1498 MES, MES3, determines that the best path is via MES2, it will 1499 continue to send traffic destined for MAC1 to MES1. This will not 1500 occur deterministially until MES1 withdraws the advertisement for 1501 MAC1. 1503 One recommended optimization to reduce the traffic loss during MAC 1504 moves is the following option. When an MES sees a MAC update from a 1505 locally attached CE on an ESI, which is different from the ESI on 1506 which the MES has currently learned the MAC, the corresponding entry 1507 in the local bridge forwarding table SHOULD be immediately purged 1508 causing the MES to withdraw its own E-VPN MAC advertisement route and 1509 replace it with the update. 1511 A future version of this specification will describe other optimized 1512 procedures to minimize traffic loss during MAC moves. 1514 20. Multicast 1516 The MESes in a particular E-VPN may use ingress replication or P2MP 1517 LSPs to send multicast traffic to other MESes. 1519 20.1. Ingress Replication 1521 The MESes may use ingress replication for flooding unknown unicast, 1522 multicast or broadcast traffic as described in section "Handling of 1523 Multi-Destination Traffic". A given unknown unicast or broadcast 1524 packet must be sent to all the remote MESes. However a given 1525 multicast packet for a multicast flow may be sent to only a subset of 1526 the MESes. Specifically a given multicast flow may be sent to only 1527 those MESes that have receivers that are interested in the multicast 1528 flow. Determining which of the MESes have receivers for a given 1529 multicast flow is done using explicit tracking described below. 1531 20.2. P2MP LSPs 1533 A MES may use an "Inclusive" tree for sending an unknown unicast, 1534 broadcast or multicast packet or a "Selective" tree. This terminology 1535 is borrowed from [VPLS-MCAST]. 1537 A variety of transport technologies may be used in the SP network. 1538 For inclusive P-Multicast trees, these transport technologies include 1539 point-to-multipoint LSPs created by RSVP-TE or mLDP. For selective P- 1540 Multicast trees, only unicast MES-MES tunnels (using MPLS or IP/GRE 1541 encapsulation) and P2MP LSPs are supported, and the supported P2MP 1542 LSP signaling protocols are RSVP-TE, and mLDP. 1544 20.3. MP2MP LSPs 1546 The root of the MP2MP LDP LSP advertises the Inclusive Multicast Tag 1547 route with the PMSI Tunnel attribute set to the MP2MP Tunnel 1548 identifier. This advertisement is then sent to all MESes in the E- 1549 VPN. Upon receiving the Inclusive Multicast Tag routes with a PMSI 1550 Tunnel attribute that contains the MP2MP Tunnel identifier, the 1551 receiving MESes initiate the setup of the MP2MP tunnel towards the 1552 root using the procedures in [MLDP]. 1554 20.3.1. Inclusive Trees 1556 An Inclusive Tree allows the use of a single multicast distribution 1557 tree, referred to as an Inclusive P-Multicast tree, in the SP network 1558 to carry all the multicast traffic from a specified set of E-VPN 1559 instances on a given MES. A particular P-Multicast tree can be set up 1560 to carry the traffic originated by sites belonging to a single E-VPN, 1561 or to carry the traffic originated by sites belonging to different E- 1562 VPNs. The ability to carry the traffic of more than one E-VPN on the 1563 same tree is termed 'Aggregation'. The tree needs to include every 1564 MES that is a member of any of the E-VPNs that are using the tree. 1565 This implies that an MES may receive multicast traffic for a 1566 multicast stream even if it doesn't have any receivers that are 1567 interested in receiving traffic for that stream. 1569 An Inclusive P-Multicast tree as defined in this document is a P2MP 1570 tree. A P2MP tree is used to carry traffic only for E-VPN CEs that 1571 are connected to the MES that is the root of the tree. 1573 The procedures for signaling an Inclusive Tree are the same as those 1574 in [VPLS-MCAST] with the VPLS-AD route replaced with the Inclusive 1575 Multicast Ethernet Tag route. The P-Tunnel attribute [VPLS-MCAST] for 1576 an Inclusive tree is advertised in the Inclusive Ethernet A-D route 1577 as described in section "Handling of Multi-Destination Traffic". 1578 Note that an MES can "aggregate" multiple inclusive trees for 1579 different E-VPNs on the same P2MP LSP using upstream labels. The 1580 procedures for aggregation are the same as those described in [VPLS- 1581 MCAST], with VPLS A-D routes replaced by E-VPN Inclusive Multicast 1582 Ethernet A-D routes. 1584 20.3.2. Selective Trees 1586 A Selective P-Multicast tree is used by an MES to send IP multicast 1587 traffic for one or more specific IP multicast streams, originated by 1588 CEs connected to the MES, that belong to the same or different E- 1589 VPNs, to a subset of the MESs that belong to those E-VPNs. Each of 1590 the MESs in the subset should be on the path to a receiver of one or 1591 more multicast streams that are mapped onto the tree. The ability to 1592 use the same tree for multicast streams that belong to different E- 1593 VPNs is termed an MES the ability to create separate SP multicast 1594 trees for specific multicast streams, e.g. high bandwidth multicast 1595 streams. This allows traffic for these multicast streams to reach 1596 only those MES routers that have receivers in these streams. This 1597 avoids flooding other MES routers in the E-VPN. 1599 A SP can use both Inclusive P-Multicast trees and Selective P- 1600 Multicast trees or either of them for a given E-VPN on an MES, based 1601 on local configuration. 1603 The granularity of a selective tree is where S is an 1604 IP multicast source address and G is an IP multicast group address or 1605 G is a multicast MAC address. Wildcard sources and wildcard groups 1606 are supported. Selective trees require explicit tracking as described 1607 below. 1609 A E-VPN MES advertises a selective tree using a E-VPN selective A-D 1610 route. The procedures are the same as those in [VPLS-MCAST] with S- 1611 PMSI A-D routes in [VPLS-MCAST] replaced by E-VPN Selective A-D 1612 routes. The information elements of the E-VPN selective 1613 A-D route are similar to those of the VPLS S-PMSI A-D route with the 1614 following differences. A E-VPN Selective A-D route includes an 1615 optional Ethernet Tag field. Also an E-VPN selective A-D route may 1616 encode a MAC address in the Group field. The encoding details of the 1617 E-VPN selective A-D route will be described in the next revision. 1619 Selective trees can also be aggregated on the same P2MP LSP using 1620 aggregation as described in [VPLS-MCAST]. 1622 20.4. Explicit Tracking 1624 [VPLS-MCAST] describes procedures for explicit tracking that rely on 1625 Leaf A-D routes. The same procedures are used for explicit tracking 1626 in this specification with VPLS Leaf A-D routes replaced with E-VPN 1627 Leaf A-D routes. These procedures allow a root MES to request 1628 multicast membership information for a given (S, G), from leaf MESs. 1629 Leaf MESs rely on IGMP snooping or PIM snooping between the MES and 1630 the CE to determine the multicast membership information. Note that 1631 the procedures in [VPLS-MCAST] do not describe how explicit tracking 1632 is performed if the CEs are enabled with join suppression. The 1633 procedures for this case will be described in a future version. 1635 21. Convergence 1637 This section describes failure recovery from different types of 1638 network failures. 1640 21.1. Transit Link and Node Failures between MESes 1642 The use of existing MPLS Fast-Reroute mechanisms can provide failure 1643 recovery in the order of 50ms, in the event of transit link and node 1644 failures in the infrastructure that connects the MESes. 1646 21.2. MES Failures 1648 Consider a host host1 that is dual homed to MES1 and MES2. If MES1 1649 fails, a remote MES, MES3, can discover this based on the failure of 1650 the BGP session. This failure detection can be in the sub-second 1651 range if BFD is used to detect BGP session failure. MES3 can update 1652 its forwarding state to start sending all traffic for host1 to only 1653 MES2. It is to be noted that this failure recovery is potentially 1654 faster than what would be possible if data plane learning were to be 1655 used. As in that case MES3 would have to rely on re-learning of MAC 1656 addresses via MES2. 1658 21.2.1. Local Repair 1660 It is possible to perform local repair in the case of MES failures. 1661 Details will be specified in the future. 1663 21.3. MES to CE Network Failures 1665 When an Ethernet segment connected to an MES fails or when a Ethernet 1666 Tag is deconfigured on an Ethernet segment, then the MES MUST 1667 withdraw the Ethernet A-D route(s) announced for the that are impacted by the failure or de-configuration. In 1669 addition the MES MUST also withdraw the MAC advertisement routes that 1670 are impacted by the failure or de-configuration. 1672 The Ethernet A-D routes should be used by an implementation to 1673 optimize the withdrawal of MAC advertisement routes. When an MES 1674 receives a withdrawal of a particular Ethernet A-D route from an MES 1675 it SHOULD consider all the MAC advertisement routes, that are learned 1676 from the same as in the Ethernet A-D route, from 1677 the advertising MES, as having been withdrawn. This optimizes the 1678 network convergence times in the event of MES to CE failures. 1680 22. LACP State Synchronization 1682 This section requires review and discussion amongst the authors and 1683 will be revised in the next version. 1685 To support CE multi-homing with multi-chassis Ethernet bundles, the 1686 MESes connected to a given CE should synchronize [802.1AX] LACP state 1687 amongst each other. This ensures that the MESes can present a single 1688 LACP bundle to the CE. This is required for initial system bring-up 1689 and upon any configuration change. 1691 This includes at least the following LACP specific configuration 1692 parameters: 1694 - System Identifier (MAC Address): uniquely identifies a LACP speaker. 1695 - System Priority: determines which LACP speaker's port priorities are 1696 used in the Selection logic. 1697 - Aggregator Identifier: uniquely identifies a bundle within a LACP 1698 speaker. 1699 - Aggregator MAC Address: identifies the MAC address of the bundle. 1700 - Aggregator Key: used to determine which ports can join an Aggregator. 1701 - Port Number: uniquely identifies an interface within a LACP speaker. 1702 - Port Key: determines the set of ports that can be bundled. 1703 - Port Priority: determines a port's precedence level to join a bundle 1704 in case the number of eligible ports exceeds the maximum number of links 1705 allowed in a bundle. 1707 Furthermore, the MESes should also synchronize operational (run-time) 1708 data, in order for the LACP Selection logic state-machines to 1709 execute. This operational data includes the following LACP 1710 operational parameters, on a per port basis: 1712 - Partner System Identifier: this is the CE System MAC address. 1713 - Partner System Priority: the CE LACP System Priority 1714 - Partner Port Number: CE's AC port number. 1715 - Partner Port Priority: CE's AC Port Priority. 1716 - Partner Key: CE's key for this AC. 1717 - Partner State: CE's LACP State for the AC. 1718 - Actor State: PE's LACP State for the AC. 1719 - Port State: PE's AC port status. 1721 The above state needs to be communicated between MESes forming a 1722 multi-chassis bundle during LACP initial bringup, upon any 1723 configuration change and upon the occurrence of a failure. 1725 It should be noted that the above configuration and operational state 1726 is localized in scope and is only relevant to MESes which connect to 1727 the same multi-homed CE over a given Ethernet bundle. 1729 Furthermore, the communication of state changes, upon failures, must 1730 occur with minimal latency, in order to minimize the switchover time 1731 and consequent service disruption. The protocol details for 1732 synchronizing the LACP state will be described in the following 1733 version. 1735 23. Acknowledgements 1737 We would like to thank Yakov Rekhter, Pedro Marques, Kaushik Ghosh, 1738 Nischal Sheth, Robert Raszuk and Amit Shukla for discussions that 1739 helped shape this document. We would also like to thank Han Nguyen 1740 for his comments and support of this work. We would also like to 1741 thank Steve Kensil for his review. 1743 24. References 1745 [E-VPN-REQ] A. Sajassi, R. Aggarwal et. al., "Requirements for 1746 Ethernet VPN", draft-sajassi-raggarwa-l2vpn-evpn-req-00.txt 1748 [RFC4364] "BGP/MPLS IP VPNs", Rosen, Rekhter, et. al., February 2006 1750 [VPLS-MCAST] "Multicast in VPLS". R. Aggarwal et.al., draft-ietf- 1751 l2vpn-vpls-mcast-04.txt 1753 [RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN Service 1754 (VPLS) Using BGP for Auto-Discovery and Signaling", RFC 4761, January 1755 2007. 1757 [RFC4762] Lasserre, M. and V. Kompella, "Virtual Private LAN Service 1758 (VPLS) Using Label Distribution Protocol (LDP) Signaling", RFC 4762, 1759 January 2007. 1761 [VPLS-MULTIHOMING] "BGP based Multi-homing in Virtual Private LAN 1762 Service", K. Kompella et. al., draft-ietf-l2vpn-vpls- 1763 multihoming-00.txt 1765 [PIM-SNOOPING] "PIM Snooping over VPLS", V. Hemige et. al., draft- 1766 ietf-l2vpn-vpls-pim-snooping-01 1768 [IGMP-SNOOPING] "Considerations for Internet Group Management 1769 Protocol (IGMP) and Multicast Listener Discovery (MLD) Snooping 1770 Switches", M. Christensen et. al., RFC4541, 1772 [RT-CONSTRAIN] P. Marques et. al., "Constrained Route Distribution 1773 for Border Gateway Protocol/MultiProtocol Label Switching (BGP/MPLS) 1774 Internet Protocol (IP) Virtual Private Networks (VPNs)", RFC 4684, 1775 November 2006 1777 25. Author's Address 1779 Rahul Aggarwal 1780 Email: raggarwa_1@yahoo.com 1782 Ali Sajassi 1783 Cisco 1784 170 West Tasman Drive 1785 San Jose, CA 95134, US 1786 Email: sajassi@cisco.com 1788 Wim Henderickx 1789 Alcatel-Lucent 1790 e-mail: wim.henderickx@alcatel-lucent.com 1792 Aldrin Isaac 1793 Bloomberg 1794 Email: aisaac71@bloomberg.net 1796 James Uttaro 1797 AT&T 1798 200 S. Laurel Avenue 1799 Middletown, NJ 07748 1800 USA 1801 Email: uttaro@att.com 1803 Nabil Bitar 1804 Verizon Communications 1805 Email : nabil.n.bitar@verizon.com 1807 Ravi Shekhar 1808 Juniper Networks 1809 1194 N. Mathilda Ave. 1810 Sunnyvale, CA 94089 US 1811 Email: rshekhar@juniper.net 1813 Florin Balus 1814 Alcatel-Lucent 1815 e-mail: Florin.Balus@alcatel-lucent.com 1817 Keyur Patel 1818 Cisco 1819 170 West Tasman Drive 1820 San Jose, CA 95134, US 1821 Email: keyupate@cisco.com 1823 Sami Boutros 1824 Cisco 1825 170 West Tasman Drive 1826 San Jose, CA 95134, US 1827 Email: sboutros@cisco.com