idnits 2.17.1 draft-raggarwa-mac-vpn-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 2 instances of too long lines in the document, the longest one being 4 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 164 has weird spacing: '...earning do no...' == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 2, 2010) is 5077 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2119' is mentioned on line 133, but not defined == Missing Reference: 'BGP-VPLS' is mentioned on line 267, but not defined == Missing Reference: 'BGP-VPLS-MH' is mentioned on line 267, but not defined == Missing Reference: 'RFC4271' is mentioned on line 325, but not defined == Missing Reference: 'RFC4760' is mentioned on line 334, but not defined == Missing Reference: 'BGP MVPN' is mentioned on line 690, but not defined == Unused Reference: 'RFC4761' is defined on line 1316, but no explicit reference was found in the text == Unused Reference: 'RFC4762' is defined on line 1320, but no explicit reference was found in the text == Unused Reference: 'VPLS-MULTIHOMING' is defined on line 1324, but no explicit reference was found in the text == Unused Reference: 'PIM-SNOOPING' is defined on line 1328, but no explicit reference was found in the text == Unused Reference: 'IGMP-SNOOPING' is defined on line 1331, but no explicit reference was found in the text == Outdated reference: A later version (-16) exists of draft-ietf-l2vpn-vpls-mcast-04 == Outdated reference: A later version (-07) exists of draft-ietf-l2vpn-vpls-multihoming-00 == Outdated reference: A later version (-07) exists of draft-ietf-l2vpn-vpls-pim-snooping-01 ** Downref: Normative reference to an Informational draft: draft-ietf-l2vpn-vpls-pim-snooping (ref. 'PIM-SNOOPING') ** Downref: Normative reference to an Informational RFC: RFC 4541 (ref. 'IGMP-SNOOPING') Summary: 6 errors (**), 0 flaws (~~), 17 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Aggarwal (Editor) 3 Internet Draft Juniper Networks 4 Category: Standards Track 5 Expiration Date: December 2010 A. Isaac 6 Bloomberg 8 J. Uttaro 9 AT&T 11 R. Shekhar 12 Juniper Networks 14 F. Balus 15 Alcatel-Lucent 17 W. Henderickx 18 Alcatel-Lucent 20 June 2, 2010 22 BGP MPLS Based MAC VPN 24 draft-raggarwa-mac-vpn-01.txt 26 Status of this Memo 28 This Internet-Draft is submitted to IETF in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF), its areas, and its working groups. Note that other 33 groups may also distribute working documents as Internet-Drafts. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 The list of current Internet-Drafts can be accessed at 41 http://www.ietf.org/ietf/1id-abstracts.txt. 43 The list of Internet-Draft Shadow Directories can be accessed at 44 http://www.ietf.org/shadow.html. 46 Copyright and License Notice 48 Copyright (c) 2010 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 This document may contain material from IETF Documents or IETF 62 Contributions published or made publicly available before November 63 10, 2008. The person(s) controlling the copyright in some of this 64 material may not have granted the IETF Trust the right to allow 65 modifications of such material outside the IETF Standards Process. 66 Without obtaining an adequate license from the person(s) controlling 67 the copyright in such materials, this document may not be modified 68 outside the IETF Standards Process, and derivative works of it may 69 not be created outside the IETF Standards Process, except to format 70 it for publication as an RFC or to translate it into languages other 71 than English. 73 Abstract 75 This document describes procedures for BGP MPLS based MAC VPNs (MAC- 76 VPN). 78 Table of Contents 80 1 Specification of requirements ......................... 4 81 2 Contributors .......................................... 4 82 3 Introduction .......................................... 4 83 4 Terminology ........................................... 5 84 5 BGP MPLS Based MAC-VPN ................................ 6 85 6 Ethernet Segment Identifier ........................... 7 86 7 BGP MAC-VPN NLRI ...................................... 8 87 8 Auto-Discovery of Ethernet Tags on Ethernet Segments .. 9 88 9 Determining Reachability to Unicast MAC Addresses ..... 11 89 9.1 Local Learning ........................................ 11 90 9.2 Remote learning ....................................... 11 91 9.2.1 BGP MAC-VPN MAC Address Advertisement ................. 12 92 10 Designated Forwarder Election ......................... 13 93 11 Handling of Broadcast, Multicast and Unknown Unicast Traffic 15 94 11.1 P-Tunnel Identification ............................... 16 95 11.2 Ethernet Segment Identifier and Ethernet Tag .......... 17 96 12 Processing of Unknown Unicast Packets ................. 17 97 12.1 Ingress Replication ................................... 18 98 12.2 P2MP MPLS LSPs ........................................ 18 99 13 Forwarding Unicast Packets ............................ 19 100 13.1 Forwarding packets received from a CE ................. 19 101 13.2 Forwarding packets received from a remote MES ......... 20 102 13.2.1 Unknown Unicast Forwarding ............................ 20 103 13.2.2 Known Unicast Forwarding .............................. 20 104 14 Split Horizon ......................................... 21 105 14.1 ESI MPLS Label: Ingress Replication ................... 22 106 14.2 ESI MPLS Label: P2MP MPLS LSPs ........................ 23 107 15 Load Balancing of Unicast Packets ..................... 23 108 15.1 Load balancing of traffic from a MES to remote CEs .... 23 109 15.2 Load balancing of traffic between a MES and a local CE ....25 110 15.2.1 Data plane learning ................................... 25 111 15.2.2 Control plane learning ................................ 25 112 16 MAC Moves ............................................. 25 113 17 Multicast ............................................. 26 114 17.1 Ingress Replication ................................... 26 115 17.2 P2MP LSPs ............................................. 27 116 17.2.1 Inclusive Trees ....................................... 27 117 17.2.2 Selective Trees ....................................... 28 118 17.3 Explicit Tracking ..................................... 28 119 18 Convergence ........................................... 29 120 18.1 Transit Link and Node Failures between MESes .......... 29 121 18.2 MES Failures .......................................... 29 122 18.2.1 Local Repair .......................................... 29 123 18.3 MES to CE Network Failures ............................ 29 124 19 Acknowledgements ...................................... 30 125 20 References ............................................ 30 126 21 Author's Address ...................................... 31 128 1. Specification of requirements 130 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 131 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 132 document are to be interpreted as described in [RFC2119]. 134 2. Contributors 136 In addition to the authors listed above, the following individuals 137 also contributed to this document. 139 Quaizar Vohra 140 Kireeti Kompella 141 Apurva Mehta 142 Juniper Networks 144 3. Introduction 146 This document describes procedures for BGP MPLS based MAC VPNs (MAC- 147 VPN). 149 There is a desire by Service Providers (SP) and data center providers 150 to provide MPLS based bridged / LAN services or/and infrastructure 151 such that they meet the requirements listed below. An example of such 152 a service is a VPLS service offered by a SP. Another example is a 153 MPLS based infrastructure in a data center. Here are the 154 requirements: 156 - Minimal or no configuration required. MPLS implementations have 157 reduced the amount of configuration over the years. There is a need 158 for greater auto-configuration. 160 - Support of multiple active points of attachment for CEs which may 161 be hosts, switches or routers. Current MPLS technologies such as 162 VPLS, currently do not support this. This allows load-balancing among 163 multiple active paths. Regular Ethernet switching technologies, based 164 on MAC learning do not allow the same MAC to be learned from two 165 different PEs and be active at the same time in the same switching 166 instance 168 - Ability to span a VLAN across multiple racks in different 169 geographic locations, which may not be in the same data center. 171 - Minimize or eliminate flooding of unknown unicast traffic. 173 - Allow hosts and Virtual Machines (VMs) in a data center to relocate 174 without requiring renumbering. For instnace VMs may be moved for load 175 or failure reasons. 177 - Ability to scale up to hundreds of thousands of hosts or more 178 across multiple data centers, where connectivity is required between 179 hosts in different data centers. 181 - Support for virtualization. This includes the ability to separate 182 hosts and VMs working together from other such groups, and the 183 ability to have overlapping IP and MAC addresses/ 185 - Fast convergence 187 This document proposes a MPLS based technology, referred to as MPLS- 188 based MAC VPN (MAC-VPN) for meeting the requirements described in 189 this section. MAC-VPN requires extensions to existing IP/MPLS 190 protocols as described in section 5. In addition to these extensions 191 MAC-VPN uses several building blocks from existing MPLS technologies. 193 4. Terminology 195 MES: MPLS Edge Switch 196 CE: Host or router or switch 197 MVI: MAC VPN Instance 198 ESI: Ethernet segment identifier 200 5. BGP MPLS Based MAC-VPN 202 This section describes the framework of MAC-VPN to meet the 203 requirements described in section 3. 205 An MAC-VPN comprises CEs that are connected to PEs or MPLS Edge 206 Switches (MES) that comprise the edge of the MPLS infrastructure. A 207 CE may be a host, a router or a switch. The MPLS Edge Switches 208 provide layer 2 virtual bridge connectivity between the CEs. There 209 may be multiple MAC VPNs in the provider's network. This document 210 uses the terms MAC-VPN and MAC VPN inter-changeably. A MAC VPN 211 routing and forwarding instance on a MES is referred to as a MAC VPN 212 Instance (MVI). 214 The MESes are connected by a MPLS LSP infrastructure which provides 215 the benefits of MPLS such as fast-reroute, resiliency etc. 217 In a MAC VPN, learning between MESes occurs not in the data plane (as 218 happens with traditional bridging) but in the control plane. Control 219 plane learning offers much greater control over the learning process, 220 such as restricting who learns what, and the ability to apply 221 policies. Furthermore, the control plane chosen for this is BGP 222 (very similar to IP VPNs (RFC 4364)), providing much greater scale, 223 and the ability to "virtualize" or isolate groups of interacting 224 agents (hosts, servers, Virtual Machines) from each other. In MAC 225 VPNs MESes advertise the MAC addresses learned from the CEs that are 226 connected to them, along with a MPLS label, to other MESes in the 227 control plane. Control plane learning enables load balancing and 228 allows CEs to connect to multiple active points of attachment. It 229 also improves convergence times in the event of certain network 230 failures. 232 However, learning between MESes and CEs is done by the method best 233 suited to the CE: data plane learning, IEEE 802.1x, LLDP, 802.1aq or 234 other protocols. 236 It is a local decision as to whether the Layer 2 forwarding table on 237 a MES contains all the MAC destinations known to the control plane or 238 implements a cache based scheme. For instance the forwarding table 239 may be populated only with the MAC destinations of the active flows 240 transiting a specific MES. 242 The policy attributes of a MAC VPN are very similar to an IP VPN. A 243 MAC-VPN instance requires a Route-Distinguisher (RD) and a MAC-VPN 244 requires one or more Route-Targets (RTs). A CE attaches to a MAC-VPN 245 on a MES in a particular MVI on a VLAN or simply an ethernet 246 interface. When the point of attachment is a VLAN there may be one or 247 more VLANs in a particular MAC-VPN. Some deployment scenarios 248 guarantee uniqueness of VLANs across MAC-VPNs: all points of 249 attachment of a given MAC VPN use the same VLAN, and no other MAC VPN 250 uses this VLAN. This document refers to this case as a "Default 251 Single VLAN MAC-VPN" and describes simplified procedures to optimize 252 for it. 254 6. Ethernet Segment Identifier 256 If a CE is multi-homed to two or more MESes, the set of attachment 257 circuits constitutes an "Ethernet segment". An Ethernet segment may 258 appear to the CE as a Link Aggregation Group (LAG). Ethernet 259 segments have an identifier, called the "Ethernet Segment Identifier" 260 (ESI). A single-homed CE is considered to be attached to a Ethernet 261 segment with ESI 0. Otherwise, an Ethernet segment MUST have a 262 unique non-zero ESI. The ESI can be assigned using various 263 mechanisms: 265 1. The ESI may be configured. For instance when MAC VPNs are used to 266 provide a VPLS service the ESI is fairly analogous to the VE ID used 267 for the procedures in [BGP-VPLS] or the Multi-homing site ID in [BGP- 268 VPLS-MH]. 270 2. If LACP is used, between the MESes and CEs that are hosts, then 271 the ESI is determined by LACP. This is the 48 bit virtual MAC address 272 of the host for the LACP link bundle. As far as the host is concerned 273 it would treat the multiple MESes that it is homed to as the same 274 switch. This allows the host to aggregate links to different MESes 275 in the same bundle. 277 3. If LLDP is used, between the MESes and CEs that are hosts, then 278 the ESI is determined by LLDP. The ESI will be specified in a 279 following version. 281 4. In the case of indirectly connected hosts and a bridged LAN 282 between the hosts and the MESes, the ESI is determined based on the 283 Layer 2 bridge protocol as follows: 285 If STP is used then the value of the ESI is derived by listening 286 to BPDUs on the ethernet segment. The MES does not run STP. However 287 it does learn the Switch ID, MSTP ID and Root Bridge ID by listening 288 to BPDUs. The ESI is as follows: 290 {Switch ID (6 bits), MSTP ID (6 bits), Root Bridge ID (48 291 bits)} 293 7. BGP MAC-VPN NLRI 295 This document defines a new BGP NLRI, called the MAC-VPN NLRI. 297 Following is the format of the MAC-VPN NLRI: 299 +-----------------------------------+ 300 | Route Type (1 octet) | 301 +-----------------------------------+ 302 | Length (1 octet) | 303 +-----------------------------------+ 304 | Route Type specific (variable) | 305 +-----------------------------------+ 307 The Route Type field defines encoding of the rest of MAC-VPN NLRI 308 (Route Type specific MAC-VPN NLRI). 310 The Length field indicates the length in octets of the Route Type 311 specific field of MAC-VPN NLRI. 313 This document defines the following Route Types: 315 + 1 - Ethernet Tag Auto-Discovery (A-D) route 316 + 2 - MAC advertisement route 317 + 3 - Inclusive Multicast Ethernet Tag Route 318 + 4 - Ethernet Segment Route 319 + 5 - Selective Multicast Auto-Discovery (A-D) Route 320 + 6 - Leaf Auto-Discovery (A-D) Route 322 The detailed encoding and procedures for these route types are 323 described in subsequent sections. 325 The MAC-VPN NLRI is carried in BGP [RFC4271] using BGP Multiprotocol 326 Extensions [RFC4760] with an AFI of TBD and an SAFI of MAC-VPN (To be 327 assigned by IANA). The NLRI field in the 328 MP_REACH_NLRI/MP_UNREACH_NLRI attribute contains the MAC-VPN NLRI 329 (encoded as specified above). 331 In order for two BGP speakers to exchange labeled MAC-VPN NLRI, they 332 must use BGP Capabilities Advertisement to ensure that they both are 333 capable of properly processing such NLRI. This is done as specified 334 in [RFC4760], by using capability code 1 (multiprotocol BGP) with an 335 AFI of TBD and an SAFI of MAC-VPN. 337 8. Auto-Discovery of Ethernet Tags on Ethernet Segments 339 If a CE is multi-homed to two or more MESes on a particular ethernet 340 segment, each MES MUST advertise to other MSEs in the MAC VPN, the 341 information about the Ethernet Tags (e.g., VLANs) on that ethernet 342 segment. If a CE is not multi-homed, then the MES that it is 343 attached to MAY advertise the information about Ethernet Tags (e.g., 344 VLANs) on the ethernet segment connected to the CE. 346 The information about an Ethernet Tag on a particular ethernet 347 segment is advertised using a "Ethernet Tag Auto-Discovery route 348 (Ethernet Tag A-D route)". This route is advertised using the MAC-VPN 349 NLRI. 351 MAC VPNs support both the non-qualified and qualified learning model. 352 When non-qualified learning is used the Ethernet Tag Identifier 353 specified in this section and in other places in this document MUST 354 be set to a default value. When qualified learning is used the 355 Ethernet Tag Identifier, when required, MUST be set to a MAC VPN 356 provider assigned tag that maps locally on the advertising MES to an 357 ethernet broadcast domain identifier such as a VLAN ID. 359 The Ethernet Tag Auto-discovery information is used for Designated 360 Forwarder (DF) election as described in section 10. It is also used 361 to enable equal cost multi-path as described in section 15. Further, 362 it can be used to optimize withdrawl of MAC addresses as described in 363 section 18. 365 A Ethernet Tag A-D route type specific MAC-VPN NLRI consists of the 366 following: 368 +---------------------------------------+ 369 | RD (8 octets) | 370 +---------------------------------------+ 371 | Ethernet Segment Identifier (8 octets)| 372 +---------------------------------------+ 373 | Ethernet Tag ID (4 octets) | 374 +---------------------------------------+ 375 | MPLS Label (3 octets) | 376 +---------------------------------------+ 377 | Originating Router's IP Addr | 378 +---------------------------------------+ 380 Route-Distinguisher (RD) MUST be set to the RD of the MAC-VPN 381 instance that is advertising the NLRI. A RD MUST be assigned for a 382 given MAC-VPN instance on a MES. This RD MUST be unique across all 383 MAC-VPN instances on a MES. This can be accomplished by using a Type 384 1 RD [RFC4364]. The value field comprises an IP address of the MES 385 (typically, the loopback address) followed by a number unique to the 386 MES. This number may be generated by the MES, or, in the Default 387 Single VLAN MAC-VPN case, may be the 12 bit VLAN ID, with the 388 remaining 4 bits set to 0. 390 Ethernet Segment Identifier MUST be an 8 octet entity as described in 391 section 6. 393 The Ethernet Tag ID is the identifier of a Ethernet Tag on the 394 ethernet segment. This value may be a two octet VLAN ID or it may be 395 another Ethernet Tagused by the MAC VPN provider. It MAY be set to 396 the default Ethernet Tag on the ethernet segment. 398 The usage of the MPLS label is described in section 15. 400 The Originating Router's IP address MUST be set to an IP address of 401 the PE. This address SHOULD be common for all the MVIs on the PE 402 (e.,g., this address may be PE's loopback address). 404 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 405 be set to the same IP address as the one carried in the Originating 406 Router's IP Address field. 408 The Ethernet Tag A-D route MUST carry one or more Route Target (RT) 409 attributes. RTs may be configured (as in IP VPNs), or may be derived 410 automatically from the Ethernet Tag ID associated with the 411 advertisement. 413 The following is the procedure for deriving the RT attribute 414 automatically from the Ethernet Tag ID associated with the 415 advertisement: 417 + The Global Administrator field of the RT MUST 418 be set to the Autonomous System (AS) number that the MES 419 belongs to. 421 + The Local Administrator field of the RT contains a 4 422 octets long number that encodes the Ethernet Tag-ID. 424 The above auto-configuration of the RT implies that a different RT is 425 used for every Ethernet Tag in a MAC-VPN, if the MAC-VPN contains 426 multiple Ethernet Tags. For the "Default Single VLAN MAC-VPN" this 427 results in auto-deriving the RT from the Ethernet Tag for that MAC- 428 VPN. 430 9. Determining Reachability to Unicast MAC Addresses 432 MESes forward packets that they receive based on the destination MAC 433 address. This implies that MESes must be able to learn how to reach a 434 given destination unicast MAC address. 436 There are two components to MAC address learning, "local learning" 437 and "remote learning": 439 9.1. Local Learning 441 A particular MES must be able to learn the MAC addresses from the CEs 442 that are connected to it. This is referred to as local learning. 444 The MESes in a particular MAC-VPN MUST support local data plane 445 learning using vanilla ethernet learning procedures. A MES must be 446 capable of learning MAC addresses in the data plane when it receives 447 packets such as the following from the CE network: 449 - DHCP requests 451 - gratuitous ARP request for its own MAC. 453 - ARP request for a peer. 455 Alternatively if a CE is a host then MESes MAY learn the MAC 456 addresses of the host in the control plane. 458 In the case where a CE is a host or a switched network connected 459 on ESI X to hosts, the MAC address that is reachable via a given 460 MES may move such that it becomes reachable via the same MES on 461 another MES on ESI Y. This is referred to as a "MAC Move" 462 Procedures to support this are described in section 16. 464 9.2. Remote learning 466 A particular MES must be able to determine how to send traffic to MAC 467 addresses that belong to or are behind CEs connected to other MESes 468 i.e. to remote CEs or hosts behind remote CEs. We call such MAC 469 addresses as "remote" MAC addresses. 471 This document requires a MES to learn remote MAC addresses in the 472 control plane. In order to achieve this each MES advertises the MAC 473 addresses it learns from its locally attached CEs in the control 474 plane, to all the other MESes in the MAC-VPN, using BGP. 476 9.2.1. BGP MAC-VPN MAC Address Advertisement 478 BGP is extended to advertise these MAC addresses using the MAC 479 advertisement route type in the MAC-VPN-NLRI. 481 A MAC advertisement route type specific MAC-VPN NLRI consists of the 482 following: 484 +---------------------------------------+ 485 | RD (8 octets) | 486 +---------------------------------------+ 487 | Ethernet Segment Identifier (8 octets)| 488 +---------------------------------------+ 489 | Ethernet Tag ID (4 octets) | 490 +---------------------------------------+ 491 | MAC Address (6 octets) | 492 +---------------------------------------+ 493 | MPLS Label (3 octets) | 494 +---------------------------------------+ 495 | Originating Router's IP Addr | 496 +---------------------------------------+ 498 The RD MUST be the RD of the MAC-VPN instance that is advertising the 499 NLRI. The procedures for setting the RD for a given MAC VPN are 500 described in section 8. 502 The Ethernet Segment Identifier is set to the eight octet ESI 503 identifier described in section 6. 505 If qualified learning is used and the MAC address that is learned 506 from the CE is associated with an Ethernet Tag, the Ethernet Tag ID 507 MUST be the Ethernet Tag Identifier, assigned by the MAC VPN provider 508 and mapped to the CE's ethernet tag. If non-qualified learning is 509 used the Ethernet Tag identifier SHOULD be set to the default 510 Ethernet Tag on the ethernet segment. 512 The encoding of a MAC address is the 6-octet MAC address specified by 513 IEEE 802 documents [802.1D-ORIG] [802.1D-REV]. 515 The MPLS label MUST be the downstream assigned MAC-VPN MPLS label 516 that is used by the MES to forward MPLS encapsulated ethernet packets 517 received from remote MESes, where the destination MAC address in the 518 ethernet packet is the MAC address advertised in the above NLRI. The 519 forwarding procedures are specified in section 13. A MES may 520 advertise the same MAC-VPN label for all MAC addresses in a given 521 MAC-VPN instance. This label assignment methodology is referred to as 522 a per MVI label assigment. Or a MES may advertise a unique MAC-VPN 523 label per combination. This label methodology is 524 referred to as a per label assignment. Or a MES 525 may advertise a unique MAC-VPN label per MAC address. All of these 526 methodologies have their tradeoffs. 528 Per MVI label assignment requires the least number of MAC-VPN labels, 529 but requires a MAC lookup in addition to a MPLS lookup on an egress 530 MES for forwarding. On the other hand a unique label per or a unique label per MAC allows an egress MES to 532 forward a packet that it receives from another MES, to the connected 533 CE, after looking up only the MPLS labels and not having to do a MAC 534 lookup. 536 The Originating Router's IP address MUST be set to an IP address of 537 the PE. This address SHOULD be common for all the MVIs on the PE 538 (e.,g., this address may be PE's loopback address). 540 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 541 be set to the same IP address as the one carried in the Originating 542 Router's IP Address field. 544 The BGP advertisement that advertises the MAC advertisement route 545 MUST also carry one or more Route Target (RT) attributes. The 546 assignemnt of RTs described in section 8 MUST be followed. 548 It is to be noted that this document does not require MESes to create 549 forwarding state for remote MACs when they are learned in the control 550 plane. When this forwarding state is actually created is a local 551 implementation matter. 553 10. Designated Forwarder Election 555 Consider a CE that is a host or a router that is multi-homed directly 556 to more than one MES in a MAC-VPN on a given ethernet segment. One or 557 more Ethernet Tags may be configured on the ethernet segment. In this 558 scenario only one of the MESes, referred to as the Designated 559 Forwarder (DF), is responsible for certain actions: 561 - Sending multicast and broadcast traffic, on a given Ethernet 562 Tag 563 on a particular ethernet segment, to the CE. Note that 564 this behavior, which allows selecting a DF at the 565 granularity of for multicast and 566 broadcast 567 traffic is the default behavior in this specification. 568 Optional mechanisms, which will be specified in the 569 future, will allow selecting a DF at the granularity of 570 . 572 - Flooding unknown unicast traffic (i.e. traffic for 573 which a MES does not know the destination MAC address), 574 on a given Ethernet Tag on a particular ethernet segment to 575 the CE, 576 if the environment requires flooding of unknown unicast 577 traffic. 579 Note that a CE always sends packets using a single link. For instance 580 if the CE is a host then, as mentioned earlier, the host treats the 581 multiple links that it uses to reach the MESes as a Link Aggregation 582 Group (LAG). 584 If a bridge network is multi-homed to more than one MES in a MAC-VPN 585 via switches, then the support of active-active points of attachments 586 as described in this specification requires the bridge network to be 587 connected to two or more MESes using a LAG. In this case the reasons 588 for doing DF election are the same as those described above when a CE 589 is a host or a router. 591 If a bridge network does not connect to the MESes using LAG, then 592 only one of the links between a CE that is a switch and the MESes 593 must be the active link. Procedures for supporting active-active 594 points of attachments, when a bridge network does not connect to the 595 MESes using LAG, are for further study. 597 The granularity of the DF election MUST be at least the ethernet 598 segment via which the CE is multi-homed to the MESes. If the DF 599 election is done at the ethernet segment granularity then a single 600 MES MUST be elected as the DF on the ethernet segment. 602 If there are one or more Ethernet Tags (e.g., VLANs) on the ethernet 603 segment then the granularity of the DF election SHOULD be the 604 combination of the ethernet segment and Ethernet Tag on that ethernet 605 segment. In this case the same MES MUST be elected as the DF for a 606 particular Ethernet Tag on that ethernet segment. 608 The MESes perform a designated forwarder (DF) election, for an 609 ethernet segment, or ethernet segment, Ethernet Tag combination using 610 the Ethernet Tag A-D BGP route described in section 8. 612 The DF election for a particular ESI or a particular combination proceeds as follows. First a MES constructs a 614 candidate list of MESes. This comprises all the Ethernet Tag A-D 615 routes with that particular ESI or tuple that a 616 MES imports in a MAC-VPN instance, including the Ethernet Tag A-D 617 route generated by the MES itself, if any. The DF MES is chosen from 618 this candidate list. Note that DF election is carried out by all the 619 MESes that import the DF route. 621 The default procedure for choosing the DF is the MES with the highest 622 IP address, of all the MESes in the candidate list. This procedure 623 MUST be implemented. It ensures that except during routing transients 624 each MES chooses the same DF MES for a given ESI and Ethernet Tag 625 combination. 627 Other alternative procedures for performing DF election are possible 628 and will be described in the future. 630 11. Handling of Broadcast, Multicast and Unknown Unicast Traffic 632 Procedures are required for a given MES to send broadcast or 633 multicast traffic, received from a CE encapsulated in a given 634 Ethernet Tag in a MAC VPN, to all the other MESes that span that 635 Ethernet Tag in the MAC VPN. In certain scenarios, described in 636 section 12, a given MES may also need to flood unknown unicast 637 traffic to other MESes. 639 The MESes in a particular MAC-VPN may use ingress replication or P2MP 640 LSPs to send unknown unicast, broadcast or multicast traffic to other 641 MESes. 643 Each MES MUST advertise an "Inclusive Multicast Ethernet Tag Route" 644 to enable the above. This section provides the encoding and the 645 overview of the Inclusive Multicast Ethernet Tag route. Subsequent 646 sections describe in further detail its usage. 648 An Inclusive Multicast Ethernet Tag route type specific MAC-VPN NLRI 649 consists of the following: 651 +---------------------------------------+ 652 | RD (8 octets) | 653 +---------------------------------------+ 654 | Ethernet Segment Identifier (8 octets)| 655 +---------------------------------------+ 656 | Ethernet Tag ID (4 octets) | 657 +---------------------------------------+ 658 | Originating Router's IP Addr | 659 +---------------------------------------+ 661 The RD MUST be the RD of the MAC-VPN instance that is advertising the 662 NLRI. The procedures for setting the RD for a given MAC VPN are 663 described in section 8. 665 The Ethernet Segment Identifier MAY be set to the eight octet ESI 666 identifier described in section 6. Or it MAY be set to 0. It MUST be 667 set to 0 if the Ethernet Tag is set to 0. 669 The Ethernet Tag ID is the identifier of the Ethernet Tag. It MAY be 670 set to 0 in which case an egress MES MUST perform a MAC lookup to 671 forward the packet. 673 The Originating Router's IP address MUST be set to an IP address of 674 the PE. This address SHOULD be common for all the MVIs on the PE 675 (e.,g., this address may be PE's loopback address). 677 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 678 be set to the same IP address as the one carried in the Originating 679 Router's IP Address field. 681 The BGP advertisement that advertises the Inclusive Multicast 682 Ethernet Tag route MUST also carry one or more Route Target (RT) 683 attributes. The assignemnt of RTs described in section 8 MUST be 684 followed. 686 11.1. P-Tunnel Identification 688 In order to identify the P-Tunnel used for sending broadcast, unknown 689 unicast or multicast traffic, the Inclusive Multicast Ethernet Tag 690 route MUST carry a "PMSI Tunnel Attribute" specified in [BGP MVPN]. 692 Depending on the technology used for the P-tunnel for the MAC VPN on 693 the PE, the PMSI Tunnel attribute of the Inclusive Multicast Ethernet 694 Tag route is constructed as follows. 696 + If the PE that originates the advertisement uses a P-Multicast 697 tree for the P-tunnel for the MAC VPN, the PMSI Tunnel attribute 698 MUST contain the identity of the tree (note that the PE could 699 create the identity of the tree prior to the actual instantiation 700 of the tree). 702 + A PE that uses a P-Multicast tree for the P-tunnel MAY aggregate 703 two or more Ethernet Tags in the same or different MAC VPNs 704 present on the PE onto the same tree. In this case in addition to 705 carrying the identity of the tree, the PMSI Tunnel attribute MUST 706 carry an MPLS upstream assigned label which the PE has bound 707 uniquely to the for MAC VPN associated with 708 this update (as determined by its RTs). 710 If the PE has already advertised Inclusive Multicast Ethernet Tag 711 routes for two or more Ethernet Tags that it now desires to 712 aggregate, then the PE MUST re-advertise those routes. The re- 713 advertised routes MUST be the same as the original ones, except 714 for the PMSI Tunnel attribute and the label carried in that 715 attribute. 717 + If the PE that originates the advertisement uses ingress 718 replication for the P-tunnel for the MAC VPN, the route MUST 719 include the PMSI Tunnel attribute with the Tunnel Type set to 720 Ingress Replication and Tunnel Identifier set to a routable 721 address of the PE. The PMSI Tunnel attribute MUST carry a 722 downstream assigned MPLS label. This label is used to demultiplex 723 the broadcast, multicast or unknown unicast MAC VPN traffic 724 received over a unicast tunnel by the PE. 726 + The Leaf Information Required flag of the PMSI Tunnel attribute 727 MUST be set to zero, and MUST be ignored on receipt. 729 11.2. Ethernet Segment Identifier and Ethernet Tag 731 As described above the encoding rules allow setting the Ethernet 732 Segment Identifier and Ethernet Tag to either valid values or to 0. 733 If the Ethernet Tag is set to a valid value, then an egress MES can 734 forward the packet to the set of egress ESIs in the Ethernet Tag, in 735 the MAC VPN, by performing a MPLS lookup alone. Further if the ESI is 736 also set to non zero then the egress MES does not need to replicate 737 the packet as it is destined for a given ethernet segment. If both 738 Ethernet Tag and ESI are set to 0 then an egress MES MUST perform a 739 MAC lookup in the MVI determined by the MPLS label, after the MPLS 740 lookup, to forward the packet. 742 If a MES advertises multiple Inclusive Ethernet Tag routes for a 743 given MAC VPN then the PMSI Tunnel Attributes for these routes MUST 744 be distinct. 746 12. Processing of Unknown Unicast Packets 748 The procedures in this document do not require MESes to flood unknown 749 unicast traffic to other MESes. If MESes learn CE MAC addresses via a 750 control plane, the MESes can then distribute MAC addresses via BGP, 751 and all unicast MAC addresses will be learnt prior to traffic to 752 those destinations. 754 However, if a destination MAC address of a received packet is not 755 known by the MES, the MES may have to flood the packet. Flooding must 756 take into account "split horizon forwarding" as follows. The 757 principles behind the following procedures are borrowed from the 758 split horizon forwarding rules in VPLS solutions [RFC 4761, RFC 759 4762]. When a MES capable of flooding (say MESx) receives a 760 broadcast Ethernet frame, or one with an unknown destination MAC 761 address, it must flood the frame. If the frame arrived from an 762 attached CE, MESx must send a copy of the frame to every other 763 attached CE, as well as to all other MESs participating in the MAC 764 VPN. If, on the other hand, the frame arrived from another MES (say 765 MESy), MESx must send a copy of the packet only to attached CEs. MESx 766 MUST NOT send the frame to other MESs, since MESy would have already 767 done so. Split horizon forwarding rules apply to broadcast and 768 multicast packets, as well as packets to an unknown MAC address. 770 Whether or not to flood packets to unknown destination MAC addresses 771 should be an administrative choice, depending on how learning happens 772 between CEs and MESes. 774 The MESes in a particular MAC VPN may use ingress replication using 775 RSVP-TE P2P LSPs or LDP MP2P LSPs for sending broadcast, multicast 776 and unknown unicast traffic to other MESes. Or they may use RSVP-TE 777 or LDP P2MP LSPs for sending such traffic to other MESes. 779 12.1. Ingress Replication 781 If ingress replication is in use, the P-Tunnel attribute, carried in 782 the Inclusive Multicast Ethernet Tag routes (section 11) for the MAC 783 VPN, specifies the downstream label that the other MESes can use to 784 send unknown unicast, multicast or broadcast traffic for the MAC VPN 785 to this particular MES. 787 The MES that receives a packet with this particular MPLS label MUST 788 treat the packet as a broadcast, multicast or unknown unicast packet. 789 Further if the MAC address is a unicast MAC address, the MES MUST 790 treat the packet as an unknown unicast packet. 792 12.2. P2MP MPLS LSPs 794 The procedures for using P2MP LSPs are very similar to VPLS 795 procedures [VPLS-MCAST]. The P-Tunnel attribute used by a MES for 796 sending unknown unicast, broadcast or multicast traffic for a 797 particular ethernet segment, is advertised in the Inclusive Ethernet 798 Tag Multicast route as described in section 11. 800 The P-Tunnel attribute specifies the P2MP LSP identifier. This is the 801 equivalent of an Inclusive tree in [VPLS-MCAST]. Note that multiple 802 Ethernet Tags, which may be in different MAC-VPNs, may use the same 803 P2MP LSP, using upstream labels [VPLS-MCAST]. When P2MP LSPs are used 804 for flooding unknown unicast traffic, packet re-ordering is possible. 806 The MES that receives a packet on the P2MP LSP specified in the PMSI 807 Tunnel Attribute MUST treat the packet as a broadcast, multicast or 808 unknown unicast packet. Further if the MAC address is a unicast MAC 809 address, the MES MUST treat the packet as an unknown unicast packet. 811 13. Forwarding Unicast Packets 813 13.1. Forwarding packets received from a CE 815 When a MES receives a packet from a CE, on a given Ethernet Tag, it 816 must first look up the source MAC address of the packet. In certain 817 environments the source MAC address may be used to authenticate the 818 CE and determine that traffic from the host can be allowed into the 819 network. 821 If the MES decides to forward the packet the destination MAC address 822 of the packet must be looked up. If the MES has received MAC address 823 advertisements for this destination MAC address from one or more 824 other MESes or learned it from locally connected CEs, it is 825 considered as a known MAC address. Else the MAC address is considered 826 as an unknown MAC address. 828 For known MAC addresses the MES forwards this packet to one of the 829 remote MESes. The packet is encapsulated in the MAC-VPN MPLS label 830 advertised by the remote MES, for that MAC address, and in the MPLS 831 LSP label stack to reach the remote MES. 833 If the MAC address is unknown then, if the administrative policy on 834 the MES requires flooding of unknown unicast traffic: 835 - The MES MUST flood the packet to other MESes. If the ESI over 836 which the MES receives the packet is multi-homed, then the MES MUST 837 first encapsulate the packet in the ESI MPLS label as described in 838 section 14. If ingress replication is used the packet MUST be 839 replicated one or more times to each remote MES with the bottom label 840 of the stack being a MPLS label determined as follows. This is the 841 MPLS label advertised by the remote MES in a PMSI Tunnel Attribute in 842 the Inclusive Multicast Ethernet Tag route for an 843 combination. The Ethernet Tag in the route must be the same as the 844 Ethernet Tag advertised by the ingress MES in its Ethernet Tag A-D 845 route associated with the interface on which the ingress MES receives 846 the packet. If P2MP LSPs are being used the packet MUST be sent on 847 the P2MP LSP that the MES is the root of for the Ethernet Tag in the 848 MAC-VPN. If the same P2MP LSP is used for all Ethernet Tags then all 849 the MESes in the MAC VPN MUST be the leaves of the P2MP LSP. If a 850 distinct P2MP LSP is used for a given Ethernet Tag in the MAC VPN 851 then only the MESes in the Ethernet Tag MUST be the leaves of the 852 P2MP LSP. The packet MUST be encapsulated in the P2MP LSP label 853 stack. 855 If the MAC address is unknown then, if the admnistrative policy on 856 the MES does not allow flooding of unknown unicast traffic: 857 - The MES MUST drop the packet. 859 13.2. Forwarding packets received from a remote MES 861 13.2.1. Unknown Unicast Forwarding 863 When a MES receives a MPLS packet from a remote MES then, after 864 processing the MPLS label stack, if the top MPLS label ends up being 865 a P2MP LSP label associated with a MAC-VPN or the downstream label 866 advertised in the P-Tunnel attribute and after performing the split 867 horizon procedures described in section 14: 869 - If the MES is the designated forwarder of unknown unicast, 870 broadcast or multicast traffic, on a particular set of ESIs for the 871 Ethernet Tag, the default behavior is for the MES to flood the packet 872 on the ESIs. In other words the default behavior is for the MES to 873 assume that the destination MAC address is unknown unicast, broadcast 874 or multicast and it is not required to do a destination MAC address 875 lookup, as long as the granularity of the MPLS label included the 876 Ethernet Tag. As an option the MES may do a destination MAC lookup to 877 flood the packet to only a subset of the CE interfaces in the 878 Ethernet Tag. For instance the MES may decide to not flood an unknown 879 unicast packet on certain ethernet segments even if it is the DF on 880 the ethernet segment, based on administrative policy. 882 - If the MES is not the designated forwarder on any of the ESIs 883 for the Ethernet Tag, the default behavior is for it to drop the 884 packet. 886 13.2.2. Known Unicast Forwarding 888 If the top MPLS label ends up being a MAC-VPN label that was 889 advertised in the unicast MAC advertisements, then the MES either 890 forwards the packet based on CE next-hop forwarding information 891 associated with the label or does a destination MAC address lookup to 892 forward the packet to a CE. 894 14. Split Horizon 896 Consider a CE that is multi-homed to two or more MESes on an ethernet 897 segment ES1. If the CE sends a multicast, broadcast or unknown 898 unicast packet to a particular MES, say MES1, then MES1 will forward 899 that packet to all or subset of the other MESes in the MAC VPN. In 900 this case the MESes, other than MES1, that the CE is multi-homed to 901 MUST drop the packet and not forward back to the CE. This is referred 902 to as "split horizon" in this document. 904 In order to accomplish this each MES distributes to other MESes that 905 are connected to the ethernet segment an "Ethernet Segment Route". 907 An Ethernet Segment route type specific MAC-VPN NLRI consists of the 908 following: 910 +---------------------------------------+ 911 | RD (8 octets) | 912 +---------------------------------------+ 913 | Ethernet Segment Identifier (8 octets)| 914 +---------------------------------------+ 915 | MPLS Label (3 octets) | 916 +---------------------------------------+ 917 | Originating Router's IP Addr | 918 +---------------------------------------+ 920 The RD MUST be the RD of the MAC-VPN instance that is advertising the 921 NLRI. The procedures for setting the RD for a given MAC VPN are 922 described in section 8. 924 The Ethernet Segment Identifier MUST be set to the eight octet ESI 925 identifier described in section 6. 927 The MPLS label is referred to as an "ESI label". This label MUST be a 928 downstream assigned MPLS label if the advertising MES is using 929 ingress replication for sending multicast, broadcast or unknown 930 unicast traffic, to other MESes. If the advertising MES is using P2MP 931 MPLS LSPs for the same, then this label MUST be an upstream assigned 932 MPLS label. The usage of this label is described below. 934 The Originating Router's IP address MUST be set to an IP address of 935 the PE. This address SHOULD be common for all the MVIs on the PE 936 (e.,g., this address may be PE's loopback address). 938 The Next Hop field of the MP_REACH_NLRI attribute of the route MUST 939 be set to the same IP address as the one carried in the Originating 940 Router's IP Address field. 942 The BGP advertisement that advertises the MAC advertisement route 943 MUST also carry one Route Target (RT) attribute. The construction of 944 this RT will be specified in the next version. 946 This route will be enhanced to carry LAG specific information such as 947 LACP parameters in the future. 949 14.1. ESI MPLS Label: Ingress Replication 951 An MES that is using ingress replication for sending broadcast, 952 multicast or unknown unicast traffic, distributes to other MESes, 953 that belong to the ethernet segment, a downstream assigned "ESI MPLS 954 label" in the Ethernet Segment route. This label MUST be programmed 955 in the platform label space by the advertising MES. Further the 956 forwarding entry for this label must result in NOT forwarding packets 957 received with this label onto the ethernet segment that the label was 958 distributed for. 960 Consider MES1 and MES2 that are multi-homed to CE1 on ES1. Further 961 consider that MES1 is using P2P or MP2P LSPs to send packets to MES2. 962 Consider that MES1 receives a a multicast, broadcast or unknown 963 unicast packet from CE1 on VLAN1 on ESI1. 965 First consider the case where MES2 distributes an unique Inclusive 966 Multicast Ethernet Tag route for VLAN1, for each ethernet segment on 967 MES2. In this case MES1 MUST NOT replicate the packet to MES2 for 968 . 970 Next consider the case where MES2 distributes a single Inclusive 971 Multicast Ethernet Tag route for VLAN1 for all ethernet segments on 972 MES2. In this case when MES1 sends a multicast, broadcast or unknown 973 unicast packet, that it receives from CE1, it MUST first push onto 974 the MPLS label stack the ESI label that MES2 has distributed for 975 ESI1. It MUST then push on the MPLS label distributed by MES2 in the 976 Inclusive Ethernet Tag Multicast route for Ethernet Tag1. The 977 resulting packet is further encapsulated in the P2P or MP2P LSP label 978 stack required to transmit the packet to MES2. When MES2 receives 979 this packet it determines the set of ESIs to replicate the packet to 980 from the top MPLS label, after any P2P or MP2P LSP labels have been 981 removed. If the next label is the ESI label assigned by MES2 then 982 MES2 MUST NOT forward the packet onto ESI1. 984 14.2. ESI MPLS Label: P2MP MPLS LSPs 986 An MES that is using P2MP LSPs for sending broadcast, multicast or 987 unknown unicast traffic, distributes to other MESes, that belong to 988 the ethernet segment, an upstream assigned "ESI MPLS label" in the 989 Ethernet Segment route. This label is upstream assigned by the MES 990 that advertises the route. This label MUST be programmed by the other 991 MESes, that are connected to the ESI advertised in the route, in the 992 context label space for the advertising MES. Further the forwarding 993 entry for this label must result in NOT forwarding packets received 994 with this label onto the ethernet segment that the label was 995 distributed for. 997 Consider MES1 and MES2 that are multi-homed to CE1 on ES1. Further 998 assume that MES1 is using P2MP MPLS LSPs to send broadcast, multicast 999 or uknown unicast packets. When MES1 sends a multicast, broadcast or 1000 unknown unicast packet, that it receives from CE1, it MUST first push 1001 onto the MPLS label stack the ESI label that it has assigned for the 1002 ESI that the packet was received on. The resulting packet is further 1003 encapsulated in the P2MP MPLS label stack necessary to transmit the 1004 packet to the other MESes. Penultimate hop popping MUST be disabled 1005 on the P2MP LSPs used in the MPLS transport infrastructure for MAC 1006 VPN. When MES2 receives this packet it decapsulates the top MPLS 1007 label and forwards the packet using the context label space 1008 determined by the top label. If the next label is the ESI label 1009 assigned by MES1 then MES2 MUST NOT forward the packet onto ESI1. 1011 15. Load Balancing of Unicast Packets 1013 This section specifies how load balancing is achieved to/from a CE 1014 that has more than one interface that is directly connected to one or 1015 more MESes. The CE may be a host or a router or it may be a switched 1016 network that is connected via LAG to the MESes. 1018 15.1. Load balancing of traffic from a MES to remote CEs 1020 Whenever a remote MES imports a MAC advertisement for a given in a MAC VPN instance, it MUST consider the MAC as 1022 reachahable via all the MESes from which it has imported Ethernet Tag 1023 A-D routes for that . Further the remote MES MUST 1024 use these MAC advertisement and Ethernet Tag A-D routes to constuct 1025 the set of next-hops that it can use to send the packet to the 1026 destination MAC. Each next-hop comprises a MPLS label, that is to be 1027 used by the egress MES to forward the packet. This label is 1028 determined as follows. If the next-hop is constructed as a result of 1029 a MAC route which has a valid MPLS label, then this label MUST be 1030 used. However if the MAC route doesn't have a valid MPLS label or if 1031 the next-hop is constructed as a result of a Ethernet Tag A-D route 1032 then the MPLS label from the Ethernet Tag A-D route MUST be used. 1034 Consider a CE, CE1, that is dual homed to two MESes, MES1 and MES2 on 1035 a LAG interface, ES1, and is sending packets with MAC address MAC1 on 1036 VLAN1. Based on MAC-VPN extensions described in sections 8 and 9, a 1037 remote MES say MES3 is able to learn that a MAC1 is reachable via 1038 MES1 and MES2. Both MES1 and MES2 may advertise MAC1 in BGP if they 1039 receive packets with MAC1 from CE1. If this is not the case and if 1040 MAC1 is advertised only by MES1, MES3 still considers MAC1 as 1041 reachable via both MES1 and MES2 as both MES1 and MES2 advertise a 1042 Ethernet Tag A-D route for . 1044 The MPLS label stack to send the packets to MES1 is the MPLS LSP 1045 stack to get to MES1 and the MAC-VPN label advertised by MES1 for 1046 CE1's MAC. 1048 The MPLS label stack to send packets to MES2 is the MPLS LSP stack to 1049 get to MES2 and the upstream assigned label in the Ethernet Tag A-D 1050 route advertised by MES2 for , if MES2 has not advertised 1051 MAC1 in BGP. 1053 We will refer to these label stacks as MPLS next-hops. 1055 The remote MES, MES3, can now load balance the traffic it receives 1056 from its CEs, destined for CE1, between MES1 and MES2. MES3 may use 1057 the IP flow information for it to hash into one of the MPLS next-hops 1058 for load balancing for IP traffic. Or MES3 may rely on the source and 1059 destination MAC addresses for load balancing. 1061 Note that once MES3 decides to send a particular packet to MES1 or 1062 MES2 it can pick from more than path to reach the particular remote 1063 MES using regular MPLS procedures. For instance if the tunneling 1064 technology is based on RSVP-TE LSPs, and MES3 decides to send a 1065 particular packet to MES1 then MES3 can choose from multiple RSVP-TE 1066 LSPs that have MES1 as their destination. 1068 When MES1 or MES2 receive the packet destined for CE1 from MES3, if 1069 the packet is a unicast MAC packet it is forwarded to CE1. If it is 1070 a multicast or broadcast MAC packet then only one of MES1 or MES2 1071 must forward the packet to the CE. Which of MES1 or MES2 forward this 1072 packet to the CE is determined by default based on which of the two 1073 is the DF. An alternate procedure to load balance multicast packets 1074 will be described in the future. 1076 If the connectivity between the multi-homed CE and one of the MESes 1077 that it is multi-homed to fails, the MES MUST withdraw the MAC 1078 address from BGP. This enables the remote MESes to remove the MPLS 1079 next-hop to this particular MES from the set of MPLS next-hops that 1080 can be used to forward traffic to the CE. For further details and 1081 procedures on withdrawl of MAC VPN route types in the event of MES to 1082 CE failures please section 18.4. 1084 15.2. Load balancing of traffic between a MES and a local CE 1086 A CE may be configured with more than one interface connected to 1087 different MESes or the same MES for load balancing. The MES(s) and 1088 the CE can load balance traffic onto these interfaces using one of 1089 the following mechanisms. 1091 15.2.1. Data plane learning 1093 Consider that the MESes perform data plane learning for local MAC 1094 addresses learned from local CEs. This enables the MES(s) to learn a 1095 particular MAC address and associate it with one or more interfaces. 1096 The MESes can now load balance traffic destined to that MAC address 1097 on the multiple interfaces. 1099 Whether the CE can load balance traffic that it generates on the 1100 multiple interfaces is dependent on the CE implementation. 1102 15.2.2. Control plane learning 1104 The CE can be a host that advertises the same MAC address using a 1105 control protocol on both interfaces. This enables the MES(s) to learn 1106 the host's MAC address and associate it with one or more interfaces. 1107 The MESes can now load balance traffic destined to the host on the 1108 multiple interfaces. The host can also load balance the traffic it 1109 generates onto these interfaces and the MES that receives the traffic 1110 employs MAC-VPN forwarding procedures to forward the traffic. 1112 16. MAC Moves 1114 In the case where a CE is a host or a switched network connected to 1115 hosts, the MAC address that is reachable via a given MES on a 1116 particular ESI may move such that it becomes reachable via another 1117 MES on another ESI. This is referred to as a "MAC Move". 1119 Remote MESes must be able to distinguish a MAC move from the case 1120 where a MAC address on an ESI is reachable via two different MESes 1121 and load balancing is performed as described in section 15. This 1122 distinction can be made as follows. If a MAC is learned by a 1123 particular MES from multiple MESes, then the MES performs load 1124 balancing only amongst the set of MESes that advertised the MAC with 1125 the same ESI. If this is not the case then the MES chooses only one 1126 of the advertising MESes to reach the MAC as per BGP path selection. 1128 There can be traffic loss during a MAC move. Consider MAC1 that is 1129 advertised by MES1 and learned from CE1 on ESI1. If MAC1 now moves 1130 behind MES2, on ESI2, MES2 advertises the MAC in BGP. Until a remote 1131 MES, MES3, determines that the best path is via MES2, it will 1132 continue to send traffic destined for MAC1 to MES1. This will not 1133 occur deterministially until MES1 withdraws the advertisement for 1134 MAC1. 1136 One recommended optimization to reduce the traffic loss during MAC 1137 moves is the following option. When an MES sees a MAC update from a 1138 CE on an ESI, which is different from the ESI on which the MES has 1139 currently learned the MAC, the corresponding entry in the local 1140 bridge forwarding table SHOULD be immediately purged causing the MES 1141 to withdraw its own MAC-VPN MAC advertisement route and replace it 1142 with the update. 1144 A future version of this specification will describe other optimized 1145 procedures to minimize traffic loss during MAC moves. 1147 17. Multicast 1149 The MESes in a particular MAC-VPN may use ingress replication or P2MP 1150 LSPs to send multicast traffic to other MESes. 1152 17.1. Ingress Replication 1154 The MESes may use ingress replication for flooding unknown unicast, 1155 multicast or broadcast traffic as described in section 11. A given 1156 unknown unicast or broadcast packet must be sent to all the remote 1157 MESes. However a given multicast packet for a multicast flow may be 1158 sent to only a subset of the MESes. Specifically a given multicast 1159 flow may be sent to only those MESes that have receivers that are 1160 interested in the multicast flow. Determining which of the MESes have 1161 receivers for a given multicast flow is done using explicit tracking 1162 described below. 1164 17.2. P2MP LSPs 1166 A MES may use an "Inclusive" tree for sending an unknown unicast, 1167 broadcast or multicast packet or a "Selective" tree. This terminology 1168 is borrowed from [VPLS-MCAST]. 1170 A variety of transport technologies may be used in the SP network. 1171 For inclusive P-Multicast trees, these transport technologies include 1172 point-to-multipoint LSPs created by RSVP-TE or mLDP. For selective P- 1173 Multicast trees, only unicast MES-MES tunnels (using MPLS or IP/GRE 1174 encapsulation) and P2MP LSPs are supported, and the supported P2MP 1175 LSP signaling protocols are RSVP-TE, and mLDP. 1177 17.2.1. Inclusive Trees 1179 An Inclusive Tree allows the use of a single multicast distribution 1180 tree, referred to as an Inclusive P-Multicast tree, in the SP network 1181 to carry all the multicast traffic from a specified set of MAC VPN 1182 instances on a given MES. A particular P-Multicast tree can be set up 1183 to carry the traffic originated by sites belonging to a single MAC 1184 VPN, or to carry the traffic originated by sites belonging to 1185 different MAC VPNs. The ability to carry the traffic of more than one 1186 MAC VPN on the same tree is termed 'Aggregation'. The tree needs to 1187 include every MES that is a member of any of the MAC VPNs that are 1188 using the tree. This implies that a MES may receive multicast traffic 1189 for a multicast stream even if it doesn't have any receivers that are 1190 interested in receiving traffic for that stream. 1192 An Inclusive P-Multicast tree as defined in this document is a P2MP 1193 tree. A P2MP tree is used to carry traffic only for MAC VPN CEs that 1194 are connected to the MES that is the root of the tree. 1196 The procedures for signaling an Inclusive Tree are the same as those 1197 in [VPLS-MCAST] with the VPLS-AD route replaced with the Inclusive 1198 Multicast Ethernet Tag route. The P-Tunnel attribute [VPLS-MCAST] for 1199 an Inclusive tree is advertised in the Inclusive Ethernet Tag A-D 1200 route as described in section 11. Note that a MES can "aggregate" 1201 multiple inclusive trees for different MAC-VPNs on the same P2MP LSP 1202 using upstream labels. The procedures for aggregation are the same as 1203 those described in [VPLS-MCAST], with VPLS A-D routes replaced by 1204 MAC-VPN Inclusive Multicast Ethernet Tag A-D routes. 1206 17.2.2. Selective Trees 1208 A Selective P-Multicast tree is used by a MES to send IP multicast 1209 traffic for one or IP more specific multicast streams, originated by 1210 CEs connected to the MES, that belong to the same or different MAC 1211 VPNs, to a subset of the MESs that belong to those MAC VPNs. Each of 1212 the MESs in the subset should be on the path to a receiver of one or 1213 more multicast streams that are mapped onto the tree. The ability to 1214 use the same tree for multicast streams that belong to different MAC 1215 VPNs is termed a MES the ability to create separate SP multicast 1216 trees for specific multicast streams, e.g. high bandwidth multicast 1217 streams. This allows traffic for these multicast streams to reach 1218 only those MES routers that have receivers in these streams. This 1219 avoids flooding other MES routers in the MAC VPN. 1221 A SP can use both Inclusive P-Multicast trees and Selective P- 1222 Multicast trees or either of them for a given MAC VPN on a MES, based 1223 on local configuration. 1225 The granularity of a selective tree is where S is an 1226 IP multicast source address and G is an IP multicast group address or 1227 G is a multicast MAC address. Wildcard sources and wildcard groups 1228 are supported. Selective trees require explicit tracking as described 1229 below. 1231 A MAC-VPN MES advertises a selective tree using a MAC-VPN selective 1232 A-D route. The procedures are the same as those in [VPLS-MCAST] with 1233 S-PMSI A-D routes in [VPLS-MCAST] replaced by MAC-VPN Selective A-D 1234 routes. The information elements of the MAC VPN selective 1235 A-D route are similar to those of the VPLS S-PMSI A-D route with the 1236 following differences. A MAC VPN Selective A-D route includes an 1237 optional Ethernet Tag field. Also a MAC VPN selective A-D route may 1238 encode a MAC address in the Group field. The encoding details of the 1239 MAC VPN selective A-D route will be described in the next revision. 1241 Selective trees can also be aggregated on the same P2MP LSP using 1242 aggregation as described in [VPLS-MCAST]. 1244 17.3. Explicit Tracking 1246 [VPLS-MCAST] describes procedures for explicit tracking that rely on 1247 Leaf A-D routes. The same procedures are used for explicit tracking 1248 in this specification with VPLS Leaf A-D routes replaced with MAC-VPN 1249 Leaf A-D routes. These procedures allow a root MES to request 1250 multicast membership information for a given (S, G), from leaf MESs. 1251 Leaf MESs rely on IGMP snooping or PIM snooping between the MES and 1252 the CE to determine the multicast membership information. Note that 1253 the procedures in [VPLS-MCAST] do not describe how explicit tracking 1254 is performed if the CEs are enabled with join suppression. The 1255 procedures for this case will be described in a future version. 1257 18. Convergence 1259 This section describes failure recovery from different types of 1260 network failures. 1262 18.1. Transit Link and Node Failures between MESes 1264 The use of existing MPLS Fast-Reroute mechanisms can provide failure 1265 recovery in the order of 50ms, in the event of transit link and node 1266 failures in the infrastructure that connects the MESes. 1268 18.2. MES Failures 1270 Consider a host host1 that is dual homed to MES1 and MES2. If MES1 1271 fails, a remote MES, MES3, can discover this based on the failure of 1272 the BGP session. This failure detection can be in the sub-second 1273 range if BFD is used to detect BGP session failure. MES3 can update 1274 its forwarding state to start sending all traffic for host1 to only 1275 MES2. It is to be noted that this failure recovery is potentially 1276 faster than what would be possible if data plane learning were to be 1277 used. As in that case MES3 would have to rely on re-learning of MAC 1278 addresses via MES2. 1280 18.2.1. Local Repair 1282 It is possible to perform local repair in the case of MES failures. 1283 Details will be specified in the future. 1285 18.3. MES to CE Network Failures 1287 When an ethernet segment connected to a MES fails or when a Ethernet 1288 Tag is deconfigured on an ethernet segment, then the MES MUST 1289 withdraw the Ethernet Tag A-D route(s) announced for the that are impacted by the failure or de-configuration. 1291 In addition the MES MUST also withdraw the MAC advertisement routes 1292 that are impacted by the failure or de-configuration. 1294 The Ethernet Tag A-D routes should be used by an implementation to 1295 optimize the withdrawal of MAC advertisement routes. When a MES 1296 receives a withdrawl of a particular Ethernet Tag A-D route it SHOULD 1297 consider all the MAC advertisement routes, that are learned from the 1298 same as in the Ethernet Tag A-D route, as having 1299 been withdrawn. This optimizes the network convergence times in the 1300 event of MES to CE failures. 1302 19. Acknowledgements 1304 We would like to thank Yakov Rekhter, Kaushik Ghosh, Nischal Sheth 1305 and Amit Shukla for discussions that helped shape this document. We 1306 would also like to thank Han Nguyen for his comments and support of 1307 this work. 1309 20. References 1311 [RFC4364] "BGP/MPLS IP VPNs", Rosen, Rekhter, et. al., February 2006 1313 [VPLS-MCAST] "Multicast in VPLS". R. Aggarwal et.al., draft-ietf- 1314 l2vpn-vpls-mcast-04.txt 1316 [RFC4761] Kompella, K. and Y. Rekhter, "Virtual Private LAN Service 1317 (VPLS) Using BGP for Auto-Discovery and Signaling", RFC 4761, January 1318 2007. 1320 [RFC4762] Lasserre, M. and V. Kompella, "Virtual Private LAN Service 1321 (VPLS) Using Label Distribution Protocol (LDP) Signaling", RFC 4762, 1322 January 2007. 1324 [VPLS-MULTIHOMING] "BGP based Multi-homing in Virtual Private LAN 1325 Service", K. Kompella et. al., draft-ietf-l2vpn-vpls- 1326 multihoming-00.txt 1328 [PIM-SNOOPING] "PIM Snooping over VPLS", V. Hemige et. al., draft- 1329 ietf-l2vpn-vpls-pim-snooping-01 1331 [IGMP-SNOOPING] "Considerations for Internet Group Management 1332 Protocol (IGMP) and Multicast Listener Discovery (MLD) Snooping 1333 Switches", M. Christensen et. al., RFC4541, 1335 21. Author's Address 1337 Rahul Aggarwal 1338 Juniper Networks 1339 1194 N. Mathilda Ave. 1340 Sunnyvale, CA 94089 US 1342 Email: rahul@juniper.net 1344 Aldrin Isaac 1345 Bloomberg 1346 Email: aisaac71@bloomberg.net 1348 James Uttaro 1349 AT&T 1350 200 S. Laurel Avenue 1351 Middletown, NJ 07748 1352 USA 1353 Email: uttaro@att.com 1355 Ravi Shekhar 1356 Juniper Networks 1357 1194 N. Mathilda Ave. 1358 Sunnyvale, CA 94089 US 1360 Wim Henderickx 1361 Alcatel-Lucent 1362 e-mail: wim.henderickx@alcatel-lucent.be 1364 Florin Balus 1365 Alcatel-Lucent 1366 e-mail: Florin.Balus@alcatel-lucent.be