idnits 2.17.1 draft-ietf-l2vpn-pbb-evpn-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 18, 2014) is 3599 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'PBB' == Outdated reference: A later version (-06) exists of draft-ietf-l2vpn-pbb-vpls-interop-05 == Outdated reference: A later version (-07) exists of draft-ietf-l2vpn-evpn-req-05 == Outdated reference: A later version (-11) exists of draft-ietf-l2vpn-evpn-04 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Working Group Ali Sajassi, Ed. 3 Internet Draft Samer Salam 4 Category: Standards Track Cisco 5 Nabil Bitar 6 Verizon 7 Aldrin Isaac 8 Bloomberg 9 Wim Henderickx 10 Alcatel-Lucent 11 Lizhong Jin 12 ZTE 13 Expires: December 18, 2014 June 18, 2014 15 PBB-EVPN 16 draft-ietf-l2vpn-pbb-evpn-07 18 Status of this Memo 20 This Internet-Draft is submitted to IETF in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF), its areas, and its working groups. Note that 25 other groups may also distribute working documents as 26 Internet-Drafts. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 The list of current Internet-Drafts can be accessed at 34 http://www.ietf.org/1id-abstracts.html 36 The list of Internet-Draft Shadow Directories can be accessed at 37 http://www.ietf.org/shadow.html 39 Copyright and License Notice 41 Copyright (c) 2013 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Abstract 56 This document discusses how Ethernet Provider Backbone Bridging 57 [802.1ah] can be combined with EVPN in order to reduce the number of 58 BGP MAC advertisement routes by aggregating Customer/Client MAC (C- 59 MAC) addresses via Provider Backbone MAC address (B-MAC), provide 60 client MAC address mobility using C-MAC aggregation and B-MAC sub- 61 netting, confine the scope of C-MAC learning to only active flows, 62 offer per site policies and avoid C-MAC address flushing on topology 63 changes. The combined solution is referred to as PBB-EVPN. 65 Conventions 67 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 68 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 69 document are to be interpreted as described in RFC 2119. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 74 2. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 4 75 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 76 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 5 77 4.1. MAC Advertisement Route Scalability . . . . . . . . . . . 5 78 4.2. C-MAC Mobility with MAC Summarization . . . . . . . . . . 5 79 4.3. C-MAC Address Learning and Confinement . . . . . . . . . . 5 80 4.4. Per Site Policy Support . . . . . . . . . . . . . . . . . 6 81 4.5. Avoiding C-MAC Address Flushing . . . . . . . . . . . . . 6 82 5. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 6 83 6. BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . 7 84 6.1. Ethernet Auto-Discovery Route . . . . . . . . . . . . . . 7 85 6.2. BGP MAC Advertisement Route . . . . . . . . . . . . . . . 8 86 6.3. Inclusive Multicast Ethernet Tag Route . . . . . . . . . . 8 87 6.4. Ethernet Segment Route . . . . . . . . . . . . . . . . . . 8 88 6.5. ESI Label Extended Community . . . . . . . . . . . . . . . 9 89 6.6. ES-Import Route Target . . . . . . . . . . . . . . . . . . 9 90 6.7. MAC Mobility Extended Community . . . . . . . . . . . . . 9 91 6.8. Default Gateway Extended Community . . . . . . . . . . . . 9 92 7. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 9 93 7.1. MAC Address Distribution over Core . . . . . . . . . . . . 9 94 7.2. Device Multi-homing . . . . . . . . . . . . . . . . . . . 9 95 7.2.1 Flow-based Load-balancing . . . . . . . . . . . . . . . 10 96 7.2.1.1 PE B-MAC Address Assignment . . . . . . . . . . . . 10 97 7.2.1.2. Automating B-MAC Address Assignment . . . . . . . 12 98 7.2.1.3 Split Horizon and Designated Forwarder Election . . 12 99 7.2.2 I-SID Based Load-balancing . . . . . . . . . . . . . . . 13 100 7.2.2.1 PE B-MAC Address Assignment . . . . . . . . . . . . 13 101 7.2.2.2 Split Horizon and Designated Forwarder Election . . 13 102 7.2.2.3 Handling Failure Scenarios . . . . . . . . . . . . . 13 103 7.3. Network Multi-homing . . . . . . . . . . . . . . . . . . . 14 104 7.4. Frame Forwarding . . . . . . . . . . . . . . . . . . . . . 15 105 7.4.1. Unicast . . . . . . . . . . . . . . . . . . . . . . . 15 106 7.4.2. Multicast/Broadcast . . . . . . . . . . . . . . . . . 15 107 8. Minimizing ARP Broadcast . . . . . . . . . . . . . . . . . . . 16 108 9. Seamless Interworking with IEEE 802.1aq/802.1Qbp . . . . . . . 16 109 9.1 B-MAC Address Assignment . . . . . . . . . . . . . . . . . . 16 110 9.2 IEEE 802.1aq / 802.1Qbp B-MAC Advertisement Route . . . . . 17 111 9.3 Operation: . . . . . . . . . . . . . . . . . . . . . . . . . 17 112 10. Solution Advantages . . . . . . . . . . . . . . . . . . . . . 17 113 10.1. MAC Advertisement Route Scalability . . . . . . . . . . . 18 114 10.2. C-MAC Mobility with MAC Sub-netting . . . . . . . . . . . 18 115 10.3. C-MAC Address Learning and Confinement . . . . . . . . . 18 116 10.4. Seamless Interworking with TRILL and 802.1aq Access 117 Networks . . . . . . . . . . . . . . . . . . . . . . . . 19 118 10.5. Per Site Policy Support . . . . . . . . . . . . . . . . . 19 119 10.6. Avoiding C-MAC Address Flushing . . . . . . . . . . . . . 19 120 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20 121 12. Security Considerations . . . . . . . . . . . . . . . . . . . 20 122 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 123 14. Intellectual Property Considerations . . . . . . . . . . . . 20 124 15. Normative References . . . . . . . . . . . . . . . . . . . . 20 125 16. Informative References . . . . . . . . . . . . . . . . . . . 20 126 17. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 21 128 1. Introduction 130 [EVPN] introduces a solution for multipoint L2VPN services, with 131 advanced multi-homing capabilities, using BGP for distributing 132 customer/client MAC address reach-ability information over the core 133 MPLS/IP network. [PBB] defines an architecture for Ethernet Provider 134 Backbone Bridging (PBB), where MAC tunneling is employed to improve 135 service instance and MAC address scalability in Ethernet as well as 136 VPLS networks [PBB-VPLS]. 138 In this document, we discuss how PBB can be combined with EVPN in 139 order to: reduce the number of BGP MAC advertisement routes by 140 aggregating Customer/Client MAC (C-MAC) addresses via Provider 141 Backbone MAC address (B-MAC), provide client MAC address mobility 142 using C-MAC aggregation and B-MAC sub-netting, confine the scope of 143 C-MAC learning to only active flows, offer per site policies and 144 avoid C-MAC address flushing on topology changes. The combined 145 solution is referred to as PBB-EVPN. 147 2. Contributors 149 In addition to the authors listed above, the following individuals 150 also contributed to this document. 152 Sami Boutros, Cisco 153 Dennis Cai, Cisco 154 Keyur Patel, Cisco 155 Clarence Filsfils, Cisco 156 Sam Aldrin, Huawei 157 Himanshu Shah, Ciena 158 Florin Balus, ALU 160 3. Terminology 162 BEB: Backbone Edge Bridge 163 B-MAC: Backbone MAC Address 164 CE: Customer Edge 165 C-MAC: Customer/Client MAC Address 166 DHD: Dual-homed Device 167 DHN: Dual-homed Network 168 LACP: Link Aggregation Control Protocol 169 LSM: Label Switched Multicast 170 MDT: Multicast Delivery Tree 171 MP2MP: Multipoint to Multipoint 172 P2MP: Point to Multipoint 173 P2P: Point to Point 174 PE: Provider Edge 175 PoA: Point of Attachment 176 PW: Pseudowire 177 EVPN: Ethernet VPN 178 Single-Active Redundancy Mode: When only a single PE, among a group 179 of PEs attached to an Ethernet segment, is allowed to forward traffic 180 to/from that Ethernet Segment, then the Ethernet segment is defined 181 to be operating in Single-Active redundancy mode. 183 All-Active Redundancy Mode: When all PEs attached to an Ethernet 184 segment are allowed to forward traffic to/from that Ethernet Segment, 185 then the Ethernet segment is defined to be operating in All-Active 186 redundancy mode. 188 4. Requirements 190 The requirements for PBB-EVPN include all the requirements for EVPN 191 that were described in [EVPN-REQ], in addition to the following: 193 4.1. MAC Advertisement Route Scalability 195 In typical operation, an [EVPN] PE sends a BGP MAC Advertisement 196 Route per customer/client MAC (C-MAC) address. In certain 197 applications, this poses scalability challenges, as is the case in 198 data center interconnect (DCI) scenarios where the number of virtual 199 machines (VMs), and hence the number of C-MAC addresses, can be in 200 the millions. In such scenarios, it is required to reduce the number 201 of BGP MAC Advertisement routes by relying on a 'MAC summarization' 202 scheme, as is provided by PBB. 204 4.2. C-MAC Mobility with MAC Summarization 206 Certain applications, such as virtual machine mobility, require 207 support for fast C-MAC address mobility. For these applications, the 208 exact virtual machine MAC address needs to be transmitted in BGP MAC 209 Advertisement route. Otherwise, traffic would be forwarded to the 210 wrong segment when a virtual machine moves from one Ethernet segment 211 to another. This means MAC address prefixes cannot be used in data 212 center applications. 214 In order to support C-MAC address mobility, while retaining the 215 scalability benefits of MAC summarization, PBB technology is used. It 216 defines a Backbone MAC (B-MAC) address space that is independent of 217 the C-MAC address space, and aggregate C-MAC addresses via a B-MAC 218 address and then apply summarization to B-MAC addresses. 220 4.3. C-MAC Address Learning and Confinement 222 In EVPN, all the PE nodes participating in the same EVPN instance are 223 exposed to all the C-MAC addresses learnt by any one of these PE 224 nodes because a C-MAC learned by one of the PE nodes is advertise in 225 BGP to other PE nodes in that EVPN instance. This is the case even if 226 some of the PE nodes for that EVPN instance are not involved in 227 forwarding traffic to, or from, these C-MAC addresses. Even if an 228 implementation does not install hardware forwarding entries for C-MAC 229 addresses that are not part of active traffic flows on that PE, the 230 device memory is still consumed by keeping record of the C-MAC 231 addresses in the routing table (RIB). In network applications with 232 millions of C-MAC addresses, this introduces a non-trivial waste of 233 PE resources. As such, it is required to confine the scope of 234 visibility of C-MAC addresses only to those PE nodes that are 235 actively involved in forwarding traffic to, or from, these addresses. 237 4.4. Per Site Policy Support 239 In many applications, it is required to be able to enforce 240 connectivity policy rules at the granularity of a site (or segment). 241 This includes the ability to control which PE nodes in the network 242 can forward traffic to, or from, a given site. PBB-EVPN is capable of 243 providing this granularity of policy control. In the case where per 244 C-MAC address granularity is required, the EVI can always continue to 245 operate in EVPN mode. 247 4.5. Avoiding C-MAC Address Flushing 249 It is required to avoid C-MAC address flushing upon link, port or 250 node failure for All-Active multi-homed devices and networks. This is 251 in order to speed up re-convergence upon failure. 253 5. Solution Overview 255 The solution involves incorporating IEEE Backbone Edge Bridge (BEB) 256 functionality on the EVPN PE nodes similar to PBB-VPLS, where BEB 257 functionality is incorporated in the VPLS PE nodes. The PE devices 258 would then receive 802.1Q Ethernet frames from their attachment 259 circuits, encapsulate them in the PBB header and forward the frames 260 over the IP/MPLS core. On the egress EVPN PE, the PBB header is 261 removed following the MPLS disposition, and the original 802.1Q 262 Ethernet frame is delivered to the customer equipment. 264 BEB +--------------+ BEB 265 || | | || 266 \/ | | \/ 267 +----+ AC1 +----+ | | +----+ +----+ 268 | CE1|-----| | | | | |---| CE2| 269 +----+\ | PE1| | IP/MPLS | | PE3| +----+ 270 \ +----+ | Network | +----+ 271 \ | | 272 AC2\ +----+ | | 273 \| | | | 274 | PE2| | | 275 +----+ | | 276 /\ +--------------+ 277 || 278 BEB 279 <-802.1Q-> <------PBB over MPLS------> <-802.1Q-> 281 Figure 1: PBB-EVPN Network 283 The PE nodes perform the following functions:- Learn customer/client 284 MAC addresses (C-MACs) over the attachment circuits in the data- 285 plane, per normal bridge operation. 287 - Learn remote C-MAC to B-MAC bindings in the data-plane for traffic 288 received from the core per [PBB] bridging operation. 290 - Advertise local B-MAC address reach-ability information in BGP to 291 all other PE nodes in the same set of service instances. Note that 292 every PE has a set of local B-MAC addresses that uniquely identify 293 the device. More on the PE addressing in section 5. 295 - Build a forwarding table from remote BGP advertisements received 296 associating remote B-MAC addresses with remote PE IP addresses and 297 the associated MPLS label(s). 299 6. BGP Encoding 301 PBB-EVPN leverages the same BGP Routes and Attributes defined in 302 [EVPN], adapted as follows: 304 6.1. Ethernet Auto-Discovery Route 306 This route and all of its associated modes are not needed in PBB- 307 EVPN. 309 The receiving PE knows that it need not wait for the receipt of the 310 Ethernet A-D route for route resolution by means of the reserved ESI 311 encoded in the MAC Advertisement route: the ESI values of 0 and MAX- 312 ESI indicate that the receiving PE can resolve the path without an 313 Ethernet A-D route. 315 6.2. BGP MAC Advertisement Route 317 The EVPN MAC Advertisement Route is used to distribute B-MAC 318 addresses of the PE nodes instead of the C-MAC addresses of end- 319 stations/hosts. This is because the C-MAC addresses are learnt in the 320 data-plane for traffic arriving from the core. The MAC Advertisement 321 Route is encoded as follows: 323 - The MAC address field contains the B-MAC address. 324 - The Ethernet Tag field is set to 0. 325 - The Ethernet Segment Identifier field must be set either to 0 (for 326 single-homed Segments or multi-homed Segments with per-ISID load- 327 balancing) or to MAX-ESI (for multi-homed Segments with per-flow 328 load-balancing). All other values are not permitted. 329 - All other fields are set as defined in [EVPN]. 331 This route is tagged with the RT corresponding to its EVI. This EVI 332 is analogous to a B-VID. 334 6.3. Inclusive Multicast Ethernet Tag Route 336 This route is used for multicast pruning per I-SID. It is used for 337 auto-discovery of PEs participating in a given I-SID so that a 338 multicast tunnel (MP2P, P2P, P2MP, or MP2MP LSP) can be setup for 339 that I-SID . [PBB-VPLS] uses multicast pruning per I-SID based on 340 [MMRP] which is a soft-state protocol. The advantages of multicast 341 pruning using this BGP route over [MMRP] are that a) it scales very 342 well for large number of PEs and b) it works with any type of LSP 343 (MP2P, P2P, P2MP, or MP2MP); whereas, [MMRP] only works over P2P PWs. 344 The Inclusive Multicast Ethernet Tag Route is encoded as follow: 346 - The Ethernet Tag field is set with the appropriate I-SID value. 347 - All other fields are set as defined in [EVPN]. 349 This route is tagged with an RT. This RT SHOULD be set to a value 350 corresponding to its EVI (which is analogous to a B-VID). The RT for 351 this route MAY also be auto-derived from the corresponding Ethernet 352 Tag (I-SID) based on the procedure specified in section 9.4.1.1.1 of 353 [EVPN]. 355 6.4. Ethernet Segment Route 356 This route is used as defined in [EVPN]. 358 6.5. ESI Label Extended Community 360 This extended community is not used in PBB-EVPN. In [EVPN], this 361 extended community is used along with the Ethernet AD route to 362 advertise an MPLS label for the purpose of split-horizon filtering. 363 Since in PBB-EVPN, the split-horizon filtering is performed natively 364 using B-MAC SA, there is no need for this extended community. 366 6.6. ES-Import Route Target 368 This RT is used as defined in [EVPN]. 370 6.7. MAC Mobility Extended Community 372 This extended community is defined in [EVPN] and it is used with a 373 MAC route (B-MAC route in case of PBB-EVPN). The B-MAC route is 374 tagged with the RT corresponding to its EVI (which is analogous to a 375 B-VID). When this extended community is used along with a B-MAC route 376 in PBB-EVPN, it indicates that all C-MAC addresses associated with 377 that B-MAC address across all corresponding I-SIDs must be flushed. 379 6.8. Default Gateway Extended Community 381 This extended community is not used in PBB-EVPN. 383 7. Operation 385 This section discusses the operation of PBB-EVPN, specifically in 386 areas where it differs from [EVPN]. 388 7.1. MAC Address Distribution over Core 390 In PBB-EVPN, host MAC addresses (i.e. C-MAC addresses) need not be 391 distributed in BGP. Rather, every PE independently learns the C-MAC 392 addresses in the data-plane via normal bridging operation. Every PE 393 has a set of one or more unicast B-MAC addresses associated with it, 394 and those are the addresses distributed over the core in MAC 395 Advertisement routes. 397 7.2. Device Multi-homing 398 7.2.1 Flow-based Load-balancing 400 This section describes the procedures for supporting device multi- 401 homing in an All-Active redundancy mode (i.e., flow-based load- 402 balancing). 404 7.2.1.1 PE B-MAC Address Assignment 406 In [PBB] every BEB is uniquely identified by one or more B-MAC 407 addresses. These addresses are usually locally administered by the 408 Service Provider. For PBB-EVPN, the choice of B-MAC address(es) for 409 the PE nodes must be examined carefully as it has implications on the 410 proper operation of multi-homing. In particular, for the scenario 411 where a CE is multi-homed to a number of PE nodes with All-Active 412 redundancy mode, a given C-MAC address would be reachable via 413 multiple PE nodes concurrently. Given that any given remote PE will 414 bind the C-MAC address to a single B-MAC address, then the various PE 415 nodes connected to the same CE must share the same B-MAC address. 416 Otherwise, the MAC address table of the remote PE nodes will keep 417 oscillating between the B-MAC addresses of the various PE devices. 418 For example, consider the network of Figure 1, and assume that PE1 419 has B-MAC BM1 and PE2 has B-MAC BM2. Also, assume that both links 420 from CE1 to the PE nodes are part of the same Ethernet link 421 aggregation group. If BM1 is not equal to BM2, the consequence is 422 that the MAC address table on PE3 will keep oscillating such that the 423 C-MAC address M1 of CE1 would flip-flop between BM1 or BM2, depending 424 on the load-balancing decision on CE1 for traffic destined to the 425 core. 427 Considering that there could be multiple sites (e.g. CEs) that are 428 multi-homed to the same set of PE nodes, then it is required for all 429 the PE devices in a Redundancy Group to have a unique B-MAC address 430 per site. This way, it is possible to achieve fast convergence in the 431 case where a link or port failure impacts the attachment circuit 432 connecting a single site to a given PE. 434 +---------+ 435 +-------+ PE1 | IP/MPLS | 436 / | | 437 CE1 | Network | PEr 438 M1 \ | | 439 +-------+ PE2 | | 440 /-------+ | | 441 / | | 442 CE2 | | 443 M2 \ | | 444 \ | | 445 +------+ PE3 +---------+ 447 Figure 2: B-MAC Address Assignment 449 In the example network shown in Figure 2 above, two sites 450 corresponding to CE1 and CE2 are dual-homed to PE1/PE2 and PE2/PE3, 451 respectively. Assume that BM1 is the B-MAC used for the site 452 corresponding to CE1. Similarly, BM2 is the B-MAC used for the site 453 corresponding to CE2. On PE1, a single B-MAC address (BM1) is 454 required for the site corresponding to CE1. On PE2, two B-MAC 455 addresses (BM1 and BM2) are required, one per site. Whereas on PE3, a 456 single B-MAC address (BM2) is required for the site corresponding to 457 CE2. All three PE nodes would advertise their respective B-MAC 458 addresses in BGP using the MAC Advertisement routes defined in 459 [EVPN]. The remote PE, PEr, would learn via BGP that BM1 is reachable 460 via PE1 and PE2, whereas BM2 is reachable via both PE2 and PE3. 461 Furthermore, PEr establishes, via the PBB bridge learning procedure, 462 that C-MAC M1 is reachable via BM1, and C-MAC M2 is reachable via 463 BM2. As a result, PEr can load-balance traffic destined to M1 between 464 PE1 and PE2, as well as traffic destined to M2 between both PE2 and 465 PE3. In the case of a failure that causes, for example, CE1 to be 466 isolated from PE1, the latter can withdraw the route it has 467 advertised for BM1. This way, PEr would update its path list for BM1, 468 and will send all traffic destined to M1 over to PE2 only. 470 For single-homed sites, it is possible to assign a unique B-MAC 471 address per site, or have all the single-homed sites connected to a 472 given PE share a single B-MAC address. The advantage of the first 473 model over the second model is the ability to avoid C-MAC destination 474 address lookup on the disposition PE (even though source C-MAC 475 learning is still required in the data-plane). Also, by assigning the 476 B-MAC addresses from a contiguous range, it is possible to advertise 477 a single B-MAC subnet for all single-homed sites, thereby rendering 478 the number of MAC advertisement routes required at par with the 479 second model. 481 In summary, every PE may use a unicast B-MAC address shared by all 482 single-homed CEs or a unicast B-MAC address per single-homed CE and, 483 in addition, a unicast B-MAC address per All-Active multi-homed CE. 484 In the latter case, the B-MAC address MUST be the same for all PE 485 nodes in a Redundancy Group connected to the same CE. 487 7.2.1.2. Automating B-MAC Address Assignment 489 The PE B-MAC address used for single-homed sites can be automatically 490 derived from the hardware (using for e.g. the backplane's address). 491 However, the B-MAC address used for multi-homed sites must be 492 coordinated among the RG members. To automate the assignment of this 493 latter address, the PE can derive this B-MAC address from the MAC 494 Address portion of the CE's LACP System Identifier by flipping the 495 'Locally Administered' bit of the CE's address. This guarantees the 496 uniqueness of the B-MAC address within the network, and ensures that 497 all PE nodes connected to the same multi-homed CE use the same value 498 for the B-MAC address. 500 Note that with this automatic provisioning of the B-MAC address 501 associated with multi-homed CEs, it is not possible to support the 502 uncommon scenario where a CE has multiple bundles towards the PE 503 nodes, and the service involves hair-pinning traffic from one bundle 504 to another. This is because the split-horizon filtering relies on B- 505 MAC addresses rather than Site-ID Labels (as will be described in the 506 next section). The operator must explicitly configure the B-MAC 507 address for this fairly uncommon service scenario. 509 Whenever a B-MAC address is provisioned on the PE, either manually or 510 automatically (as an outcome of CE auto-discovery), the PE MUST 511 transmit an MAC Advertisement Route for the B-MAC address with a 512 downstream assigned MPLS label that uniquely identifies that address 513 on the advertising PE. The route is tagged with the RTs of the 514 associated EVIs as described above. 516 7.2.1.3 Split Horizon and Designated Forwarder Election 518 [EVPN] relies on access split horizon, where the Ethernet Segment 519 Label is used for egress filtering on the attachment circuit in order 520 to prevent forwarding loops. In PBB-EVPN, the B-MAC source address 521 can be used for the same purpose, as it uniquely identifies the 522 originating site of a given frame. As such, Ethernet Segment (ES) 523 Labels are not used in PBB-EVPN, and the egress split-horizon 524 filtering is done based on the B-MAC source address. It is worth 525 noting here that [PBB] defines this B-MAC address based filtering 526 function as part of the I-Component options, hence no new functions 527 are required to support split-horizon beyond what is already defined 528 in [PBB]. Given that the ES label is not used in PBB-EVPN, the PE 529 sets the Label field in the Ethernet Segment Route to 0. 531 The Designated Forwarder election procedures are defined in [EVPN]. 533 7.2.2 I-SID Based Load-balancing 535 This section describes the procedures for supporting device multi- 536 homing in a Single-Active redundancy mode with per-ISID load- 537 balancing. 539 7.2.2.1 PE B-MAC Address Assignment 541 In the case where per-ISID load-balancing is desired among the PE 542 nodes in a given redundancy group, multiple unicast B-MAC addresses 543 are allocated per multi-homed Ethernet Segment: Each PE connected to 544 the multi-homed segment is assigned a unique B-MAC. Every PE then 545 advertises its B-MAC address using the BGP MAC advertisement route. 546 In this mode of operation, two B-MAC address assignment models are 547 possible: 549 - The PE may use a shared B-MAC address for multiple Ethernet 550 Segments. This includes the single-homed segments as well as the 551 multi-homed segments operating with per-ISID load-balancing mode. 553 - The PE may use a dedicated B-MAC address for each Ethernet Segment 554 operating with per-ISID load-balancing mode. 556 All PE implementations MUST support the shared B-MAC address model 557 and MAY support the dedicated B-MAC address model. 559 A remote PE initially floods traffic to a destination C-MAC address, 560 located in a given multi-homed Ethernet Segment, to all the PE nodes 561 configured with that I-SID. Then, when reply traffic arrives at the 562 remote PE, it learns (in the data-path) the B-MAC address and 563 associated next-hop PE to use for said C-MAC address. 565 7.2.2.2 Split Horizon and Designated Forwarder Election The procedures 566 are similar to the flow-based load-balancing case, with the only 567 difference being that the DF filtering must be applied to unicast as 568 well as multicast traffic, and in both core-to-segment as well as 569 segment-to-core directions. 571 7.2.2.3 Handling Failure Scenarios 573 When a PE connected to a multi-homed Ethernet Segment loses 574 connectivity to the segment, due to link or port failure, it needs to 575 notify the remote PEs to trigger C-MAC address flushing. This can be 576 achieved in one of two ways, depending on the B-MAC assignment model: 578 - If the PE uses a shared B-MAC address for multiple Ethernet 579 Segments, then the C-MAC flushing is signaled by means of having the 580 failed PE re-advertise the MAC Advertisement route for the associated 581 B-MAC, tagged with the MAC Mobility Extended Community attribute. The 582 value of the Counter field in that attribute must be incremented 583 prior to advertisement. This causes the remote PE nodes to flush all 584 C-MAC addresses associated with the B-MAC in question. This is done 585 across all I-SIDs that are mapped to the EVI of the withdrawn MAC 586 route. 588 - If the PE uses a dedicated B-MAC address for each Ethernet Segment 589 operating under per-ISID load-balancing mode, the the failed PE 590 simply withdraws the B-MAC route previously advertised for that 591 segment. This causes the remote PE nodes to flush all C-MAC addresses 592 associated with the B-MAC in question. This is done across all I-SIDs 593 that are mapped to the EVI of the withdrawn MAC route. 595 When a PE connected to a multi-homed Ethernet Segment fails (i.e. 596 node failure) or when the PE becomes completely isolated from the 597 EVPN network, the remote PEs will start purging the MAC Advertisement 598 routes that were advertised by the failed PE. This is done either as 599 an outcome of the remote PEs detecting that the BGP session to the 600 failed PE has gone down, or by having a Route Reflector withdrawing 601 all the routes that were advertised by the failed PE. The remote PEs, 602 in this case, will perform C-MAC address flushing as an outcome of 603 the MAC Advertisement route withdrawals. 605 For all failure scenarios (link/port failure, node failure and PE 606 node isolation), when the fault condition clears, the recovered PE 607 re-advertises the associated Ethernet Segment route to other members 608 of its Redundancy Group. This triggers the backup PE(s) in the 609 Redundancy Group to block the I-SIDs for which the recovered PE is a 610 DF. When a backup PE blocks the I-SIDs, it triggers a C-MAC address 611 flush notification to the remote PEs by re-advertising the MAC 612 Advertisement route for the associated B-MAC, with the MAC Mobility 613 Extended Community attribute. The value of the Counter field in that 614 attribute must be incremented prior to advertisement. This causes the 615 remote PE nodes to flush all C-MAC addresses associated with the B- 616 MAC in question. This is done across all I-SIDs that are mapped to 617 the EVI of the withdrawn MAC route. 619 7.3. Network Multi-homing 621 When an Ethernet network is multi-homed to a set of PE nodes running 622 PBB-EVPN, a single-active redundancy model can be supported with per 623 service instance (i.e. I-SID) load-balancing. In this model, DF 624 election is performed to ensure that a single PE node in the 625 redundancy group is responsible for forwarding traffic associated 626 with a given I-SID. This guarantees that no forwarding loops are 627 created. Filtering based on DF state applies to both unicast and 628 multicast traffic, and in both access-to-core as well as core-to- 629 access directions (unlike the multi-homed device scenario where DF 630 filtering is limited to multi-destination frames in the core-to- 631 access direction). Similar to the multi-homed device scenario, with 632 I-SID based load-balancing, a unique B-MAC address is assigned to 633 each of the PE nodes connected to the multi-homed network (Segment). 635 7.4. Frame Forwarding 637 The frame forwarding functions are divided in between the Bridge 638 Module, which hosts the [PBB] Backbone Edge Bridge (BEB) 639 functionality, and the MPLS Forwarder which handles the MPLS 640 imposition/disposition. The details of frame forwarding for unicast 641 and multi-destination frames are discussed next. 643 7.4.1. Unicast 645 Known unicast traffic received from the AC will be PBB-encapsulated 646 by the PE using the B-MAC source address corresponding to the 647 originating site. The unicast B-MAC destination address is determined 648 based on a lookup of the C-MAC destination address (the binding of 649 the two is done via transparent learning of reverse traffic). The 650 resulting frame is then encapsulated with an LSP tunnel label and the 651 MPLS label which uniquely identifies the B-MAC destination address on 652 the egress PE. If per flow load-balancing over ECMPs in the MPLS core 653 is required, then a flow label is added as the end of stack label. 655 For unknown unicast traffic, the PE forwards these frames over MPLS 656 core. When these frames are to be forwarded, then the same set of 657 options used for forwarding multicast/broadcast frames (as described 658 in next section) are used. 660 7.4.2. Multicast/Broadcast 662 Multi-destination frames received from the AC will be PBB- 663 encapsulated by the PE using the B-MAC source address corresponding 664 to the originating site. The multicast B-MAC destination address is 665 selected based on the value of the I-SID as defined in [PBB]. The 666 resulting frame is then forwarded over the MPLS core using one out of 667 the following two options: 669 Option 1: the MPLS Forwarder can perform ingress replication over a 670 set of MP2P tunnel LSPs. The frame is encapsulated with a tunnel LSP 671 label and the EVPN ingress replication label advertised in the 672 Inclusive Multicast Route. 674 Option 2: the MPLS Forwarder can use P2MP tunnel LSP per the 675 procedures defined in [EVPN]. This includes either the use of 676 Inclusive or Aggregate Inclusive trees. 678 Note that the same procedures for advertising and handling the 679 Inclusive Multicast Route defined in [EVPN] apply here. 681 8. Minimizing ARP Broadcast 683 The PE nodes implement an ARP-proxy function in order to minimize the 684 volume of ARP traffic that is broadcasted over the MPLS network. This 685 is achieved by having each PE node snoop on ARP request and response 686 messages received over the access interfaces or the MPLS core. The PE 687 builds a cache of IP / MAC address bindings from these snooped 688 messages. The PE then uses this cache to respond to ARP requests 689 ingress on access ports and targeting hosts that are in remote sites. 690 If the PE finds a match for the IP address in its ARP cache, it 691 responds back to the requesting host and drops the request. 692 Otherwise, if it does not find a match, then the request is flooded 693 over the MPLS network using either ingress replication or LSM. 695 9. Seamless Interworking with IEEE 802.1aq/802.1Qbp 697 +--------------+ 698 | | 699 +---------+ | MPLS | +---------+ 700 +----+ | | +----+ +----+ | | +----+ 701 |SW1 |--| | | PE1| | PE2| | |--| SW3| 702 +----+ | 802.1aq |---| | | |--| 802.1aq | +----+ 703 +----+ | .1Qbp | +----+ +----+ | .1Qbp | +----+ 704 |SW2 |--| | | Backbone | | |--| SW4| 705 +----+ +---------+ +--------------+ +---------+ +----+ 707 |<------ IS-IS -------->|<-----BGP----->|<------ IS-IS ------>| CP 709 |<------------------------- PBB -------------------------->| DP 710 |<----MPLS----->| 712 Legend: CP = Control Plane View 713 DP = Data Plane View 715 Figure 7: Interconnecting 802.1aq/802.1Qbp Networks with PBB-EVPN 717 9.1 B-MAC Address Assignment 719 For the same reasons cited in the TRILL section, the B-MAC addresses 720 need to be globally unique across all the IEEE 802.1aq / 802.1Qbp 721 networks. The same hierarchical address assignment scheme depicted 722 above is proposed for B-MAC addresses as well. 724 9.2 IEEE 802.1aq / 802.1Qbp B-MAC Advertisement Route 726 B-MAC addresses associated with 802.1aq / 802.1Qbp switches are 727 advertised using the BGP MAC Advertisement route already defined in 728 [EVPN]. 730 The encapsulation for the transport of PBB frames over MPLS is 731 similar to that of classical Ethernet, albeit with the additional PBB 732 header, as shown in the figure below: 734 +------------------+ 735 | IP/MPLS Header | 736 +------------------+ 737 | PBB Header | 738 +------------------+ 739 | Ethernet Header | 740 +------------------+ 741 | Ethernet Payload | 742 +------------------+ 743 | Ethernet FCS | 744 +------------------+ 746 Figure 8: PBB over MPLS Encapsulation 748 9.3 Operation: 750 When a PE receives a PBB-encapsulated Ethernet frame from the access 751 side, it performs a lookup on the B-MAC destination address to 752 identify the next hop. If the lookup yields that the next hop is a 753 remote PE, the local PE would then encapsulate the PBB frame in MPLS. 754 The label stack comprises of the VPN label (advertised by the remote 755 PE), followed by an LSP/IGP label. From that point onwards, regular 756 MPLS forwarding is applied. 758 On the disposition PE, assuming penultimate-hop-popping is employed, 759 the PE receives the MPLS-encapsulated PBB frame with a single label: 760 the VPN label. The value of the label indicates to the disposition PE 761 that this is a PBB frame, so the label is popped, the TTL field (in 762 the 802.1Qbp F-Tag) is reinitialized and normal PBB processing is 763 employed from this point onwards. 765 10. Solution Advantages 767 In this section, we discuss the advantages of the PBB-EVPN solution 768 in the context of the requirements set forth in section 3 above. 770 10.1. MAC Advertisement Route Scalability 772 In PBB-EVPN the number of MAC Advertisement Routes is a function of 773 the number of segments (sites), rather than the number of 774 hosts/servers. This is because the B-MAC addresses of the PEs, rather 775 than C-MAC addresses (of hosts/servers) are being advertised in BGP. 776 And, as discussed above, there's a one-to-one mapping between multi- 777 homed segments and B-MAC addresses, whereas there's a one-to-one or 778 many-to-one mapping between single-homed segments and B-MAC addresses 779 for a given PE. As a result, the volume of MAC Advertisement Routes 780 in PBB-EVPN is multiple orders of magnitude less than EVPN. 782 10.2. C-MAC Mobility with MAC Sub-netting 784 In PBB-EVPN, if a PE allocates its B-MAC addresses from a contiguous 785 range, then it can advertise a MAC prefix rather than individual 48- 786 bit addresses. It should be noted that B-MAC addresses can easily be 787 assigned from a contiguous range because PE nodes are within the 788 provider administrative domain; however, CE devices and hosts are 789 typically not within the provider administrative domain. The 790 advantage of such MAC address sub-netting can be maintained even as 791 C-MAC addresses move from one Ethernet segment to another. This is 792 because the C-MAC address to B-MAC address association is learnt in 793 the data-plane and C-MAC addresses are not advertised in BGP. To 794 illustrate how this compares to EVPN, consider the following example: 796 If a PE running EVPN advertises reachability for a MAC subnet that 797 spans N addresses via a particular segment, and then 50% of the MAC 798 addresses in that subnet move to other segments (e.g. due to virtual 799 machine mobility), then in the worst case, N/2 additional MAC 800 Advertisement routes need to be sent for the MAC addresses that have 801 moved. This defeats the purpose of the sub-netting. With PBB-EVPN, on 802 the other hand, the sub-netting applies to the B-MAC addresses which 803 are statically associated with PE nodes and are not subject to 804 mobility. As C-MAC addresses move from one segment to another, the 805 binding of C-MAC to B-MAC addresses is updated via data-plane 806 learning. 808 10.3. C-MAC Address Learning and Confinement 810 In PBB-EVPN, C-MAC address reachability information is built via 811 data-plane learning. As such, PE nodes not participating in active 812 conversations involving a particular C-MAC address will purge that 813 address from their forwarding tables. Furthermore, since C-MAC 814 addresses are not distributed in BGP, PE nodes will not maintain any 815 record of them in control-plane routing table. 817 10.4. Seamless Interworking with TRILL and 802.1aq Access Networks 819 Consider the scenario where two access networks, one running MPLS and 820 the other running 802.1aq, are interconnected via an MPLS backbone 821 network. The figure below shows such an example network. 823 +--------------+ 824 | | 825 +---------+ | MPLS | +---------+ 826 +----+ | | +----+ +----+ | | +----+ 827 | CE |--| | | PE1| | PE2| | |--| CE | 828 +----+ | 802.1aq |---| | | |--| MPLS | +----+ 829 +----+ | | +----+ +----+ | | +----+ 830 | CE |--| | | Backbone | | |--| CE | 831 +----+ +---------+ +--------------+ +---------+ +----+ 833 Figure 9: Interoperability with 802.1aq 835 If the MPLS backbone network employs EVPN, then the 802.1aq data- 836 plane encapsulation must be terminated on PE1 or the edge device 837 connecting to PE1. Either way, all the PE nodes that are part of the 838 associated service instances will be exposed to all the C-MAC 839 addresses of all hosts/servers connected to the access networks. 840 However, if the MPLS backbone network employs PBB-EVPN, then the 841 802.1aq encapsulation can be extended over the MPLS backbone, thereby 842 maintaining C-MAC address transparency on PE1. If PBB-EVPN is also 843 extended over the MPLS access network on the right, then C-MAC 844 addresses would be transparent to PE2 as well. 846 Interoperability with TRILL access network will be described in 847 future revision of this draft. 849 10.5. Per Site Policy Support 851 In PBB-EVPN, a unique B-MAC address can be associated with every site 852 (single-homed or multi-homed). Given that the B-MAC addresses are 853 sent in BGP MAC Advertisement routes, it is possible to define per 854 site (i.e. B-MAC) forwarding policies including policies for E-TREE 855 service. 857 10.6. Avoiding C-MAC Address Flushing 859 With PBB-EVPN, it is possible to avoid C-MAC address flushing upon 860 topology change affecting a multi-homed device. To illustrate this, 861 consider the example network of Figure 1. Both PE1 and PE2 advertize 862 the same B-MAC address (BM1) to PE3. PE3 then learns the C-MAC 863 addresses of the servers/hosts behind CE1 via data-plane learning. If 864 AC1 fails, then PE3 does not need to flush any of the C-MAC addresses 865 learnt and associated with BM1. This is because PE1 will withdraw the 866 MAC Advertisement routes associated with BM1, thereby leading PE3 to 867 have a single adjacency (to PE2) for this B-MAC address. Therefore, 868 the topology change is communicated to PE3 and no C-MAC address 869 flushing is required. 871 11. Acknowledgements 873 The authors would like to thank Jose Liste and Patrice Brissette for 874 their reviews and comments of this document. 876 12. Security Considerations 878 There are no additional security aspects beyond those of VPLS/H-VPLS 879 that need to be discussed here. 881 13. IANA Considerations 883 This document requires IANA to assign a new SAFI value for L2VPN_MAC 884 SAFI. 886 14. Intellectual Property Considerations 888 This document is being submitted for use in IETF standards 889 discussions. 891 15. Normative References 893 [PBB] Clauses 25 and 26 of "IEEE Standard for Local and metropolitan 894 area networks - Media Access Control (MAC) Bridges and 895 Virtual Bridged Local Area Networks", IEEE Std 802.1Q, 896 2013. 898 16. Informative References 900 [PBB-VPLS] Sajassi, et al., "VPLS Interoperability with Provider 901 Backbone Bridges", draft-ietf-l2vpn-pbb-vpls-interop- 902 05.txt, work in progress, October, 2013. 904 [EVPN-REQ] Sajassi, et al., "Requirements for Ethernet VPN (EVPN)", 905 draft-ietf-l2vpn-evpn-req-05.txt, work in progress, 906 October, 2013. 908 [EVPN] Sajassi, et al., "BGP MPLS Based Ethernet VPN", draft-ietf- 909 l2vpn-evpn-04.txt, work in progress, July, 2013. 911 [MMRP] Clause 10 of "IEEE Standard for Local and metropolitan area 912 networks - Media Access Control (MAC) Bridges and Virtual 913 Bridged Local Area Networks", IEEE Std 802.1Q, 2013. 915 17. Authors' Addresses 917 Ali Sajassi 918 Cisco 919 170 West Tasman Drive 920 San Jose, CA 95134, US 921 Email: sajassi@cisco.com 923 Samer Salam 924 Cisco 925 595 Burrard Street, Suite # 2123 926 Vancouver, BC V7X 1J1, Canada 927 Email: ssalam@cisco.com 929 Nabil Bitar 930 Verizon Communications 931 Email : nabil.n.bitar@verizon.com 933 Aldrin Isaac 934 Bloomberg 935 Email: aisaac71@bloomberg.net 937 Wim Henderickx 938 Alcatel-Lucent 939 Email: wim.henderickx@alcatel-lucent.be 941 Lizhong Jin 942 ZTE Corporation 943 889, Bibo Road 944 Shanghai, 201203, China 945 Email: lizhong.jin@zte.com.cn