idnits 2.17.1 draft-ietf-l2vpn-pbb-evpn-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 24, 2014) is 3462 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Working Group Ali Sajassi, Ed. 3 Internet Draft Samer Salam 4 Category: Standards Track Cisco 5 Nabil Bitar 6 Verizon 7 Aldrin Isaac 8 Bloomberg 9 Wim Henderickx 10 Alcatel-Lucent 11 Lizhong Jin 12 ZTE 13 Expires: April 24, 2015 October 24, 2014 15 PBB-EVPN 16 draft-ietf-l2vpn-pbb-evpn-09 18 Status of this Memo 20 This Internet-Draft is submitted to IETF in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF), its areas, and its working groups. Note that 25 other groups may also distribute working documents as 26 Internet-Drafts. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 The list of current Internet-Drafts can be accessed at 34 http://www.ietf.org/1id-abstracts.html 36 The list of Internet-Draft Shadow Directories can be accessed at 37 http://www.ietf.org/shadow.html 39 Copyright and License Notice 41 Copyright (c) 2014 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Abstract 56 This document discusses how Ethernet Provider Backbone Bridging (PBB) 57 can be combined with Ethernet VPN (EVPN) in order to reduce the 58 number of BGP MAC advertisement routes by aggregating Customer/Client 59 MAC (C-MAC) addresses via Provider Backbone MAC address (B-MAC), 60 provide client MAC address mobility using C-MAC aggregation, confine 61 the scope of C-MAC learning to only active flows, offer per site 62 policies and avoid C-MAC address flushing on topology changes. The 63 combined solution is referred to as PBB-EVPN. 65 Conventions 67 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 68 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 69 document are to be interpreted as described in RFC 2119. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 74 2. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 4 75 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 76 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 5 77 4.1. MAC Advertisement Route Scalability . . . . . . . . . . . 5 78 4.2. C-MAC Mobility Independent of B-MAC Advertisements . . . . 5 79 4.3. C-MAC Address Learning and Confinement . . . . . . . . . . 5 80 4.4. Per Site Policy Support . . . . . . . . . . . . . . . . . 6 81 4.5. No C-MAC Address Flushing for All-Active Multi-Homing . . 6 82 5. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 6 83 6. BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . 7 84 6.1. Ethernet Auto-Discovery Route . . . . . . . . . . . . . . 7 85 6.2. MAC/IP Advertisement Route . . . . . . . . . . . . . . . . 8 86 6.3. Inclusive Multicast Ethernet Tag Route . . . . . . . . . . 8 87 6.4. Ethernet Segment Route . . . . . . . . . . . . . . . . . . 9 88 6.5. ESI Label Extended Community . . . . . . . . . . . . . . . 9 89 6.6. ES-Import Route Target . . . . . . . . . . . . . . . . . . 9 90 6.7. MAC Mobility Extended Community . . . . . . . . . . . . . 9 91 6.8. Default Gateway Extended Community . . . . . . . . . . . . 9 92 7. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 9 93 7.1. MAC Address Distribution over Core . . . . . . . . . . . . 9 94 7.2. Device Multi-homing . . . . . . . . . . . . . . . . . . . 10 95 7.2.1. Flow-based Load-balancing . . . . . . . . . . . . . . . 10 96 7.2.1.1. PE B-MAC Address Assignment . . . . . . . . . . . 10 97 7.2.1.2. Automating B-MAC Address Assignment . . . . . . . 12 98 7.2.1.3 Split Horizon and Designated Forwarder Election . . 12 99 7.2.2. I-SID Based Load-balancing . . . . . . . . . . . . . . 13 100 7.2.2.1. PE B-MAC Address Assignment . . . . . . . . . . . . 13 101 7.2.2.2. Split Horizon and Designated Forwarder Election . . 13 102 7.2.2.3. Handling Failure Scenarios . . . . . . . . . . . . 13 103 7.3. Network Multi-homing . . . . . . . . . . . . . . . . . . . 14 104 7.4. Frame Forwarding . . . . . . . . . . . . . . . . . . . . . 15 105 7.4.1. Unicast . . . . . . . . . . . . . . . . . . . . . . . 15 106 7.4.2. Multicast/Broadcast . . . . . . . . . . . . . . . . . 15 107 7.5. MPLS Encapsulation of PBB Frames . . . . . . . . . . . . . 16 108 8. Minimizing ARP Broadcast . . . . . . . . . . . . . . . . . . . 16 109 9. Seamless Interworking with IEEE 802.1aq/802.1Qbp . . . . . . . 16 110 9.1. B-MAC Address Assignment . . . . . . . . . . . . . . . . . 17 111 9.2. IEEE 802.1aq / 802.1Qbp B-MAC Address Advertisement . . . 17 112 9.4. Operation: . . . . . . . . . . . . . . . . . . . . . . . . 17 113 10. Solution Advantages . . . . . . . . . . . . . . . . . . . . . 18 114 10.1. MAC Advertisement Route Scalability . . . . . . . . . . . 18 115 10.2. C-MAC Mobility Independent of B-MAC Advertisements . . . 18 116 10.3. C-MAC Address Learning and Confinement . . . . . . . . . 19 117 10.4. Seamless Interworking with 802.1aq Access Networks . . . 19 118 10.5. Per Site Policy Support . . . . . . . . . . . . . . . . . 20 119 10.6. No C-MAC Address Flushing for All-Active Multi-Homing . . 20 120 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20 121 12. Security Considerations . . . . . . . . . . . . . . . . . . . 20 122 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 123 14. Normative References . . . . . . . . . . . . . . . . . . . . 20 124 15. Informative References . . . . . . . . . . . . . . . . . . . 21 125 16. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 21 127 1. Introduction 129 [EVPN] introduces a solution for multipoint L2VPN services, with 130 advanced multi-homing capabilities, using BGP for distributing 131 customer/client MAC address reach-ability information over the core 132 MPLS/IP network. [PBB] defines an architecture for Ethernet Provider 133 Backbone Bridging (PBB), where MAC tunneling is employed to improve 134 service instance and MAC address scalability in Ethernet as well as 135 VPLS networks [RFC7080]. 137 In this document, we discuss how PBB can be combined with EVPN in 138 order to: reduce the number of BGP MAC advertisement routes by 139 aggregating Customer/Client MAC (C-MAC) addresses via Provider 140 Backbone MAC address (B-MAC), provide client MAC address mobility 141 using C-MAC aggregation, confine the scope of C-MAC learning to only 142 active flows, offer per site policies and avoid C-MAC address 143 flushing on topology changes. The combined solution is referred to as 144 PBB-EVPN. 146 2. Contributors 148 In addition to the authors listed above, the following individuals 149 also contributed to this document. 151 Sami Boutros, Cisco 152 Dennis Cai, Cisco 153 Keyur Patel, Cisco 154 Clarence Filsfils, Cisco 155 Sam Aldrin, Huawei 156 Himanshu Shah, Ciena 157 Florin Balus, ALU 159 3. Terminology 161 BEB: Backbone Edge Bridge 162 B-MAC: Backbone MAC Address 163 CE: Customer Edge 164 C-MAC: Customer/Client MAC Address 165 ES: Ethernet Segment 166 ESI: Ethernet Segment Identifier 167 LSP: Label Switched Path 168 MP2MP: Multipoint to Multipoint 169 MP2P: Multipoint to Point 170 P2MP: Point to Multipoint 171 P2P: Point to Point 172 PE: Provider Edge 173 EVPN: Ethernet VPN 174 EVI: EVPN Instance 175 RT: Route Target 177 Single-Active Redundancy Mode: When only a single PE, among a group 178 of PEs attached to an Ethernet segment, is allowed to forward traffic 179 to/from that Ethernet Segment, then the Ethernet segment is defined 180 to be operating in Single-Active redundancy mode. 182 All-Active Redundancy Mode: When all PEs attached to an Ethernet 183 segment are allowed to forward traffic to/from that Ethernet Segment, 184 then the Ethernet segment is defined to be operating in All-Active 185 redundancy mode. 187 4. Requirements 189 The requirements for PBB-EVPN include all the requirements for EVPN 190 that were described in [RFC7209], in addition to the following: 192 4.1. MAC Advertisement Route Scalability 194 In typical operation, an [EVPN] PE sends a BGP MAC Advertisement 195 Route per customer/client MAC (C-MAC) address. In certain 196 applications, this poses scalability challenges, as is the case in 197 data center interconnect (DCI) scenarios where the number of virtual 198 machines (VMs), and hence the number of C-MAC addresses, can be in 199 the millions. In such scenarios, it is required to reduce the number 200 of BGP MAC Advertisement routes by relying on a 'MAC summarization' 201 scheme, as is provided by PBB. 203 4.2. C-MAC Mobility Independent of B-MAC Advertisements 205 Certain applications, such as virtual machine mobility, require 206 support for fast C-MAC address mobility. For these applications, when 207 using EVPN, the virtual machine MAC address needs to be transmitted 208 in BGP MAC Advertisement route. Otherwise, traffic would be forwarded 209 to the wrong segment when a virtual machine moves from one Ethernet 210 segment to another. This means MAC address prefixes cannot be used in 211 data center applications. 213 In order to support C-MAC address mobility, while retaining the 214 scalability benefits of MAC summarization, PBB technology is used. It 215 defines a Backbone MAC (B-MAC) address space that is independent of 216 the C-MAC address space, and aggregates C-MAC addresses via a single 217 B-MAC address. 219 4.3. C-MAC Address Learning and Confinement 221 In EVPN, all the PE nodes participating in the same EVPN instance are 222 exposed to all the C-MAC addresses learnt by any one of these PE 223 nodes because a C-MAC learned by one of the PE nodes is advertise in 224 BGP to other PE nodes in that EVPN instance. This is the case even if 225 some of the PE nodes for that EVPN instance are not involved in 226 forwarding traffic to, or from, these C-MAC addresses. Even if an 227 implementation does not install hardware forwarding entries for C-MAC 228 addresses that are not part of active traffic flows on that PE, the 229 device memory is still consumed by keeping record of the C-MAC 230 addresses in the routing table (RIB). In network applications with 231 millions of C-MAC addresses, this introduces a non-trivial waste of 232 PE resources. As such, it is required to confine the scope of 233 visibility of C-MAC addresses only to those PE nodes that are 234 actively involved in forwarding traffic to, or from, these addresses. 236 4.4. Per Site Policy Support 238 In many applications, it is required to be able to enforce 239 connectivity policy rules at the granularity of a site (or segment). 240 This includes the ability to control which PE nodes in the network 241 can forward traffic to, or from, a given site. Both EVPN and PBB-EVPN 242 are capable of providing this granularity of policy control. In the 243 case where the policy needs to be at the granularity of per C-MAC 244 address, then C-MAC address learning in control-plane (in BGP) per 245 [EVPN] should be used. 247 4.5. No C-MAC Address Flushing for All-Active Multi-Homing 249 Just as in [EVPN], it is required to avoid C-MAC address flushing 250 upon link, port or node failure for All-Active multi-homed segments. 252 5. Solution Overview 254 The solution involves incorporating IEEE Backbone Edge Bridge (BEB) 255 functionality on the EVPN PE nodes similar to PBB-VPLS, where BEB 256 functionality is incorporated in the VPLS PE nodes. The PE devices 257 would then receive 802.1Q Ethernet frames from their attachment 258 circuits, encapsulate them in the PBB header and forward the frames 259 over the IP/MPLS core. On the egress EVPN PE, the PBB header is 260 removed following the MPLS disposition, and the original 802.1Q 261 Ethernet frame is delivered to the customer equipment. 263 BEB +--------------+ BEB 264 || | | || 265 \/ | | \/ 266 +----+ AC1 +----+ | | +----+ +----+ 267 | CE1|-----| | | | | |---| CE2| 268 +----+\ | PE1| | IP/MPLS | | PE3| +----+ 269 \ +----+ | Network | +----+ 270 \ | | 271 AC2\ +----+ | | 272 \| | | | 273 | PE2| | | 274 +----+ | | 275 /\ +--------------+ 276 || 277 BEB 278 <-802.1Q-> <------PBB over MPLS------> <-802.1Q-> 280 Figure 1: PBB-EVPN Network 282 The PE nodes perform the following functions:- Learn customer/client 283 MAC addresses (C-MACs) over the attachment circuits in the data- 284 plane, per normal bridge operation. 286 - Learn remote C-MAC to B-MAC bindings in the data-plane for traffic 287 received from the core per [PBB] bridging operation. 289 - Advertise local B-MAC address reach-ability information in BGP to 290 all other PE nodes in the same set of service instances. Note that 291 every PE has a set of B-MAC addresses that uniquely identify the 292 device. B-MAC address assignment is described in details in section 293 7.2.2. 295 - Build a forwarding table from remote BGP advertisements received 296 associating remote B-MAC addresses with remote PE IP addresses and 297 the associated MPLS label(s). 299 6. BGP Encoding 301 PBB-EVPN leverages the same BGP Routes and Attributes defined in 302 [EVPN], adapted as follows: 304 6.1. Ethernet Auto-Discovery Route 306 This route and all of its associated modes are not needed in PBB-EVPN 307 because PBB encapsulation provides the required level of indirection 308 for C-MAC addresses - i.e., an ES can be represented by a B-MAC 309 address for the purpose of data-plane learning/forwarding. 311 The receiving PE knows that it need not wait for the receipt of the 312 Ethernet A-D route for route resolution by means of the reserved 313 Ethernet Segment Identifier (ESI) encoded in the MAC Advertisement 314 route: the ESI values of 0 and MAX-ESI indicate that the receiving PE 315 can resolve the path without an Ethernet A-D route. 317 6.2. MAC/IP Advertisement Route 319 The EVPN MAC/IP Advertisement Route is used to distribute B-MAC 320 addresses of the PE nodes instead of the C-MAC addresses of end- 321 stations/hosts. This is because the C-MAC addresses are learnt in the 322 data-plane for traffic arriving from the core. The MAC Advertisement 323 Route is encoded as follows: 325 - The MAC address field contains the B-MAC address. 326 - The Ethernet Tag field is set to 0. 327 - The Ethernet Segment Identifier field must be set either to 0 (for 328 single-homed segments or multi-homed segments with per-ISID load- 329 balancing) or to MAX-ESI (for multi-homed segments with per-flow 330 load-balancing). All other values are not permitted. 331 - All other fields are set as defined in [EVPN]. 333 This route is tagged with the Route Target (RT) corresponding to its 334 EVI. This EVI is analogous to a B-VID. 336 6.3. Inclusive Multicast Ethernet Tag Route 338 This route is used for multicast pruning per I-SID. It is used for 339 auto-discovery of PEs participating in a given I-SID so that a 340 multicast tunnel (MP2P, P2P, P2MP, or MP2MP LSP) can be setup for 341 that I-SID . [RFC7080] uses multicast pruning per I-SID based on 342 [MMRP] which is a soft-state protocol. The advantages of multicast 343 pruning using this BGP route over [MMRP] are that a) it scales very 344 well for large number of PEs and b) it works with any type of LSP 345 (MP2P, P2P, P2MP, or MP2MP); whereas, [MMRP] only works over P2P 346 pseudowires. The Inclusive Multicast Ethernet Tag Route is encoded as 347 follow: 349 - The Ethernet Tag field is set with the appropriate I-SID value. 350 - All other fields are set as defined in [EVPN]. 352 This route is tagged with an RT. This RT SHOULD be set to a value 353 corresponding to its EVI (which is analogous to a B-VID). The RT for 354 this route MAY also be auto-derived from the corresponding Ethernet 355 Tag (I-SID) based on the procedure specified in section 8.4.1.1.1 of 356 [EVPN]. 358 6.4. Ethernet Segment Route 360 This route is auto-discovery of member PEs belonging to a given 361 redundancy group (e.g., attached to a given Ethernet Segment) per 362 [EVPN]. 364 6.5. ESI Label Extended Community 366 This extended community is not used in PBB-EVPN. In [EVPN], this 367 extended community is used along with the Ethernet AD route to 368 advertise an MPLS label for the purpose of split-horizon filtering. 369 Since in PBB-EVPN, the split-horizon filtering is performed natively 370 using B-MAC SA, there is no need for this extended community. 372 6.6. ES-Import Route Target 374 This RT is used as defined in [EVPN]. 376 6.7. MAC Mobility Extended Community 378 This extended community is defined in [EVPN] and it is used with a 379 MAC route (B-MAC route in case of PBB-EVPN). The B-MAC route is 380 tagged with the RT corresponding to its EVI (which is analogous to a 381 B-VID). When this extended community is used along with a B-MAC route 382 in PBB-EVPN, it indicates that all C-MAC addresses associated with 383 that B-MAC address across all corresponding I-SIDs must be flushed. 385 6.8. Default Gateway Extended Community 387 This extended community is not used in PBB-EVPN. 389 7. Operation 391 This section discusses the operation of PBB-EVPN, specifically in 392 areas where it differs from [EVPN]. 394 7.1. MAC Address Distribution over Core 396 In PBB-EVPN, host MAC addresses (i.e. C-MAC addresses) need not be 397 distributed in BGP. Rather, every PE independently learns the C-MAC 398 addresses in the data-plane via normal bridging operation. Every PE 399 has a set of one or more unicast B-MAC addresses associated with it, 400 and those are the addresses distributed over the core in MAC 401 Advertisement routes. 403 7.2. Device Multi-homing 405 7.2.1. Flow-based Load-balancing 407 This section describes the procedures for supporting device multi- 408 homing in an All-Active redundancy mode (i.e., flow-based load- 409 balancing). 411 7.2.1.1. PE B-MAC Address Assignment 413 In [PBB] every BEB is uniquely identified by one or more B-MAC 414 addresses. These addresses are usually locally administered by the 415 Service Provider. For PBB-EVPN, the choice of B-MAC address(es) for 416 the PE nodes must be examined carefully as it has implications on the 417 proper operation of multi-homing. In particular, for the scenario 418 where a CE is multi-homed to a number of PE nodes with All-Active 419 redundancy mode, a given C-MAC address would be reachable via 420 multiple PE nodes concurrently. Given that any given remote PE will 421 bind the C-MAC address to a single B-MAC address, then the various PE 422 nodes connected to the same CE must share the same B-MAC address. 423 Otherwise, the MAC address table of the remote PE nodes will keep 424 oscillating between the B-MAC addresses of the various PE devices. 425 For example, consider the network of Figure 1, and assume that PE1 426 has B-MAC BM1 and PE2 has B-MAC BM2. Also, assume that both links 427 from CE1 to the PE nodes are part of the same Ethernet link 428 aggregation group. If BM1 is not equal to BM2, the consequence is 429 that the MAC address table on PE3 will keep oscillating such that the 430 C-MAC address M1 of CE1 would flip-flop between BM1 or BM2, depending 431 on the load-balancing decision on CE1 for traffic destined to the 432 core. 434 Considering that there could be multiple sites (e.g. CEs) that are 435 multi-homed to the same set of PE nodes, then it is required for all 436 the PE devices in a Redundancy Group to have a unique B-MAC address 437 per site. This way, it is possible to achieve fast convergence in the 438 case where a link or port failure impacts the attachment circuit 439 connecting a single site to a given PE. 441 +---------+ 442 +-------+ PE1 | IP/MPLS | 443 / | | 444 CE1 | Network | PEr 445 M1 \ | | 446 +-------+ PE2 | | 447 /-------+ | | 448 / | | 449 CE2 | | 450 M2 \ | | 451 \ | | 452 +------+ PE3 +---------+ 454 Figure 2: B-MAC Address Assignment 456 In the example network shown in Figure 2 above, two sites 457 corresponding to CE1 and CE2 are dual-homed to PE1/PE2 and PE2/PE3, 458 respectively. Assume that BM1 is the B-MAC used for the site 459 corresponding to CE1. Similarly, BM2 is the B-MAC used for the site 460 corresponding to CE2. On PE1, a single B-MAC address (BM1) is 461 required for the site corresponding to CE1. On PE2, two B-MAC 462 addresses (BM1 and BM2) are required, one per site. Whereas on PE3, a 463 single B-MAC address (BM2) is required for the site corresponding to 464 CE2. All three PE nodes would advertise their respective B-MAC 465 addresses in BGP using the MAC Advertisement routes defined in 466 [EVPN]. The remote PE, PEr, would learn via BGP that BM1 is reachable 467 via PE1 and PE2, whereas BM2 is reachable via both PE2 and PE3. 468 Furthermore, PEr establishes, via the PBB bridge learning procedure, 469 that C-MAC M1 is reachable via BM1, and C-MAC M2 is reachable via 470 BM2. As a result, PEr can load-balance traffic destined to M1 between 471 PE1 and PE2, as well as traffic destined to M2 between both PE2 and 472 PE3. In the case of a failure that causes, for example, CE1 to be 473 isolated from PE1, the latter can withdraw the route it has 474 advertised for BM1. This way, PEr would update its path list for BM1, 475 and will send all traffic destined to M1 over to PE2 only. 477 For Single-Homed or Single-Active sites, it is possible to assign a 478 unique B-MAC address per site, or have all the Single-Homed sites or 479 Single-Active sites connected to a given PE share a single B-MAC 480 address. The advantage of the first model over the second model is 481 the ability to avoid C-MAC destination address lookup on the 482 disposition PE (even though source C-MAC learning is still required 483 in the data-plane). The disadvantage of the first model over the 484 second model is additional B-MAC advertisements in BGP. 486 In summary, every PE may use a unicast B-MAC address shared by all 487 single-homed sites or a unicast B-MAC address per single-homed site 488 and, in addition, a unicast B-MAC address per All-Active multi-homed 489 site. In the latter case, the B-MAC address MUST be the same for all 490 PE nodes in a Redundancy Group connected to the same site. 492 7.2.1.2. Automating B-MAC Address Assignment 494 The PE B-MAC address used for Single-Homed or Single-Active sites can 495 be automatically derived from the hardware (using for e.g. the 496 backplane's address and/or PE's reserved MAC pool ). However, the B- 497 MAC address used for All-Active sites must be coordinated among the 498 RG members. To automate the assignment of this latter address, the PE 499 can derive this B-MAC address from the MAC Address portion of the 500 CE's Link Aggregation Control Protocol (LACP) System Identifier by 501 flipping the 'Locally Administered' bit of the CE's address. This 502 guarantees the uniqueness of the B-MAC address within the network, 503 and ensures that all PE nodes connected to the same All-Active CE use 504 the same value for the B-MAC address. 506 Note that with this automatic provisioning of the B-MAC address 507 associated with All-Active CEs, it is not possible to support the 508 uncommon scenario where a CE has multiple link bundles within the 509 same LACP session towards the PE nodes, and the service involves 510 hair-pinning traffic from one bundle to another. This is because the 511 split-horizon filtering relies on B-MAC addresses rather than Site-ID 512 Labels (as will be described in the next section). The operator must 513 explicitly configure the B-MAC address for this fairly uncommon 514 service scenario. 516 Whenever a B-MAC address is provisioned on the PE, either manually or 517 automatically (as an outcome of CE auto-discovery), the PE MUST 518 transmit an MAC Advertisement Route for the B-MAC address with a 519 downstream assigned MPLS label that uniquely identifies that address 520 on the advertising PE. The route is tagged with the RTs of the 521 associated EVIs as described above. 523 7.2.1.3 Split Horizon and Designated Forwarder Election 525 [EVPN] relies on access split horizon, where the Ethernet Segment 526 Label is used for egress filtering on the attachment circuit in order 527 to prevent forwarding loops. In PBB-EVPN, the B-MAC source address 528 can be used for the same purpose, as it uniquely identifies the 529 originating site of a given frame. As such, Ethernet Segment (ES) 530 Labels are not used in PBB-EVPN, and the egress split-horizon 531 filtering is done based on the B-MAC source address. It is worth 532 noting here that [PBB] defines this B-MAC address based filtering 533 function as part of the I-Component options, hence no new functions 534 are required to support split-horizon beyond what is already defined 535 in [PBB]. 537 The Designated Forwarder election procedures are defined in [EVPN]. 539 7.2.2. I-SID Based Load-balancing 541 This section describes the procedures for supporting device multi- 542 homing in a Single-Active redundancy mode with per-ISID load- 543 balancing. 545 7.2.2.1. PE B-MAC Address Assignment 547 In the case where per-ISID load-balancing is desired among the PE 548 nodes in a given redundancy group, multiple unicast B-MAC addresses 549 are allocated per multi-homed Ethernet Segment: Each PE connected to 550 the multi-homed segment is assigned a unique B-MAC. Every PE then 551 advertises its B-MAC address using the BGP MAC advertisement route. 552 In this mode of operation, two B-MAC address assignment models are 553 possible: 555 - The PE may use a shared B-MAC address for multiple Ethernet 556 Segments (ES's). This includes the single-homed segments as well as 557 the multi-homed segments operating with per-ISID load-balancing mode. 559 - The PE may use a dedicated B-MAC address for each ES operating with 560 per-ISID load-balancing mode. 562 A PE implementation MAY choose to support either the shared B-MAC 563 address model or the dedicated B-MAC address model without causing 564 any interoperability issues. 566 A remote PE initially floods traffic to a destination C-MAC address, 567 located in a given multi-homed Ethernet Segment, to all the PE nodes 568 configured with that I-SID. Then, when reply traffic arrives at the 569 remote PE, it learns (in the data-path) the B-MAC address and 570 associated next-hop PE to use for said C-MAC address. 572 7.2.2.2. Split Horizon and Designated Forwarder Election The procedures 573 are similar to the flow-based load-balancing case, with the only 574 difference being that the DF filtering must be applied to unicast as 575 well as multicast traffic, and in both core-to-segment as well as 576 segment-to-core directions. 578 7.2.2.3. Handling Failure Scenarios 580 When a PE connected to a multi-homed Ethernet Segment loses 581 connectivity to the segment, due to link or port failure, it needs to 582 notify the remote PEs to trigger C-MAC address flushing. This can be 583 achieved in one of two ways, depending on the B-MAC assignment model: 585 - If the PE uses a shared B-MAC address for multiple ES's, then the 586 C-MAC flushing is signaled by means of having the failed PE re- 587 advertise the MAC Advertisement route for the associated B-MAC, 588 tagged with the MAC Mobility Extended Community attribute. The value 589 of the Counter field in that attribute must be incremented prior to 590 advertisement. This causes the remote PE nodes to flush all C-MAC 591 addresses associated with the B-MAC in question. This is done across 592 all I-SIDs that are mapped to the EVI of the withdrawn MAC route. 594 - If the PE uses a dedicated B-MAC address for each Ethernet Segment 595 operating under per-ISID load-balancing mode, the the failed PE 596 simply withdraws the B-MAC route previously advertised for that 597 segment. This causes the remote PE nodes to flush all C-MAC addresses 598 associated with the B-MAC in question. This is done across all I-SIDs 599 that are mapped to the EVI of the withdrawn MAC route. 601 When a PE connected to a multi-homed Ethernet Segment fails (i.e. 602 node failure) or when the PE becomes completely isolated from the 603 EVPN network, the remote PEs will start purging the MAC Advertisement 604 routes that were advertised by the failed PE. This is done either as 605 an outcome of the remote PEs detecting that the BGP session to the 606 failed PE has gone down, or by having a Route Reflector withdrawing 607 all the routes that were advertised by the failed PE. The remote PEs, 608 in this case, will perform C-MAC address flushing as an outcome of 609 the MAC Advertisement route withdrawals. 611 For all failure scenarios (link/port failure, node failure and PE 612 node isolation), when the fault condition clears, the recovered PE 613 re-advertises the associated Ethernet Segment route to other members 614 of its Redundancy Group. This triggers the backup PE(s) in the 615 Redundancy Group to block the I-SIDs for which the recovered PE is a 616 DF. When a backup PE blocks the I-SIDs, it triggers a C-MAC address 617 flush notification to the remote PEs by re-advertising the MAC 618 Advertisement route for the associated B-MAC, with the MAC Mobility 619 Extended Community attribute. The value of the Counter field in that 620 attribute must be incremented prior to advertisement. This causes the 621 remote PE nodes to flush all C-MAC addresses associated with the B- 622 MAC in question. This is done across all I-SIDs that are mapped to 623 the EVI of the withdrawn/readvertised MAC route. 625 7.3. Network Multi-homing 627 When an Ethernet network is multi-homed to a set of PE nodes running 628 PBB-EVPN, Single-Active redundancy model can be supported with per 629 service instance (i.e. I-SID) load-balancing. In this model, DF 630 election is performed to ensure that a single PE node in the 631 redundancy group is responsible for forwarding traffic associated 632 with a given I-SID. This guarantees that no forwarding loops are 633 created. Filtering based on DF state applies to both unicast and 634 multicast traffic, and in both access-to-core as well as core-to- 635 access directions just like Single-Active multi-homed device scenario 636 (but unlike All-Active multi-homed device scenario where DF filtering 637 is limited to multi-destination frames in the core-to-access 638 direction). Similar to Single-Active multi-homed device scenario, 639 with I-SID based load-balancing, a unique B-MAC address is assigned 640 to each of the PE nodes connected to the multi-homed network 641 (Segment). 643 7.4. Frame Forwarding 645 The frame forwarding functions are divided in between the Bridge 646 Module, which hosts the [PBB] Backbone Edge Bridge (BEB) 647 functionality, and the MPLS Forwarder which handles the MPLS 648 imposition/disposition. The details of frame forwarding for unicast 649 and multi-destination frames are discussed next. 651 7.4.1. Unicast 653 Known unicast traffic received from the AC will be PBB-encapsulated 654 by the PE using the B-MAC source address corresponding to the 655 originating site. The unicast B-MAC destination address is determined 656 based on a lookup of the C-MAC destination address (the binding of 657 the two is done via transparent learning of reverse traffic). The 658 resulting frame is then encapsulated with an LSP tunnel label and the 659 MPLS label which uniquely identifies the B-MAC destination address on 660 the egress PE. If per flow load-balancing over ECMPs in the MPLS core 661 is required, then a flow label is added below the label associated 662 with the BMAC address in the label stack. 664 For unknown unicast traffic, the PE forwards these frames over MPLS 665 core. When these frames are to be forwarded, then the same set of 666 options used for forwarding multicast/broadcast frames (as described 667 in next section) are used. 669 7.4.2. Multicast/Broadcast 671 Multi-destination frames received from the AC will be PBB- 672 encapsulated by the PE using the B-MAC source address corresponding 673 to the originating site. The multicast B-MAC destination address is 674 selected based on the value of the I-SID as defined in [PBB]. The 675 resulting frame is then forwarded over the MPLS core using one out of 676 the following two options: 678 Option 1: the MPLS Forwarder can perform ingress replication over a 679 set of MP2P or P2P tunnel LSPs. The frame is encapsulated with a 680 tunnel LSP label and the EVPN ingress replication label advertised in 681 the Inclusive Multicast Route. 683 Option 2: the MPLS Forwarder can use P2MP tunnel LSP per the 684 procedures defined in [EVPN]. This includes either the use of 685 Inclusive or Aggregate Inclusive trees. Furthermore, the MPLS 686 Forwarder can use MP2MP tunnel LSP if Inclusive trees are used. 688 Note that the same procedures for advertising and handling the 689 Inclusive Multicast Route defined in [EVPN] apply here. 691 7.5. MPLS Encapsulation of PBB Frames 693 The encapsulation for the transport of PBB frames over MPLS is 694 similar to that of classical Ethernet, albeit with the additional PBB 695 header, as shown in the figure below: 697 +------------------+ 698 | IP/MPLS Header | 699 +------------------+ 700 | PBB Header | 701 +------------------+ 702 | Ethernet Header | 703 +------------------+ 704 | Ethernet Payload | 705 +------------------+ 706 | Ethernet FCS | 707 +------------------+ 709 Figure 8: PBB over MPLS Encapsulation 711 8. Minimizing ARP Broadcast 713 The PE nodes implement an ARP-proxy function in order to minimize the 714 volume of ARP traffic that is broadcasted over the MPLS network. This 715 is achieved by having each PE node snoop on ARP request and response 716 messages received over the access interfaces or the MPLS core. The PE 717 builds a cache of IP / MAC address bindings from these snooped 718 messages. The PE then uses this cache to respond to ARP requests 719 ingress on access ports and targeting hosts that are in remote sites. 720 If the PE finds a match for the IP address in its ARP cache, it 721 responds back to the requesting host and drops the request. 722 Otherwise, if it does not find a match, then the request is flooded 723 over the MPLS network using either ingress replication or P2MP LSPs. 725 9. Seamless Interworking with IEEE 802.1aq/802.1Qbp 726 +--------------+ 727 | | 728 +---------+ | MPLS | +---------+ 729 +----+ | | +----+ +----+ | | +----+ 730 |SW1 |--| | | PE1| | PE2| | |--| SW3| 731 +----+ | 802.1aq |---| | | |--| 802.1aq | +----+ 732 +----+ | .1Qbp | +----+ +----+ | .1Qbp | +----+ 733 |SW2 |--| | | Backbone | | |--| SW4| 734 +----+ +---------+ +--------------+ +---------+ +----+ 736 |<------ IS-IS -------->|<-----BGP----->|<------ IS-IS ------>| CP 738 |<------------------------- PBB -------------------------->| DP 739 |<----MPLS----->| 741 Legend: CP = Control Plane View 742 DP = Data Plane View 744 Figure 7: Interconnecting 802.1aq/802.1Qbp Networks with PBB-EVPN 746 9.1. B-MAC Address Assignment 748 The B-MAC addresses need to be globally unique across all networks 749 including PBB-EVPN and IEEE 802.1aq / 802.1Qbp networks. The B-MAC 750 addresses used for Single-Home and Single-Active Ethernet Segments 751 should be unique because they are typically auto-derived from PE's 752 pools of reserved MAC addresses that are unique. The B-MAC addresses 753 used for All-Active Ethernet Segments should also be unique given 754 that each network operator typically has its own assigned 755 Organizationally Unique Identifier (OUI) and thus can assign its own 756 unique MAC addresses. 758 9.2. IEEE 802.1aq / 802.1Qbp B-MAC Address Advertisement 760 B-MAC addresses associated with 802.1aq / 802.1Qbp switches are 761 advertised using the EVPN MAC/IP route advertisement already defined 762 in [EVPN]. 764 9.4. Operation: 766 When a PE receives a PBB-encapsulated Ethernet frame from the access 767 side, it performs a lookup on the B-MAC destination address to 768 identify the next hop. If the lookup yields that the next hop is a 769 remote PE, the local PE would then encapsulate the PBB frame in MPLS. 770 The label stack comprises of the VPN label (advertised by the remote 771 PE), followed by an LSP/IGP label. From that point onwards, regular 772 MPLS forwarding is applied. 774 On the disposition PE, assuming penultimate-hop-popping is employed, 775 the PE receives the MPLS-encapsulated PBB frame with a single label: 776 the VPN label. The value of the label indicates to the disposition PE 777 that this is a PBB frame, so the label is popped, the TTL field (in 778 the 802.1Qbp F-Tag) is reinitialized and normal PBB processing is 779 employed from this point onwards. 781 10. Solution Advantages 783 In this section, we discuss the advantages of the PBB-EVPN solution 784 in the context of the requirements set forth in section 3 above. 786 10.1. MAC Advertisement Route Scalability 788 In PBB-EVPN the number of MAC Advertisement Routes is a function of 789 the number of Ethernet Segments (e.g., sites), rather than the number 790 of hosts/servers. This is because the B-MAC addresses of the PEs, 791 rather than C-MAC addresses (of hosts/servers) are being advertised 792 in BGP. As discussed above, there's a one-to-one mapping between All- 793 Active multi-homed segments and their associated B-MAC addresses, and 794 there can be either a one-to-one or many-to-one mapping between 795 Single-Active multi-homed segments and their associated B-MAC 796 addresses, and finally there is a many-to-one mapping between single- 797 home sites and their associated B-MAC addresses on a given PE. This 798 means a single B-MAC is associated with one or more segments where 799 each segment can be associated with many C-MAC addresses. As a 800 result, the volume of MAC Advertisement Routes in PBB-EVPN may be 801 multiple orders of magnitude less than EVPN. 803 10.2. C-MAC Mobility Independent of B-MAC Advertisements 805 As described above, in PBB-EVPN, a single B-MAC address can aggregate 806 many C-MAC addresses. Given that B-MAC addresses are associated with 807 segments attached to a PE or to the PE itself, their locations are 808 fixed and thus not impacted what so ever by C-MAC mobility. 809 Therefore, C-MAC mobility does not affect B-MAC addresses (e.g., any 810 re-advertisements of them). This is because the C-MAC address to B- 811 MAC address association is learnt in the data-plane and C-MAC 812 addresses are not advertised in BGP. Aggregation via B-MAC addresses 813 in PBB-EVPN performs much better than EVPN. 815 To illustrate how this compares to EVPN, consider the following 816 example: 818 If a PE running EVPN advertises reachability for N MAC addresses via 819 a particular segment, and then 50% of the MAC addresses in that 820 segment move to other segments (e.g. due to virtual machine 821 mobility), then N/2 additional MAC Advertisement routes need to be 822 sent for the MAC addresses that have moved. With PBB-EVPN, on the 823 other hand, the B-MAC addresses which are statically associated with 824 PE nodes, are not subject to mobility. As C-MAC addresses move from 825 one segment to another, the binding of C-MAC to B-MAC addresses is 826 updated via data-plane learning in PBB-EVPN. 828 10.3. C-MAC Address Learning and Confinement 830 In PBB-EVPN, C-MAC address reachability information is built via 831 data-plane learning. As such, PE nodes not participating in active 832 conversations involving a particular C-MAC address will purge that 833 address from their forwarding tables. Furthermore, since C-MAC 834 addresses are not distributed in BGP, PE nodes will not maintain any 835 record of them in control-plane routing table. 837 10.4. Seamless Interworking with 802.1aq Access Networks 839 Consider the scenario where two access networks, one running MPLS and 840 the other running 802.1aq, are interconnected via an MPLS backbone 841 network. The figure below shows such an example network. 843 +--------------+ 844 | | 845 +---------+ | MPLS | +---------+ 846 +----+ | | +----+ +----+ | | +----+ 847 | CE |--| | | PE1| | PE2| | |--| CE | 848 +----+ | 802.1aq |---| | | |--| MPLS | +----+ 849 +----+ | | +----+ +----+ | | +----+ 850 | CE |--| | | Backbone | | |--| CE | 851 +----+ +---------+ +--------------+ +---------+ +----+ 853 Figure 9: Interoperability with 802.1aq 855 If the MPLS backbone network employs EVPN, then the 802.1aq data- 856 plane encapsulation must be terminated on PE1 or the edge device 857 connecting to PE1. Either way, all the PE nodes that are part of the 858 associated service instances will be exposed to all the C-MAC 859 addresses of all hosts/servers connected to the access networks. 860 However, if the MPLS backbone network employs PBB-EVPN, then the 861 802.1aq encapsulation can be extended over the MPLS backbone, thereby 862 maintaining C-MAC address transparency on PE1. If PBB-EVPN is also 863 extended over the MPLS access network on the right, then C-MAC 864 addresses would be transparent to PE2 as well. 866 10.5. Per Site Policy Support 868 In PBB-EVPN, the per site policy can be supported via B-MAC addresses 869 via assigning a unique B-MAC address for every site/segment 870 (typically multi-homed but can also be single-homed). Given that the 871 B-MAC addresses are sent in BGP MAC/IP route advertisement, it is 872 possible to define per site (i.e. B-MAC) forwarding policies 873 including policies for E-TREE service. 875 10.6. No C-MAC Address Flushing for All-Active Multi-Homing 877 Just as in [EVPN], with PBB-EVPN, it is possible to avoid C-MAC 878 address flushing upon topology change affecting an All-Active multi- 879 homed segment. To illustrate this, consider the example network of 880 Figure 1. Both PE1 and PE2 advertise the same B-MAC address (BM1) to 881 PE3. PE3 then learns the C-MAC addresses of the servers/hosts behind 882 CE1 via data-plane learning. If AC1 fails, then PE3 does not need to 883 flush any of the C-MAC addresses learnt and associated with BM1. This 884 is because PE1 will withdraw the MAC Advertisement routes associated 885 with BM1, thereby leading PE3 to have a single adjacency (to PE2) for 886 this B-MAC address. Therefore, the topology change is communicated to 887 PE3 and no C-MAC address flushing is required. 889 11. Acknowledgements 891 The authors would like to thank Sami Boutros, Jose Liste, and Patrice 892 Brissette for their reviews and comments of this document. We would 893 also like to thank Giles Heron for several rounds of reviews and 894 providing valuable inputs to get this draft ready for IESG 895 submission. 897 12. Security Considerations 899 All the security considerations in [EVPN] apply directly to this 900 document because this document leverages [EVPN] control plane and 901 their associated procedures - although not the complete set but 902 rather a subset. 904 13. IANA Considerations 906 There is no additional IANA considerations for PBB-EVPN beyond what 907 is already described in [EVPN]. 909 14. Normative References 911 [EVPN] Sajassi, et al., "BGP MPLS Based Ethernet VPN", draft- 912 ietf-l2vpn-evpn-11.txt, work in progress, October 2014. 914 15. Informative References 916 [RFC7080] A. Sajassi, et al., "Virtual Private LAN Service (VPLS) 917 Interoperability with Provider Backbone Bridges", RFC 918 7080, December 2013. 920 [RFC7209] A. Sajassi, et al., "Requirements for Ethernet VPN 921 (EVPN)", RFC 7209, May 2014. 923 [PBB] Clauses 25 and 26 of "IEEE Standard for Local and 924 metropolitan area networks - Media Access Control (MAC) 925 Bridges and Virtual Bridged Local Area Networks", IEEE Std 926 802.1Q, 2013. 928 [MMRP] Clause 10 of "IEEE Standard for Local and metropolitan 929 area networks - Media Access Control (MAC) Bridges and 930 Virtual Bridged Local Area Networks", IEEE Std 802.1Q, 931 2013. 933 16. Authors' Addresses 935 Ali Sajassi 936 Cisco 937 170 West Tasman Drive 938 San Jose, CA 95134, US 939 Email: sajassi@cisco.com 941 Samer Salam 942 Cisco 943 595 Burrard Street, Suite # 2123 944 Vancouver, BC V7X 1J1, Canada 945 Email: ssalam@cisco.com 947 Nabil Bitar 948 Verizon Communications 949 Email : nabil.n.bitar@verizon.com 951 Aldrin Isaac 952 Bloomberg 953 Email: aisaac71@bloomberg.net 955 Wim Henderickx 956 Alcatel-Lucent 957 Email: wim.henderickx@alcatel-lucent.be 958 Lizhong Jin 959 Shanghai, 960 China 961 Email: lizho.jin@gmail.comLizhong Jin