idnits 2.17.1 draft-ietf-l2vpn-pbb-evpn-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (May 14, 2015) is 3268 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Possible downref: Non-RFC (?) normative reference: ref. 'PBB' == Outdated reference: A later version (-12) exists of draft-ietf-bess-evpn-overlay-01 Summary: 0 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Working Group Ali Sajassi, Ed. 3 Internet Draft Samer Salam 4 Category: Standards Track Cisco 5 Nabil Bitar 6 Verizon 7 Aldrin Isaac 8 Bloomberg 9 Wim Henderickx 10 Alcatel-Lucent 11 Expires: November 14, 2015 May 14, 2015 13 Provider Backbone Bridging Combined with Ethernet VPN (PBB-EVPN) 14 draft-ietf-l2vpn-pbb-evpn-10 16 Status of this Memo 18 This Internet-Draft is submitted to IETF in full conformance with the 19 provisions of BCP 78 and BCP 79. 21 Internet-Drafts are working documents of the Internet Engineering 22 Task Force (IETF), its areas, and its working groups. Note that 23 other groups may also distribute working documents as 24 Internet-Drafts. 26 Internet-Drafts are draft documents valid for a maximum of six months 27 and may be updated, replaced, or obsoleted by other documents at any 28 time. It is inappropriate to use Internet-Drafts as reference 29 material or to cite them other than as "work in progress." 31 The list of current Internet-Drafts can be accessed at 32 http://www.ietf.org/1id-abstracts.html 34 The list of Internet-Draft Shadow Directories can be accessed at 35 http://www.ietf.org/shadow.html 37 Copyright and License Notice 39 Copyright (c) 2015 IETF Trust and the persons identified as the 40 document authors. All rights reserved. 42 This document is subject to BCP 78 and the IETF Trust's Legal 43 Provisions Relating to IETF Documents 44 (http://trustee.ietf.org/license-info) in effect on the date of 45 publication of this document. Please review these documents 46 carefully, as they describe your rights and restrictions with respect 47 to this document. Code Components extracted from this document must 48 include Simplified BSD License text as described in Section 4.e of 49 the Trust Legal Provisions and are provided without warranty as 50 described in the Simplified BSD License. 52 Abstract 54 This document discusses how Ethernet Provider Backbone Bridging (PBB) 55 can be combined with Ethernet VPN (EVPN) in order to reduce the 56 number of BGP MAC advertisement routes by aggregating Customer/Client 57 MAC (C-MAC) addresses via Provider Backbone MAC address (B-MAC), 58 provide client MAC address mobility using C-MAC aggregation, confine 59 the scope of C-MAC learning to only active flows, offer per site 60 policies and avoid C-MAC address flushing on topology changes. The 61 combined solution is referred to as PBB-EVPN. 63 Conventions 65 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 66 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 67 document are to be interpreted as described in RFC 2119. 69 Table of Contents 71 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 72 2. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 4 73 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 74 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 5 75 4.1. MAC Advertisement Route Scalability . . . . . . . . . . . 5 76 4.2. C-MAC Mobility Independent of B-MAC Advertisements . . . . 5 77 4.3. C-MAC Address Learning and Confinement . . . . . . . . . . 6 78 4.4. Per Site Policy Support . . . . . . . . . . . . . . . . . 6 79 4.5. No C-MAC Address Flushing for All-Active Multi-Homing . . 6 80 5. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 6 81 6. BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . 7 82 6.1. Ethernet Auto-Discovery Route . . . . . . . . . . . . . . 7 83 6.2. MAC/IP Advertisement Route . . . . . . . . . . . . . . . . 8 84 6.3. Inclusive Multicast Ethernet Tag Route . . . . . . . . . . 8 85 6.4. Ethernet Segment Route . . . . . . . . . . . . . . . . . . 9 86 6.5. ESI Label Extended Community . . . . . . . . . . . . . . . 9 87 6.6. ES-Import Route Target . . . . . . . . . . . . . . . . . . 9 88 6.7. MAC Mobility Extended Community . . . . . . . . . . . . . 9 89 6.8. Default Gateway Extended Community . . . . . . . . . . . . 9 90 7. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 9 91 7.1. MAC Address Distribution over Core . . . . . . . . . . . . 10 92 7.2. Device Multi-homing . . . . . . . . . . . . . . . . . . . 10 93 7.2.1. Flow-based Load-balancing . . . . . . . . . . . . . . . 10 94 7.2.1.1. PE B-MAC Address Assignment . . . . . . . . . . . 10 95 7.2.1.2. Automating B-MAC Address Assignment . . . . . . . 12 96 7.2.1.3 Split Horizon and Designated Forwarder Election . . 12 97 7.2.2. I-SID Based Load-balancing . . . . . . . . . . . . . . 13 98 7.2.2.1. PE B-MAC Address Assignment . . . . . . . . . . . . 13 99 7.2.2.2. Split Horizon and Designated Forwarder Election . . 13 100 7.2.2.3. Handling Failure Scenarios . . . . . . . . . . . . 13 101 7.3. Network Multi-homing . . . . . . . . . . . . . . . . . . . 14 102 7.4. Frame Forwarding . . . . . . . . . . . . . . . . . . . . . 15 103 7.4.1. Unicast . . . . . . . . . . . . . . . . . . . . . . . 15 104 7.4.2. Multicast/Broadcast . . . . . . . . . . . . . . . . . 15 105 7.5. MPLS Encapsulation of PBB Frames . . . . . . . . . . . . . 16 106 8. Minimizing ARP/ND Broadcast . . . . . . . . . . . . . . . . . 16 107 9. Seamless Interworking with IEEE 802.1aq/802.1Qbp . . . . . . . 17 108 9.1. B-MAC Address Assignment . . . . . . . . . . . . . . . . . 17 109 9.2. IEEE 802.1aq / 802.1Qbp B-MAC Address Advertisement . . . 17 110 9.4. Operation: . . . . . . . . . . . . . . . . . . . . . . . . 17 111 10. Solution Advantages . . . . . . . . . . . . . . . . . . . . . 18 112 10.1. MAC Advertisement Route Scalability . . . . . . . . . . . 18 113 10.2. C-MAC Mobility Independent of B-MAC Advertisements . . . 18 114 10.3. C-MAC Address Learning and Confinement . . . . . . . . . 19 115 10.4. Seamless Interworking with 802.1aq Access Networks . . . 19 116 10.5. Per Site Policy Support . . . . . . . . . . . . . . . . . 20 117 10.6. No C-MAC Address Flushing for All-Active Multi-Homing . . 20 118 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 20 119 12. Security Considerations . . . . . . . . . . . . . . . . . . . 20 120 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 20 121 14. Normative References . . . . . . . . . . . . . . . . . . . . 21 122 15. Informative References . . . . . . . . . . . . . . . . . . . 21 123 16. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 21 125 1. Introduction 127 [RFC7432] introduces a solution for multipoint L2VPN services, with 128 advanced multi-homing capabilities, using BGP for distributing 129 customer/client MAC address reach-ability information over the core 130 MPLS/IP network. [PBB] defines an architecture for Ethernet Provider 131 Backbone Bridging (PBB), where MAC tunneling is employed to improve 132 service instance and MAC address scalability in Ethernet as well as 133 VPLS networks [RFC7080]. 135 In this document, we discuss how PBB can be combined with EVPN in 136 order to: reduce the number of BGP MAC advertisement routes by 137 aggregating Customer/Client MAC (C-MAC) addresses via Provider 138 Backbone MAC address (B-MAC), provide client MAC address mobility 139 using C-MAC aggregation, confine the scope of C-MAC learning to only 140 active flows, offer per site policies and avoid C-MAC address 141 flushing on topology changes. The combined solution is referred to as 142 PBB-EVPN. 144 2. Contributors 146 In addition to the authors listed above, the following individuals 147 also contributed to this document. 149 Lizhong Jin, ZTE 150 Sami Boutros, Cisco 151 Dennis Cai, Cisco 152 Keyur Patel, Cisco 153 Sam Aldrin, Huawei 154 Himanshu Shah, Ciena 155 Jorge Rabadan, ALU 157 3. Terminology 159 ARP: Address Resolution Protocol 160 BEB: Backbone Edge Bridge 161 B-MAC: Backbone MAC Address 162 CE: Customer Edge 163 C-MAC: Customer/Client MAC Address 164 ES: Ethernet Segment 165 ESI: Ethernet Segment Identifier 166 LSP: Label Switched Path 167 MP2MP: Multipoint to Multipoint 168 MP2P: Multipoint to Point 169 ND: Neighbor Discovery 170 NA: Neighbor Advertisement 171 P2MP: Point to Multipoint 172 P2P: Point to Point 173 PE: Provider Edge 174 EVPN: Ethernet VPN 175 EVI: EVPN Instance 176 RT: Route Target 178 Single-Active Redundancy Mode: When only a single PE, among a group 179 of PEs attached to an Ethernet segment, is allowed to forward traffic 180 to/from that Ethernet Segment, then the Ethernet segment is defined 181 to be operating in Single-Active redundancy mode. 183 All-Active Redundancy Mode: When all PEs attached to an Ethernet 184 segment are allowed to forward traffic to/from that Ethernet Segment, 185 then the Ethernet segment is defined to be operating in All-Active 186 redundancy mode. 188 4. Requirements 190 The requirements for PBB-EVPN include all the requirements for EVPN 191 that were described in [RFC7209], in addition to the following: 193 4.1. MAC Advertisement Route Scalability 195 In typical operation, an EVPN PE sends a BGP MAC Advertisement Route 196 per customer/client MAC (C-MAC) address. In certain applications, 197 this poses scalability challenges, as is the case in data center 198 interconnect (DCI) scenarios where the number of virtual machines 199 (VMs), and hence the number of C-MAC addresses, can be in the 200 millions. In such scenarios, it is required to reduce the number of 201 BGP MAC Advertisement routes by relying on a 'MAC summarization' 202 scheme, as is provided by PBB. 204 4.2. C-MAC Mobility Independent of B-MAC Advertisements 206 Certain applications, such as virtual machine mobility, require 207 support for fast C-MAC address mobility. For these applications, when 208 using EVPN, the virtual machine MAC address needs to be transmitted 209 in BGP MAC Advertisement route. Otherwise, traffic would be forwarded 210 to the wrong segment when a virtual machine moves from one Ethernet 211 segment to another. This means MAC address prefixes cannot be used in 212 data center applications. 214 In order to support C-MAC address mobility, while retaining the 215 scalability benefits of MAC summarization, PBB technology is used. It 216 defines a Backbone MAC (B-MAC) address space that is independent of 217 the C-MAC address space, and aggregates C-MAC addresses via a single 218 B-MAC address. 220 4.3. C-MAC Address Learning and Confinement 222 In EVPN, all the PE nodes participating in the same EVPN instance are 223 exposed to all the C-MAC addresses learnt by any one of these PE 224 nodes because a C-MAC learned by one of the PE nodes is advertise in 225 BGP to other PE nodes in that EVPN instance. This is the case even if 226 some of the PE nodes for that EVPN instance are not involved in 227 forwarding traffic to, or from, these C-MAC addresses. Even if an 228 implementation does not install hardware forwarding entries for C-MAC 229 addresses that are not part of active traffic flows on that PE, the 230 device memory is still consumed by keeping record of the C-MAC 231 addresses in the routing table (RIB). In network applications with 232 millions of C-MAC addresses, this introduces a non-trivial waste of 233 PE resources. As such, it is required to confine the scope of 234 visibility of C-MAC addresses only to those PE nodes that are 235 actively involved in forwarding traffic to, or from, these addresses. 237 4.4. Per Site Policy Support 239 In many applications, it is required to be able to enforce 240 connectivity policy rules at the granularity of a site (or segment). 241 This includes the ability to control which PE nodes in the network 242 can forward traffic to, or from, a given site. Both EVPN and PBB-EVPN 243 are capable of providing this granularity of policy control. In the 244 case where the policy needs to be at the granularity of per C-MAC 245 address, then C-MAC address learning in control-plane (in BGP) per 246 [RFC7432] should be used. 248 4.5. No C-MAC Address Flushing for All-Active Multi-Homing 250 Just as in [RFC7432], it is required to avoid C-MAC address flushing 251 upon link, port or node failure for All-Active multi-homed segments. 253 5. Solution Overview 255 The solution involves incorporating IEEE Backbone Edge Bridge (BEB) 256 functionality on the EVPN PE nodes similar to PBB-VPLS, where BEB 257 functionality is incorporated in the VPLS PE nodes. The PE devices 258 would then receive 802.1Q Ethernet frames from their attachment 259 circuits, encapsulate them in the PBB header and forward the frames 260 over the IP/MPLS core. On the egress EVPN PE, the PBB header is 261 removed following the MPLS disposition, and the original 802.1Q 262 Ethernet frame is delivered to the customer equipment. 264 BEB +--------------+ BEB 265 || | | || 266 \/ | | \/ 267 +----+ AC1 +----+ | | +----+ +----+ 268 | CE1|-----| | | | | |---| CE2| 269 +----+\ | PE1| | IP/MPLS | | PE3| +----+ 270 \ +----+ | Network | +----+ 271 \ | | 272 AC2\ +----+ | | 273 \| | | | 274 | PE2| | | 275 +----+ | | 276 /\ +--------------+ 277 || 278 BEB 279 <-802.1Q-> <------PBB over MPLS------> <-802.1Q-> 281 Figure 1: PBB-EVPN Network 283 The PE nodes perform the following functions:- Learn customer/client 284 MAC addresses (C-MACs) over the attachment circuits in the data- 285 plane, per normal bridge operation. 287 - Learn remote C-MAC to B-MAC bindings in the data-plane for traffic 288 received from the core per [PBB] bridging operation. 290 - Advertise local B-MAC address reach-ability information in BGP to 291 all other PE nodes in the same set of service instances. Note that 292 every PE has a set of B-MAC addresses that uniquely identify the 293 device. B-MAC address assignment is described in details in section 294 7.2.2. 296 - Build a forwarding table from remote BGP advertisements received 297 associating remote B-MAC addresses with remote PE IP addresses and 298 the associated MPLS label(s). 300 6. BGP Encoding 302 PBB-EVPN leverages the same BGP Routes and Attributes defined in 303 [RFC7432], adapted as follows: 305 6.1. Ethernet Auto-Discovery Route 307 This route and all of its associated modes are not needed in PBB-EVPN 308 because PBB encapsulation provides the required level of indirection 309 for C-MAC addresses - i.e., an ES can be represented by a B-MAC 310 address for the purpose of data-plane learning/forwarding. 312 The receiving PE knows that it need not wait for the receipt of the 313 Ethernet A-D route for route resolution by means of the reserved 314 Ethernet Segment Identifier (ESI) encoded in the MAC Advertisement 315 route: the ESI values of 0 and MAX-ESI indicate that the receiving PE 316 can resolve the path without an Ethernet A-D route. 318 6.2. MAC/IP Advertisement Route 320 The EVPN MAC/IP Advertisement Route is used to distribute B-MAC 321 addresses of the PE nodes instead of the C-MAC addresses of end- 322 stations/hosts. This is because the C-MAC addresses are learnt in the 323 data-plane for traffic arriving from the core. The MAC Advertisement 324 Route is encoded as follows: 326 - The MAC address field contains the B-MAC address. 327 - The Ethernet Tag field is set to 0. 328 - The Ethernet Segment Identifier field must be set either to 0 (for 329 single-homed segments or multi-homed segments with per-ISID load- 330 balancing) or to MAX-ESI (for multi-homed segments with per-flow 331 load-balancing). All other values are not permitted. 332 - All other fields are set as defined in [RFC7432]. 334 This route is tagged with the Route Target (RT) corresponding to its 335 EVI. This EVI is analogous to a B-VID. 337 6.3. Inclusive Multicast Ethernet Tag Route 339 This route is used for multicast pruning per I-SID. It is used for 340 auto-discovery of PEs participating in a given I-SID so that a 341 multicast tunnel (MP2P, P2P, P2MP, or MP2MP LSP) can be setup for 342 that I-SID . [RFC7080] uses multicast pruning per I-SID based on 343 [MMRP] which is a soft-state protocol. The advantages of multicast 344 pruning using this BGP route over [MMRP] are that a) it scales very 345 well for large number of PEs and b) it works with any type of LSP 346 (MP2P, P2P, P2MP, or MP2MP); whereas, [MMRP] only works over P2P 347 pseudowires. The Inclusive Multicast Ethernet Tag Route is encoded as 348 follow: 350 - The Ethernet Tag field is set with the appropriate I-SID value. 351 - All other fields are set as defined in [RFC7432]. 353 This route is tagged with an RT. This RT SHOULD be set to a value 354 corresponding to its EVI (which is analogous to a B-VID). The RT for 355 this route MAY also be auto-derived from the corresponding Ethernet 356 Tag (I-SID) based on the procedure specified in section 5.1.2.1 of 357 [OVERLAY]. 359 6.4. Ethernet Segment Route 361 This route is auto-discovery of member PEs belonging to a given 362 redundancy group (e.g., attached to a given Ethernet Segment) per 363 [RFC7432]. 365 6.5. ESI Label Extended Community 367 This extended community is not used in PBB-EVPN. In [RFC7432], this 368 extended community is used along with the Ethernet AD route to 369 advertise an MPLS label for the purpose of split-horizon filtering. 370 Since in PBB-EVPN, the split-horizon filtering is performed natively 371 using B-MAC SA, there is no need for this extended community. 373 6.6. ES-Import Route Target 375 This RT is used as defined in [RFC7432]. 377 6.7. MAC Mobility Extended Community 379 This extended community is defined in [RFC7432] and it is used with a 380 MAC route (B-MAC route in case of PBB-EVPN). The B-MAC route is 381 tagged with the RT corresponding to its EVI (which is analogous to a 382 B-VID). When this extended community is used along with a B-MAC route 383 in PBB-EVPN, it indicates that all C-MAC addresses associated with 384 that B-MAC address across all corresponding I-SIDs must be flushed. 386 When a PE first advertises a B-MAC, it MAY advertise it with this 387 Extended Community where the sticky/static flag is set to 1 and the 388 sequence number is set to zero. In such cases where the PE wants to 389 signal the stickiness of a B-MAC, then when a flush indication is 390 needed, the PE advertises the B-MAC along with the MAC Mobility 391 extended community where the sticky/static flag remains set and the 392 sequence number is incremented. 394 6.8. Default Gateway Extended Community 396 This extended community is not used in PBB-EVPN. 398 7. Operation 400 This section discusses the operation of PBB-EVPN, specifically in 401 areas where it differs from [RFC7432]. 403 7.1. MAC Address Distribution over Core 405 In PBB-EVPN, host MAC addresses (i.e. C-MAC addresses) need not be 406 distributed in BGP. Rather, every PE independently learns the C-MAC 407 addresses in the data-plane via normal bridging operation. Every PE 408 has a set of one or more unicast B-MAC addresses associated with it, 409 and those are the addresses distributed over the core in MAC 410 Advertisement routes. 412 7.2. Device Multi-homing 414 7.2.1. Flow-based Load-balancing 416 This section describes the procedures for supporting device multi- 417 homing in an All-Active redundancy mode (i.e., flow-based load- 418 balancing). 420 7.2.1.1. PE B-MAC Address Assignment 422 In [PBB] every BEB is uniquely identified by one or more B-MAC 423 addresses. These addresses are usually locally administered by the 424 Service Provider. For PBB-EVPN, the choice of B-MAC address(es) for 425 the PE nodes must be examined carefully as it has implications on the 426 proper operation of multi-homing. In particular, for the scenario 427 where a CE is multi-homed to a number of PE nodes with All-Active 428 redundancy mode, a given C-MAC address would be reachable via 429 multiple PE nodes concurrently. Given that any given remote PE will 430 bind the C-MAC address to a single B-MAC address, then the various PE 431 nodes connected to the same CE must share the same B-MAC address. 432 Otherwise, the MAC address table of the remote PE nodes will keep 433 oscillating between the B-MAC addresses of the various PE devices. 434 For example, consider the network of Figure 1, and assume that PE1 435 has B-MAC BM1 and PE2 has B-MAC BM2. Also, assume that both links 436 from CE1 to the PE nodes are part of the same Ethernet link 437 aggregation group. If BM1 is not equal to BM2, the consequence is 438 that the MAC address table on PE3 will keep oscillating such that the 439 C-MAC address M1 of CE1 would flip-flop between BM1 or BM2, depending 440 on the load-balancing decision on CE1 for traffic destined to the 441 core. 443 Considering that there could be multiple sites (e.g. CEs) that are 444 multi-homed to the same set of PE nodes, then it is required for all 445 the PE devices in a Redundancy Group to have a unique B-MAC address 446 per site. This way, it is possible to achieve fast convergence in the 447 case where a link or port failure impacts the attachment circuit 448 connecting a single site to a given PE. 450 +---------+ 451 +-------+ PE1 | IP/MPLS | 452 / | | 453 CE1 | Network | PEr 454 M1 \ | | 455 +-------+ PE2 | | 456 /-------+ | | 457 / | | 458 CE2 | | 459 M2 \ | | 460 \ | | 461 +------+ PE3 +---------+ 463 Figure 2: B-MAC Address Assignment 465 In the example network shown in Figure 2 above, two sites 466 corresponding to CE1 and CE2 are dual-homed to PE1/PE2 and PE2/PE3, 467 respectively. Assume that BM1 is the B-MAC used for the site 468 corresponding to CE1. Similarly, BM2 is the B-MAC used for the site 469 corresponding to CE2. On PE1, a single B-MAC address (BM1) is 470 required for the site corresponding to CE1. On PE2, two B-MAC 471 addresses (BM1 and BM2) are required, one per site. Whereas on PE3, a 472 single B-MAC address (BM2) is required for the site corresponding to 473 CE2. All three PE nodes would advertise their respective B-MAC 474 addresses in BGP using the MAC Advertisement routes defined in 475 [RFC7432]. The remote PE, PEr, would learn via BGP that BM1 is 476 reachable via PE1 and PE2, whereas BM2 is reachable via both PE2 and 477 PE3. Furthermore, PEr establishes, via the PBB bridge learning 478 procedure, that C-MAC M1 is reachable via BM1, and C-MAC M2 is 479 reachable via BM2. As a result, PEr can load-balance traffic destined 480 to M1 between PE1 and PE2, as well as traffic destined to M2 between 481 both PE2 and PE3. In the case of a failure that causes, for example, 482 CE1 to be isolated from PE1, the latter can withdraw the route it has 483 advertised for BM1. This way, PEr would update its path list for BM1, 484 and will send all traffic destined to M1 over to PE2 only. 486 For Single-Homed or Single-Active sites, it is possible to assign a 487 unique B-MAC address per site, or have all the Single-Homed sites or 488 Single-Active sites connected to a given PE share a single B-MAC 489 address. The advantage of the first model over the second model is 490 the ability to avoid C-MAC destination address lookup on the 491 disposition PE (even though source C-MAC learning is still required 492 in the data-plane). The disadvantage of the first model over the 493 second model is additional B-MAC advertisements in BGP. 495 In summary, every PE may use a unicast B-MAC address shared by all 496 single-homed sites or a unicast B-MAC address per single-homed site 497 and, in addition, a unicast B-MAC address per All-Active multi-homed 498 site. In the latter case, the B-MAC address MUST be the same for all 499 PE nodes in a Redundancy Group connected to the same site. 501 7.2.1.2. Automating B-MAC Address Assignment 503 The PE B-MAC address used for Single-Homed or Single-Active sites can 504 be automatically derived from the hardware (using for e.g. the 505 backplane's address and/or PE's reserved MAC pool ). However, the B- 506 MAC address used for All-Active sites must be coordinated among the 507 RG members. To automate the assignment of this latter address, the PE 508 can derive this B-MAC address from the MAC Address portion of the 509 CE's Link Aggregation Control Protocol (LACP) System Identifier by 510 flipping the 'Locally Administered' bit of the CE's address. This 511 guarantees the uniqueness of the B-MAC address within the network, 512 and ensures that all PE nodes connected to the same All-Active CE use 513 the same value for the B-MAC address. 515 Note that with this automatic provisioning of the B-MAC address 516 associated with All-Active CEs, it is not possible to support the 517 uncommon scenario where a CE has multiple link bundles within the 518 same LACP session towards the PE nodes, and the service involves 519 hair-pinning traffic from one bundle to another. This is because the 520 split-horizon filtering relies on B-MAC addresses rather than Site-ID 521 Labels (as will be described in the next section). The operator must 522 explicitly configure the B-MAC address for this fairly uncommon 523 service scenario. 525 Whenever a B-MAC address is provisioned on the PE, either manually or 526 automatically (as an outcome of CE auto-discovery), the PE MUST 527 transmit an MAC Advertisement Route for the B-MAC address with a 528 downstream assigned MPLS label that uniquely identifies that address 529 on the advertising PE. The route is tagged with the RTs of the 530 associated EVIs as described above. 532 7.2.1.3 Split Horizon and Designated Forwarder Election 534 [RFC7432] relies on access split horizon, where the Ethernet Segment 535 Label is used for egress filtering on the attachment circuit in order 536 to prevent forwarding loops. In PBB-EVPN, the B-MAC source address 537 can be used for the same purpose, as it uniquely identifies the 538 originating site of a given frame. As such, Ethernet Segment (ES) 539 Labels are not used in PBB-EVPN, and the egress split-horizon 540 filtering is done based on the B-MAC source address. It is worth 541 noting here that [PBB] defines this B-MAC address based filtering 542 function as part of the I-Component options, hence no new functions 543 are required to support split-horizon beyond what is already defined 544 in [PBB]. 546 The Designated Forwarder election procedures are defined in 547 [RFC7432]. 549 7.2.2. I-SID Based Load-balancing 551 This section describes the procedures for supporting device multi- 552 homing in a Single-Active redundancy mode with per-ISID load- 553 balancing. 555 7.2.2.1. PE B-MAC Address Assignment 557 In the case where per-ISID load-balancing is desired among the PE 558 nodes in a given redundancy group, multiple unicast B-MAC addresses 559 are allocated per multi-homed Ethernet Segment: Each PE connected to 560 the multi-homed segment is assigned a unique B-MAC. Every PE then 561 advertises its B-MAC address using the BGP MAC advertisement route. 562 In this mode of operation, two B-MAC address assignment models are 563 possible: 565 - The PE may use a shared B-MAC address for multiple Ethernet 566 Segments (ES's). This includes the single-homed segments as well as 567 the multi-homed segments operating with per-ISID load-balancing mode. 569 - The PE may use a dedicated B-MAC address for each ES operating with 570 per-ISID load-balancing mode. 572 A PE implementation MAY choose to support either the shared B-MAC 573 address model or the dedicated B-MAC address model without causing 574 any interoperability issues. 576 A remote PE initially floods traffic to a destination C-MAC address, 577 located in a given multi-homed Ethernet Segment, to all the PE nodes 578 configured with that I-SID. Then, when reply traffic arrives at the 579 remote PE, it learns (in the data-path) the B-MAC address and 580 associated next-hop PE to use for said C-MAC address. 582 7.2.2.2. Split Horizon and Designated Forwarder Election The procedures 583 are similar to the flow-based load-balancing case, with the only 584 difference being that the DF filtering must be applied to unicast as 585 well as multicast traffic, and in both core-to-segment as well as 586 segment-to-core directions. 588 7.2.2.3. Handling Failure Scenarios 590 When a PE connected to a multi-homed Ethernet Segment loses 591 connectivity to the segment, due to link or port failure, it needs to 592 notify the remote PEs to trigger C-MAC address flushing. This can be 593 achieved in one of two ways, depending on the B-MAC assignment model: 595 - If the PE uses a shared B-MAC address for multiple ES's, then the 596 C-MAC flushing is signaled by means of having the failed PE re- 597 advertise the MAC Advertisement route for the associated B-MAC, 598 tagged with the MAC Mobility Extended Community attribute. The value 599 of the Counter field in that attribute must be incremented prior to 600 advertisement. This causes the remote PE nodes to flush all C-MAC 601 addresses associated with the B-MAC in question. This is done across 602 all I-SIDs that are mapped to the EVI of the withdrawn MAC route. 604 - If the PE uses a dedicated B-MAC address for each Ethernet Segment 605 operating under per-ISID load-balancing mode, the failed PE simply 606 withdraws the B-MAC route previously advertised for that segment. 607 This causes the remote PE nodes to flush all C-MAC addresses 608 associated with the B-MAC in question. This is done across all I-SIDs 609 that are mapped to the EVI of the withdrawn MAC route. 611 When a PE connected to a multi-homed Ethernet Segment fails (i.e. 612 node failure) or when the PE becomes completely isolated from the 613 EVPN network, the remote PEs will start purging the MAC Advertisement 614 routes that were advertised by the failed PE. This is done either as 615 an outcome of the remote PEs detecting that the BGP session to the 616 failed PE has gone down, or by having a Route Reflector withdrawing 617 all the routes that were advertised by the failed PE. The remote PEs, 618 in this case, will perform C-MAC address flushing as an outcome of 619 the MAC Advertisement route withdrawals. 621 For all failure scenarios (link/port failure, node failure and PE 622 node isolation), when the fault condition clears, the recovered PE 623 re-advertises the associated Ethernet Segment route to other members 624 of its Redundancy Group. This triggers the backup PE(s) in the 625 Redundancy Group to block the I-SIDs for which the recovered PE is a 626 DF. When a backup PE blocks the I-SIDs, it triggers a C-MAC address 627 flush notification to the remote PEs by re-advertising the MAC 628 Advertisement route for the associated B-MAC, with the MAC Mobility 629 Extended Community attribute. The value of the Counter field in that 630 attribute must be incremented prior to advertisement. This causes the 631 remote PE nodes to flush all C-MAC addresses associated with the B- 632 MAC in question. This is done across all I-SIDs that are mapped to 633 the EVI of the withdrawn/readvertised MAC route. 635 7.3. Network Multi-homing 637 When an Ethernet network is multi-homed to a set of PE nodes running 638 PBB-EVPN, Single-Active redundancy model can be supported with per 639 service instance (i.e. I-SID) load-balancing. In this model, DF 640 election is performed to ensure that a single PE node in the 641 redundancy group is responsible for forwarding traffic associated 642 with a given I-SID. This guarantees that no forwarding loops are 643 created. Filtering based on DF state applies to both unicast and 644 multicast traffic, and in both access-to-core as well as core-to- 645 access directions just like Single-Active multi-homed device scenario 646 (but unlike All-Active multi-homed device scenario where DF filtering 647 is limited to multi-destination frames in the core-to-access 648 direction). Similar to Single-Active multi-homed device scenario, 649 with I-SID based load-balancing, a unique B-MAC address is assigned 650 to each of the PE nodes connected to the multi-homed network 651 (Segment). 653 7.4. Frame Forwarding 655 The frame forwarding functions are divided in between the Bridge 656 Module, which hosts the [PBB] Backbone Edge Bridge (BEB) 657 functionality, and the MPLS Forwarder which handles the MPLS 658 imposition/disposition. The details of frame forwarding for unicast 659 and multi-destination frames are discussed next. 661 7.4.1. Unicast 663 Known unicast traffic received from the AC will be PBB-encapsulated 664 by the PE using the B-MAC source address corresponding to the 665 originating site. The unicast B-MAC destination address is determined 666 based on a lookup of the C-MAC destination address (the binding of 667 the two is done via transparent learning of reverse traffic). The 668 resulting frame is then encapsulated with an LSP tunnel label and the 669 MPLS label which uniquely identifies the B-MAC destination address on 670 the egress PE. If per flow load-balancing over ECMPs in the MPLS core 671 is required, then a flow label is added below the label associated 672 with the BMAC address in the label stack. 674 For unknown unicast traffic, the PE forwards these frames over MPLS 675 core. When these frames are to be forwarded, then the same set of 676 options used for forwarding multicast/broadcast frames (as described 677 in next section) are used. 679 7.4.2. Multicast/Broadcast 681 Multi-destination frames received from the AC will be PBB- 682 encapsulated by the PE using the B-MAC source address corresponding 683 to the originating site. The multicast B-MAC destination address is 684 selected based on the value of the I-SID as defined in [PBB]. The 685 resulting frame is then forwarded over the MPLS core using one out of 686 the following two options: 688 Option 1: the MPLS Forwarder can perform ingress replication over a 689 set of MP2P or P2P tunnel LSPs. The frame is encapsulated with a 690 tunnel LSP label and the EVPN ingress replication label advertised in 691 the Inclusive Multicast Route. 693 Option 2: the MPLS Forwarder can use P2MP tunnel LSP per the 694 procedures defined in [RFC7432]. This includes either the use of 695 Inclusive or Aggregate Inclusive trees. Furthermore, the MPLS 696 Forwarder can use MP2MP tunnel LSP if Inclusive trees are used. 698 Note that the same procedures for advertising and handling the 699 Inclusive Multicast Route defined in [RFC7432] apply here. 701 7.5. MPLS Encapsulation of PBB Frames 703 The encapsulation for the transport of PBB frames over MPLS is 704 similar to that of classical Ethernet, albeit with the additional PBB 705 header, as shown in the figure below: 707 +------------------+ 708 | IP/MPLS Header | 709 +------------------+ 710 | PBB Header | 711 +------------------+ 712 | Ethernet Header | 713 +------------------+ 714 | Ethernet Payload | 715 +------------------+ 716 | Ethernet FCS | 717 +------------------+ 719 Figure 8: PBB over MPLS Encapsulation 721 8. Minimizing ARP/ND Broadcast 723 The PE nodes MAY implement an ARP/ND-proxy function in order to 724 minimize the volume of ARP/ND traffic that is broadcasted over the 725 MPLS network. In case of ARP proxy, this is achieved by having each 726 PE node snoop on ARP request and response messages received over the 727 access interfaces or the MPLS core. The PE builds a cache of IP / MAC 728 address bindings from these snooped messages. The PE then uses this 729 cache to respond to ARP requests ingress on access ports and 730 targeting hosts that are in remote sites. If the PE finds a match for 731 the IP address in its ARP cache, it responds back to the requesting 732 host and drops the request. Otherwise, if it does not find a match, 733 then the request is flooded over the MPLS network using either 734 ingress replication or P2MP LSPs. In case of ND proxy, this is 735 achieved similar to the above but with ND/NA messages per [RFC4389]. 737 9. Seamless Interworking with IEEE 802.1aq/802.1Qbp 739 +--------------+ 740 | | 741 +---------+ | MPLS | +---------+ 742 +----+ | | +----+ +----+ | | +----+ 743 |SW1 |--| | | PE1| | PE2| | |--| SW3| 744 +----+ | 802.1aq |---| | | |--| 802.1aq | +----+ 745 +----+ | .1Qbp | +----+ +----+ | .1Qbp | +----+ 746 |SW2 |--| | | Backbone | | |--| SW4| 747 +----+ +---------+ +--------------+ +---------+ +----+ 749 |<------ IS-IS -------->|<-----BGP----->|<------ IS-IS ------>| CP 751 |<------------------------- PBB -------------------------->| DP 752 |<----MPLS----->| 754 Legend: CP = Control Plane View 755 DP = Data Plane View 757 Figure 7: Interconnecting 802.1aq/802.1Qbp Networks with PBB-EVPN 759 9.1. B-MAC Address Assignment 761 The B-MAC addresses need to be globally unique across all networks 762 including PBB-EVPN and IEEE 802.1aq / 802.1Qbp networks. The B-MAC 763 addresses used for Single-Home and Single-Active Ethernet Segments 764 should be unique because they are typically auto-derived from PE's 765 pools of reserved MAC addresses that are unique. The B-MAC addresses 766 used for All-Active Ethernet Segments should also be unique given 767 that each network operator typically has its own assigned 768 Organizationally Unique Identifier (OUI) and thus can assign its own 769 unique MAC addresses. 771 9.2. IEEE 802.1aq / 802.1Qbp B-MAC Address Advertisement 773 B-MAC addresses associated with 802.1aq / 802.1Qbp switches are 774 advertised using the EVPN MAC/IP route advertisement already defined 775 in [RFC7432]. 777 9.4. Operation: 779 When a PE receives a PBB-encapsulated Ethernet frame from the access 780 side, it performs a lookup on the B-MAC destination address to 781 identify the next hop. If the lookup yields that the next hop is a 782 remote PE, the local PE would then encapsulate the PBB frame in MPLS. 783 The label stack comprises of the VPN label (advertised by the remote 784 PE), followed by an LSP/IGP label. From that point onwards, regular 785 MPLS forwarding is applied. 787 On the disposition PE, assuming penultimate-hop-popping is employed, 788 the PE receives the MPLS-encapsulated PBB frame with a single label: 789 the VPN label. The value of the label indicates to the disposition PE 790 that this is a PBB frame, so the label is popped, the TTL field (in 791 the 802.1Qbp F-Tag) is reinitialized and normal PBB processing is 792 employed from this point onwards. 794 10. Solution Advantages 796 In this section, we discuss the advantages of the PBB-EVPN solution 797 in the context of the requirements set forth in section 3 above. 799 10.1. MAC Advertisement Route Scalability 801 In PBB-EVPN the number of MAC Advertisement Routes is a function of 802 the number of Ethernet Segments (e.g., sites), rather than the number 803 of hosts/servers. This is because the B-MAC addresses of the PEs, 804 rather than C-MAC addresses (of hosts/servers) are being advertised 805 in BGP. As discussed above, there's a one-to-one mapping between All- 806 Active multi-homed segments and their associated B-MAC addresses, and 807 there can be either a one-to-one or many-to-one mapping between 808 Single-Active multi-homed segments and their associated B-MAC 809 addresses, and finally there is a many-to-one mapping between single- 810 home sites and their associated B-MAC addresses on a given PE. This 811 means a single B-MAC is associated with one or more segments where 812 each segment can be associated with many C-MAC addresses. As a 813 result, the volume of MAC Advertisement Routes in PBB-EVPN may be 814 multiple orders of magnitude less than EVPN. 816 10.2. C-MAC Mobility Independent of B-MAC Advertisements 818 As described above, in PBB-EVPN, a single B-MAC address can aggregate 819 many C-MAC addresses. Given that B-MAC addresses are associated with 820 segments attached to a PE or to the PE itself, their locations are 821 fixed and thus not impacted what so ever by C-MAC mobility. 822 Therefore, C-MAC mobility does not affect B-MAC addresses (e.g., any 823 re-advertisements of them). This is because the C-MAC address to B- 824 MAC address association is learnt in the data-plane and C-MAC 825 addresses are not advertised in BGP. Aggregation via B-MAC addresses 826 in PBB-EVPN performs much better than EVPN. 828 To illustrate how this compares to EVPN, consider the following 829 example: 831 If a PE running EVPN advertises reachability for N MAC addresses via 832 a particular segment, and then 50% of the MAC addresses in that 833 segment move to other segments (e.g. due to virtual machine 834 mobility), then N/2 additional MAC Advertisement routes need to be 835 sent for the MAC addresses that have moved. With PBB-EVPN, on the 836 other hand, the B-MAC addresses which are statically associated with 837 PE nodes, are not subject to mobility. As C-MAC addresses move from 838 one segment to another, the binding of C-MAC to B-MAC addresses is 839 updated via data-plane learning in PBB-EVPN. 841 10.3. C-MAC Address Learning and Confinement 843 In PBB-EVPN, C-MAC address reachability information is built via 844 data-plane learning. As such, PE nodes not participating in active 845 conversations involving a particular C-MAC address will purge that 846 address from their forwarding tables. Furthermore, since C-MAC 847 addresses are not distributed in BGP, PE nodes will not maintain any 848 record of them in control-plane routing table. 850 10.4. Seamless Interworking with 802.1aq Access Networks 852 Consider the scenario where two access networks, one running MPLS and 853 the other running 802.1aq, are interconnected via an MPLS backbone 854 network. The figure below shows such an example network. 856 +--------------+ 857 | | 858 +---------+ | MPLS | +---------+ 859 +----+ | | +----+ +----+ | | +----+ 860 | CE |--| | | PE1| | PE2| | |--| CE | 861 +----+ | 802.1aq |---| | | |--| MPLS | +----+ 862 +----+ | | +----+ +----+ | | +----+ 863 | CE |--| | | Backbone | | |--| CE | 864 +----+ +---------+ +--------------+ +---------+ +----+ 866 Figure 9: Interoperability with 802.1aq 868 If the MPLS backbone network employs EVPN, then the 802.1aq data- 869 plane encapsulation must be terminated on PE1 or the edge device 870 connecting to PE1. Either way, all the PE nodes that are part of the 871 associated service instances will be exposed to all the C-MAC 872 addresses of all hosts/servers connected to the access networks. 873 However, if the MPLS backbone network employs PBB-EVPN, then the 874 802.1aq encapsulation can be extended over the MPLS backbone, thereby 875 maintaining C-MAC address transparency on PE1. If PBB-EVPN is also 876 extended over the MPLS access network on the right, then C-MAC 877 addresses would be transparent to PE2 as well. 879 10.5. Per Site Policy Support 881 In PBB-EVPN, the per site policy can be supported via B-MAC addresses 882 via assigning a unique B-MAC address for every site/segment 883 (typically multi-homed but can also be single-homed). Given that the 884 B-MAC addresses are sent in BGP MAC/IP route advertisement, it is 885 possible to define per site (i.e. B-MAC) forwarding policies 886 including policies for E-TREE service. 888 10.6. No C-MAC Address Flushing for All-Active Multi-Homing 890 Just as in [RFC7432], with PBB-EVPN, it is possible to avoid C-MAC 891 address flushing upon topology change affecting an All-Active multi- 892 homed segment. To illustrate this, consider the example network of 893 Figure 1. Both PE1 and PE2 advertise the same B-MAC address (BM1) to 894 PE3. PE3 then learns the C-MAC addresses of the servers/hosts behind 895 CE1 via data-plane learning. If AC1 fails, then PE3 does not need to 896 flush any of the C-MAC addresses learnt and associated with BM1. This 897 is because PE1 will withdraw the MAC Advertisement routes associated 898 with BM1, thereby leading PE3 to have a single adjacency (to PE2) for 899 this B-MAC address. Therefore, the topology change is communicated to 900 PE3 and no C-MAC address flushing is required. 902 11. Acknowledgements 904 The authors would like to thank Jose Liste and Patrice Brissette for 905 their reviews and comments of this document. We would also like to 906 thank Giles Heron for several rounds of reviews and providing 907 valuable inputs to get this draft ready for IESG submission. 909 12. Security Considerations 911 All the security considerations in [RFC7432] apply directly to this 912 document because this document leverages [RFC7432] control plane and 913 their associated procedures - although not the complete set but 914 rather a subset. 916 This draft does not introduce any new security considerations beyond 917 that of [RFC7432] and [RFC4761] because advertisements and processing 918 of B-MAC addresses follow that of [RFC7432], and processing of C-MAC 919 addresses follow that of [RFC4761] - i.e, B-MAC addresses are learned 920 in control plane and C-MAC addresses are learned in data plane. 922 13. IANA Considerations 924 There is no additional IANA considerations for PBB-EVPN beyond what 925 is already described in [RFC7432]. 927 14. Normative References 929 [RFC7432] A. Sajassi, et al., "BGP MPLS Based Ethernet VPN", RFC 930 7432 , February 2015. 932 [PBB] Clauses 25 and 26 of "IEEE Standard for Local and 933 metropolitan area networks - Media Access Control (MAC) 934 Bridges and Virtual Bridged Local Area Networks", IEEE Std 935 802.1Q, 2013. 937 15. Informative References 939 [RFC7080] A. Sajassi, et al., "Virtual Private LAN Service (VPLS) 940 Interoperability with Provider Backbone Bridges", RFC 941 7080, December 2013. 943 [RFC7209] D. Thaler, et al., "Requirements for Ethernet VPN (EVPN)", 944 RFC 7209, May 2014. 946 [RFC4389] A. Sajassi, et al., "Neighbor Discovery Proxies (ND 947 Proxy)", RFC 4389, April 2006. 949 [RFC4761] K. Kompella, et al., "Virtual Private LAN Service (VPLS) 950 Using BGP for Auto-Discovery and Signaling", RFC 4761, 951 Jauary 2007. 953 [OVERLAY] A. Sajassi, et al., "A Network Virtualization Overlay 954 Solution using EVPN", draft-ietf-bess-evpn-overlay-01, 955 work in progress, February 2015. 957 [MMRP] Clause 10 of "IEEE Standard for Local and metropolitan 958 area networks - Media Access Control (MAC) Bridges and 959 Virtual Bridged Local Area Networks", IEEE Std 802.1Q, 960 2013. 962 16. Authors' Addresses 964 Ali Sajassi 965 Cisco 966 170 West Tasman Drive 967 San Jose, CA 95134, US 968 Email: sajassi@cisco.com 970 Samer Salam 971 Cisco 972 595 Burrard Street, Suite # 2123 973 Vancouver, BC V7X 1J1, Canada 974 Email: ssalam@cisco.com 976 Nabil Bitar 977 Verizon Communications 978 Email : nabil.n.bitar@verizon.com 980 Aldrin Isaac 981 Bloomberg 982 Email: aisaac71@bloomberg.net 984 Wim Henderickx 985 Alcatel-Lucent 986 Email: wim.henderickx@alcatel-lucent.be