idnits 2.17.1 draft-ietf-l2vpn-pbb-evpn-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 25, 2013) is 4076 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'EVPN' is mentioned on line 327, but not defined -- Unexpected draft version: The latest known version of draft-ietf-l2vpn-vpls-pbb-interop is -00, but you're referring to -02. == Outdated reference: A later version (-11) exists of draft-ietf-l2vpn-evpn-00 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Working Group Ali Sajassi 3 Internet Draft Samer Salam 4 Category: Standards Track Sami Boutros 5 Cisco 7 Florin Balus Nabil Bitar 8 Wim Henderickx Verizon 9 Alcatel-Lucent 10 Aldrin Isaac 11 Clarence Filsfils Bloomberg 12 Dennis Cai 13 Cisco Lizhong Jin 14 ZTE 16 Expires: August 25, 2013 February 25, 2013 18 PBB-EVPN 19 draft-ietf-l2vpn-pbb-evpn-04 21 Status of this Memo 23 This Internet-Draft is submitted to IETF in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF), its areas, and its working groups. Note that 28 other groups may also distribute working documents as 29 Internet-Drafts. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 The list of current Internet-Drafts can be accessed at 37 http://www.ietf.org/1id-abstracts.html 39 The list of Internet-Draft Shadow Directories can be accessed at 40 http://www.ietf.org/shadow.html 42 Copyright and License Notice 44 Copyright (c) 2013 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Abstract 59 This document discusses how Ethernet Provider Backbone Bridging 60 [802.1ah] can be combined with E-VPN in order to reduce the number of 61 BGP MAC advertisement routes by aggregating Customer/Client MAC (C- 62 MAC) addresses via Provider Backbone MAC address (B-MAC), provide 63 client MAC address mobility using C-MAC aggregation and B-MAC sub- 64 netting, confine the scope of C-MAC learning to only active flows, 65 offer per site policies and avoid C-MAC address flushing on topology 66 changes. The combined solution is referred to as PBB-EVPN. 68 Conventions 70 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 71 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 72 document are to be interpreted as described in RFC 2119. 74 Table of Contents 76 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 77 2. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 4 78 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 79 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 4 80 4.1. MAC Advertisement Route Scalability . . . . . . . . . . . 5 81 4.2. C-MAC Mobility with MAC Summarization . . . . . . . . . . 5 82 4.3. C-MAC Address Learning and Confinement . . . . . . . . . . 5 83 4.4. Per Site Policy Support . . . . . . . . . . . . . . . . . 6 84 4.5. Avoiding C-MAC Address Flushing . . . . . . . . . . . . . 6 85 5. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 6 86 6. BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . 7 87 6.1. BGP MAC Advertisement Route . . . . . . . . . . . . . . . 7 88 6.2. Ethernet Auto-Discovery Route . . . . . . . . . . . . . . 7 89 6.3. Per VPN Route Targets . . . . . . . . . . . . . . . . . . 8 90 6.4. MAC Mobility Extended Community . . . . . . . . . . . . . 8 91 7. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 8 92 7.1. MAC Address Distribution over Core . . . . . . . . . . . . 8 93 7.2. Device Multi-homing . . . . . . . . . . . . . . . . . . . 8 94 7.2.1 Flow-based Load-balancing . . . . . . . . . . . . . . . 8 95 7.2.1.1 PE B-MAC Address Assignment . . . . . . . . . . . . 8 96 7.2.1.2. Automating B-MAC Address Assignment . . . . . . . 10 97 7.2.1.3 Split Horizon and Designated Forwarder Election . . 11 98 7.2.2 I-SID Based Load-balancing . . . . . . . . . . . . . . . 11 99 7.2.2.1 PE B-MAC Address Assignment . . . . . . . . . . . . 11 100 7.2.2.2 Split Horizon and Designated Forwarder Election . . 12 101 7.3. Network Multi-homing . . . . . . . . . . . . . . . . . . . 12 102 7.4. Frame Forwarding . . . . . . . . . . . . . . . . . . . . . 12 103 7.4.1. Unicast . . . . . . . . . . . . . . . . . . . . . . . 12 104 7.4.2. Multicast/Broadcast . . . . . . . . . . . . . . . . . 13 105 8. Minimizing ARP Broadcast . . . . . . . . . . . . . . . . . . . 13 106 9. Seamless Interworking with IEEE 802.1aq/802.1Qbp . . . . . . . 13 107 9.1 B-MAC Address Assignment . . . . . . . . . . . . . . . . . . 14 108 9.2 IEEE 802.1aq / 802.1Qbp B-MAC Advertisement Route . . . . . 14 109 9.3 Operation: . . . . . . . . . . . . . . . . . . . . . . . . . 15 110 10. Solution Advantages . . . . . . . . . . . . . . . . . . . . . 15 111 10.1. MAC Advertisement Route Scalability . . . . . . . . . . . 15 112 10.2. C-MAC Mobility with MAC Sub-netting . . . . . . . . . . . 16 113 10.3. C-MAC Address Learning and Confinement . . . . . . . . . 16 114 10.4. Seamless Interworking with TRILL and 802.1aq Access 115 Networks . . . . . . . . . . . . . . . . . . . . . . . . 16 116 10.5. Per Site Policy Support . . . . . . . . . . . . . . . . . 17 117 10.6. Avoiding C-MAC Address Flushing . . . . . . . . . . . . . 17 118 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 18 119 12. Security Considerations . . . . . . . . . . . . . . . . . . . 18 120 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 121 14. Intellectual Property Considerations . . . . . . . . . . . . 18 122 15. Normative References . . . . . . . . . . . . . . . . . . . . 18 123 16. Informative References . . . . . . . . . . . . . . . . . . . 18 124 17. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 18 126 1. Introduction 128 [E-VPN] introduces a solution for multipoint L2VPN services, with 129 advanced multi-homing capabilities, using BGP for distributing 130 customer/client MAC address reach-ability information over the core 131 MPLS/IP network. [802.1ah] defines an architecture for Ethernet 132 Provider Backbone Bridging (PBB), where MAC tunneling is employed to 133 improve service instance and MAC address scalability in Ethernet as 134 well as VPLS networks [PBB-VPLS]. 136 In this document, we discuss how PBB can be combined with E-VPN in 137 order to: reduce the number of BGP MAC advertisement routes by 138 aggregating Customer/Client MAC (C-MAC) addresses via Provider 139 Backbone MAC address (B-MAC), provide client MAC address mobility 140 using C-MAC aggregation and B-MAC sub-netting, confine the scope of 141 C-MAC learning to only active flows, offer per site policies and 142 avoid C-MAC address flushing on topology changes. The combined 143 solution is referred to as PBB-EVPN. 145 2. Contributors 147 In addition to the authors listed above, the following individuals 148 also contributed to this document. 150 Keyur Patel, Cisco 151 Sam Aldrin, Huawei 152 Himanshu Shah, Ciena 154 3. Terminology 156 BEB: Backbone Edge Bridge 157 B-MAC: Backbone MAC Address 158 CE: Customer Edge 159 C-MAC: Customer/Client MAC Address 160 DHD: Dual-homed Device 161 DHN: Dual-homed Network 162 LACP: Link Aggregation Control Protocol 163 LSM: Label Switched Multicast 164 MDT: Multicast Delivery Tree 165 MP2MP: Multipoint to Multipoint 166 P2MP: Point to Multipoint 167 P2P: Point to Point 168 PE: Provider Edge 169 PoA: Point of Attachment 170 PW: Pseudowire 171 E-VPN: Ethernet VPN 173 4. Requirements 174 The requirements for PBB-EVPN include all the requirements for E-VPN 175 that were described in [EVPN-REQ], in addition to the following: 177 4.1. MAC Advertisement Route Scalability 179 In typical operation, an [E-VPN] PE sends a BGP MAC Advertisement 180 Route per customer/client MAC (C-MAC) address. In certain 181 applications, this poses scalability challenges, as is the case in 182 virtualized data center environments where the number of virtual 183 machines (VMs), and hence the number of C-MAC addresses, can be in 184 the millions. In such scenarios, it is required to reduce the number 185 of BGP MAC Advertisement routes by relying on a 'MAC summarization' 186 scheme, as is provided by PBB. Note that the MAC summarization 187 capability already built into E-VPN is not sufficient in those 188 environments, as will be discussed next. 190 4.2. C-MAC Mobility with MAC Summarization 192 Certain applications, such as virtual machine mobility, require 193 support for fast C-MAC address mobility. For these applications, it 194 is not possible to use MAC address summarization in E-VPN, i.e. 195 advertise reach-ability to a MAC address prefix. Rather, the exact 196 virtual machine MAC address needs to be transmitted in BGP MAC 197 Advertisement route. Otherwise, traffic would be forwarded to the 198 wrong segment when a virtual machine moves from one Ethernet segment 199 to another. This hinders the scalability benefits of summarization. 201 It is required to support C-MAC address mobility, while retaining the 202 scalability benefits of MAC summarization. This can be achieved by 203 leveraging PBB technology, which defines a Backbone MAC (B-MAC) 204 address space that is independent of the C-MAC address space, and 205 aggregate C-MAC addresses via a B-MAC address and then apply 206 summarization to B-MAC addresses. 208 4.3. C-MAC Address Learning and Confinement 210 In E-VPN, all the PE nodes participating in the same E-VPN instance 211 are exposed to all the C-MAC addresses learnt by any one of these PE 212 nodes because a C-MAC learned by one of the PE nodes is advertise in 213 BGP to other PE nodes in that E-VPN instance. This is the case even 214 if some of the PE nodes for that E-VPN instance are not involved in 215 forwarding traffic to, or from, these C-MAC addresses. Even if an 216 implementation does not install hardware forwarding entries for C-MAC 217 addresses that are not part of active traffic flows on that PE, the 218 device memory is still consumed by keeping record of the C-MAC 219 addresses in the routing table (RIB). In network applications with 220 millions of C-MAC addresses, this introduces a non-trivial waste of 221 PE resources. As such, it is required to confine the scope of 222 visibility of C-MAC addresses only to those PE nodes that are 223 actively involved in forwarding traffic to, or from, these addresses. 225 4.4. Per Site Policy Support 227 In many applications, it is required to be able to enforce 228 connectivity policy rules at the granularity of a site (or segment). 229 This includes the ability to control which PE nodes in the network 230 can forward traffic to, or from, a given site. PBB-EVPN is capable of 231 providing this granularity of policy control. In the case where per 232 C-MAC address granularity is required, the EVI can always continue to 233 operate in E-VPN mode. 235 4.5. Avoiding C-MAC Address Flushing 237 It is required to avoid C-MAC address flushing upon link, port or 238 node failure for multi-homed devices and networks. This is in order 239 to speed up re-convergence upon failure. 241 5. Solution Overview 243 The solution involves incorporating IEEE 802.1ah Backbone Edge Bridge 244 (BEB) functionality on the E-VPN PE nodes similar to PBB-VPLS, where 245 BEB functionality is incorporated in the VPLS PE nodes. The PE 246 devices would then receive 802.1Q Ethernet frames from their 247 attachment circuits, encapsulate them in the PBB header and forward 248 the frames over the IP/MPLS core. On the egress E-VPN PE, the PBB 249 header is removed following the MPLS disposition, and the original 250 802.1Q Ethernet frame is delivered to the customer equipment. 251 BEB +--------------+ BEB 252 || | | || 253 \/ | | \/ 254 +----+ AC1 +----+ | | +----+ +----+ 255 | CE1|-----| | | | | |---| CE2| 256 +----+\ | PE1| | IP/MPLS | | PE3| +----+ 257 \ +----+ | Network | +----+ 258 \ | | 259 AC2\ +----+ | | 260 \| | | | 261 | PE2| | | 262 +----+ | | 263 /\ +--------------+ 264 || 265 BEB 266 <-802.1Q-> <------PBB over MPLS------> <-802.1Q-> 268 Figure 1: PBB-EVPN Network 269 The PE nodes perform the following functions:- Learn customer/client 270 MAC addresses (C-MACs) over the attachment circuits in the data- 271 plane, per normal bridge operation. 273 - Learn remote C-MAC to B-MAC bindings in the data-plane from traffic 274 ingress from the core per [802.1ah] bridging operation. 276 - Advertise local B-MAC address reach-ability information in BGP to 277 all other PE nodes in the same set of service instances. Note that 278 every PE has a set of local B-MAC addresses that uniquely identify 279 the device. More on the PE addressing in section 5. 281 - Build a forwarding table from remote BGP advertisements received 282 associating remote B-MAC addresses with remote PE IP addresses and 283 the associated MPLS label(s). 285 6. BGP Encoding 287 PBB-EVPN leverages the same BGP Routes and Attributes defined in [E- 288 VPN], adapted as follows: 290 6.1. BGP MAC Advertisement Route 292 The E-VPN MAC Advertisement Route is used to distribute B-MAC 293 addresses of the PE nodes instead of the C-MAC addresses of end- 294 stations/hosts. This is because the C-MAC addresses are learnt in the 295 data-plane for traffic arriving from the core. The MAC Advertisement 296 Route is encoded as follows: 298 - The MAC address field contains the B-MAC address. 299 - The Ethernet Tag field is set to 0. 300 - The Ethernet Segment Identifier field must be set either to 0 (for 301 single-homed Segments) or to MAX-ESI (for multi-homed Segments). All 302 other values are not permitted. 304 The route is tagged with the RT corresponding to the EVI associated 305 with the B-MAC address. 307 All other fields are set as defined in [E-VPN]. 309 6.2. Ethernet Auto-Discovery Route 311 This route and all of its associated modes are not needed in PBB- 312 EVPN. 314 The receiving PE knows that it need not wait for the receipt of the 315 Ethernet A-D route for route resolution by means of the reserved ESI 316 encoded in the MAC Advertisement route: the ESI values of 0 and MAX- 317 ESI indicate that the receiving PE can resolve the path without an 318 Ethernet A-D route. 320 6.3. Per VPN Route Targets 322 PBB-EVPN uses the same set of route targets defined in [E-VPN]. The 323 future revision of this document will describe new RT types. 325 6.4. MAC Mobility Extended Community 327 This extended community is defined in [EVPN]. When used in PBB-EVPN, 328 it indicates that the C-MAC forwarding tables for the I-SIDs 329 associated with the RT tagging the MAC Advertisement route must be 330 flushed. 332 Note that all other BGP messages and/or attributes are used as 333 defined in [E-VPN]. 335 7. Operation 337 This section discusses the operation of PBB-EVPN, specifically in 338 areas where it differs from [E-VPN]. 340 7.1. MAC Address Distribution over Core 342 In PBB-EVPN, host MAC addresses (i.e. C-MAC addresses) need not be 343 distributed in BGP. Rather, every PE independently learns the C-MAC 344 addresses in the data-plane via normal bridging operation. Every PE 345 has a set of one or more unicast B-MAC addresses associated with it, 346 and those are the addresses distributed over the core in MAC 347 Advertisement routes. 349 7.2. Device Multi-homing 351 7.2.1 Flow-based Load-balancing 353 This section describes the procedures for supporting device multi- 354 homing in an all-active redundancy model with flow-based load- 355 balancing. 357 7.2.1.1 PE B-MAC Address Assignment 359 In [802.1ah] every BEB is uniquely identified by one or more B-MAC 360 addresses. These addresses are usually locally administered by the 361 Service Provider. For PBB-EVPN, the choice of B-MAC address(es) for 362 the PE nodes must be examined carefully as it has implications on the 363 proper operation of multi-homing. In particular, for the scenario 364 where a CE is multi-homed to a number of PE nodes with all-active 365 redundancy and flow-based load-balancing, a given C-MAC address would 366 be reachable via multiple PE nodes concurrently. Given that any given 367 remote PE will bind the C-MAC address to a single B-MAC address, then 368 the various PE nodes connected to the same CE must share the same B- 369 MAC address. Otherwise, the MAC address table of the remote PE nodes 370 will keep oscillating between the B-MAC addresses of the various PE 371 devices. For example, consider the network of Figure 1, and assume 372 that PE1 has B-MAC BM1 and PE2 has B-MAC BM2. Also, assume that both 373 links from CE1 to the PE nodes are part of an all-active multi- 374 chassis Ethernet link aggregation group. If BM1 is not equal to BM2, 375 the consequence is that the MAC address table on PE3 will keep 376 oscillating such that the C-MAC address CM of CE1 would flip-flop 377 between BM1 or BM2, depending on the load-balancing decision on CE1 378 for traffic destined to the core. 380 Considering that there could be multiple sites (e.g. CEs) that are 381 multi-homed to the same set of PE nodes, then it is required for all 382 the PE devices in a Redundancy Group to have a unique B-MAC address 383 per site. This way, it is possible to achieve fast convergence in the 384 case where a link or port failure impacts the attachment circuit 385 connecting a single site to a given PE. 387 +---------+ 388 +-------+ PE1 | IP/MPLS | 389 / | | 390 CE1 | Network | PEr 391 M1 \ | | 392 +-------+ PE2 | | 393 /-------+ | | 394 / | | 395 CE2 | | 396 M2 \ | | 397 \ | | 398 +------+ PE3 +---------+ 400 Figure 2: B-MAC Address Assignment 402 In the example network shown in Figure 2 above, two sites 403 corresponding to CE1 and CE2 are dual-homed to PE1/PE2 and PE2/PE3, 404 respectively. Assume that BM1 is the B-MAC used for the site 405 corresponding to CE1. Similarly, BM2 is the B-MAC used for the site 406 corresponding to CE2. On PE1, a single B-MAC address (BM1) is 407 required for the site corresponding to CE1. On PE2, two B-MAC 408 addresses (BM1 and BM2) are required, one per site. Whereas on PE3, a 409 single B-MAC address (BM2) is required for the site corresponding to 410 CE2. All three PE nodes would advertise their respective B-MAC 411 addresses in BGP using the MAC Advertisement routes defined in [E- 412 VPN]. The remote PE, PEr, would learn via BGP that BM1 is reachable 413 via PE1 and PE2, whereas BM2 is reachable via both PE2 and PE3. 414 Furthermore, PEr establishes via the normal bridge learning that C- 415 MAC M1 is reachable via BM1, and C-MAC M2 is reachable via BM2. As a 416 result, PEr can load-balance traffic destined to M1 between PE1 and 417 PE2, as well as traffic destined to M2 between both PE2 and PE3. In 418 the case of a failure that causes, for example, CE1 to be isolated 419 from PE1, the latter can withdraw the route it has advertised for 420 BM1. This way, PEr would update its path list for BM1, and will send 421 all traffic destined to M1 over to PE2 only. 423 For single-homed sites, it is possible to assign a unique B-MAC 424 address per site, or have all the single-homed sites connected to a 425 given PE share a single B-MAC address. The advantage of the first 426 model over the second model is the ability to avoid C-MAC destination 427 address lookup on the disposition PE (even though source C-MAC 428 learning is still required in the data-plane). Also, by assigning the 429 B-MAC addresses from a contiguous range, it is possible to advertise 430 a single B-MAC subnet for all single-homed sites, thereby rendering 431 the number of MAC advertisement routes required at par with the 432 second model. 434 In summary, every PE may use a unicast B-MAC address shared by all 435 single-homed CEs or a unicast B-MAC address per single-homed CE and, 436 in addition, a unicast B-MAC address per dual-homed CE. In the latter 437 case, the B-MAC address MUST be the same for all PE nodes in a 438 Redundancy Group connected to the same CE. 440 7.2.1.2. Automating B-MAC Address Assignment 442 The PE B-MAC address used for single-homed sites can be automatically 443 derived from the hardware (using for e.g. the backplane's address). 444 However, the B-MAC address used for multi-homed sites must be 445 coordinated among the RG members. To automate the assignment of this 446 latter address, the PE can derive this B-MAC address from the MAC 447 Address portion of the CE's LACP System Identifier by flipping the 448 'Locally Administered' bit of the CE's address. This guarantees the 449 uniqueness of the B-MAC address within the network, and ensures that 450 all PE nodes connected to the same multi-homed CE use the same value 451 for the B-MAC address. 453 Note that with this automatic provisioning of the B-MAC address 454 associated with multi-homed CEs, it is not possible to support the 455 uncommon scenario where a CE has multiple bundles towards the PE 456 nodes, and the service involves hair-pinning traffic from one bundle 457 to another. This is because the split-horizon filtering relies on B- 458 MAC addresses rather than Site-ID Labels (as will be described in the 459 next section). The operator must explicitly configure the B-MAC 460 address for this fairly uncommon service scenario. 462 Whenever a B-MAC address is provisioned on the PE, either manually or 463 automatically (as an outcome of CE auto-discovery), the PE MUST 464 transmit an MAC Advertisement Route for the B-MAC address with a 465 downstream assigned MPLS label that uniquely identifies that address 466 on the advertising PE. The route is tagged with the RTs of the 467 associated EVIs as described above. 469 7.2.1.3 Split Horizon and Designated Forwarder Election 471 [E-VPN] relies on access split horizon, where the Ethernet Segment 472 Label is used for egress filtering on the attachment circuit in order 473 to prevent forwarding loops. In PBB-EVPN, the B-MAC source address 474 can be used for the same purpose, as it uniquely identifies the 475 originating site of a given frame. As such, Segment Labels are not 476 used in PBB-EVPN, and the egress split-horizon filtering is done 477 based on the B-MAC source address. It is worth noting here that 478 [802.1ah] defines this B-MAC address based filtering function as part 479 of the I-Component options, hence no new functions are required to 480 support split-horizon beyond what is already defined in [802.1ah]. 481 Given that the Segment label is not used in PBB-EVPN, the PE sets the 482 Label field in the Ethernet Segment Route to 0. 484 The Designated Forwarder election procedures are defined in [I-D- 485 Segment-Route]. 487 7.2.2 I-SID Based Load-balancing 489 This section describes the procedures for supporting device multi- 490 homing in an all-active redundancy model with per-ISID load- 491 balancing. 493 7.2.2.1 PE B-MAC Address Assignment 495 In the case where per-ISID load-balancing is desired among the PE 496 nodes in a given redundancy group, multiple unicast B-MAC addresses 497 are allocated per multi-homed Ethernet Segment: Each PE connected to 498 the multi-homed segment is assigned a unique B-MAC. Every PE then 499 advertises its B-MAC address using the BGP MAC advertisement route. 501 A remote PE initially floods traffic to a destination C-MAC address, 502 located in a given multi-homed Ethernet Segment, to all the PE nodes 503 connected to that segment. Then, when reply traffic arrives at the 504 remote PE, it learns (in the data-path) the B-MAC address and 505 associated next-hop PE to use for said C-MAC address. When a PE 506 connected to a multi-homed Ethernet Segment loses connectivity to the 507 segment, due to link or port failure, it withdraws the B-MAC route 508 previously advertised for that segment. This causes the remote PE 509 nodes to flush all C-MAC addresses associated with the B-MAC in 510 question. This is done across all I-SIDs that are mapped to the EVI 511 of the withdrawn MAC route. 513 7.2.2.2 Split Horizon and Designated Forwarder Election The procedures 514 are similar to the flow-based load-balancing case, with the only 515 difference being that the DF filtering must be applied to unicast as 516 well as multicast traffic, and in both core-to-segment as well as 517 segment-to-core directions. 519 7.3. Network Multi-homing 521 When an Ethernet network is multi-homed to a set of PE nodes running 522 PBB-EVPN, an all-active redundancy model can be supported with per 523 service instance (i.e. I-SID) load-balancing. In this model, DF 524 election is performed to ensure that a single PE node in the 525 redundancy group is responsible for forwarding traffic associated 526 with a given I-SID. This guarantees that no forwarding loops are 527 created. Filtering based on DF state applies to both unicast and 528 multicast traffic, and in both access-to-core as well as core-to- 529 access directions (unlike the multi-homed device scenario where DF 530 filtering is limited to multi-destination frames in the core-to- 531 access direction). Similar to the multi-homed device scenario, with 532 I-SID based load-balancing, a unique B-MAC address is assigned to 533 each of the PE nodes connected to the multi-homed network (Segment). 535 7.4. Frame Forwarding 537 The frame forwarding functions are divided in between the Bridge 538 Module, which hosts the [802.1ah] Backbone Edge Bridge (BEB) 539 functionality, and the MPLS Forwarder which handles the MPLS 540 imposition/disposition. The details of frame forwarding for unicast 541 and multi-destination frames are discussed next. 543 7.4.1. Unicast 545 Known unicast traffic received from the AC will be PBB-encapsulated 546 by the PE using the B-MAC source address corresponding to the 547 originating site. The unicast B-MAC destination address is determined 548 based on a lookup of the C-MAC destination address (the binding of 549 the two is done via transparent learning of reverse traffic). The 550 resulting frame is then encapsulated with an LSP tunnel label and the 551 MPLS label which uniquely identifies the B-MAC destination address on 552 the egress PE. If per flow load-balancing over ECMPs in the MPLS core 553 is required, then a flow label is added as the end of stack label. 555 For unknown unicast traffic, the PE forwards these frames over MPLS 556 core. When these frames are to be forwarded, then the same set of 557 options used for forwarding multicast/broadcast frames (as described 558 in next section) are used. 560 7.4.2. Multicast/Broadcast 562 Multi-destination frames received from the AC will be PBB- 563 encapsulated by the PE using the B-MAC source address corresponding 564 to the originating site. The multicast B-MAC destination address is 565 selected based on the value of the I-SID as defined in [802.1ah]. The 566 resulting frame is then forwarded over the MPLS core using one out of 567 the following two options: 569 Option 1: the MPLS Forwarder can perform ingress replication over a 570 set of MP2P tunnel LSPs. The frame is encapsulated with a tunnel LSP 571 label and the E-VPN ingress replication label advertised in the 572 Inclusive Multicast Route. 574 Option 2: the MPLS Forwarder can use P2MP tunnel LSP per the 575 procedures defined in [E-VPN]. This includes either the use of 576 Inclusive or Aggregate Inclusive trees. 578 Note that the same procedures for advertising and handling the 579 Inclusive Multicast Route defined in [E-VPN] apply here. 581 8. Minimizing ARP Broadcast 583 The PE nodes implement an ARP-proxy function in order to minimize the 584 volume of ARP traffic that is broadcasted over the MPLS network. This 585 is achieved by having each PE node snoop on ARP request and response 586 messages received over the access interfaces or the MPLS core. The PE 587 builds a cache of IP / MAC address bindings from these snooped 588 messages. The PE then uses this cache to respond to ARP requests 589 ingress on access ports and targeting hosts that are in remote sites. 590 If the PE finds a match for the IP address in its ARP cache, it 591 responds back to the requesting host and drops the request. 592 Otherwise, if it does not find a match, then the request is flooded 593 over the MPLS network using either ingress replication or LSM. 595 9. Seamless Interworking with IEEE 802.1aq/802.1Qbp 596 +--------------+ 597 | | 598 +---------+ | MPLS | +---------+ 599 +----+ | | +----+ +----+ | | +----+ 600 |SW1 |--| | | PE1| | PE2| | |--| SW3| 601 +----+ | 802.1aq |---| | | |--| 802.1aq | +----+ 602 +----+ | .1Qbp | +----+ +----+ | .1Qbp | +----+ 603 |SW2 |--| | | Backbone | | |--| SW4| 604 +----+ +---------+ +--------------+ +---------+ +----+ 606 |<------ IS-IS -------->|<-----BGP----->|<------ IS-IS ------>| CP 608 |<------------------------- PBB -------------------------->| DP 609 |<----MPLS----->| 611 Legend: CP = Control Plane View 612 DP = Data Plane View 614 Figure 7: Interconnecting 802.1aq/802.1Qbp Networks with PBB-EVPN 616 9.1 B-MAC Address Assignment 618 For the same reasons cited in the TRILL section, the B-MAC addresses 619 need to be globally unique across all the IEEE 802.1aq / 802.1Qbp 620 networks. The same hierarchical address assignment scheme depicted 621 above is proposed for B-MAC addresses as well. 623 9.2 IEEE 802.1aq / 802.1Qbp B-MAC Advertisement Route 625 B-MAC addresses associated with 802.1aq / 802.1Qbp switches are 626 advertised using the BGP MAC Advertisement route already defined in 627 [E-VPN]. 629 The encapsulation for the transport of PBB frames over MPLS is 630 similar to that of classical Ethernet, albeit with the additional PBB 631 header, as shown in the figure below: 633 +------------------+ 634 | IP/MPLS Header | 635 +------------------+ 636 | PBB Header | 637 +------------------+ 638 | Ethernet Header | 639 +------------------+ 640 | Ethernet Payload | 641 +------------------+ 642 | Ethernet FCS | 643 +------------------+ 645 Figure 8: PBB over MPLS Encapsulation 647 9.3 Operation: 649 When a PE receives a PBB-encapsulated Ethernet frame from the access 650 side, it performs a lookup on the B-MAC destination address to 651 identify the next hop. If the lookup yields that the next hop is a 652 remote PE, the local PE would then encapsulate the PBB frame in MPLS. 653 The label stack comprises of the VPN label (advertised by the remote 654 PE), followed by an LSP/IGP label. From that point onwards, regular 655 MPLS forwarding is applied. 657 On the disposition PE, assuming penultimate-hop-popping is employed, 658 the PE receives the MPLS-encapsulated PBB frame with a single label: 659 the VPN label. The value of the label indicates to the disposition PE 660 that this is a PBB frame, so the label is popped, the TTL field (in 661 the 802.1Qbp F-Tag) is reinitialized and normal PBB processing is 662 employed from this point onwards. 664 10. Solution Advantages 666 In this section, we discuss the advantages of the PBB-EVPN solution 667 in the context of the requirements set forth in section 3 above. 669 10.1. MAC Advertisement Route Scalability 671 In PBB-EVPN the number of MAC Advertisement Routes is a function of 672 the number of segments (sites), rather than the number of 673 hosts/servers. This is because the B-MAC addresses of the PEs, rather 674 than C-MAC addresses (of hosts/servers) are being advertised in BGP. 675 And, as discussed above, there's a one-to-one mapping between multi- 676 homed segments and B-MAC addresses, whereas there's a one-to-one or 677 many-to-one mapping between single-homed segments and B-MAC addresses 678 for a given PE. As a result, the volume of MAC Advertisement Routes 679 in PBB-EVPN is multiple orders of magnitude less than E-VPN. 681 10.2. C-MAC Mobility with MAC Sub-netting 683 In PBB-EVPN, if a PE allocates its B-MAC addresses from a contiguous 684 range, then it can advertise a MAC prefix rather than individual 48- 685 bit addresses. It should be noted that B-MAC addresses can easily be 686 assigned from a contiguous range because PE nodes are within the 687 provider administrative domain; however, CE devices and hosts are 688 typically not within the provider administrative domain. The 689 advantage of such MAC address sub-netting can be maintained even as 690 C-MAC addresses move from one Ethernet segment to another. This is 691 because the C-MAC address to B-MAC address association is learnt in 692 the data-plane and C-MAC addresses are not advertised in BGP. To 693 illustrate how this compares to E-VPN, consider the following 694 example: 696 If a PE running E-VPN advertises reachability for a MAC subnet that 697 spans N addresses via a particular segment, and then 50% of the MAC 698 addresses in that subnet move to other segments (e.g. due to virtual 699 machine mobility), then in the worst case, N/2 additional MAC 700 Advertisement routes need to be sent for the MAC addresses that have 701 moved. This defeats the purpose of the sub-netting. With PBB-EVPN, on 702 the other hand, the sub-netting applies to the B-MAC addresses which 703 are statically associated with PE nodes and are not subject to 704 mobility. As C-MAC addresses move from one segment to another, the 705 binding of C-MAC to B-MAC addresses is updated via data-plane 706 learning. 708 10.3. C-MAC Address Learning and Confinement 710 In PBB-EVPN, C-MAC address reachability information is built via 711 data-plane learning. As such, PE nodes not participating in active 712 conversations involving a particular C-MAC address will purge that 713 address from their forwarding tables. Furthermore, since C-MAC 714 addresses are not distributed in BGP, PE nodes will not maintain any 715 record of them in control-plane routing table. 717 10.4. Seamless Interworking with TRILL and 802.1aq Access Networks 719 Consider the scenario where two access networks, one running MPLS and 720 the other running 802.1aq, are interconnected via an MPLS backbone 721 network. The figure below shows such an example network. 723 +--------------+ 724 | | 725 +---------+ | MPLS | +---------+ 726 +----+ | | +----+ +----+ | | +----+ 727 | CE |--| | | PE1| | PE2| | |--| CE | 728 +----+ | 802.1aq |---| | | |--| MPLS | +----+ 729 +----+ | | +----+ +----+ | | +----+ 730 | CE |--| | | Backbone | | |--| CE | 731 +----+ +---------+ +--------------+ +---------+ +----+ 733 Figure 9: Interoperability with 802.1aq 735 If the MPLS backbone network employs E-VPN, then the 802.1aq data- 736 plane encapsulation must be terminated on PE1 or the edge device 737 connecting to PE1. Either way, all the PE nodes that are part of the 738 associated service instances will be exposed to all the C-MAC 739 addresses of all hosts/servers connected to the access networks. 740 However, if the MPLS backbone network employs PBB-EVPN, then the 741 802.1aq encapsulation can be extended over the MPLS backbone, thereby 742 maintaining C-MAC address transparency on PE1. If PBB-EVPN is also 743 extended over the MPLS access network on the right, then C-MAC 744 addresses would be transparent to PE2 as well. 746 Interoperability with TRILL access network will be described in 747 future revision of this draft. 749 10.5. Per Site Policy Support 751 In PBB-EVPN, a unique B-MAC address can be associated with every site 752 (single-homed or multi-homed). Given that the B-MAC addresses are 753 sent in BGP MAC Advertisement routes, it is possible to define per 754 site (i.e. B-MAC) forwarding policies including policies for E-TREE 755 service. 757 10.6. Avoiding C-MAC Address Flushing 759 With PBB-EVPN, it is possible to avoid C-MAC address flushing upon 760 topology change affecting a multi-homed device. To illustrate this, 761 consider the example network of Figure 1. Both PE1 and PE2 advertize 762 the same B-MAC address (BM1) to PE3. PE3 then learns the C-MAC 763 addresses of the servers/hosts behind CE1 via data-plane learning. If 764 AC1 fails, then PE3 does not need to flush any of the C-MAC addresses 765 learnt and associated with BM1. This is because PE1 will withdraw the 766 MAC Advertisement routes associated with BM1, thereby leading PE3 to 767 have a single adjacency (to PE2) for this B-MAC address. Therefore, 768 the topology change is communicated to PE3 and no C-MAC address 769 flushing is required. 771 11. Acknowledgements 773 TBD. 775 12. Security Considerations 777 There are no additional security aspects beyond those of VPLS/H-VPLS 778 that need to be discussed here. 780 13. IANA Considerations 782 This document requires IANA to assign a new SAFI value for L2VPN_MAC 783 SAFI. 785 14. Intellectual Property Considerations 787 This document is being submitted for use in IETF standards 788 discussions. 790 15. Normative References 792 [802.1ah] "Virtual Bridged Local Area Networks Amendment 7: Provider 793 Backbone Bridges", IEEE Std. 802.1ah-2008, August 2008. 795 16. Informative References 797 [PBB-VPLS] Sajassi et al., "VPLS Interoperability with Provider 798 Backbone Bridges", draft-ietf-l2vpn-vpls-pbb-interop- 799 02.txt, work in progress, July, 2011. 801 [EVPN-REQ] Sajassi et al., "Requirements for Ethernet VPN (E-VPN)", 802 draft-sajassi-raggarwa-l2vpn-evpn-req-01.txt, work in 803 progress, July, 2011. 805 [E-VPN] Aggarwal et al., "BGP MPLS Based Ethernet VPN", draft-ietf- 806 l2vpn-evpn-00.txt, work in progress, February, 2012. 808 17. Authors' Addresses 810 Ali Sajassi 811 Cisco 812 170 West Tasman Drive 813 San Jose, CA 95134, US 814 Email: sajassi@cisco.com 815 Samer Salam 816 Cisco 817 595 Burrard Street, Suite 2123 818 Vancouver, BC V7X 1J1, Canada 819 Email: ssalam@cisco.com 821 Sami Boutros 822 Cisco 823 170 West Tasman Drive 824 San Jose, CA 95134, US 825 Email: sboutros@cisco.com 827 Nabil Bitar 828 Verizon Communications 829 Email : nabil.n.bitar@verizon.com 831 Aldrin Isaac 832 Bloomberg 833 Email: aisaac71@bloomberg.net 835 Florin Balus 836 Alcatel-Lucent 837 701 E. Middlefield Road 838 Mountain View, CA, USA 94043 839 Email: florin.balus@alcatel-lucent.com 841 Wim Henderickx 842 Alcatel-Lucent 843 Email: wim.henderickx@alcatel-lucent.be 845 Clarence Filsfils 846 Cisco 847 Email: cfilsfil@cisco.com 849 Dennis Cai 850 Cisco 851 Email: dcai@cisco.com 853 Lizhong Jin 854 ZTE Corporation 855 889, Bibo Road 856 Shanghai, 201203, China 857 Email: lizhong.jin@zte.com.cn