idnits 2.17.1 draft-ietf-l2vpn-pbb-evpn-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 29, 2012) is 4412 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'TRILL' is mentioned on line 239, but not defined == Missing Reference: 'TRILL-PERLMAN-MULTILEVEL' is mentioned on line 695, but not defined == Missing Reference: 'RFC6325' is mentioned on line 760, but not defined -- Unexpected draft version: The latest known version of draft-ietf-l2vpn-vpls-pbb-interop is -00, but you're referring to -02. == Outdated reference: A later version (-11) exists of draft-ietf-l2vpn-evpn-00 == Outdated reference: A later version (-01) exists of draft-tissa-trill-cmt-00 == Outdated reference: A later version (-02) exists of draft-tissa-trill-multilevel-00 Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Working Group Ali Sajassi 3 Internet Draft Samer Salam 4 Category: Standards Track Sami Boutros 5 Cisco 7 Florin Balus Nabil Bitar 8 Wim Henderickx Verizon 9 Alcatel-Lucent 10 Aldrin Isaac 11 Clarence Filsfils Bloomberg 12 Dennis Cai 13 Cisco Lizhong Jin 14 ZTE 16 Expires: September 29, 2012 March 29, 2012 18 PBB E-VPN 19 draft-ietf-l2vpn-pbb-evpn-02 21 Status of this Memo 23 This Internet-Draft is submitted to IETF in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF), its areas, and its working groups. Note that 28 other groups may also distribute working documents as 29 Internet-Drafts. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 The list of current Internet-Drafts can be accessed at 37 http://www.ietf.org/1id-abstracts.html 39 The list of Internet-Draft Shadow Directories can be accessed at 40 http://www.ietf.org/shadow.html 42 Copyright and License Notice 44 Copyright (c) 2012 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Abstract 59 This document discusses how Ethernet Provider Backbone Bridging 60 [802.1ah] can be combined with E-VPN in order to reduce the number of 61 BGP MAC advertisement routes by aggregating Customer/Client MAC (C- 62 MAC) addresses via Provider Backbone MAC address (B-MAC), provide 63 client MAC address mobility using C-MAC aggregation and B-MAC sub- 64 netting, confine the scope of C-MAC learning to only active flows, 65 offer per site policies and avoid C-MAC address flushing on topology 66 changes. The combined solution is referred to as PBB-EVPN. 68 Conventions 70 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 71 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 72 document are to be interpreted as described in RFC 2119. 74 Table of Contents 76 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 77 2. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 4 78 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 79 4. Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 4 80 4.1. MAC Advertisement Route Scalability . . . . . . . . . . . 5 81 4.2. C-MAC Mobility with MAC Summarization . . . . . . . . . . 5 82 4.3. C-MAC Address Learning and Confinement . . . . . . . . . . 5 83 4.4. Interworking with TRILL and 802.1aq Access Networks with 84 C-MAC Address Transparency . . . . . . . . . . . . . . . . 6 85 4.5. Per Site Policy Support . . . . . . . . . . . . . . . . . 6 86 4.6. Avoiding C-MAC Address Flushing . . . . . . . . . . . . . 6 87 5. Solution Overview . . . . . . . . . . . . . . . . . . . . . . 6 88 6. BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . 7 89 6.1. BGP MAC Advertisement Route . . . . . . . . . . . . . . . 7 90 6.2. Ethernet Auto-Discovery Route . . . . . . . . . . . . . . 8 91 6.3. Per VPN Route Targets . . . . . . . . . . . . . . . . . . 8 92 6.4. MAC Mobility Extended Community . . . . . . . . . . . . . 8 93 7. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . 8 94 7.1. MAC Address Distribution over Core . . . . . . . . . . . . 8 95 7.2. Device Multi-homing . . . . . . . . . . . . . . . . . . . 9 96 7.2.1 Flow-based Load-balancing . . . . . . . . . . . . . . . 9 97 7.2.1.1 MES B-MAC Address Assignment . . . . . . . . . . . 9 98 7.2.1.2. Automating B-MAC Address Assignment . . . . . . . 11 99 7.2.1.3 Split Horizon and Designated Forwarder Election . . 11 100 7.2.2 I-SID Based Load-balancing . . . . . . . . . . . . . . . 12 101 7.2.2.1 MES B-MAC Address Assignment . . . . . . . . . . . . 12 102 7.2.2.2 Split Horizon and Designated Forwarder Election . . 12 103 7.3. Network Multi-homing . . . . . . . . . . . . . . . . . . . 12 104 7.4. Frame Forwarding . . . . . . . . . . . . . . . . . . . . . 13 105 7.4.1. Unicast . . . . . . . . . . . . . . . . . . . . . . . 13 106 7.4.2. Multicast/Broadcast . . . . . . . . . . . . . . . . . 13 107 8. Minimizing ARP Broadcast . . . . . . . . . . . . . . . . . . . 14 108 9. Seamless Interworking with TRILL . . . . . . . . . . . . . . . 14 109 9.1 TRILL Nickname Assignment . . . . . . . . . . . . . . . . . 15 110 9.2 TRILL Nickname Advertisement Route . . . . . . . . . . . . 16 111 9.3 Frame Format . . . . . . . . . . . . . . . . . . . . . . . . 16 112 9.4 Unicast Forwarding . . . . . . . . . . . . . . . . . . . . 17 113 9.5 Handling Multicast . . . . . . . . . . . . . . . . . . . . . 18 114 9.5.1 Multicast Stitching with Per-Source Load Balancing . . . 19 115 9.5.2 Multicast Stitching with Per-VLAN Load Balancing . . . . 19 116 9.5.3 Multicast Stitching with Per-Flow Load Balancing . . . . 20 117 9.5.4 Multicast Stitching with Per-Tree Load Balancing . . . . 20 118 10. Seamless Interworking with IEEE 802.1aq/802.1Qbp . . . . . . . 21 119 10.2 B-MAC Address Assignment . . . . . . . . . . . . . . . . . 21 120 10.2 IEEE 802.1aq / 802.1Qbp B-MAC Advertisement Route . . . . 21 121 10.3 Operation: . . . . . . . . . . . . . . . . . . . . . . . . 22 122 11. Solution Advantages . . . . . . . . . . . . . . . . . . . . . 22 123 11.1. MAC Advertisement Route Scalability . . . . . . . . . . . 22 124 11.2. C-MAC Mobility with MAC Sub-netting . . . . . . . . . . . 23 125 11.3. C-MAC Address Learning and Confinement . . . . . . . . . 23 126 11.4. Seamless Interworking with TRILL and 802.1aq Access 127 Networks . . . . . . . . . . . . . . . . . . . . . . . . 23 128 11.5. Per Site Policy Support . . . . . . . . . . . . . . . . . 24 129 11.6. Avoiding C-MAC Address Flushing . . . . . . . . . . . . . 24 130 12. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 25 131 13. Security Considerations . . . . . . . . . . . . . . . . . . . 25 132 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 25 133 15. Intellectual Property Considerations . . . . . . . . . . . . 25 134 16. Normative References . . . . . . . . . . . . . . . . . . . . 25 135 17. Informative References . . . . . . . . . . . . . . . . . . . 25 136 18. Authors' Addresses . . . . . . . . . . . . . . . . . . . . . 25 138 1. Introduction 140 [E-VPN] introduces a solution for multipoint L2VPN services, with 141 advanced multi-homing capabilities, using BGP for distributing 142 customer/client MAC address reach-ability information over the core 143 MPLS/IP network. [802.1ah] defines an architecture for Ethernet 144 Provider Backbone Bridging (PBB), where MAC tunneling is employed to 145 improve service instance and MAC address scalability in Ethernet as 146 well as VPLS networks [PBB-VPLS]. 148 In this document, we discuss how PBB can be combined with E-VPN in 149 order to: reduce the number of BGP MAC advertisement routes by 150 aggregating Customer/Client MAC (C-MAC) addresses via Provider 151 Backbone MAC address (B-MAC), provide client MAC address mobility 152 using C-MAC aggregation and B-MAC sub-netting, confine the scope of 153 C-MAC learning to only active flows, offer per site policies and 154 avoid C-MAC address flushing on topology changes. The combined 155 solution is referred to as PBB-EVPN. 157 2. Contributors 159 In addition to the authors listed above, the following individuals 160 also contributed to this document. 162 Keyur Patel Cisco 164 3. Terminology 166 BEB: Backbone Edge Bridge 167 B-MAC: Backbone MAC Address 168 CE: Customer Edge 169 C-MAC: Customer/Client MAC Address 170 DHD: Dual-homed Device 171 DHN: Dual-homed Network 172 LACP: Link Aggregation Control Protocol 173 LSM: Label Switched Multicast 174 MDT: Multicast Delivery Tree 175 MES: MPLS Edge Switch 176 MP2MP: Multipoint to Multipoint 177 P2MP: Point to Multipoint 178 P2P: Point to Point 179 PoA: Point of Attachment 180 PW: Pseudowire 181 E-VPN: Ethernet VPN 183 4. Requirements 185 The requirements for PBB-EVPN include all the requirements for E-VPN 186 that were described in [EVPN-REQ], in addition to the following: 188 4.1. MAC Advertisement Route Scalability 190 In typical operation, an [E-VPN] MES sends a BGP MAC Advertisement 191 Route per customer/client MAC (C-MAC) address. In certain 192 applications, this poses scalability challenges, as is the case in 193 virtualized data center environments where the number of virtual 194 machines (VMs), and hence the number of C-MAC addresses, can be in 195 the millions. In such scenarios, it is required to reduce the number 196 of BGP MAC Advertisement routes by relying on a 'MAC summarization' 197 scheme, as is provided by PBB. Note that the MAC summarization 198 capability already built into E-VPN is not sufficient in those 199 environments, as will be discussed next. 201 4.2. C-MAC Mobility with MAC Summarization 203 Certain applications, such as virtual machine mobility, require 204 support for fast C-MAC address mobility. For these applications, it 205 is not possible to use MAC address summarization in E-VPN, i.e. 206 advertise reach-ability to a MAC address prefix. Rather, the exact 207 virtual machine MAC address needs to be transmitted in BGP MAC 208 Advertisement route. Otherwise, traffic would be forwarded to the 209 wrong segment when a virtual machine moves from one Ethernet segment 210 to another. This hinders the scalability benefits of summarization. 212 It is required to support C-MAC address mobility, while retaining the 213 scalability benefits of MAC summarization. This can be achieved by 214 leveraging PBB technology, which defines a Backbone MAC (B-MAC) 215 address space that is independent of the C-MAC address space, and 216 aggregate C-MAC addresses via a B-MAC address and then apply 217 summarization to B-MAC addresses. 219 4.3. C-MAC Address Learning and Confinement 221 In E-VPN, all the MES nodes participating in the same E-VPN instance 222 are exposed to all the C-MAC addresses learnt by any one of these MES 223 nodes because a C-MAC learned by one of the MES nodes is advertise in 224 BGP to other MES nodes in that E-VPN instance. This is the case even 225 if some of the MES nodes for that E-VPN instance are not involved in 226 forwarding traffic to, or from, these C-MAC addresses. Even if an 227 implementation does not install hardware forwarding entries for C-MAC 228 addresses that are not part of active traffic flows on that MES, the 229 device memory is still consumed by keeping record of the C-MAC 230 addresses in the routing table (RIB). In network applications with 231 millions of C-MAC addresses, this introduces a non-trivial waste of 232 MES resources. As such, it is required to confine the scope of 233 visibility of C-MAC addresses only to those MES nodes that are 234 actively involved in forwarding traffic to, or from, these addresses. 236 4.4. Interworking with TRILL and 802.1aq Access Networks with C-MAC 237 Address Transparency 239 [TRILL] and [802.1aq] define next generation Ethernet bridging 240 technologies that offer optimal forwarding using IS-IS control plane, 241 and C-MAC address transparency via Ethernet tunneling technologies. 242 When access networks based on TRILL or 802.1aq are interconnected 243 over an MPLS/IP network, it is required to guarantee C-MAC address 244 transparency on the hand-off point and the edge (i.e. MES) of the 245 MPLS network. As such, solutions that require termination of the 246 access data-plane encapsulation (i.e. TRILL or 802.1aq) at the hand- 247 off to the MPLS network do not meet this transparency requirement, 248 and expose the MPLS edge devices to the MAC address scalability 249 problem. 251 PBB-EVPN supports seamless interconnect with these next generation 252 Ethernet solutions while guaranteeing C-MAC address transparency on 253 the MES nodes. 255 4.5. Per Site Policy Support 257 In many applications, it is required to be able to enforce 258 connectivity policy rules at the granularity of a site (or segment). 259 This includes the ability to control which MES nodes in the network 260 can forward traffic to, or from, a given site. PBB-EVPN is capable of 261 providing this granularity of policy control. In the case where per 262 C-MAC address granularity is required, the EVI can always continue to 263 operate in E-VPN mode. 265 4.6. Avoiding C-MAC Address Flushing 267 It is required to avoid C-MAC address flushing upon link, port or 268 node failure for multi-homed devices and networks. This is in order 269 to speed up re-convergence upon failure. 271 5. Solution Overview 273 The solution involves incorporating IEEE 802.1ah Backbone Edge Bridge 274 (BEB) functionality on the E-VPN MES nodes similar to PBB-VPLS, where 275 BEB functionality is incorporated in the VPLS PE nodes. The MES 276 devices would then receive 802.1Q Ethernet frames from their 277 attachment circuits, encapsulate them in the PBB header and forward 278 the frames over the IP/MPLS core. On the egress E-VPN MES, the PBB 279 header is removed following the MPLS disposition, and the original 280 802.1Q Ethernet frame is delivered to the customer equipment. 282 BEB +--------------+ BEB 283 || | | || 284 \/ | | \/ 285 +----+ AC1 +----+ | | +----+ +----+ 286 | CE1|-----| | | | | |---| CE2| 287 +----+\ |MES1| | IP/MPLS | |MES3| +----+ 288 \ +----+ | Network | +----+ 289 \ | | 290 AC2\ +----+ | | 291 \| | | | 292 |MES2| | | 293 +----+ | | 294 /\ +--------------+ 295 || 296 BEB 297 <-802.1Q-> <------PBB over MPLS------> <-802.1Q-> 299 Figure 1: PBB-EVPN Network 301 The MES nodes perform the following functions:- Learn customer/client 302 MAC addresses (C-MACs) over the attachment circuits in the data- 303 plane, per normal bridge operation. 305 - Learn remote C-MAC to B-MAC bindings in the data-plane from traffic 306 ingress from the core per [802.1ah] bridging operation. 308 - Advertise local B-MAC address reach-ability information in BGP to 309 all other MES nodes in the same set of service instances. Note that 310 every MES has a set of local B-MAC addresses that uniquely identify 311 the device. More on the MES addressing in section 5. 313 - Build a forwarding table from remote BGP advertisements received 314 associating remote B-MAC addresses with remote MES IP addresses and 315 the associated MPLS label(s). 317 6. BGP Encoding 319 PBB-EVPN leverages the same BGP Routes and Attributes defined in [E- 320 VPN], adapted as follows: 322 6.1. BGP MAC Advertisement Route 324 The E-VPN MAC Advertisement Route is used to distribute B-MAC 325 addresses of the MES nodes instead of the C-MAC addresses of end- 326 stations/hosts. This is because the C-MAC addresses are learnt in the 327 data-plane for traffic arriving from the core. The MAC Advertisement 328 Route is encoded as follows: 330 - The MAC address field contains the B-MAC address. 331 - The Ethernet Tag field is set to 0. 333 The route is tagged with the RT corresponding to the EVI associated 334 with the B-MAC address. 336 All other fields are set as defined in [E-VPN]. 338 6.2. Ethernet Auto-Discovery Route 340 This route and all of its associated modes are not needed in PBB- 341 EVPN. 343 6.3. Per VPN Route Targets 345 PBB-EVPN uses the same set of route targets defined in [E-VPN]. The 346 future revision of this document will describe new RT types. 348 6.4. MAC Mobility Extended Community 350 This extended community is a new transitive extended community. It 351 may be advertised along with the MAC Advertisement route. When used 352 in PBB-EVPN, it indicates that the C-MAC forwarding tables for the I- 353 SIDs associated with the RT tagging the MAC Advertisement route must 354 be flushed. This extended community is encoded in 8-bytes as follows: 356 - Type (1 byte) = Pending IANA assignment. 357 - Sub-Type (1 byte) = Pending IANA assignment. 358 - Reserved (2 bytes) 359 - Counter (4 bytes) 361 Note that all other BGP messages and/or attributes are used as 362 defined in [E-VPN]. 364 7. Operation 366 This section discusses the operation of PBB-EVPN, specifically in 367 areas where it differs from [E-VPN]. 369 7.1. MAC Address Distribution over Core 371 In PBB-EVPN, host MAC addresses (i.e. C-MAC addresses) need not be 372 distributed in BGP. Rather, every MES independently learns the C-MAC 373 addresses in the data-plane via normal bridging operation. Every MES 374 has a set of one or more unicast B-MAC addresses associated with it, 375 and those are the addresses distributed over the core in MAC 376 Advertisement routes. 378 7.2. Device Multi-homing 380 7.2.1 Flow-based Load-balancing 382 This section describes the procedures for supporting device multi- 383 homing in an all-active redundancy model with flow-based load- 384 balancing. 386 7.2.1.1 MES B-MAC Address Assignment 388 In [802.1ah] every BEB is uniquely identified by one or more B-MAC 389 addresses. These addresses are usually locally administered by the 390 Service Provider. For PBB-EVPN, the choice of B-MAC address(es) for 391 the MES nodes must be examined carefully as it has implications on 392 the proper operation of multi-homing. In particular, for the scenario 393 where a CE is multi-homed to a number of MES nodes with all-active 394 redundancy and flow-based load-balancing, a given C-MAC address would 395 be reachable via multiple MES nodes concurrently. Given that any 396 given remote MES will bind the C-MAC address to a single B-MAC 397 address, then the various MES nodes connected to the same CE must 398 share the same B-MAC address. Otherwise, the MAC address table of the 399 remote MES nodes will keep oscillating between the B-MAC addresses of 400 the various MES devices. For example, consider the network of Figure 401 1, and assume that MES1 has B-MAC BM1 and MES2 has B-MAC BM2. Also, 402 assume that both links from CE1 to the MES nodes are part of an all- 403 active multi-chassis Ethernet link aggregation group. If BM1 is not 404 equal to BM2, the consequence is that the MAC address table on MES3 405 will keep oscillating such that the C-MAC address CM of CE1 would 406 flip-flop between BM1 or BM2, depending on the load-balancing 407 decision on CE1 for traffic destined to the core. 409 Considering that there could be multiple sites (e.g. CEs) that are 410 multi-homed to the same set of MES nodes, then it is required for all 411 the MES devices in a Redundancy Group to have a unique B-MAC address 412 per site. This way, it is possible to achieve fast convergence in the 413 case where a link or port failure impacts the attachment circuit 414 connecting a single site to a given MES. 416 +---------+ 417 +-------+ MES1 | IP/MPLS | 418 / | | 419 CE1 | Network | MESr 420 M1 \ | | 421 +-------+ MES2 | | 422 /-------+ | | 423 / | | 424 CE2 | | 425 M2 \ | | 426 \ | | 427 +------+ MES3 +---------+ 429 Figure 2: B-MAC Address Assignment 431 In the example network shown in Figure 2 above, two sites 432 corresponding to CE1 and CE2 are dual-homed to MES1/MES2 and 433 MES2/MES3, respectively. Assume that BM1 is the B-MAC used for the 434 site corresponding to CE1. Similarly, BM2 is the B-MAC used for the 435 site corresponding to CE2. On MES1, a single B-MAC address (BM1) is 436 required for the site corresponding to CE1. On MES2, two B-MAC 437 addresses (BM1 and BM2) are required, one per site. Whereas on MES3, 438 a single B-MAC address (BM2) is required for the site corresponding 439 to CE2. All three MES nodes would advertise their respective B-MAC 440 addresses in BGP using the MAC Advertisement routes defined in [E- 441 VPN]. The remote MES, MESr, would learn via BGP that BM1 is reachable 442 via MES1 and MES2, whereas BM2 is reachable via both MES2 and MES3. 443 Furthermore, MESr establishes via the normal bridge learning that C- 444 MAC M1 is reachable via BM1, and C-MAC M2 is reachable via BM2. As a 445 result, MESr can load-balance traffic destined to M1 between MES1 and 446 MES2, as well as traffic destined to M2 between both MES2 and MES3. 447 In the case of a failure that causes, for example, CE1 to be isolated 448 from MES1, the latter can withdraw the route it has advertised for 449 BM1. This way, MESr would update its path list for BM1, and will send 450 all traffic destined to M1 over to MES2 only. 452 For single-homed sites, it is possible to assign a unique B-MAC 453 address per site, or have all the single-homed sites connected to a 454 given MES share a single B-MAC address. The advantage of the first 455 model over the second model is the ability to avoid C-MAC destination 456 address lookup on the disposition PE (even though source C-MAC 457 learning is still required in the data-plane). Also, by assigning the 458 B-MAC addresses from a contiguous range, it is possible to advertise 459 a single B-MAC subnet for all single-homed sites, thereby rendering 460 the number of MAC advertisement routes required at par with the 461 second model. 463 In summary, every MES may use a unicast B-MAC address shared by all 464 single-homed CEs or a unicast B-MAC address per single-homed CE and, 465 in addition, a unicast B-MAC address per dual-homed CE. In the latter 466 case, the B-MAC address MUST be the same for all MES nodes in a 467 Redundancy Group connected to the same CE. 469 7.2.1.2. Automating B-MAC Address Assignment 471 The MES B-MAC address used for single-homed sites can be 472 automatically derived from the hardware (using for e.g. the 473 backplane's address). However, the B-MAC address used for multi-homed 474 sites must be coordinated among the RG members. To automate the 475 assignment of this latter address, the MES can derive this B-MAC 476 address from the MAC Address portion of the CE's LACP System 477 Identifier by flipping the 'Locally Administered' bit of the CE's 478 address. This guarantees the uniqueness of the B-MAC address within 479 the network, and ensures that all MES nodes connected to the same 480 multi-homed CE use the same value for the B-MAC address. 482 Note that with this automatic provisioning of the B-MAC address 483 associated with multi-homed CEs, it is not possible to support the 484 uncommon scenario where a CE has multiple bundles towards the MES 485 nodes, and the service involves hair-pinning traffic from one bundle 486 to another. This is because the split-horizon filtering relies on B- 487 MAC addresses rather than Site-ID Labels (as will be described in the 488 next section). The operator must explicitly configure the B-MAC 489 address for this fairly uncommon service scenario. 491 Whenever a B-MAC address is provisioned on the MES, either manually 492 or automatically (as an outcome of CE auto-discovery), the MES MUST 493 transmit an MAC Advertisement Route for the B-MAC address with a 494 downstream assigned MPLS label that uniquely identifies that address 495 on the advertising MES. The route is tagged with the RTs of the 496 associated EVIs as described above. 498 7.2.1.3 Split Horizon and Designated Forwarder Election 500 [E-VPN] relies on access split horizon, where the Ethernet Segment 501 Label is used for egress filtering on the attachment circuit in order 502 to prevent forwarding loops. In PBB-EVPN, the B-MAC source address 503 can be used for the same purpose, as it uniquely identifies the 504 originating site of a given frame. As such, Segment Labels are not 505 used in PBB-EVPN, and the egress split-horizon filtering is done 506 based on the B-MAC source address. It is worth noting here that 507 [802.1ah] defines this B-MAC address based filtering function as part 508 of the I-Component options, hence no new functions are required to 509 support split-horizon beyond what is already defined in [802.1ah]. 510 Given that the Segment label is not used in PBB-EVPN, the MES sets 511 the Label field in the Ethernet Segment Route to 0. 513 The Designated Forwarder election procedures are defined in [I-D- 514 Segment-Route]. 516 7.2.2 I-SID Based Load-balancing 518 This section describes the procedures for supporting device multi- 519 homing in an all-active redundancy model with per-ISID load- 520 balancing. 522 7.2.2.1 MES B-MAC Address Assignment 524 In the case where per-ISID load-balancing is desired among the MES 525 nodes in a given redundancy group, multiple unicast B-MAC addresses 526 are allocated per multi-homed Ethernet Segment: Each MES connected to 527 the multi-homed segment is assigned a unique B-MAC. Every MES then 528 advertises its B-MAC address using the BGP MAC advertisement route. 530 A remote MES initially floods traffic to a destination C-MAC address, 531 located in a given multi-homed Ethernet Segment, to all the MES nodes 532 connected to that segment. Then, when reply traffic arrives at the 533 remote MES, it learns (in the data-path) the B-MAC address and 534 associated next-hop MES to use for said C-MAC address. When a MES 535 connected to a multi-homed Ethernet Segment loses connectivity to the 536 segment, due to link or port failure, it withdraws the B-MAC route 537 previously advertised for that segment. This causes the remote MES 538 nodes to flush all C-MAC addresses associated with the B-MAC in 539 question. This is done across all I-SIDs that are mapped to the EVI 540 of the withdrawn MAC route. 542 7.2.2.2 Split Horizon and Designated Forwarder Election The procedures 543 are similar to the flow-based load-balancing case, with the only 544 difference being that the DF filtering must be applied to unicast as 545 well as multicast traffic, and in both core-to-segment as well as 546 segment-to-core directions. 548 7.3. Network Multi-homing 550 When an Ethernet network is multi-homed to a set of MES nodes running 551 PBB-EVPN, an all-active redundancy model can be supported with per 552 service instance (i.e. I-SID) load-balancing. In this model, DF 553 election is performed to ensure that a single MES node in the 554 redundancy group is responsible for forwarding traffic associated 555 with a given I-SID. This guarantees that no forwarding loops are 556 created. Filtering based on DF state applies to both unicast and 557 multicast traffic, and in both access-to-core as well as core-to- 558 access directions (unlike the multi-homed device scenario where DF 559 filtering is limited to multi-destination frames in the core-to- 560 access direction). Similar to the multi-homed device scenario, with 561 I-SID based load-balancing, a unique B-MAC address is assigned to 562 each of the MES nodes connected to the multi-homed network (Segment). 564 7.4. Frame Forwarding 566 The frame forwarding functions are divided in between the Bridge 567 Module, which hosts the [802.1ah] Backbone Edge Bridge (BEB) 568 functionality, and the MPLS Forwarder which handles the MPLS 569 imposition/disposition. The details of frame forwarding for unicast 570 and multi-destination frames are discussed next. 572 7.4.1. Unicast 574 Known unicast traffic received from the AC will be PBB-encapsulated 575 by the MES using the B-MAC source address corresponding to the 576 originating site. The unicast B-MAC destination address is determined 577 based on a lookup of the C-MAC destination address (the binding of 578 the two is done via transparent learning of reverse traffic). The 579 resulting frame is then encapsulated with an LSP tunnel label and the 580 MPLS label which uniquely identifies the B-MAC destination address on 581 the egress MES. If per flow load-balancing over ECMPs in the MPLS 582 core is required, then a flow label is added as the end of stack 583 label. 585 For unknown unicast traffic, the MES forwards these frames over MPLS 586 core. When these frames are to be forwarded, then the same set of 587 options used for forwarding multicast/broadcast frames (as described 588 in next section) are used. 590 7.4.2. Multicast/Broadcast 592 Multi-destination frames received from the AC will be PBB- 593 encapsulated by the MES using the B-MAC source address corresponding 594 to the originating site. The multicast B-MAC destination address is 595 selected based on the value of the I-SID as defined in [802.1ah]. The 596 resulting frame is then forwarded over the MPLS core using one out of 597 the following two options: 599 Option 1: the MPLS Forwarder can perform ingress replication over a 600 set of MP2P tunnel LSPs. The frame is encapsulated with a tunnel LSP 601 label and the E-VPN ingress replication label advertised in the 602 Inclusive Multicast Route. 604 Option 2: the MPLS Forwarder can use P2MP tunnel LSP per the 605 procedures defined in [E-VPN]. This includes either the use of 606 Inclusive or Aggregate Inclusive trees. 608 Note that the same procedures for advertising and handling the 609 Inclusive Multicast Route defined in [E-VPN] apply here. 611 8. Minimizing ARP Broadcast 613 The MES nodes implement an ARP-proxy function in order to minimize 614 the volume of ARP traffic that is broadcasted over the MPLS network. 615 This is achieved by having each MES node snoop on ARP request and 616 response messages received over the access interfaces or the MPLS 617 core. The MES builds a cache of IP / MAC address bindings from these 618 snooped messages. The MES then uses this cache to respond to ARP 619 requests ingress on access ports and targeting hosts that are in 620 remote sites. If the MES finds a match for the IP address in its ARP 621 cache, it responds back to the requesting host and drops the request. 622 Otherwise, if it does not find a match, then the request is flooded 623 over the MPLS network using either ingress replication or LSM. 625 9. Seamless Interworking with TRILL 627 PBB-EVPN enables seamless connectivity of TRILL networks over an 628 MPLS/IP core while ensuring control-plane separation among these 629 networks, and maintaining C-MAC address transparency on the MES 630 nodes. 632 Every TRILL network that is connected to the MPLS core runs an 633 independent instance of the IS-IS control-plane. Each MES 634 participates in the TRILL IS-IS control plane of its local site. The 635 MES peers, in IS-IS protocol, with the RBridges internal to the site, 636 but does not terminate the TRILL data-plane encapsulation. So, from a 637 control-plane viewpoint, the MES appears as an edge RBridge; whereas, 638 from a data-plane viewpoint, the MES appears as a core RBridge to the 639 TRILL network. The MES nodes encapsulate TRILL frames with MPLS in 640 the imposition path, and de-capsulate them in the disposition path. 642 +--------------+ 643 | | 644 +---------+ | MPLS | +---------+ 645 +----+ | | +----+ +----+ | | +----+ 646 |RB1 |--| | |MES1| |MES2| | |--| RB3| 647 +----+ | TRILL |---| | | |--| TRILL | +----+ 648 +----+ | | +----+ +----+ | | +----+ 649 |RB2 |--| | | Backbone | | |--| RB4| 650 +----+ +---------+ +--------------+ +---------+ +----+ 652 |<------ IS-IS -------->|<-----BGP----->|<------ IS-IS ------>| CP 654 |<------------------------- TRILL -------------------------->| DP 655 |<----MPLS----->| 657 Legend: CP = Control Plane View 658 DP = Data Plane View 660 Figure 4: Interconnecting TRILL Networks with PBB-EVPN 662 9.1 TRILL Nickname Assignment 664 In TRILL, edge RBridges build forwarding tables that associate remote 665 C-MAC addresses with remote edge RBridge nicknames via data-path 666 learning (except if the optional ESADI function is in use). When 667 different TRILL networks are interconnected over an MPLS/IP network 668 using a seamless hand-off, the edge RBridges (corresponding to the 669 ingress and egress RBridges of particular traffic flows) may very 670 well reside in different TRILL networks. Therefore, in order to 671 guarantee correct connectivity, the TRILL Nicknames must be globally 672 unique across all the interconnected TRILL islands in a given EVI. 673 This can be achieved, for instance, by using a hierarchical Nickname 674 assignment paradigm, and encoding a Site ID in the high-order bits of 675 the Nickname: 677 Nickname = [Site ID : Rbridge ID ] 679 The Site ID uniquely identifies a TRILL network, whereas the RBridge 680 ID portion of the Nickname has local significance to a TRILL site, 681 and can be reused in different sites to designate different RBridges. 682 However, the fully qualified Nickname is globally unique in the 683 entire domain of interconnected TRILL networks for a given EVI. 685 It is worth noting here that this hierarchical Nickname encoding 686 scheme guarantees that Nickname collisions do not occur between 687 different TRILL islands. Therefore, there is no need to define TRILL 688 Nickname collision detection/resolution mechanisms to operate across 689 separate TRILL islands interconnected via PBB-EVPN. 691 Another point to note is that there are proposals to achieve per-site 692 Nickname significance; however, these proposals either require C-MAC 693 learning on the border RBridge (i.e. violate the C-MAC address 694 transparency requirement), or require a completely new encapsulation 695 and associated data-path for TRILL [TRILL-PERLMAN-MULTILEVEL]. 697 9.2 TRILL Nickname Advertisement Route 699 A new BGP route is defined to support the interconnection of TRILL 700 networks over PBB-EVPN: the TRILL Nickname Advertisement' route, 701 encoded as follows: 703 +---------------------------------------+ 704 | RD (8 octets) | 705 +---------------------------------------+ 706 |Ethernet Segment Identifier (10 octets)| 707 +---------------------------------------+ 708 | Ethernet Tag ID (4 octets) | 709 +---------------------------------------+ 710 | Nickname Length (1 octet) | 711 +---------------------------------------+ 712 | RBridge Nickname (2 octets) | 713 +---------------------------------------+ 714 | MPLS Label (n * 3 octets) | 715 +---------------------------------------+ 717 Figure 5: TRILL Nickname Advertisement Route 719 The MES uses this route to advertise the reachability of TRILL 720 RBridge nicknames to other MES nodes in the EVI. The MPLS label 721 advertised in this route is allocated on a per EVI basis and serves 722 the purpose of identifying to the disposition MES that the MPLS- 723 encapsulated packet holds an MPLS encapsulated TRILL frame. 725 9.3 Frame Format 727 The encapsulation for the transport of TRILL frames over MPLS is 728 encoded as shown in the figure below: 730 +------------------+ 731 | IP/MPLS Header | 732 +------------------+ 733 | TRILL Header | 734 +------------------+ 735 | Ethernet Header | 736 +------------------+ 737 | Ethernet Payload | 738 +------------------+ 739 | Ethernet FCS | 740 +------------------+ 742 Figure 6: TRILL over MPLS Encapsulation 744 It is worth noting here that while it is possible to transport 745 Ethernet encapsulated TRILL frames over MPLS, that approach 746 unnecessarily wastes 16 bytes per packet. That approach further 747 requires either the use of well-known MAC addresses or having the MES 748 nodes advertise in BGP their device MAC addresses, in order to 749 resolve the TRILL next-hop L2 adjacency. To that end, it is simpler 750 and more efficient to transport TRILL natively over MPLS, and this is 751 the reason why a new BGP route for TRILL Nickname advertisement is 752 defined. 754 9.4 Unicast Forwarding 756 Every MES advertises in BGP the Nicknames of all RBridges local to 757 its site in the TRILL Nickname Advertisement routes. Furthermore, the 758 MES advertises in IS-IS, to the local island, the Rbridge nicknames 759 of all remote switches in all the other TRILL islands that the MES 760 has learned via BGP. This is required since TRILL [RFC6325] currently 761 does not define the concept of default routes. However, if the 762 concept of default routes is added to TRILL, then the MES can 763 advertise itself as a border RBridge, and all the other Rbridges in 764 the TRILL network would install a default route pointing to the MES. 765 The default route would be used for all unknown destination 766 Nicknames. This eliminates the need to redistribute Nicknames learnt 767 via BGP into TRILL IS-IS. 769 Note that by having multiple MES nodes (connected to the same TRILL 770 island) advertise routes to the same RBridge nickname, with equal BGP 771 Local_Pref attribute, it is possible to perform active/active load- 772 balancing to/from the MPLS core. 774 When a MES receives an Ethernet-encapsulated TRILL frame from the 775 access side, it removes the Ethernet encapsulation (i.e. outer MAC 776 header), and performs a lookup on the egress RBridge nickname in the 777 TRILL header to identify the next-hop. If the lookup yields that the 778 next hop is a remote MES, the local MES would then encapsulate the 779 TRILL frame in MPLS. The label stack comprises of the VPN label 780 (advertised by the remote MES), followed by an LSP/IGP label. From 781 that point onwards, regular MPLS forwarding is applied. 783 On the disposition MES, assuming penultimate-hop-popping is employed, 784 the MES receives the MPLS-encapsulated TRILL frame with a single 785 label: the VPN label. The value of the label indicates to the 786 disposition MES that this is a TRILL packet, so the label is popped, 787 the TTL field (in the TRILL header) is reinitialized and normal TRILL 788 processing is employed from this point onwards. 790 9.5 Handling Multicast 792 Each TRILL network independently builds its shared multicast trees. 793 The number of these trees need not match in the different 794 interconnected TRILL islands. In the MPLS/IP network, multiple 795 options are available for the delivery of multicast traffic: 797 - Ingress replication 798 - LSM with Inclusive trees 799 - LSM with Aggregate Inclusive trees 800 - LSM with Selective trees 801 - LSM with Aggregate Selective trees 803 When LSM is used, the trees may be either P2MP or MP2MP. 805 The MES nodes are responsible for stitching the TRILL multicast 806 trees, on the access side, to the ingress replication tunnels or LSM 807 trees in the MPLS/IP core. The stitching must ensure that the 808 following characteristics are maintained at all times: 810 1. Avoiding Packet Duplication: In the case where the TRILL network 811 is multi-homed to multiple MES nodes, if all of the MES nodes forward 812 the same multicast frame, then packet duplication would arise. This 813 applies to both multicast traffic from site to core as well as from 814 core to site. 816 2. Avoiding Forwarding Loops: In the case of TRILL network multi- 817 homing, the solution must ensure that a multicast frame forwarded by 818 a given MES to the MPLS core is not forwarded back by another MES (in 819 the same TRILL network) to the TRILL network of origin. The same 820 applies for traffic in the core to site direction. 822 3. Pacifying TRILL RPF Checks: For multicast traffic originating from 823 a different TRILL network, the RPF checks must be performed against 824 the disposition MES (i.e. the MES on which the traffic ingress into 825 the destination TRILL network). 827 There are two approaches by which the above operation can be 828 guaranteed: one offers per-source load-balancing while the other 829 offers per-flow load-balancing. 831 9.5.1 Multicast Stitching with Per-Source Load Balancing 833 The MES nodes, connected to a multi-homed TRILL network, perform BGP 834 DF election to decide which MES is responsible for forwarding 835 multicast traffic from a given source RBridge. An MES would only 836 forward multicast traffic from source RBridges for which it is the 837 DF, in both the site to core as well as core to site directions. This 838 solves both the issue of avoiding packet duplication as well as the 839 issue of avoiding forwarding loops. 841 In addition, the MES node advertises in IS-IS the nicknames of remote 842 RBridges, learnt in BGP, for which it is the elected DF. This allows 843 all RBridges in the local TRILL network to build the correct RPF 844 state for these remote RBridge nicknames. Note that this results in 845 all unicast traffic to a given remote RBridge being forwarded to the 846 DF MES only (i.e. load-balancing of unicast traffic would not be 847 possible in the site to core direction). 849 Alternatively, all MES nodes in a redundancy group can advertise the 850 nicknames of all remote RBridges learnt in BGP. In addition, each MES 851 advertises the Affinity sub-TLV, defined in [TRILL-CMT], on behalf of 852 each of the remote RBridges for which it is the elected DF. This 853 ensures that the RPF check state is set up correctly in the TRILL 854 network, while allowing load-balancing of unicast traffic among the 855 MES nodes. 857 In this approach, all MES nodes in a given redundancy group can 858 forward and receive traffic on all TRILL trees. 860 9.5.2 Multicast Stitching with Per-VLAN Load Balancing 862 The MES nodes, connected to a multi-homed TRILL network, perform BGP 863 DF election to decide which MES node is responsible for forwarding 864 multicast traffic associated with a given VLAN. An MES would forward 865 multicast traffic for a given VLAN only when it is the DF for this 866 VLAN. This forwarding rule applies in both the site to core as well 867 as core to site directions. 869 In addition, the MES nodes in the redundancy group partition among 870 themselves the set of TRILL multicast trees so that each MES only 871 sends traffic on a unique set of trees. This can be done using the RP 872 Election Protocol as discussed in [TRILL-MULTILEVEL]. Alternatively, 873 the BGP DF election could be used for that. Each MES, then, 874 advertises to the local TRILL network a Default Affinity sub-TLV, per 876 [TRILL-MULTILEVEL], listing the trees that it will be using for 877 multicast traffic originating from remote RBridges. 879 In this approach, each MES node in given TRILL network receives 880 traffic from all TRILL trees but forwards traffic on only a dedicated 881 subset of trees. Hence, the TRILL network must have at least as many 882 multicast trees as the number of directly attached MES nodes. 884 9.5.3 Multicast Stitching with Per-Flow Load Balancing 886 This approach is similar to the per-VLAN load-balancing approach 887 described above, with the difference being that the MES nodes perform 888 the BGP DF election on a per-flow basis. The flow is identified by an 889 N-Tuple comprising of Layer 2 and Layer 3 addresses in addition to 890 Layer 4 ports. This can be done by treating the N-Tuple as a numeric 891 value, and performing, for e.g., a modulo hash function against the 892 number of PEs in the redundancy group in order to identify the index 893 of the PE that is the DF for a given N-Tuple. 895 In this approach, each MES node in given TRILL network receives 896 traffic from all TRILL trees but forwards traffic on only a dedicated 897 subset of trees. Hence, the TRILL network must have at least as many 898 multicast trees as the number of directly attached MES nodes. 900 9.5.4 Multicast Stitching with Per-Tree Load Balancing 902 The MES nodes, connected to a multi-homed TRILL network, perform BGP 903 DF election to decide which MES node is responsible for forwarding 904 multicast traffic associated with a given TRILL multicast tree. An 905 MES would forward multicast traffic with a given destination RBridge 906 nickname only when it is the DF for this nickname. This forwarding 907 rule applies in both the site to core as well as core to site 908 directions. The outcome of the BGP DF election is then used to drive 909 TRILL IS-IS advertisements: the MES advertises to the local TRILL 910 network a Default Affinity sub-TLV, per [TRILL-MULTILEVEL], listing 911 the trees for which it is the elected DF. 913 Note that on the egress MES, the destination RBridge Nickname in 914 multicast frames identifies the multicast tree of the remote TRILL 915 network from which the frame originated. If the TRILL tree 916 identifiers are not coordinated between sites, then the egress 917 Nickname has no meaning in the directly attached (destination) TRILL 918 network. So, the MES needs to select a new tree (after the MPLS 919 disposition) based on a hash function, and rewrite the frame with 920 this new destination Nickname before forwarding the traffic. This may 921 be necessary in certain deployments to ensure complete decoupling 922 between the TRILL sites connected to the MPLS core. On the other 923 hand, if the TRILL tree identifiers are coordinated between sites, 924 then the MES doesn't have to rewrite the destination nickname in the 925 TRILL header, after the MPLS disposition. 927 In this approach, each MES node in a given redundancy group forwards 928 and receives traffic on a disjoint set of TRILL trees. At a minimum, 929 the TRILL network must have as many multicast trees as the number of 930 directly attached MES nodes. 932 10. Seamless Interworking with IEEE 802.1aq/802.1Qbp 934 +--------------+ 935 | | 936 +---------+ | MPLS | +---------+ 937 +----+ | | +----+ +----+ | | +----+ 938 |SW1 |--| | |MES1| |MES2| | |--| SW3| 939 +----+ | 802.1aq |---| | | |--| 802.1aq | +----+ 940 +----+ | .1Qbp | +----+ +----+ | .1Qbp | +----+ 941 |SW2 |--| | | Backbone | | |--| SW4| 942 +----+ +---------+ +--------------+ +---------+ +----+ 944 |<------ IS-IS -------->|<-----BGP----->|<------ IS-IS ------>| CP 946 |<------------------------- PBB -------------------------->| DP 947 |<----MPLS----->| 949 Legend: CP = Control Plane View 950 DP = Data Plane View 952 Figure 7: Interconnecting 802.1aq/802.1Qbp Networks with PBB-EVPN 954 10.2 B-MAC Address Assignment 956 For the same reasons cited in the TRILL section, the B-MAC addresses 957 need to be globally unique across all the IEEE 802.1aq / 802.1Qbp 958 networks. The same hierarchical address assignment scheme depicted 959 above is proposed for B-MAC addresses as well. 961 10.2 IEEE 802.1aq / 802.1Qbp B-MAC Advertisement Route 963 B-MAC addresses associated with 802.1aq / 802.1Qbp switches are 964 advertised using the BGP MAC Advertisement route already defined in 965 [E-VPN]. 967 The encapsulation for the transport of PBB frames over MPLS is 968 similar to that of classical Ethernet, albeit with the additional PBB 969 header, as shown in the figure below: 971 +------------------+ 972 | IP/MPLS Header | 973 +------------------+ 974 | PBB Header | 975 +------------------+ 976 | Ethernet Header | 977 +------------------+ 978 | Ethernet Payload | 979 +------------------+ 980 | Ethernet FCS | 981 +------------------+ 983 Figure 8: PBB over MPLS Encapsulation 985 10.3 Operation: 987 When a MES receives a PBB-encapsulated Ethernet frame from the access 988 side, it performs a lookup on the B-MAC destination address to 989 identify the next hop. If the lookup yields that the next hop is a 990 remote MES, the local MES would then encapsulate the PBB frame in 991 MPLS. The label stack comprises of the VPN label (advertised by the 992 remote PE), followed by an LSP/IGP label. From that point onwards, 993 regular MPLS forwarding is applied. 995 On the disposition MES, assuming penultimate-hop-popping is employed, 996 the MES receives the MPLS-encapsulated PBB frame with a single label: 997 the VPN label. The value of the label indicates to the disposition 998 MES that this is a PBB frame, so the label is popped, the TTL field 999 (in the 802.1Qbp F-Tag) is reinitialized and normal PBB processing is 1000 employed from this point onwards. 1002 11. Solution Advantages 1004 In this section, we discuss the advantages of the PBB-EVPN solution 1005 in the context of the requirements set forth in section 3 above. 1007 11.1. MAC Advertisement Route Scalability 1009 In PBB-EVPN the number of MAC Advertisement Routes is a function of 1010 the number of segments (sites), rather than the number of 1011 hosts/servers. This is because the B-MAC addresses of the MESes, 1012 rather than C-MAC addresses (of hosts/servers) are being advertised 1013 in BGP. And, as discussed above, there's a one-to-one mapping between 1014 multi-homed segments and B-MAC addresses, whereas there's a one-to- 1015 one or many-to-one mapping between single-homed segments and B-MAC 1016 addresses for a given MES. As a result, the volume of MAC 1017 Advertisement Routes in PBB-EVPN is multiple orders of magnitude less 1018 than E-VPN. 1020 11.2. C-MAC Mobility with MAC Sub-netting 1022 In PBB-EVPN, if a MES allocates its B-MAC addresses from a contiguous 1023 range, then it can advertise a MAC prefix rather than individual 48- 1024 bit addresses. It should be noted that B-MAC addresses can easily be 1025 assigned from a contiguous range because MES nodes are within the 1026 provider administrative domain; however, CE devices and hosts are 1027 typically not within the provider administrative domain. The 1028 advantage of such MAC address sub-netting can be maintained even as 1029 C-MAC addresses move from one Ethernet segment to another. This is 1030 because the C-MAC address to B-MAC address association is learnt in 1031 the data-plane and C-MAC addresses are not advertised in BGP. To 1032 illustrate how this compares to E-VPN, consider the following 1033 example: 1035 If a MES running E-VPN advertises reachability for a MAC subnet that 1036 spans N addresses via a particular segment, and then 50% of the MAC 1037 addresses in that subnet move to other segments (e.g. due to virtual 1038 machine mobility), then in the worst case, N/2 additional MAC 1039 Advertisement routes need to be sent for the MAC addresses that have 1040 moved. This defeats the purpose of the sub-netting. With PBB-EVPN, on 1041 the other hand, the sub-netting applies to the B-MAC addresses which 1042 are statically associated with MES nodes and are not subject to 1043 mobility. As C-MAC addresses move from one segment to another, the 1044 binding of C-MAC to B-MAC addresses is updated via data-plane 1045 learning. 1047 11.3. C-MAC Address Learning and Confinement 1049 In PBB-EVPN, C-MAC address reachability information is built via 1050 data-plane learning. As such, MES nodes not participating in active 1051 conversations involving a particular C-MAC address will purge that 1052 address from their forwarding tables. Furthermore, since C-MAC 1053 addresses are not distributed in BGP, MES nodes will not maintain any 1054 record of them in control-plane routing table. 1056 11.4. Seamless Interworking with TRILL and 802.1aq Access Networks 1058 Consider the scenario where two access networks, one running MPLS and 1059 the other running 802.1aq, are interconnected via an MPLS backbone 1060 network. The figure below shows such an example network. 1062 +--------------+ 1063 | | 1064 +---------+ | MPLS | +---------+ 1065 +----+ | | +----+ +----+ | | +----+ 1066 | CE |--| | |MES1| |MES2| | |--| CE | 1067 +----+ | 802.1aq |---| | | |--| MPLS | +----+ 1068 +----+ | | +----+ +----+ | | +----+ 1069 | CE |--| | | Backbone | | |--| CE | 1070 +----+ +---------+ +--------------+ +---------+ +----+ 1072 Figure 9: Interoperability with 802.1aq 1074 If the MPLS backbone network employs E-VPN, then the 802.1aq data- 1075 plane encapsulation must be terminated on MES1 or the edge device 1076 connecting to MES1. Either way, all the MES nodes that are part of 1077 the associated service instances will be exposed to all the C-MAC 1078 addresses of all hosts/servers connected to the access networks. 1079 However, if the MPLS backbone network employs PBB-EVPN, then the 1080 802.1aq encapsulation can be extended over the MPLS backbone, thereby 1081 maintaining C-MAC address transparency on MES1. If PBB-EVPN is also 1082 extended over the MPLS access network on the right, then C-MAC 1083 addresses would be transparent to MES2 as well. 1085 Interoperability with TRILL access network will be described in 1086 future revision of this draft. 1088 11.5. Per Site Policy Support 1090 In PBB-EVPN, a unique B-MAC address can be associated with every site 1091 (single-homed or multi-homed). Given that the B-MAC addresses are 1092 sent in BGP MAC Advertisement routes, it is possible to define per 1093 site (i.e. B-MAC) forwarding policies including policies for E-TREE 1094 service. 1096 11.6. Avoiding C-MAC Address Flushing 1098 With PBB-EVPN, it is possible to avoid C-MAC address flushing upon 1099 topology change affecting a multi-homed device. To illustrate this, 1100 consider the example network of Figure 1. Both MES1 and MES2 1101 advertize the same B-MAC address (BM1) to MES3. MES3 then learns the 1102 C-MAC addresses of the servers/hosts behind CE1 via data-plane 1103 learning. If AC1 fails, then MES3 does not need to flush any of the 1104 C-MAC addresses learnt and associated with BM1. This is because MES1 1105 will withdraw the MAC Advertisement routes associated with BM1, 1106 thereby leading MES3 to have a single adjacency (to MES2) for this B- 1107 MAC address. Therefore, the topology change is communicated to MES3 1108 and no C-MAC address flushing is required. 1110 12. Acknowledgements 1112 TBD. 1114 13. Security Considerations 1116 There are no additional security aspects beyond those of VPLS/H-VPLS 1117 that need to be discussed here. 1119 14. IANA Considerations 1121 This document requires IANA to assign a new SAFI value for L2VPN_MAC 1122 SAFI. 1124 15. Intellectual Property Considerations 1126 This document is being submitted for use in IETF standards 1127 discussions. 1129 16. Normative References 1131 [802.1ah] "Virtual Bridged Local Area Networks Amendment 7: Provider 1132 Backbone Bridges", IEEE Std. 802.1ah-2008, August 2008. 1134 17. Informative References 1136 [PBB-VPLS] Sajassi et al., "VPLS Interoperability with Provider 1137 Backbone Bridges", draft-ietf-l2vpn-vpls-pbb-interop- 1138 02.txt, work in progress, July, 2011. 1140 [EVPN-REQ] Sajassi et al., "Requirements for Ethernet VPN (E-VPN)", 1141 draft-sajassi-raggarwa-l2vpn-evpn-req-01.txt, work in 1142 progress, July, 2011. 1144 [E-VPN] Aggarwal et al., "BGP MPLS Based Ethernet VPN", draft-ietf- 1145 l2vpn-evpn-00.txt, work in progress, February, 2012. 1147 [TRILL-CMT] Senevirathne et al., "Coordinated Multicast Trees for 1148 TRILL", draft-tissa-trill-cmt-00.txt, work in progress, 1149 January 2012. 1151 [TRILL-MULTILEVEL] Senevirathne et al., "Default Nickname Based 1152 Approach for Multilevel TRILL", draft-tissa-trill- 1153 multilevel-00.txt, work in progress, February 2012. 1155 18. Authors' Addresses 1156 Ali Sajassi 1157 Cisco 1158 170 West Tasman Drive 1159 San Jose, CA 95134, US 1160 Email: sajassi@cisco.com 1162 Samer Salam 1163 Cisco 1164 595 Burrard Street, Suite 2123 1165 Vancouver, BC V7X 1J1, Canada 1166 Email: ssalam@cisco.com 1168 Sami Boutros 1169 Cisco 1170 170 West Tasman Drive 1171 San Jose, CA 95134, US 1172 Email: sboutros@cisco.com 1174 Nabil Bitar 1175 Verizon Communications 1176 Email : nabil.n.bitar@verizon.com 1178 Aldrin Isaac 1179 Bloomberg 1180 Email: aisaac71@bloomberg.net 1182 Florin Balus 1183 Alcatel-Lucent 1184 701 E. Middlefield Road 1185 Mountain View, CA, USA 94043 1186 Email: florin.balus@alcatel-lucent.com 1188 Wim Henderickx 1189 Alcatel-Lucent 1190 Email: wim.henderickx@alcatel-lucent.be 1192 Clarence Filsfils 1193 Cisco 1194 Email: cfilsfil@cisco.com 1195 Dennis Cai 1196 Cisco 1197 Email: dcai@cisco.com 1199 Lizhong Jin 1200 ZTE Corporation 1201 889, Bibo Road 1202 Shanghai, 201203, China 1203 Email: lizhong.jin@zte.com.cn