idnits 2.17.1 draft-sajassi-l2vpn-evpn-inter-subnet-forwarding-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 2 instances of too long lines in the document, the longest one being 2 characters in excess of 72. ** There are 3 instances of lines with control characters in the document. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 268 has weird spacing: '...ed with an IP...' -- The document date (October 2, 2014) is 3486 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'LS' is mentioned on line 611, but not defined == Missing Reference: 'RFC5512' is mentioned on line 1011, but not defined ** Obsolete undefined reference: RFC 5512 (Obsoleted by RFC 9012) == Outdated reference: A later version (-11) exists of draft-ietf-l2vpn-evpn-04 == Outdated reference: A later version (-02) exists of draft-sajassi-l2vpn-evpn-ipvpn-interop-01 == Outdated reference: A later version (-07) exists of draft-raggarwa-data-center-mobility-05 == Outdated reference: A later version (-03) exists of draft-rabadan-l2vpn-evpn-prefix-advertisement-02 Summary: 3 errors (**), 0 flaws (~~), 8 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 L2VPN Workgroup Ali Sajassi 3 INTERNET-DRAFT Samer Salam 4 Intended Status: Standards Track Samir Thoria 5 Cisco 6 Wim Henderickx 7 Jorge Rabadan Yakov Rekhter 8 Alcatel-Lucent John Drake 9 Juniper 10 Florin Balus 11 Nuage Networks Lucy Yong 12 Linda Dunbar 13 Dennis Cai Huawei 14 Cisco 16 Expires: April 2, 2015 October 2, 2014 18 Integrated Routing and Bridging in EVPN 19 draft-sajassi-l2vpn-evpn-inter-subnet-forwarding-05 21 Abstract 23 EVPN provides an extensible and flexible multi-homing VPN solution 24 for intra-subnet connectivity among hosts/VMs over an MPLS/IP 25 network. However, there are scenarios in which inter-subnet 26 forwarding among hosts/VMs across different IP subnets is required, 27 while maintaining the multi-homing capabilities of EVPN. This 28 document describes an Integrated Routing and Bridging (IRB) solution 29 based on EVPN to address such requirements. 31 Status of this Memo 33 This Internet-Draft is submitted to IETF in full conformance with the 34 provisions of BCP 78 and BCP 79. 36 Internet-Drafts are working documents of the Internet Engineering 37 Task Force (IETF), its areas, and its working groups. Note that 38 other groups may also distribute working documents as 39 Internet-Drafts. 41 Internet-Drafts are draft documents valid for a maximum of six months 42 and may be updated, replaced, or obsoleted by other documents at any 43 time. It is inappropriate to use Internet-Drafts as reference 44 material or to cite them other than as "work in progress." 46 The list of current Internet-Drafts can be accessed at 47 http://www.ietf.org/1id-abstracts.html 49 The list of Internet-Draft Shadow Directories can be accessed at 50 http://www.ietf.org/shadow.html 52 Copyright and License Notice 54 Copyright (c) 2014 IETF Trust and the persons identified as the 55 document authors. All rights reserved. 57 This document is subject to BCP 78 and the IETF Trust's Legal 58 Provisions Relating to IETF Documents 59 (http://trustee.ietf.org/license-info) in effect on the date of 60 publication of this document. Please review these documents 61 carefully, as they describe your rights and restrictions with respect 62 to this document. Code Components extracted from this document must 63 include Simplified BSD License text as described in Section 4.e of 64 the Trust Legal Provisions and are provided without warranty as 65 described in the Simplified BSD License. 67 Table of Contents 69 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 70 2 Inter-Subnet Forwarding Scenarios . . . . . . . . . . . . . . . 5 71 2.1 Switching among Subnets within a DC . . . . . . . . . . . . 6 72 2.2 Switching among EVIs in different DCs without route 73 aggregation . . . . . . . . . . . . . . . . . . . . . . . . 7 74 2.3 Switching among EVIs in different DCs with route 75 aggregation . . . . . . . . . . . . . . . . . . . . . . . . 7 76 2.4 Switching among IP-VPN sites and EVIs with route 77 aggregation . . . . . . . . . . . . . . . . . . . . . . . . 7 78 3 Default L3 Gateway Addressing . . . . . . . . . . . . . . . . . 8 79 3.1 Homogeneous Environment . . . . . . . . . . . . . . . . . . 8 80 3.2 Heterogeneous Environment . . . . . . . . . . . . . . . . . 9 81 4 Operational Models for Asymmetric Inter-Subnet Forwarding . . . 9 82 4.1 Among EVPN NVEs within a DC . . . . . . . . . . . . . . . . 9 83 4.2 Among EVPN NVEs in Different DCs Without Route Aggregation . 10 84 4.3 Among EVPN NVEs in Different DCs with Route Aggregation . . 12 85 4.4 Among IP-VPN Sites and EVPN NVEs with Route Aggregation . . 13 86 4.5 Use of Centralized Gateway . . . . . . . . . . . . . . . . . 14 87 5 Operational Models for Symmetric Inter-Subnet Forwarding . . . . 15 88 5.1 IRB forwarding on NVEs for Tenant Systems . . . . . . . . . 15 89 5.1.1 Control Plane Operation . . . . . . . . . . . . . . . . 16 90 5.1.2 Data Plane Operation . . . . . . . . . . . . . . . . . . 17 91 5.1.3 TS Move Operation . . . . . . . . . . . . . . . . . . . 18 93 5.2 IRB forwarding on NVEs for Subnets behind Tenant Systems . . 19 94 5.2.1 Control Plane Operation . . . . . . . . . . . . . . . . 21 95 5.2.2 Data Plane Operation . . . . . . . . . . . . . . . . . . 22 96 6 BGP Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . 23 97 6.1 Router's MAC Extended Community . . . . . . . . . . . . . . 23 98 7 TS Mobility . . . . . . . . . . . . . . . . . . . . . . . . . . 23 99 7.1 TS Mobility & Optimum Forwarding for TS Outbound Traffic . . 23 100 7.2 TS Mobility & Optimum Forwarding for TS Inbound Traffic . . 23 101 7.2.1 Mobility without Route Aggregation . . . . . . . . . . . 24 102 7.2.2 Mobility with Route Aggregation . . . . . . . . . . . . 24 103 8 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 24 104 9 Security Considerations . . . . . . . . . . . . . . . . . . . . 24 105 10 IANA Considerations . . . . . . . . . . . . . . . . . . . . . 24 106 11 References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 107 11.1 Normative References . . . . . . . . . . . . . . . . . . . 25 108 11.2 Informative References . . . . . . . . . . . . . . . . . . 25 109 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25 111 Terminology 113 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 114 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 115 document are to be interpreted as described in RFC 2119 [RFC2119]. 117 EVI : EVPN Instance 119 IRB: Integrated Routing and Bridging 121 MAC-VRF: A Virtual Routing and Forwarding table for MAC addresses on 122 a PE for an EVI 124 IP-VRF: A Virtual Routing and Forwarding table for IP addresses on a 125 PE that is associated with one or more EVIs 127 IRB Interface: A virtual interface that connects the MAC-VRF and the 128 IP-VRF on an NVE. 130 NVE: Network Virtualization Endpoint 132 TS: Tenant System 134 1 Introduction 136 EVPN provides an extensible and flexible multi-homing VPN solution 137 for intra-subnet connectivity among Tenant Systems (TS's) over an 138 MPLS/IP network. However, there are scenarios where, in addition to 139 intra-subnet forwarding, inter-subnet forwarding is required among 140 TS's across different IP subnets at the EVPN PE nodes, also known as 141 EVPN NVE nodes throughout this document, while maintaining the multi- 142 homing capabilities of EVPN. This document describes an Integrated 143 Routing and Bridging (IRB) solution based on EVPN to address such 144 requirements. 146 The inter-subnet communication is traditionally achieved at 147 centralized L3 Gateway nodes where all the inter-subnet communication 148 policies are enforced. When two Tenant Systems (TS's) belonging to 149 two different subnets connected to the same PE node wanted to talk to 150 each other, their traffic needed to be back hauled from the PE node 151 all the way to the centralized gateway nodes where inter-subnet 152 switching is performed and then back to the PE node. For today's 153 large multi-tenant data center, this scheme is very inefficient and 154 sometimes impractical. 156 In order to overcome the drawback of centralized approach, IRB 157 functionality is needed on the PE nodes (i.e., NVE devices) as close 158 to TS as possible to avoid hair pinning of user traffic 159 unnecessarily. Under this design, all traffic between hosts attached 160 to one NVE can be routed and bridged locally, thus avoiding traffic 161 hair-pinning issue at the centralized L3GW. 163 There can be scenarios where both centralized and decentralized 164 approaches may be preferred simultaneously. For example, to allow 165 NVEs to switch inter-subnet traffic belonging to one tenant or one 166 security zone locally; whereas, to back haul inter-subnet traffic 167 belonging to two different tenants or security zones to the 168 centralized gateway nodes and perform switching there after the 169 traffic is subjected to Firewall or Deep Packet Inspection (DPI). 171 Some TS's run non-IP protocols in conjunction with their IP traffic. 172 Therefore, it is important to handle both kinds of traffic optimally 173 - e.g., to bridge non-IP traffic and to route IP traffic. 175 Therefore, the solution needs to meet the following requirements: 177 R1: The solution MUST allow for inter-subnet traffic to be locally 178 switched at NVEs. 180 R2: The solution MUST allow for both inter-subnet and intra-subnet 181 traffic belonging to the same tenant to be locally routed and bridged 182 respectively. The solution MUST provide IP routing for inter-subnet 183 traffic and Ethernet Bridging for intra-subnet traffic. 185 R3: The solution MUST support bridging of non-IP traffic. 187 R4: The solution MUST allow inter-subnet switching to be disabled on 188 a per VLAN basis on NVEs where the traffic needs to be back hauled to 189 another node (e.g., for performing FW or DPI functionality). 191 2 Inter-Subnet Forwarding Scenarios 193 The inter-subnet forwarding scenarios performed by an EVPN NVE can be 194 divided into the following five categories. The last scenario, along 195 with their corresponding solutions, are described in [EVPN-IPVPN- 196 INTEROP]. The solutions for the first four scenarios are the focus of 197 this document. 199 1. Switching among EVIs (subnets) within a DC 201 2. Switching among EVIs (subnets) in different DCs without route 202 aggregation 204 3. Switching among EVIs (subnets) in different DCs with route 205 aggregation 207 4. Switching among IP-VPN instance and EVIs with route aggregation 209 5. Switching among IP-VPN instance and EVIs without route aggregation 211 In the above scenario, the term "route aggregation" refers to the 212 case where a node situated at the WAN edge of the data center network 213 behaves as a default gateway for all the destinations that are 214 outside the data center. The absence of route aggregation refers to 215 the scenario where NVEs within a data center maintain individual 216 (host) routes that are outside of the data center. 218 In the case (4), the WAN edge node also performs route aggregation 219 for all the destinations within its own data center, and acts as an 220 interworking unit between EVPN and IP VPN (it implements both EVPN 221 and IP VPN functionality). 223 +---+ Enterprise Site 1 224 |PE1|----- H1 225 +---+ 226 / 227 ,---------. Enterprise Site 2 228 ,' `. +---+ 229 ,---------. /( MPLS/IP )---|PE2|----- H2 230 ' DCN 3 `./ `. Core ,' +---+ 231 `-+------+' `-+------+' 232 __/__ / / \ \ 233 :NVE4 : +---+ \ \ 234 '-----' ,----|GW |. \ \ 235 | ,' +---+ `. ,---------. 236 TS6 ( DCN 1 ) ,' `. 237 `. ,' ( DCN 2 ) 238 `-+------+' `. ,' 239 __/__ `-+------+' 240 :NVE1 : __/__ __\__ 241 '-----' :NVE2 : :NVE3 : 242 | | '-----' '-----' 243 TS1 TS2 | | | 244 TS3 TS4 TS5 246 Figure 2: Interoperability Use-Cases 248 In what follows, we will describe scenarios 3 through 6 in more 249 detail. 251 2.1 Switching among Subnets within a DC 253 In this scenario, connectivity is required between TS's in the same 254 data center, where those hosts belong to different IP subnets. All 255 these subnets belong to the same tenant or are part of the same IP 256 VPN. Each subnet is associated with a single EVPN instance (EVI) 257 realized by a collection of MAC-VRFs (one per NVE) residing on the 258 NVEs configured for that EVI. 260 As an example, consider TS3 and TS5 of Figure 2 above. Assume that 261 connectivity is required between these two TS's where TS3 belongs to 262 the IP-subnet 3 (SN3) whereas TS5 belongs to the IP-subnet 5 (SN5). 263 Both SN3 and SN5 subnets belong to the same tenant (e.g., are part of 264 the same IP VPN). NVE2 has an EVI3 associated with the SN3 and this 265 EVI is represented by a MAC-VRF which is associated with an IP-VRF 266 (for that IP VPN) via an IRB interface. NVE3 respectively has an EVI5 267 associated with the SN5 and this EVI is represented by an MAC-VRF 268 which is associated with an IP-VRF (for the same IP VPN) via an IRB 269 interface. 271 2.2 Switching among EVIs in different DCs without route aggregation 273 This case is similar to that of section 2.1 above albeit for the fact 274 that the TS's belong to different data centers that are 275 interconnected over a WAN (e.g. MPLS/IP PSN). The data centers in 276 question here are seamlessly interconnected to the WAN, i.e., the WAN 277 edge devices do not maintain any TS-specific addresses in the 278 forwarding path - e.g., there is no WAN edge GW(s) between these DCs. 280 As an example, consider TS3 and TS6 of Figure 2 above. Assume that 281 connectivity is required between these two TS's where TS3 belongs to 282 the SN3 whereas TS6 belongs to the SN6. NVE2 has an EVI3 associated 283 with SN3 and NVE4 has an EVI6 associated with the SN6. Both SN3 and 284 SN6 are part of the same IP VPN. 286 2.3 Switching among EVIs in different DCs with route aggregation 288 In this scenario, connectivity is required between TS's in different 289 data centers, and those hosts belong to different IP subnets. What 290 makes this case different from that of Section 2.2 is that (in the 291 context of a given IP-VRF) at least one of the data centers in 292 question has a gateway as the WAN edge switch. Because of that, the 293 NVE's IP-VRF within each data center need not maintain (host) routes 294 to individual TS's outside of the data center. 296 As an example, consider TS1 and TS5 of Figure 2 above. Assume that 297 connectivity is required between these two TS's where TS1 belongs to 298 the SN1 whereas TS5 belongs to the SN5 thus SN1 and SN5 belong to the 299 same IP VPN. NVE3 has an EVI5 associated with the SN5 and this EVI is 300 represented by the MAC-VRF which is connected to the IP-VRF via an 301 IRB interface. NVE1 has an EVI1 associated with the SN1 and this EVI 302 is represented by the MAC-VRF which is connected to the IP-VRF 303 representing the same IP VPN. Due to the gateway at the edge of DCN 304 1, NVE1's IP-VRF does not need to have the address of TS5 but instead 305 it has a default route in its IP-VRF with the next-hop being the GW. 307 2.4 Switching among IP-VPN sites and EVIs with route aggregation 309 In this scenario, connectivity is required between TS's in a data 310 center and hosts in an enterprise site that belongs to a given IP- 311 VPN. The NVE within the data center is an EVPN NVE, whereas the 312 enterprise site has an IP-VPN PE. Furthermore, the data center in 313 question has a gateway as the WAN edge switch. Because of that, the 314 NVE in the data center does not need to maintain individual IP 315 prefixes advertised by enterprise sites (by IP-VPN PEs). 317 As an example, consider end-station H1 and TS2 of Figure 2. Assume 318 that connectivity is required between the end-station and the TS, 319 where TS2 belongs to the SN2 that is realized using EVPN, whereas H1 320 belongs to an IP VPN site connected to PE1 (PE1 maintains an IP-VRF 321 associated with that IP VPN). NVE1 has an EVI2 associated with the 322 SN2. Moreover, EVI2 on NVE1 is connected to an IP-VRF associated with 323 that IP VPN. PE1 originates a VPN-IP route that covers H1. The 324 gateway at the edge of DCN1 performs interworking function between 325 IP-VPN and EVPN. As a result of this, a default route in the IP-VRF 326 on the NVE1, pointing to the gateway as the next hop, and a route to 327 the TS2 (or maybe SN2) on the PE1's IP-VRF are sufficient for the 328 connectivity between H1 and TS2. In this scenario, the NVE1's IP-VRF 329 does not need to maintain a route to H1 because it has the default 330 route to the gateway. 332 3 Default L3 Gateway Addressing 334 3.1 Homogeneous Environment 336 This is an environment where all NVEs to which an EVPN instance could 337 potentially be attached (or moved), perform inter-subnet switching. 338 Therefore, inter-subnet traffic can be locally switched by the EVPN 339 NVE connecting the TS's belonging to different subnets. 341 To support such inter-subnet forwarding, the NVE behaves as an IP 342 Default Gateway from the perspective of the attached TS's. Two models 343 are possible: 345 1. All the NVEs of a given EVPN instance use the same anycast default 346 gateway IP address and the same anycast default gateway MAC address. 347 On each NVE, this default gateway IP/MAC address correspond to the 348 IRB interface of the MAC-VRF associated with that EVI. 350 2. Each NVE of a given EVPN instance uses its own default gateway IP 351 and MAC addresses, and these addresses are aliased to the same 352 conceptual gateway through the use of the Default Gateway extended 353 community as specified in [EVPN], which is carried in the EVPN MAC 354 Advertisement routes. On each NVE, this default gateway IP/MAC 355 address correspond to the IRB interface of the MAC-VRF associated 356 with that EVI. 358 Both of these models enable a packet forwarding paradigm for 359 asymmetric IRB forwarding where a packet can bypass the IP-VRF 360 processing on the egress (i.e. disposition) NVE. The egress NVE 361 merely needs to perform a lookup in the associated MAC-VRF and 362 forward the Ethernet frames unmodified, i.e. without rewriting the 363 source MAC address. This is different from symmetric IRB forwarding 364 where a packet is forwarded through the MAC-VRF followed by the IP- 365 VRF on the ingress NVE, and then forwarded through the IP-VRF 366 followed by the MAC-VRF on the egress NVE. 368 It is worth noting that if the applications that are running on the 369 TS's are employing or relying on any form of MAC security, then the 370 first model (i.e. using anycast addresses) would be required to 371 ensure that the applications receive traffic from the same source MAC 372 address that they are sending to. 374 3.2 Heterogeneous Environment 376 For large data centers with thousands of servers and ToR (or Access) 377 switches, some of them may not have the capability of maintaining or 378 enforcing policies for inter-subnet switching. Even though policies 379 among multiple subnets belonging to same tenant can be simpler, hosts 380 belonging to one tenant can also send traffic to peers belonging to 381 different tenants or security zones. A L3GW not only needs to enforce 382 policies for communication among subnets belonging to a single 383 tenant, but also it needs to know how to handle traffic destined 384 towards peers in different tenants. Therefore, there can be a mixed 385 environment where an NVE performs inter-subnet switching for some 386 EVPN instances but not others. 388 4 Operational Models for Asymmetric Inter-Subnet Forwarding 390 4.1 Among EVPN NVEs within a DC 392 When an EVPN MAC advertisement route is received by the NVE, the IP 393 address associated with the route is used to populate the IP-VRF 394 table, whereas the MAC address associated with the route is used to 395 populate both the MAC-VRF table, as well as the adjacency associated 396 with the IP route in the IP-VRF table. 398 When an Ethernet frame is received by an ingress NVE, it performs a 399 lookup on the destination MAC address in the associated MAC-VRF for 400 that EVI. If the MAC address corresponds to its IRB Interface MAC 401 address, the ingress NVE deduces that the packet MUST be inter-subnet 402 routed. Hence, the ingress NVE performs an IP lookup in the 403 associated IP-VRF table. The lookup identifies both the next-hop 404 (i.e. egress) NVE to which the packet must be forwarded, in addition 405 to an adjacency that contains a MAC rewrite and an MPLS label stack. 406 The MAC rewrite holds the MAC address associated with the destination 407 host (as populated by the EVPN MAC route), instead of the MAC address 408 of the next-hop NVE. The ingress NVE then rewrites the destination 409 MAC address in the packet with the address specified in the 410 adjacency. It also rewrites the source MAC address with its IRB 411 Interface MAC address. The ingress NVE, then, forwards the frame to 412 the next-hop (i.e. egress) NVE after encapsulating it with the MPLS 413 label stack. Note that this label stack includes the LSP label as 414 well as the EVPN label that was advertised by the egress NVE. When 415 the MPLS encapsulated packet is received by the egress NVE, it uses 416 the EVPN label to identify the MAC-VRF table. It then performs a MAC 417 lookup in that table, which yields the outbound interface to which 418 the Ethernet frame must be forwarded. Figure 2 below depicts the 419 packet flow, where NVE1 and NVE2 are the ingress and egress NVEs, 420 respectively. 422 NVE1 NVE2 423 +------------+ +------------+ 424 | | | | 425 |(MAC - (IP | |(IP - (MAC | 426 | VRF) VRF)| | VRF) VRF)| 427 | | | | | | | | 428 +------------+ +------------+ 429 ^ v ^ V 430 | | | | 431 TS1->-+ +-->--------------+ +->-TS2 433 Figure 2: Inter-Subnet Forwarding Among EVPN NVEs within a DC 435 Note that the forwarding behavior on the egress NVE is similar to 436 EVPN intra-subnet forwarding. In other words, all the packet 437 processing associated with the inter-subnet forwarding semantics is 438 confined to the ingress NVE and that is why it is called Asymmetric 439 IRB. 441 It should also be noted that [EVPN] provides different level of 442 granularity for the EVPN label. Besides identifying bridge domain 443 table, it can be used to identify the egress interface or a 444 destination MAC address on that interface. If EVPN label is used for 445 egress interface or destination MAC address identification, then no 446 MAC lookup is needed in the egress EVI and the packet can be directly 447 forwarded to the egress interface just based on EVPN label lookup. 449 4.2 Among EVPN NVEs in Different DCs Without Route Aggregation 451 When an EVPN MAC advertisement route is received by the NVE, the IP 452 address associated with the route is used to populate the IP-VRF 453 table, whereas the MAC address associated with the route is used to 454 populate both the MAC-VRF table, as well as the adjacency associated 455 with the IP route in the IP-VRF table. 457 When an Ethernet frame is received by an ingress NVE, it performs a 458 lookup on the destination MAC address in the associated EVI. If the 459 MAC address corresponds to its IRB Interface MAC address, the ingress 460 NVE deduces that the packet MUST be inter-subnet routed. Hence, the 461 ingress NVE performs an IP lookup in the associated IP-VRF table. The 462 lookup identifies both the next-hop (i.e. egress) Gateway to which 463 the packet must be forwarded, in addition to an adjacency that 464 contains a MAC rewrite and an MPLS label stack. The MAC rewrite holds 465 the MAC address associated with the destination host (as populated by 466 the EVPN MAC route), instead of the MAC address of the next-hop 467 Gateway. The ingress NVE then rewrites the destination MAC address in 468 the packet with the address specified in the adjacency. It also 469 rewrites the source MAC address with its IRB Interface MAC address. 470 The ingress NVE, then, forwards the frame to the next-hop (i.e. 471 egress) Gateway after encapsulating it with the MPLS label stack. 473 Note that this label stack includes the LSP label as well as an EVPN 474 label. The EVPN label could be either advertised by the ingress 475 Gateway, if inter-AS option B is used, or advertised by the egress 476 NVE, if inter-AS option C is used. When the MPLS encapsulated packet 477 is received by the ingress Gateway, the processing again differs 478 depending on whether inter-AS option B or option C is employed: in 479 the former case, the ingress Gateway swaps the EVPN label in the 480 packets with the EVPN label value received from the egress Gateway. 481 In the latter case, the ingress Gateway does not modify the EVPN 482 label and performs normal label switching on the LSP label. 483 Similarly on the egress Gateway, for option B, the egress Gateway 484 swaps the EVPN label with the value advertised by the egress NVE. 485 Whereas, for option C, the egress Gateway does not modify the EVPN 486 label, and performs normal label switching on the LSP label. When the 487 MPLS encapsulated packet is received by the egress NVE, it uses the 488 EVPN label to identify the bridge-domain table. It then performs a 489 MAC lookup in that table, which yields the outbound interface to 490 which the Ethernet frame must be forwarded. Figure 3 below depicts 491 the packet flow. 493 NVE1 GW1 GW2 NVE2 494 +------------+ +------------+ +------------+ +------------+ 495 | | | | | | | | 496 |(MAC - (IP | | [LS] | | [LS] | |(IP - (MAC | 497 | VRF) VRF)| | | | | | VRF) VRF)| 498 | | | | | | | | | | | | | | | | 499 +------------+ +------------+ +------------+ +------------+ 500 ^ v ^ V ^ V ^ V 501 | | | | | | | | 502 TS1->-+ +-->--------+ +------------+ +---------------+ +->-TS2 504 Figure 3: Inter-Subnet Forwarding Among EVPN NVEs in Different DCs 505 without Route Aggregation 507 4.3 Among EVPN NVEs in Different DCs with Route Aggregation 509 In this scenario, the NVEs within a given data center do not have 510 entries for the MAC/IP addresses of hosts in remote data centers. 511 Rather, the NVEs have a default IP route pointing to the WAN gateway 512 for each VRF. This is accomplished by the WAN gateway advertising for 513 a given EVPN that spans multiple DC a default VPN-IP route that is 514 imported by the NVEs of that EVPN that are in the gateway's own DC. 516 When an Ethernet frame is received by an ingress NVE, it performs a 517 lookup on the destination MAC address in the associated MAC-VRF 518 table. If the MAC address corresponds to the IRB Interface MAC 519 address, the ingress NVE deduces that the packet MUST be inter-subnet 520 routed. Hence, the ingress NVE performs an IP lookup in the 521 associated IP-VRF table. The lookup, in this case, matches the 522 default route which points to the local WAN gateway. The ingress NVE 523 then rewrites the destination MAC address in the packet with the IRB 524 Interface MAC address of the local WAN gateway. It also rewrites the 525 source MAC address with its own IRB Interface MAC address. The 526 ingress NVE, then, forwards the frame to the WAN gateway after 527 encapsulating it with the MPLS label stack. Note that this label 528 stack includes the LSP label as well as the IP-VPN label that was 529 advertised by the local WAN gateway. When the MPLS encapsulated 530 packet is received by the local WAN gateway, it uses the IP-VPN label 531 to identify the IP-VRF table. It then performs an IP lookup in that 532 table. The lookup identifies both the remote WAN gateway (of the 533 remote data center) to which the packet must be forwarded, in 534 addition to an adjacency that contains a MAC rewrite and an MPLS 535 label stack. The MAC rewrite holds the MAC address associated with 536 the ultimate destination host (as populated by the EVPN MAC route). 537 The local WAN gateway then rewrites the destination MAC address in 538 the packet with the address specified in the adjacency. It also 539 rewrites the source MAC address with its IRB Interface MAC address. 541 The local WAN gateway, then, forwards the frame to the remote WAN 542 gateway after encapsulating it with the MPLS label stack. Note that 543 this label stack includes the LSP label as well as a EVPN label that 544 was advertised by the remote WAN gateway. When the MPLS encapsulated 545 packet is received by the remote WAN gateway, it simply swaps the 546 EVPN label and forwards the packet to the egress NVE. This implies 547 that the GW1 needs to keep the remote host MAC addresses along with 548 the corresponding EVPN labels in the adjacency entries of the IP-VRF 549 table. The remote WAN gateway then forward the packet to the egress 550 NVE. The egress NVE then performs a MAC lookup in the MAC-VRF 551 (identified by the received EVPN label) to determine the outbound 552 port to send the traffic on. 554 Figure 4 below depicts the forwarding model. 556 NVE1 GW1 GW2 NVE2 557 +------------+ +------------+ +------------+ +------------+ 558 | | | | | | | | 559 |(MAC - (IP | |(IP - (MAC | | [LS] | |(IP - (MAC | 560 | VRF) VRF)| | VRF) VRF)| | | | | | VRF) VRF)| 561 | | | | | | | | | | | | | | | | 562 +------------+ +------------+ +------------+ +------------+ 563 ^ v ^ V ^ V ^ V 564 | | | | | | | | 565 TS1->-+ +-->-----+ +---------------+ +---------------+ +->-TS2 567 Figure 4: Inter-Subnet Forwarding Among EVPN NVEs in Different DCs 568 with Route Aggregation 570 4.4 Among IP-VPN Sites and EVPN NVEs with Route Aggregation 572 In this scenario, the NVEs within a given data center do not have 573 entries for the IP addresses of hosts in remote enterprise sites. 574 Rather, the NVEs have a default IP route pointing the WAN gateway for 575 each IP-VRF. 577 When an Ethernet frame is received by an ingress NVE, it performs a 578 lookup on the destination MAC address in the associated MAC-VRF 579 table. If the MAC address corresponds to the IRB Interface MAC 580 address, the ingress NVE deduces that the packet MUST be inter-subnet 581 routed. Hence, the ingress NVE performs an IP lookup in the 582 associated IP-VRF table. The lookup, in this case, matches the 583 default route which points to the local WAN gateway. The ingress NVE 584 then rewrites the destination MAC address in the packet with the IRB 585 Interface MAC address of the local WAN gateway. It also rewrites the 586 source MAC address with its own IRB Interface MAC address. The 587 ingress NVE, then, forwards the frame to the local WAN gateway after 588 encapsulating it with the MPLS label stack. Note that this label 589 stack includes the LSP label as well as the IP-VPN label that was 590 advertised by the local WAN gateway. When the MPLS encapsulated 591 packet is received by the local WAN gateway, it uses the IP-VPN label 592 to identify the VRF table. It then performs an IP lookup in that 593 table. The lookup identifies the next hop ASBR to which the packet 594 must be forwarded. The local gateway in this case strips the Ethernet 595 encapsulation and perform an IP lookup in its IP-VRF and forwards the 596 IP packet to the ASBR using a label stack comprising of an LSP label 597 and an IP-VPN label that was advertised by the ASBR. When the MPLS 598 encapsulated packet is received by the ASBR, it simply swaps the IP- 599 VPN label with the one advertised by the egress PE. This implies that 600 the remote WAN gateway must allocate the VPN label at least at the 601 granularity of a (VRF, egress PE) tuple. The ASBR then forwards the 602 packet to the egress PE. The egress PE then performs an IP lookup in 603 the IP-VRF (identified by the received IP-VPN label) to determine 604 where to forward the traffic. 606 Figure 5 below depicts the forwarding model. 608 NVE1 GW1 ASBR NVE2 609 +------------+ +------------+ +------------+ +------------+ 610 | | | | | | | | 611 |(MAC - (IP | |(IP - (MAC | | [LS] | | (IP | 612 | VRF) VRF)| | VRF) VRF)| | | | | | VRF)| 613 | | | | | | | | | | | | | | | | 614 +------------+ +------------+ +------------+ +------------+ 615 ^ v ^ V ^ V ^ V 616 | | | | | | | | 617 TS1->-+ +-->-----+ +--------------+ +---------------+ +->-H1 619 Figure 5: Inter-Subnet Forwarding Among IP-VPN Sites and EVPN NVEs 620 with Route Aggregation 622 4.5 Use of Centralized Gateway 624 In this scenario, the NVEs within a given data center need to forward 625 traffic in L2 to a centralized L3GW for a number of reasons: a) they 626 don't have IRB capabilities or b) they don't have required policy for 627 switching traffic between different tenants or security zones. The 628 centralized L3GW performs both the IRB function for switching traffic 629 among different EVPN instances as well as it performs interworking 630 function when the traffic needs to be switched between IP-VPN sites 631 and EVPN instances. 633 5 Operational Models for Symmetric Inter-Subnet Forwarding 635 The following sections describe several main symmetric IRB forwarding 636 scenarios. 638 5.1 IRB forwarding on NVEs for Tenant Systems 640 This section covers the symmetric IRB procedures for the scenario 641 where each Tenant System (TS) is attached to one or more NVEs and its 642 host IP and MAC addresses are learned by the attached NVEs and are 643 distributed to all other NVEs that are interested in participating in 644 both intra-subnet and inter-subnet communications with that TS. 646 In this scenario, for a given tenant (e.g., an IP-VPN service), an 647 NVE has one MAC-VRF for each tenant's subnet (VLAN) that is 648 configured for. Assuming VLAN-based service which is typically the 649 case for VxLAN and NVGRE encapsulation, each MAC-VRF consists of a 650 single bridge domain. In case of MPLS encapsulation with VLAN-aware 651 bundling, then each MAC-VRF consists of multiple bridge domains (one 652 bridge domain per VLAN). The MAC-VRFs on an NVE for a given tenant 653 are associated with an IP-VRF corresponding to that tenant (or IP-VPN 654 service) via their IRB interfaces. 656 Each NVE MUST support QoS, Security, and OAM policies per IP-VRF 657 to/from the core network. This is not to be confused with the QoS, 658 Security, and OAM policies per Attachment Circuits (AC) to/from the 659 Tenant Systems. How this requirement is met is an implementation 660 choice and it is outside the scope of this document. 662 Since VxLAN and NVGRE encapsulations require inner Ethernet header 663 (inner MAC SA/DA), and since for inter-subnet traffic, TS MAC address 664 cannot be used, the ingress NVE's MAC address is used as inner MAC 665 SA. The NVE's MAC address is the device MAC address and it is common 666 across all MAC-VRFs and IP-VRFs. This MAC address is advertised using 667 the new EVPN Router's MAC Extended Community (section 6.1). 669 Figure below illustrates this scenario where a given tenant (e.g., an 670 IP-VPN service) has three subnets represented by MAC-VRF1, MAC-VRF2, 671 and MAC-VRF3 across two NVEs. There are five TS's that are associated 672 with these three MAC-VRFs - i.e., TS1, TS5 are associated with MAC- 673 VRF1 on NVE1, TS4 is associated with MAC-VRF1 on NVE2, TS2 is 674 associated with MAC-VRF2 on NVE1, and TS3 is associated with MAC-VRF3 675 on NVE2. MAC-VRF1 and MAC-VRF2 on NVE1 are in turn associated with 676 IP-VRF1 on NVE1 and MAC-VRF1 and MAC-VRF3 on NVE2 are associated with 677 IP-VRF1 on NVE2. When TS1, TS5, and TS4 exchange traffic with each 678 other, only L2 forwarding (bridging) part of the IRB solution is 679 exercised because all these TS's sit on the same subnet. However, 680 when TS1 wants to exchange traffic with TS2 or TS3 which belong to 681 different subnets, then both bridging and routing parts of the IRB 682 solution are exercised. The following subsections describe the 683 control and data planes operations for this IRB scenario in details. 685 NVE1 +---------+ 686 +-------------+ | | 687 TS1-----| MACx| | | NVE2 688 (IP1/M1) |(MAC- | | | +-------------+ 689 TS5-----| VRF1)\ | | MPLS/ | |MACy (MAC- |-----TS3 690 (IP5/M5) | \ | | VxLAN/ | | / VRF3) | (IP3/M3) 691 | (IP-VRF1)|----| NVGRE |---|(IP-VRF1) | 692 | / | | | | \ | 693 TS2-----|(MAC- / | | | | (MAC- |-----TS4 694 (IP2/M2) | VRF2) | | | | VRF1) | (IP4/M4) 695 +-------------+ | | +-------------+ 696 | | 697 +---------+ 699 Figure 6: IRB forwarding on NVEs without core-facing IRB Interface 701 5.1.1 Control Plane Operation 703 Each NVE advertises a Route Type-2 (RT-2, MAC/IP Advertisement Route) 704 for each of its TS's with the following field set: 706 - RD and ESI per [EVPN] 707 - Ethernet Tag = 0; assuming VLAN-based service 708 - MAC Address Length = 48 709 - MAC Address = Mi ; where i = 1,2,3,4, or 5 in the above example 710 - IP Address Length = 32 or 128 711 - IP Address = IPi ; where i = 1,2,3,4, or 5 in the above example 712 - Label-1 = MPLS Label or VNID corresponding to MAC-VRF 713 - Label-2 = MPLS Label or VNID corresponding to IP-VRF 715 Each NVE advertises an RT-2 route with two Route Targets (one 716 corresponding to its MAC-VRF and the other corresponding to its IP- 717 VRF. Furthermore, the RT-2 is advertised with two BGP Extended 718 Communities. The first BGP Extended Community identifies the tunnel 719 type per section 4.5 of [RFC5512] and the second BGP Extended 720 Community includes the MAC address of the NVE (e.g., MACx for NVE1 or 721 MACy for NVE2) and it is defined in section 6.1. This second Extended 722 Community (for the MAC address of NVE) is only required when the 723 tunnel encapsulation is of type VxLAN or NVGRE where an inner MAC 724 address is needed. 726 Upon receiving this advertisement, the receiving NVE performs the 727 following: 729 - It uses Route Targets corresponding to its MAC-VRF and IP-VRF for 730 identifying these tables and subsequently importing this route into 731 them. 733 - It imports the MAC address into the MAC-VRF with BGP Next Hop 734 address as underlay tunnel destination address (e.g., VTEP DA for 735 VxLAN encapsulation) and Label-1 as VNID for VxLAN encapsulation or 736 EVPN label for MPLS encapsulation. 738 - If the route carries the new Router's MAC Extended Community, and 739 if the receiving NVE is going to use VxLAN encapsulation, then the 740 receiving NVE imports the IP address into IP-VRF with NVE's MAC 741 address (from the new Router's MAC Extended Community) as inner MAC 742 DA and BGP Next Hop address as underlay tunnel destination address, 743 VTEP DA for VxLAN encapsulation and Label-2 as IP-VPN VNID for VxLAN 744 encapsulation. 746 - If the receiving NVE is going to use MPLS encapsulation, then the 747 receiving NVE imports the IP address into IP-VRF with BGP Next Hop 748 address as underlay tunnel destination address, and Label-2 as IP-VPN 749 label for MPLS encapsulation. 751 If the receiving NVE receives a RT-2 with only a single Route Target 752 corresponding to IP-VRF and Label-1, then it must discard this route 753 and log an error. If the receiving NVE receives a RT-2 with only a 754 single Route Target corresponding to MAC-VRF but with both Label-1 755 and Label-2, then it must discard this route and log an error. If the 756 receiving NVE receives a RT-2 with MAC Address Length of zero, then 757 it must discard this route and log an error. 759 5.1.2 Data Plane Operation 761 The following description of the data-plane operation describes just 762 the logical functions and the actual implementation may differ. Lets 763 consider data-plane operation when TS1 in subnet-1 (MAC-VRF1) on NVE1 764 wants to send traffic to TS3 in subnet-3 (MAC-VRF3) on NVE2. 766 - TS1 send a packet with MAC DA corresponding to the MAC-VRF1 IRB 767 interface on NVE1 (the interface between MAC-VRF1 and IP-VRF1), and 768 VLAN-tag corresponding to MAC-VRF1. 770 - Upon receiving the packet, the NVE1 uses VLAN-tag to identify the 771 MAC-VRF1. It then looks up the MAC DA and forwards the frame to its 772 IRB interface. 774 - The Ethernet header of the packet is stripped and the packet is 775 fed to the IP-VRF (iVRF) where IP lookup is performed on the 776 destination address. This lookup yields a MAC address to be used as 777 inner MAC DA for VxLAN/NVGRE encapsulation, an IP address to be used 778 as VTEP DA for VxLAN encap or tunnel label for MPLS encap, and a VPN- 779 ID to be used as VNID for VxLAN encap or VPN label for MPLS encap. 781 - The packet is then encapsulated with the proper header based on 782 the above info. The inner MAC SA and VTEP SA is set to NVE's MAC and 783 IP addresses respectively. The packet is then forwarded to the egress 784 NVE. 786 - On the egress NVE, if the packet is VxLAN encapsulated, the VxLAN 787 header is removed. Since the inner MAC DA is the egress NVE's MAC 788 address, the egress NVE knows that it needs to perform an IP lookup. 789 It uses VNID to identify the IP-VRF (iVRF) table and then performs an 790 IP lookup which results in destination TS (TS3) MAC address and the 791 access-facing IRB interface over which the packet is sent. 793 - The IP packet is encapsulated with an Ethernet header with MAC SA 794 set to that of NVE2 MAC address(MACy) and MAC DA set to that of 795 destination TS (TS3) MAC address. The packet is sent to the 796 corresponding MAC-VRF3 and after a lookup of MAC DA, is forwarded to 797 the destination TS (TS3) over the corresponding interface. 799 In this scenario, inter-subnet forwarding traffic between NVEs will 800 always use the IP-VRF VNID/MPLS label, even if the IP DA belongs to a 801 subnet defined in both NVEs. For instance, traffic from TS2 to TS4 802 will be encapsulated by NVE1 using NVE2's IP-VRF VNID/MPLS label as 803 opposed to the MAC-VRF1 VNID/MPLS label, as long as TS4's host IP is 804 present in NVE1's IP-VRF. 806 5.1.3 TS Move Operation 808 When a TS move from one NVE to other, it is important that the MAC 809 mobility procedures are properly executed and the corresponding MAC- 810 VRF and IP-VRF tables on all participating NVEs are updated. [EVPN] 811 describes the MAC mobility procedures for L2-only services for both 812 single-homed TS and All-Active multi-homed TS . This section 813 describes the incremental procedures and BGP Extended Communities 814 needed to handle the MAC mobility for a mixed of L2 and L3 services 815 known as Integrated Routing and Bridging - IRB. In order to place 816 the emphasis on the differences between L2-only versus L2-and-L3 use 817 cases, the incremental procedure is described for single-homed TS 818 with the expectation that the reader can easily extrapolate multi- 819 homed TS based on the procedures described in section 15 of [EVPN]. 821 Lets consider TS1 in figure-6 above where it moves from NVE1 to NVE2. 823 In such move, NVE2 discovers IP1/MAC1 of TS1 and realizes that it is 824 a MAC move and it advertises a MAC/IP route per section 5.1.1 above 825 with MAC Mobility Extended Community. In this IRB use case, both MAC 826 and IP addresses of the TS are included in the EVPN MAC/IP 827 Advertisement route as oppose to L2-only use case where only the MAC 828 address of the TS is included. Furthermore, besides MAC mobility 829 Extended Community and Route Target corresponding to the MAC-VRF, the 830 following additional BGP Extended Communities are advertised along 831 with the MAC/IP Advertisement route: 833 - Route Target associated with IP-VRF 834 - Router's MAC Extended Community 835 - Tunnel Type Extended Community 837 Since NVE2 learns TS1's MAC/IP addresses locally, it updates its MAC- 838 VRF1 and IP-VRF1 for TS1 with its local interface. 840 If the local learning at NVE1 is performed using control or 841 management planes, then these interactions serve as the trigger for 842 NVE1 to withdraw the MAC/IP addresses associated with TS1. However, 843 if the local learning at NVE1 is performed using data-plane learning, 844 then the reception of the MAC/IP Advertisement route for TS1 with MAC 845 Mobility extended community serve as the trigger for NVE1 to withdraw 846 the MAC/IP addresses associated with TS1. 848 All other remote NVE devices upon receiving the MAC/IP advertisement 849 route for TS1 from NVE2 with MAC Mobility extended community compare 850 the sequence number in this advertisement with the one previously 851 received. If the new sequence number is greater than the old one, 852 then they update the MAC/IP addresses of TS1 in their corresponding 853 MAC-VRFs and IP-VRFs to point to NVE2. Furthermore, upon receiving 854 the MAC/IP withdraw for TS1 from NVE1, these remote PEs perform the 855 cleanups for their BGP tables. 857 5.2 IRB forwarding on NVEs for Subnets behind Tenant Systems 859 This section covers the symmetric IRB procedures for the scenario 860 where some Tenant Systems (TS's) support one or more subnets and 861 these TS's are associated with one ore more NVEs. Therefore, besides 862 the advertisement of MAC/IP addresses for each TS which can be in the 863 presence of All-Active multi-homing, the associated NVE needs to also 864 advertise the subnets behind each TS. 866 The main difference between this scenario and the previous one is the 867 additional advertisement corresponding to each subnet. These subnet 868 advertisements are accomplished using EVPN IP Prefix route defined in 869 [EVPN-PREFIX]. These subnet prefixes are advertised with the IP 870 address of their associated TS (which is in overlay address space) as 871 their next hop. The receiving NVEs perform recursive route resolution 872 to resolve the subnet prefix with its associated ingress NVE so that 873 they know which NVE to forward the packets to when they are destined 874 for that subnet prefix. 876 The advantage of this recursive route resolution is that when a TS 877 moves from one NVE to another, there is no need to re-advertise any 878 of the subnet prefixes for that TS. All it is needed is to advertise 879 the IP/MAC addresses associated with the TS itself and exercise MAC 880 mobility procedures for that TS. The recursive route resolution 881 automatically takes care of the updates for the subnet prefixes of 882 that TS. 884 Figure below illustrates this scenario where a given tenant (e.g., an 885 IP-VPN service) has three subnets represented by MAC-VRF1, MAC-VRF2, 886 and MAC-VRF3 across two NVEs. There are four TS's associated with 887 these three MAC-VRFs - i.e., TS1, TS5 are connected to MAC-VRF1 on 888 NVE1, TS2 is connected to MAC-VRF2 on NVE1, TS3 is connected to MAC- 889 VRF3 on NVE2, and TS4 is connected to MAC-VRF1 on NVE2. TS1 has two 890 subnet prefixes (SN1 and SN2) and TS3 has a single subnet prefix, 891 SN3. The MAC-VRFs on each NVE are associated with their corresponding 892 IP-VRF using their IRB interfaces. When TS4 and TS1 exchange intra- 893 subnet traffic, only L2 forwarding (bridging) part of the IRB 894 solution is used (i.e., the traffic only goes through their MAC- 895 VRFs); however, when TS4 wants to forward traffic to SN1 or SN2 896 sitting behind TS1 (inter-subnet traffic), then both bridging and 897 routing parts of the IRB solution are exercised (i.e., the traffic 898 goes through the corresponding MAC-VRFs and IP-VRFs). The following 899 subsections describe the control and data planes operations for this 900 IRB scenario in details. 902 NVE1 +----------+ 903 SN1--+ +-------------+ | | 904 |--TS1-----|(MAC- \ | | | 905 SN2--+ IP1/M1 | VRF1) \ | | | 906 | (IP-VRF)|---| | 907 | / | | | 908 TS2-----|(MAC- / | | MPLS/ | 909 IP2/M2 | VRF2) | | VxLAN/ | 910 +-------------+ | NVGRE | 911 +-------------+ | | 912 SN3--+--TS3-----|(MAC-\ | | | 913 IP3/M3 | VRF3)\ | | | 914 | (iVRF)|---| | 915 | / | | | 916 TS4-----|(MAC- / | | | 917 IP4/M4 | VRF1) | | | 918 +-------------+ +----------+ 919 NVE2 921 Figure 7: IRB forwarding on NVEs with core-facing IRB Interface 923 5.2.1 Control Plane Operation 925 Each NVE advertises a Route Type-5 (RT-5, IP Prefix Route defined in 926 [EVPN-PREFIX]) for each of its subnet prefixes with the IP address of 927 its TS as the next hop (gateway address field) as follow: 929 - RD per VPN 930 - ESI = 0 931 - Ethernet Tag = 0; 932 - IP Prefix Length = 32 or 128 933 - IP Prefix = SNi 934 - Gateway Address = IPi; IP address of TS 935 - Label = 0 937 This RT-5 is advertised with a Route Target corresponding to the IP- 938 VPN service. 940 Each NVE also advertises an RT-2 (MAC/IP Advertisement Route) along 941 with their associated Route Targets and Extended Communities for each 942 of its TS's exactly as described in section 5.1.1. 944 Upon receiving the RT-5 advertisement, the receiving NVE performs the 945 following: 947 - It uses the Route Target to identify the corresponding IP-VRF 948 - It imports the IP prefix into its corresponding IP-VRF with the IP 949 address of the associated TS as its next hop. 951 Upon receiving the RT-2 advertisement, the receiving NVE imports 952 MAC/IP addresses of the TS into the corresponding MAC-VRF and IP-VRF 953 per section 5.1.1. Furthermore, it performs recursive route 954 resolution to resolve the IP prefix (received in RT-5) to its 955 corresponding NVE's IP address (e.g., its BGP next hop). BGP next hop 956 will be used as underlay tunnel destination address (e.g., VTEP DA 957 for VxLAN encapsulation) and Router's MAC will be used as inner MAC 958 for VxLAN encapsulation. 960 5.2.2 Data Plane Operation 962 The following description of the data-plane operation describes just 963 the logical functions and the actual implementation may differ. Lets 964 consider data-plane operation when a host on SN1 sitting behind TS1 965 wants to send traffic to a host sitting behind SN3 behind TS3. 967 - TS1 send a packet with MAC DA corresponding to the MAC-VRF1 IRB 968 interface of NVE1, and VLAN-tag corresponding to MAC-VRF1. 970 - Upon receiving the packet, the ingress NVE1 uses VLAN-tag to 971 identify the MAC-VRF1. It then looks up the MAC DA and forwards the 972 frame to its IRB interface just like section 5.1.1. 974 - The Ethernet header of the packet is stripped and the packet is fed 975 to the IP-VRF; where, IP lookup is performed on the destination 976 address. This lookup yields the fields needed for VxLAN encapsulation 977 with NVE2's MAC address as the inner MAC DA, NVE'2 IP address as the 978 VTEP DA, and the VNID. MAC SA is set to NVE1's MAC address and VTEP 979 SA is set to NVE1's IP address. 981 - The packet is then encapsulated with the proper header based on 982 the above info and is forwarded to the egress NVE (NVE2). 984 - On the egress NVE (NVE2), assuming the packet is VxLAN 985 encapsulated, the VxLAN and the inner Ethernet headers are removed 986 and the resultant IP packet is fed to the IP-VRF associated with that 987 the VNID. 989 - Next, a lookup is performed based on IP DA (which is in SN3) in the 990 associated IP-VRF of NVE2. The IP lookup yields the destination TS 991 (TS3) MAC address and the access-facing IRB interface over which the 992 packet needs to be sent. 994 - The IP packet is encapsulated with an Ethernet header with the MAC 995 SA set to that of the access-facing IRB interface of the egress NVE 996 (NVE2) and the MAC DA is set to that of destination TS (TS3) MAC 997 address. The packet is sent to the corresponding MAC-VRF3 and after a 998 lookup of MAC DA, is forwarded to the destination TS (TS3) over the 999 corresponding interface. 1001 6 BGP Encoding 1003 This document defines one new BGP Extended Community for EVPN. 1005 6.1 Router's MAC Extended Community 1007 A new EVPN BGP Extended Community called Router's MAC is introduced 1008 here. This new extended community is a transitive extended community 1009 with the Type field of 0x06 (EVPN) and the Sub-Type of 0x03. It may 1010 be advertised along with BGP Encapsulation Extended Community define 1011 in section 4.5 of [RFC5512]. 1013 The Router's MAC Extended Community is encoded as an 8-octet value as 1014 follows: 1016 0 1 2 3 1017 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1018 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1019 | Type=0x06 | Sub-Type=0x03 | Router's MAC | 1020 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1021 | Router's MAC Cont'd | 1022 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1024 This extended community is used to carry the NVE's MAC address for 1025 symmetric IRB scenarios and it is sent with RT-2 as described in 1026 section 5.1.1 and 5.2.1. 1028 7 TS Mobility 1030 7.1 TS Mobility & Optimum Forwarding for TS Outbound Traffic 1032 Optimum forwarding for the TS outbound traffic, upon TS mobility, can 1033 be achieved using either the anycast default Gateway MAC and IP 1034 addresses, or using the address aliasing as discussed in [DC- 1035 MOBILITY]. 1037 7.2 TS Mobility & Optimum Forwarding for TS Inbound Traffic 1038 For optimum forwarding of the TS inbound traffic, upon TS mobility, 1039 all the NVEs and/or IP-VPN PEs need to know the up to date location 1040 of the TS. Two scenarios must be considered, as discussed next. 1042 In what follows, we use the following terminology: 1044 - source NVE refers to the NVE behind which the TS used to reside 1045 prior to the TS mobility event. 1047 - target NVE refers to the new NVE behind which the TS has moved 1048 after the mobility event. 1050 7.2.1 Mobility without Route Aggregation 1052 In this scenario, when a target NVE detects that a MAC mobility event 1053 has occurred, it initiates the MAC mobility handshake in BGP as 1054 specified in [EVPN]. The WAN Gateways, acting as ASBRs in this case, 1055 re-advertise the MAC route of the target NVE with the MAC Mobility 1056 extended community attribute unmodified. Because the WAN Gateway for 1057 a given data center re-advertises BGP routes received from the WAN 1058 into the data center, the source NVE will receive the MAC 1059 Advertisement route of the target NVE (with the next hop attribute 1060 adjusted depending on which inter-AS option is employed). The source 1061 NVE will then withdraw its original MAC Advertisement route as a 1062 result of evaluating the Sequence Number field of the MAC Mobility 1063 extended community in the received MAC Advertisement route. This is 1064 per the procedures already defined in [EVPN]. 1066 7.2.2 Mobility with Route Aggregation 1068 This section will be completed in the next revision. 1070 8 Acknowledgements 1072 The authors would like to thank Sami Boutros for his valuable 1073 comments. 1075 9 Security Considerations 1077 10 IANA Considerations 1079 IANA has allocated a new transitive extended community Type of 0x06 1080 and Sub-Type of 0x03 for EVPN Router's MAC Extended Community. 1082 11 References 1083 11.1 Normative References 1085 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1086 Requirement Levels", BCP 14, RFC 2119, March 1997. 1088 11.2 Informative References 1090 [EVPN] Sajassi et al., "BGP MPLS Based Ethernet VPN", draft-ietf- 1091 l2vpn-evpn-04.txt, work in progress, July, 2014. 1093 [EVPN-IPVPN-INTEROP] Sajassi et al., "EVPN Seamless Interoperability 1094 with IP-VPN", draft-sajassi-l2vpn-evpn-ipvpn-interop-01, work in 1095 progress, October, 2012. 1097 [DC-MOBILITY] Aggarwal et al., "Data Center Mobility based on 1098 BGP/MPLS, IP Routing and NHRP", draft-raggarwa-data-center-mobility- 1099 05.txt, work in progress, June, 2013. 1101 [EVPN-PREFIX] Rabadan et al., "IP Prefix Advertisement in EVPN", 1102 draft-rabadan-l2vpn-evpn-prefix-advertisement-02, July, 2014. 1104 Authors' Addresses 1106 Ali Sajassi 1107 Cisco 1108 Email: sajassi@cisco.com 1110 Samer Salam 1111 Cisco 1112 Email: ssalam@cisco.com 1114 Yakov Rekhter 1115 Juniper Networks 1116 Email: yakov@juniper.net 1118 John E. Drake 1119 Juniper Networks 1120 Email: jdrake@juniper.net 1122 Lucy Yong 1123 Huawei Technologies 1124 Email: lucy.yong@huawei.com 1125 Linda Dunbar 1126 Huawei Technologies 1127 Email: linda.dunbar@huawei.com 1129 Wim Henderickx 1130 Alcatel-Lucent 1131 Email: wim.henderickx@alcatel-lucent.com 1133 Florin Balus 1134 Alcatel-Lucent 1135 Email: Florin.Balus@alcatel-lucent.com 1137 Samir Thoria 1138 Cisco 1139 Email: sthoria@cisco.com