idnits 2.17.1 draft-sajassi-l2vpn-evpn-inter-subnet-forwarding-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 13, 2014) is 3722 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-11) exists of draft-ietf-l2vpn-evpn-04 == Outdated reference: A later version (-02) exists of draft-sajassi-l2vpn-evpn-ipvpn-interop-01 == Outdated reference: A later version (-07) exists of draft-raggarwa-data-center-mobility-05 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 L2VPN Workgroup Ali Sajassi 3 INTERNET-DRAFT Samer Salam 4 Intended Status: Standards Track Samir Thoria 5 Cisco 7 Wim Henderickx Yakov Rekhter 8 Alcatel-Lucent John Drake 9 Juniper 10 Florin Balus 11 Nuage Networks Lucy Yong 12 Linda Dunbar 13 Huawei 15 Expires: August 13, 2014 February 13, 2014 17 IP Inter-Subnet Forwarding in EVPN 18 draft-sajassi-l2vpn-evpn-inter-subnet-forwarding-03 20 Abstract 22 EVPN provides an extensible and flexible multi-homing VPN solution 23 for intra-subnet connectivity among hosts/VMs over an MPLS/IP 24 network. However, there are scenarios in which inter-subnet 25 forwarding among hosts/VMs across different IP subnets is required, 26 while maintaining the multi-homing capabilities of EVPN. This 27 document describes an IRB solution based on EVPN to address such 28 requirements. 30 Status of this Memo 32 This Internet-Draft is submitted to IETF in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF), its areas, and its working groups. Note that 37 other groups may also distribute working documents as 38 Internet-Drafts. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 The list of current Internet-Drafts can be accessed at 46 http://www.ietf.org/1id-abstracts.html 47 The list of Internet-Draft Shadow Directories can be accessed at 48 http://www.ietf.org/shadow.html 50 Copyright and License Notice 52 Copyright (c) 2013 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 1.1 Traditional Inter-Subnet Forwarding . . . . . . . . . . . . 4 69 1.2. Scenarios of EVPN NVEs as L3GW . . . . . . . . . . . . . . 4 70 2 Inter-Subnet Forwarding Scenarios . . . . . . . . . . . . . . . 5 71 2.1 Switching among EVIs within a DC . . . . . . . . . . . . . . 6 72 2.2 Switching among EVIs in different DCs without route 73 aggregation . . . . . . . . . . . . . . . . . . . . . . . . 7 74 2.3 Switching among EVIs in different DCs with route 75 aggregation . . . . . . . . . . . . . . . . . . . . . . . . 7 76 2.4 Switching among IP-VPN sites and EVIs with route 77 aggregation . . . . . . . . . . . . . . . . . . . . . . . . 8 78 3 Default L3 Gateway Addressing . . . . . . . . . . . . . . . . . 8 79 3.1 Homogeneous Environment . . . . . . . . . . . . . . . . . . 8 80 3.1 Heterogeneous Environment . . . . . . . . . . . . . . . . . 9 81 4 Operational Models for Asymmetric Inter-Subnet Forwarding . . . 9 82 4.1 Among EVPN NVEs within a DC . . . . . . . . . . . . . . . . 9 83 4.2 Among EVPN NVEs in Different DCs Without Route Aggregation . 11 84 4.3 Among EVPN NVEs in Different DCs with Route Aggregation . . 12 85 4.4 Among IP-VPN Sites and EVPN NVEs with Route Aggregation . . 13 86 4.5 Use of Centralized Gateway . . . . . . . . . . . . . . . . . 14 87 5 Operational Models for Symmetric Inter-Subnet Forwarding . . . . 14 88 5.1 Among EVPN NVEs within a DC . . . . . . . . . . . . . . . . 14 89 6 VM Mobility . . . . . . . . . . . . . . . . . . . . . . . . . . 16 90 6.1 VM Mobility & Optimum Forwarding for VM's Outbound Traffic . 16 91 6.2 VM Mobility & Optimum Forwarding for VM's Inbound Traffic . 16 92 6.2.1 Mobility without Route Aggregation . . . . . . . . . . . 16 93 6.2.2 Mobility with Route Aggregation . . . . . . . . . . . . 17 94 7 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 17 95 8 Security Considerations . . . . . . . . . . . . . . . . . . . . 17 96 9 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 17 97 10 References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 98 10.1 Normative References . . . . . . . . . . . . . . . . . . . 17 99 10.2 Informative References . . . . . . . . . . . . . . . . . . 17 100 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 17 102 Terminology 104 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 105 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 106 document are to be interpreted as described in RFC 2119 [RFC2119]. 108 IRB: Integrated Routing and Bridging 110 IRB Interface: A virtual interface that connects the bridging module 111 and the routing module on an NVE. 113 NVE: Network Virtualization Endpoint 115 1 Introduction 117 EVPN provides an extensible and flexible multi-homing VPN solution 118 for intra-subnet connectivity among hosts/VMs over an MPLS/IP 119 network. However, there are scenarios where, in addition to intra- 120 subnet forwarding, inter-subnet forwarding is required among 121 hosts/VMs across different IP subnets at the EVPN PE nodes, also 122 known as EVPN NVE nodes throughout this document, while maintaining 123 the multi-homing capabilities of EVPN. This document describes an IRB 124 solution based on EVPN to address such requirements. 126 1.1 Traditional Inter-Subnet Forwarding 128 The inter-subnet communication is traditionally achieved at the L3 129 Gateway nodes where all the inter-subnet communication policies are 130 enforced. Even for different subnets belonging to one IP-VPN or 131 tenant, traffic may need to go through FW or IPS between the trusted 132 and un-trusted zones. 134 Some operators may prefer centralized approach, i.e. only have a set 135 of default L3 gateways (whose redundancy is typically achieved by 136 VRRP) for all inter-subnet traffic to go through. Usually there are 137 FW, IPS, or other network appliances directly attached to the 138 centralized L3 Gateway nodes. The centralized approach makes it 139 easier for maintaining consistent policies and less prone to 140 configuration errors. However, such centralized approach suffers 141 from a major drawback of requiring all traffic to be hair-pinned to 142 the L3GW nodes. 144 Some operators may prefer fully distributed L3 gateway design, e.g. 145 allowing all NVEs to have the policies to route traffic across 146 subnets. Under this design, all traffic between hosts attached to one 147 NVE can be routed locally, thus avoiding traffic hair-pinning issue 148 at the centralized L3GW. The perceived drawback of this fully 149 distributed approach may be the extra effort required in maintaining 150 policy consistence across all the NVEs. 152 Some operators may prefer somewhere in the middle, i.e. allowing NVEs 153 to route traffic across only selected subnets. For example, allow 154 NVEs to route traffic among subnets belonging to one tenant or one 155 security zone. 157 1.2. Scenarios of EVPN NVEs as L3GW 159 When an EVPN NVE node is not the L3GW for the subnets attached, the 160 EVPN NVE performs only L2 switching function for the traffic 161 initiated from or destined to the hosts attached to the NVE. 163 Some EVPN NVEs can be the default L3GWs for some subnets. In this 164 situation, the EVPN NVEs can route traffic across the subnets for 165 which they are default L3GWs. 167 When there are multiple subnets attached to an EVPN NVE, some of the 168 subnets could have the EVPN NVE as their L3GW, some other subnets 169 don't have the NVE as their L3GW. For example: "Subnet-X" can 170 communicate with "Subnet-Y" via NVE "A", but "Subnet-X" can't 171 communicate with "Subnet-Z" via NVE "A". So when the "Subnet-X" needs 172 to communicate with "Subnet-Z", the traffic might need to be routed 173 through another device (e.g. FW, IPS, or another L3GW node). 175 1. When the EVPN NVE is the L3GW for "Subnet-X", hosts within 176 "Subnet-X" will have the NVE's IRB MAC address (or NVE's MAC address) 177 as their default GW MAC address when they send data frames towards 178 targets in different subnets. 180 2. When the EVPN NVE is not the L3GW for "Subnet-Y", hosts within 181 "Subnet-Y", (even though still attached to the NVE), will use their 182 own designated L3GW MAC address (that is different from the NVE's IRB 183 address) in data frames destined towards targets in different 184 subnets. 186 2 Inter-Subnet Forwarding Scenarios 188 The inter-subnet forwarding scenarios performed by an EVPN NVE can be 189 divided into the following five categories. The last scenario, along 190 with their corresponding solutions, are described in [EVPN-IPVPN- 191 INTEROP]. The solutions for the first four scenarios are the focus of 192 this document. 194 1. Switching among EVPN instances (subnets) within a DC 196 2. Switching among EVPN instances in different DCs without route 197 aggregation 199 3. Switching among EVPN instances in different DCs with route 200 aggregation 202 4. Switching among IP-VPN sites and EVPN instances with route 203 aggregation 205 5. Switching among IP-VPN sites and EVPN instances without route 206 aggregation 208 In the above scenario, the term "route aggregation" refers to the 209 case where for a given IP-VRF a node situated at the WAN edge of the 210 data center network behaves as a default gateway for all the 211 destinations that are outside the data center. The absence of route 212 aggregation refers to the scenario where a given IP-VRF within a data 213 center has (host) routes to individual VMs that are outside of the 214 data center. 216 In the case (4) the WAN edge node also performs route aggregation for 217 all the destinations within its own data center, and acts as an 218 interworking unit between EVPN and IP VPN (it implements both EVPN 219 and IP VPN functionality). 221 +---+ Enterprise Site 1 222 |PE1|----- H1 223 +---+ 224 / 225 ,---------. Enterprise Site 2 226 ,' `. +---+ 227 ,---------. /( MPLS/IP )---|PE2|----- H2 228 ' DCN 3 `./ `. Core ,' +---+ 229 `-+------+' `-+------+' 230 __/__ / / \ \ 231 :NVE4 : +---+ \ \ 232 '-----' ,----|GW |. \ \ 233 | ,' +---+ `. ,---------. 234 VM6 ( DCN 1 ) ,' `. 235 `. ,' ( DCN 2 ) 236 `-+------+' `. ,' 237 __/__ `-+------+' 238 :NVE1 : __/__ __\__ 239 '-----' :NVE2 : :NVE3 : 240 | | '-----' '-----' 241 VM1 VM2 | | | 242 VM3 VM4 VM5 244 Figure 2: Interoperability Use-Cases 246 In what follows, we will describe scenarios 3 through 6 in more 247 detail. 249 2.1 Switching among EVIs within a DC 251 In this scenario, connectivity is required between hosts (e.g. VMs) 252 in the same data center, where those hosts belong to different IP 253 subnets. All these subnets are part of the same IP VPN. Each subnet 254 is associated with a single EVPN instance, where each such EVI is 255 realized by a collection of MAC-VRFs residing on appropriate NVEs. 257 As an example, consider VM3 and VM5 of Figure 2 above. Assume that 258 connectivity is required between these two VMs where VM3 belongs to 259 the IP-subnet 3 (SN3) whereas VM5 belongs to the IP-subnet 5 (SN5). 260 Both SN3 and SN5 subnets are part of the same IP VPN. NVE2 has an 261 EVI3 associated with the SN3 and this EVI is represented by a MAC-VRF 262 which is connected to an IP-VRF (for that IP VPN) via an IRB 263 interface. NVE3 respectively has an EVI5 associated with the SN5 and 264 this EVI is represented by an MAC-VRF which is connected to an IP-VRF 265 (for the same IP VPN) via an IRB interface. 267 2.2 Switching among EVIs in different DCs without route aggregation 269 This case is similar to that of section 2.1 above albeit for the fact 270 that the hosts belong to different data centers that are 271 interconnected over a WAN (e.g. MPLS/IP PSN). The data centers in 272 question here are seamlessly interconnected to the WAN, i.e., the WAN 273 edge devices does not maintain any host/VM-specific addresses in the 274 forwarding path - e.g., there is no WAN edge GW(s) between these DCs. 276 As an example, consider VM3 and VM6 of Figure 2 above. Assume that 277 connectivity is required between these two VMs where VM3 belongs to 278 the SN3 whereas VM6 belongs to the SN6. NVE2 has an EVI3 associated 279 with SN3 and NVE4 has an EVI6 associated with the SN6. Both SN3 and 280 SN6 are part of the same IP VPN. 282 2.3 Switching among EVIs in different DCs with route aggregation 284 In this scenario, connectivity is required between hosts (e.g. VMs) 285 in different data centers, and those hosts belong to different IP 286 subnets. What makes this case different from that of Section 2.2 is 287 that (in the context of a given IP-VRF) at least one of the data 288 centers in question has a gateway as the WAN edge switch. Because of 289 that, the NVE's IP-VRF within each data center need not maintain 290 (host) routes to individual VMs outside of the data center. 292 As an example, consider VM1 and VM5 of Figure 2 above. Assume that 293 connectivity is required between these two VMs where VM1 belongs to 294 the SN1 whereas VM5 belongs to the SN5 thus SN1 and SN5 belong to the 295 same IP VPN. NVE3 has an EVI5 associated with the SN5 and this EVI is 296 represented by the MAC-VRF which is connected to the IP-VRF via an 297 IRB interface. NVE1 has an EVI1 associated with the SN1 and this EVI 298 is represented by the MAC-VRF which is connected to the IP-VRF 299 representing the same IP VPN. Due to the gateway at the edge of DCN 300 1, NVE1's IP-VRF does not need to have the address of VM5 but instead 301 it has a default route in its IP-VRF with the next-hop being the GW. 303 2.4 Switching among IP-VPN sites and EVIs with route aggregation 305 In this scenario, connectivity is required between hosts (e.g. VMs) 306 in a data center and hosts in an enterprise site that belongs to a 307 given IP-VPN. The NVE within the data center is an EVPN NVE, whereas 308 the enterprise site has an IP-VPN PE. Furthermore, the data center in 309 question has a gateway as the WAN edge switch. Because of that, the 310 NVE in the data center does not need to maintain individual IP 311 prefixes advertised by enterprise sites (by IP-VPN PEs). 313 As an example, consider end-station H1 and VM2 of Figure 2. Assume 314 that connectivity is required between the end-station and the VM, 315 where VM2 belongs to the SN2 that is realized using EVPN, whereas H1 316 belongs to an IP VPN site connected to PE1 (PE1 maintains an IP-VRF 317 associated with that IP VPN). NVE1 has an EVI2 associated with the 318 SN2. Moreover, EVI2 on NVE1 is connected to an IP-VRF associated with 319 that IP VPN. PE1 originates a VPN-IP route that covers H1. The 320 gateway at the edge of DCN1 performs interworking function between 321 IP-VPN and EVPN. As a result of this, a default route in the IP-VRF 322 on the NVE1, pointing to the gateway as the next hop, and a route to 323 the VM2 (or maybe SN2) on the PE1's IP-VRF are sufficient for the 324 connectivity between H1 and VM2. In this scenario, the NVE1's IP-VRF 325 does not need to maintain a route to H1 because it has the default 326 route to the gateway. 328 3 Default L3 Gateway Addressing 330 3.1 Homogeneous Environment 332 This is an environment where all NVEs to which an EVPN instance could 333 potentially be attached (or moved), perform inter-subnet switching. 334 Therefore, inter-subnet traffic can be locally switched by the EVPN 335 NVE connecting the VMs belonging to different subnets. 337 To support such inter-subnet forwarding, the NVE behaves as an IP 338 Default Gateway from the perspective of the attached end-stations 339 (e.g. VMs). Two models are possible, as discussed in [DC-MOBILITY]: 341 1. All the EVIs of a given EVPN instance use the same anycast default 342 gateway IP address and the same anycast default gateway MAC address. 343 On each NVE, this default gateway IP/MAC address correspond to the 344 IRB interface of the EVI associated with that EVPN instance. 346 2. Each EVI of a given EVPN instance uses its own default gateway IP 347 and MAC addresses, and these addresses are aliased to the same 348 conceptual gateway through the use of the Default Gateway extended 349 community as specified in [EVPN], which is carried in the EVPN MAC 350 Advertisement routes. On each NVE, this default gateway IP/MAC 351 address correspond to the IRB interface of the EVI associated with 352 that EVPN instance. 354 Both of these models enable a packet forwarding paradigm where inter- 355 subnet traffic can bypass the VRF processing on the egress (i.e. 356 disposition) NVE. The egress NVE merely needs to perform a lookup in 357 the associated EVI and forward the Ethernet frames unmodified, i.e. 358 without rewriting the source MAC address. This is different from 359 traditional IRB forwarding where a packet is forwarded through the 360 bridge module followed by the routing module on the ingress NVE, and 361 then forwarded through the routing module followed by the bridging 362 module on the egress NVE. For inter-subnet forwarding using EVPN, the 363 routing module on the egress NVE can be completely bypassed. 365 It is worth noting that if the applications that are running on the 366 hosts (e.g. VMs) are employing or relying on any form of MAC 367 security, then the first model (i.e. using anycast addresses) would 368 be required to ensure that the applications receive traffic from the 369 same source MAC address that they are sending to. 371 3.1 Heterogeneous Environment 373 For large data centers with thousands of servers and ToR (or Access) 374 switches, some of them may not have the capability of maintaining or 375 enforcing policies for inter-subnet switching. Even though policies 376 among multiple subnets belonging to same tenant can be simpler, hosts 377 belonging to one tenant can also send traffic to peers belonging to 378 different tenants or security zones. A L3GW not only needs to enforce 379 policies for communication among subnets belonging to a single 380 tenant, but also it needs to know how to handle traffic destined 381 towards peers in different tenants. Therefore, there can be a mixed 382 environment where an NVE performs inter-subnet switching for some 383 EVPN instances but not others. 385 4 Operational Models for Asymmetric Inter-Subnet Forwarding 387 4.1 Among EVPN NVEs within a DC 389 When an EVPN MAC advertisement route is received by the NVE, the IP 390 address associated with the route is used to populate the IP-VRF 391 table, whereas the MAC address associated with the route is used to 392 populate both the MAC-VRF table, as well as the adjacency associated 393 with the IP route in the IP-VRF table. 395 When an Ethernet frame is received by an ingress NVE, it performs a 396 lookup on the destination MAC address in the associated MAC-VRF for 397 that EVI. If the MAC address corresponds to its IRB Interface MAC 398 address, the ingress NVE deduces that the packet MUST be inter-subnet 399 routed. Hence, the ingress NVE performs an IP lookup in the 400 associated IP-VRF table. The lookup identifies both the next-hop 401 (i.e. egress) NVE to which the packet must be forwarded, in addition 402 to an adjacency that contains a MAC rewrite and an MPLS label stack. 403 The MAC rewrite holds the MAC address associated with the destination 404 host (as populated by the EVPN MAC route), instead of the MAC address 405 of the next-hop NVE. The ingress NVE then rewrites the destination 406 MAC address in the packet with the address specified in the 407 adjacency. It also rewrites the source MAC address with its IRB 408 Interface MAC address. The ingress NVE, then, forwards the frame to 409 the next-hop (i.e. egress) NVE after encapsulating it with the MPLS 410 label stack. Note that this label stack includes the LSP label as 411 well as the EVI label that was advertised by the egress NVE. When the 412 MPLS encapsulated packet is received by the egress NVE, it uses the 413 EVI label to identify the MAC-VRF table. It then performs a MAC 414 lookup in that table, which yields the outbound interface to which 415 the Ethernet frame must be forwarded. Figure 2 below depicts the 416 packet flow, where NVE1 and NVE2 are the ingress and egress NVEs, 417 respectively. 419 NVE1 NVE2 420 +------------+ +------------+ 421 | ... ... | | ... ... | 422 |(EVI)-(VRF) | |(VRF)-(EVI) | 423 | .|. .|. | | ... |..| | 424 +------------+ +------------+ 425 ^ v ^ V 426 | | | | 427 VM1->-+ +-->--------------+ +->-VM2 429 Figure 2: Inter-Subnet Forwarding Among EVPN NVEs within a DC 431 Note that the forwarding behavior on the egress NVE is similar to 432 EVPN intra-subnet forwarding. In other words, all the packet 433 processing associated with the inter-subnet forwarding semantics is 434 confined to the ingress NVE and that is why it is called Asymmetric 435 IRB. 437 It should also be noted that [EVPN] provides different level of 438 granularity for the EVI label. Besides identifying bridge domain 439 table, it can be used to identify the egress interface or a 440 destination MAC address on that interface. If EVI label is used for 441 egress interface or destination MAC address identification, then no 442 MAC lookup is needed in the egress EVI and the packet can be directly 443 forwarded to the egress interface just based on EVI label lookup. 445 4.2 Among EVPN NVEs in Different DCs Without Route Aggregation 447 When an EVPN MAC advertisement route is received by the NVE, the IP 448 address associated with the route is used to populate the IP-VRF 449 table, whereas the MAC address associated with the route is used to 450 populate both the MAC-VRF table, as well as the adjacency associated 451 with the IP route in the IP-VRF table. 453 When an Ethernet frame is received by an ingress NVE, it performs a 454 lookup on the destination MAC address in the associated EVI. If the 455 MAC address corresponds to its IRB Interface MAC address, the ingress 456 NVE deduces that the packet MUST be inter-subnet routed. Hence, the 457 ingress NVE performs an IP lookup in the associated IP-VRF table. The 458 lookup identifies both the next-hop (i.e. egress) Gateway to which 459 the packet must be forwarded, in addition to an adjacency that 460 contains a MAC rewrite and an MPLS label stack. The MAC rewrite holds 461 the MAC address associated with the destination host (as populated by 462 the EVPN MAC route), instead of the MAC address of the next-hop 463 Gateway. The ingress NVE then rewrites the destination MAC address in 464 the packet with the address specified in the adjacency. It also 465 rewrites the source MAC address with its IRB Interface MAC address. 466 The ingress NVE, then, forwards the frame to the next-hop (i.e. 467 egress) Gateway after encapsulating it with the MPLS label stack. 469 Note that this label stack includes the LSP label as well as an EVI 470 label. The EVI label could be either advertised by the ingress 471 Gateway, if inter-AS option B is used, or advertised by the egress 472 NVE, if inter-AS option C is used. When the MPLS encapsulated packet 473 is received by the ingress Gateway, the processing again differs 474 depending on whether inter-AS option B or option C is employed: in 475 the former case, the ingress Gateway swaps the EVI label in the 476 packets with the EVI label value received from the egress Gateway. In 477 the latter case, the ingress Gateway does not modify the EVI label 478 and performs normal label switching on the LSP label. Similarly on 479 the egress Gateway, for option B, the egress Gateway swaps the EVI 480 label with the value advertised by the egress NVE. Whereas, for 481 option C, the egress Gateway does not modify the EVI label, and 482 performs normal label switching on the LSP label. When the MPLS 483 encapsulated packet is received by the egress NVE, it uses the EVI 484 label to identify the bridge-domain table. It then performs a MAC 485 lookup in that table, which yields the outbound interface to which 486 the Ethernet frame must be forwarded. Figure 3 below depicts the 487 packet flow. 489 NVE1 GW1 GW2 NVE2 490 +------------+ +------------+ +------------+ +------------+ 491 | ... ... | | ... | | ... | | ... ... | 492 |(EVI)-(VRF) | | [LS ] | | [LS ] | |(VRF)-(EVI) | 493 | .|. .|. | | |..| | | |..| | | ... |..| | 494 +------------+ +------------+ +------------+ +------------+ 495 ^ v ^ V ^ V ^ V 496 | | | | | | | | 497 VM1->-+ +-->--------+ +------------+ +---------------+ +->-VM2 499 Figure 3: Inter-Subnet Forwarding Among EVPN NVEs in Different DCs 500 without Route Aggregation 502 4.3 Among EVPN NVEs in Different DCs with Route Aggregation 504 In this scenario, the NVEs within a given data center do not have 505 entries for the MAC/IP addresses of hosts in remote data centers. 506 Rather, the NVEs have a default IP route pointing to the WAN gateway 507 for each VRF. This is accomplished by the WAN gateway advertising for 508 a given EVPN that spans multiple DC a default VPN-IP route that is 509 imported by the NVEs of that EVPN that are in the gateway's own DC. 511 When an Ethernet frame is received by an ingress NVE, it performs a 512 lookup on the destination MAC address in the associated MAC-VRF 513 table. If the MAC address corresponds to the IRB Interface MAC 514 address, the ingress NVE deduces that the packet MUST be inter-subnet 515 routed. Hence, the ingress NVE performs an IP lookup in the 516 associated IP-VRF table. The lookup, in this case, matches the 517 default route which points to the local WAN gateway. The ingress NVE 518 then rewrites the destination MAC address in the packet with the IRB 519 Interface MAC address of the local WAN gateway. It also rewrites the 520 source MAC address with its own IRB Interface MAC address. The 521 ingress NVE, then, forwards the frame to the WAN gateway after 522 encapsulating it with the MPLS label stack. Note that this label 523 stack includes the LSP label as well as the IP-VPN label that was 524 advertised by the local WAN gateway. When the MPLS encapsulated 525 packet is received by the local WAN gateway, it uses the IP-VPN label 526 to identify the IP-VRF table. It then performs an IP lookup in that 527 table. The lookup identifies both the remote WAN gateway (of the 528 remote data center) to which the packet must be forwarded, in 529 addition to an adjacency that contains a MAC rewrite and an MPLS 530 label stack. The MAC rewrite holds the MAC address associated with 531 the ultimate destination host (as populated by the EVPN MAC route). 532 The local WAN gateway then rewrites the destination MAC address in 533 the packet with the address specified in the adjacency. It also 534 rewrites the source MAC address with its IRB Interface MAC address. 535 The local WAN gateway, then, forwards the frame to the remote WAN 536 gateway after encapsulating it with the MPLS label stack. Note that 537 this label stack includes the LSP label as well as a EVI label that 538 was advertised by the remote WAN gateway. When the MPLS encapsulated 539 packet is received by the remote WAN gateway, it simply swaps the EVI 540 label and forwards the packet to the egress NVE. This implies that 541 the GW1 needs to keep the remote host MAC addresses along with the 542 corresponding EVI labels in the adjacency entries of the IP-VRF 543 table. The remote WAN gateway then forward the packet to the egress 544 NVE. The egress NVE then performs a MAC lookup in the MAC-VRF 545 (identified by the received EVI label) to determine the outbound port 546 to send the traffic on. 548 Figure 4 below depicts the forwarding model. 550 NVE1 GW1 GW2 NVE2 551 +------------+ +------------+ +------------+ +------------+ 552 | ... ... | | ... ... | | ... | | ... ... | 553 |(EVI)-(VRF) | |(VRF)-(EVI) | | [LS ] | |(VRF)-(EVI) | 554 | .|. .|. | | |..| | | |...| | | ... |..| | 555 +------------+ +------------+ +------------+ +------------+ 556 ^ v ^ V ^ V ^ V 557 | | | | | | | | 558 VM1->-+ +-->-----+ +--------------+ +---------------+ +->-VM2 560 Figure 4: Inter-Subnet Forwarding Among EVPN NVEs in Different DCs 561 with Route Aggregation 563 4.4 Among IP-VPN Sites and EVPN NVEs with Route Aggregation 565 In this scenario, the NVEs within a given data center do not have 566 entries for the IP addresses of hosts in remote enterprise sites. 567 Rather, the NVEs have a default IP route pointing the WAN gateway for 568 each IP-VRF. 570 When an Ethernet frame is received by an ingress NVE, it performs a 571 lookup on the destination MAC address in the associated MAC-VRF 572 table. If the MAC address corresponds to the IRB Interface MAC 573 address, the ingress NVE deduces that the packet MUST be inter-subnet 574 routed. Hence, the ingress NVE performs an IP lookup in the 575 associated IP-VRF table. The lookup, in this case, matches the 576 default route which points to the local WAN gateway. The ingress NVE 577 then rewrites the destination MAC address in the packet with the IRB 578 Interface MAC address of the local WAN gateway. It also rewrites the 579 source MAC address with its own IRB Interface MAC address. The 580 ingress NVE, then, forwards the frame to the local WAN gateway after 581 encapsulating it with the MPLS label stack. Note that this label 582 stack includes the LSP label as well as the IP-VPN label that was 583 advertised by the local WAN gateway. When the MPLS encapsulated 584 packet is received by the local WAN gateway, it uses the IP-VPN label 585 to identify the VRF table. It then performs an IP lookup in that 586 table. The lookup identifies the next hop ASBR to which the packet 587 must be forwarded. The local gateway in this case strips the Ethernet 588 encapsulation and perform an IP lookup in its IP-VRF and forwards the 589 IP packet to the ASBR using a label stack comprising of an LSP label 590 and an IP-VPN label that was advertised by the ASBR. When the MPLS 591 encapsulated packet is received by the ASBR, it simply swaps the IP- 592 VPN label with the one advertised by the egress PE. This implies that 593 the remote WAN gateway must allocate the VPN label at least at the 594 granularity of a (VRF, egress PE) tuple. The ASBR then forwards the 595 packet to the egress PE. The egress PE then performs an IP lookup in 596 the IP-VRF (identified by the received IP-VPN label) to determine 597 where to forward the traffic. 599 Figure 5 below depicts the forwarding model. 601 NVE1 GW1 ASBR NVE2 602 +------------+ +------------+ +------------+ +------------+ 603 | ... ... | | ... ... | | ... | | ... | 604 |(EVI)-(VRF) | |(VRF)-(EVI) | | [LS ] | | (VRF)| 605 | .|. .|. | | |..| | | |...| | | |..| | 606 +------------+ +------------+ +------------+ +------------+ 607 ^ v ^ V ^ V ^ V 608 | | | | | | | | 609 VM1->-+ +-->-----+ +--------------+ +---------------+ +->-H1 611 Figure 5: Inter-Subnet Forwarding Among IP-VPN Sites and EVPN NVEs 612 with Route Aggregation 614 4.5 Use of Centralized Gateway 616 In this scenario, the NVEs within a given data center need to forward 617 traffic in L2 to a centralized L3GW for a number of reasons: a) they 618 don't have IRB capabilities or b) they don't have required policy for 619 switching traffic between different tenants or security zones. The 620 centralized L3GW performs both the IRB function for switching traffic 621 among different EVPN instances as well as it performs interworking 622 function when the traffic needs to be switched between IP-VPN sites 623 and EVPN instances. 625 5 Operational Models for Symmetric Inter-Subnet Forwarding 627 5.1 Among EVPN NVEs within a DC 628 When an EVPN MAC advertisement route is received by the NVE, the IP 629 address associated with the route is used to populate the IP-VRF 630 table, whereas the MAC address associated with the route is used to 631 populate both the MAC-VRF table. However, the received MAC address is 632 not used to populate the adjacency associated with the IP route in 633 the IP-VRF table, instead, the remote NVE's MAC address is used for 634 this purpose. 636 When an Ethernet frame is received by an ingress NVE, it performs a 637 lookup on the destination MAC address in the associated MAC-VRF for 638 that MAC-VRF table. If the MAC address corresponds to its IRB 639 Interface MAC address, the ingress NVE deduces that the packet MUST 640 be inter-subnet routed. Hence, the ingress NVE performs an IP lookup 641 in the associated IP-VRF table. The lookup identifies both the next- 642 hop (i.e. egress) NVE to which the packet must be forwarded, in 643 addition to an adjacency that contains a MAC rewrite and an MPLS 644 label stack. The MAC rewrite holds the MAC address associated with 645 the next-hop NVE (egress NVE). The ingress NVE then rewrites the 646 destination MAC address in the packet with the address specified in 647 the adjacency. It also rewrites the source MAC address with its IRB 648 Interface MAC address. The ingress NVE, then, forwards the frame to 649 the next-hop (i.e. egress) NVE after encapsulating it with the MPLS 650 label stack. Note that this label stack includes the LSP label as 651 well as the IP-VPN label (lable2 in the MAC route) that was 652 advertised by the egress NVE. When the MPLS encapsulated packet is 653 received by the egress NVE, it uses the IP-VPN label to identify the 654 IP-VRF table. It then performs an IP lookup in that table, which 655 yields the outbound IRB interface to which the Ethernet frame must be 656 forwarded. Next, a MAC lookup is performed on the destination MAC 657 address of the frame in the MAC-VRF table, which yields the outbound 658 interface to which the Ethernet frame must be forwarded. Figure 2 659 below depicts the packet flow, where NVE1 and NVE2 are the ingress 660 and egress NVEs, respectively. 662 NVE1 NVE2 663 +------------+ +------------+ 664 | ... ... | | ... ... | 665 |(EVI)-(VRF) | |(VRF)-(EVI) | 666 | .|. .|. | | ... |..| | 667 +------------+ +------------+ 668 ^ v ^ V 669 | | | | 670 VM1->-+ +-->---------+ +->-VM2 672 Figure 2: Inter-Subnet Forwarding Among EVPN NVEs within a DC 674 Note that the forwarding behavior on the egress NVE is similar to 675 EVPN intra-subnet forwarding. In other words, all the packet 676 processing associated with the inter-subnet forwarding semantics is 677 confined to the ingress NVE and that is why it is called Asymmetric 678 IRB. 680 It should also be noted that [EVPN] provides different level of 681 granularity for the EVI label. Besides identifying bridge domain 682 table, it can be used to identify the egress interface or a 683 destination MAC address on that interface. If EVI label is used for 684 egress interface or destination MAC address identification, then no 685 MAC lookup is needed in the egress EVI and the packet can be directly 686 forwarded to the egress interface just based on EVI label lookup. 688 6 VM Mobility 690 6.1 VM Mobility & Optimum Forwarding for VM's Outbound Traffic 692 Optimum forwarding for the VM's outbound traffic, upon VM mobility, 693 can be achieved using either the anycast default Gateway MAC and IP 694 addresses, or using the address aliasing as discussed in [DC- 695 MOBILITY]. 697 6.2 VM Mobility & Optimum Forwarding for VM's Inbound Traffic 699 For optimum forwarding of the VM's inbound traffic, upon VM mobility, 700 all the NVEs and/or IP-VPN PEs need to know the up to date location 701 of the VM. Two scenarios must be considered, as discussed next. 703 In what follows, we use the following terminology: 705 - source NVE refers to the NVE behind which the VM used to reside 706 prior to the VM mobility event. 708 - target NVE refers to the new NVE behind which the VM has moved 709 after the mobility event. 711 6.2.1 Mobility without Route Aggregation 713 In this scenario, when a target NVE detects that a MAC mobility event 714 has occurred, it initiates the MAC mobility handshake in BGP as 715 specified in [EVPN]. The WAN Gateways, acting as ASBRs in this case, 716 re-advertise the MAC route of the target NVE with the MAC Mobility 717 extended community attribute unmodified. Because the WAN Gateway for 718 a given data center re-advertises BGP routes received from the WAN 719 into the data center, the source NVE will receive the MAC 720 Advertisement route of the target NVE (with the next hop attribute 721 adjusted depending on which inter-AS option is employed). The source 722 NVE will then withdraw its original MAC Advertisement route as a 723 result of evaluating the Sequence Number field of the MAC Mobility 724 extended community in the received MAC Advertisement route. This is 725 per the procedures already defined in [EVPN]. 727 6.2.2 Mobility with Route Aggregation 729 This section will be completed in the next revision. 731 7 Acknowledgements 733 The authors would like to thank Sami Boutros for his valuable 734 comments. 736 8 Security Considerations 738 9 IANA Considerations 740 10 References 742 10.1 Normative References 744 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 745 Requirement Levels", BCP 14, RFC 2119, March 1997. 747 10.2 Informative References 749 [EVPN] Sajassi et al., "BGP MPLS Based Ethernet VPN", draft-ietf- 750 l2vpn-evpn-04.txt, work in progress, July, 2014. 752 [EVPN-IPVPN-INTEROP] Sajassi et al., "EVPN Seamless Interoperability 753 with IP-VPN", draft-sajassi-l2vpn-evpn-ipvpn-interop-01, work in 754 progress, October, 2012. 756 [DC-MOBILITY] Aggarwal et al., "Data Center Mobility based on 757 BGP/MPLS, IP Routing and NHRP", draft-raggarwa-data-center-mobility- 758 05.txt, work in progress, June, 2013. 760 Authors' Addresses 762 Ali Sajassi 763 Cisco 764 Email: sajassi@cisco.com 766 Samer Salam 767 Cisco 768 Email: ssalam@cisco.com 770 Yakov Rekhter 771 Juniper Networks 772 Email: yakov@juniper.net 774 John E. Drake 775 Juniper Networks 776 Email: jdrake@juniper.net 778 Lucy Yong 779 Huawei Technologies 780 Email: lucy.yong@huawei.com 782 Linda Dunbar 783 Huawei Technologies 784 Email: linda.dunbar@huawei.com 786 Wim Henderickx 787 Alcatel-Lucent 788 Email: wim.henderickx@alcatel-lucent.com 790 Florin Balus 791 Alcatel-Lucent 792 Email: Florin.Balus@alcatel-lucent.com 794 Samir Thoria 795 Cisco 796 Email: sthoria@cisco.com