idnits 2.17.1 draft-sajassi-l2vpn-evpn-inter-subnet-forwarding-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 15, 2013) is 3932 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-11) exists of draft-ietf-l2vpn-evpn-04 == Outdated reference: A later version (-02) exists of draft-sajassi-l2vpn-evpn-ipvpn-interop-01 == Outdated reference: A later version (-07) exists of draft-raggarwa-data-center-mobility-05 Summary: 0 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 L2VPN Workgroup Ali Sajassi 3 INTERNET-DRAFT Samer Salam 4 Intended Status: Standards Track Cisco 6 Yakov Rekhter 7 Wim Henderickx John Drake 8 Alcatel-Lucent Juniper 10 Lucy Yong 11 Florin Balus Linda Dunbar 12 Nuage Networks Huawei 14 Expires: January 15, 2014 July 15, 2013 16 IP Inter-Subnet Forwarding in EVPN 17 draft-sajassi-l2vpn-evpn-inter-subnet-forwarding-02 19 Abstract 21 EVPN provides an extensible and flexible multi-homing VPN solution 22 for intra-subnet connectivity among hosts/VMs over an MPLS/IP 23 network. However, there are scenarios in which inter-subnet 24 forwarding among hosts/VMs across different IP subnets is required, 25 while maintaining the multi-homing capabilities of EVPN. This 26 document describes an IRB solution based on EVPN to address such 27 requirements. 29 Status of this Memo 31 This Internet-Draft is submitted to IETF in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF), its areas, and its working groups. Note that 36 other groups may also distribute working documents as 37 Internet-Drafts. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 The list of current Internet-Drafts can be accessed at 45 http://www.ietf.org/1id-abstracts.html 46 The list of Internet-Draft Shadow Directories can be accessed at 47 http://www.ietf.org/shadow.html 49 Copyright and License Notice 51 Copyright (c) 2013 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (http://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 Table of Contents 66 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 67 1.1 Traditional Inter-Subnet Forwarding . . . . . . . . . . . . 4 68 1.2. Scenarios of EVPN NVEs as L3GW . . . . . . . . . . . . . . 4 69 2 Inter-Subnet Forwarding Scenarios . . . . . . . . . . . . . . . 5 70 2.1 Switching among EVIs within a DC . . . . . . . . . . . . . . 6 71 2.2 Switching among EVIs in different DCs without route 72 aggregation . . . . . . . . . . . . . . . . . . . . . . . . 7 73 2.3 Switching among EVIs in different DCs with route 74 aggregation . . . . . . . . . . . . . . . . . . . . . . . . 7 75 2.4 Switching among IP-VPN sites and EVIs with route 76 aggregation . . . . . . . . . . . . . . . . . . . . . . . . 7 77 3 Default L3 Gateway Addressing . . . . . . . . . . . . . . . . . 8 78 3.1 Homogeneous Environment . . . . . . . . . . . . . . . . . . 8 79 3.1 Heterogeneous Environment . . . . . . . . . . . . . . . . . 9 80 4 Operational Models for Inter-Subnet Forwarding . . . . . . . . 9 81 4.1 Among EVPN NVEs within a DC . . . . . . . . . . . . . . . . 9 82 4.2 Among EVPN NVEs in Different DCs Without Route Aggregation . 10 83 4.3 Among EVPN NVEs in Different DCs with Route Aggregation . . 12 84 4.4 Among IP-VPN Sites and EVPN NVEs with Route Aggregation . . 13 85 4.5 Use of Centralized Gateway . . . . . . . . . . . . . . . . . 14 86 5 VM Mobility . . . . . . . . . . . . . . . . . . . . . . . . . . 14 87 5.1 VM Mobility & Optimum Forwarding for VM's Outbound Traffic . 14 88 5.2 VM Mobility & Optimum Forwarding for VM's Inbound Traffic . 15 89 5.2.1 Mobility without Route Aggregation . . . . . . . . . . . 15 90 5.2.2 Mobility with Route Aggregation . . . . . . . . . . . . 15 92 6 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 15 93 7 Security Considerations . . . . . . . . . . . . . . . . . . . . 15 94 8 IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 15 95 9 References . . . . . . . . . . . . . . . . . . . . . . . . . . 16 96 9.1 Normative References . . . . . . . . . . . . . . . . . . . 16 97 9.2 Informative References . . . . . . . . . . . . . . . . . . 16 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16 100 Terminology 102 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 103 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 104 document are to be interpreted as described in RFC 2119 [RFC2119]. 106 IRB: Integrated Routing and Bridging 108 IRB Interface: A virtual interface that connects the bridging module 109 and the routing module on an NVE. 111 NVE: Network Virtualization Endpoint 113 1 Introduction 115 EVPN provides an extensible and flexible multi-homing VPN solution 116 for intra-subnet connectivity among hosts/VMs over an MPLS/IP 117 network. However, there are scenarios where, in addition to intra- 118 subnet forwarding, inter-subnet forwarding is required among 119 hosts/VMs across different IP subnets at the EVPN PE nodes, also 120 known as EVPN NVE nodes throughout this document, while maintaining 121 the multi-homing capabilities of EVPN. This document describes an IRB 122 solution based on EVPN to address such requirements. 124 1.1 Traditional Inter-Subnet Forwarding 126 The inter-subnet communication is traditionally achieved at the L3 127 Gateway nodes where all the inter-subnet communication policies are 128 enforced. Even for different subnets belonging to one IP-VPN or 129 tenant, traffic may need to go through FW or IPS between the trusted 130 and un-trusted zones. 132 Some operators may prefer centralized approach, i.e. only have a set 133 of default L3 gateways (whose redundancy is typically achieved by 134 VRRP) for all inter-subnet traffic to go through. Usually there are 135 FW, IPS, or other network appliances directly attached to the 136 centralized L3 Gateway nodes. The centralized approach makes it 137 easier for maintaining consistent policies and less prone to 138 configuration errors. However, such centralized approach suffers 139 from a major drawback of requiring all traffic to be hair-pinned to 140 the L3GW nodes. 142 Some operators may prefer fully distributed L3 gateway design, e.g. 143 allowing all NVEs to have the policies to route traffic across 144 subnets. Under this design, all traffic between hosts attached to one 145 NVE can be routed locally, thus avoiding traffic hair-pinning issue 146 at the centralized L3GW. The perceived drawback of this fully 147 distributed approach may be the extra effort required in maintaining 148 policy consistence across all the NVEs. 150 Some operators may prefer somewhere in the middle, i.e. allowing NVEs 151 to route traffic across only selected subnets. For example, allow 152 NVEs to route traffic among subnets belonging to one tenant or one 153 security zone. 155 1.2. Scenarios of EVPN NVEs as L3GW 157 When an EVPN NVE node is not the L3GW for the subnets attached, the 158 EVPN NVE performs only L2 switching function for the traffic 159 initiated from or destined to the hosts attached to the NVE. 161 Some EVPN NVEs can be the default L3GWs for some subnets. In this 162 situation, the EVPN NVEs can route traffic across the subnets for 163 which they are default L3GWs. 165 When there are multiple subnets attached to an EVPN NVE, some of the 166 subnets could have the EVPN NVE as their L3GW, some other subnets 167 don't have the NVE as their L3GW. For example: "Subnet-X" can 168 communicate with "Subnet-Y" via NVE "A", but "Subnet-X" can't 169 communicate with "Subnet-Z" via NVE "A". So when the "Subnet-X" needs 170 to communicate with "Subnet-Z", the traffic might need to be routed 171 through another device (e.g. FW, IPS, or another L3GW node). 173 1. When the EVPN NVE is the L3GW for "Subnet -X", hosts within 174 "Subnet-X" will have the NVE's IRB MAC address as their default GW 175 MAC address when they send data frames towards targets in different 176 subnets. 178 2. When the EVPN NVE is not the L3GW for "Subnet-Y", hosts within 179 "Subnet-Y", (even though still attached to the NVE), will use their 180 own designated L3GW MAC address (that is different from the NVE's IRB 181 address) in data frames destined towards targets in different 182 subnets. 184 2 Inter-Subnet Forwarding Scenarios 186 The inter-subnet forwarding scenarios performed by an EVPN NVE can be 187 divided into the following five categories. The last scenario, along 188 with their corresponding solutions, are described in [EVPN-IPVPN- 189 INTEROP]. The solutions for the first four scenarios are the focus of 190 this document. 192 1. Switching among EVPN instances within a DC 194 2. Switching among EVPN instances in different DCs without route 195 aggregation 197 3. Switching among EVPN instances in different DCs with route 198 aggregation 200 4. Switching among IP-VPN sites and EVPN instances with route 201 aggregation 203 5. Switching among IP-VPN sites and EVPN instances without route 204 aggregation 206 In the above scenario, the term "route aggregation" refers to the 207 case where for a given EVI/VRF a node situated at the WAN edge of the 208 data center network behaves as a default gateway for all the 209 destinations that are outside the data center. The absence of route 210 aggregation refers to the scenario where a given EVI/VRF within a 211 data center has (host) routes to individual VMs that are outside of 212 the data center. 214 In the case (4) the WAN edge node also performs route aggregation for 215 all the destinations within its own data center, and acts as an 216 interworking unit between EVPN and IP VPN (it implements both EVPN 217 and IP VPN functionality). 219 +---+ Enterprise Site 1 220 |PE1|----- H1 221 +---+ 222 / 223 ,---------. Enterprise Site 2 224 ,' `. +---+ 225 ,---------. /( MPLS/IP )---|PE2|----- H2 226 ' DCN 3 `./ `. Core ,' +---+ 227 `-+------+' `-+------+' 228 __/__ / / \ \ 229 :NVE4 : +---+ \ \ 230 '-----' ,----|GW |. \ \ 231 | ,' +---+ `. ,---------. 232 VM6 ( DCN 1 ) ,' `. 233 `. ,' ( DCN 2 ) 234 `-+------+' `. ,' 235 __/__ `-+------+' 236 :NVE1 : __/__ __\__ 237 '-----' :NVE2 : :NVE3 : 238 | | '-----' '-----' 239 VM1 VM2 | | | 240 VM3 VM4 VM5 242 Figure 2: Interoperability Use-Cases 244 In what follows, we will describe scenarios 3 through 6 in more 245 detail. 247 2.1 Switching among EVIs within a DC 249 In this scenario, connectivity is required between hosts (e.g. VMs) 250 in the same data center, where those hosts belong to different IP 251 subnets. All these subnets are part of the same IP VPN. Each subnet 252 is associated with a single EVPN, where each such EVPN is realized by 253 a collection of EVIs residing on appropriate NVEs. 255 As an example, consider VM3 and VM5 of Figure 2 above. Assume that 256 connectivity is required between these two VMs where VM3 belongs to 257 the IP3 subnet whereas VM5 belongs to the IP5 subnet. Both IP3 and 258 IP5 subnets are part of the same IP VPN. NVE2 has an EVI3 associated 259 with IP3 subnet and NVE3 has an EVI5 associated with the IP5 subnet. 261 2.2 Switching among EVIs in different DCs without route aggregation 263 This case is similar to that of section 2.1 above albeit for the fact 264 that the hosts belong to different data centers that are 265 interconnected over a WAN (e.g. MPLS/IP PSN). The data centers in 266 question here are seamlessly interconnected to the WAN, i.e., the WAN 267 edge does not maintain any host/VM-specific addresses in the 268 forwarding path. 270 As an example, consider VM3 and VM6 of Figure 2 above. Assume that 271 connectivity is required between these two VMs where VM3 belongs to 272 the IP3 subnet whereas VM6 belongs to the IP6 subnet. NVE2 has an 273 EVI3 associated with IP3 subnet and NVE4 has an EVI6 associated with 274 the IP6 subnet. Both IP3 and IP6 subnets are part of the same IP VPN 275 and both EVI3 and EVI6 are associated with their VRFs for that IP 276 VPN. 278 2.3 Switching among EVIs in different DCs with route aggregation 280 In this scenario, connectivity is required between hosts (e.g. VMs) 281 in different data centers, and those hosts belong to different IP 282 subnets. What makes this case different from that of Section 2.2 is 283 that (in the context of a given EVI/VRF) at least one of the data 284 centers in question has a gateway as the WAN edge switch. Because of 285 that, the EVIs/VRFs within each data center need not maintain (host) 286 routes to individual VMs outside of the data center. 288 As an example, consider VM1 and VM5 of Figure 2 above. Assume that 289 connectivity is required between these two VMs where VM1 belongs to 290 the IP1 subnet whereas VM5 belongs to the IP5 subnet thus IP1 and IP5 291 subnets belong to the same IP VPN. NVE3 has an EVI5 associated with 292 the IP5 subnet and NVE1 has an EVI1 associated with the IP1 subnet. 293 Both EVI1 and EVI5 have associated with their VRFs that belong to the 294 IP VPN that includes IP1 and IP5 subnets. Due to the gateway at the 295 edge of DCN 1, NVE1 does not have the address of VM5 in its VRF table 296 but instead it has a default route in its VRF with the next-hop being 297 the GW. 299 2.4 Switching among IP-VPN sites and EVIs with route aggregation 300 In this scenario (within a context of a particular EVPN instance), 301 connectivity is required between hosts (e.g. VMs) in a data center 302 and hosts in an enterprise site that belongs to a given IP-VPN. The 303 NVE within the data center is an EVPN NVE, whereas the enterprise 304 site has an IP-VPN PE. Furthermore, the data center in question has a 305 gateway as the WAN edge switch. Because of that, the NVE in the data 306 center does not need to maintain individual IP prefixes advertised by 307 enterprise sites (by IP-VPN PEs). 309 As an example, consider end-station H1 and VM2 of Figure 2. Assume 310 that connectivity is required between the end-station and the VM, 311 where VM2 belongs to the IP2 subnet that is realized using EVPN, 312 whereas H1 belongs to an IP VPN site connected to PE1 (PE1 maintains 313 an IP VPN VRF associated with that IP VPN). NVE1 has an EVI2 314 associated with the IP2 subnet. Moreover, NVE1 maintains a VRF 315 associated with EVI2. PE1 originates a VPN-IP route that covers H1. 316 The gateway at the edge of DCN1 performs interworking function 317 between IP-VPN and EVPN. As a result of this, a default route in the 318 VRF associated with EVI2, pointing to the gateway as the next hop, 319 and a route to the VM2 (or maybe IP2 subnet) on the H1's VRF on PE1 320 are sufficient for the connectivity between H1 and VM2. 322 3 Default L3 Gateway Addressing 324 3.1 Homogeneous Environment 326 This is an environment where all NVEs to which an EVPN instance could 327 potentially be attached (or moved), perform inter-subnet switching. 328 Therefore, inter-subnet traffic can be locally switched by the EVPN 329 NVE connecting the VMs belonging to different subnets. 331 To support such inter-subnet forwarding, the NVE behaves as an IP 332 Default Gateway from the perspective of the attached end-stations 333 (e.g. VMs). Two models are possible, as discussed in [DC-MOBILITY]: 335 1. All the EVIs of a given EVPN instance use the same anycast default 336 gateway IP address and the same anycast default gateway MAC address. 337 On each NVE, this default gateway IP/MAC address correspond to the 338 IRB interface of the EVI associated with that EVPN instance. 340 2. Each EVI of a given EVPN instance uses its own default gateway IP 341 and MAC addresses, and these addresses are aliased to the same 342 conceptual gateway through the use of the Default Gateway extended 343 community as specified in [EVPN], which is carried in the EVPN MAC 344 Advertisement routes. On each NVE, this default gateway IP/MAC 345 address correspond to the IRB interface of the EVI associated with 346 that EVPN instance. 348 Both of these models enable a packet forwarding paradigm where inter- 349 subnet traffic can bypass the VRF processing on the egress (i.e. 350 disposition) NVE. The egress NVE merely needs to perform a lookup in 351 the associated EVI and forward the Ethernet frames unmodified, i.e. 352 without rewriting the source MAC address. This is different from 353 traditional IRB forwarding where a packet is forwarded through the 354 bridge module followed by the routing module on the ingress NVE, and 355 then forwarded through the routing module followed by the bridging 356 module on the egress NVE. For inter-subnet forwarding using EVPN, the 357 routing module on the egress NVE can be completely bypassed. 359 It is worth noting that if the applications that are running on the 360 hosts (e.g. VMs) are employing or relying on any form of MAC 361 security, then the first model (i.e. using anycast addresses) would 362 be required to ensure that the applications receive traffic from the 363 same source MAC address that they are sending to. 365 3.1 Heterogeneous Environment 367 For large data centers with thousands of servers and ToR (or Access) 368 switches, some of them may not have the capability of maintaining or 369 enforcing policies for inter-subnet switching. Even though policies 370 among multiple subnets belonging to same tenant can be simpler, hosts 371 belonging to one tenant can also send traffic to peers belonging to 372 different tenants or security zones. A L3GW not only needs to enforce 373 policies for communication among subnets belonging to a single 374 tenant, but also it needs to know how to handle traffic destined 375 towards peers in different tenants. Therefore, there can be a mixed 376 environment where an NVE performs inter-subnet switching for some 377 EVPN instances but not others. 379 4 Operational Models for Inter-Subnet Forwarding 381 4.1 Among EVPN NVEs within a DC 383 When an EVPN MAC advertisement route is received by the NVE, the IP 384 address associated with the route is used to populate the VRF, 385 whereas the MAC address associated with the route is used to populate 386 both the bridge-domain MAC table, as well as the adjacency associated 387 with the IP route in the VRF. 389 When an Ethernet frame is received by an ingress NVE, it performs a 390 lookup on the destination MAC address in the associated EVI. If the 391 MAC address corresponds to its IRB Interface MAC address, the ingress 392 NVE deduces that the packet MUST be inter-subnet routed. Hence, the 393 ingress NVE performs an IP lookup in the associated VRF table. The 394 lookup identifies both the next-hop (i.e. egress) NVE to which the 395 packet must be forwarded, in addition to an adjacency that contains a 396 MAC rewrite and an MPLS label stack. The MAC rewrite holds the MAC 397 address associated with the destination host (as populated by the 398 EVPN MAC route), instead of the MAC address of the next-hop NVE. The 399 ingress NVE then rewrites the destination MAC address in the packet 400 with the address specified in the adjacency. It also rewrites the 401 source MAC address with its IRB Interface MAC address. The ingress 402 NVE, then, forwards the frame to the next-hop (i.e. egress) NVE after 403 encapsulating it with the MPLS label stack. Note that this label 404 stack includes the LSP label as well as the EVI label that was 405 advertised by the egress NVE. When the MPLS encapsulated packet is 406 received by the egress NVE, it uses the EVI label to identify the 407 bridge-domain table. It then performs a MAC lookup in that table, 408 which yields the outbound interface to which the Ethernet frame must 409 be forwarded. Figure 2 below depicts the packet flow, where NVE1 and 410 NVE2 are the ingress and egress NVEs, respectively. 412 NVE1 NVE2 413 +------------+ +------------+ 414 | ... ... | | ... ... | 415 |(EVI)-(VRF) | |(VRF)-(EVI) | 416 | .|. .|. | | ... |..| | 417 +------------+ +------------+ 418 ^ v ^ V 419 | | | | 420 VM1->-+ +-->--------------+ +->-VM2 422 Figure 2: Inter-Subnet Forwarding Among EVPN NVEs within a DC 424 Note that the forwarding behavior on the egress NVE is similar to 425 EVPN intra-subnet forwarding. In other words, all the packet 426 processing associated with the inter-subnet forwarding semantics is 427 confined to the ingress NVE. 429 It should also be noted that [EVPN] provides different level of 430 granularity for the EVI label. Besides identifying bridge domain 431 table, it can be used to identify the egress interface or a 432 destination MAC address on that interface. If EVI label is used for 433 egress interface or destination MAC address identification, then no 434 MAC lookup is needed in the egress EVI and the packet can be directly 435 forwarded to the egress interface just based on EVI label lookup. 437 4.2 Among EVPN NVEs in Different DCs Without Route Aggregation 439 When an EVPN MAC advertisement route is received by the NVE, the IP 440 address associated with the route is used to populate the VRF, 441 whereas the MAC address associated with the route is used to populate 442 both the bridge-domain MAC table, as well as the adjacency associated 443 with the IP route in the VRF. 445 When an Ethernet frame is received by an ingress NVE, it performs a 446 lookup on the destination MAC address in the associated EVI. If the 447 MAC address corresponds to its IRB Interface MAC address, the ingress 448 NVE deduces that the packet MUST be inter-subnet routed. Hence, the 449 ingress NVE performs an IP lookup in the associated VRF table. The 450 lookup identifies both the next-hop (i.e. egress) Gateway to which 451 the packet must be forwarded, in addition to an adjacency that 452 contains a MAC rewrite and an MPLS label stack. The MAC rewrite holds 453 the MAC address associated with the destination host (as populated by 454 the EVPN MAC route), instead of the MAC address of the next-hop 455 Gateway. The ingress NVE then rewrites the destination MAC address in 456 the packet with the address specified in the adjacency. It also 457 rewrites the source MAC address with its IRB Interface MAC address. 458 The ingress NVE, then, forwards the frame to the next-hop (i.e. 459 egress) Gateway after encapsulating it with the MPLS label stack. 460 Note that this label stack includes the LSP label as well as an EVI 461 label. The EVI label could be either advertised by the ingress 462 Gateway, if inter-AS option B is used, or advertised by the egress 463 NVE, if inter-AS option C is used. When the MPLS encapsulated packet 464 is received by the ingress Gateway, the processing again differs 465 depending on whether inter-AS option B or option C is employed: in 466 the former case, the ingress Gateway swaps the EVI label in the 467 packets with the EVI label value received from the egress Gateway. In 468 the latter case, the ingress Gateway does not modify the EVI label 469 and performs normal label switching on the LSP label. Similarly on 470 the egress Gateway, for option B, the egress Gateway swaps the EVI 471 label with the value advertised by the egress NVE. Whereas, for 472 option C, the egress Gateway does not modify the EVI label, and 473 performs normal label switching on the LSP label. When the MPLS 474 encapsulated packet is received by the egress NVE, it uses the EVI 475 label to identify the bridge-domain table. It then performs a MAC 476 lookup in that table, which yields the outbound interface to which 477 the Ethernet frame must be forwarded. Figure 3 below depicts the 478 packet flow. 480 NVE1 GW1 GW2 NVE2 481 +------------+ +------------+ +------------+ +------------+ 482 | ... ... | | ... | | ... | | ... ... | 483 |(EVI)-(VRF) | | [LS ] | | [LS ] | |(VRF)-(EVI) | 484 | .|. .|. | | |..| | | |..| | | ... |..| | 485 +------------+ +------------+ +------------+ +------------+ 486 ^ v ^ V ^ V ^ V 487 | | | | | | | | 488 VM1->-+ +-->--------+ +------------+ +---------------+ +->-VM2 490 Figure 3: Inter-Subnet Forwarding Among EVPN NVEs in Different DCs 491 without Route Aggregation 493 4.3 Among EVPN NVEs in Different DCs with Route Aggregation 495 In this scenario, the NVEs within a given data center do not have 496 entries for the MAC/IP addresses of hosts in remote data centers. 497 Rather, the NVEs have a default IP route pointing to the WAN gateway 498 for each VRF. This is accomplished by the WAN gateway advertising for 499 a given EVPN that spans multiple DC a default VPN-IP route that is 500 imported by the NVEs of that EVPN that are in the gateway's own DC. 502 When an Ethernet frame is received by an ingress NVE, it performs a 503 lookup on the destination MAC address in the associated EVI. If the 504 MAC address corresponds to the IRB Interface MAC address, the ingress 505 NVE deduces that the packet MUST be inter-subnet routed. Hence, the 506 ingress NVE performs an IP lookup in the associated VRF table. The 507 lookup, in this case, matches the default route which points to the 508 local WAN gateway. The ingress NVE then rewrites the destination MAC 509 address in the packet with the IRB Interface MAC address of the local 510 WAN gateway. It also rewrites the source MAC address with its own IRB 511 Interface MAC address. The ingress NVE, then, forwards the frame to 512 the WAN gateway after encapsulating it with the MPLS label stack. 513 Note that this label stack includes the LSP label as well as the IP- 514 VPN label that was advertised by the local WAN gateway. When the MPLS 515 encapsulated packet is received by the local WAN gateway, it uses the 516 IP-VPN label to identify the VRF table. It then performs an IP lookup 517 in that table. The lookup identifies both the remote WAN gateway (of 518 the remote data center) to which the packet must be forwarded, in 519 addition to an adjacency that contains a MAC rewrite and an MPLS 520 label stack. The MAC rewrite holds the MAC address associated with 521 the ultimate destination host (as populated by the EVPN MAC route). 522 The local WAN gateway then rewrites the destination MAC address in 523 the packet with the address specified in the adjacency. It also 524 rewrites the source MAC address with its IRB Interface MAC address. 525 The local WAN gateway, then, forwards the frame to the remote WAN 526 gateway after encapsulating it with the MPLS label stack. Note that 527 this label stack includes the LSP label as well as a VPN label that 528 was advertised by the remote WAN gateway. When the MPLS encapsulated 529 packet is received by the remote WAN gateway, it simply swaps the VPN 530 label with the EVI label advertised by the egress NVE. This implies 531 that the remote WAN gateway must allocate the VPN label at least at 532 the granularity of a (VRF, egress NVE) tuple. The remote WAN gateway 533 then forward the packet to the egress NVE. The egress NVE then 534 performs a MAC lookup in the EVI (identified by the received EVI 535 label) to determine the outbound port to send the traffic on. 537 Figure 4 below depicts the forwarding model. 539 NVE1 GW1 GW2 NVE2 540 +------------+ +------------+ +------------+ +------------+ 541 | ... ... | | ... ... | | ... | | ... ... | 542 |(EVI)-(VRF) | |(VRF)-(EVI) | | [LS ] | |(VRF)-(EVI) | 543 | .|. .|. | | |..| | | |...| | | ... |..| | 544 +------------+ +------------+ +------------+ +------------+ 545 ^ v ^ V ^ V ^ V 546 | | | | | | | | 547 VM1->-+ +-->-----+ +--------------+ +---------------+ +->-VM2 549 Figure 4: Inter-Subnet Forwarding Among EVPN NVEs in Different DCs 550 with Route Aggregation 552 4.4 Among IP-VPN Sites and EVPN NVEs with Route Aggregation 554 In this scenario, the NVEs within a given data center do not have 555 entries for the IP addresses of hosts in remote enterprise sites. 556 Rather, the NVEs have a default IP route pointing to the WAN gateway 557 for each VRF. 559 When an Ethernet frame is received by an ingress NVE, it performs a 560 lookup on the destination MAC address in the associated EVI. If the 561 MAC address corresponds to the IRB Interface MAC address, the ingress 562 NVE deduces that the packet MUST be inter-subnet routed. Hence, the 563 ingress NVE performs an IP lookup in the associated VRF table. The 564 lookup, in this case, matches the default route which points to the 565 local WAN gateway. The ingress NVE then rewrites the destination MAC 566 address in the packet with the IRB Interface MAC address of the local 567 WAN gateway. It also rewrites the source MAC address with its own IRB 568 Interface MAC address. The ingress NVE, then, forwards the frame to 569 the WAN gateway after encapsulating it with the MPLS label stack. 570 Note that this label stack includes the LSP label as well as the IP- 571 VPN label that was advertised by the local WAN gateway. When the MPLS 572 encapsulated packet is received by the local WAN gateway, it uses the 573 IP-VPN label to identify the VRF table. It then performs an IP lookup 574 in that table. The lookup identifies the next hop ASBR to which the 575 packet must be forwarded. The local gateway in this case strips the 576 Ethernet encapsulation and forwards the IP packet to the ASBR using a 577 label stack comprising of an LSP label and a VPN label that was 578 advertised by the ASBR. When the MPLS encapsulated packet is received 579 by the ASBR, it simply swaps the VPN label with the IP-VPN label 580 advertised by the egress PE. This implies that the remote WAN gateway 581 must allocate the VPN label at least at the granularity of a (VRF, 582 egress PE) tuple. The ASBR then forwards the packet to the egress PE. 583 The egress PE then performs an IP lookup in the VRF (identified by 584 the received IP-VPN label) to determine where to forward the traffic. 586 Figure 5 below depicts the forwarding model. 588 NVE1 GW1 ASBR NVE2 589 +------------+ +------------+ +------------+ +------------+ 590 | ... ... | | ... ... | | ... | | ... | 591 |(EVI)-(VRF) | |(VRF)-(EVI) | | [LS ] | | (VRF)| 592 | .|. .|. | | |..| | | |...| | | |..| | 593 +------------+ +------------+ +------------+ +------------+ 594 ^ v ^ V ^ V ^ V 595 | | | | | | | | 596 VM1->-+ +-->-----+ +--------------+ +---------------+ +->-H1 598 Figure 5: Inter-Subnet Forwarding Among IP-VPN Sites and EVPN NVEs 599 with Route Aggregation 601 4.5 Use of Centralized Gateway 603 In this scenario, the NVEs within a given data center need to forward 604 traffic in L2 to a centralized L3GW for a number of reasons: a) they 605 don't have IRB capabilities or b) they don't have required policy for 606 switching traffic between different tenants or security zones. The 607 centralized L3GW performs both the IRB function for switching traffic 608 among different EVPN instances as well as it performs interworking 609 function when the traffic needs to be switched between IP-VPN sites 610 and EVPN instances. 612 5 VM Mobility 614 5.1 VM Mobility & Optimum Forwarding for VM's Outbound Traffic 616 Optimum forwarding for the VM's outbound traffic, upon VM mobility, 617 can be achieved using either the anycast default Gateway MAC and IP 618 addresses, or using the address aliasing as discussed in [DC- 619 MOBILITY]. 621 5.2 VM Mobility & Optimum Forwarding for VM's Inbound Traffic 623 For optimum forwarding of the VM's inbound traffic, upon VM mobility, 624 all the NVEs and/or IP-VPN PEs need to know the up to date location 625 of the VM. Two scenarios must be considered, as discussed next. 627 In what follows, we use the following terminology: 629 - source NVE refers to the NVE behind which the VM used to reside 630 prior to the VM mobility event. 632 - target NVE refers to the new NVE behind which the VM has moved 633 after the mobility event. 635 5.2.1 Mobility without Route Aggregation 637 In this scenario, when a target NVE detects that a MAC mobility event 638 has occurred, it initiates the MAC mobility handshake in BGP as 639 specified in [EVPN]. The WAN Gateways, acting as ASBRs in this case, 640 re-advertise the MAC route of the target NVE with the MAC Mobility 641 extended community attribute unmodified. Because the WAN Gateway for 642 a given data center re-advertises BGP routes received from the WAN 643 into the data center, the source NVE will receive the MAC 644 Advertisement route of the target NVE (with the next hop attribute 645 adjusted depending on which inter-AS option is employed). The source 646 NVE will then withdraw its original MAC Advertisement route as a 647 result of evaluating the Sequence Number field of the MAC Mobility 648 extended community in the received MAC Advertisement route. This is 649 per the procedures already defined in [EVPN]. 651 5.2.2 Mobility with Route Aggregation 653 This section will be completed in the next revision. 655 6 Acknowledgements 657 The authors would like to thank Sami Boutros for his valuable 658 comments. 660 7 Security Considerations 662 8 IANA Considerations 663 9 References 665 9.1 Normative References 667 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 668 Requirement Levels", BCP 14, RFC 2119, March 1997. 670 9.2 Informative References 672 [EVPN] Sajassi et al., "BGP MPLS Based Ethernet VPN", draft-ietf- 673 l2vpn-evpn-04.txt, work in progress, July, 2014. 675 [EVPN-IPVPN-INTEROP] Sajassi et al., "EVPN Seamless Interoperability 676 with IP-VPN", draft-sajassi-l2vpn-evpn-ipvpn-interop-01, work in 677 progress, October, 2012. 679 [DC-MOBILITY] Aggarwal et al., "Data Center Mobility based on 680 BGP/MPLS, IP Routing and NHRP", draft-raggarwa-data-center-mobility- 681 05.txt, work in progress, June, 2013. 683 Authors' Addresses 685 Ali Sajassi 686 Cisco 687 Email: sajassi@cisco.com 689 Samer Salam 690 Cisco 691 Email: ssalam@cisco.com 693 Yakov Rekhter 694 Juniper Networks 695 Email: yakov@juniper.net 697 John E. Drake 698 Juniper Networks 699 Email: jdrake@juniper.net 701 Lucy Yong 702 Huawei Technologies 703 Email: lucy.yong@huawei.com 704 Linda Dunbar 705 Huawei Technologies 706 Email: linda.dunbar@huawei.com 708 Wim Henderickx 709 Alcatel-Lucent 710 Email: wim.henderickx@alcatel-lucent.com 712 Florin Balus 713 Alcatel-Lucent 714 Email: Florin.Balus@alcatel-lucent.com