idnits 2.17.1 draft-narten-nvo3-overlay-problem-statement-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 15, 2012) is 4334 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-24) exists of draft-ietf-lisp-23 == Outdated reference: A later version (-07) exists of draft-ietf-trill-fine-labeling-01 == Outdated reference: A later version (-04) exists of draft-kreeger-nvo3-overlay-cp-00 == Outdated reference: A later version (-03) exists of draft-lasserre-nvo3-framework-01 == Outdated reference: A later version (-09) exists of draft-mahalingam-dutt-dcops-vxlan-01 == Outdated reference: A later version (-08) exists of draft-sridharan-virtualization-nvgre-00 Summary: 0 errors (**), 0 flaws (~~), 7 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force T. Narten, Ed. 3 Internet-Draft IBM 4 Intended status: Informational M. Sridharan 5 Expires: December 17, 2012 Microsoft 6 D. Dutt 8 D. Black 9 EMC 10 L. Kreeger 11 Cisco 12 June 15, 2012 14 Problem Statement: Overlays for Network Virtualization 15 draft-narten-nvo3-overlay-problem-statement-02 17 Abstract 19 This document describes issues associated with providing multi- 20 tenancy in large data center networks and an overlay-based network 21 virtualization approach to addressing them. A key multi-tenancy 22 requirement is traffic isolation, so that a tenant's traffic is not 23 visible to any other tenant. This isolation can be achieved by 24 assigning one or more virtual networks to each tenant such that 25 traffic within a virtual network is isolated from traffic in other 26 virtual networks. The primary functionality required is provisioning 27 virtual networks, associating a virtual machine's NIC with the 28 appropriate virtual network, and maintaining that association as the 29 virtual machine is activated, migrated and/or deactivated. Use of an 30 overlay-based approach enables scalable deployment on large network 31 infrastructures. 33 Status of this Memo 35 This Internet-Draft is submitted in full conformance with the 36 provisions of BCP 78 and BCP 79. 38 Internet-Drafts are working documents of the Internet Engineering 39 Task Force (IETF). Note that other groups may also distribute 40 working documents as Internet-Drafts. The list of current Internet- 41 Drafts is at http://datatracker.ietf.org/drafts/current/. 43 Internet-Drafts are draft documents valid for a maximum of six months 44 and may be updated, replaced, or obsoleted by other documents at any 45 time. It is inappropriate to use Internet-Drafts as reference 46 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on December 17, 2012. 50 Copyright Notice 52 Copyright (c) 2012 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 2. Problem Details . . . . . . . . . . . . . . . . . . . . . . . 5 69 2.1. Multi-tenant Environment Scale . . . . . . . . . . . . . . 5 70 2.2. Virtual Machine Mobility Requirements . . . . . . . . . . 5 71 2.3. Span of Virtual Networks . . . . . . . . . . . . . . . . . 6 72 2.4. Inadequate Forwarding Table Sizes in Switches . . . . . . 6 73 2.5. Decoupling Logical and Physical Configuration . . . . . . 6 74 2.6. Support Communication Between VMs and Non-virtualized 75 Devices . . . . . . . . . . . . . . . . . . . . . . . . . 7 76 2.7. Overlay Design Characteristics . . . . . . . . . . . . . . 7 77 3. Network Overlays . . . . . . . . . . . . . . . . . . . . . . . 8 78 3.1. Limitations of Existing Virtual Network Models . . . . . . 8 79 3.2. Benefits of Network Overlays . . . . . . . . . . . . . . . 9 80 3.3. Overlay Networking Work Areas . . . . . . . . . . . . . . 10 81 4. Related Work . . . . . . . . . . . . . . . . . . . . . . . . . 12 82 4.1. IEEE 802.1aq - Shortest Path Bridging . . . . . . . . . . 12 83 4.2. ARMD . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 84 4.3. TRILL . . . . . . . . . . . . . . . . . . . . . . . . . . 12 85 4.4. L2VPNs . . . . . . . . . . . . . . . . . . . . . . . . . . 13 86 4.5. Proxy Mobile IP . . . . . . . . . . . . . . . . . . . . . 13 87 4.6. LISP . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 88 4.7. Individual Submissions . . . . . . . . . . . . . . . . . . 13 89 5. Further Work . . . . . . . . . . . . . . . . . . . . . . . . . 14 90 6. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 91 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 14 92 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 93 9. Security Considerations . . . . . . . . . . . . . . . . . . . 14 94 10. Informative References . . . . . . . . . . . . . . . . . . . . 14 95 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 16 96 A.1. Changes from -01 . . . . . . . . . . . . . . . . . . . . . 16 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 16 100 1. Introduction 102 Server virtualization is increasingly becoming the norm in data 103 centers. With server virtualization, each physical server supports 104 multiple virtual machines (VMs), each running its own operating 105 system, middleware and applications. Virtualization is a key enabler 106 of workload agility, i.e., allowing any server to host any 107 application and providing the flexibility of adding, shrinking, or 108 moving services within the physical infrastructure. Server 109 virtualization provides numerous benefits, including higher 110 utilization, increased data security, reduced user downtime, reduced 111 power usage, etc. 113 Large scale multi-tenant data centers are taking advantage of the 114 benefits of server virtualization to provide a new kind of hosting, a 115 virtual hosted data center. Multi-tenant data centers are ones where 116 individual tenants could belong to a different company (in the case 117 of a public provider) or a different department (in the case of an 118 internal company data center). Each tenant has the expectation of a 119 level of security and privacy separating their resources from those 120 of other tenants. For example, one tenant's traffic must never be 121 exposed to another tenant, except through carefully controlled 122 interfaces, such as a security gateway. 124 To a tenant, virtual data centers are similar to their physical 125 counterparts, consisting of end stations attached to a network, 126 complete with services such as load balancers and firewalls. But 127 unlike a physical data center, end stations connect to a virtual 128 network. To end stations, a virtual network looks like a normal 129 network (e.g., providing an ethernet service), except that the only 130 end stations connected to the virtual network are those belonging to 131 the tenant. 133 A tenant is the administrative entity that is responsible for and 134 manages a specific virtual network instance and its associated 135 services (whether virtual or physical). In a cloud environment, a 136 tenant would correspond to the customer that has defined and is using 137 a particular virtual network. However, a tenant may also find it 138 useful to create multiple different virtual network instances. 139 Hence, there is a one-to-many mapping between tenants and virtual 140 network instances. A single tenant may operate multiple individual 141 virtual network instances, each associated with a different service. 143 How a virtual network is implemented does not matter to the tenant. 144 It could be a pure routed network, a pure bridged network or a 145 combination of bridged and routed networks. The key requirement is 146 that each individual virtual network instance be isolated from other 147 virtual network instances. 149 This document outlines the problems encountered in scaling the number 150 of isolated networks in a data center, as well as the problems of 151 managing the creation/deletion, membership and span of these networks 152 and makes the case that an overlay based approach, where individual 153 networks are implemented within individual virtual networks that are 154 dynamically controlled by a standardized control plane provides a 155 number of advantages over current approaches. The purpose of this 156 document is to identify the set of problems that any solution has to 157 address in building multi-tenant data centers. With this approach, 158 the goal is to allow the construction of standardized, interoperable 159 implementations to allow the construction of multi-tenant data 160 centers. 162 Section 2 describes the problem space details. Section 3 describes 163 network overlays in more detail and the potential work areas. 164 Sections 4 and 5 review related and further work, while Section 6 165 closes with a summary. 167 2. Problem Details 169 The following subsections describe aspects of multi-tenant networking 170 that pose problems for large scale network infrastructure. Different 171 problem aspects may arise based on the network architecture and 172 scale. 174 2.1. Multi-tenant Environment Scale 176 Cloud computing involves on-demand elastic provisioning of resources 177 for multi-tenant environments. A common example of cloud computing 178 is the public cloud, where a cloud service provider offers these 179 elastic services to multiple customers over the same infrastructure. 180 This elastic on-demand nature in conjunction with trusted hypervisors 181 to control network access by VMs calls for resilient distributed 182 network control mechanisms. 184 2.2. Virtual Machine Mobility Requirements 186 A key benefit of server virtualization is virtual machine (VM) 187 mobility. A VM can be migrated from one server to another, live i.e. 188 as it continues to run and without shutting down the VM and 189 restarting it at a new location. A key requirement for live 190 migration is that a VM retain its IP address(es) and MAC address(es) 191 in its new location (to avoid tearing down existing communication). 192 Today, servers are assigned IP addresses based on their physical 193 location, typically based on the ToR (Top of Rack) switch for the 194 server rack or the VLAN configured to the server. This works well 195 for physical servers, which cannot move, but it restricts the 196 placement and movement of the more mobile VMs within the data center 197 (DC). Any solution for a scalable multi-tenant DC must allow a VM to 198 be placed (or moved to) anywhere within the data center, without 199 being constrained by the subnet boundary concerns of the host 200 servers. 202 2.3. Span of Virtual Networks 204 Another use case is cross pod expansion. A pod typically consists of 205 one or more racks of servers with its associated network and storage 206 connectivity. Tenants may start off on a pod and, due to expansion, 207 require servers/VMs on other pods, especially the case when tenants 208 on the other pods are not fully utilizing all their resources. This 209 use case requires that virtual networks span multiple pods in order 210 to provide connectivity to all of the tenant's servers/VMs. 212 2.4. Inadequate Forwarding Table Sizes in Switches 214 Today's virtualized environments place additional demands on the 215 forwarding tables of switches. Instead of just one link-layer 216 address per server, the switching infrastructure has to learn 217 addresses of the individual VMs (which could range in the 100s per 218 server). This is a requirement since traffic from/to the VMs to the 219 rest of the physical network will traverse the physical network 220 infrastructure. This places a much larger demand on the switches' 221 forwarding table capacity compared to non-virtualized environments, 222 causing more traffic to be flooded or dropped when the addresses in 223 use exceeds the forwarding table capacity. 225 2.5. Decoupling Logical and Physical Configuration 227 Data center operators must be able to achieve high utilization of 228 server and network capacity. For efficient and flexible allocation, 229 operators should be able to spread a virtual network instance across 230 servers in any rack in the data center. It should also be possible 231 to migrate compute workloads to any server anywhere in the network 232 while retaining the workload's addresses. This can be achieved today 233 by stretching VLANs (e.g., by using TRILL or SPB). 235 However, in order to limit the broadcast domain of each VLAN, multi- 236 destination frames within a VLAN should optimally flow only to those 237 devices that have that VLAN configured. When workloads migrate, the 238 physical network (e.g., access lists) may need to be reconfigured 239 which is typically time consuming and error prone. 241 2.6. Support Communication Between VMs and Non-virtualized Devices 243 Within data centers, not all communication will be between VMs. 244 Network operators will continue to use non-virtualized servers for 245 various reasons, traditional routers to provide L2VPN and L3VPN 246 services, traditional load balancers, firewalls, intrusion detection 247 engines and so on. Any virtual network solution should be capable of 248 working with these existing systems. 250 2.7. Overlay Design Characteristics 252 There are existing layer 2 overlay protocols in existence, but they 253 were not necessarily designed to solve the problem in the environment 254 of a highly virtualized data center. Below are some of the 255 characteristics of environments that must be taken into account by 256 the overlay technology: 258 1. Highly distributed systems. The overlay should work in an 259 environment where there could be many thousands of access 260 switches (e.g. residing within the hypervisors) and many more end 261 systems (e.g. VMs) connected to them. This leads to a 262 distributed mapping system that puts a low overhead on the 263 overlay tunnel endpoints. 265 2. Many highly distributed virtual networks with sparse membership. 266 Each virtual network could be highly dispersed inside the data 267 center. Also, along with expectation of many virtual networks, 268 the number of end systems connected to any one virtual network is 269 expected to be relatively low; Therefore, the percentage of 270 access switches participating in any given virtual network would 271 also be expected to be low. For this reason, efficient pruning 272 of multi-destination traffic should be taken into consideration. 274 3. Highly dynamic end systems. End systems connected to virtual 275 networks can be very dynamic, both in terms of creation/deletion/ 276 power-on/off and in terms of mobility across the access switches. 278 4. Work with existing, widely deployed network Ethernet switches and 279 IP routers without requiring wholesale replacement. The first 280 hop switch that adds and removes the overlay header will require 281 new equipment and/or new software. 283 5. Network infrastructure administered by a single administrative 284 domain. This is consistent with operation within a data center, 285 and not across the Internet. 287 3. Network Overlays 289 Virtual Networks are used to isolate a tenant's traffic from that of 290 other tenants (or even traffic within the same tenant that requires 291 isolation). There are two main characteristics of virtual networks: 293 1. Providing network address space that is isolated from other 294 virtual networks. The same network addresses may be used in 295 different virtual networks on the same underlying network 296 infrastructure. 298 2. Limiting the scope of frames sent on the virtual network. Frames 299 sent by end systems attached to a virtual network are delivered 300 as expected to other end systems on that virtual network and may 301 exit a virtual network only through controlled exit points such 302 as a security gateway. Likewise, frames sourced outside of the 303 virtual network may enter the virtual network only through 304 controlled entry points, such as a security gateway. 306 3.1. Limitations of Existing Virtual Network Models 308 Virtual networks are not new to networking. For example, VLANs are a 309 well known construct in the networking industry. A VLAN is an L2 310 bridging construct that provides some of the semantics of virtual 311 networks mentioned above: a MAC address is unique within a VLAN, but 312 not necessarily across VLANs. Traffic sourced within a VLAN 313 (including broadcast and multicast traffic) remains within the VLAN 314 it originates from. Traffic forwarded from one VLAN to another 315 typically involves router (L3) processing. The forwarding table look 316 up operation is keyed on {VLAN, MAC address} tuples. 318 But there are problems and limitations with L2 VLANs. VLANs are a 319 pure L2 bridging construct and VLAN identifiers are carried along 320 with data frames to allow each forwarding point to know what VLAN the 321 frame belongs to. A VLAN today is defined as a 12 bit number, 322 limiting the total number of VLANs to 4096 (though typically, this 323 number is 4094 since 0 and 4095 are reserved). Due to the large 324 number of tenants that a cloud provider might service, the 4094 VLAN 325 limit is often inadequate. In addition, there is often a need for 326 multiple VLANs per tenant, which exacerbates the issue. 328 In the case of IP networks, many routers provide a Virtual Routing 329 and Forwarding (VRF) service. The same router operates multiple 330 instances of forwarding tables, one for each tenant. Each forwarding 331 table instance is populated separately via routing protocols, either 332 running (conceptually) as separate instances for each VRF, or as a 333 single instance-aware routing protocol that supports VRFs directly 334 (e.g., [RFC4364]). Each VRF instance provides address and traffic 335 isolation. The forwarding table look up operation is keyed on {VRF, 336 IP address} tuples. 338 VRF's are a pure routing construct and do not have end-to-end 339 significance in the sense that the data plane carries a VRF indicator 340 on an end-to-end basis. Instead, the VRF is derived at each hop 341 using a combination of incoming interface and some information in the 342 frame (e.g., local VLAN tag). Furthermore, the VRF model has 343 typically assumed that a separate control plane governs the 344 population of the forwarding table within that VRF. Thus, a 345 traditional VRF model assumes multiple, independent control planes 346 and has no specific tag within a data frame to identify the VRF of 347 the frame. 349 There are number of VPN approaches that provide some of the desired 350 semantics of virtual networks (e.g., [RFC4364]). But VPN approaches 351 have traditionally been deployed across WANs and have not seen 352 widespread deployment within enterprise data centers. They are not 353 necessarily seen as supporting the characteristics outlined in 354 Section 2.7. 356 3.2. Benefits of Network Overlays 358 To address the problems described earlier, a network overlay model 359 can be used. 361 The idea behind an overlay is quite straightforward. Each virtual 362 network instance is implemented as an overlay. The original frame is 363 encapsulated by the first hop network device. The encapsulation 364 identifies the destination of the device that will perform the 365 decapsulation before delivering the frame to the endpoint. The rest 366 of the network forwards the frame based on the encapsulation header 367 and can be oblivious to the payload that is carried inside. To avoid 368 belaboring the point each time, the first hop network device can be a 369 traditional switch or router or the virtual switch residing inside a 370 hypervisor. Furthermore, the endpoint can be a VM or it can be a 371 physical server. Some examples of network overlays are tunnels such 372 as IP GRE [RFC2784], LISP [I-D.ietf-lisp] or TRILL [RFC6325]. 374 With the overlay, a virtual network identifier (or VNID) can be 375 carried as part of the overlay header so that every data frame 376 explicitly identifies the specific virtual network the frame belongs 377 to. Since both routed and bridged semantics can be supported by a 378 virtual data center, the original frame carried within the overlay 379 header can be an Ethernet frame complete with MAC addresses or just 380 the IP packet. 382 The use of a large (e.g., 24-bit) VNID would allow 16 million 383 distinct virtual networks within a single data center, eliminating 384 current VLAN size limitations. This VNID needs to be carried in the 385 data plane along with the packet. Adding an overlay header provides 386 a place to carry this VNID. 388 A key aspect of overlays is the decoupling of the "virtual" MAC and 389 IP addresses used by VMs from the physical network infrastructure and 390 the infrastructure IP addresses used by the data center. If a VM 391 changes location, the switches at the edge of the overlay simply 392 update their mapping tables to reflect the new location of the VM 393 within the data center's infrastructure space. Because an overlay 394 network is used, a VM can now be located anywhere in the data center 395 that the overlay reaches without regards to traditional constraints 396 implied by L2 properties such as VLAN numbering, or the span of an L2 397 broadcast domain scoped to a single pod or access switch. 399 Multi-tenancy is supported by isolating the traffic of one virtual 400 network instance from traffic of another. Traffic from one virtual 401 network instance cannot be delivered to another instance without 402 (conceptually) exiting the instance and entering the other instance 403 via an entity that has connectivity to both virtual network 404 instances. Without the existence of this entity, tenant traffic 405 remains isolated within each individual virtual network instance. 406 External communications (from a VM within a virtual network instance 407 to a machine outside of any virtual network instance, e.g. on the 408 Internet) is handled by having an ingress switch forward traffic to 409 an external router, where an egress switch decapsulates a tunneled 410 packet and delivers it to the router for normal processing. This 411 router is external to the overlay, and behaves much like existing 412 external facing routers in data centers today. 414 Overlays are designed to allow a set of VMs to be placed within a 415 single virtual network instance, whether that virtual network 416 provides the bridged network or a routed network. 418 3.3. Overlay Networking Work Areas 420 There are three specific and separate potential work areas needed to 421 realize an overlay solution. The areas correspond to different 422 possible "on-the-wire" protocols, where distinct entities interact 423 with each other. 425 One area of work concerns the address dissemination protocol an NVE 426 uses to build and maintain the mapping tables it uses to deliver 427 encapsulated frames to their proper destination. One approach is to 428 build mapping tables entirely via learning (as is done in 802.1 429 networks). But to provide better scaling properties, a more 430 sophisticated approach is needed, i.e., the use of a specialized 431 control plane protocol. While there are some advantages to using or 432 leveraging an existing protocol for maintaining mapping tables, the 433 fact that large numbers of NVE's will likely reside in hypervisors 434 places constraints on the resources (cpu and memory) that can be 435 dedicated to such functions. For example, routing protocols (e.g., 436 IS-IS, BGP) may have scaling difficulties if implemented directly in 437 all NVEs, based on both flooding and convergence time concerns. This 438 suggests that use of a standard lookup protocol between NVEs and a 439 smaller number of network nodes that implement the actual routing 440 protocol (or the directory-based "oracle") is a more promising 441 approach at larger scale. 443 From an architectural perspective, one can view the address mapping 444 dissemination problem as having two distinct and separable 445 components. The first component consists of a back-end "oracle" that 446 is responsible for distributing and maintaining the mapping 447 information for the entire overlay system. The second component 448 consists of the on-the-wire protocols an NVE uses when interacting 449 with the oracle. 451 The back-end oracle could provide high performance, high resiliancy, 452 failover, etc. and could be implemented in different ways. For 453 example, one model uses a traditional, centralized "directory-based" 454 database, using replicated instances for reliability and failover 455 (e.g., LISP-XXX). A second model involves using and possibly 456 extending an existing routing protocol (e.g., BGP, IS-IS, etc.). To 457 support different architectural models, it is useful to have one 458 standard protocol for the NVE-oracle interaction while allowing 459 different protocols and architectural approaches for the oracle 460 itself. Separating the two allows NVEs to interact with different 461 types of oracles, i.e., either of the two architectural models 462 described above. Having separate protocols also allows for a 463 simplified NVE that only interacts with the oracle for the mapping 464 table entries it needs and allows the oracle (and its associated 465 protocols) to evolve independently over time with minimal impact to 466 the NVEs. 468 A third work area considers the attachment and detachment of VMs (or 469 Tenant End Systems [I-D.lasserre-nvo3-framework] more generally) from 470 a specific virtual network instance. When a VM attaches, the Network 471 Virtualization Edge (NVE) [I-D.lasserre-nvo3-framework] associates 472 the VM with a specific overlay for the purposes of tunneling traffic 473 sourced from or destined to the VM. When a VM disconnects, it is 474 removed from the overlay and the NVE effectively terminates any 475 tunnels associated with the VM. To achieve this functionality, a 476 standardized interaction between the NVE and hypervisor may be 477 needed, for example in the case where the NVE resides on a separate 478 device from the VM. 480 In summary, there are three areas of potential work. The first area 481 concerns the oracle itself and any on-the-wire protocols it needs. A 482 second area concerns the interaction between the oracle and NVEs. 483 The third work area concerns protocols associated with attaching and 484 detaching a VM from a particular virtual network instance. The 485 latter two items are the priority work areas and can be done largely 486 independent of any oracle-related work. 488 4. Related Work 490 4.1. IEEE 802.1aq - Shortest Path Bridging 492 Shortest Path Bridging (SPB) is an IS-IS based overlay for L2 493 Ethernets. SPB supports multi-pathing and addresses a number of 494 shortcoming in the original Ethernet Spanning Tree Protocol. SPB-M 495 uses IEEE 802.1ah MAC-in-MAC encapsulation and supports a 24-bit 496 I-SID, which can be used to identify virtual network instances. SPB 497 is entirely L2 based, extending the L2 Ethernet bridging model. 499 4.2. ARMD 501 ARMD is chartered to look at data center scaling issues with a focus 502 on address resolution. ARMD is currently chartered to develop a 503 problem statement and is not currently developing solutions. While 504 an overlay-based approach may address some of the "pain points" that 505 have been raised in ARMD (e.g., better support for multi-tenancy), an 506 overlay approach may also push some of the L2 scaling concerns (e.g., 507 excessive flooding) to the IP level (flooding via IP multicast). 508 Analysis will be needed to understand the scaling trade offs of an 509 overlay based approach compared with existing approaches. On the 510 other hand, existing IP-based approaches such as proxy ARP may help 511 mitigate some concerns. 513 4.3. TRILL 515 TRILL is an L2-based approach aimed at improving deficiencies and 516 limitations with current Ethernet networks and STP in particular. 517 Although it differs from Shortest Path Bridging in many architectural 518 and implementation details, it is similar in that is provides an L2- 519 based service to end systems. TRILL as defined today, supports only 520 the standard (and limited) 12-bit VLAN model. Approaches to extend 521 TRILL to support more than 4094 VLANs are currently under 522 investigation [I-D.ietf-trill-fine-labeling] 524 4.4. L2VPNs 526 The IETF has specified a number of approaches for connecting L2 527 domains together as part of the L2VPN Working Group. That group, 528 however has historically been focused on Provider-provisioned L2 529 VPNs, where the service provider participates in management and 530 provisioning of the VPN. In addition, much of the target environment 531 for such deployments involves carrying L2 traffic over WANs. Overlay 532 approaches are intended be used within data centers where the overlay 533 network is managed by the data center operator, rather than by an 534 outside party. While overlays can run across the Internet as well, 535 they will extend well into the data center itself (e.g., up to and 536 including hypervisors) and include large numbers of machines within 537 the data center itself. 539 Other L2VPN approaches, such as L2TP [RFC2661] require significant 540 tunnel state at the encapsulating and decapsulating end points. 541 Overlays require less tunnel state than other approaches, which is 542 important to allow overlays to scale to hundreds of thousands of end 543 points. It is assumed that smaller switches (i.e., virtual switches 544 in hypervisors or the physical switches to which VMs connect) will be 545 part of the overlay network and be responsible for encapsulating and 546 decapsulating packets. 548 4.5. Proxy Mobile IP 550 Proxy Mobile IP [RFC5213] [RFC5844] makes use of the GRE Key Field 551 [RFC5845] [RFC6245], but not in a way that supports multi-tenancy. 553 4.6. LISP 555 LISP[I-D.ietf-lisp] essentially provides an IP over IP overlay where 556 the internal addresses are end station Identifiers and the outer IP 557 addresses represent the location of the end station within the core 558 IP network topology. The LISP overlay header uses a 24 bit Instance 559 ID used to support overlapping inner IP addresses. 561 4.7. Individual Submissions 563 Many individual submissions also look to addressing some or all of 564 the issues addressed in this draft. Examples of such drafts are 565 VXLAN [I-D.mahalingam-dutt-dcops-vxlan], NVGRE 566 [I-D.sridharan-virtualization-nvgre] and Virtual Machine Mobility in 567 L3 networks[I-D.wkumari-dcops-l3-vmmobility]. 569 5. Further Work 571 It is believed that overlay-based approaches may be able to reduce 572 the overall amount of flooding and other multicast and broadcast 573 related traffic (e.g, ARP and ND) currently experienced within 574 current data centers with a large flat L2 network. Further analysis 575 is needed to characterize expected improvements. 577 6. Summary 579 This document has argued that network virtualization using L3 580 overlays addresses a number of issues being faced as data centers 581 scale in size. In addition, careful consideration of a number of 582 issues would lead to the development of interoperable implementation 583 of virtualization overlays. 585 Three potential work were identified. The first involves the 586 interaction that take place when a VM attaches or detaches from an 587 overlay. A second involves the protocol an NVE would use to 588 communicate with a backend "oracle" to learn and disseminate mapping 589 information about the VMs the NVE communicates with. The third 590 potential work area involves the backend oracle itself, i.e., how it 591 provides failover and how it interacts with oracles in other domains. 593 7. Acknowledgments 595 Helpful comments and improvements to this document have come from 596 Ariel Hendel, Vinit Jain, and Benson Schliesser. 598 8. IANA Considerations 600 This memo includes no request to IANA. 602 9. Security Considerations 604 TBD 606 10. Informative References 608 [I-D.ietf-lisp] 609 Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, 610 "Locator/ID Separation Protocol (LISP)", 611 draft-ietf-lisp-23 (work in progress), May 2012. 613 [I-D.ietf-trill-fine-labeling] 614 Eastlake, D., Zhang, M., Agarwal, P., Perlman, R., and D. 615 Dutt, "TRILL: Fine-Grained Labeling", 616 draft-ietf-trill-fine-labeling-01 (work in progress), 617 June 2012. 619 [I-D.kreeger-nvo3-overlay-cp] 620 Black, D., Dutt, D., Kreeger, L., Sridhavan, M., and T. 621 Narten, "Network Virtualization Overlay Control Protocol 622 Requirements", draft-kreeger-nvo3-overlay-cp-00 (work in 623 progress), January 2012. 625 [I-D.lasserre-nvo3-framework] 626 Lasserre, M., Balus, F., Morin, T., Bitar, N., Rekhter, 627 Y., and Y. Ikejiri, "Framework for DC Network 628 Virtualization", draft-lasserre-nvo3-framework-01 (work in 629 progress), March 2012. 631 [I-D.mahalingam-dutt-dcops-vxlan] 632 Sridhar, T., Bursell, M., Kreeger, L., Dutt, D., Wright, 633 C., Mahalingam, M., Duda, K., and P. Agarwal, "VXLAN: A 634 Framework for Overlaying Virtualized Layer 2 Networks over 635 Layer 3 Networks", draft-mahalingam-dutt-dcops-vxlan-01 636 (work in progress), February 2012. 638 [I-D.sridharan-virtualization-nvgre] 639 Sridhavan, M., Duda, K., Ganga, I., Greenberg, A., Lin, 640 G., Pearson, M., Thaler, P., Tumuluri, C., and Y. Wang, 641 "NVGRE: Network Virtualization using Generic Routing 642 Encapsulation", draft-sridharan-virtualization-nvgre-00 643 (work in progress), September 2011. 645 [I-D.wkumari-dcops-l3-vmmobility] 646 Kumari, W. and J. Halpern, "Virtual Machine mobility in L3 647 Networks.", draft-wkumari-dcops-l3-vmmobility-00 (work in 648 progress), August 2011. 650 [RFC2661] Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn, 651 G., and B. Palter, "Layer Two Tunneling Protocol "L2TP"", 652 RFC 2661, August 1999. 654 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 655 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 656 March 2000. 658 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 659 Networks (VPNs)", RFC 4364, February 2006. 661 [RFC5213] Gundavelli, S., Leung, K., Devarapalli, V., Chowdhury, K., 662 and B. Patil, "Proxy Mobile IPv6", RFC 5213, August 2008. 664 [RFC5844] Wakikawa, R. and S. Gundavelli, "IPv4 Support for Proxy 665 Mobile IPv6", RFC 5844, May 2010. 667 [RFC5845] Muhanna, A., Khalil, M., Gundavelli, S., and K. Leung, 668 "Generic Routing Encapsulation (GRE) Key Option for Proxy 669 Mobile IPv6", RFC 5845, June 2010. 671 [RFC6245] Yegani, P., Leung, K., Lior, A., Chowdhury, K., and J. 672 Navali, "Generic Routing Encapsulation (GRE) Key Extension 673 for Mobile IPv4", RFC 6245, May 2011. 675 [RFC6325] Perlman, R., Eastlake, D., Dutt, D., Gai, S., and A. 676 Ghanwani, "Routing Bridges (RBridges): Base Protocol 677 Specification", RFC 6325, July 2011. 679 Appendix A. Change Log 681 A.1. Changes from -01 683 1. Removed Section 4.2 (Standardization Issues) and Section 5 684 (Control Plane) as those are more appropriately covered in and 685 overlap with material in [I-D.lasserre-nvo3-framework] and 686 [I-D.kreeger-nvo3-overlay-cp]. 688 2. Expanded introduction and better explained terms such as tenant 689 and virtual network instance. These had been covered in a 690 section that has since been removed. 692 3. Added Section 3.3 "Overlay Networking Work Areas" to better 693 articulate the three separable work components (or "on-the-wire 694 protocols") where work is needed. 696 4. Added section on Shortest Path Bridging in Related Work section. 698 5. Revised some of the terminology to be consistent with 699 [I-D.lasserre-nvo3-framework] and [I-D.kreeger-nvo3-overlay-cp]. 701 Authors' Addresses 703 Thomas Narten (editor) 704 IBM 706 Email: narten@us.ibm.com 708 Murari Sridharan 709 Microsoft 711 Email: muraris@microsoft.com 713 Dinesh Dutt 715 Email: ddutt.ietf@hobbesdutt.com 717 David Black 718 EMC 720 Email: david.black@emc.com 722 Lawrence Kreeger 723 Cisco 725 Email: kreeger@cisco.com