idnits 2.17.1 draft-ietf-nvo3-vm-mobility-issues-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Missing reference section? 'RFC2119' on line 103 looks like a reference Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Y. Rekhter 3 Internet Draft Juniper Networks 4 Category: Standards Track 5 Expiration Date: June 2014 6 W. Henderickx 7 Alcatel-Lucent 9 R. Shekhar 10 Juniper Networks 12 Luyuan Fang 13 Cisco Systems 15 Linda Dunbar 16 Huawei 18 Ali Sajassi 19 Cisco Systems 21 December 2 2013 23 Network-related VM Mobility Issues 25 draft-ietf-nvo3-vm-mobility-issues-02.txt 27 Status of this Memo 29 This Internet-Draft is submitted to IETF in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF), its areas, and its working groups. Note that other 34 groups may also distribute working documents as Internet-Drafts. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 The list of current Internet-Drafts can be accessed at 42 http://www.ietf.org/ietf/1id-abstracts.txt. 44 The list of Internet-Draft Shadow Directories can be accessed at 45 http://www.ietf.org/shadow.html. 47 Copyright and License Notice 49 Copyright (c) 2011 IETF Trust and the persons identified as the 50 document authors. All rights reserved. 52 This document is subject to BCP 78 and the IETF Trust's Legal 53 Provisions Relating to IETF Documents 54 (http://trustee.ietf.org/license-info) in effect on the date of 55 publication of this document. Please review these documents 56 carefully, as they describe your rights and restrictions with respect 57 to this document. Code Components extracted from this document must 58 include Simplified BSD License text as described in Section 4.e of 59 the Trust Legal Provisions and are provided without warranty as 60 described in the Simplified BSD License. 62 This document may contain material from IETF Documents or IETF 63 Contributions published or made publicly available before November 64 10, 2008. The person(s) controlling the copyright in some of this 65 material may not have granted the IETF Trust the right to allow 66 modifications of such material outside the IETF Standards Process. 67 Without obtaining an adequate license from the person(s) controlling 68 the copyright in such materials, this document may not be modified 69 outside the IETF Standards Process, and derivative works of it may 70 not be created outside the IETF Standards Process, except to format 71 it for publication as an RFC or to translate it into languages other 72 than English. 74 Abstract 76 This document describes a set of network-related issues presented by 77 the desire to support seamless Virtual Machine mobility in the data 78 center and between data centers. In particular, it looks at the 79 implications of meeting the requirements for "seamless mobility". 81 Table of Contents 83 1 Specification of requirements ......................... 3 84 2 Introduction .......................................... 3 85 2.1 Terminology ........................................... 4 86 3 Problem Statement ..................................... 7 87 3.1 Usage of VLAN-IDs ..................................... 7 88 3.2 Maintaining Connectivity in the Presence of VM Mobility ...8 89 3.3 Layer 2 Extension ..................................... 8 90 3.4 Optimal IP Routing .................................... 9 91 3.5 Preserving Policies ................................... 10 92 4 IANA Considerations ................................... 10 93 5 Security Considerations ............................... 10 94 6 Acknowledgements ...................................... 10 95 7 References ............................................ 10 96 8 Author's Address ...................................... 11 98 1. Specification of requirements 100 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 101 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 102 document are to be interpreted as described in [RFC2119]. 104 2. Introduction 106 An important feature of data centers identified in [nvo3-problem] is 107 the support of Virtual Machine (VM) mobility within the data center 108 and between data centers. This document describes a set of network- 109 related issues presented by the desire to support seamless Virtual 110 Machine mobility in the data center, where seamless mobility is 111 defined as the ability to move a VM from one server in the data 112 center to another server in the same or different data center, while 113 retaining the IP and MAC address of the VM. In the context of this 114 document the term mobility, or a reference to moving a VM should be 115 considered to imply seamless mobility, unless otherwise stated. 117 Note that in the scenario where a VM is moved between servers located 118 in different data centers, there are certain issues related to the 119 current state of the art of the Virtual Machine technology, the 120 bandwidth that may be available between the data centers, the 121 distance between the data centers, the ability to manage and operate 122 such VM mobility, storage-related issues (the moved VM has to have 123 access to the same virtual disk), etc. Discussion of these issues is 124 outside the scope of this document. 126 2.1. Terminology 128 In this document the term "Top of Rack Switch (ToR)" is used to refer 129 to a switch in a data center that is connected to the servers that 130 host VMs. A data center may have multiple ToRs. When External Bridge 131 Port Extenders (as defined by 802.1BR) are used to connect the 132 servers to the data center network, the ToR switch is the Controlling 133 Bridge. 135 Several data centers could be connected by a network. In addition to 136 providing interconnect among the data centers, such a network could 137 provide connectivity between the VMs hosted in these data centers and 138 the sites that contain hosts communicating with such VMs. Each data 139 center has one or more Data Center Border Router (DCBR) that connects 140 the data center to the network, and provides (a) connectivity between 141 VMs hosted in the data center and VMs hosted in other data centers, 142 and (b) connectivity between VMs hosted in the data center and hosts 143 communicating with these VMs. 145 The following figure illustrates the above: 147 __________ 148 ( ) 149 ( Data Center) 150 ( Interconnect )------------------------- 151 ( Network ) | 152 (__________) | 153 | | | 154 ---- ---- | 155 | | | 156 --------+--------------+--------------- ------------- 157 | | | Data | | | 158 | ------ ------ Center | | Data Center | 159 | | DBCR | | DBCR | | | | 160 | ------ ------ | ------------- 161 | | | | 162 | --- --- | 163 | ___|______|__ | 164 | ( ) | 165 | ( Data Center ) | 166 | ( Network ) | 167 | (___________) | 168 | | | | 169 | ---- ---- | 170 | | | | 171 | ------------ ----- | 172 | | ToR Switch | | ToR | | 173 | ------------ ----- | 174 | | | | 175 | | ---------- | ---------- | 176 | |--| Server | |--| Server | | 177 | | | | | ---------- | 178 | | | ---- | | | 179 | | | | VM | | | ---------- | 180 | | | ----- | --| Server | | 181 | | | | VM | | ---------- | 182 | | | ----- | | 183 | | | | VM | | | 184 | | | ---- | | 185 | | ---------- | 186 | | | 187 | | ---------- | 188 | |--| Server | | 189 | | ---------- | 190 | | | 191 | | ---------- | 192 | --| Server | | 193 | ---------- | 194 | | 195 ---------------------------------------- 197 The data centers and the network that interconnects them may be 198 either (a) under the same administrative control, or (b) controlled 199 by different administrations. 201 Consider a set of VMs that (as a matter of policy) are allowed to 202 communicate with each other, and a collection of devices that 203 interconnect these VMs. If communication among any VMs in that set 204 could be accomplished in such a way as to preserve MAC source and 205 destination addresses in the Ethernet header of the packets exchanged 206 among these VMs (as these packets traverse from their sources to 207 their destinations), we will refer to such set of VMs as an Layer 2 208 based Closed User Group (L2-based CUG). 210 A given VM may be a member of more than one L2-based CUG. 212 In terms of IP address assignment this document assumes that all VMs 213 of a given L2-based CUG have their IP addresses assigned out of a 214 single IP prefix. Thus, in the context of this document a single IP 215 subnet corresponds to a single L2-based CUG. If a given VM is a 216 member of more than one L2-based CUG, this VM would have multiple IP 217 addresses and multiple logical interface, one IP address and one 218 logical interface per each such CUG. 220 A VM that is a member of a given L2-based CUG may (as a matter of 221 policy) be allowed to communicate with VMs that belong to other 222 L2-based CUGs, or with other hosts. Such communication involves IP 223 forwarding, and thus would result in changing MAC source and 224 destination addresses in the Ethernet header of the packets being 225 exchanged. 227 In this document the term "L2 physical domain" refers to a collection 228 of interconnected devices that perform forwarding based on the 229 information carried in the Ethernet header. A trivial L2 physical 230 domain consists of just one server. In a non-trivial L2 physical 231 domain (domain that contains multiple forwarding entities) forwarding 232 could be provided by such layer 2 technologies as Spanning Tree 233 Protocol (STP), etc... Note that any multi-chassis LAG can not span 234 more than one L2 physical domain. This document assumes that a layer 235 2 access domain is an L2 physical domain. 237 A physical server connected to a given L2 physical domain may host 238 VMs that belong to different L2-based CUGs (while each of these CUGs 239 may span multiple L2 physical domains). If an L2 physical domain 240 contains servers that host VMs belonging to different L2-based CUGs, 241 then enforcing L2-based CUGs boundaries among these VMs within that 242 domain is accomplished by relying on Layer 2 mechanisms (e.g., 243 VLANs). 245 We say that an L2 physical domain contains a given VM (or that a 246 given VM is in a given L2 physical domain), if the server presently 247 hosting this VM is part of that domain, or the server is connected to 248 a ToR that is part of that domain. 250 We say that a given L2-based CUG is present within a given data 251 center if one or more VMs that are part of that CUG are presently 252 hosted by the servers located in that data center. 254 In the context of this document when we talk about VLAN-ID used by a 255 given VM, we refer to the VLAN-ID carried by the traffic that is 256 within the same L2 physical domain as the VM, and that is either 257 originated or destined to that VM - e.g., VLAN-ID only has local 258 significance within the L2 physical domain, unless it is stated 259 otherwise. 261 3. Problem Statement 263 This section describes the specific problems/issues that need to be 264 addressed to enable seamless VM mobility. 266 3.1. Usage of VLAN-IDs 268 This document assumes that within a given non-trivial L2 physical 269 domain traffic from/to VMs that are in that domain, and belong to the 270 same L2-based CUG MUST have the same VLAN-ID. This document assumes 271 that in different non-trivial L2 physical domains traffic from/to VMs 272 that are in these domains and belong to the same L2-based CUG MAY 273 have either the same or different VLAN-IDs. Thus when a given VM 274 moves from one non-trivial L2 physical domain to another, the VLAN-ID 275 of the traffic from/to VM in the former may be different than in the 276 latter, and thus can not assume to stay the same. 278 This document assumes that within a trivial L2 physical domain 279 traffic from/to VMs that are in this domain may not have VLAN-IDs at 280 all. 282 If a given VM's Guest OS sends packets that carry VLAN-ID, then when 283 the VM moves from one L2 physical domain to another the VLAN-ID used 284 by the Guest OS can not change (this is irrespective of whether L2 285 physical domains are trivial or non-trivial). In other words, the 286 VLAN-IDs used by a tagged VM network interface are part of the VM's 287 state and cannot be changed when the VM moves from one L2 physical 288 domain to another, even though it is possible for an entity, such as 289 hypervisor virtual switch, to change the VLAN-ID from the value used 290 by NVE to the value expected by the VM (in contrast, a VLAN tag 291 assigned by a hypervisor for use with an untagged VM network 292 interface can change). If the L2 physical domain is extended to 293 include VM tagged interfaces, the hypervisor virtual switch, and the 294 DC bridged network, then special consideration is needed in 295 assignment of VLAN tags for the VMs, the L2 physical domain and other 296 domains into which the VM may move. 298 This document assumes that within a given non-trivial L2 physical 299 domain traffic from/to VMs that are in that domain, and belong to 300 different L2-based CUG MUST have different VLAN-IDs. 302 The above assumptions about VLAN-IDs are driven by (a) the assumption 303 that within a given L2 physical domain VLANs are used to identify 304 individual L2-based CUGs, and (b) the need to overcome the limitation 305 on the number of different VLAN-IDs. 307 3.2. Maintaining Connectivity in the Presence of VM Mobility 309 In the context of this document the ability to maintain connectivity 310 in the presence of VM mobility means the ability to exchange traffic 311 between a VM and its peer(s), as the VM moves from one server to 312 another, where the peer(s) may be either other VM(s) or hosts. 313 Furthermore, the peer(s) need not be within the same data center as 314 the VM itself. 316 A given VM could be moved from one server to another in stopped or 317 suspended state ("cold" VM mobility), or the hypervisors might move a 318 running VM ("hot" VM mobility). IP address preservation is sometimes 319 highly desired for cold VM mobility; it's mandatory to preserve 320 transport connections when a running VM is moved. 322 VM mobility may result in transient loss of IP connectivity between 323 VM and its peers. In the case of hot VM mobility the upper bound on 324 the duration of such transients is (much) lower than in the case of 325 cold VM mobility (due to the requirement of preserving transport 326 connections and potential additional application requirements). 328 Furthermore, while with cold VM mobility one may assume that VM's ARP 329 cache gets flushed once VM moves to another server, one can not make 330 such an assumption with hot VM mobility. 332 3.3. Layer 2 Extension 334 Consider a scenario where a VM that is a member of a given L2-based 335 CUG moves from one server to another, and these two servers are in 336 different L2 physical domains, where these domains may be located in 337 the same or different data centers. In order to enable communication 338 between this VM and other VMs of that L2-based CUG, the new L2 339 physical domain must become interconnected with the other L2 physical 340 domain(s) that presently contain the rest of the VMs of that CUG, and 341 the interconnect must not violate the L2-based CUG requirement to 342 preserve source and destination MAC addresses in the Ethernet header 343 of the packets exchange between this VM and other members of that 344 CUG. 346 Moreover, if the previous L2 physical domain no longer contains any 347 VMs of that CUG, the previous domain no longer needs to be 348 interconnected with the other L2 physical domains(s) that contain the 349 rest of the VMs of that CUG. 351 Note that supporting VM mobility implies that the set of L2 physical 352 domains that contain VMs that belong to a given L2-based CUG may 353 change over time (new domains added, old domains deleted). 355 We will refer to this as the "layer 2 extension problem". 357 Note that the layer 2 extension problem is a special case of 358 maintaining connectivity in the presence of VM mobility, as the 359 former restricts communicating VMs to a single/common L2-based CUG, 360 while the latter does not. 362 3.4. Optimal IP Routing 364 In the context of this document optimal IP routing, or just optimal 365 routing, in the presence of VM mobility could be partitioned into two 366 problems: 368 + Optimal routing of a VM's outbound traffic. This means that as a 369 given VM moves from one server to another, the VM's default 370 gateway should be in a close topological proximity to the ToR 371 that connects the server presently hosting that VM. Note that 372 when we talk about optimal routing of the VM's outbound traffic, 373 we mean traffic from that VM to the destinations that are outside 374 of the VM's L2-based CUG. This document refers to this problem as 375 the VM default gateway problem. 377 + Optimal routing of VM's inbound traffic. This means that as a 378 given VM moves from one server to another, the (inbound) traffic 379 originated outside of the VM's L2-based CUG, and destined to that 380 VM be routed via the router of the VM's L2-based CUG that is in a 381 close topological proximity to the ToR that connects the server 382 presently hosting that VM, without first traversing some other 383 router of that L2-based CUG (the router of the VM's L2-based CUG 384 may be either DCBR or ToR itself). This is also known as avoiding 385 "triangular routing". This document refers to this problem as the 386 triangular routing problem. 388 Note that optimal routing is a special case of maintaining 389 connectivity in the presence of VM mobility, as the former assumes 390 not only the ability to maintain connectivity, but also that this 391 connectivity is maintained using optimal routing. On the other hand, 392 maintaining connectivity does not make optimal routing a pre- 393 requisite. 395 The ability to deliver optimal routing (as defined above) in the 396 presence of stateful devices is outside the scope of this document. 398 3.5. Preserving Policies 400 Moving VM from one L2 physical domain to another means (among other 401 things) that the NVE in the new domain that provides connectivity 402 between this VM and VMs in other L2 physical domains must be able to 403 implement the policies that control connectivity between this VM and 404 VMs in other L2 physical domains. In other words, the policies that 405 control connectivity between a given VM and its peers MUST NOT change 406 as the VM moves from one L2 physical domain to another. Moreover, 407 policies, if any, within the L2 physical domain that contain a given 408 VM MUST NOT preclude realization of the policies that control 409 connectivity between this VM and its peers. All of the above is 410 irrespective of whether the L2 physical domains are trivial or not. 412 4. IANA Considerations 414 This document introduces no new IANA Considerations. 416 5. Security Considerations 418 TBD. 420 6. Acknowledgements 422 The authors would like to thank Adrian Farrel for his review and 423 comments. The authors would also like to thank Ivan Pepelnjak and 424 David Black for their contributions to this document. 426 7. References 428 [nvo3-problem] Narten T.et al., "Overlays for Network 429 Virtualization", draft-narten-nvo3-overlay-problem-statement, work in 430 progress. 432 8. Author's Address 434 Yakov Rekhter 435 Juniper Networks 436 1194 North Mathilda Ave. 437 Sunnyvale, CA 94089 438 Email: yakov@juniper.net 440 Wim Henderickx 441 Alcatel-Lucent 442 Email: wim.henderickx@alcatel-lucent.com 444 Ravi Shekhar 445 Juniper Networks 446 1194 North Mathilda Ave. 447 Sunnyvale, CA 94089 448 Email: rshekhar@juniper.net 450 Luyuan Fang 451 Cisco Systems 452 111 Wood Avenue South 453 Iselin, NJ 08830 454 Email: lufang@cisco.com 456 Linda Dunbar 457 Huawei Technologies 458 5340 Legacy Drive, Suite 175 459 Plano, TX 75024, USA 460 Phone: (469) 277 5840 461 Email: ldunbar@huawei.com 463 Ali Sajassi 464 Cisco Systems 465 Email: sajassi@cisco.com 467 Rahul Aggarwal 468 Arktan, Inc 469 Email: raggarwa_1@yahoo.com