idnits 2.17.1 draft-ietf-nvo3-nve-nva-cp-req-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The abstract seems to contain references ([RFC7364]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 2, 2015) is 3220 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-08) exists of draft-ietf-nvo3-arch-03 == Outdated reference: A later version (-17) exists of draft-ietf-nvo3-hpvr2nve-cp-req-02 Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force L. Kreeger 3 Internet-Draft Cisco Systems 4 Intended status: Informational D. Dutt 5 Expires: January 3, 2016 Cumulus Networks 6 T. Narten 7 IBM 8 D. Black 9 EMC 10 July 2, 2015 12 Network Virtualization NVE to NVA Control Protocol Requirements 13 draft-ietf-nvo3-nve-nva-cp-req-04 15 Abstract 17 [RFC7364] "Problem Statement: Overlays for Network Virtualization" 18 discusses the needs for network virtualization using overlay networks 19 in highly virtualized data centers. The problem statement outlines a 20 need for control protocols to facilitate running these overlay 21 networks. This document outlines the high level requirements to be 22 fulfilled by the control protocols related to building and managing 23 the mapping tables and other state information used by the Network 24 Virtualization Edge to transmit encapsulated packets across the 25 underlying network. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on January 3, 2016. 44 Copyright Notice 46 Copyright (c) 2015 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 62 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 3. Control Plane Protocol Functionality . . . . . . . . . . . . 4 64 3.1. Inner to Outer Address Mapping . . . . . . . . . . . . . 7 65 3.2. Underlying Network Multi-Destination Delivery Address(es) 7 66 3.3. VN Connect/Disconnect Notification . . . . . . . . . . . 8 67 3.4. VN Name to VN ID Mapping . . . . . . . . . . . . . . . . 8 68 4. Control Plane Characteristics . . . . . . . . . . . . . . . . 8 69 5. Security Considerations . . . . . . . . . . . . . . . . . . . 10 70 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 71 7. Informative References . . . . . . . . . . . . . . . . . . . 11 72 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 12 73 A.1. Changes from draft-ietf-nvo3-nve-nva-cp-req-01 to -02 . . 12 74 A.2. Changes from draft-ietf-nvo3-nve-nva-cp-req-02 to -03 . . 12 75 A.3. Changes from draft-ietf-nvo3-nve-nva-cp-req-03 to -04 . . 12 76 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 78 1. Introduction 80 [RFC7364] "Problem Statement: Overlays for Network Virtualization" 81 discusses the needs for network virtualization using overlay networks 82 in highly virtualized data centers and provides a general motivation 83 for building such networks. [RFC7365] "Framework for DC Network 84 Virtualization" provides a framework for discussing overlay networks 85 generally and the various components that must work together in 86 building such systems. "An Architecture for Overlay Networks (NVO3)" 87 [I-D.ietf-nvo3-arch] presents a high-level architecture for building 88 NVO3 Overlay networks. The reader is assumed to be familiar with 89 these documents. 91 Section 4.5 of [RFC7364] describes three separate work areas that 92 fall under the general category of a control protocol for NVO3. This 93 document focuses entirely on those aspects of the control protocol 94 related to the building and distributing the mapping tables an NVE 95 uses to tunnel traffic from one VM to another. Specifically, this 96 document focuses on work area 2 given in Section 4.5 of [RFC7364], 97 and discussed in section 8 of [I-D.ietf-nvo3-arch]. Work area 2 98 covers the interaction between an NVE and the Network Virtualization 99 Authority (NVA), while work area 1 concerns operation of the NVA 100 itself. Requirements related to interaction between a hypervisor and 101 NVE when the two entities reside on separate physical devices (work 102 area 3) are covered in [I-D.ietf-nvo3-hpvr2nve-cp-req]. 104 2. Terminology 106 This document uses the same terminology as found in [RFC7365] and 107 [I-D.ietf-nvo3-arch]. This section defines additional terminology 108 used by this document. 110 Network Service Appliance: A stand-alone physical device or a 111 virtual device that provides a network service, such as a 112 firewall, load balancer, etc. Such appliances may embed Network 113 Virtualization Edge (NVE) functionality within them in order to 114 more efficiently operate as part of a virtualized network. 116 VN Alias: A string name for a VN as used by administrators and 117 customers to name a specific VN. A VN Alias is a human-usable 118 string that can be listed in contracts, customer forms, email, 119 configuration files, etc. and that can be communicated easily 120 vocally (e.g., over the phone). A VN Alias is independent of the 121 underlying technology used to implement a VN and will generally 122 not be carried in protocol fields of control protocols used in 123 virtual networks. Rather, a VN Alias will be mapped into a VN 124 Name where precision is required. 126 VN Name: A globally unique identifier for a VN suitable for use 127 within network protocols. A VN Name will usually be paired with a 128 VN Alias, with the VN Alias used by humans as a shorthand way to 129 name and identify a specific VN. A VN Name should have a compact 130 representation to minimize protocol overhead where a VN Name is 131 carried in a protocol field. Using a Universally Unique 132 Identifier (UUID) as discussed in RFC 4122, may work well because 133 it is both compact and a fixed size and can be generated locally 134 with a very high likelihood of global uniqueness. 136 VN ID: A unique and compact identifier for a VN within the scope of 137 a specific NVO3 administrative domain. It will generally be more 138 efficient to carry VN IDs as fields in control protocols than VN 139 Names or VN Aliases. There is a one-to-one mapping between a VN 140 Name and a VN ID within an NVO3 Administrative Domain. Depending 141 on the technology used to implement an overlay network, the VN ID 142 could be used as the VN Context in the data plane, or would need 143 to be mapped to a locally-significant context ID. 145 3. Control Plane Protocol Functionality 147 The NVO3 problem statement [RFC7364], discusses the needs for a 148 control plane protocol (or protocols) to populate each NVE with the 149 state needed to perform its functions. 151 In one common scenario, an NVE provides overlay encapsulation/ 152 decapsulation packet forwarding services to Tenant Systems that are 153 co-resident with the NVE on the same End Device. For example, when 154 the NVE is embedded within a hypervisor or a Network Service 155 Appliance, as depicted in Figure 1 and Figure 2 below. 156 Alternatively, a Tenant System may use an externally connected NVE. 157 For example, an NVE residing on a physical Network Switch connected 158 to the End Device, as depicted in Figure 3 and Figure 4 below. 160 There are two control plane aspects for an NVE. One is the protocol 161 between the NVE and its NVA used to populate the NVE's mapping tables 162 for tunneling traffic across the underlying network. Another is the 163 protocol between an End Device (e.g. Hypervisor) and an external NVE 164 used to promptly update the NVE of Tenant System Interface (TSI) 165 status. This latter control plane aspect is not discussed in this 166 document, but is covered in [I-D.ietf-nvo3-hpvr2nve-cp-req]. The 167 functional requirements for the NVE to NVA control plane are the same 168 regardless of whether the NVE is embedded within and End Device or in 169 an external device as depicted in Figure 1 through Figure 4 below. 171 Hypervisor 172 +-----------------------+ 173 | +--+ +-------+---+ | 174 | |VM|---| | | | 175 | +--+ |Virtual|NVE|----- Underlying 176 | +--+ |Switch | | | Network 177 | |VM|---| | | | 178 | +--+ +-------+---+ | 179 +-----------------------+ 181 Hypervisor with an Embedded NVE. 183 Figure 1 185 Network Service Appliance 186 +---------------------------+ 187 | +------------+ +-----+ | 188 | |Net Service |---| | | 189 | |Instance | | | | 190 | +------------+ | NVE |------ Underlying 191 | +------------+ | | | Network 192 | |Net Service |---| | | 193 | |Instance | | | | 194 | +------------+ +-----+ | 195 +---------------------------+ 197 Network Service Appliance (physical or virtual) with an Embedded NVE. 199 Figure 2 201 Hypervisor Access Switch 202 +------------------+ +-----+-------+ 203 | +--+ +-------+ | | | | 204 | |VM|---| | | VLAN | | | 205 | +--+ |Virtual|---------+ NVE | +--- Underlying 206 | +--+ |Switch | | Trunk | | | Network 207 | |VM|---| | | | | | 208 | +--+ +-------+ | | | | 209 +------------------+ +-----+-------+ 211 Hypervisor with an External NVE. 213 Figure 3 215 Network Service Appliance Access Switch 216 +--------------------------+ +-----+-------+ 217 | +------------+ |\ | | | | 218 | |Net Service |----| \ | | | | 219 | |Instance | | \ | VLAN | | | 220 | +------------+ | |---------+ NVE | +--- Underlying 221 | +------------+ | | | Trunk| | | Network 222 | |Net Service |----| / | | | | 223 | |Instance | | / | | | | 224 | +------------+ |/ | | | | 225 +--------------------------+ +-----+-------+ 227 Physical Network Service Appliance with an External NVE. 229 Figure 4 231 To support an NVE, a control plane protocol is necessary to provide 232 an NVE with the information it needs to maintain its own internal 233 state necessary to carry out its forwarding functions as explained in 234 detail below. 236 1. An NVE maintains a per-VN table of mappings from TSI (inner) 237 addresses to Underlying Network (outer) addresses of remote NVEs. 239 2. An NVE maintains per-VN state for delivering tenant multicast and 240 broadcast packets to other Tenant Systems. Such state could 241 include a list of multicast addresses and/or unicast addresses on 242 the Underlying Network for the NVEs associated with a particular 243 VN. 245 3. End Devices (such as a Hypervisor or Network Service Appliance) 246 utilizing an external NVE need to "attach to" and "detach from" 247 an NVE. Specifically, a mechanism is needed to notify an NVE 248 when a TSI attaches to or detaches from a specific VN. Such a 249 mechanism would provide the necessary information to the NVE that 250 it needs to provide service to a particular TSI. The details of 251 such a mechanism are out-of-scope for this document and are 252 covered in [I-D.ietf-nvo3-hpvr2nve-cp-req]. 254 4. An NVE needs a mapping from each unique VN name to the VN Context 255 value used within encapsulated data packets within the 256 administrative domain that the VN is instantiated. 258 The NVE to NVA control protocol operates directly over the underlay 259 network. The NVA is expected to be connected to the same underlay 260 network as the NVEs. 262 Each NVE communicates with only a single logical NVA; However, the 263 NVA can be centralized or distributed between multiple entities for 264 redundancy purposes. When the NVA is made up of multiple entities, 265 better resiliency may be achieved by physically separating them, 266 which may require each entity to be connected to a different IP 267 subnet of the underlay network. For this reason, each NVE should be 268 allowed to be configured with more than one IP addresses for its 269 logical NVA. NVEs should be able to switch between these IP 270 addresses when it detects that the address it is currently using for 271 the NVA is unreachable. How the NVA represents itself externally is 272 discussed in section 7.3 of [I-D.ietf-nvo3-arch]. 274 Note that a single device could contain both NVE and NVA 275 functionality, but the functional interaction between the NVE and NVA 276 within that device should operate similarly to when the NVE and NVA 277 are implemented in separate devices. 279 3.1. Inner to Outer Address Mapping 281 When presented with a data packet to forward to a TSI within a VN, 282 the NVE needs to know the mapping of the TSI destination (inner) 283 address to the (outer) address on the Underlying Network of the 284 remote NVE which can deliver the packet to the destination Tenant 285 System. In addition, the NVE needs to know what VN Context to use 286 when sending to a destination Tenant System. 288 A protocol is needed to provide this inner to outer mapping and VN 289 Context to each NVE that requires it and keep the mapping updated in 290 a timely manner. Timely updates are important for maintaining 291 connectivity between Tenant Systems when one Tenant System is a VM. 293 Note that one technique that could be used to create this mapping 294 without the need for a control protocol is via data plane learning; 295 However, the learning approach requires packets to be flooded to all 296 NVEs participating in the VN when no mapping exists. One goal of 297 using a control protocol is to eliminate this flooding. 299 3.2. Underlying Network Multi-Destination Delivery Address(es) 301 Each NVE needs a way to deliver multi-destination packets (i.e. 302 tenant broadcast/multicast) within a given VN to each remote NVE 303 which has a destination TSI for these packets. Three possible ways 304 of accomplishing this are: 306 o Use the multicast capabilities of the Underlying Network. 308 o Have each NVE replicate the packets and send a copy across the 309 Underlying Network to each remote NVE currently participating in 310 the VN. 312 o Use one or more distribution servers that replicate the packets on 313 the behalf of the NVEs. 315 Whichever method is used, a protocol is needed to provide on a per VN 316 basis, one or more multicast addresses (assuming the Underlying 317 Network supports multicast), and/or one or more unicast addresses of 318 either the remote NVEs which are not multicast reachable, or of one 319 or more distribution servers for the VN. 321 The protocol must also keep the list of addresses up to date in a 322 timely manner as the set of NVEs for a given VN changes over time. 323 For example, the set of NVEs for a VN could change as VMs power on/ 324 off or migrate to different hypervisors. 326 3.3. VN Connect/Disconnect Notification 328 For the purposes of this document, it is assumed that an NVE receives 329 appropriate notifications when a TSI attaches to or detaches from a 330 specific VN. The details of how that is done are orthogonal to the 331 NVE-to-NVA control plane, so long as such notification provides the 332 necessary information needed by the control plane. As one example, 333 the attach/detach notification would presumably include a VN Name 334 that identifies the specific VN to which the attach/detach operation 335 applies to. 337 3.4. VN Name to VN ID Mapping 339 Once an NVE (embedded or external) receives a VN connect indication 340 with a specified VN Name, the NVE must determine what VN Context 341 value and other necessary information to use to forward Tenant System 342 traffic to remote NVEs. In one approach, the NVE-to-NVA protocol 343 uses VN Names directly when interacting, with the NVA providing such 344 information as the VN Context (or VN ID) along with egress NVE's 345 address. Alternatively, it may be desirable for the NVE-to-NVA 346 protocol to use a more compact representation of the VN name, that 347 is, a VN ID. In such a case, a specific NVE-to-NVA operation might 348 be needed to first map the VN Name into a VN ID, with subsequent NVE- 349 to-NVA operations utilizing the VN ID directly. Thus, it may be 350 useful for the NVE-to-NVA protocol to support an operation that maps 351 VN Names into VN IDs. 353 4. Control Plane Characteristics 355 NVEs are expected to be implemented within both hypervisors (or 356 Network Service Appliances) and within access switches. Any 357 resources used by these protocols (e.g. processing or memory) takes 358 away resources that could be better used by these devices to perform 359 their intended functions (e.g. providing resources for hosted VMs). 361 A large scale data center may contain hundreds of thousands of these 362 NVEs (which may be several independent implementations); Therefore, 363 any savings in per-NVE resources can be multiplied hundreds of 364 thousands of times. 366 Given this, the control plane protocol(s) implemented by NVEs to 367 provide the functionality discussed above should have the below 368 characteristics. 370 1. Minimize the amount of state needed to be stored on each NVE. 371 The NVE should only be required to cache state that it is 372 actively using, and be able to discard any cached state when it 373 is no longer required. For example, an NVE should only need to 374 maintain an inner-to-outer address mapping for destinations to 375 which it is actively sending traffic as opposed to maintaining 376 mappings for all possible destinations. 378 2. Fast acquisition of needed state. For example, when a TSI emits 379 a packet destined to an inner address that the NVE does not have 380 a mapping for, the NVE should be able to acquire the needed 381 mapping quickly. 383 3. Fast detection/update of stale cached state information. This 384 only applies if the cached state is actually being used. For 385 example, when a VM moves such that it is connected to a 386 different NVE, the inner to outer mapping for this VM's address 387 that is cached on other NVEs must be updated in a timely manner 388 (if they are actively in use). If the update is not timely, the 389 NVEs will forward data to the wrong NVE until it is updated. 391 4. Minimize processing overhead. This means that an NVE should 392 only be required to perform protocol processing directly related 393 to maintaining state for the TSIs it is actively communicating 394 with. For example, if the NVA provides unsolicited information 395 to the NVEs, then one way to minimize the processing on the NVE 396 is for it to subscribe for getting these mappings on a per VN 397 basis. Consequently an NVE is not required to maintain state 398 for all VNs within a domain. An NVE only needs to maintain 399 state (or participate in protocol exchanges) about the VNs it is 400 currently attached to. If the NVE obtains mappings on demand 401 from the NVA, then it only needs to obtain the information 402 relevant to the traffic flows that are currently active. This 403 requirement is for the NVE functionality only. The network node 404 that contains the NVE may be involved in other functionality for 405 the underlying network that maintains connectivity that the NVE 406 is not actively using (e.g., routing and multicast distribution 407 protocols for the underlying network). 409 5. Highly scalable. This means scaling to hundreds of thousands of 410 NVEs and several million VNs within a single administrative 411 domain. As the number of NVEs and/or VNs within a data center 412 grows, the protocol overhead at any one NVE should not increase 413 significantly. 415 6. Minimize the complexity of the implementation. This argues for 416 using the least number of protocols to achieve all the 417 functionality listed above. Ideally a single protocol should be 418 able to be used. The less complex the protocol is on the NVE, 419 the more likely interoperable implementations will be created in 420 a timely manner. 422 7. Extensible. The protocol should easily accommodate extension to 423 meet related future requirements. For example, access control 424 or QoS policies, or new address families for either inner or 425 outer addresses should be easy to add while maintaining 426 interoperability with NVEs running older versions. 428 8. Simple protocol configuration. A minimal amount of 429 configuration should be required for a new NVE to be 430 provisioned. Existing NVEs should not require any configuration 431 changes when a new NVE is provisioned. Ideally NVEs should be 432 able to auto configure themselves. 434 9. Do not rely on IP Multicast in the Underlying Network. Many 435 data centers do not have IP multicast routing enabled. If the 436 Underlying Network is an IP network, the protocol should allow 437 for, but not require the presence of IP multicast services 438 within the data center. 440 10. Flexible mapping sources. It should be possible for either NVEs 441 themselves, or other third party entities (e.g. data center 442 management or orchestration systems) to create inner to outer 443 address mappings in the NVA. The protocol should allow for 444 mappings created by an NVE to be automatically removed from all 445 other NVEs if it fails or is brought down unexpectedly. 447 11. Secure. See the Security Considerations section below. 449 5. Security Considerations 451 Editor's Note: This is an initial start on the security 452 considerations section; it will need to be expanded, and suggestions 453 for material to add are welcome. 455 The protocol(s) should protect the integrity of the mapping against 456 both off-path and on-path attacks. It should authenticate the 457 systems that are creating mappings, and rely on light weight security 458 mechanisms to minimize the impact on scalability and allow for simple 459 configuration. 461 Use of an overlay exposes virtual networks to attacks on the 462 underlying network beyond attacks on the control protocol that is the 463 subject of this draft. In addition to the directly applicable 464 security considerations for the networks involved, the use of an 465 overlay enables attacks on encapsulated virtual networks via the 466 underlying network. Examples of such attacks include traffic 467 injection into a virtual network via injection of encapsulated 468 traffic into the underlying network and modifying underlying network 469 traffic to forward traffic among virtual networks that should have no 470 connectivity. The control protocol should provide functionality to 471 help counter some of these attacks, e.g., distribution of NVE access 472 control lists for each virtual network to enable packets from non- 473 participating NVEs to be discarded, but the primary security measures 474 for the underlying network need to be applied to the underlying 475 network. For example, if the underlying network includes 476 connectivity across the public Internet, use of secure gateways 477 (e.g., based on IPsec [RFC4301]) may be appropriate. 479 The inner to outer address mappings used for forwarding data towards 480 a remote NVE could also be used to filter incoming traffic to ensure 481 the inner address sourced packet came from the correct NVE source 482 address, allowing access control to discard traffic that does not 483 originate from the correct NVE. This destination filtering 484 functionality should be optional to use. 486 6. Acknowledgements 488 Thanks to the following people for reviewing and providing feedback: 489 Fabio Maino, Victor Moreno, Ajit Sanzgiri, Chris Wright. 491 7. Informative References 493 [I-D.ietf-nvo3-arch] 494 Black, D., Hudson, J., Kreeger, L., Lasserre, M., and T. 495 Narten, "An Architecture for Overlay Networks (NVO3)", 496 draft-ietf-nvo3-arch-03 (work in progress), March 2015. 498 [I-D.ietf-nvo3-hpvr2nve-cp-req] 499 Yizhou, L., Yong, L., Kreeger, L., Narten, T., and D. 500 Black, "Hypervisor to NVE Control Plane Requirements", 501 draft-ietf-nvo3-hpvr2nve-cp-req-02 (work in progress), 502 February 2015. 504 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 505 Internet Protocol", RFC 4301, December 2005. 507 [RFC7364] Narten, T., Gray, E., Black, D., Fang, L., Kreeger, L., 508 and M. Napierala, "Problem Statement: Overlays for Network 509 Virtualization", RFC 7364, October 2014. 511 [RFC7365] Lasserre, M., Balus, F., Morin, T., Bitar, N., and Y. 512 Rekhter, "Framework for Data Center (DC) Network 513 Virtualization", RFC 7365, October 2014. 515 Appendix A. Change Log 517 A.1. Changes from draft-ietf-nvo3-nve-nva-cp-req-01 to -02 519 1. Added references to the architecture document 520 [I-D.ietf-nvo3-arch]. 522 2. Terminology: Usage of "TSI" in several places. 524 A.2. Changes from draft-ietf-nvo3-nve-nva-cp-req-02 to -03 526 1. Updated references to the framework, problem statement and merged 527 WG hypervisor-to-nve document. 529 A.3. Changes from draft-ietf-nvo3-nve-nva-cp-req-03 to -04 531 1. Minor editorial tweaks. 533 Authors' Addresses 535 Lawrence Kreeger 536 Cisco Systems 538 Email: kreeger@cisco.com 540 Dinesh Dutt 541 Cumulus Networks 543 Email: ddutt@cumulusnetworks.com 545 Thomas Narten 546 IBM 548 Email: narten@us.ibm.com 550 David Black 551 EMC 553 Email: david.black@emc.com