idnits 2.17.1 draft-ietf-nvo3-nve-nva-cp-req-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 27, 2014) is 3467 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-08) exists of draft-ietf-nvo3-arch-01 == Outdated reference: A later version (-17) exists of draft-ietf-nvo3-hpvr2nve-cp-req-00 Summary: 1 error (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force L. Kreeger 3 Internet-Draft Cisco Systems 4 Intended status: Informational D. Dutt 5 Expires: April 30, 2015 Cumulus Networks 6 T. Narten 7 IBM 8 D. Black 9 EMC 10 October 27, 2014 12 Network Virtualization NVE to NVA Control Protocol Requirements 13 draft-ietf-nvo3-nve-nva-cp-req-03 15 Abstract 17 The document "Problem Statement: Overlays for Network Virtualization" 18 discusses the needs for network virtualization using overlay networks 19 in highly virtualized data centers. The problem statement outlines a 20 need for control protocols to facilitate running these overlay 21 networks. This document outlines the high level requirements to be 22 fulfilled by the control protocols related to building and managing 23 the mapping tables and other state information used by the Network 24 Virtualization Edge to transmit encapsulated packets across the 25 underlying network. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at http://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on April 30, 2015. 44 Copyright Notice 46 Copyright (c) 2014 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents 51 (http://trustee.ietf.org/license-info) in effect on the date of 52 publication of this document. Please review these documents 53 carefully, as they describe your rights and restrictions with respect 54 to this document. Code Components extracted from this document must 55 include Simplified BSD License text as described in Section 4.e of 56 the Trust Legal Provisions and are provided without warranty as 57 described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 62 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 63 3. Control Plane Protocol Functionality . . . . . . . . . . . . 4 64 3.1. Inner to Outer Address Mapping . . . . . . . . . . . . . 7 65 3.2. Underlying Network Multi-Destination Delivery Address(es) 7 66 3.3. VN Connect/Disconnect Notification . . . . . . . . . . . 8 67 3.4. VN Name to VN ID Mapping . . . . . . . . . . . . . . . . 8 68 4. Control Plane Characteristics . . . . . . . . . . . . . . . . 8 69 5. Security Considerations . . . . . . . . . . . . . . . . . . . 10 70 6. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 11 71 7. Informative References . . . . . . . . . . . . . . . . . . . 11 72 Appendix A. Change Log . . . . . . . . . . . . . . . . . . . . . 12 73 A.1. Changes from draft-ietf-nvo3-nve-nva-cp-req-01 to -02 . . 12 74 A.2. Changes from draft-ietf-nvo3-nve-nva-cp-req-02 to -03 . . 12 75 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 77 1. Introduction 79 "Problem Statement: Overlays for Network Virtualization" [RFC7364] 80 discusses the needs for network virtualization using overlay networks 81 in highly virtualized data centers and provides a general motivation 82 for building such networks. "Framework for DC Network 83 Virtualization" [RFC7365] provides a framework for discussing overlay 84 networks generally and the various components that must work together 85 in building such systems. "An Architecture for Overlay Networks 86 (NVO3)" [I-D.ietf-nvo3-arch] presents a high-level architecture for 87 building NVO3 Overlay networks. The reader is assumed to be familiar 88 with these documents. 90 Section 4.5 of [RFC7364] describes three separate work areas that 91 fall under the general category of a control protocol for NVO3. This 92 document focuses entirely on those aspects of the control protocol 93 related to the building and distributing the mapping tables an NVE 94 uses to tunnel traffic from one VM to another. Specifically, this 95 document focuses on work area 2 given in Section 4.5 of [RFC7364], 96 and discussed in section 8 of [I-D.ietf-nvo3-arch]. Work area 2 97 covers the interaction between an NVE and the Network Virtualization 98 Authority (NVA), while work area 1 concerns operation of the NVA 99 itself. Requirements related to interaction between a hypervisor and 100 NVE when the two entities reside on separate physical devices (work 101 area 3) are covered in [I-D.ietf-nvo3-hpvr2nve-cp-req]. 103 2. Terminology 105 This document uses the same terminology as found in [RFC7365] and 106 [I-D.ietf-nvo3-arch]. This section defines additional terminology 107 used by this document. 109 Network Service Appliance: A stand-alone physical device or a 110 virtual device that provides a network service, such as a 111 firewall, load balancer, etc. Such appliances may embed Network 112 Virtualization Edge (NVE) functionality within them in order to 113 more efficiently operate as part of a virtualized network. 115 VN Alias: A string name for a VN as used by administrators and 116 customers to name a specific VN. A VN Alias is a human-usable 117 string that can be listed in contracts, customer forms, email, 118 configuration files, etc. and that can be communicated easily 119 vocally (e.g., over the phone). A VN Alias is independent of the 120 underlying technology used to implement a VN and will generally 121 not be carried in protocol fields of control protocols used in 122 virtual networks. Rather, a VN Alias will be mapped into a VN 123 Name where precision is required. 125 VN Name: A globally unique identifier for a VN suitable for use 126 within network protocols. A VN Name will usually be paired with a 127 VN Alias, with the VN Alias used by humans as a shorthand way to 128 name and identify a specific VN. A VN Name should have a compact 129 representation to minimize protocol overhead where a VN Name is 130 carried in a protocol field. Using a Universally Unique 131 Identifier (UUID) as discussed in RFC 4122, may work well because 132 it is both compact and a fixed size and can be generated locally 133 with a very high likelihood of global uniqueness. 135 VN ID: A unique and compact identifier for a VN within the scope of 136 a specific NVO3 administrative domain. It will generally be more 137 efficient to carry VN IDs as fields in control protocols than VN 138 Names or VN Aliases. There is a one-to-one mapping between a VN 139 Name and a VN ID within an NVO3 Administrative Domain. Depending 140 on the technology used to implement an overlay network, the VN ID 141 could be used as the VN Context in the data plane, or would need 142 to be mapped to a locally-significant context ID. 144 3. Control Plane Protocol Functionality 146 The NVO3 problem statement [RFC7364], discusses the needs for a 147 control plane protocol (or protocols) to populate each NVE with the 148 state needed to perform its functions. 150 In one common scenario, an NVE provides overlay encapsulation/ 151 decapsulation packet forwarding services to Tenant Systems that are 152 co-resident with the NVE on the same End Device. For example, when 153 the NVE is embedded within a hypervisor or a Network Service 154 Appliance, as depicted in Figure 1 and Figure 2 below. 155 Alternatively, a Tenant System may use an externally connected NVE. 156 For example, an NVE residing on a physical Network Switch connected 157 to the End Device, as depicted in Figure 3 and Figure 4 below. 159 There are two control plane aspects for an NVE. One is the protocol 160 between the NVE and its NVA used to populate the NVE's mapping tables 161 for tunneling traffic across the underlying network. Another is the 162 protocol between an End Device (e.g. Hypervisor) and an external NVE 163 used to promptly update the NVE of Tenant System Interface (TSI) 164 status. This latter control plane aspect is not discussed in this 165 document, but is covered in [I-D.ietf-nvo3-hpvr2nve-cp-req]. The 166 functional requirements for the NVE to NVA control plane are the same 167 regardless of whether the NVE is embedded within and End Device or in 168 an external device as depicted in Figure 1 through Figure 4 below. 170 Hypervisor 171 +-----------------------+ 172 | +--+ +-------+---+ | 173 | |VM|---| | | | 174 | +--+ |Virtual|NVE|----- Underlying 175 | +--+ |Switch | | | Network 176 | |VM|---| | | | 177 | +--+ +-------+---+ | 178 +-----------------------+ 180 Hypervisor with an Embedded NVE. 182 Figure 1 184 Network Service Appliance 185 +---------------------------+ 186 | +------------+ +-----+ | 187 | |Net Service |---| | | 188 | |Instance | | | | 189 | +------------+ | NVE |------ Underlying 190 | +------------+ | | | Network 191 | |Net Service |---| | | 192 | |Instance | | | | 193 | +------------+ +-----+ | 194 +---------------------------+ 196 Network Service Appliance (physical or virtual) with an Embedded NVE. 198 Figure 2 200 Hypervisor Access Switch 201 +------------------+ +-----+-------+ 202 | +--+ +-------+ | | | | 203 | |VM|---| | | VLAN | | | 204 | +--+ |Virtual|---------+ NVE | +--- Underlying 205 | +--+ |Switch | | Trunk | | | Network 206 | |VM|---| | | | | | 207 | +--+ +-------+ | | | | 208 +------------------+ +-----+-------+ 210 Hypervisor with an External NVE. 212 Figure 3 214 Network Service Appliance Access Switch 215 +--------------------------+ +-----+-------+ 216 | +------------+ |\ | | | | 217 | |Net Service |----| \ | | | | 218 | |Instance | | \ | VLAN | | | 219 | +------------+ | |---------+ NVE | +--- Underlying 220 | +------------+ | | | Trunk| | | Network 221 | |Net Service |----| / | | | | 222 | |Instance | | / | | | | 223 | +------------+ |/ | | | | 224 +--------------------------+ +-----+-------+ 226 Physical Network Service Appliance with an External NVE. 228 Figure 4 230 To support an NVE, a control plane protocol is necessary to provide 231 an NVE with the information it needs to maintain its own internal 232 state necessary to carry out its forwarding functions as explained in 233 detail below. 235 1. An NVE maintains a per-VN table of mappings from TSI (inner) 236 addresses to Underlying Network (outer) addresses of remote NVEs. 238 2. An NVE maintains per-VN state for delivering tenant multicast and 239 broadcast packets to other Tenant Systems. Such state could 240 include a list of multicast addresses and/or unicast addresses on 241 the Underlying Network for the NVEs associated with a particular 242 VN. 244 3. End Devices (such as a Hypervisor or Network Service Appliance) 245 utilizing an external NVE need to "attach to" and "detach from" 246 an NVE. Specifically, a mechanism is needed to notify an NVE 247 when a TSI attaches to or detaches from a specific VN. Such a 248 mechanism would provide the necessary information to the NVE that 249 it needs to provide service to a particular TSI. The details of 250 such a mechanism are out-of-scope for this document and are 251 covered in [I-D.ietf-nvo3-hpvr2nve-cp-req]. 253 4. An NVE needs a mapping from each unique VN name to the VN Context 254 value used within encapsulated data packets within the 255 administrative domain that the VN is instantiated. 257 The NVE to NVA control protocol operates directly over the underlay 258 network. The NVA is expected to be connected to the same underlay 259 network as the NVEs. 261 Each NVE communicates with only a single logical NVA; However, the 262 NVA can be centralized or distributed between multiple entities for 263 redundancy purposes. When the NVA is made up of multiple entities, 264 better resiliency may be achieved by physically separating them, 265 which may require each entity to be connected to a different IP 266 subnet of the underlay network. For this reason, each NVE should be 267 allowed to be configured with more than one IP addresses for its 268 logical NVA. NVEs should be able to switch between these IP 269 addresses when it detects that the address it is currently using for 270 the NVA is unreachable. How the NVA represents itself externally is 271 discussed in section 7.3 of [I-D.ietf-nvo3-arch]. 273 Note that a single device could contain both NVE and NVA 274 functionality, but the functional interaction between the NVE and NVA 275 within that device should operate similarly to when the NVE and NVA 276 are implemented in separate devices. 278 3.1. Inner to Outer Address Mapping 280 When presented with a data packet to forward to a TSI within a VN, 281 the NVE needs to know the mapping of the TSI destination (inner) 282 address to the (outer) address on the Underlying Network of the 283 remote NVE which can deliver the packet to the destination Tenant 284 System. In addition, the NVE needs to know what VN Context to use 285 when sending to a destination Tenant System. 287 A protocol is needed to provide this inner to outer mapping and VN 288 Context to each NVE that requires it and keep the mapping updated in 289 a timely manner. Timely updates are important for maintaining 290 connectivity between Tenant Systems when one Tenant System is a VM. 292 Note that one technique that could be used to create this mapping 293 without the need for a control protocol is via data plane learning; 294 However, the learning approach requires packets to be flooded to all 295 NVEs participating in the VN when no mapping exists. One goal of 296 using a control protocol is to eliminate this flooding. 298 3.2. Underlying Network Multi-Destination Delivery Address(es) 300 Each NVE needs a way to deliver multi-destination packets (i.e. 301 tenant broadcast/multicast) within a given VN to each remote NVE 302 which has a destination TSI for these packets. Three possible ways 303 of accomplishing this are: 305 o Use the multicast capabilities of the Underlying Network. 307 o Have each NVE replicate the packets and send a copy across the 308 Underlying Network to each remote NVE currently participating in 309 the VN. 311 o Use one or more distribution servers that replicate the packets on 312 the behalf of the NVEs. 314 Whichever method is used, a protocol is needed to provide on a per VN 315 basis, one or more multicast addresses (assuming the Underlying 316 Network supports multicast), and/or one or more unicast addresses of 317 either the remote NVEs which are not multicast reachable, or of one 318 or more distribution servers for the VN. 320 The protocol must also keep the list of addresses up to date in a 321 timely manner as the set of NVEs for a given VN changes over time. 322 For example, the set of NVEs for a VN could change as VMs power on/ 323 off or migrate to different hypervisors. 325 3.3. VN Connect/Disconnect Notification 327 For the purposes of this document, it is assumed that an NVE receives 328 appropriate notifications when a TSI attaches to or detaches from a 329 specific VN. The details of how that is done are orthogonal to the 330 NVE-to-NVA control plane, so long as such notification provides the 331 necessary information needed by the control plane. As one example, 332 the attach/detach notification would presumably include a VN Name 333 that identifies the specific VN to which the attach/detach operation 334 applies to. 336 3.4. VN Name to VN ID Mapping 338 Once an NVE (embedded or external) receives a VN connect indication 339 with a specified VN Name, the NVE must determine what VN Context 340 value and other necessary information to use to forward Tenant System 341 traffic to remote NVEs. In one approach, the NVE-to-NVA protocol 342 uses VN Names directly when interacting, with the NVA providing such 343 information as the VN Context (or VN ID) along with egress NVE's 344 address. Alternatively, it may be desirable for the NVE-to-NVA 345 protocol to use a more compact representation of the VN name, that 346 is, a VN ID. In such a case, a specific NVE-to-NVA operation might 347 be needed to first map the VN Name into a VN ID, with subsequent NVE- 348 to-NVA operations utilizing the VN ID directly. Thus, it may be 349 useful for the NVE-to-NVA protocol to support an operation that maps 350 VN Names into VN IDs. 352 4. Control Plane Characteristics 354 NVEs are expected to be implemented within both hypervisors (or 355 Network Service Appliances) and within access switches. Any 356 resources used by these protocols (e.g. processing or memory) takes 357 away resources that could be better used by these devices to perform 358 their intended functions (e.g. providing resources for hosted VMs). 360 A large scale data center may contain hundreds of thousands of these 361 NVEs (which may be several independent implementations); Therefore, 362 any savings in per-NVE resources can be multiplied hundreds of 363 thousands of times. 365 Given this, the control plane protocol(s) implemented by NVEs to 366 provide the functionality discussed above should have the below 367 characteristics. 369 1. Minimize the amount of state needed to be stored on each NVE. 370 The NVE should only be required to cache state that it is 371 actively using, and be able to discard any cached state when it 372 is no longer required. For example, an NVE should only need to 373 maintain an inner-to-outer address mapping for destinations to 374 which it is actively sending traffic as opposed to maintaining 375 mappings for all possible destinations. 377 2. Fast acquisition of needed state. For example, when a TSI emits 378 a packet destined to an inner address that the NVE does not have 379 a mapping for, the NVE should be able to acquire the needed 380 mapping quickly. 382 3. Fast detection/update of stale cached state information. This 383 only applies if the cached state is actually being used. For 384 example, when a VM moves such that it is connected to a 385 different NVE, the inner to outer mapping for this VM's address 386 that is cached on other NVEs must be updated in a timely manner 387 (if they are actively in use). If the update is not timely, the 388 NVEs will forward data to the wrong NVE until it is updated. 390 4. Minimize processing overhead. This means that an NVE should 391 only be required to perform protocol processing directly related 392 to maintaining state for the TSIs it is actively communicating 393 with. For example, if the NVA provides unsolicited information 394 to the NVEs, then one way to minimize the processing on the NVE 395 is for it to subscribe for getting these mappings on a per VN 396 basis. Consequently an NVE is not required to maintain state 397 for all VNs within a domain. An NVE only needs to maintain 398 state (or participate in protocol exchanges) about the VNs it is 399 currently attached to. If the NVE obtains mappings on demand 400 from the NVA, then it only needs to obtain the information 401 relevant to the traffic flows that are currently active. This 402 requirement is for the NVE functionality only. The network node 403 that contains the NVE may be involved in other functionality for 404 the underlying network that maintains connectivity that the NVE 405 is not actively using (e.g., routing and multicast distribution 406 protocols for the underlying network). 408 5. Highly scalable. This means scaling to hundreds of thousands of 409 NVEs and several million VNs within a single administrative 410 domain. As the number of NVEs and/or VNs within a data center 411 grows, the protocol overhead at any one NVE should not increase 412 significantly. 414 6. Minimize the complexity of the implementation. This argues for 415 using the least number of protocols to achieve all the 416 functionality listed above. Ideally a single protocol should be 417 able to be used. The less complex the protocol is on the NVE, 418 the more likely interoperable implementations will be created in 419 a timely manner. 421 7. Extensible. The protocol should easily accommodate extension to 422 meet related future requirements. For example, access control 423 or QoS policies, or new address families for either inner or 424 outer addresses should be easy to add while maintaining 425 interoperability with NVEs running older versions. 427 8. Simple protocol configuration. A minimal amount of 428 configuration should be required for a new NVE to be 429 provisioned. Existing NVEs should not require any configuration 430 changes when a new NVE is provisioned. Ideally NVEs should be 431 able to auto configure themselves. 433 9. Do not rely on IP Multicast in the Underlying Network. Many 434 data centers do not have IP multicast routing enabled. If the 435 Underlying Network is an IP network, the protocol should allow 436 for, but not require the presence of IP multicast services 437 within the data center. 439 10. Flexible mapping sources. It should be possible for either NVEs 440 themselves, or other third party entities (e.g. data center 441 management or orchestration systems) to create inner to outer 442 address mappings in the NVA. The protocol should allow for 443 mappings created by an NVE to be automatically removed from all 444 other NVEs if it fails or is brought down unexpectedly. 446 11. Secure. See the Security Considerations section below. 448 5. Security Considerations 450 Editor's Note: This is an initial start on the security 451 considerations section; it will need to be expanded, and suggestions 452 for material to add are welcome. 454 The protocol(s) should protect the integrity of the mapping against 455 both off-path and on-path attacks. It should authenticate the 456 systems that are creating mappings, and rely on light weight security 457 mechanisms to minimize the impact on scalability and allow for simple 458 configuration. 460 Use of an overlay exposes virtual networks to attacks on the 461 underlying network beyond attacks on the control protocol that is the 462 subject of this draft. In addition to the directly applicable 463 security considerations for the networks involved, the use of an 464 overlay enables attacks on encapsulated virtual networks via the 465 underlying network. Examples of such attacks include traffic 466 injection into a virtual network via injection of encapsulated 467 traffic into the underlying network and modifying underlying network 468 traffic to forward traffic among virtual networks that should have no 469 connectivity. The control protocol should provide functionality to 470 help counter some of these attacks, e.g., distribution of NVE access 471 control lists for each virtual network to enable packets from non- 472 participating NVEs to be discarded, but the primary security measures 473 for the underlying network need to be applied to the underlying 474 network. For example, if the underlying network includes 475 connectivity across the public Internet, use of secure gateways 476 (e.g., based on IPsec [RFC4301]) may be appropriate. 478 The inner to outer address mappings used for forwarding data towards 479 a remote NVE could also be used to filter incoming traffic to ensure 480 the inner address sourced packet came from the correct NVE source 481 address, allowing access control to discard traffic that does not 482 originate from the correct NVE. This destination filtering 483 functionality should be optional to use. 485 6. Acknowledgements 487 Thanks to the following people for reviewing and providing feedback: 488 Fabio Maino, Victor Moreno, Ajit Sanzgiri, Chris Wright. 490 7. Informative References 492 [I-D.ietf-nvo3-arch] 493 Black, D., Hudson, J., Kreeger, L., Lasserre, M., and T. 494 Narten, "An Architecture for Overlay Networks (NVO3)", 495 draft-ietf-nvo3-arch-01 (work in progress), February 2014. 497 [I-D.ietf-nvo3-hpvr2nve-cp-req] 498 Yizhou, L., Yong, L., Kreeger, L., Narten, T., and D. 499 Black, "Hypervisor to NVE Control Plane Requirements", 500 draft-ietf-nvo3-hpvr2nve-cp-req-00 (work in progress), 501 July 2014. 503 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 504 Internet Protocol", RFC 4301, December 2005. 506 [RFC7364] Narten, T., Gray, E., Black, D., Fang, L., Kreeger, L., 507 and M. Napierala, "Problem Statement: Overlays for Network 508 Virtualization", RFC 7364, October 2014. 510 [RFC7365] Lasserre, M., Balus, F., Morin, T., Bitar, N., and Y. 511 Rekhter, "Framework for Data Center (DC) Network 512 Virtualization", RFC 7365, October 2014. 514 Appendix A. Change Log 516 A.1. Changes from draft-ietf-nvo3-nve-nva-cp-req-01 to -02 518 1. Added references to the architecture document 519 [I-D.ietf-nvo3-arch]. 521 2. Terminology: Usage of "TSI" in several places. 523 A.2. Changes from draft-ietf-nvo3-nve-nva-cp-req-02 to -03 525 1. Updated references to the framework, problem statement and merged 526 WG hypervisor-to-nve document. 528 Authors' Addresses 530 Lawrence Kreeger 531 Cisco Systems 533 Email: kreeger@cisco.com 535 Dinesh Dutt 536 Cumulus Networks 538 Email: ddutt@cumulusnetworks.com 540 Thomas Narten 541 IBM 543 Email: narten@us.ibm.com 545 David Black 546 EMC 548 Email: david.black@emc.com