idnits 2.17.1 draft-ietf-nvo3-arch-08.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1207 has weird spacing: '... xxxxxx xxx...' == Line 1219 has weird spacing: '... xxxx xxxx...' -- The document date (September 20, 2016) is 2774 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-11) exists of draft-ietf-nvo3-mcast-framework-05 == Outdated reference: A later version (-17) exists of draft-ietf-nvo3-use-case-09 Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force D. Black 3 Internet-Draft Dell EMC 4 Intended status: Informational J. Hudson 5 Expires: March 24, 2017 Independent 6 L. Kreeger 7 Cisco 8 M. Lasserre 9 Independent 10 T. Narten 11 IBM 12 September 20, 2016 14 An Architecture for Data Center Network Virtualization Overlays (NVO3) 15 draft-ietf-nvo3-arch-08 17 Abstract 19 This document presents a high-level overview architecture for 20 building data center network virtualization overlay (NVO3) networks. 21 The architecture is given at a high-level, showing the major 22 components of an overall system. An important goal is to divide the 23 space into individual smaller components that can be implemented 24 independently with clear inter-component interfaces and interactions. 25 It should be possible to build and implement individual components in 26 isolation and have them interoperate with other independently 27 implemented components. That way, implementers have flexibility in 28 implementing individual components and can optimize and innovate 29 within their respective components without requiring changes to other 30 components. 32 Status of This Memo 34 This Internet-Draft is submitted in full conformance with the 35 provisions of BCP 78 and BCP 79. 37 Internet-Drafts are working documents of the Internet Engineering 38 Task Force (IETF). Note that other groups may also distribute 39 working documents as Internet-Drafts. The list of current Internet- 40 Drafts is at http://datatracker.ietf.org/drafts/current/. 42 Internet-Drafts are draft documents valid for a maximum of six months 43 and may be updated, replaced, or obsoleted by other documents at any 44 time. It is inappropriate to use Internet-Drafts as reference 45 material or to cite them other than as "work in progress." 47 This Internet-Draft will expire on March 24, 2017. 49 Copyright Notice 51 Copyright (c) 2016 IETF Trust and the persons identified as the 52 document authors. All rights reserved. 54 This document is subject to BCP 78 and the IETF Trust's Legal 55 Provisions Relating to IETF Documents 56 (http://trustee.ietf.org/license-info) in effect on the date of 57 publication of this document. Please review these documents 58 carefully, as they describe your rights and restrictions with respect 59 to this document. Code Components extracted from this document must 60 include Simplified BSD License text as described in Section 4.e of 61 the Trust Legal Provisions and are provided without warranty as 62 described in the Simplified BSD License. 64 Table of Contents 66 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 67 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 68 3. Background . . . . . . . . . . . . . . . . . . . . . . . . . 4 69 3.1. VN Service (L2 and L3) . . . . . . . . . . . . . . . . . 6 70 3.1.1. VLAN Tags in L2 Service . . . . . . . . . . . . . . . 7 71 3.1.2. Packet Lifetime Considerations . . . . . . . . . . . 7 72 3.2. Network Virtualization Edge (NVE) . . . . . . . . . . . . 8 73 3.3. Network Virtualization Authority (NVA) . . . . . . . . . 9 74 3.4. VM Orchestration Systems . . . . . . . . . . . . . . . . 10 75 4. Network Virtualization Edge (NVE) . . . . . . . . . . . . . . 11 76 4.1. NVE Co-located With Server Hypervisor . . . . . . . . . . 11 77 4.2. Split-NVE . . . . . . . . . . . . . . . . . . . . . . . . 12 78 4.2.1. Tenant VLAN handling in Split-NVE Case . . . . . . . 13 79 4.3. NVE State . . . . . . . . . . . . . . . . . . . . . . . . 13 80 4.4. Multi-Homing of NVEs . . . . . . . . . . . . . . . . . . 14 81 4.5. Virtual Access Point (VAP) . . . . . . . . . . . . . . . 15 82 5. Tenant System Types . . . . . . . . . . . . . . . . . . . . . 15 83 5.1. Overlay-Aware Network Service Appliances . . . . . . . . 15 84 5.2. Bare Metal Servers . . . . . . . . . . . . . . . . . . . 16 85 5.3. Gateways . . . . . . . . . . . . . . . . . . . . . . . . 16 86 5.3.1. Gateway Taxonomy . . . . . . . . . . . . . . . . . . 17 87 5.3.1.1. L2 Gateways (Bridging) . . . . . . . . . . . . . 17 88 5.3.1.2. L3 Gateways (Only IP Packets) . . . . . . . . . . 17 89 5.4. Distributed Inter-VN Gateways . . . . . . . . . . . . . . 18 90 5.5. ARP and Neighbor Discovery . . . . . . . . . . . . . . . 19 91 6. NVE-NVE Interaction . . . . . . . . . . . . . . . . . . . . . 19 92 7. Network Virtualization Authority . . . . . . . . . . . . . . 20 93 7.1. How an NVA Obtains Information . . . . . . . . . . . . . 20 94 7.2. Internal NVA Architecture . . . . . . . . . . . . . . . . 21 95 7.3. NVA External Interface . . . . . . . . . . . . . . . . . 21 96 8. NVE-to-NVA Protocol . . . . . . . . . . . . . . . . . . . . . 23 97 8.1. NVE-NVA Interaction Models . . . . . . . . . . . . . . . 23 98 8.2. Direct NVE-NVA Protocol . . . . . . . . . . . . . . . . . 24 99 8.3. Propagating Information Between NVEs and NVAs . . . . . . 24 100 9. Federated NVAs . . . . . . . . . . . . . . . . . . . . . . . 25 101 9.1. Inter-NVA Peering . . . . . . . . . . . . . . . . . . . . 28 102 10. Control Protocol Work Areas . . . . . . . . . . . . . . . . . 28 103 11. NVO3 Data Plane Encapsulation . . . . . . . . . . . . . . . . 28 104 12. Operations, Administration and Maintenance (OAM) . . . . . . 29 105 13. Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 106 14. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 30 107 15. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 30 108 16. Security Considerations . . . . . . . . . . . . . . . . . . . 30 109 17. Informative References . . . . . . . . . . . . . . . . . . . 31 110 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 33 112 1. Introduction 114 This document presents a high-level architecture for building data 115 center network virtualization overlay (NVO3) networks. The 116 architecture is given at a high-level, showing the major components 117 of an overall system. An important goal is to divide the space into 118 smaller individual components that can be implemented independently 119 with clear inter-component interfaces and interactions. It should be 120 possible to build and implement individual components in isolation 121 and have them interoperate with other independently implemented 122 components. That way, implementers have flexibility in implementing 123 individual components and can optimize and innovate within their 124 respective components without requiring changes to other components. 126 The motivation for overlay networks is given in "Problem Statement: 127 Overlays for Network Virtualization" [RFC7364]. "Framework for DC 128 Network Virtualization" [RFC7365] provides a framework for discussing 129 overlay networks generally and the various components that must work 130 together in building such systems. This document differs from the 131 framework document in that it doesn't attempt to cover all possible 132 approaches within the general design space. Rather, it describes one 133 particular approach that the NVO3 WG has focused on. 135 2. Terminology 137 This document uses the same terminology as [RFC7365]. In addition, 138 the following terms are used: 140 NV Domain A Network Virtualization Domain is an administrative 141 construct that defines a Network Virtualization Authority (NVA), 142 the set of Network Virtualization Edges (NVEs) associated with 143 that NVA, and the set of virtual networks the NVA manages and 144 supports. NVEs are associated with a (logically centralized) NVA, 145 and an NVE supports communication for any of the virtual networks 146 in the domain. 148 NV Region A region over which information about a set of virtual 149 networks is shared. The degenerate case of a single NV Domain 150 corresponds to an NV region corresponding to that domain. The 151 more interesting case occurs when two or more NV Domains share 152 information about part or all of a set of virtual networks that 153 they manage. Two NVAs share information about particular virtual 154 networks for the purpose of supporting connectivity between 155 tenants located in different NV Domains. NVAs can share 156 information about an entire NV domain, or just individual virtual 157 networks. 159 Tenant System Interface (TSI) Interface to a Virtual Network as 160 presented to a Tenant System (TS, see [RFC7365]). The TSI 161 logically connects to the NVE via a Virtual Access Point (VAP). 162 To the Tenant System, the TSI is like a Network Interface Card 163 (NIC); the TSI presents itself to a Tenant System as a normal 164 network interface. 166 VLAN Unless stated otherwise, the terms VLAN and VLAN Tag are used 167 in this document to denote a C-VLAN [IEEE-802.1Q] and the terms 168 are used interchangeably to improve readability. 170 3. Background 172 Overlay networks are an approach for providing network virtualization 173 services to a set of Tenant Systems (TSs) [RFC7365]. With overlays, 174 data traffic between tenants is tunneled across the underlying data 175 center's IP network. The use of tunnels provides a number of 176 benefits by decoupling the network as viewed by tenants from the 177 underlying physical network across which they communicate. 178 Additional discussion of some NVO3 use cases can be found in 179 [I-D.ietf-nvo3-use-case]. 181 Tenant Systems connect to Virtual Networks (VNs), with each VN having 182 associated attributes defining properties of the network, such as the 183 set of members that connect to it. Tenant Systems connected to a 184 virtual network typically communicate freely with other Tenant 185 Systems on the same VN, but communication between Tenant Systems on 186 one VN and those external to the VN (whether on another VN or 187 connected to the Internet) is carefully controlled and governed by 188 policy. The NVO3 architecture does not impose any restrictions to 189 the application of policy controls even within a VN. 191 A Network Virtualization Edge (NVE) [RFC7365] is the entity that 192 implements the overlay functionality. An NVE resides at the boundary 193 between a Tenant System and the overlay network as shown in Figure 1. 194 An NVE creates and maintains local state about each Virtual Network 195 for which it is providing service on behalf of a Tenant System. 197 +--------+ +--------+ 198 | Tenant +--+ +----| Tenant | 199 | System | | (') | System | 200 +--------+ | ................ ( ) +--------+ 201 | +-+--+ . . +--+-+ (_) 202 | | NVE|--. .--| NVE| | 203 +--| | . . | |---+ 204 +-+--+ . . +--+-+ 205 / . . 206 / . L3 Overlay . +--+-++--------+ 207 +--------+ / . Network . | NVE|| Tenant | 208 | Tenant +--+ . .- -| || System | 209 | System | . . +--+-++--------+ 210 +--------+ ................ 211 | 212 +----+ 213 | NVE| 214 | | 215 +----+ 216 | 217 | 218 ===================== 219 | | 220 +--------+ +--------+ 221 | Tenant | | Tenant | 222 | System | | System | 223 +--------+ +--------+ 225 Figure 1: NVO3 Generic Reference Model 227 The following subsections describe key aspects of an overlay system 228 in more detail. Section 3.1 describes the service model (Ethernet 229 vs. IP) provided to Tenant Systems. Section 3.2 describes NVEs in 230 more detail. Section 3.3 introduces the Network Virtualization 231 Authority, from which NVEs obtain information about virtual networks. 232 Section 3.4 provides background on Virtual Machine (VM) orchestration 233 systems and their use of virtual networks. 235 3.1. VN Service (L2 and L3) 237 A Virtual Network provides either L2 or L3 service to connected 238 tenants. For L2 service, VNs transport Ethernet frames, and a Tenant 239 System is provided with a service that is analogous to being 240 connected to a specific L2 C-VLAN. L2 broadcast frames are generally 241 delivered to all (and multicast frames delivered to a subset of) the 242 other Tenant Systems on the VN. To a Tenant System, it appears as if 243 they are connected to a regular L2 Ethernet link. Within the NVO3 244 architecture, tenant frames are tunneled to remote NVEs based on the 245 MAC addresses of the frame headers as originated by the Tenant 246 System. On the underlay, NVO3 packets are forwarded between NVEs 247 based on the outer addresses of tunneled packets. 249 For L3 service, VNs are routed networks that transport IP datagrams, 250 and a Tenant System is provided with a service that supports only IP 251 traffic. Within the NVO3 architecture, tenant frames are tunneled to 252 remote NVEs based on the IP addresses of the packet originated by the 253 Tenant System; any L2 destination addresses provided by Tenant 254 Systems are effectively ignored by the NVEs and overlay network. For 255 L3 service, the Tenant System will be configured with an IP subnet 256 that is effectively a point-to-point link, i.e., having only the 257 Tenant System and a next-hop router address on it. 259 L2 service is intended for systems that need native L2 Ethernet 260 service and the ability to run protocols directly over Ethernet 261 (i.e., not based on IP). L3 service is intended for systems in which 262 all the traffic can safely be assumed to be IP. It is important to 263 note that whether an NVO3 network provides L2 or L3 service to a 264 Tenant System, the Tenant System does not generally need to be aware 265 of the distinction. In both cases, the virtual network presents 266 itself to the Tenant System as an L2 Ethernet interface. An Ethernet 267 interface is used in both cases simply as a widely supported 268 interface type that essentially all Tenant Systems already support. 269 Consequently, no special software is needed on Tenant Systems to use 270 an L3 vs. an L2 overlay service. 272 NVO3 can also provide a combined L2 and L3 service to tenants. A 273 combined service provides L2 service for intra-VN communication, but 274 also provides L3 service for L3 traffic entering or leaving the VN. 275 Architecturally, the handling of a combined L2/L3 service within the 276 NVO3 architecture is intended to match what is commonly done today in 277 non-overlay environments by devices providing a combined bridge/ 278 router service. With combined service, the virtual network itself 279 retains the semantics of L2 service and all traffic is processed 280 according to its L2 semantics. In addition, however, traffic 281 requiring IP processing is also processed at the IP level. 283 The IP processing for a combined service can be implemented on a 284 standalone device attached to the virtual network (e.g., an IP 285 router) or implemented locally on the NVE (see Section 5.4 on 286 Distributed Gateways). For unicast traffic, NVE implementation of a 287 combined service may result in a packet being delivered to another 288 Tenant System attached to the same NVE (on either the same or a 289 different VN) or tunneled to a remote NVE, or even forwarded outside 290 the NV domain. For multicast or broadcast packets, the combination 291 of NVE L2 and L3 processing may result in copies of the packet 292 receiving both L2 and L3 treatments to realize delivery to all of the 293 destinations involved. This distributed NVE implementation of IP 294 routing results in the same network delivery behavior as if the L2 295 processing of the packet included delivery of the packet to an IP 296 router attached to the L2 VN as a Tenant System, with the router 297 having additional network attachments to other networks, either 298 virtual or not. 300 3.1.1. VLAN Tags in L2 Service 302 An NVO3 L2 virtual network service may include encapsulated L2 VLAN 303 tags provided by a Tenant System, but does not use encapsulated tags 304 in deciding where and how to forward traffic. Such VLAN tags can be 305 passed through, so that Tenant Systems that send or expect to receive 306 them can be supported as appropriate. 308 The processing of VLAN tags that an NVE receives from a TS is 309 controlled by settings associated with the VAP. Just as in the case 310 with ports on Ethernet switches, a number of settings are possible. 311 For example, C-TAGs can be passed through transparently, they could 312 always be stripped upon receipt from a Tenant System, they could be 313 compared against a list of explicitly configured tags, etc. 315 Note that that there are additional considerations when VLAN tags are 316 used to identify both the VN and a Tenant System VLAN within that VN, 317 as described in Section 4.2.1 below. 319 3.1.2. Packet Lifetime Considerations 321 For L3 service, Tenant Systems should expect the IPv4 TTL (Time to 322 Live) or IPv6 Hop Limit in the packets they send to be decremented by 323 at least 1. For L2 service, neither the TTL nor the Hop Limit (when 324 the packet is IP) are modified. The underlay network manages TTLs 325 and Hop Limits in the outer IP encapsulation - the values in these 326 fields could be independent from or related to the values in the same 327 fields of tenant IP packets. 329 3.2. Network Virtualization Edge (NVE) 331 Tenant Systems connect to NVEs via a Tenant System Interface (TSI). 332 The TSI logically connects to the NVE via a Virtual Access Point 333 (VAP) and each VAP is associated with one Virtual Network as shown in 334 Figure 2. To the Tenant System, the TSI is like a NIC; the TSI 335 presents itself to a Tenant System as a normal network interface. On 336 the NVE side, a VAP is a logical network port (virtual or physical) 337 into a specific virtual network. Note that two different Tenant 338 Systems (and TSIs) attached to a common NVE can share a VAP (e.g., 339 TS1 and TS2 in Figure 2) so long as they connect to the same Virtual 340 Network. 342 | Data Center Network (IP) | 343 | | 344 +-----------------------------------------+ 345 | | 346 | Tunnel Overlay | 347 +------------+---------+ +---------+------------+ 348 | +----------+-------+ | | +-------+----------+ | 349 | | Overlay Module | | | | Overlay Module | | 350 | +---------+--------+ | | +---------+--------+ | 351 | | | | | | 352 NVE1 | | | | | | NVE2 353 | +--------+-------+ | | +--------+-------+ | 354 | | VNI1 VNI2 | | | | VNI1 VNI2 | | 355 | +-+----------+---+ | | +-+-----------+--+ | 356 | | VAP1 | VAP2 | | | VAP1 | VAP2| 357 +----+----------+------+ +----+-----------+-----+ 358 | | | | 359 |\ | | | 360 | \ | | /| 361 -------+--\-------+-------------------+---------/-+------- 362 | \ | Tenant | / | 363 TSI1 |TSI2\ | TSI3 TSI1 TSI2/ TSI3 364 +---+ +---+ +---+ +---+ +---+ +---+ 365 |TS1| |TS2| |TS3| |TS4| |TS5| |TS6| 366 +---+ +---+ +---+ +---+ +---+ +---+ 368 Figure 2: NVE Reference Model 370 The Overlay Module performs the actual encapsulation and 371 decapsulation of tunneled packets. The NVE maintains state about the 372 virtual networks it is a part of so that it can provide the Overlay 373 Module with such information as the destination address of the NVE to 374 tunnel a packet to and the Context ID that should be placed in the 375 encapsulation header to identify the virtual network that a tunneled 376 packet belongs to. 378 On the data center network side, the NVE sends and receives native IP 379 traffic. When ingressing traffic from a Tenant System, the NVE 380 identifies the egress NVE to which the packet should be sent, adds an 381 overlay encapsulation header, and sends the packet on the underlay 382 network. When receiving traffic from a remote NVE, an NVE strips off 383 the encapsulation header, and delivers the (original) packet to the 384 appropriate Tenant System. When the source and destination Tenant 385 System are on the same NVE, no encapsulation is needed and the NVE 386 forwards traffic directly. 388 Conceptually, the NVE is a single entity implementing the NVO3 389 functionality. In practice, there are a number of different 390 implementation scenarios, as described in detail in Section 4. 392 3.3. Network Virtualization Authority (NVA) 394 Address dissemination refers to the process of learning, building and 395 distributing the mapping/forwarding information that NVEs need in 396 order to tunnel traffic to each other on behalf of communicating 397 Tenant Systems. For example, in order to send traffic to a remote 398 Tenant System, the sending NVE must know the destination NVE for that 399 Tenant System. 401 One way to build and maintain mapping tables is to use learning, as 402 802.1 bridges do [IEEE-802.1Q]. When forwarding traffic to multicast 403 or unknown unicast destinations, an NVE could simply flood traffic. 404 While flooding works, it can lead to traffic hot spots and can lead 405 to problems in larger networks (e.g., excessive amounts of flooded 406 traffic). 408 Alternatively, to reduce the scope of where flooding must take place, 409 or to eliminate it all together, NVEs can make use of a Network 410 Virtualization Authority (NVA). An NVA is the entity that provides 411 address mapping and other information to NVEs. NVEs interact with an 412 NVA to obtain any required address mapping information they need in 413 order to properly forward traffic on behalf of tenants. The term NVA 414 refers to the overall system, without regards to its scope or how it 415 is implemented. NVAs provide a service, and NVEs access that service 416 via an NVE-to-NVA protocol as discussed in Section 8. 418 Even when an NVA is present, Ethernet bridge MAC address learning 419 could be used as a fallback mechanism, should the NVA be unable to 420 provide an answer or for other reasons. This document does not 421 consider flooding approaches in detail, as there are a number of 422 benefits in using an approach that depends on the presence of an NVA. 424 For the rest of this document, it is assumed that an NVA exists and 425 will be used. NVAs are discussed in more detail in Section 7. 427 3.4. VM Orchestration Systems 429 VM orchestration systems manage server virtualization across a set of 430 servers. Although VM management is a separate topic from network 431 virtualization, the two areas are closely related. Managing the 432 creation, placement, and movement of VMs also involves creating, 433 attaching to and detaching from virtual networks. A number of 434 existing VM orchestration systems have incorporated aspects of 435 virtual network management into their systems. 437 Note also, that although this section uses the term "VM" and 438 "hypervisor" throughout, the same issues apply to other 439 virtualization approaches, including Linux Containers (LXC), BSD 440 Jails, Network Service Appliances as discussed in Section 5.1, etc.. 441 From an NVO3 perspective, it should be assumed that where the 442 document uses the term "VM" and "hypervisor", the intention is that 443 the discussion also applies to other systems, where, e.g., the host 444 operating system plays the role of the hypervisor in supporting 445 virtualization, and a container plays the equivalent role as a VM. 447 When a new VM image is started, the VM orchestration system 448 determines where the VM should be placed, interacts with the 449 hypervisor on the target server to load and start the VM and controls 450 when a VM should be shutdown or migrated elsewhere. VM orchestration 451 systems also have knowledge about how a VM should connect to a 452 network, possibly including the name of the virtual network to which 453 a VM is to connect. The VM orchestration system can pass such 454 information to the hypervisor when a VM is instantiated. VM 455 orchestration systems have significant (and sometimes global) 456 knowledge over the domain they manage. They typically know on what 457 servers a VM is running, and meta data associated with VM images can 458 be useful from a network virtualization perspective. For example, 459 the meta data may include the addresses (MAC and IP) the VMs will use 460 and the name(s) of the virtual network(s) they connect to. 462 VM orchestration systems run a protocol with an agent running on the 463 hypervisor of the servers they manage. That protocol can also carry 464 information about what virtual network a VM is associated with. When 465 the orchestrator instantiates a VM on a hypervisor, the hypervisor 466 interacts with the NVE in order to attach the VM to the virtual 467 networks it has access to. In general, the hypervisor will need to 468 communicate significant VM state changes to the NVE. In the reverse 469 direction, the NVE may need to communicate network connectivity 470 information back to the hypervisor. Examples of deployed VM 471 orchestration systems include VMware's vCenter Server, Microsoft's 472 System Center Virtual Machine Manager, and systems based on OpenStack 473 and its associated plugins (e.g., Nova and Neutron). Each can pass 474 information about what virtual networks a VM connects to down to the 475 hypervisor. The protocol used between the VM orchestration system 476 and hypervisors is generally proprietary. 478 It should be noted that VM orchestration systems may not have direct 479 access to all networking related information a VM uses. For example, 480 a VM may make use of additional IP or MAC addresses that the VM 481 management system is not aware of. 483 4. Network Virtualization Edge (NVE) 485 As introduced in Section 3.2 an NVE is the entity that implements the 486 overlay functionality. This section describes NVEs in more detail. 487 An NVE will have two external interfaces: 489 Tenant System Facing: On the Tenant System facing side, an NVE 490 interacts with the hypervisor (or equivalent entity) to provide 491 the NVO3 service. An NVE will need to be notified when a Tenant 492 System "attaches" to a virtual network (so it can validate the 493 request and set up any state needed to send and receive traffic on 494 behalf of the Tenant System on that VN). Likewise, an NVE will 495 need to be informed when the Tenant System "detaches" from the 496 virtual network so that it can reclaim state and resources 497 appropriately. 499 Data Center Network Facing: On the data center network facing side, 500 an NVE interfaces with the data center underlay network, sending 501 and receiving tunneled packets to and from the underlay. The NVE 502 may also run a control protocol with other entities on the 503 network, such as the Network Virtualization Authority. 505 4.1. NVE Co-located With Server Hypervisor 507 When server virtualization is used, the entire NVE functionality will 508 typically be implemented as part of the hypervisor and/or virtual 509 switch on the server. In such cases, the Tenant System interacts 510 with the hypervisor and the hypervisor interacts with the NVE. 511 Because the interaction between the hypervisor and NVE is implemented 512 entirely in software on the server, there is no "on-the-wire" 513 protocol between Tenant Systems (or the hypervisor) and the NVE that 514 needs to be standardized. While there may be APIs between the NVE 515 and hypervisor to support necessary interaction, the details of such 516 an API are not in-scope for the NVO3 WG at the time of publication of 517 this memo. 519 Implementing NVE functionality entirely on a server has the 520 disadvantage that server CPU resources must be spent implementing the 521 NVO3 functionality. Experimentation with overlay approaches and 522 previous experience with TCP and checksum adapter offloads suggests 523 that offloading certain NVE operations (e.g., encapsulation and 524 decapsulation operations) onto the physical network adapter can 525 produce performance advantages. As has been done with checksum and/ 526 or TCP server offload and other optimization approaches, there may be 527 benefits to offloading common operations onto adapters where 528 possible. Just as important, the addition of an overlay header can 529 disable existing adapter offload capabilities that are generally not 530 prepared to handle the addition of a new header or other operations 531 associated with an NVE. 533 While the exact details of how to split the implementation of 534 specific NVE functionality between a server and its network adapters 535 is an implementation matter and outside the scope of IETF 536 standardization, the NVO3 architecture should be cognizant of and 537 support such separation. Ideally, it may even be possible to bypass 538 the hypervisor completely on critical data path operations so that 539 packets between a Tenant System and its VN can be sent and received 540 without having the hypervisor involved in each individual packet 541 operation. 543 4.2. Split-NVE 545 Another possible scenario leads to the need for a split NVE 546 implementation. An NVE running on a server (e.g. within a 547 hypervisor) could support NVO3 service towards the tenant, but not 548 perform all NVE functions (e.g., encapsulation) directly on the 549 server; some of the actual NVO3 functionality could be implemented on 550 (i.e., offloaded to) an adjacent switch to which the server is 551 attached. While one could imagine a number of link types between a 552 server and the NVE, one simple deployment scenario would involve a 553 server and NVE separated by a simple L2 Ethernet link. A more 554 complicated scenario would have the server and NVE separated by a 555 bridged access network, such as when the NVE resides on a top of rack 556 (ToR) switch, with an embedded switch residing between servers and 557 the ToR switch. 559 For the split NVE case, protocols will be needed that allow the 560 hypervisor and NVE to negotiate and setup the necessary state so that 561 traffic sent across the access link between a server and the NVE can 562 be associated with the correct virtual network instance. 563 Specifically, on the access link, traffic belonging to a specific 564 Tenant System would be tagged with a specific VLAN C-TAG that 565 identifies which specific NVO3 virtual network instance it connects 566 to. The hypervisor-NVE protocol would negotiate which VLAN C-TAG to 567 use for a particular virtual network instance. More details of the 568 protocol requirements for functionality between hypervisors and NVEs 569 can be found in [I-D.ietf-nvo3-nve-nva-cp-req]. 571 4.2.1. Tenant VLAN handling in Split-NVE Case 573 Preserving tenant VLAN tags across an NVO3 VN as described in 574 Section 3.1.1 poses additional complications in the split-NVE case. 575 The portion of the NVE that performs the encapsulation function needs 576 access to the specific VLAN tags that the Tenant System is using in 577 order to include them in the encapsulated packet. When an NVE is 578 implemented entirely within the hypervisor, the NVE has access to the 579 complete original packet (including any VLAN tags) sent by the 580 tenant. In the split-NVE case, however, the VLAN tag used between 581 the hypervisor and offloaded portions of the NVE normally only 582 identifies the specific VN that traffic belongs to. In order to 583 allow a tenant to preserve VLAN information from end to end between 584 Tenant Systems in the split-NVE case, additional mechanisms would be 585 needed (e.g., carry an additional VLAN tag by carrying both a C-Tag 586 and an S-Tag as specified in [IEEE-802.1Q] where the C-Tag identifies 587 the tenant VLAN end-to-end and the S-Tag identifies the VN locally 588 between each Tenant System and the corresponding NVE). 590 4.3. NVE State 592 NVEs maintain internal data structures and state to support the 593 sending and receiving of tenant traffic. An NVE may need some or all 594 of the following information: 596 1. An NVE keeps track of which attached Tenant Systems are connected 597 to which virtual networks. When a Tenant System attaches to a 598 virtual network, the NVE will need to create or update local 599 state for that virtual network. When the last Tenant System 600 detaches from a given VN, the NVE can reclaim state associated 601 with that VN. 603 2. For tenant unicast traffic, an NVE maintains a per-VN table of 604 mappings from Tenant System (inner) addresses to remote NVE 605 (outer) addresses. 607 3. For tenant multicast (or broadcast) traffic, an NVE maintains a 608 per-VN table of mappings and other information on how to deliver 609 tenant multicast (or broadcast) traffic. If the underlying 610 network supports IP multicast, the NVE could use IP multicast to 611 deliver tenant traffic. In such a case, the NVE would need to 612 know what IP underlay multicast address to use for a given VN. 613 Alternatively, if the underlying network does not support 614 multicast, a source NVE could use unicast replication to deliver 615 traffic. In such a case, an NVE would need to know which remote 616 NVEs are participating in the VN. An NVE could use both 617 approaches, switching from one mode to the other depending on 618 such factors as bandwidth efficiency and group membership 619 sparseness. [I-D.ietf-nvo3-mcast-framework] discusses the 620 subject of multicast handling in NVO3 in further detail. 622 4. An NVE maintains necessary information to encapsulate outgoing 623 traffic, including what type of encapsulation and what value to 624 use for a Context ID to identify the VN within the encapsulation 625 header. 627 5. In order to deliver incoming encapsulated packets to the correct 628 Tenant Systems, an NVE maintains the necessary information to map 629 incoming traffic to the appropriate VAP (i.e., Tenant System 630 Interface). 632 6. An NVE may find it convenient to maintain additional per-VN 633 information such as QoS settings, Path MTU information, ACLs, 634 etc. 636 4.4. Multi-Homing of NVEs 638 NVEs may be multi-homed. That is, an NVE may have more than one IP 639 address associated with it on the underlay network. Multihoming 640 happens in two different scenarios. First, an NVE may have multiple 641 interfaces connecting it to the underlay. Each of those interfaces 642 will typically have a different IP address, resulting in a specific 643 Tenant Address (on a specific VN) being reachable through the same 644 NVE but through more than one underlay IP address. Second, a 645 specific tenant system may be reachable through more than one NVE, 646 each having one or more underlay addresses. In both cases, NVE 647 address mapping functionality needs to support one-to-many mappings 648 and enable a sending NVE to (at a minimum) be able to fail over from 649 one IP address to another, e.g., should a specific NVE underlay 650 address become unreachable. 652 Finally, multi-homed NVEs introduce complexities when source unicast 653 replication is used to implement tenant multicast as described in 654 Section 4.3. Specifically, an NVE should only receive one copy of a 655 replicated packet. 657 Multi-homing is needed to support important use cases. First, a bare 658 metal server may have multiple uplink connections to either the same 659 or different NVEs. Having only a single physical path to an upstream 660 NVE, or indeed, having all traffic flow through a single NVE would be 661 considered unacceptable in highly-resilient deployment scenarios that 662 seek to avoid single points of failure. Moreover, in today's 663 networks, the availability of multiple paths would require that they 664 be usable in an active-active fashion (e.g., for load balancing). 666 4.5. Virtual Access Point (VAP) 668 The VAP is the NVE-side of the interface between the NVE and the TS. 669 Traffic to and from the tenant flows through the VAP. If an NVE runs 670 into difficulties sending traffic received on the VAP, it may need to 671 signal such errors back to the VAP. Because the VAP is an emulation 672 of a physical port, its ability to signal NVE errors is limited and 673 lacks sufficient granularity to reflect all possible errors an NVE 674 may encounter (e.g., inability reach a particular destination). Some 675 errors, such as an NVE losing all of its connections to the underlay, 676 could be reflected back to the VAP by effectively disabling it. This 677 state change would reflect itself on the TS as an interface going 678 down, allowing the TS to implement interface error handling, e.g., 679 failover, in the same manner as when a physical interfaces becomes 680 disabled. 682 5. Tenant System Types 684 This section describes a number of special Tenant System types and 685 how they fit into an NVO3 system. 687 5.1. Overlay-Aware Network Service Appliances 689 Some Network Service Appliances [I-D.ietf-nvo3-nve-nva-cp-req] 690 (virtual or physical) provide tenant-aware services. That is, the 691 specific service they provide depends on the identity of the tenant 692 making use of the service. For example, firewalls are now becoming 693 available that support multi-tenancy where a single firewall provides 694 virtual firewall service on a per-tenant basis, using per-tenant 695 configuration rules and maintaining per-tenant state. Such 696 appliances will be aware of the VN an activity corresponds to while 697 processing requests. Unlike server virtualization, which shields VMs 698 from needing to know about multi-tenancy, a Network Service Appliance 699 may explicitly support multi-tenancy. In such cases, the Network 700 Service Appliance itself will be aware of network virtualization and 701 either embed an NVE directly, or implement a split NVE as described 702 in Section 4.2. Unlike server virtualization, however, the Network 703 Service Appliance may not be running a hypervisor and the VM 704 orchestration system may not interact with the Network Service 705 Appliance. The NVE on such appliances will need to support a control 706 plane to obtain the necessary information needed to fully participate 707 in an NV Domain. 709 5.2. Bare Metal Servers 711 Many data centers will continue to have at least some servers 712 operating as non-virtualized (or "bare metal") machines running a 713 traditional operating system and workload. In such systems, there 714 will be no NVE functionality on the server, and the server will have 715 no knowledge of NVO3 (including whether overlays are even in use). 716 In such environments, the NVE functionality can reside on the first- 717 hop physical switch. In such a case, the network administrator would 718 (manually) configure the switch to enable the appropriate NVO3 719 functionality on the switch port connecting the server and associate 720 that port with a specific virtual network. Such configuration would 721 typically be static, since the server is not virtualized, and once 722 configured, is unlikely to change frequently. Consequently, this 723 scenario does not require any protocol or standards work. 725 5.3. Gateways 727 Gateways on VNs relay traffic onto and off of a virtual network. 728 Tenant Systems use gateways to reach destinations outside of the 729 local VN. Gateways receive encapsulated traffic from one VN, remove 730 the encapsulation header, and send the native packet out onto the 731 data center network for delivery. Outside traffic enters a VN in a 732 reverse manner. 734 Gateways can be either virtual (i.e., implemented as a VM) or 735 physical (i.e., as a standalone physical device). For performance 736 reasons, standalone hardware gateways may be desirable in some cases. 737 Such gateways could consist of a simple switch forwarding traffic 738 from a VN onto the local data center network, or could embed router 739 functionality. On such gateways, network interfaces connecting to 740 virtual networks will (at least conceptually) embed NVE (or split- 741 NVE) functionality within them. As in the case with Network Service 742 Appliances, gateways may not support a hypervisor and will need an 743 appropriate control plane protocol to obtain the information needed 744 to provide NVO3 service. 746 Gateways handle several different use cases. For example, one use 747 case consists of systems supporting overlays together with systems 748 that do not (e.g., bare metal servers). Gateways could be used to 749 connect legacy systems supporting, e.g., L2 VLANs, to specific 750 virtual networks, effectively making them part of the same virtual 751 network. Gateways could also forward traffic between a virtual 752 network and other hosts on the data center network or relay traffic 753 between different VNs. Finally, gateways can provide external 754 connectivity such as Internet or VPN access. 756 5.3.1. Gateway Taxonomy 758 As can be seen from the discussion above, there are several types of 759 gateways that can exist in an NVO3 environment. This section breaks 760 them down into the various types that could be supported. Note that 761 each of the types below could be implemented in either a centralized 762 manner or distributed to co-exist with the NVEs. 764 5.3.1.1. L2 Gateways (Bridging) 766 L2 Gateways act as layer 2 bridges to forward Ethernet frames based 767 on the MAC addresses present in them. 769 L2 VN to Legacy L2: This type of gateway bridges traffic between L2 770 VNs and other legacy L2 networks such as VLANs or L2 VPNs. 772 L2 VN to L2 VN: The main motivation for this type of gateway to 773 create separate groups of Tenant Systems using L2 VNs such that 774 the gateway can enforce network policies between each L2 VN. 776 5.3.1.2. L3 Gateways (Only IP Packets) 778 L3 Gateways forward IP packets based on the IP addresses present in 779 the packets. 781 L3 VN to Legacy L2: This type of gateway forwards packets between L3 782 VNs and legacy L2 networks such as VLANs or L2 VPNs. The 783 original sender's destination MAC address in any frames that 784 the gateway forwards from a legacy L2 network would be the MAC 785 address of the gateway. 787 L3 VN to Legacy L3: The type of gateway forwards packets between L3 788 VNs and legacy L3 networks. These legacy L3 networks could be 789 local the data center, in the WAN, or an L3 VPN. 791 L3 VN to L2 VN: This type of gateway forwards packets on between L3 792 VNs and L2 VNs. The original sender's destination MAC address 793 in any frames that the gateway forwards from a L2 VN would be 794 the MAC address of the gateway. 796 L2 VN to L2 VN: This type of gateway acts similar to a traditional 797 router that forwards between L2 interfaces. The original 798 sender's destination MAC address in any frames that the gateway 799 forwards from any of the L2 VNs would be the MAC address of the 800 gateway. 802 L3 VN to L3 VN: The main motivation for this type of gateway to 803 create separate groups of Tenant Systems using L3 VNs such that 804 the gateway can enforce network policies between each L3 VN. 806 5.4. Distributed Inter-VN Gateways 808 The relaying of traffic from one VN to another deserves special 809 consideration. Whether traffic is permitted to flow from one VN to 810 another is a matter of policy, and would not (by default) be allowed 811 unless explicitly enabled. In addition, NVAs are the logical place 812 to maintain policy information about allowed inter-VN communication. 813 Policy enforcement for inter-VN communication can be handled in (at 814 least) two different ways. Explicit gateways could be the central 815 point for such enforcement, with all inter-VN traffic forwarded to 816 such gateways for processing. Alternatively, the NVA can provide 817 such information directly to NVEs, by either providing a mapping for 818 a target Tenant System (TS) on another VN, or indicating that such 819 communication is disallowed by policy. 821 When inter-VN gateways are centralized, traffic between TSs on 822 different VNs can take suboptimal paths, i.e., triangular routing 823 results in paths that always traverse the gateway. In the worst 824 case, traffic between two TSs connected to the same NVE can be hair- 825 pinned through an external gateway. As an optimization, individual 826 NVEs can be part of a distributed gateway that performs such 827 relaying, reducing or completely eliminating triangular routing. In 828 a distributed gateway, each ingress NVE can perform such relaying 829 activity directly, so long as it has access to the policy information 830 needed to determine whether cross-VN communication is allowed. 831 Having individual NVEs be part of a distributed gateway allows them 832 to tunnel traffic directly to the destination NVE without the need to 833 take suboptimal paths. 835 The NVO3 architecture supports distributed gateways for the case of 836 inter-VN communication. Such support requires that NVO3 control 837 protocols include mechanisms for the maintenance and distribution of 838 policy information about what type of cross-VN communication is 839 allowed so that NVEs acting as distributed gateways can tunnel 840 traffic from one VN to another as appropriate. 842 Distributed gateways could also be used to distribute other 843 traditional router services to individual NVEs. The NVO3 844 architecture does not preclude such implementations, but does not 845 define or require them as they are outside the scope of the NVO3 846 architecture. 848 5.5. ARP and Neighbor Discovery 850 For an L2 service, strictly speaking, special processing of Address 851 Resolution Protocol (ARP) [RFC0826] (and IPv6 Neighbor Discovery (ND) 852 [RFC4861]) is not required. ARP requests are broadcast, and an NVO3 853 can deliver ARP requests to all members of a given L2 virtual 854 network, just as it does for any packet sent to an L2 broadcast 855 address. Similarly, ND requests are sent via IP multicast, which 856 NVO3 can support by delivering via L2 multicast. However, as a 857 performance optimization, an NVE can intercept ARP (or ND) requests 858 from its attached TSs and respond to them directly using information 859 in its mapping tables. Since an NVE will have mechanisms for 860 determining the NVE address associated with a given TS, the NVE can 861 leverage the same mechanisms to suppress sending ARP and ND requests 862 for a given TS to other members of the VN. The NVO3 architecture 863 supports such a capability. 865 6. NVE-NVE Interaction 867 Individual NVEs will interact with each other for the purposes of 868 tunneling and delivering traffic to remote TSs. At a minimum, a 869 control protocol may be needed for tunnel setup and maintenance. For 870 example, tunneled traffic may need to be encrypted or integrity 871 protected, in which case it will be necessary to set up appropriate 872 security associations between NVE peers. It may also be desirable to 873 perform tunnel maintenance (e.g., continuity checks) on a tunnel in 874 order to detect when a remote NVE becomes unreachable. Such generic 875 tunnel setup and maintenance functions are not generally 876 NVO3-specific. Hence, the NVO3 architecture expects to leverage 877 existing tunnel maintenance protocols rather than defining new ones. 879 Some NVE-NVE interactions may be specific to NVO3 (and in particular 880 be related to information kept in mapping tables) and agnostic to the 881 specific tunnel type being used. For example, when tunneling traffic 882 for TS-X to a remote NVE, it is possible that TS-X is not presently 883 associated with the remote NVE. Normally, this should not happen, 884 but there could be race conditions where the information an NVE has 885 learned from the NVA is out-of-date relative to actual conditions. 886 In such cases, the remote NVE could return an error or warning 887 indication, allowing the sending NVE to attempt a recovery or 888 otherwise attempt to mitigate the situation. 890 The NVE-NVE interaction could signal a range of indications, for 891 example: 893 o "No such TS here", upon a receipt of a tunneled packet for an 894 unknown TS. 896 o "TS-X not here, try the following NVE instead" (i.e., a redirect). 898 o Delivered to correct NVE, but could not deliver packet to TS-X. 900 When an NVE receives information from a remote NVE that conflicts 901 with the information it has in its own mapping tables, it should 902 consult with the NVA to resolve those conflicts. In particular, it 903 should confirm that the information it has is up-to-date, and it 904 might indicate the error to the NVA, so as to nudge the NVA into 905 following up (as appropriate). While it might make sense for an NVE 906 to update its mapping table temporarily in response to an error from 907 a remote NVE, any changes must be handled carefully as doing so can 908 raise security considerations if the received information cannot be 909 authenticated. That said, a sending NVE might still take steps to 910 mitigate a problem, such as applying rate limiting to data traffic 911 towards a particular NVE or TS. 913 7. Network Virtualization Authority 915 Before sending to and receiving traffic from a virtual network, an 916 NVE must obtain the information needed to build its internal 917 forwarding tables and state as listed in Section 4.3. An NVE can 918 obtain such information from a Network Virtualization Authority. 920 The Network Virtualization Authority (NVA) is the entity that is 921 expected to provide address mapping and other information to NVEs. 922 NVEs can interact with an NVA to obtain any required information they 923 need in order to properly forward traffic on behalf of tenants. The 924 term NVA refers to the overall system, without regards to its scope 925 or how it is implemented. 927 7.1. How an NVA Obtains Information 929 There are two primary ways in which an NVA can obtain the address 930 dissemination information it manages. The NVA can obtain information 931 either from the VM orchestration system, and/or directly from the 932 NVEs themselves. 934 On virtualized systems, the NVA may be able to obtain the address 935 mapping information associated with VMs from the VM orchestration 936 system itself. If the VM orchestration system contains a master 937 database for all the virtualization information, having the NVA 938 obtain information directly to the orchestration system would be a 939 natural approach. Indeed, the NVA could effectively be co-located 940 with the VM orchestration system itself. In such systems, the VM 941 orchestration system communicates with the NVE indirectly through the 942 hypervisor. 944 However, as described in Section 4 not all NVEs are associated with 945 hypervisors. In such cases, NVAs cannot leverage VM orchestration 946 protocols to interact with an NVE and will instead need to peer 947 directly with them. By peering directly with an NVE, NVAs can obtain 948 information about the TSs connected to that NVE and can distribute 949 information to the NVE about the VNs those TSs are associated with. 950 For example, whenever a Tenant System attaches to an NVE, that NVE 951 would notify the NVA that the TS is now associated with that NVE. 952 Likewise when a TS detaches from an NVE, that NVE would inform the 953 NVA. By communicating directly with NVEs, both the NVA and the NVE 954 are able to maintain up-to-date information about all active tenants 955 and the NVEs to which they are attached. 957 7.2. Internal NVA Architecture 959 For reliability and fault tolerance reasons, an NVA would be 960 implemented in a distributed or replicated manner without single 961 points of failure. How the NVA is implemented, however, is not 962 important to an NVE so long as the NVA provides a consistent and 963 well-defined interface to the NVE. For example, an NVA could be 964 implemented via database techniques whereby a server stores address 965 mapping information in a traditional (possibly replicated) database. 966 Alternatively, an NVA could be implemented in a distributed fashion 967 using an existing (or modified) routing protocol to maintain and 968 distribute mappings. So long as there is a clear interface between 969 the NVE and NVA, how an NVA is architected and implemented is not 970 important to an NVE. 972 A number of architectural approaches could be used to implement NVAs 973 themselves. NVAs manage address bindings and distribute them to 974 where they need to go. One approach would be to use Border Gateway 975 Protocol (BGP) [RFC4364] (possibly with extensions) and route 976 reflectors. Another approach could use a transaction-based database 977 model with replicated servers. Because the implementation details 978 are local to an NVA, there is no need to pick exactly one solution 979 technology, so long as the external interfaces to the NVEs (and 980 remote NVAs) are sufficiently well defined to achieve 981 interoperability. 983 7.3. NVA External Interface 985 Conceptually, from the perspective of an NVE, an NVA is a single 986 entity. An NVE interacts with the NVA, and it is the NVA's 987 responsibility for ensuring that interactions between the NVE and NVA 988 result in consistent behavior across the NVA and all other NVEs using 989 the same NVA. Because an NVA is built from multiple internal 990 components, an NVA will have to ensure that information flows to all 991 internal NVA components appropriately. 993 One architectural question is how the NVA presents itself to the NVE. 994 For example, an NVA could be required to provide access via a single 995 IP address. If NVEs only have one IP address to interact with, it 996 would be the responsibility of the NVA to handle NVA component 997 failures, e.g., by using a "floating IP address" that migrates among 998 NVA components to ensure that the NVA can always be reached via the 999 one address. Having all NVA accesses through a single IP address, 1000 however, adds constraints to implementing robust failover, load 1001 balancing, etc. 1003 In the NVO3 architecture, an NVA is accessed through one or more IP 1004 addresses (or IP address/port combination). If multiple IP addresses 1005 are used, each IP address provides equivalent functionality, meaning 1006 that an NVE can use any of the provided addresses to interact with 1007 the NVA. Should one address stop working, an NVE is expected to 1008 failover to another. While the different addresses result in 1009 equivalent functionality, one address may respond more quickly than 1010 another, e.g., due to network conditions, load on the server, etc. 1012 To provide some control over load balancing, NVA addresses may have 1013 an associated priority. Addresses are used in order of priority, 1014 with no explicit preference among NVA addresses having the same 1015 priority. To provide basic load-balancing among NVAs of equal 1016 priorities, NVEs could use some randomization input to select among 1017 equal-priority NVAs. Such a priority scheme facilitates failover and 1018 load balancing, for example, allowing a network operator to specify a 1019 set of primary and backup NVAs. 1021 It may be desirable to have individual NVA addresses responsible for 1022 a subset of information about an NV Domain. In such a case, NVEs 1023 would use different NVA addresses for obtaining or updating 1024 information about particular VNs or TS bindings. A key question with 1025 such an approach is how information would be partitioned, and how an 1026 NVE could determine which address to use to get the information it 1027 needs. 1029 Another possibility is to treat the information on which NVA 1030 addresses to use as cached (soft-state) information at the NVEs, so 1031 that any NVA address can be used to obtain any information, but NVEs 1032 are informed of preferences for which addresses to use for particular 1033 information on VNs or TS bindings. That preference information would 1034 be cached for future use to improve behavior - e.g., if all requests 1035 for a specific subset of VNs are forwarded to a specific NVA 1036 component, the NVE can optimize future requests within that subset by 1037 sending them directly to that NVA component via its address. 1039 8. NVE-to-NVA Protocol 1041 As outlined in Section 4.3, an NVE needs certain information in order 1042 to perform its functions. To obtain such information from an NVA, an 1043 NVE-to-NVA protocol is needed. The NVE-to-NVA protocol provides two 1044 functions. First it allows an NVE to obtain information about the 1045 location and status of other TSs with which it needs to communicate. 1046 Second, the NVE-to-NVA protocol provides a way for NVEs to provide 1047 updates to the NVA about the TSs attached to that NVE (e.g., when a 1048 TS attaches or detaches from the NVE), or about communication errors 1049 encountered when sending traffic to remote NVEs. For example, an NVE 1050 could indicate that a destination it is trying to reach at a 1051 destination NVE is unreachable for some reason. 1053 While having a direct NVE-to-NVA protocol might seem straightforward, 1054 the existence of existing VM orchestration systems complicates the 1055 choices an NVE has for interacting with the NVA. 1057 8.1. NVE-NVA Interaction Models 1059 An NVE interacts with an NVA in at least two (quite different) ways: 1061 o NVEs embedded within the same server as the hypervisor can obtain 1062 necessary information entirely through the hypervisor-facing side 1063 of the NVE. Such an approach is a natural extension to existing 1064 VM orchestration systems supporting server virtualization because 1065 an existing protocol between the hypervisor and VM orchestration 1066 system already exists and can be leveraged to obtain any needed 1067 information. Specifically, VM orchestration systems used to 1068 create, terminate and migrate VMs already use well-defined (though 1069 typically proprietary) protocols to handle the interactions 1070 between the hypervisor and VM orchestration system. For such 1071 systems, it is a natural extension to leverage the existing 1072 orchestration protocol as a sort of proxy protocol for handling 1073 the interactions between an NVE and the NVA. Indeed, existing 1074 implementations can already do this. 1076 o Alternatively, an NVE can obtain needed information by interacting 1077 directly with an NVA via a protocol operating over the data center 1078 underlay network. Such an approach is needed to support NVEs that 1079 are not associated with systems performing server virtualization 1080 (e.g., as in the case of a standalone gateway) or where the NVE 1081 needs to communicate directly with the NVA for other reasons. 1083 The NVO3 architecture will focus on support for the second model 1084 above. Existing virtualization environments are already using the 1085 first model. But they are not sufficient to cover the case of 1086 standalone gateways -- such gateways may not support virtualization 1087 and do not interface with existing VM orchestration systems. 1089 8.2. Direct NVE-NVA Protocol 1091 An NVE can interact directly with an NVA via an NVE-to-NVA protocol. 1092 Such a protocol can be either independent of the NVA internal 1093 protocol, or an extension of it. Using a purpose-specific protocol 1094 would provide architectural separation and independence between the 1095 NVE and NVA. The NVE and NVA interact in a well-defined way, and 1096 changes in the NVA (or NVE) do not need to impact each other. Using 1097 a dedicated protocol also ensures that both NVE and NVA 1098 implementations can evolve independently and without dependencies on 1099 each other. Such independence is important because the upgrade path 1100 for NVEs and NVAs is quite different. Upgrading all the NVEs at a 1101 site will likely be more difficult in practice than upgrading NVAs 1102 because of their large number - one on each end device. In practice, 1103 it would be prudent to assume that once an NVE has been implemented 1104 and deployed, it may be challenging to get subsequent NVE extensions 1105 and changes implemented and deployed, whereas an NVA (and its 1106 associated internal protocols) are more likely to evolve over time as 1107 experience is gained from usage and upgrades will involve fewer 1108 nodes. 1110 Requirements for a direct NVE-NVA protocol can be found in 1111 [I-D.ietf-nvo3-nve-nva-cp-req] 1113 8.3. Propagating Information Between NVEs and NVAs 1115 Information flows between NVEs and NVAs in both directions. The NVA 1116 maintains information about all VNs in the NV Domain, so that NVEs do 1117 not need to do so themselves. NVEs obtain from the NVA information 1118 about where a given remote TS destination resides. NVAs in turn 1119 obtain information from NVEs about the individual TSs attached to 1120 those NVEs. 1122 While the NVA could push information relevant to every virtual 1123 network to every NVE, such an approach scales poorly and is 1124 unnecessary. In practice, a given NVE will only need and want to 1125 know about VNs to which it is attached. Thus, an NVE should be able 1126 to subscribe to updates only for the virtual networks it is 1127 interested in receiving updates for. The NVO3 architecture supports 1128 a model where an NVE is not required to have full mapping tables for 1129 all virtual networks in an NV Domain. 1131 Before sending unicast traffic to a remote TS (or TSes for broadcast 1132 or multicast traffic), an NVE must know where the remote TS(es) 1133 currently reside. When a TS attaches to a virtual network, the NVE 1134 obtains information about that VN from the NVA. The NVA can provide 1135 that information to the NVE at the time the TS attaches to the VN, 1136 either because the NVE requests the information when the attach 1137 operation occurs, or because the VM orchestration system has 1138 initiated the attach operation and provides associated mapping 1139 information to the NVE at the same time. 1141 There are scenarios where an NVE may wish to query the NVA about 1142 individual mappings within an VN. For example, when sending traffic 1143 to a remote TS on a remote NVE, that TS may become unavailable (e.g,. 1144 because it has migrated elsewhere or has been shutdown, in which case 1145 the remote NVE may return an error indication). In such situations, 1146 the NVE may need to query the NVA to obtain updated mapping 1147 information for a specific TS, or verify that the information is 1148 still correct despite the error condition. Note that such a query 1149 could also be used by the NVA as an indication that there may be an 1150 inconsistency in the network and that it should take steps to verify 1151 that the information it has about the current state and location of a 1152 specific TS is still correct. 1154 For very large virtual networks, the amount of state an NVE needs to 1155 maintain for a given virtual network could be significant. Moreover, 1156 an NVE may only be communicating with a small subset of the TSs on 1157 such a virtual network. In such cases, the NVE may find it desirable 1158 to maintain state only for those destinations it is actively 1159 communicating with. In such scenarios, an NVE may not want to 1160 maintain full mapping information about all destinations on a VN. 1161 Should it then need to communicate with a destination for which it 1162 does not have mapping information, however, it will need to be able 1163 to query the NVA on demand for the missing information on a per- 1164 destination basis. 1166 The NVO3 architecture will need to support a range of operations 1167 between the NVE and NVA. Requirements for those operations can be 1168 found in [I-D.ietf-nvo3-nve-nva-cp-req]. 1170 9. Federated NVAs 1172 An NVA provides service to the set of NVEs in its NV Domain. Each 1173 NVA manages network virtualization information for the virtual 1174 networks within its NV Domain. An NV domain is administered by a 1175 single entity. 1177 In some cases, it will be necessary to expand the scope of a specific 1178 VN or even an entire NV domain beyond a single NVA. For example, 1179 multiple data centers managed by the same administrator may wish to 1180 operate all of its data centers as a single NV region. Such cases 1181 are handled by having different NVAs peer with each other to exchange 1182 mapping information about specific VNs. NVAs operate in a federated 1183 manner with a set of NVAs operating as a loosely-coupled federation 1184 of individual NVAs. If a virtual network spans multiple NVAs (e.g., 1185 located at different data centers), and an NVE needs to deliver 1186 tenant traffic to an NVE that is part of a different NV Domain, it 1187 still interacts only with its NVA, even when obtaining mappings for 1188 NVEs associated with a different NV Domain. 1190 Figure 3 shows a scenario where two separate NV Domains (1 and 2) 1191 share information about Virtual Network "1217". VM1 and VM2 both 1192 connect to the same Virtual Network 1217, even though the two VMs are 1193 in separate NV Domains. There are two cases to consider. In the 1194 first case, NV Domain B (NVB) does not allow NVE-A to tunnel traffic 1195 directly to NVE-B. There could be a number of reasons for this. For 1196 example, NV Domains 1 and 2 may not share a common address space 1197 (i.e., require traversal through a NAT device), or for policy 1198 reasons, a domain might require that all traffic between separate NV 1199 Domains be funneled through a particular device (e.g., a firewall). 1200 In such cases, NVA-2 will advertise to NVA-1 that VM1 on Virtual 1201 Network 1217 is available, and direct that traffic between the two 1202 nodes go through IP-G. IP-G would then decapsulate received traffic 1203 from one NV Domain, translate it appropriately for the other domain 1204 and re-encapsulate the packet for delivery. 1206 xxxxxx xxxx +-----+ 1207 +-----+ xxxxxx xxxxxx xxxxxx xxxxx | VM2 | 1208 | VM1 | xx xx xxx xx |-----| 1209 |-----| xx x xx x |NVE-B| 1210 |NVE-A| x x +----+ x x +-----+ 1211 +--+--+ x NV Domain A x |IP-G|--x x | 1212 +-------x xx--+ | x xx | 1213 x x +----+ x NV Domain B x | 1214 +---x xx xx x---+ 1215 | xxxx xx +->xx xx 1216 | xxxxxxxx | xx xx 1217 +---+-+ | xx xx 1218 |NVA-1| +--+--+ xx xxx 1219 +-----+ |NVA-2| xxxx xxxx 1220 +-----+ xxxxx 1222 Figure 3: VM1 and VM2 are in different NV Domains. 1224 NVAs at one site share information and interact with NVAs at other 1225 sites, but only in a controlled manner. It is expected that policy 1226 and access control will be applied at the boundaries between 1227 different sites (and NVAs) so as to minimize dependencies on external 1228 NVAs that could negatively impact the operation within a site. It is 1229 an architectural principle that operations involving NVAs at one site 1230 not be immediately impacted by failures or errors at another site. 1231 (Of course, communication between NVEs in different NV domains may be 1232 impacted by such failures or errors.) It is a strong requirement 1233 that an NVA continue to operate properly for local NVEs even if 1234 external communication is interrupted (e.g., should communication 1235 between a local and remote NVA fail). 1237 At a high level, a federation of interconnected NVAs has some 1238 analogies to BGP and Autonomous Systems. Like an Autonomous System, 1239 NVAs at one site are managed by a single administrative entity and do 1240 not interact with external NVAs except as allowed by policy. 1241 Likewise, the interface between NVAs at different sites is well 1242 defined, so that the internal details of operations at one site are 1243 largely hidden to other sites. Finally, an NVA only peers with other 1244 NVAs that it has a trusted relationship with, i.e., where a VN is 1245 intended to span multiple NVAs. 1247 Reasons for using a federated model include: 1249 o Provide isolation among NVAs operating at different sites at 1250 different geographic locations. 1252 o Control the quantity and rate of information updates that flow 1253 (and must be processed) between different NVAs in different data 1254 centers. 1256 o Control the set of external NVAs (and external sites) a site peers 1257 with. A site will only peer with other sites that are cooperating 1258 in providing an overlay service. 1260 o Allow policy to be applied between sites. A site will want to 1261 carefully control what information it exports (and to whom) as 1262 well as what information it is willing to import (and from whom). 1264 o Allow different protocols and architectures to be used for intra- 1265 vs. inter-NVA communication. For example, within a single data 1266 center, a replicated transaction server using database techniques 1267 might be an attractive implementation option for an NVA, and 1268 protocols optimized for intra-NVA communication would likely be 1269 different from protocols involving inter-NVA communication between 1270 different sites. 1272 o Allow for optimized protocols, rather than using a one-size-fits 1273 all approach. Within a data center, networks tend to have lower- 1274 latency, higher-speed and higher redundancy when compared with WAN 1275 links interconnecting data centers. The design constraints and 1276 tradeoffs for a protocol operating within a data center network 1277 are different from those operating over WAN links. While a single 1278 protocol could be used for both cases, there could be advantages 1279 to using different and more specialized protocols for the intra- 1280 and inter-NVA case. 1282 9.1. Inter-NVA Peering 1284 To support peering between different NVAs, an inter-NVA protocol is 1285 needed. The inter-NVA protocol defines what information is exchanged 1286 between NVAs. It is assumed that the protocol will be used to share 1287 addressing information between data centers and must scale well over 1288 WAN links. 1290 10. Control Protocol Work Areas 1292 The NVO3 architecture consists of two major distinct entities: NVEs 1293 and NVAs. In order to provide isolation and independence between 1294 these two entities, the NVO3 architecture calls for well defined 1295 protocols for interfacing between them. For an individual NVA, the 1296 architecture calls for a logically centralized entity that could be 1297 implemented in a distributed or replicated fashion. While the IETF 1298 may choose to define one or more specific architectural approaches to 1299 building individual NVAs, there is little need for it to pick exactly 1300 one approach to the exclusion of others. An NVA for a single domain 1301 will likely be deployed as a single vendor product and thus there is 1302 little benefit in standardizing the internal structure of an NVA. 1304 Individual NVAs peer with each other in a federated manner. The NVO3 1305 architecture calls for a well-defined interface between NVAs. 1307 Finally, a hypervisor-to-NVE protocol is needed to cover the split- 1308 NVE scenario described in Section 4.2. 1310 11. NVO3 Data Plane Encapsulation 1312 When tunneling tenant traffic, NVEs add encapsulation header to the 1313 original tenant packet. The exact encapsulation to use for NVO3 does 1314 not seem to be critical. The main requirement is that the 1315 encapsulation support a Context ID of sufficient size. A number of 1316 encapsulations already exist that provide a VN Context of sufficient 1317 size for NVO3. For example, VXLAN [RFC7348] has a 24-bit VXLAN 1318 Network Identifier (VNI). NVGRE [RFC7637] has a 24-bit Tenant 1319 Network ID (TNI). MPLS-over-GRE provides a 20-bit label field. 1320 While there is widespread recognition that a 12-bit VN Context would 1321 be too small (only 4096 distinct values), it is generally agreed that 1322 20 bits (1 million distinct values) and 24 bits (16.8 million 1323 distinct values) are sufficient for a wide variety of deployment 1324 scenarios. 1326 12. Operations, Administration and Maintenance (OAM) 1328 The simplicity of operating and debugging overlay networks will be 1329 critical for successful deployment. 1331 Overlay networks are based on tunnels between NVEs, so the OAM 1332 (Operations, Administration and Maintenance) [RFC6291] framework for 1333 overlay networks can draw from prior IETF OAM work for tunnel-based 1334 networks, specifically L2VPN OAM [RFC6136]. RFC 6136 focuses on 1335 Fault Management and Performance Management as fundamental to L2VPN 1336 service delivery, leaving the Configuration, Management, Accounting 1337 Management and Security Management components of the OSI "FCAPS" 1338 taxonomy [M.3400] for further study. This section does likewise for 1339 NVO3 OAM, but those three areas continue to be important parts of 1340 complete OAM functionality for NVO3. 1342 The relationship between the overlay and underlay networks is a 1343 consideration for fault and performance management - a fault in the 1344 underlay may manifest as fault and/or performance issues in the 1345 overlay. Diagnosing and fixing such issues are complicated by NVO3 1346 abstracting the underlay network away from the overlay network (e.g., 1347 intermediate nodes on the underlay network path between NVEs are 1348 hidden from overlay VNs). 1350 NVO3-specific OAM techniques, protocol constructs and tools are 1351 needed to provide visibility beyond this abstraction to diagnose and 1352 correct problems that appear in the overlay. Two examples are 1353 underlay-aware traceroute 1354 [I-D.nordmark-nvo3-transcending-traceroute], and ping protocol 1355 constructs for overlay networks [I-D.jain-nvo3-vxlan-ping] 1356 [I-D.kumar-nvo3-overlay-ping]. 1358 NVO3-specific tools and techniques are best viewed as complements to 1359 (i.e., not as replacements for) single-network tools that apply to 1360 the overlay and/or underlay networks. Coordination among the 1361 individual network tools (for the overlay and underlay networks) and 1362 NVO3-aware dual-network tools is required to achieve effective 1363 monitoring and fault diagnosis. For example, the defect detection 1364 intervals and performance measurement intervals ought to be 1365 coordinated among all tools involved in order to provide consistency 1366 and comparability of results. 1368 For further discussion of NVO3 OAM requirements, see 1369 [I-D.ashwood-nvo3-oam-requirements]. 1371 13. Summary 1373 This document presents the overall architecture for Network 1374 Virtualization Overlays (NVO3). The architecture calls for three 1375 main areas of protocol work: 1377 1. A hypervisor-to-NVE protocol to support Split NVEs as discussed 1378 in Section 4.2. 1380 2. An NVE to NVA protocol for disseminating VN information (e.g., 1381 inner to outer address mappings). 1383 3. An NVA-to-NVA protocol for exchange of information about specific 1384 virtual networks between federated NVAs. 1386 It should be noted that existing protocols or extensions of existing 1387 protocols are applicable. 1389 14. Acknowledgments 1391 Helpful comments and improvements to this document have come from 1392 Alia Atlas, Abdussalam Baryun, Spencer Dawkins, Linda Dunbar, Stephen 1393 Farrell, Anton Ivanov, Lizhong Jin, Suresh Krishnan, Mirja Kuehlwind, 1394 Greg Mirsky, Carlos Pignataro, Dennis (Xiaohong) Qin, Erik Smith, 1395 Takeshi Takahashi, Ziye Yang and Lucy Yong. 1397 15. IANA Considerations 1399 This memo includes no request to IANA. 1401 16. Security Considerations 1403 The data plane and control plane described in this architecture will 1404 need to address potential security threats. 1406 For the data plane, tunneled application traffic may need protection 1407 against being misdelivered, modified, or having its content exposed 1408 to an inappropriate third party. In all cases, encryption between 1409 authenticated tunnel endpoints (e.g., via use of IPsec [RFC4301]) and 1410 enforcing policies that control which endpoints and VNs are permitted 1411 to exchange traffic can be used to mitigate risks. 1413 For the control plane, between NVAs, the NVA and NVE as well as 1414 between different components of the split-NVE approach, a combination 1415 of authentication and encryption can be used. All entities will need 1416 to properly authenticate with each other and enable encryption for 1417 their interactions as appropriate to protect sensitive information. 1419 Leakage of sensitive information about users or other entities 1420 associated with VMs whose traffic is virtualized can also be covered 1421 by using encryption for the control plane protocols and enforcing 1422 policies that control which NVO3 components are permitted to exchange 1423 control plane traffic. 1425 Control plane elements such as NVEs and NVAs need to collect 1426 performance and other data in order to carry out their functions. 1427 This data can sometimes be unexpectedly sensitive, for example, 1428 allowing non-obvious inferences as to activity within a VM. This 1429 provides a reason to minimise the data collected in some environments 1430 in order to limit potential exposure of sensitive information. As 1431 noted briefly in RFC 6973 [RFC6973] and RFC 7258 [RFC7258] there is 1432 an inevitable tension between being privacy sensitive and network 1433 operations that needs to be taken into account in nvo3 protocol 1434 development 1436 See the NVO3 framework security considerations in RFC 7365 [RFC7365] 1437 for further discussion. 1439 17. Informative References 1441 [I-D.ashwood-nvo3-oam-requirements] 1442 Chen, H., Ashwood-Smith, P., Xia, L., Iyengar, R., Tsou, 1443 T., Sajassi, A., Boucadair, M., Jacquenet, C., Daikoku, 1444 M., Ghanwani, A., and R. Krishnan, "NVO3 Operations, 1445 Administration, and Maintenance Requirements", draft- 1446 ashwood-nvo3-oam-requirements-04 (work in progress), 1447 October 2015. 1449 [I-D.ietf-nvo3-mcast-framework] 1450 Ghanwani, A., Dunbar, L., McBride, M., Bannai, V., and R. 1451 Krishnan, "A Framework for Multicast in Network 1452 Virtualization Overlays", draft-ietf-nvo3-mcast- 1453 framework-05 (work in progress), May 2016. 1455 [I-D.ietf-nvo3-nve-nva-cp-req] 1456 Kreeger, L., Dutt, D., Narten, T., and D. Black, "Network 1457 Virtualization NVE to NVA Control Protocol Requirements", 1458 draft-ietf-nvo3-nve-nva-cp-req-05 (work in progress), 1459 March 2016. 1461 [I-D.ietf-nvo3-use-case] 1462 Yong, L., Dunbar, L., Toy, M., Isaac, A., and V. Manral, 1463 "Use Cases for Data Center Network Virtualization 1464 Overlays", draft-ietf-nvo3-use-case-09 (work in progress), 1465 September 2016. 1467 [I-D.jain-nvo3-vxlan-ping] 1468 Jain, P., Singh, K., Balus, F., Henderickx, W., and V. 1469 Bannai, "Detecting VXLAN Segment Failure", draft-jain- 1470 nvo3-vxlan-ping-00 (work in progress), June 2013. 1472 [I-D.kumar-nvo3-overlay-ping] 1473 Kumar, N., Pignataro, C., Rao, D., and S. Aldrin, 1474 "Detecting NVO3 Overlay Data Plane failures", draft-kumar- 1475 nvo3-overlay-ping-01 (work in progress), January 2014. 1477 [I-D.nordmark-nvo3-transcending-traceroute] 1478 Nordmark, E., Appanna, C., Lo, A., Boutros, S., and A. 1479 Dubey, "Layer-Transcending Traceroute for Overlay Networks 1480 like VXLAN", draft-nordmark-nvo3-transcending- 1481 traceroute-03 (work in progress), July 2016. 1483 [IEEE-802.1Q] 1484 IEEE Std 802.1Q-2014, , "IEEE Standard for Local and 1485 metropolitan area networks: Bridges and Bridged Networks", 1486 November 2014. 1488 [M.3400] ITU-T Recommendation M.3400, , "TMN management functions", 1489 February 2000. 1491 [RFC0826] Plummer, D., "Ethernet Address Resolution Protocol: Or 1492 Converting Network Protocol Addresses to 48.bit Ethernet 1493 Address for Transmission on Ethernet Hardware", STD 37, 1494 RFC 826, DOI 10.17487/RFC0826, November 1982, 1495 . 1497 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 1498 Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, 1499 December 2005, . 1501 [RFC4364] Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private 1502 Networks (VPNs)", RFC 4364, DOI 10.17487/RFC4364, February 1503 2006, . 1505 [RFC4861] Narten, T., Nordmark, E., Simpson, W., and H. Soliman, 1506 "Neighbor Discovery for IP version 6 (IPv6)", RFC 4861, 1507 DOI 10.17487/RFC4861, September 2007, 1508 . 1510 [RFC6136] Sajassi, A., Ed. and D. Mohan, Ed., "Layer 2 Virtual 1511 Private Network (L2VPN) Operations, Administration, and 1512 Maintenance (OAM) Requirements and Framework", RFC 6136, 1513 DOI 10.17487/RFC6136, March 2011, 1514 . 1516 [RFC6291] Andersson, L., van Helvoort, H., Bonica, R., Romascanu, 1517 D., and S. Mansfield, "Guidelines for the Use of the "OAM" 1518 Acronym in the IETF", BCP 161, RFC 6291, 1519 DOI 10.17487/RFC6291, June 2011, 1520 . 1522 [RFC6973] Cooper, A., Tschofenig, H., Aboba, B., Peterson, J., 1523 Morris, J., Hansen, M., and R. Smith, "Privacy 1524 Considerations for Internet Protocols", RFC 6973, 1525 DOI 10.17487/RFC6973, July 2013, 1526 . 1528 [RFC7258] Farrell, S. and H. Tschofenig, "Pervasive Monitoring Is an 1529 Attack", BCP 188, RFC 7258, DOI 10.17487/RFC7258, May 1530 2014, . 1532 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 1533 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual 1534 eXtensible Local Area Network (VXLAN): A Framework for 1535 Overlaying Virtualized Layer 2 Networks over Layer 3 1536 Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014, 1537 . 1539 [RFC7364] Narten, T., Ed., Gray, E., Ed., Black, D., Fang, L., 1540 Kreeger, L., and M. Napierala, "Problem Statement: 1541 Overlays for Network Virtualization", RFC 7364, 1542 DOI 10.17487/RFC7364, October 2014, 1543 . 1545 [RFC7365] Lasserre, M., Balus, F., Morin, T., Bitar, N., and Y. 1546 Rekhter, "Framework for Data Center (DC) Network 1547 Virtualization", RFC 7365, DOI 10.17487/RFC7365, October 1548 2014, . 1550 [RFC7637] Garg, P., Ed. and Y. Wang, Ed., "NVGRE: Network 1551 Virtualization Using Generic Routing Encapsulation", 1552 RFC 7637, DOI 10.17487/RFC7637, September 2015, 1553 . 1555 Authors' Addresses 1557 David Black 1558 Dell EMC 1560 Email: david.black@dell.com 1561 Jon Hudson 1562 Independent 1564 Email: jon.hudson@gmail.com 1566 Lawrence Kreeger 1567 Cisco 1569 Email: kreeger@cisco.com 1571 Marc Lasserre 1572 Independent 1574 Email: mmlasserre@gmail.com 1576 Thomas Narten 1577 IBM 1579 Email: narten@us.ibm.com