idnits 2.17.1 draft-ietf-nvo3-geneve-14.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 12, 2019) is 1687 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Outdated reference: A later version (-13) exists of draft-ietf-intarea-tunnels-09 == Outdated reference: A later version (-12) exists of draft-ietf-nvo3-encap-02 -- Obsolete informational reference (is this intentional?): RFC 2460 (Obsoleted by RFC 8200) Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Gross, Ed. 3 Internet-Draft 4 Intended status: Standards Track I. Ganga, Ed. 5 Expires: March 15, 2020 Intel 6 T. Sridhar, Ed. 7 VMware 8 September 12, 2019 10 Geneve: Generic Network Virtualization Encapsulation 11 draft-ietf-nvo3-geneve-14 13 Abstract 15 Network virtualization involves the cooperation of devices with a 16 wide variety of capabilities such as software and hardware tunnel 17 endpoints, transit fabrics, and centralized control clusters. As a 18 result of their role in tying together different elements in the 19 system, the requirements on tunnels are influenced by all of these 20 components. Flexibility is therefore the most important aspect of a 21 tunnel protocol if it is to keep pace with the evolution of the 22 system. This document describes Geneve, an encapsulation protocol 23 designed to recognize and accommodate these changing capabilities and 24 needs. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on March 15, 2020. 43 Copyright Notice 45 Copyright (c) 2019 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 61 1.1. Requirements Language . . . . . . . . . . . . . . . . . . 4 62 1.2. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 63 2. Design Requirements . . . . . . . . . . . . . . . . . . . . . 5 64 2.1. Control Plane Independence . . . . . . . . . . . . . . . 6 65 2.2. Data Plane Extensibility . . . . . . . . . . . . . . . . 7 66 2.2.1. Efficient Implementation . . . . . . . . . . . . . . 7 67 2.3. Use of Standard IP Fabrics . . . . . . . . . . . . . . . 8 68 3. Geneve Encapsulation Details . . . . . . . . . . . . . . . . 9 69 3.1. Geneve Packet Format Over IPv4 . . . . . . . . . . . . . 9 70 3.2. Geneve Packet Format Over IPv6 . . . . . . . . . . . . . 10 71 3.3. UDP Header . . . . . . . . . . . . . . . . . . . . . . . 12 72 3.4. Tunnel Header Fields . . . . . . . . . . . . . . . . . . 13 73 3.5. Tunnel Options . . . . . . . . . . . . . . . . . . . . . 14 74 3.5.1. Options Processing . . . . . . . . . . . . . . . . . 16 75 4. Implementation and Deployment Considerations . . . . . . . . 17 76 4.1. Applicability Statement . . . . . . . . . . . . . . . . . 17 77 4.2. Congestion Control Functionality . . . . . . . . . . . . 18 78 4.3. UDP Checksum . . . . . . . . . . . . . . . . . . . . . . 18 79 4.3.1. UDP Zero Checksum Handling with IPv6 . . . . . . . . 19 80 4.4. Encapsulation of Geneve in IP . . . . . . . . . . . . . . 20 81 4.4.1. IP Fragmentation . . . . . . . . . . . . . . . . . . 21 82 4.4.2. DSCP, ECN and TTL . . . . . . . . . . . . . . . . . . 21 83 4.4.3. Broadcast and Multicast . . . . . . . . . . . . . . . 22 84 4.4.4. Unidirectional Tunnels . . . . . . . . . . . . . . . 23 85 4.5. Constraints on Protocol Features . . . . . . . . . . . . 23 86 4.5.1. Constraints on Options . . . . . . . . . . . . . . . 23 87 4.6. NIC Offloads . . . . . . . . . . . . . . . . . . . . . . 24 88 4.7. Inner VLAN Handling . . . . . . . . . . . . . . . . . . . 24 89 5. Interoperability Issues . . . . . . . . . . . . . . . . . . . 25 90 6. Security Considerations . . . . . . . . . . . . . . . . . . . 25 91 6.1. Data Confidentiality . . . . . . . . . . . . . . . . . . 26 92 6.1.1. Inter-Data Center Traffic . . . . . . . . . . . . . . 26 93 6.2. Data Integrity . . . . . . . . . . . . . . . . . . . . . 27 94 6.3. Authentication of NVE peers . . . . . . . . . . . . . . . 27 95 6.4. Options Interpretation by Transit Devices . . . . . . . . 28 96 6.5. Multicast/Broadcast . . . . . . . . . . . . . . . . . . . 28 97 6.6. Control Plane Communications . . . . . . . . . . . . . . 28 98 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 28 99 8. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 29 100 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 30 101 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 31 102 10.1. Normative References . . . . . . . . . . . . . . . . . . 31 103 10.2. Informative References . . . . . . . . . . . . . . . . . 32 104 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 35 106 1. Introduction 108 Networking has long featured a variety of tunneling, tagging, and 109 other encapsulation mechanisms. However, the advent of network 110 virtualization has caused a surge of renewed interest and a 111 corresponding increase in the introduction of new protocols. The 112 large number of protocols in this space, ranging all the way from 113 VLANs [IEEE.802.1Q_2014] and MPLS [RFC3031] through the more recent 114 VXLAN [RFC7348] (Virtual eXtensible Local Area Network) and NVGRE 115 [RFC7637] (Network Virtualization Using Generic Routing 116 Encapsulation), often leads to questions about the need for new 117 encapsulation formats and what it is about network virtualization in 118 particular that leads to their proliferation. 120 While many encapsulation protocols seek to simply partition the 121 underlay network or bridge between two domains, network 122 virtualization views the transit network as providing connectivity 123 between multiple components of a distributed system. In many ways 124 this system is similar to a chassis switch with the IP underlay 125 network playing the role of the backplane and tunnel endpoints on the 126 edge as line cards. When viewed in this light, the requirements 127 placed on the tunnel protocol are significantly different in terms of 128 the quantity of metadata necessary and the role of transit nodes. 130 Current work such as [VL2] (A Scalable and Flexible Data Center 131 Network) and the NVO3 Data Plane Requirements 132 [I-D.ietf-nvo3-dataplane-requirements] have described some of the 133 properties that the data plane must have to support network 134 virtualization. However, one additional defining requirement is the 135 need to carry system state along with the packet data. The use of 136 some metadata is certainly not a foreign concept - nearly all 137 protocols used for virtualization have at least 24 bits of identifier 138 space as a way to partition between tenants. This is often described 139 as overcoming the limits of 12-bit VLANs, and when seen in that 140 context, or any context where it is a true tenant identifier, 16 141 million possible entries is a large number. However, the reality is 142 that the metadata is not exclusively used to identify tenants and 143 encoding other information quickly starts to crowd the space. In 144 fact, when compared to the tags used to exchange metadata between 145 line cards on a chassis switch, 24-bit identifiers start to look 146 quite small. There are nearly endless uses for this metadata, 147 ranging from storing input ports for simple security policies to 148 service based context for interposing advanced middleboxes. 150 Existing tunnel protocols have each attempted to solve different 151 aspects of these new requirements, only to be quickly rendered out of 152 date by changing control plane implementations and advancements. 153 Furthermore, software and hardware components and controllers all 154 have different advantages and rates of evolution - a fact that should 155 be viewed as a benefit, not a liability or limitation. This draft 156 describes Geneve, a protocol which seeks to avoid these problems by 157 providing a framework for tunneling for network virtualization rather 158 than being prescriptive about the entire system. 160 1.1. Requirements Language 162 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 163 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 164 "OPTIONAL" in this document are to be interpreted as described in BCP 165 14 [RFC2119] [RFC8174] when, and only when, they appear in all 166 capitals, as shown here. 168 1.2. Terminology 170 The NVO3 framework [RFC7365] defines many of the concepts commonly 171 used in network virtualization. In addition, the following terms are 172 specifically meaningful in this document: 174 Checksum offload. An optimization implemented by many NICs (Network 175 Interface Controller) which enables computation and verification of 176 upper layer protocol checksums in hardware on transmit and receive, 177 respectively. This typically includes IP and TCP/UDP checksums which 178 would otherwise be computed by the protocol stack in software. 180 Clos network. A technique for composing network fabrics larger than 181 a single switch while maintaining non-blocking bandwidth across 182 connection points. ECMP is used to divide traffic across the 183 multiple links and switches that constitute the fabric. Sometimes 184 termed "leaf and spine" or "fat tree" topologies. 186 ECMP. Equal Cost Multipath. A routing mechanism for selecting from 187 among multiple best next hop paths by hashing packet headers in order 188 to better utilize network bandwidth while avoiding reordering of 189 packets within a flow. 191 Geneve. Generic Network Virtualization Encapsulation. The tunnel 192 protocol described in this document. 194 LRO. Large Receive Offload. The receive-side equivalent function of 195 LSO, in which multiple protocol segments (primarily TCP) are 196 coalesced into larger data units. 198 NIC. Network Interface Controller. Also called as Network Interface 199 Card or Network Adapter. A NIC could be part of a tunnel endpoint or 200 transit device and can either process Geneve packets or aid in the 201 processing of Geneve packets. 203 Transit device. A forwarding element (e.g. router or switch) along 204 the path of the tunnel making up part of the Underlay Network. A 205 transit device MAY be capable of understanding the Geneve packet 206 format but does not originate or terminate Geneve packets. 208 LSO. Large Segmentation Offload. A function provided by many 209 commercial NICs that allows data units larger than the MTU to be 210 passed to the NIC to improve performance, the NIC being responsible 211 for creating smaller segments of size less than or equal to the MTU 212 with correct protocol headers. When referring specifically to TCP/ 213 IP, this feature is often known as TSO (TCP Segmentation Offload). 215 Tunnel endpoint. A component performing encapsulation and 216 decapsulation of packets, such as Ethernet frames or IP datagrams, in 217 Geneve headers. As the ultimate consumer of any tunnel metadata, 218 tunnel endpoints have the highest level of requirements for parsing 219 and interpreting tunnel headers. Tunnel endpoints may consist of 220 either software or hardware implementations or a combination of the 221 two. Tunnel endpoints are frequently a component of an NVE (Network 222 Virtualization Edge) but may also be found in middleboxes or other 223 elements making up an NVO3 Network. 225 VM. Virtual Machine. 227 2. Design Requirements 229 Geneve is designed to support network virtualization use cases, where 230 tunnels are typically established to act as a backplane between the 231 virtual switches residing in hypervisors, physical switches, or 232 middleboxes or other appliances. An arbitrary IP network can be used 233 as an underlay although Clos networks composed using ECMP links are a 234 common choice to provide consistent bisectional bandwidth across all 235 connection points. Many of the concepts of network virtualization 236 overlays over Layer 3 IP networks are described in NVO3 Framework 237 framework [RFC7365]. Figure 1 shows an example of a hypervisor, top 238 of rack switch for connectivity to physical servers, and a WAN uplink 239 connected using Geneve tunnels over a simplified Clos network. These 240 tunnels are used to encapsulate and forward frames from the attached 241 components such as VMs or physical links. 243 +---------------------+ +-------+ +------+ 244 | +--+ +-------+---+ | |Transit|--|Top of|==Physical 245 | |VM|--| | | | +------+ /|Router | | Rack |==Servers 246 | +--+ |Virtual|NIC|---|Top of|/ +-------+\/+------+ 247 | +--+ |Switch | | | | Rack |\ +-------+/\+------+ 248 | |VM|--| | | | +------+ \|Transit| |Uplink| WAN 249 | +--+ +-------+---+ | |Router |--| |=========> 250 +---------------------+ +-------+ +------+ 251 Hypervisor 253 ()===================================() 254 Switch-Switch Geneve Tunnels 256 Figure 1: Sample Geneve Deployment 258 To support the needs of network virtualization, the tunnel protocol 259 should be able to take advantage of the differing (and evolving) 260 capabilities of each type of device in both the underlay and overlay 261 networks. This results in the following requirements being placed on 262 the data plane tunneling protocol: 264 o The data plane is generic and extensible enough to support current 265 and future control planes. 267 o Tunnel components are efficiently implementable in both hardware 268 and software without restricting capabilities to the lowest common 269 denominator. 271 o High performance over existing IP fabrics. 273 These requirements are described further in the following 274 subsections. 276 2.1. Control Plane Independence 278 Although some protocols for network virtualization have included a 279 control plane as part of the tunnel format specification (most 280 notably, the VXLAN spec prescribed a multicast learning- based 281 control plane), these specifications have largely been treated as 282 describing only the data format. The VXLAN packet format has 283 actually seen a wide variety of control planes built on top of it. 285 There is a clear advantage in settling on a data format: most of the 286 protocols are only superficially different and there is little 287 advantage in duplicating effort. However, the same cannot be said of 288 control planes, which are diverse in very fundamental ways. The case 289 for standardization is also less clear given the wide variety in 290 requirements, goals, and deployment scenarios. 292 As a result of this reality, Geneve is a pure tunnel format 293 specification that is capable of fulfilling the needs of many control 294 planes by explicitly not selecting any one of them. This 295 simultaneously promotes a shared data format and reduces the chance 296 of obsolescence by future control plane enhancements. 298 2.2. Data Plane Extensibility 300 Achieving the level of flexibility needed to support current and 301 future control planes effectively requires an options infrastructure 302 to allow new metadata types to be defined, deployed, and either 303 finalized or retired. Options also allow for differentiation of 304 products by encouraging independent development in each vendor's core 305 specialty, leading to an overall faster pace of advancement. By far 306 the most common mechanism for implementing options is Type-Length- 307 Value (TLV) format. 309 It should be noted that while options can be used to support non- 310 wirespeed control packets, they are equally important on data packets 311 as well to segregate and direct forwarding (for instance, the 312 examples given before of input port based security policies and 313 service interposition both require tags to be placed on data 314 packets). Therefore, while it would be desirable to limit the 315 extensibility to only control packets for the purposes of simplifying 316 the datapath, that would not satisfy the design requirements. 318 2.2.1. Efficient Implementation 320 There is often a conflict between software flexibility and hardware 321 performance that is difficult to resolve. For a given set of 322 functionality, it is obviously desirable to maximize performance. 323 However, that does not mean new features that cannot be run at a 324 desired speed today should be disallowed. Therefore, for a protocol 325 to be efficiently implementable means that a set of common 326 capabilities can be reasonably handled across platforms along with a 327 graceful mechanism to handle more advanced features in the 328 appropriate situations. 330 The use of a variable length header and options in a protocol often 331 raises questions about whether it is truly efficiently implementable 332 in hardware. To answer this question in the context of Geneve, it is 333 important to first divide "hardware" into two categories: tunnel 334 endpoints and transit devices. 336 Tunnel endpoints must be able to parse the variable header, including 337 any options, and take action. Since these devices are actively 338 participating in the protocol, they are the most affected by Geneve. 340 However, as tunnel endpoints are the ultimate consumers of the data, 341 transmitters can tailor their output to the capabilities of the 342 recipient. As new functionality becomes sufficiently well defined to 343 add to tunnel endpoints, supporting options can be designed using 344 ordering restrictions and other techniques to ease parsing. 346 Options, if present in the packet, MUST only be generated and 347 terminated by tunnel endpoints. Transit devices MAY be able to 348 interpret the options, however, as non-terminating devices, transit 349 devices do not originate or terminate the Geneve packet, hence MUST 350 NOT modify Geneve headers and MUST NOT insert or delete options, 351 which is the responsibility of tunnel endpoints. The participation 352 of transit devices in interpreting options is OPTIONAL. 354 Further, either tunnel endpoints or transit devices MAY use offload 355 capabilities of NICs such as checksum offload to improve the 356 performance of Geneve packet processing. The presence of a Geneve 357 variable length header SHOULD NOT prevent the tunnel endpoints and 358 transit devices from using such offload capabilities. 360 2.3. Use of Standard IP Fabrics 362 IP has clearly cemented its place as the dominant transport mechanism 363 and many techniques have evolved over time to make it robust, 364 efficient, and inexpensive. As a result, it is natural to use IP 365 fabrics as a transit network for Geneve. Fortunately, the use of IP 366 encapsulation and addressing is enough to achieve the primary goal of 367 delivering packets to the correct point in the network through 368 standard switching and routing. 370 In addition, nearly all underlay fabrics are designed to exploit 371 parallelism in traffic to spread load across multiple links without 372 introducing reordering in individual flows. These equal cost 373 multipathing (ECMP) techniques typically involve parsing and hashing 374 the addresses and port numbers from the packet to select an outgoing 375 link. However, the use of tunnels often results in poor ECMP 376 performance without additional knowledge of the protocol as the 377 encapsulated traffic is hidden from the fabric by design and only 378 tunnel endpoint addresses are available for hashing. 380 Since it is desirable for Geneve to perform well on these existing 381 fabrics, it is necessary for entropy from encapsulated packets to be 382 exposed in the tunnel header. The most common technique for this is 383 to use the UDP source port, which is discussed further in 384 Section 3.3. 386 3. Geneve Encapsulation Details 388 The Geneve packet format consists of a compact tunnel header 389 encapsulated in UDP over either IPv4 or IPv6. A small fixed tunnel 390 header provides control information plus a base level of 391 functionality and interoperability with a focus on simplicity. This 392 header is then followed by a set of variable options to allow for 393 future innovation. Finally, the payload consists of a protocol data 394 unit of the indicated type, such as an Ethernet frame. Section 3.1 395 and Section 3.2 illustrate the Geneve packet format transported (for 396 example) over Ethernet along with an Ethernet payload. 398 3.1. Geneve Packet Format Over IPv4 400 0 1 2 3 401 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 402 Outer Ethernet Header: 403 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 404 | Outer Destination MAC Address | 405 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 406 | Outer Destination MAC Address | Outer Source MAC Address | 407 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 408 | Outer Source MAC Address | 409 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 410 |Optional Ethertype=C-Tag 802.1Q| Outer VLAN Tag Information | 411 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 412 | Ethertype=0x0800 | 413 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 415 Outer IPv4 Header: 416 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 417 |Version| IHL |Type of Service| Total Length | 418 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 419 | Identification |Flags| Fragment Offset | 420 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 421 | Time to Live |Protocol=17 UDP| Header Checksum | 422 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 423 | Outer Source IPv4 Address | 424 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 425 | Outer Destination IPv4 Address | 426 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 428 Outer UDP Header: 429 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 430 | Source Port = xxxx | Dest Port = 6081 | 431 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 432 | UDP Length | UDP Checksum | 433 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 435 Geneve Header: 436 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 437 |Ver| Opt Len |O|C| Rsvd. | Protocol Type | 438 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 439 | Virtual Network Identifier (VNI) | Reserved | 440 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 441 | Variable Length Options | 442 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 444 Inner Ethernet Header (example payload): 445 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 446 | Inner Destination MAC Address | 447 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 448 | Inner Destination MAC Address | Inner Source MAC Address | 449 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 450 | Inner Source MAC Address | 451 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 452 |Optional Ethertype=C-Tag 802.1Q| Inner VLAN Tag Information | 453 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 455 Payload: 456 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 457 | Ethertype of Original Payload | | 458 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 459 | Original Ethernet Payload | 460 | | 461 | (Note that the original Ethernet Frame's FCS is not included) | 462 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 464 Frame Check Sequence: 465 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 466 | New FCS (Frame Check Sequence) for Outer Ethernet Frame | 467 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 469 3.2. Geneve Packet Format Over IPv6 471 0 1 2 3 472 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 473 Outer Ethernet Header: 474 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 475 | Outer Destination MAC Address | 476 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 477 | Outer Destination MAC Address | Outer Source MAC Address | 478 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 479 | Outer Source MAC Address | 480 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 481 |Optional Ethertype=C-Tag 802.1Q| Outer VLAN Tag Information | 482 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 483 | Ethertype=0x86DD | 484 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 486 Outer IPv6 Header: 487 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 488 |Version| Traffic Class | Flow Label | 489 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 490 | Payload Length | NxtHdr=17 UDP | Hop Limit | 491 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 492 | | 493 + + 494 | | 495 + Outer Source IPv6 Address + 496 | | 497 + + 498 | | 499 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 500 | | 501 + + 502 | | 503 + Outer Destination IPv6 Address + 504 | | 505 + + 506 | | 507 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 509 Outer UDP Header: 510 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 511 | Source Port = xxxx | Dest Port = 6081 | 512 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 513 | UDP Length | UDP Checksum | 514 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 516 Geneve Header: 517 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 518 |Ver| Opt Len |O|C| Rsvd. | Protocol Type | 519 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 520 | Virtual Network Identifier (VNI) | Reserved | 521 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 522 | Variable Length Options | 523 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 525 Inner Ethernet Header (example payload): 526 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 527 | Inner Destination MAC Address | 528 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 529 | Inner Destination MAC Address | Inner Source MAC Address | 530 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 531 | Inner Source MAC Address | 532 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 533 |Optional Ethertype=C-Tag 802.1Q| Inner VLAN Tag Information | 534 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 536 Payload: 537 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 538 | Ethertype of Original Payload | | 539 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 540 | Original Ethernet Payload | 541 | | 542 | (Note that the original Ethernet Frame's FCS is not included) | 543 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 545 Frame Check Sequence: 546 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 547 | New FCS (Frame Check Sequence) for Outer Ethernet Frame | 548 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 550 3.3. UDP Header 552 The use of an encapsulating UDP [RFC0768] header follows the 553 connectionless semantics of Ethernet and IP in addition to providing 554 entropy to routers performing ECMP. The header fields are therefore 555 interpreted as follows: 557 Source port: A source port selected by the originating tunnel 558 endpoint. This source port SHOULD be the same for all packets 559 belonging to a single encapsulated flow to prevent reordering due 560 to the use of different paths. To encourage an even distribution 561 of flows across multiple links, the source port SHOULD be 562 calculated using a hash of the encapsulated packet headers using, 563 for example, a traditional 5-tuple. Since the port represents a 564 flow identifier rather than a true UDP connection, the entire 565 16-bit range MAY be used to maximize entropy. 567 Dest port: IANA has assigned port 6081 as the fixed well-known 568 destination port for Geneve. Although the well-known value should 569 be used by default, it is RECOMMENDED that implementations make 570 this configurable. The chosen port is used for identification of 571 Geneve packets and MUST NOT be reversed for different ends of a 572 connection as is done with TCP. 574 UDP length: The length of the UDP packet including the UDP header. 576 UDP checksum: In order to protect the Geneve header, options and 577 payload from potential data corruption, UDP checksum SHOULD be 578 generated as specified in [RFC0768] and [RFC1112] when Geneve is 579 encapsulated in IPv4. To protect the IP header, Geneve header, 580 options and payload from potential data corruption, the UDP 581 checksum MUST be generated by default as specified in [RFC0768] 582 and [RFC2460] when Geneve is encapsulated in IPv6. Upon receiving 583 such packets with non-zero UDP checksum, the receiving tunnel 584 endpoints MUST validate the checksum. If the checksum is not 585 correct, the packet MUST be dropped, otherwise the packet MUST be 586 accepted for decapsulation. 588 Under certain conditions, the UDP checksum MAY be set to zero on 589 transmit for packets encapsulated in both IPv4 and IPv6 [RFC6935]. 590 See Section 4.3 for additional requirements that apply for using 591 zero UDP checksum with IPv4 and IPv6. Disabling the use of UDP 592 checksums is an operational consideration that should take into 593 account the risks and effects of packet corruption. 595 3.4. Tunnel Header Fields 597 Ver (2 bits): The current version number is 0. Packets received by 598 a tunnel endpoint with an unknown version MUST be dropped. 599 Transit devices interpreting Geneve packets with an unknown 600 version number MUST treat them as UDP packets with an unknown 601 payload. 603 Opt Len (6 bits): The length of the options fields, expressed in 604 four byte multiples, not including the eight byte fixed tunnel 605 header. This results in a minimum total Geneve header size of 8 606 bytes and a maximum of 260 bytes. The start of the payload 607 headers can be found using this offset from the end of the base 608 Geneve header. 610 O (1 bit): Control packet. This packet contains a control message. 611 Control messages are sent between tunnel endpoints. Tunnel 612 Endpoints MUST NOT forward the payload and transit devices MUST 613 NOT attempt to interpret it. Since these are infrequent control 614 messages, it is RECOMMENDED that tunnel endpoints direct these 615 packets to a high priority control queue (for example, to direct 616 the packet to a general purpose CPU from a forwarding ASIC or to 617 separate out control traffic on a NIC). Transit devices MUST NOT 618 alter forwarding behavior on the basis of this bit, such as ECMP 619 link selection. 621 C (1 bit): Critical options present. One or more options has the 622 critical bit set (see Section 3.5). If this bit is set then 623 tunnel endpoints MUST parse the options list to interpret any 624 critical options. On tunnel endpoints where option parsing is not 625 supported the packet MUST be dropped on the basis of the 'C' bit 626 in the base header. If the bit is not set tunnel endpoints MAY 627 strip all options using 'Opt Len' and forward the decapsulated 628 packet. Transit devices MUST NOT drop packets on the basis of 629 this bit. 631 The critical bit allows hardware implementations the flexibility 632 to handle options processing in the hardware fastpath or in the 633 exception (slow) path without the need to process all the options. 634 For example, a critical option such as secure hash to provide 635 Geneve header integrity check must be processed by tunnel 636 endpoints and typically processed in the hardware fastpath. 638 Rsvd. (6 bits): Reserved field, which MUST be zero on transmission 639 and MUST be ignored on receipt. 641 Protocol Type (16 bits): The type of the protocol data unit 642 appearing after the Geneve header. This follows the EtherType 643 [ETYPES] convention with Ethernet itself being represented by the 644 value 0x6558. 646 Virtual Network Identifier (VNI) (24 bits): An identifier for a 647 unique element of a virtual network. In many situations this may 648 represent an L2 segment, however, the control plane defines the 649 forwarding semantics of decapsulated packets. The VNI MAY be used 650 as part of ECMP forwarding decisions or MAY be used as a mechanism 651 to distinguish between overlapping address spaces contained in the 652 encapsulated packet when load balancing across CPUs. 654 Reserved (8 bits): Reserved field which MUST be zero on transmission 655 and ignored on receipt. 657 Transit devices MUST maintain consistent forwarding behavior 658 irrespective of the value of 'Opt Len', including ECMP link 659 selection. These devices SHOULD be able to forward packets 660 containing options without resorting to a slow path. 662 3.5. Tunnel Options 663 0 1 2 3 664 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 665 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 666 | Option Class | Type |R|R|R| Length | 667 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 668 | Variable Option Data | 669 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 671 Geneve Option 673 The base Geneve header is followed by zero or more options in Type- 674 Length-Value format. Each option consists of a four byte option 675 header and a variable amount of option data interpreted according to 676 the type. 678 Option Class (16 bits): Namespace for the 'Type' field. IANA will 679 be requested to create a "Geneve Option Class" registry to 680 allocate identifiers for organizations, technologies, and vendors 681 that have an interest in creating types for options. Each 682 organization may allocate types independently to allow 683 experimentation and rapid innovation. It is expected that over 684 time certain options will become well known and a given 685 implementation may use option types from a variety of sources. In 686 addition, IANA will be requested to reserve specific ranges for 687 standardized and experimental options. 689 Type (8 bits): Type indicating the format of the data contained in 690 this option. Options are primarily designed to encourage future 691 extensibility and innovation and so standardized forms of these 692 options will be defined in a separate document. 694 The high order bit of the option type indicates that this is a 695 critical option. If the receiving tunnel endpoint does not 696 recognize this option and this bit is set then the packet MUST be 697 dropped. If the 'C' bit (critical bit) is set in any option then 698 the 'C' bit in the Geneve base header MUST also be set. Transit 699 devices MUST NOT drop packets on the basis of this bit. The 700 following figure shows the location of the 'C' bit in the 'Type' 701 field: 703 0 1 2 3 4 5 6 7 8 704 +-+-+-+-+-+-+-+-+ 705 |C| Type | 706 +-+-+-+-+-+-+-+-+ 708 The requirement to drop a packet with an unknown option with the 709 'C' bit set applies to the entire tunnel endpoint system and not a 710 particular component of the implementation. For example, in a 711 system comprised of a forwarding ASIC and a general purpose CPU, 712 this does not mean that the packet must be dropped in the ASIC. 713 An implementation may send the packet to the CPU using a rate- 714 limited control channel for slow-path exception handling. 716 R (3 bits): Option control flags reserved for future use. MUST be 717 zero on transmission and ignored on receipt. 719 Length (5 bits): Length of the option, expressed in four byte 720 multiples excluding the option header. The total length of each 721 option may be between 4 and 128 bytes. A value of 0 in the Length 722 field implies an option with only the option header without the 723 variable option data. Packets in which the total length of all 724 options is not equal to the 'Opt Len' in the base header are 725 invalid and MUST be silently dropped if received by a tunnel 726 endpoint that processes the options. 728 Variable Option Data: Option data interpreted according to 'Type'. 730 3.5.1. Options Processing 732 Geneve options are intended to be originated and processed by tunnel 733 endpoints. However, options MAY be interpreted by transit devices 734 along the tunnel path. Transit devices not interpreting Geneve 735 headers (that may or may not include options) MUST handle Geneve 736 packets as any other UDP packet and maintain consistent forwarding 737 behavior. 739 In tunnel endpoints, the generation and interpretation of options is 740 determined by the control plane, which is out of the scope of this 741 document. However, to ensure interoperability between heterogeneous 742 devices some requirements are imposed on options and the devices that 743 process them: 745 o Receiving tunnel endpoints MUST drop packets containing unknown 746 options with the 'C' bit set in the option type. Conversely, 747 transit devices MUST NOT drop packets as a result of encountering 748 unknown options, including those with the 'C' bit set. 750 o Some options may be defined in such a way that the position in the 751 option list is significant. Options MUST NOT be changed by 752 transit devices. 754 o An option SHOULD NOT be dependent upon any other option in the 755 packet, i.e., options can be processed independently of one 756 another. Architecturally, options are intended to be self- 757 descriptive and independent. This enables parallelism in option 758 processing and reduces implementation complexity. 760 When designing a Geneve option, it is important to consider how the 761 option will evolve in the future. Once an option is defined it is 762 reasonable to expect that implementations may come to depend on a 763 specific behavior. As a result, the scope of any future changes must 764 be carefully described upfront. 766 Unexpectedly significant interoperability issues may result from 767 changing the length of an option that was defined to be a certain 768 size. A particular option is specified to have either a fixed 769 length, which is constant, or a variable length, which may change 770 over time or for different use cases. This property is part of the 771 definition of the option and conveyed by the 'Type'. For fixed 772 length options, some implementations may choose to ignore the length 773 field in the option header and instead parse based on the well known 774 length associated with the type. In this case, redefining the length 775 will impact not only parsing of the option in question but also any 776 options that follow. Therefore, options that are defined to be fixed 777 length in size MUST NOT be redefined to a different length. Instead, 778 a new 'Type' should be allocated. 780 Options may be processed by NIC hardware utilizing offloads (e.g. 781 LSO and LRO) as described in Section 4.6. Careful consideration 782 should be given to how the offload capabilities outlined in 783 Section 4.6 impact an option's design. 785 4. Implementation and Deployment Considerations 787 4.1. Applicability Statement 789 Geneve is a network virtualization overlay encapsulation protocol 790 designed to establish tunnels between NVEs over an existing IP 791 network. It is intended for use in public or private data center 792 environments, for deploying multi-tenant overlay networks over an 793 existing IP underlay network. 795 Geneve is a UDP based encapsulation protocol transported over 796 existing IPv4 and IPv6 networks. Hence, as a UDP based protocol, 797 Geneve adheres to the UDP usage guidelines as specified in [RFC8085]. 798 The applicability of these guidelines are dependent on the underlay 799 IP network and the nature of Geneve payload protocol (example TCP/IP, 800 IP/Ethernet). 802 [RFC8085] outlines two applicability scenarios for UDP applications, 803 1) general Internet and 2) controlled environment. The controlled 804 environment means a single administrative domain or adjacent set of 805 cooperating domains. A network in a controlled environment can be 806 managed to operate under certain conditions whereas in general 807 Internet this cannot be done. Hence requirements for a tunnel 808 protocol operating under a controlled environment can be less 809 restrictive than the requirements of general internet. 811 Geneve is intended to be deployed in a data center network 812 environment operated by a single operator or adjacent set of 813 cooperating network operators that fits with the definition of 814 controlled environments in [RFC8085]. 816 For the purpose of this document, a traffic-managed controlled 817 environment (TMCE) is defined as an IP network that is traffic- 818 engineered and/or otherwise managed (e.g., via use of traffic rate 819 limiters) to avoid congestion. The concept of TMCE is outlined in 820 [RFC8086]. Significant portions of text in Section 4.1 through 821 Section 4.3 are based on [RFC8086] as applicable to Geneve. 823 It is the responsibility of the operator to ensure that the 824 guidelines/requirements in this section are followed as applicable to 825 their Geneve deployment(s). 827 4.2. Congestion Control Functionality 829 Geneve does not natively provide congestion control functionality and 830 relies on the payload protocol traffic for congestion control. As 831 such Geneve MUST be used with congestion controlled traffic or within 832 a network that is traffic managed to avoid congestion (TMCE). An 833 operator of a traffic managed network (TMCE) may avoid congestion by 834 careful provisioning of their networks, rate-limiting of user data 835 traffic and traffic engineering according to path capacity. 837 4.3. UDP Checksum 839 In order to provide integrity of Geneve headers, options and payload, 840 for example to avoid mis-delivery of payload to different tenant 841 systems in case of data corruption, outer UDP checksum SHOULD be used 842 with Geneve when transported over IPv4. The UDP checksum provides a 843 statistical guarantee that a payload was not corrupted in transit. 844 These integrity checks are not strong from a coding or cryptographic 845 perspective and are not designed to detect physical-layer errors or 846 malicious modification of the datagram (see Section 3.4 of 847 [RFC8085]). In deployments where such a risk exists, an operator 848 SHOULD use additional data integrity mechanisms such as offered by 849 IPSec (see Section 6.2). 851 An operator MAY choose to disable UDP checksum and use zero checksum 852 if Geneve packet integrity is provided by other data integrity 853 mechanisms such as IPsec or additional checksums or if one of the 854 conditions in Section 4.3.1 a, b, c are met. 856 By default, UDP checksum MUST be used when Geneve is transported over 857 IPv6. A tunnel endpoint MAY be configured for use with zero UDP 858 checksum if additional requirements in Section 4.3.1 are met. 860 4.3.1. UDP Zero Checksum Handling with IPv6 862 When Geneve is used over IPv6, UDP checksum is used to protect IPv6 863 headers, UDP headers and Geneve headers, options and payload from 864 potential data corruption. As such by default Geneve MUST use UDP 865 checksum when transported over IPv6. An operator MAY choose to 866 configure to operate with zero UDP checksum if operating in a traffic 867 managed controlled environment as stated in Section 4.1 if one of the 868 following conditions are met. 870 a. It is known that the packet corruption is exceptionally unlikely 871 (perhaps based on knowledge of equipment types in their underlay 872 network) and the operator is willing to take a risk of undetected 873 packet corruption 875 b. It is judged through observational measurements (perhaps through 876 historic or current traffic flows that use non zero checksum) 877 that the level of packet corruption is tolerably low and where 878 the operator is willing to take the risk of undetected 879 corruption. 881 c. Geneve payload is carrying applications that are tolerant of 882 misdelivered or corrupted packets (perhaps through higher layer 883 checksum validation and/or reliability through retransmission) 885 In addition Geneve tunnel implementations using Zero UDP checksum 886 MUST meet the following requirements: 888 1. Use of UDP checksum over IPv6 MUST be the default configuration 889 for all Geneve tunnels. 891 2. If Geneve is used with zero UDP checksum over IPv6 then such 892 tunnel endpoint implementation MUST meet all the requirements 893 specified in section 4 of [RFC6936] and requirements 1 as 894 specified in section 5 of [RFC6936]. 896 3. The Geneve tunnel endpoint that decapsulates the tunnel SHOULD 897 check the source and destination IPv6 addresses are valid for the 898 Geneve tunnel that is configured to receive Zero UDP checksum and 899 discard other packets for which such check fails. 901 4. The Geneve tunnel endpoint that encapsulates the tunnel MAY use 902 different IPv6 source addresses for each Geneve tunnel that uses 903 Zero UDP checksum mode in order to strengthen the decapsulator's 904 check of the IPv6 source address (i.e the same IPv6 source 905 address is not to be used with more than one IPv6 destination 906 address, irrespective of whether that destination address is a 907 unicast or multicast address). When this is not possible, it is 908 RECOMMENDED to use each source address for as few Geneve tunnels 909 that use zero UDP checksum as is feasible. 911 5. Measures SHOULD be taken to prevent Geneve traffic over IPv6 with 912 zero UDP checksum from escaping into the general Internet. 913 Examples of such measures include employing packet filters at the 914 Gateways or edge of Geneve network and/or keeping logical or 915 physical separation of Geneve network from networks carrying 916 General Internet. 918 The above requirements do not change either the requirements 919 specified in [RFC2460] as modified by [RFC6935] or the requirements 920 specified in [RFC6936]. 922 The requirement to check the source IPv6 address in addition to the 923 destination IPv6 address, plus the recommendation against reuse of 924 source IPv6 addresses among Geneve tunnels collectively provide some 925 mitigation for the absence of UDP checksum coverage of the IPv6 926 header. A traffic-managed controlled environment that satisfies at 927 least one of three conditions listed at the beginning of this section 928 provides additional assurance. 930 Editorial Note (The following paragraph to be removed by the RFC 931 Editor before publication) 933 It was discussed during TSVART early review if the level of 934 requirement for using different IPv6 source addresses for different 935 tunnel destinations would need to be "MAY" or "SHOULD". The 936 discussion concluded that it was appropriate to keep this as "MAY", 937 since it was considered not realistic for control planes having to 938 maintain a high level of state on a per tunnel destination basis. In 939 addition, the text above provides sufficient guidance to operators 940 and implementors on possible mitigations. 942 4.4. Encapsulation of Geneve in IP 944 As an IP-based tunnel protocol, Geneve shares many properties and 945 techniques with existing protocols. The application of some of these 946 are described in further detail, although in general most concepts 947 applicable to the IP layer or to IP tunnels generally also function 948 in the context of Geneve. 950 4.4.1. IP Fragmentation 952 It is strongly RECOMMENDED that Path MTU Discovery ([RFC1191], 953 [RFC8201]) be used by setting the DF bit in the IP header when Geneve 954 packets are transmitted over IPv4 (this is the default with IPv6). 955 The use of Path MTU Discovery on the transit network provides the 956 encapsulating tunnel endpoint with soft-state about the link that it 957 may use to prevent or minimize fragmentation depending on its role in 958 the virtualized network. The NVE control plane MAY use configuration 959 mechanism or path discovery information to maintain the MTU size of 960 the tunnel link(s) associated with the tunnel endpoint, so if a 961 tenant system sends large packets that when encapsulated exceed the 962 MTU size of the tunnel link, the tunnel endpoint can discard such 963 packets and send exception messages to the tenant system(s). If the 964 tunnel endpoint is associated with a routing or forwarding function 965 and/or has the capability to send ICMP messages, the encapsulating 966 tunnel endpoint MAY send ICMP fragmentation needed [RFC0792] or 967 Packet Too Big [RFC4443] messages to the tenant system(s). For 968 example, recommendations/guidance for handling fragmentation in 969 similar overlay encapsulation services like PWE3 are provided in 970 section 5.3 of [RFC3985]. 972 Note that some implementations may not be capable of supporting 973 fragmentation or other less common features of the IP header, such as 974 options and extension headers. For example, some of the issues 975 associated with MTU size and fragmentation in IP tunneling and use of 976 ICMP messages is outlined in section 4.2 of 977 [I-D.ietf-intarea-tunnels]. 979 Editorial Note (The following paragraph to be removed by the RFC 980 Editor before publication) 982 It was discussed during TSVART early review if the level of 983 requirement for maintaining tunnel MTU at the ingress has to be "MAY" 984 or "SHOULD". The discussion concluded that it was appropriate to 985 leave this as "MAY", considering the high level of state to be 986 maintained. 988 4.4.2. DSCP, ECN and TTL 990 When encapsulating IP (including over Ethernet) packets in Geneve, 991 there are several considerations for propagating DSCP and ECN bits 992 from the inner header to the tunnel on transmission and the reverse 993 on reception. 995 [RFC2983] provides guidance for mapping DSCP between inner and outer 996 IP headers. Network virtualization is typically more closely aligned 997 with the Pipe model described, where the DSCP value on the tunnel 998 header is set based on a policy (which may be a fixed value, one 999 based on the inner traffic class, or some other mechanism for 1000 grouping traffic). Aspects of the Uniform model (which treats the 1001 inner and outer DSCP value as a single field by copying on ingress 1002 and egress) may also apply, such as the ability to remark the inner 1003 header on tunnel egress based on transit marking. However, the 1004 Uniform model is not conceptually consistent with network 1005 virtualization, which seeks to provide strong isolation between 1006 encapsulated traffic and the physical network. 1008 [RFC6040] describes the mechanism for exposing ECN capabilities on IP 1009 tunnels and propagating congestion markers to the inner packets. 1010 This behavior MUST be followed for IP packets encapsulated in Geneve. 1012 Though Uniform or Pipe models could be used for TTL (or Hop Limit in 1013 case of IPv6) handling when tunneling IP packets, Pipe model is more 1014 aligned with network virtualization. [RFC2003] provides guidance on 1015 handling TTL between inner IP header and outer IP tunnels; this model 1016 is more aligned with the Pipe model and is recommended for use with 1017 Geneve for network virtualization applications. 1019 4.4.3. Broadcast and Multicast 1021 Geneve tunnels may either be point-to-point unicast between two 1022 tunnel endpoints or may utilize broadcast or multicast addressing. 1023 It is not required that inner and outer addressing match in this 1024 respect. For example, in physical networks that do not support 1025 multicast, encapsulated multicast traffic may be replicated into 1026 multiple unicast tunnels or forwarded by policy to a unicast location 1027 (possibly to be replicated there). 1029 With physical networks that do support multicast it may be desirable 1030 to use this capability to take advantage of hardware replication for 1031 encapsulated packets. In this case, multicast addresses may be 1032 allocated in the physical network corresponding to tenants, 1033 encapsulated multicast groups, or some other factor. The allocation 1034 of these groups is a component of the control plane and therefore 1035 outside of the scope of this document. When physical multicast is in 1036 use, the 'C' bit in the Geneve header may be used with groups of 1037 devices with heterogeneous capabilities as each device can interpret 1038 only the options that are significant to it if they are not critical. 1040 In addition, [RFC8293] provides examples of various mechanisms that 1041 can be used for multicast handling in network virtualization overlay 1042 networks. 1044 4.4.4. Unidirectional Tunnels 1046 Generally speaking, a Geneve tunnel is a unidirectional concept. IP 1047 is not a connection oriented protocol and it is possible for two 1048 tunnel endpoints to communicate with each other using different paths 1049 or to have one side not transmit anything at all. As Geneve is an 1050 IP-based protocol, the tunnel layer inherits these same 1051 characteristics. 1053 It is possible for a tunnel to encapsulate a protocol, such as TCP, 1054 which is connection oriented and maintains session state at that 1055 layer. In addition, implementations MAY model Geneve tunnels as 1056 connected, bidirectional links, such as to provide the abstraction of 1057 a virtual port. In both of these cases, bidirectionality of the 1058 tunnel is handled at a higher layer and does not affect the operation 1059 of Geneve itself. 1061 4.5. Constraints on Protocol Features 1063 Geneve is intended to be flexible to a wide range of current and 1064 future applications. As a result, certain constraints may be placed 1065 on the use of metadata or other aspects of the protocol in order to 1066 optimize for a particular use case. For example, some applications 1067 may limit the types of options which are supported or enforce a 1068 maximum number or length of options. Other applications may only 1069 handle certain encapsulated payload types, such as Ethernet or IP. 1070 This could be either globally throughout the system or, for example, 1071 restricted to certain classes of devices or network paths. 1073 These constraints may be communicated to tunnel endpoints either 1074 explicitly through a control plane or implicitly by the nature of the 1075 application. As Geneve is defined as a data plane protocol that is 1076 control plane agnostic, the exact mechanism is not defined in this 1077 document. 1079 4.5.1. Constraints on Options 1081 While Geneve options are more flexible, a control plane may restrict 1082 the number of option TLVs as well as the order and size of the TLVs, 1083 between tunnel endpoints, to make it simpler for a data plane 1084 implementation in software or hardware to handle 1085 [I-D.ietf-nvo3-encap]. For example, there may be some critical 1086 information such as a secure hash that must be processed in a certain 1087 order to provide lowest latency. 1089 A control plane may negotiate a subset of option TLVs and certain TLV 1090 ordering, as well may limit the total number of option TLVs present 1091 in the packet, for example, to accommodate hardware capable of 1092 processing fewer options [I-D.ietf-nvo3-encap]. Hence, a control 1093 plane needs to have the ability to describe the supported TLVs subset 1094 and their order to the tunnel endpoints. In the absence of a control 1095 plane, alternative configuration mechanisms may be used for this 1096 purpose. The exact mechanism is not defined in this document. 1098 4.6. NIC Offloads 1100 Modern NICs currently provide a variety of offloads to enable the 1101 efficient processing of packets. The implementation of many of these 1102 offloads requires only that the encapsulated packet be easily parsed 1103 (for example, checksum offload). However, optimizations such as LSO 1104 and LRO involve some processing of the options themselves since they 1105 must be replicated/merged across multiple packets. In these 1106 situations, it is desirable to not require changes to the offload 1107 logic to handle the introduction of new options. To enable this, 1108 some constraints are placed on the definitions of options to allow 1109 for simple processing rules: 1111 o When performing LSO, a NIC MUST replicate the entire Geneve header 1112 and all options, including those unknown to the device, onto each 1113 resulting segment. However, a given option definition may 1114 override this rule and specify different behavior in supporting 1115 devices. Conversely, when performing LRO, a NIC MAY assume that a 1116 binary comparison of the options (including unknown options) is 1117 sufficient to ensure equality and MAY merge packets with equal 1118 Geneve headers. 1120 o Options MUST NOT be reordered during the course of offload 1121 processing, including when merging packets for the purpose of LRO. 1123 o NICs performing offloads MUST NOT drop packets with unknown 1124 options, including those marked as critical, unless explicitly 1125 configured. 1127 There is no requirement that a given implementation of Geneve employ 1128 the offloads listed as examples above. However, as these offloads 1129 are currently widely deployed in commercially available NICs, the 1130 rules described here are intended to enable efficient handling of 1131 current and future options across a variety of devices. 1133 4.7. Inner VLAN Handling 1135 Geneve is capable of encapsulating a wide range of protocols and 1136 therefore a given implementation is likely to support only a small 1137 subset of the possibilities. However, as Ethernet is expected to be 1138 widely deployed, it is useful to describe the behavior of VLANs 1139 inside encapsulated Ethernet frames. 1141 As with any protocol, support for inner VLAN headers is OPTIONAL. In 1142 many cases, the use of encapsulated VLANs may be disallowed due to 1143 security or implementation considerations. However, in other cases 1144 trunking of VLAN frames across a Geneve tunnel can prove useful. As 1145 a result, the processing of inner VLAN tags upon ingress or egress 1146 from a tunnel endpoint is based upon the configuration of the tunnel 1147 endpoint and/or control plane and not explicitly defined as part of 1148 the data format. 1150 5. Interoperability Issues 1152 Viewed exclusively from the data plane, Geneve does not introduce any 1153 interoperability issues as it appears to most devices as UDP packets. 1154 However, as there are already a number of tunnel protocols deployed 1155 in network virtualization environments, there is a practical question 1156 of transition and coexistence. 1158 Since Geneve is a superset of the functionality of the most common 1159 protocols used for network virtualization (VXLAN,NVGRE) it should be 1160 straightforward to port an existing control plane to run on top of it 1161 with minimal effort. With both the old and new packet formats 1162 supporting the same set of capabilities, there is no need for a hard 1163 transition - tunnel endpoints directly communicating with each other 1164 use any common protocol, which may be different even within a single 1165 overall system. As transit devices are primarily forwarding packets 1166 on the basis of the IP header, all protocols appear similar and these 1167 devices do not introduce additional interoperability concerns. 1169 To assist with this transition, it is strongly suggested that 1170 implementations support simultaneous operation of both Geneve and 1171 existing tunnel protocols as it is expected to be common for a single 1172 node to communicate with a mixture of other nodes. Eventually, older 1173 protocols may be phased out as they are no longer in use. 1175 6. Security Considerations 1177 As encapsulated within a UDP/IP packet, Geneve does not have any 1178 inherent security mechanisms. As a result, an attacker with access 1179 to the underlay network transporting the IP packets has the ability 1180 to snoop or inject packets. Compromised tunnel endpoints may also 1181 spoof identifiers in the tunnel header to gain access to networks 1182 owned by other tenants. 1184 Within a particular security domain, such as a data center operated 1185 by a single service provider, the most common and highest performing 1186 security mechanism is isolation of trusted components. Tunnel 1187 traffic can be carried over a separate VLAN and filtered at any 1188 untrusted boundaries. In addition, tunnel endpoints should only be 1189 operated in environments controlled by the service provider, such as 1190 the hypervisor itself rather than within a customer VM. 1192 When crossing an untrusted link, such as the public Internet, IPsec 1193 [RFC4301] may be used to provide authentication and/or encryption of 1194 the IP packets formed as part of Geneve encapsulation. 1196 Geneve does not otherwise affect the security of the encapsulated 1197 packets. As per the guidelines of BCP 72 [RFC3552], the following 1198 sections describe potential security risks that may be applicable to 1199 Geneve deployments and approaches to mitigate such risks. It is also 1200 noted that not all such risks are applicable to all Geneve deployment 1201 scenarios, i.e., only a subset may be applicable to certain 1202 deployments. So an operator has to make an assessment based on their 1203 network environment and determine the risks that are applicable to 1204 their specific environment and use appropriate mitigation approaches 1205 as applicable. 1207 6.1. Data Confidentiality 1209 Geneve is a network virtualization overlay encapsulation protocol 1210 designed to establish tunnels between NVEs over an existing IP 1211 network. It can be used to deploy multi-tenant overlay networks over 1212 an existing IP underlay network in a public or private data center. 1213 The overlay service is typically provided by a service provider, for 1214 example a cloud services provider or a private data center operator, 1215 this may or not may be the same provider as an underlay service 1216 provider. Due to the nature of multi-tenancy in such environments, a 1217 tenant system may expect data confidentiality to ensure its packet 1218 data is not tampered with (active attack) in transit or a target of 1219 unauthorized monitoring (passive attack). A tenant may expect the 1220 overlay service provider to provide data confidentiality as part of 1221 the service or a tenant may bring its own data confidentiality 1222 mechanisms like IPsec or TLS to protect the data end to end between 1223 its tenant systems. 1225 If an operator determines data confidentiality is necessary in their 1226 environment based on their risk analysis, for example as in multi- 1227 tenant environments, then an encryption mechanism SHOULD be used to 1228 encrypt the tenant data end to end between the NVEs. The NVEs may 1229 use existing well established encryption mechanisms such as IPsec, 1230 DTLS, etc. 1232 6.1.1. Inter-Data Center Traffic 1234 A tenant system in a customer premises (private data center) may want 1235 to connect to tenant systems on their tenant overlay network in a 1236 public cloud data center or a tenant may want to have its tenant 1237 systems located in multiple geographically separated data centers for 1238 high availability. Geneve data traffic between tenant systems across 1239 such separated networks should be protected from threats when 1240 traversing public networks. Any Geneve overlay data leaving the data 1241 center network beyond the operator's security domain SHOULD be 1242 secured by encryption mechanisms such as IPsec or other VPN 1243 mechanisms to protect the communications between the NVEs when they 1244 are geographically separated over untrusted network links. 1245 Specification of data protection mechanisms employed between data 1246 centers is beyond the scope of this document. 1248 6.2. Data Integrity 1250 Geneve encapsulation is used between NVEs to establish overlay 1251 tunnels over an existing IP underlay network. In a multi-tenant data 1252 center, a rogue or compromised tenant system may try to launch a 1253 passive attack such as monitoring the traffic of other tenants, or an 1254 active attack such as trying to inject unauthorized Geneve 1255 encapsulated traffic such as spoofing, replay, etc., into the 1256 network. To prevent such attacks, an NVE MUST NOT propagate Geneve 1257 packets beyond the NVE to tenant systems and SHOULD employ packet 1258 filtering mechanisms so as not to forward unauthorized traffic 1259 between TSs in different tenant networks. 1261 A compromised network node or a transit device within a data center 1262 may launch an active attack trying to tamper with the Geneve packet 1263 data between NVEs. Malicious tampering of Geneve header fields may 1264 cause the packet from one tenant to be forwarded to a different 1265 tenant network. If an operator determines the possibility of such 1266 threat in their environment, the operator may choose to employ data 1267 integrity mechanisms between NVEs. In order to prevent such risks, a 1268 data integrity mechanism SHOULD be used in such environments to 1269 protect the integrity of Geneve packets including packet headers, 1270 options and payload on communications between NVE pairs. A 1271 cryptographic data protection mechanism such as IPsec may be used to 1272 provide data integrity protection. A data center operator may choose 1273 to deploy any other data integrity mechanisms as applicable and 1274 supported in their underlay networks. 1276 6.3. Authentication of NVE peers 1278 A rogue network device or a compromised NVE in a data center 1279 environment might be able to spoof Geneve packets as if it came from 1280 a legitimate NVE. In order to mitigate such a risk, an operator 1281 SHOULD use an authentication mechanism, such as IPsec to ensure that 1282 the Geneve packet originated from the intended NVE peer, in 1283 environments where the operator determines spoofing or rogue devices 1284 is a potential threat. Other simpler source checks such as ingress 1285 filtering for VLAN/MAC/IP address, reverse path forwarding checks, 1286 etc., may be used in certain trusted environments to ensure Geneve 1287 packets originated from the intended NVE peer. 1289 6.4. Options Interpretation by Transit Devices 1291 Options, if present in the packet, are generated and terminated by 1292 tunnel endpoints. As indicated in Section 2.2.1, transit devices may 1293 interpret the options. However, if the packet is protected by tunnel 1294 endpoint to tunnel endpoint encryption, for example through IPsec, 1295 transit devices will not have visibility into the Geneve header or 1296 options in the packet. In such cases transit devices MUST handle 1297 Geneve packets as any other IP packet and maintain consistent 1298 forwarding behavior. In cases where options are interpreted by 1299 transit devices, the operator MUST ensure that transit devices are 1300 trusted and not compromised. Implementation of a mechanism to ensure 1301 this trust is beyond the scope of this document. 1303 6.5. Multicast/Broadcast 1305 In typical data center networks where IP multicasting is not 1306 supported in the underlay network, multicasting may be supported 1307 using multiple unicast tunnels. The same security requirements as 1308 described in the above sections can be used to protect Geneve 1309 communications between NVE peers. If IP multicasting is supported in 1310 the underlay network and the operator chooses to use it for multicast 1311 traffic among tunnel endpoints, then the operator in such 1312 environments may use data protection mechanisms such as IPsec with 1313 Multicast extensions [RFC5374] to protect multicast traffic among 1314 Geneve NVE groups. 1316 6.6. Control Plane Communications 1318 A Network Virtualization Authority (NVA) as outlined in [RFC8014] may 1319 be used as a control plane for configuring and managing the Geneve 1320 NVEs. The data center operator is expected to use security 1321 mechanisms to protect the communications between the NVA to NVEs and 1322 use authentication mechanisms to detect any rogue or compromised NVEs 1323 within their administrative domain. Data protection mechanisms for 1324 control plane communication or authentication mechanisms between the 1325 NVA and the NVEs is beyond the scope of this document. 1327 7. IANA Considerations 1329 IANA has allocated UDP port 6081 as the well-known destination port 1330 for Geneve. Upon publication, the registry should be updated to cite 1331 this document. The original request was: 1333 Service Name: geneve 1334 Transport Protocol(s): UDP 1335 Assignee: Jesse Gross 1336 Contact: Jesse Gross 1337 Description: Generic Network Virtualization Encapsulation (Geneve) 1338 Reference: This document 1339 Port Number: 6081 1341 In addition, IANA is requested to create a "Geneve Option Class" 1342 registry to allocate Option Classes. This shall be a registry of 1343 16-bit hexadecimal values along with descriptive strings. The 1344 identifiers 0x0-0xFF are to be reserved for standardized options for 1345 allocation by IETF Review [RFC8126] and 0xFFF0-0xFFFF for 1346 Experimental Use. Otherwise, identifiers are to be assigned to any 1347 organization with an interest in creating Geneve options on a First 1348 Come First Served basis. The registry is to be populated with the 1349 following initial values: 1351 +----------------+--------------------------------------+ 1352 | Option Class | Description | 1353 +----------------+--------------------------------------+ 1354 | 0x0000..0x00FF | Unassigned - IETF Review | 1355 | 0x0100 | Linux | 1356 | 0x0101 | Open vSwitch (OVS) | 1357 | 0x0102 | Open Virtual Networking (OVN) | 1358 | 0x0103 | In-band Network Telemetry (INT) | 1359 | 0x0104 | VMware, Inc. | 1360 | 0x0105 | Amazon.com, Inc. | 1361 | 0x0106 | Cisco Systems, Inc. | 1362 | 0x0107 | Oracle Corporation | 1363 | 0x0108..0x110 | Amazon.com, Inc. | 1364 | 0x0111..0xFFEF | Unassigned - First Come First Served | 1365 | 0xFFF0..FFFF | Experimental | 1366 +----------------+--------------------------------------+ 1368 8. Contributors 1370 The following individuals were authors of an earlier version of this 1371 document and made significant contributions: 1373 Pankaj Garg 1374 Microsoft Corporation 1375 1 Microsoft Way 1376 Redmond, WA 98052 1377 USA 1379 Email: pankajg@microsoft.com 1381 Chris Wright 1382 Red Hat Inc. 1383 1801 Varsity Drive 1384 Raleigh, NC 27606 1385 USA 1387 Email: chrisw@redhat.com 1389 Kenneth Duda 1390 Arista Networks 1391 5453 Great America Parkway 1392 Santa Clara, CA 95054 1393 USA 1395 Email: kduda@arista.com 1397 Dinesh G. Dutt 1398 Independent 1400 Email: didutt@gmail.com 1402 Jon Hudson 1403 Independent 1405 Email: jon.hudson@gmail.com 1407 Ariel Hendel 1408 Facebook, Inc. 1409 1 Hacker Way 1410 Menlo Park, CA 94025 1411 USA 1413 Email: ahendel@fb.com 1415 9. Acknowledgements 1417 The authors wish to thank Martin Casado, Bruce Davie and Dave Thaler 1418 for their input, feedback, and helpful suggestions. 1420 The authors would like to thank Magnus Nystrom for his reviews and 1421 feedback. 1423 Thanks to Daniel Migault, Anoop Ghanwani, Greg Mirksy, Puneet 1424 Agarwal, and Tal Mizrahi for their reviews, comments and feedback. 1426 The authors would like to thank David Black for his detailed reviews 1427 and valuable inputs. 1429 Thanks to Sami Boutros for his inputs and helpful feedback. 1431 The authors would like to thank Matthew Bocci, Sam Aldrin, Benson 1432 Schliesser, Martin Vigoureux, and Alia Atlas for their guidance 1433 throughout the process. 1435 10. References 1437 10.1. Normative References 1439 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 1440 DOI 10.17487/RFC0768, August 1980, 1441 . 1443 [RFC0792] Postel, J., "Internet Control Message Protocol", STD 5, 1444 RFC 792, DOI 10.17487/RFC0792, September 1981, 1445 . 1447 [RFC1112] Deering, S., "Host extensions for IP multicasting", STD 5, 1448 RFC 1112, DOI 10.17487/RFC1112, August 1989, 1449 . 1451 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1452 Requirement Levels", BCP 14, RFC 2119, 1453 DOI 10.17487/RFC2119, March 1997, 1454 . 1456 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 1457 Control Message Protocol (ICMPv6) for the Internet 1458 Protocol Version 6 (IPv6) Specification", STD 89, 1459 RFC 4443, DOI 10.17487/RFC4443, March 2006, 1460 . 1462 [RFC6935] Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and 1463 UDP Checksums for Tunneled Packets", RFC 6935, 1464 DOI 10.17487/RFC6935, April 2013, 1465 . 1467 [RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement 1468 for the Use of IPv6 UDP Datagrams with Zero Checksums", 1469 RFC 6936, DOI 10.17487/RFC6936, April 2013, 1470 . 1472 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 1473 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 1474 March 2017, . 1476 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 1477 Writing an IANA Considerations Section in RFCs", BCP 26, 1478 RFC 8126, DOI 10.17487/RFC8126, June 2017, 1479 . 1481 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1482 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1483 May 2017, . 1485 10.2. Informative References 1487 [ETYPES] The IEEE Registration Authority, "IEEE 802 Numbers", 2013, 1488 . 1491 [I-D.ietf-intarea-tunnels] 1492 Touch, J. and M. Townsley, "IP Tunnels in the Internet 1493 Architecture", draft-ietf-intarea-tunnels-09 (work in 1494 progress), July 2018. 1496 [I-D.ietf-nvo3-dataplane-requirements] 1497 Bitar, N., Lasserre, M., Balus, F., Morin, T., Jin, L., 1498 and B. Khasnabish, "NVO3 Data Plane Requirements", draft- 1499 ietf-nvo3-dataplane-requirements-03 (work in progress), 1500 April 2014. 1502 [I-D.ietf-nvo3-encap] 1503 Boutros, S., "NVO3 Encapsulation Considerations", draft- 1504 ietf-nvo3-encap-02 (work in progress), September 2018. 1506 [IEEE.802.1Q_2014] 1507 IEEE, "IEEE Standard for Local and metropolitan area 1508 networks--Bridges and Bridged Networks", IEEE 802.1Q-2014, 1509 DOI 10.1109/ieeestd.2014.6991462, December 2014, 1510 . 1513 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 1514 DOI 10.17487/RFC1191, November 1990, 1515 . 1517 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 1518 DOI 10.17487/RFC2003, October 1996, 1519 . 1521 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 1522 (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, 1523 December 1998, . 1525 [RFC2983] Black, D., "Differentiated Services and Tunnels", 1526 RFC 2983, DOI 10.17487/RFC2983, October 2000, 1527 . 1529 [RFC3031] Rosen, E., Viswanathan, A., and R. Callon, "Multiprotocol 1530 Label Switching Architecture", RFC 3031, 1531 DOI 10.17487/RFC3031, January 2001, 1532 . 1534 [RFC3552] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 1535 Text on Security Considerations", BCP 72, RFC 3552, 1536 DOI 10.17487/RFC3552, July 2003, 1537 . 1539 [RFC3985] Bryant, S., Ed. and P. Pate, Ed., "Pseudo Wire Emulation 1540 Edge-to-Edge (PWE3) Architecture", RFC 3985, 1541 DOI 10.17487/RFC3985, March 2005, 1542 . 1544 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 1545 Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, 1546 December 2005, . 1548 [RFC5374] Weis, B., Gross, G., and D. Ignjatic, "Multicast 1549 Extensions to the Security Architecture for the Internet 1550 Protocol", RFC 5374, DOI 10.17487/RFC5374, November 2008, 1551 . 1553 [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion 1554 Notification", RFC 6040, DOI 10.17487/RFC6040, November 1555 2010, . 1557 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 1558 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual 1559 eXtensible Local Area Network (VXLAN): A Framework for 1560 Overlaying Virtualized Layer 2 Networks over Layer 3 1561 Networks", RFC 7348, DOI 10.17487/RFC7348, August 2014, 1562 . 1564 [RFC7365] Lasserre, M., Balus, F., Morin, T., Bitar, N., and Y. 1565 Rekhter, "Framework for Data Center (DC) Network 1566 Virtualization", RFC 7365, DOI 10.17487/RFC7365, October 1567 2014, . 1569 [RFC7637] Garg, P., Ed. and Y. Wang, Ed., "NVGRE: Network 1570 Virtualization Using Generic Routing Encapsulation", 1571 RFC 7637, DOI 10.17487/RFC7637, September 2015, 1572 . 1574 [RFC8014] Black, D., Hudson, J., Kreeger, L., Lasserre, M., and T. 1575 Narten, "An Architecture for Data-Center Network 1576 Virtualization over Layer 3 (NVO3)", RFC 8014, 1577 DOI 10.17487/RFC8014, December 2016, 1578 . 1580 [RFC8086] Yong, L., Ed., Crabbe, E., Xu, X., and T. Herbert, "GRE- 1581 in-UDP Encapsulation", RFC 8086, DOI 10.17487/RFC8086, 1582 March 2017, . 1584 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 1585 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 1586 DOI 10.17487/RFC8201, July 2017, 1587 . 1589 [RFC8293] Ghanwani, A., Dunbar, L., McBride, M., Bannai, V., and R. 1590 Krishnan, "A Framework for Multicast in Network 1591 Virtualization over Layer 3", RFC 8293, 1592 DOI 10.17487/RFC8293, January 2018, 1593 . 1595 [VL2] "VL2: A Scalable and Flexible Data Center Network", ACM 1596 SIGCOMM Computer Communication Review, 1597 DOI 10.1145/1594977.1592576, 2009, 1598 . 1601 Authors' Addresses 1603 Jesse Gross (editor) 1605 Email: jesse@kernel.org 1607 Ilango Ganga (editor) 1608 Intel Corporation 1609 2200 Mission College Blvd. 1610 Santa Clara, CA 95054 1611 USA 1613 Email: ilango.s.ganga@intel.com 1615 T. Sridhar (editor) 1616 VMware, Inc. 1617 3401 Hillview Ave. 1618 Palo Alto, CA 94304 1619 USA 1621 Email: tsridhar@vmware.com