idnits 2.17.1 draft-ietf-nvo3-encap-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 46 instances of too long lines in the document, the longest one being 2 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 404 has weird spacing: '...of TLVs from ...' == Line 534 has weird spacing: '...lso, if the...' -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RFC2119' is mentioned on line 176, but not defined == Missing Reference: 'I-D.ietf-nvo3-vxlan-gpe' is mentioned on line 749, but not defined == Missing Reference: 'I-D.herbert-gue-extensions' is mentioned on line 739, but not defined == Missing Reference: 'I-D.ietf-nvo3-gue' is mentioned on line 763, but not defined == Unused Reference: 'KEYWORDS' is defined on line 650, but no explicit reference was found in the text == Unused Reference: 'Geneve' is defined on line 655, but no explicit reference was found in the text == Unused Reference: 'GUE' is defined on line 657, but no explicit reference was found in the text == Unused Reference: 'NSH' is defined on line 658, but no explicit reference was found in the text == Unused Reference: 'VXLAN-GPE' is defined on line 659, but no explicit reference was found in the text == Outdated reference: A later version (-05) exists of draft-smith-vxlan-group-policy-03 Summary: 2 errors (**), 0 flaws (~~), 13 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 INTERNET-DRAFT Sami Boutros(Ed.) 3 Intended Status: Informational VMware 5 Ilango Ganga 6 Intel 8 Pankaj Garg 9 Microsoft 11 Rajeev Manur 12 Broadcom 14 Tal Mizrahi 15 Marvell 17 David Mozes 19 Erik Nordmark 21 Michael Smith 22 Cisco 24 Sam Aldrin 25 Google 27 Ignas Bagdonas 28 Equinix 30 Expires: April 27, 2018 October 24, 2017 32 NVO3 Encapsulation Considerations 33 draft-ietf-nvo3-encap-01 35 Abstract 37 As communicated by WG Chairs, the IETF NVO3 chairs and Routing Area 38 director have chartered a design team to take forward the 39 encapsulation discussion and see if there is potential to design a 40 common encapsulation that addresses the various technical concerns. 42 There are implications of different encapsulations in real 43 environments consisting of both software and hardware implementations 44 and spanning multiple data centers. For example, OAM functions such 45 as path MTU discovery become challenging with multiple encapsulations 46 along the data path. 48 The design team recommend Geneve with few modifications as the common 49 encapsulation, more details are described in section 7. 51 Status of this Memo 53 This Internet-Draft is submitted to IETF in full conformance with the 54 provisions of BCP 78 and BCP 79. 56 Internet-Drafts are working documents of the Internet Engineering 57 Task Force (IETF), its areas, and its working groups. Note that 58 other groups may also distribute working documents as 59 Internet-Drafts. 61 Internet-Drafts are draft documents valid for a maximum of six months 62 and may be updated, replaced, or obsoleted by other documents at any 63 time. It is inappropriate to use Internet-Drafts as reference 64 material or to cite them other than as "work in progress." 66 The list of current Internet-Drafts can be accessed at 67 http://www.ietf.org/1id-abstracts.html 69 The list of Internet-Draft Shadow Directories can be accessed at 70 http://www.ietf.org/shadow.html 72 Copyright and License Notice 74 Copyright (c) 2017 IETF Trust and the persons identified as the 75 document authors. All rights reserved. 77 This document is subject to BCP 78 and the IETF Trust's Legal 78 Provisions Relating to IETF Documents 79 (http://trustee.ietf.org/license-info) in effect on the date of 80 publication of this document. Please review these documents 81 carefully, as they describe your rights and restrictions with respect 82 to this document. Code Components extracted from this document must 83 include Simplified BSD License text as described in Section 4.e of 84 the Trust Legal Provisions and are provided without warranty as 85 described in the Simplified BSD License. 87 Table of Contents 89 1. Problem Statement . . . . . . . . . . . . . . . . . . . . . . . 4 90 2. Design Team Goals . . . . . . . . . . . . . . . . . . . . . . . 4 91 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . 4 92 4. Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . 4 93 5. Issues with current Encapsulations . . . . . . . . . . . . . . 5 94 5.1 Geneve . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 95 5.2 GUE . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 96 5.3 VXLAN-GPE . . . . . . . . . . . . . . . . . . . . . . . . . 5 97 6. Common Encapsulation Considerations . . . . . . . . . . . . . . 6 98 6.1 Current Encapsulations . . . . . . . . . . . . . . . . . . . 6 99 6.2 Useful Extensions Use cases . . . . . . . . . . . . . . . . 6 100 6.2.1. Telemetry extensions. . . . . . . . . . . . . . . . . . 6 101 6.2.2. Security/Integrity extensions . . . . . . . . . . . . . 7 102 6.2.3. Group Base Policy . . . . . . . . . . . . . . . . . . . 7 103 6.3 Hardware Considerations . . . . . . . . . . . . . . . . . . 7 104 6.4 Extension Size . . . . . . . . . . . . . . . . . . . . . . . 8 105 6.5 Extension Ordering . . . . . . . . . . . . . . . . . . . . . 9 106 6.6 TLV vs Bit Fields . . . . . . . . . . . . . . . . . . . . . 9 107 6.7 Control Plane Considerations . . . . . . . . . . . . . . . . 10 108 6.8 Split NVE . . . . . . . . . . . . . . . . . . . . . . . . . 11 109 6.9 Larger VNI Considerations . . . . . . . . . . . . . . . . . 11 110 6.10 NAT traversal Considerations . . . . . . . . . . . . . . . 11 111 7. Design team recommendations . . . . . . . . . . . . . . . . . . 11 112 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 14 113 9. Security Considerations . . . . . . . . . . . . . . . . . . . . 14 114 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 14 115 10.1 Normative References . . . . . . . . . . . . . . . . . . . 14 116 10.2 Informative References . . . . . . . . . . . . . . . . . . 15 117 11. Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . . 15 118 11.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 15 119 11.2. Extensibility . . . . . . . . . . . . . . . . . . . . . . 15 120 11.2.1. Native Extensibility Support . . . . . . . . . . . . 15 121 11.2.2. Extension Parsing . . . . . . . . . . . . . . . . . . 15 122 11.2.3. Critical Extensions . . . . . . . . . . . . . . . . . 16 123 11.2.4. Maximal Header Length . . . . . . . . . . . . . . . . 16 124 11.3. Encapsulation Header . . . . . . . . . . . . . . . . . . 16 125 11.3.1. Virtual Network Identifier (VNI) . . . . . . . . . . 16 126 11.3.2. Next Protocol . . . . . . . . . . . . . . . . . . . . 17 127 11.3.3. Other Header Fields . . . . . . . . . . . . . . . . . 17 128 11.4. Comparison Summary . . . . . . . . . . . . . . . . . . . 17 129 Authors' Addresses (In alphabetical order) . . . . . . . . . . . . 18 131 1. Problem Statement 133 As communicated by WG Chairs, the NVO3 WG charter states that it may 134 produce requirements for network virtualization data planes based on 135 encapsulation of virtual network traffic over an IP-based underlay 136 data plane. Such requirements should consider OAM and security. Based 137 on these requirements the WG will select, extend, and/or develop one 138 or more data plane encapsulation format(s). 140 This has led to drafts describing three encapsulations being adopted 141 by the working group: 143 - draft-ietf-nvo3-geneve-03 145 - draft-ietf-nvo3-gue-04 147 - draft-ietf-nvo3-vxlan-gpe-02 149 Discussion on the list and in face-to-face meetings has identified a 150 number of technical problems with each of these encapsulations. 151 Furthermore, there was clear consensus at the IETF meeting in Berlin 152 that it is undesirable for the working group to progress more than 153 one data plane encapsulation. Although consensus could not be reached 154 on the list, the overall consensus was for a single encapsulation 155 (RFC2418, Section 3.3). Nonetheless there has been resistance to 156 converging on a single encapsulation format. 158 2. Design Team Goals 160 As communicated by WG Chairs, the design team should take one of the 161 proposed encapsulations and enhance it to address the technical 162 concerns. The simple evolution of deployed networks as well as 163 applicability to all locations in the NVO3 architecture are goals. 164 The DT should specifically avoid a design that is burdensome on 165 hardware implementations, but should allow future extensibility. The 166 chosen design should also operate well with ICMP and in ECMP 167 environments. If further extensibility is required, then it should be 168 done in such a manner that it does not require the consent of an 169 entity outside of the IETF. 171 3. Terminology 173 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 174 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 175 document are to be interpreted as described in RFC 2119 [RFC2119]. 177 4. Abbreviations 178 NVO3 Network Virtualization Overlays over Layer 3 180 OAM Operations, Administration, and Maintenance 182 TLV Type, Length, and Value 184 VNI Virtual Network Identifier 186 NVE Network Virtualization Edge 188 NVA Network Virtualization Authority 190 NIC Network interface card 192 Transit device Underlay network devices between NVE(s). 194 5. Issues with current Encapsulations 196 As summarized by WG Chairs. 198 5.1 Geneve 200 - Can't be implemented cost-effectively in all use cases because 201 variable length header and order of the TLVs makes is costly (in 202 terms of number of gates) to implement in hardware 204 - Header doesn't fit into largest commonly available parse buffer 205 (256 bytes in NIC). Cannot justify doubling buffer size unless it is 206 mandatory for hardware to process additional option fields. 208 5.2 GUE 210 - There were a significant number of objections related to the 211 complexity of implementation in hardware, similar to those noted for 212 Geneve above. 214 5.3 VXLAN-GPE 216 - GPE is not day-1 backwards compatible with VXLAN. Although the 217 frame format is similar, it uses a different UDP port, so would 218 require changes to existing implementations even if the rest of the 219 GPE frame is the same. 221 - GPE is insufficiently extensible. Numerous extensions and options 222 have been designed for GUE and Geneve. Note that these have not yet 223 been validated by the WG. 225 - Security e.g. of the VNI has not been addressed by GPE. Although a 226 shim header could be used for security and other extensions, this has 227 not been defined yet and its implications on offloading in NICs are 228 not understood. 230 6. Common Encapsulation Considerations 232 6.1 Current Encapsulations 234 Appendix A includes a detailed comparison between the three proposed 235 encapsulations. The comparison indicates several common properties, 236 but also three major differences among the encapsulations: 238 - Extensibility: Geneve and GUE were defined with built-in 239 extensibility, while VXLAN-GPE is not inherently extensible. Note 240 that any of the three encapsulations can be extended using the 241 Network Service Header (NSH). 243 - Extension method: Geneve is extensible using Type/Length/Value 244 (TLV) fields, while GUE uses a small set of possible extensions, and 245 a set of flags that indicate which extension is present. 247 - Length field: Geneve and GUE include a Length field, indicating the 248 length of the encapsulation header, while VXLAN-GPE does not include 249 such a field. 251 6.2 Useful Extensions Use cases 253 Non vendor specific TLV MUST follow the standardization process. The 254 following use cases for extensions shows that there is a strong 255 requirement to support variable length extensions with possible 256 different subtypes. 258 6.2.1. Telemetry extensions. 260 In several scenarios it is beneficial to make information about the 261 path a packet took through the network or through a network device as 262 well as associated telemetry information available to the operator. 264 This includes not only tasks like debugging, troubleshooting, as well 265 as network planning and network optimization but also policy or 266 service level agreement compliance checks. 268 Packet scheduling algorithms, especially for balancing traffic across 269 equal cost paths or links, often leverage information contained 270 within the packet, such as protocol number, IP-address or MAC- 271 address. Probe packets would thus either need to be sent from the 272 exact same endpoints with the exact same parameters, or probe packets 273 would need to be artificially constructed as "fake" packets and 274 inserted along the path. Both approaches are often not feasible from 275 an operational perspective, be it that access to the end-system is 276 not feasible, or that the diversity of parameters and associated 277 probe packets to be created is simply too large. An in-band telemetry 278 mechanism in extensions is an alternative in those cases. 280 6.2.2. Security/Integrity extensions 282 Since the currently proposed NVO3 encapsulations do not protect their 283 headers a single bit corruption in the VNI field could deliver a 284 packet to the wrong tenant. Extensions are needed to use any 285 sophisticated security. 287 The possibility of VNI spoofing with an NVO3 protocol is exacerbated 288 by the use of UDP. Systems typically have no restrictions on 289 applications being able to send to any UDP port so an unprivileged 290 application can trivially spoof for instance, VXLAN packets, 291 including using arbitrary VNIs. 293 One can envision HMAC-like support in some NVO3 extension to 294 authenticate the header and the outer IP addresses, thereby 295 preventing attackers from injecting packets with spoofed VNIs. 297 An other aspect of security is payload security. Essentially this is 298 to make packets that look like IP|UDP|NVO3 Encap|DTLS/IPSEC-ESP 299 Extension|payload. This is nice since we still have the UDP header 300 for ECMP, the NVO3 header is in plain text so it can by read by 301 network elements, and different security or other payload transforms 302 can be supported on a single UDP port (we don't need a separate UDP 303 for DTLS/IPSEC). 305 6.2.3. Group Base Policy 307 Another use case would be to carry the Group Based Policy (GBP) 308 source group information within a NVO3 header extension in a similar 309 manner as has been implemented for VXLAN [VXLAN-GBP]. This allows 310 various forms of policy such as access control and QoS to be applied 311 between abstract groups rather than coupled to specific endpoint 312 addresses. 314 6.3 Hardware Considerations 316 Hardware restrictions should be taken into consideration along with 317 future hardware enhancements that may provide more flexible metadata 318 processing. However, the set of options that need to and will be 319 implemented in hardware will be a subset of what is implemented in 320 software, since software NVEs are likely to grow features, and hence 321 option support, at a more rapid rate. 323 We note that it is hard to predict which options will be implemented 324 in which piece of hardware and when. That depends on whether the 325 hardware will be in the form of a NIC providing increasing offload 326 capabilities to software NVEs, or a switch chip being used as an NVE 327 gateway towards non-NVO3 parts of the network, or even an transit 328 devices that participates in the NVO3 dataplane e.g. for OAM 329 purposes. 331 A result of this is that it doesn't look useful to prescribe some 332 order of the option so that the ones that are likely to be 333 implemented in hardware come first; we can't decide such an order 334 when we define the options, however a control plane can enforce such 335 order for some hardware implementations. 337 We do know that hardware needs to initially be able to efficiently 338 skip over the NVO3 header to find the inner payload. That is needed 339 for both NICs doing e.g. TCP offload and transit devices and NVEs 340 applying policy/ACLs to the inner payload. 342 6.4 Extension Size 344 Extension header length has a significant impact to hardware and 345 software implementations. A total header length that is too small 346 will unnecessarily constrained software flexibility. A total header 347 length that is too large will place a nontrivial cost on hardware 348 implementations. Thus, the design team recommends that there be a 349 minimum and maximum total extension header length selected. The 350 maximum total header length is determined by the bits allocated for 351 the total extension header length field. The risk with this approach 352 is that it may be difficult to extend the total header size in the 353 future. The minimum total header length is determined by a 354 requirement in the specifications that all implementations must meet. 355 The risk with this approach is that all implementations will only 356 implement the minimum total header length which would then become the 357 de facto maximum total header length. The recommended minimum total 358 header length is 64 bytes. 360 Single Extension size should always be 4 bytes aligned. 362 The maximum length of a single option should be large enough to meet 363 the different extension use case requirements e.g. in-band telemetry 364 and future use. 366 6.5 Extension Ordering 368 In order to support hardware nodes at the tunnel endpoint or at the 369 transit that can process one or few extensions TLVs in TCAM. A 370 control plane in such a deployment can signal a capability to ensure 371 a specific TLV will always appear in a specific order for example the 372 first one in the packet. 374 The order of the TLVs should be HW friendly for both the sender and 375 the receiver and possibly the transit node too. 377 Transit nodes doesn't participate in control plane communication 378 between the end points and are not required to process the options 379 however, if they do, they need to process only a small subset of 380 options that will be consumed by tunnel endpoints. 382 6.6 TLV vs Bit Fields 384 If there is a well-known initial set of options that are likely to be 385 implemented in software and in hardware, it can be efficient to use 386 the bit-field approach as in GUE. However, as described in section 387 6.3, if options are added over time and different subsets of options 388 are likely to be implemented in different pieces of hardware, then it 389 would be hard for the IETF to specify which options should get the 390 early bit fields. TLVs are a lot more flexible, which avoids the need 391 to determine the relative importance different options. However, 392 general TLV of arbitrary order, size, and repetition of the same 393 order is difficult to implement in hardware. A middle ground is to 394 use TLV with restrictions on the size and alignment, observing that 395 individual TLVs can have a fixed length, and support in the control 396 plane such that an NVE will only receive options that to needs and 397 implements. The control plane approach can potentially be used to 398 control the order of the TLVs sent to a particular NVE. Note that 399 transit devices are not likely to participate in the control plane 400 hence to the extent that they need to participate in option 401 processing they need more effort, But transit devices would have 402 issues with future GUE bits being defined for future options as well. 404 A benefit of TLVs from a HW perspective is that they are self 405 describing i.e., all the information is in the TLV. In a Bit fields 406 approach the hardware needs to look up the bit to determine the 407 length of the data associated with the bit through some separate 408 table, which would add hardware complexity. 410 There are use cases where multiple modules of software are running on 411 NVE. This can be modules such as a diagnostic module by one vendor 412 that does packet sampling and another module from a different vendor 413 that does a firewall. Using a TLV format, it is easier to have 414 different software modules process different TLVs, which could be 415 standard extensions or vendor specific extensions defined by the 416 different vendors, without conflicting with each other. This can help 417 with hardware modularity as well. There are some implementations with 418 options that allows different software like mac learning and security 419 handle different options. 421 6.7 Control Plane Considerations 423 Given that we want to allow large flexibility and extensibility for 424 e.g. software NVEs, yet be able to support key extensions in less 425 flexible e.g. hardware NVEs, it is useful to consider the control 426 plane. By control plane in this context we mean both protocols such 427 as EVPN and others, and also deployment specific configuration. 429 If each NVE can express in the control plane that they only care 430 about particular extensions (could be a single extension, or a few), 431 and the source NVEs only include requested extensions in the NVO3 432 packets, then the target NVE can both use a simpler parser (e.g., a 433 TCAM might be usable to look for a single NVO3 extension) and the 434 depth of the inner payload in the NVO3 packet will be minimized. 435 Furthermore, if the target NVE cares about a few extensions and can 436 express in the control plane the desired order of those extensions in 437 the NVO3 packets, then it can provide useful functionality with 438 minimal hardware requirements. 440 Note that transit devices that are not aware of the NVO3 extensions 441 somewhat benefit from such an approach, since the inner payload is 442 less deep in the packet if no extraneous extensions are included in 443 the packet. However, in general a transit device is not likely to 444 participate in the NVO3 control plane. (However, configuration 445 mechanisms can take into account limitations of the transit devices 446 used in particular deployments.) 448 Note that in this approach different NVEs could desire different 449 (sets of) extensions, which means that the source NVE needs to be 450 able to place different sets of extensions in different NVO3 packets, 451 and perhaps in different order. It also assumes that underlay 452 multicast or replication servers are not used together with NVO3 453 extensions. 455 There is a need to consider mandatory extensions versus optional 456 extensions. Mandatory extensions require the receiver to drop the 457 packet if the extension is unknown. A control plane mechanism can 458 prevent the need for dropping unknown extensions, since they would 459 not be included to targets that do not support them. 461 The control planes defined today need to add the ability to describe 462 the different encapsulations. Thus perhaps EVPN, and any other 463 control plane protocol that the IETF defines, should have a way to 464 enumerate the supported NVO3 extensions and their order. 466 The WG should consider developing a separate draft on guidance for 467 option processing and control plane participation. This should 468 provide examples/guidance on range of usage models and deployments 469 scenarios for specific options and ordering that are relevant for 470 that specific deployment. This includes end points and middle boxes 471 using the options. So, having the control plane negotiate the 472 constraints is most appropriate and flexible way to address these 473 requirements. 475 6.8 Split NVE 477 If the working group sees a need for having the hosts send and 478 receive options in a split NVE case, this is possible using any of 479 the existing extensible encapsulations (Geneve, GUE, GPE+NSH) by 480 defining a way to carry those over other transports. NSH can already 481 be used over different transports. 483 If we need to do this with other encapsulations it can be done by 484 defining an Ether type for other encapsulations so that it can be 485 carried over Ethernet and 802.1Q. 487 If we need to carry other encapsulations over MPLS, it would require 488 an EVPN control plane to signal that other encapsulation header + 489 options will be present in front of the L2 packet. The VNI can be 490 ignored in the header, and the MPLS label will be the one used to 491 identify the EVPN L2 instance. 493 6.9 Larger VNI Considerations 495 We discussed whether we should make VNI 32-bits or larger. The 496 benefit of 24-bit VNI would be to avoid unnecessary changes with 497 existing proposals and implementations that are almost all, if not 498 all, are using 24-bit VNI. If we need a larger VNI, an extension can 499 be used to support that. 501 6.10 NAT traversal Considerations TBD 503 7. Design team recommendations 505 We concluded that Geneve is most suitable as a starting point for 506 proposed standard for network virtualization, for the following 507 reasons: 509 1. We studied whether VNI should be in base header or in extensions 510 and whether it should be 24-bit or 32-bit. The design team agreed 511 that VNI is critical information for network virtualization and MUST 512 be present in all packets. Design team also agreed that 24-bit VNI 513 matches the existing widely used encapsulation format i.e. VxLAN and 514 NVGRE and hence more suitable to use going forward. 516 2. Geneve has the total options length that allow skipping over the 517 options for NIC offload operations, and will allow transit devices to 518 view flow information in the inner payload. 520 3. We considered the option of using NSH with VxLAN-GPE but given 521 that NSH is targeted at service chaining and contains service 522 chaining information, it is less suitable for the network 523 virtualization use case. The other downside for VxLAN-GPE was lack of 524 header length in VxLAN-GPE and hence makes skipping over the headers 525 to process inner payload more difficult. Total Option Length is 526 present in Geneve. It is not possible to skip any options in the 527 middle with VxLAN-GPE. In principle a split between a base header and 528 a header with options is interesting (whether that options header is 529 NSH or some new header without ties to a service path). We explored 530 whether it would make sense to either use NSH for this, or define a 531 new NVO3 options header. However, we observed that this makes it 532 slightly harder to find the inner payload since the length field is 533 not in the NVO3 header itself. Thus one more field would have to be 534 extracted to compute the start of the inner payload. Also, if the 535 experience with IPv6 extension headers is a guidance, there would be 536 a risk that key pieces of hardware might not implement the options 537 header, resulting in future calls to deprecate its use. Making the 538 options part of the base NVO3 header has less of those issues. Even 539 though the implementation of any particular option can not be 540 predicted ahead of time, the option mechanism and ability to skip the 541 options is likely to be broadly implemented. 543 4. We compared the TLV vs Bit-fields style extension and it was 544 deemed that parsing both TLV and bit-fields is expensive and while 545 bit-fields may be simpler to parse, it is also more restrictive and 546 requires guessing which extensions will be widely implemented so they 547 can get early bit assignments, given that half the bits are already 548 assigned in GUE, a widely deployed extension may appear in a flag 549 extension, and this will require extra processing, to dig the flag 550 from the flag extension and then look for the extension itself. As 551 well Bit-fields are not flexible enough to address the requirements 552 from OAM, Telemetry and security extensions, for variable length 553 option and different subtypes of the same option. While TLV are more 554 flexible, a control plane can restrict the number of option TLVs as 555 well the order and size of the TLVs to make it simpler for a 556 dataplane implementation to handle. 558 5. We briefly discussed multi-vendor NVE case, and the need to allow 559 vendors to put their own extensions in the NVE header. This is 560 possible with TLVs. 562 6. We also agreed that the C bit in Geneve is helpful to allow 563 receiver NVE to easily decide whether to process options or not. For 564 example a UUID based packet trace and how an optional extension such 565 as that can be ignored by receiver NVE and thus make it easy for NVE 566 to skip over the options. Thus the C-bit remains as defined in 567 Geneve. 569 7. There are already some extensions that are being discussed (see 570 section 6.2) of varying sizes, by using Geneve option it is possible 571 to get in band parameters like: switch id, ingress port, egress port, 572 internal delay, and queue in telemetry defined extension TLV from 573 switches. It is also possible to add Security extension TLVs like 574 HMAC and DTLS/IPSEC to authenticate the Geneve packet header and 575 secure the Geneve packet payload by software or hardware tunnel 576 endpoints. As well, a Group Based Policy extension TLV can be 577 carried. 579 8. There are implemented Geneve options today in production. There 580 are as well new HW supporting Geneve TLV parsing. In addition In-band 581 Telemetry (INT) specification being developed by P4.org illustrates 582 the option of INT meta data carried over Geneve. OVN/OVS have also 583 defined some option TLV(s) for Geneve. 585 9. The DT has addressed the usage models while considering the 586 requirements and implementations in general that includes software 587 and hardware. 589 There seems to be interest to standardize some well known secure 590 option TLVs to secure the header and payload to guarantee 591 encapsulation header integrity and tenant data privacy. The design 592 team recommends that the working group consider standardizing such 593 option(s). 595 We recommend the following enhancements to Geneve to make it more 596 suitable to hardware and yet provide the flexibility for software: 598 We would propose a text such as, while TLV are more flexible, a 599 control plane can restrict the number of option TLVs as well the 600 order and size of the TLVs to make it simpler for a data plane 601 implementation in software or hardware to handle. For example, there 602 may be some critical information such as secure hash that must be 603 processed in certain order at lowest latency. 605 A control plane can negotiate a subset of option TLVs and certain TLV 606 ordering, as well can limit the total number of option TLVs present 607 in the packet, for example, to allow hardware capable of processing 608 fewer options. Hence, the control planes need to have the ability to 609 describe the supported TLVs subset and their order. 611 The Geneve draft could specify that the subset and order of option 612 TLVs should be configurable for each remote NVE in the absence of a 613 protocol control plane. 615 We recommend Geneve to follow fragmentation recommendations in 616 overlay services like PWE3, and L2/L3 VPN recommendation to guarantee 617 larger MTU for the tunnel overhead 618 https://tools.ietf.org/html/rfc3985#section-5.3 620 We request Geneve to provide a recommendation for critical bit 621 processing - text could look like how critical bits can be used with 622 control plane specifying the critical options. 624 Given that there is a telemetry option use case for a length of 256 625 bytes, we recommend Geneve to increase the Single TLV option length 626 to 256. 628 We request Geneve to address Requirements for OAM considerations for 629 alternate marking and for performance measurements that need 2 bits 630 in the header. And clarify the need of the current OAM bit in the 631 Geneve Header. 633 We recommend the WG to work on security options for Geneve. 635 8. Acknowledgements 637 The authors would like to thank Tom Herbert for providing the 638 motivation for the Security/Integrity extension, and for his valuable 639 comments, and would like to thank T. Sridhar for his valuable 640 comments and feedback. 642 9. Security Considerations 644 This document does not introduce any additional security constraints. 646 10. References 648 10.1 Normative References 650 [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate 651 Requirement Levels", BCP 14, RFC 2119, March 1997. 653 10.2 Informative References 655 [Geneve] Generic Network Virtualization Encapsulation [I-D.ietf-nvo3- 656 geneve] 657 [GUE] Generic UDP Encapsulation [I-D.ietf-nvo3-gue] 658 [NSH] Network Service Header [I-D.ietf-sfc-nsh] 659 [VXLAN-GPE] Virtual eXtensible Local Area Network - Generic Protocol 660 Extension [I-D.ietf-nvo3-vxlan-gpe] 662 [VXLAN-GBP] VXLAN Group Policy Option - [I-D.draft-smith-vxlan-group- 663 policy-03] 665 11. Appendix A 667 11.1. Overview 669 This section presents a comparison of the three NVO3 encapsulation 670 proposals, Geneve, GUE, and VXLAN-GPE. The three encapsulations use 671 an outer UDP/IP transport. Geneve and VXLAN-GPE use an 8-octet 672 header, while GUE uses a 4-octet header. In addition to the base 673 header, optional extensions may be included in the encapsulation, as 674 discussed in Section 3.2 below. 676 11.2. Extensibility 678 11.2.1. Native Extensibility Support 680 The Geneve and GUE encapsulations both enable optional headers to be 681 incorporated at the end of the base encapsulation header. 683 VXLAN-GPE does not provide native support for header extensions. 684 However, as discussed in [I-D.ietf-nvo3-vxlan-gpe], extensibility can 685 be attained to some extent if the Network Service Header (NSH) [I- 686 D.ietf-sfc-nsh] is used immediately following the VXLAN-GPE header. 687 NSH supports either a fixed-size extension (MD Type 1), or a 688 variable-size TLV-based extension (MD Type 2). It should be noted 689 that NSH-over-VXLAN-GPE implies an additional overhead of the 8- 690 octets NSH header, in addition to the VXLAN-GPE header. 692 11.2.2. Extension Parsing 694 The Geneve Variable Length Options are defined as 695 Type/Length/Value(TLV) extensions. Similarly, VXLAN-GPE, when using 696 NSH, can include NSH TLV-based extensions. In contrast, GUE defines 697 a small set of possible extension fields (proposed in [I-D.herbert- 698 gue-extensions]), and a set of flags in the GUE header that indicate 699 for each extension type whether it is present or not. 701 TLV-based extensions, as defined in Geneve, provide the flexibility 702 for a large number of possible extension types. Similar behavior can 703 be supported in NSH-over-VXLAN-GPE when using MD Type 2. The flag- 704 based approach taken in GUE strives to simplify implementations by 705 defining a small number of possible extensions, used in a fixed 706 order. 708 The Geneve and GUE headers both include a length field, defining the 709 total length of the encapsulation, including the optional extensions. 711 The length field simplifies the parsing of transit devices that skip 712 the encapsulation header without parsing its extensions. 714 11.2.3. Critical Extensions 716 The Geneve encapsulation header includes the 'C' field, which 717 indicates whether the current Geneve header includes critical 718 options, which must be parsed by the tunnel endpoint. If the endpoint 719 is not able to process the critical option, the packet is discarded. 721 11.2.4. Maximal Header Length 723 The maximal header length in Geneve, including options, is 260 724 octets. GUE defines the maximal header to be 128 octets. VXLAN-GPE 725 uses a fixed-length header of 8 octets, unless NSH-over-VXLAN-GPE is 726 used, yielding an encapsulation header of up to 264 octets. 728 11.3. Encapsulation Header 730 11.3.1. Virtual Network Identifier (VNI) 732 The Geneve and VXLAN-GPE headers both include a 24-bit VNI field. 733 GUE, on the other hand, enables the use of a 32-bit field called 734 VNID; this field is not included in the GUE header, but was defined 735 as an optional extension in [I-D.herbert-gue-extensions]. 737 The VXLAN-GPE header includes the 'I' bit, indicating that the VNI 738 field is valid in the current header. A similar indicator is defined 739 as a flag in the GUE header [I-D.herbert-gue-extensions]. 741 11.3.2. Next Protocol 743 The three encapsulation headers include a field that specifies the 744 type of the next protocol header, which resides after the NVO3 745 encapsulation header. The Geneve header includes a 16-bit field that 746 uses the IEEE Ethertype convention. GUE uses an 8-bit field, which 747 uses the IANA Internet protocol numbering. The VXLAN-GPE header 748 incorporates an 8-bit Next Protocol field, using a VXLAN-GPE-specific 749 registry, defined in [I-D.ietf-nvo3-vxlan-gpe]. 751 The VXLAN-GPE header also includes the 'P' bit, which explicitly 752 indicates whether the Next Protocol field is present in the current 753 header. 755 11.3.3. Other Header Fields 757 The OAM bit, which is defined in Geneve and in VXLAN-GPE, indicates 758 whether the current packet is an OAM packet. The GUE header includes 759 a similar field, but uses different terminology; the GUE 'C-bit' 760 specifies whether the current packet is a control packet. Note that 761 the GUE control bit can potentially be used in a large set of 762 protocols that are not OAM protocols. However, the control packet 763 examples discussed in [I-D.ietf-nvo3-gue] are OAM-related. 765 Each of the three NVO3 encapsulation headers includes a 2-bit Version 766 field, which is currently defined to be zero. 768 The Geneve and VXLAN-GPE headers include reserved fields; 14 bits in 769 the Geneve header, and 27 bits in the VXLAN-GPE header are reserved. 771 11.4. Comparison Summary 773 The following table summarizes the comparison between the three NVO3 774 encapsulations. 776 +----------------+----------------+----------------+----------------+ 777 | | Geneve | GUE | VXLAN-GPE | 778 +----------------+----------------+----------------+----------------+ 779 | Outer transport| UDP/IP | UDP/IP | UDP/IP | 780 +----------------+----------------+----------------+----------------+ 781 | Base header | 8 octets | 4 octets | 8 octets | 782 | length | | | (16 octets | 783 | | | | using NSH) | 784 +----------------+----------------+----------------+----------------+ 785 | Extensibility |Variable length |Extension fields| No native ext- | 786 | | options | | ensibility. | 787 | | | | Extensible | 788 | | | | using NSH. | 789 +----------------+----------------+----------------+----------------+ 790 | Extension | TLV-based | Flag-based | TLV-based | 791 | parsing method | | |(using NSH with | 792 | | | | MD Type 2) | 793 +----------------+----------------+----------------+----------------+ 794 | Extension | Variable | Fixed | Variable | 795 | order | | | (using NSH) | 796 +----------------+----------------+----------------+----------------+ 797 | Length field | + | + | - | 798 +----------------+----------------+----------------+----------------+ 799 | Max Header | 260 octets | 128 octets | 8 octets | 800 | Length | | |(264 using NSH) | 801 +----------------+----------------+----------------+----------------+ 802 | Critical exte- | + | - | - | 803 | nsion bit | | | | 804 +----------------+----------------+----------------+----------------+ 805 | VNI field size | 24 bits | 32 bits | 24 bits | 806 | | | (extension) | | 807 +----------------+----------------+----------------+----------------+ 808 | Next protocol | 16 bits | 8 bits | 8 bits | 809 | field | Ethertype | Internet prot- | New registry | 810 | | registry | ocol registry | | 811 +----------------+----------------+----------------+----------------+ 812 | Next protocol | - | - | + | 813 | indicator | | | | 814 +----------------+----------------+----------------+----------------+ 815 | OAM / control | OAM bit | Control bit | OAM bit | 816 | field | | | | 817 +----------------+----------------+----------------+----------------+ 818 | Version field | 2 bits | 2 bits | 2 bits | 819 +----------------+----------------+----------------+----------------+ 820 | Reserved bits | 14 bits | - | 27 bits | 821 +----------------+----------------+----------------+----------------+ 823 Figure 1: NVO3 Encapsulation Comparison 825 Authors' Addresses (In alphabetical order) 827 Sami Boutros 828 VMware 829 Email: sboutros@vmware.com 831 Ilango Ganga 832 Intel 833 Email: ilango.s.ganga@intel.com 834 Pankaj Garg 835 Microsoft 836 Email: pankajg@microsoft.com 838 Rajeev Manur 839 Broadcom 840 Email: rajeev.manur@broadcom.com 842 Tal Mizrahi 843 Marvell 844 Email: talmi@marvell.com 846 David Mozes 847 Email: mosesster@gmail.com 849 Erik Nordmark 850 Email: nordmark@sonic.net 852 Michael Smith 853 Cisco 854 Email: michsmit@cisco.com 856 Sam Aldrin 857 Google 858 Email: aldrin.ietf@gmail.com 860 Ignas Bagdonas 861 Equinix 862 Email: ibagdona.ietf@gmail.com