idnits 2.17.1 draft-ietf-nvo3-encap-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (June 9, 2021) is 1045 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- No issues found here. Summary: 0 errors (**), 0 flaws (~~), 1 warning (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 NVO3 Workgroup S. Boutros, Ed. 2 Internet-Draft Ciena 3 Intended Status: Informational D. Eastlake, Ed. 4 Futurewei 5 Expires: December 8, 2021 June 9, 2021 7 NVO3 Encapsulation Considerations 8 draft-ietf-nvo3-encap-06 10 Abstract 11 As communicated by the WG Chairs, the IETF NVO3 chairs and Routing 12 Area director have chartered a design team to take forward the 13 encapsulation discussion and see if there is potential to design a 14 common encapsulation that addresses the various technical concerns. 16 There are implications of different encapsulations in real 17 environments consisting of both software and hardware implementations 18 and spanning multiple data centers. For example, OAM functions such 19 as path MTU discovery become challenging with multiple encapsulations 20 along the data path. 22 The design team recommends Geneve with a few modifications as the 23 common encapsulation. This document provides more details, 24 particularly in Section 7. 26 Status of This Document 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Distribution of this document is unlimited. Comments should be sent 32 to the authors or the IDR Working Group mailing list . 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF), its areas, and its working groups. Note that 36 other groups may also distribute working documents as Internet- 37 Drafts. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 The list of current Internet-Drafts can be accessed at 45 https://www.ietf.org/1id-abstracts.html. The list of Internet-Draft 46 Shadow Directories can be accessed at 47 https://www.ietf.org/shadow.html. 49 Copyright Notice 50 Internet-Draft NVO3 Encapsulation Considerations 52 Copyright (c) 2021 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Internet-Draft NVO3 Encapsulation Considerations 67 Table of Contents 69 1. Introduction............................................4 70 2. Design Team Goals.......................................4 71 3. Terminology.............................................5 72 4. Abbreviations and Acronyms..............................5 74 5. Issues with Current Encapsulations......................6 75 5.1. Geneve................................................6 76 5.2. GUE...................................................6 77 5.3. VXLAN-GPE.............................................6 79 6. Common Encapsulation Considerations.....................7 80 6.1. Current Encapsulations................................7 81 6.2. Useful Extensions Use Cases...........................7 82 6.2.1. Telemetry Extensions................................7 83 6.2.2. Security/Integrity Extensions.......................8 84 6.2.3 Group Base Policy....................................8 85 6.3. Hardware Considerations...............................9 86 6.4. Extension Size........................................9 87 6.5. Extension Ordering...................................10 88 6.6. TLV versus Bit Fields................................10 89 6.7. Control Plane Considerations.........................11 90 6.8. Split NVE............................................12 91 6.9. Larger VNI Considerations............................12 93 7. Design Team Recommendations............................13 94 8. Acknowledgements.......................................16 96 9. Security Considerations................................16 97 10. IANA Considerations...................................16 99 11. References............................................17 100 11.1 Normative References.................................17 101 11.2 Informative References...............................17 103 Appendix A: Encapsulations Comparison.....................19 104 A.1. Overview.............................................19 105 A.2. Extensibility........................................19 106 A.2.1. Native Extensibility Support.......................19 107 A.2.2. Extension Parsing..................................19 108 A.2.3. Critical Extensions................................20 109 A.2.4. Maximal Header Length..............................20 110 A.3. Encapsulation Header.................................20 111 A.3.1. Virtual Network Identifier (VNI)...................20 112 A.3.2. Next Protocol......................................20 113 A.3.3. Other Header Fields................................21 114 A.4. Comparison Summary...................................21 116 Contributors..............................................23 118 Internet-Draft NVO3 Encapsulation Considerations 120 1. Introduction 122 As communicated by the WG Chairs, the NVO3 WG Charter states that it 123 may produce requirements for network virtualization data planes based 124 on encapsulation of virtual network traffic over an IP-based underlay 125 data plane. Such requirements should consider OAM and security. 126 Based on these requirements the WG will select, extend, and/or 127 develop one or more data plane encapsulation format(s). 129 This has led to WG drafts and an RFC describing three encapsulations 130 as follows: 132 - [RFC8926] Geneve: Generic Network Virtualization Encapsulation 134 - [I-D.ietf-intarea-gue] Generic UDP Encapsulation 136 - [I-D.ietf-nvo3-vxlan-gpe] Generic Protocol Extension for VXLAN 137 (VXLAN-GPE) 139 Discussion on the list and in face-to-face meetings has identified a 140 number of technical problems with each of these encapsulations. 141 Furthermore, there was clear consensus at the 96th IETF meeting in 142 Berlin that it is undesirable for the working group to progress more 143 than one data plane encapsulation. Although consensus could not be 144 reached on the list, the overall consensus was for a single 145 encapsulation [RFC2418], Section 3.3. 147 Nonetheless there has been resistance to converging on a single 148 encapsulation format. 150 2. Design Team Goals 152 As communicated by the WG Chairs, the design team should take one of 153 the proposed encapsulations and enhance it to address the technical 154 concerns. The simple evolution of deployed networks as well as 155 applicability to all locations in the NVO3 architecture are goals. 156 The DT should specifically avoid a design that is burdensome on 157 hardware implementations but should allow future extensibility. The 158 chosen design should also operate well with ICMP and in ECMP 159 environments. If further extensibility is required, then it should 160 be done in such a manner that it does not require the consent of an 161 entity outside of the IETF. 163 Internet-Draft NVO3 Encapsulation Considerations 165 3. Terminology 167 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 168 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 169 "OPTIONAL" in this document are to be interpreted as described in BCP 170 14 [RFC2119] [RFC8174] when, and only when, they appear in all 171 capitals, as shown here. 173 4. Abbreviations and Acronyms 175 DT NVO3 encapsulation Design Team 177 NVO3 Network Virtualization Overlays over Layer 3 179 OAM Operations, Administration, and Maintenance 181 TLV Type, Length, and Value 183 VNI Virtual Network Identifier 185 NVE Network Virtualization Edge 187 NVA Network Virtualization Authority 189 NIC Network interface card 191 TCAM Ternary Content-Addressable Memory 193 Transit device - Underlay network devices between NVE(s). 195 Internet-Draft NVO3 Encapsulation Considerations 197 5. Issues with Current Encapsulations 199 The following subsections describe issues with current encapsulations 200 as summarized by the WG Chairs: 202 5.1. Geneve 204 - Can't be implemented cost-effectively in all use cases because 205 variable length header and order of the TLVs makes is costly (in 206 terms of number of gates) to implement in hardware. 208 - Header doesn't fit into largest commonly available parse buffer 209 (256 bytes in NIC). Cannot justify doubling buffer size unless it is 210 mandatory for hardware to process additional option fields. 212 5.2. GUE 214 - There were a significant number of objections related to the 215 complexity of implementation in hardware, similar to those noted for 216 Geneve above. 218 5.3. VXLAN-GPE 220 - GPE is not day-1 backwards compatible with VXLAN. Although the 221 frame format is similar, it uses a different UDP port, so would 222 require changes to existing implementations even if the rest of the 223 GPE frame is the same. 225 - GPE is insufficiently extensible. Numerous extensions and options 226 have been designed for GUE and Geneve. Note that these have not yet 227 been validated by the WG. 229 - Security, e.g., of the VNI, has not been addressed by GPE. 230 Although a shim header could be used for security and other 231 extensions, this has not been defined yet and its implications on 232 offloading in NICs are not understood. 234 Internet-Draft NVO3 Encapsulation Considerations 236 6. Common Encapsulation Considerations 238 6.1. Current Encapsulations 240 Appendix A includes a detailed comparison between the three proposed 241 encapsulations. The comparison indicates several common properties 242 but also three major differences among the encapsulations: 244 - Extensibility: Geneve and GUE were defined with built-in 245 extensibility, while VXLAN-GPE is not inherently extensible. Note 246 that any of the three encapsulations can be extended using the 247 Network Service Header (NSH [RFC8300]). 249 - Extension method: Geneve is extensible using Type/Length/Value 250 (TLV) fields, while GUE uses a small set of possible extensions, and 251 a set of flags that indicate which extensions are present. 253 - Length field: Geneve and GUE include a Length field, indicating the 254 length of the encapsulation header while VXLAN-GPE does not include 255 such a field. 257 6.2. Useful Extensions Use Cases 259 Non vendor specific TLVs MUST follow the standardization process. 260 The following use cases for extensions shows that there is a strong 261 requirement to support variable length extensions with possible 262 different subtypes. 264 6.2.1. Telemetry Extensions 266 In several scenarios it is beneficial to make information about the 267 path a packet took through the network or through a network device as 268 well as associated telemetry information available to the operator. 270 This includes not only tasks like debugging, troubleshooting, and 271 network planning and optimization but also policy or service level 272 agreement compliance checks. 274 Packet scheduling algorithms, especially for balancing traffic across 275 equal cost paths or links, often leverage information contained 276 within the packet, such as protocol number, IP-address, or MAC- 277 address. Probe packets would thus either need to be sent between the 278 exact same endpoints with the exact same parameters, or probe packets 279 would need to be artificially constructed as "fake" packets and 281 Internet-Draft NVO3 Encapsulation Considerations 283 inserted along the path. Both approaches are often not feasible from 284 an operational perspective, be it that access to the end-system is 285 not feasible, or that the diversity of parameters and associated 286 probe packets to be created is simply too large. An extension 287 providing an in-band telemetry mechanism is an alternative in those 288 cases. 290 6.2.2. Security/Integrity Extensions 292 Since the currently proposed NVO3 encapsulations do not protect their 293 headers, a single bit corruption in the VNI field could deliver a 294 packet to the wrong tenant. Extensions are needed to use any 295 sophisticated security. 297 The possibility of VNI spoofing with an NVO3 protocol is exacerbated 298 by using UDP. Systems typically have no restrictions on applications 299 being able to send to any UDP port so an unprivileged application can 300 trivially spoof VXLAN packets for instance, including using arbitrary 301 VNIs. 303 One can envision HMAC-like support in some NVO3 extension to 304 authenticate the header and the outer IP addresses, thereby 305 preventing attackers from injecting packets with spoofed VNIs. 307 Another aspect of security is payload security. Essentially this is 308 to make packets that look like IP|UDP|NVO3 Encap|DTLS/IPSEC-ESP 309 Extension|payload. This is nice since we still have the UDP header 310 for ECMP, the NVO3 header is in plain text so it can be read by 311 network elements, and different security or other payload transforms 312 can be supported on a single UDP port (we don't need a separate UDP 313 for DTLS/IPSEC). 315 6.2.3 Group Base Policy 317 Another use case would be to carry the Group Based Policy (GBP) 318 source group information within a NVO3 header extension in a similar 319 manner as has been implemented for VXLAN 320 [I-D.smith-vxlan-group-policy]. This allows various forms of policy 321 such as access control and QoS to be applied between abstract groups 322 rather than coupled to specific endpoint addresses. 324 Internet-Draft NVO3 Encapsulation Considerations 326 6.3. Hardware Considerations 328 Hardware restrictions should be taken into consideration along with 329 future hardware enhancements that may provide more flexible metadata 330 processing. However, the set of options that need to and will be 331 implemented in hardware will be a subset of what is implemented in 332 software, since software NVEs are likely to grow features, and hence 333 option support, at a more rapid rate. 335 We note that it is hard to predict which options will be implemented 336 in which piece of hardware and when. That depends on whether the 337 hardware will be in the form of a NIC providing increasing offload 338 capabilities to software NVEs, or a switch chip being used as an NVE 339 gateway towards non-NVO3 parts of the network, or even a transit 340 device that participates in the NVO3 dataplane, e.g., for OAM 341 purposes. 343 A result of this is that it doesn't look useful to prescribe some 344 order of the option so that the ones that are likely to be 345 implemented in hardware come first; we can't decide such an order 346 when we define the options, however a control plane can enforce such 347 an order for some hardware implementation. 349 We do know that hardware needs to initially be able to efficiently 350 skip over the NVO3 header to find the inner payload. That is needed 351 both for NICs doing TCP offload and for transit devices and NVEs 352 applying policy/ACLs to the inner payload. 354 6.4. Extension Size 356 Extension header length has a significant impact on hardware and 357 software implementations. A total header length that is too small 358 will unnecessarily constrained software flexibility. A total header 359 length that is too large will place a nontrivial cost on hardware 360 implementations. Thus, the design team recommends that there be a 361 minimum and maximum total extension header length selected. The 362 maximum total header length is determined by the bits allocated for 363 the total extension header length field. The risk with this approach 364 is that it may be difficult to extend the total header size in the 365 future. The minimum total header length is determined by a 366 requirement in the specifications that all implementations must meet. 367 The risk with this approach is that all implementations will only 368 implement the minimum total header length which would then become the 369 de facto maximum total header length. The recommended minimum total 370 header length is 64 bytes. 372 Single Extension size should always be 4 byte aligned. 374 Internet-Draft NVO3 Encapsulation Considerations 376 The maximum length of a single option should be large enough to meet 377 the different extension use case requirements, e.g., in-band 378 telemetry and future use. 380 6.5. Extension Ordering 382 To support hardware nodes at the tunnel endpoint or at a transit 383 device that can process one or a few extensions TLVs in TCAM, a 384 control plane in such a deployment can signal a capability to ensure 385 a specific TLV will always appear in a specific order, for example 386 the first one in the packet. 388 The order of the TLVs should be hardware friendly for both the sender 389 and the receiver and possibly the transit device also. 391 Transit devices doesn't participate in control plane communication 392 between the end points and are not required to process the options; 393 however, if they do, they need to process only a small subset of 394 options that will be consumed by tunnel endpoints. 396 6.6. TLV versus Bit Fields 398 If there is a well-known initial set of options that are likely to be 399 implemented in software and in hardware, it can be efficient to use 400 the bit-field approach as in GUE. However, as described in section 401 6.3, if options are added over time and different subsets of options 402 are likely to be implemented in different pieces of hardware, then it 403 would be hard for the IETF to specify which options should get the 404 early bit fields. TLVs are a lot more flexible, which avoids the 405 need to determine the relative importance different options. 406 However, general TLV of arbitrary order, size, and repetition of the 407 same order is difficult to implement in hardware. A middle ground is 408 to use TLVs with restrictions on their size and alignment, observing 409 that individual TLVs can have a fixed length, and support in the 410 control plane such that an NVE will only receive options that it 411 needs and implements. The control plane approach can potentially be 412 used to control the order of the TLVs sent to a particular NVE. Note 413 that transit devices are not likely to participate in the control 414 plane; hence, to the extent that they need to participate in option 415 processing, they need more effort. Transit devices would have issues 416 with future GUE bits being defined for future options as well. 418 A benefit of TLVs from a hardware perspective is that they are self 419 describing, i.e., all the information is in the TLV. In a Bit fields 420 approach the hardware needs to look up the bit to determine the 421 length of the data associated with the bit through some separate 423 Internet-Draft NVO3 Encapsulation Considerations 425 table, which would add hardware complexity. 427 There are use cases where multiple modules of software are running on 428 an NVE. This can be modules such as a diagnostic module by one 429 vendor that does packet sampling and another module from a different 430 vendor that does a firewall. Using a TLV format, it is easier to 431 have different software modules process different TLVs, which could 432 be standard extensions or vendor specific extensions defined by the 433 different vendors, without conflicting with each other. This can 434 help with hardware modularity as well. There are some 435 implementations with options that allows different software, like MAC 436 learning and security, to handle different options. 438 6.7. Control Plane Considerations 440 Given that we want to allow considerable flexibility and 441 extensibility for, e.g., software NVEs, yet be able to support 442 important extensions in less flexible contexts such as hardware NVEs, 443 it is useful to consider the control plane. By control plane in this 444 section we mean both protocols, such as EVPN and others, and 445 deployment specific configuration. 447 If each NVE can express in the control plane that they only care 448 about particular extensions (could be a single extension, or a few), 449 and the source NVEs only include requested extensions in the NVO3 450 packets, then the target NVE can both use a simpler parser (e.g., a 451 TCAM might be usable to look for a single NVO3 extension) and the 452 depth of the inner payload in the NVO3 packet will be minimized. 453 Furthermore, if the target NVE cares about a few extensions and can 454 express in the control plane the desired order of those extensions in 455 the NVO3 packets, then it can provide useful functionality with 456 minimal hardware requirements. 458 Note that transit devices that are not aware of the NVO3 extensions 459 somewhat benefit from such an approach, since the inner payload is 460 less deep in the packet if no extraneous extensions are included in 461 the packet. In general, a transit device is not likely to 462 participate in the NVO3 control plane. (However, configuration 463 mechanisms can take into account limitations of the transit devices 464 used in particular deployments.) 466 Note that in this approach different NVEs could desire different 467 extensions or sets of extensions, which means that the source NVE 468 needs to be able to place different sets of extensions in different 469 NVO3 packets, and perhaps in different order. It also assumes that 470 underlay multicast or replication servers are not used together with 471 NVO3 extensions. 473 Internet-Draft NVO3 Encapsulation Considerations 475 There is a need to consider mandatory extensions versus optional 476 extensions. Mandatory extensions require the receiver to drop the 477 packet if the extension is unknown. A control plane mechanism can 478 prevent the need for dropping unknown extensions, since they would 479 not be included to targets that do not support them. 481 The control planes defined today need to add the ability to describe 482 the different encapsulations. Thus, perhaps EVPN and any other 483 control plane protocol that the IETF defines should have a way to 484 enumerate the supported NVO3 extensions and their order. 486 The WG should consider developing a separate draft on guidance for 487 option processing and control plane participation. This should 488 provide examples/guidance on range of usage models and deployments 489 scenarios for specific options and ordering that are relevant for 490 that specific deployment. This includes end points and middle boxes 491 using the options. So, having the control plane negotiate the 492 constraints is the most appropriate and flexible way to address these 493 requirements. 495 6.8. Split NVE 497 If the working group sees a need for having the hosts send and 498 receive options in a split NVE case, this is possible using any of 499 the existing extensible encapsulations (Geneve, GUE, GPE+NSH) by 500 defining a way to carry those over other transports. NSH can already 501 be used over different transports. 503 If we need to do this with other encapsulations it can be done by 504 defining an Ether type for other encapsulations so that it can be 505 carried over Ethernet and 802.1Q. 507 If we need to carry other encapsulations over MPLS, it would require 508 an EVPN control plane to signal that other encapsulation header + 509 options will be present in front of the L2 packet. The VNI can be 510 ignored in the header, and the MPLS label will be the one used to 511 identify the EVPN L2 instance. 513 6.9. Larger VNI Considerations 515 We discussed whether we should make the VNI 32-bits or larger. The 516 benefit of a 24-bit VNI would be to avoid unnecessary changes with 517 existing proposals and implementations that are almost all, if not 518 all, using 24-bit VNI. If we need a larger VNI, an extension can be 519 used to support that. 521 Internet-Draft NVO3 Encapsulation Considerations 523 7. Design Team Recommendations 525 We concluded that Geneve is most suitable as a starting point for a 526 proposed standard for network virtualization, for the following 527 reasons: 529 1. We studied whether VNI should be in the base header or in 530 extensions and whether it should be 24-bit or 32-bit. The design 531 team agreed that VNI is critical information for network 532 virtualization and MUST be present in all packets. The design team 533 also agreed that a 24-bit VNI matches the existing widely used 534 encapsulation formats, i.e., VxLAN and NVGRE, and hence is more 535 suitable to use going forward. 537 2. The Geneve header has the total options length which allows 538 skipping over the options for NIC offload operations and will allow 539 transit devices to view flow information in the inner payload. 541 3. We considered the option of using NSH [RFC8300] with VxLAN-GPE 542 but given that NSH is targeted at service chaining and contains 543 service chaining information, it is less suitable for the network 544 virtualization use case. The other downside for VxLAN-GPE was lack 545 of header length in VxLAN-GPE which makes skipping over the headers 546 to process inner payload more difficult. Total Option Length is 547 present in Geneve. It is not possible to skip any options in the 548 middle with VxLAN-GPE. In principle a split between a base header 549 and a header with options is interesting (whether that options header 550 is NSH or some new header without ties to a service path). We 551 explored whether it would make sense to either use NSH for this, or 552 define a new NVO3 options header. However, we observed that this 553 makes it slightly harder to find the inner payload since the length 554 field is not in the NVO3 header itself. Thus, one more field would 555 have to be extracted to compute the start of the inner payload. 556 Also, if the experience with IPv6 extension headers is a guide, there 557 would be a risk that key pieces of hardware might not implement the 558 options header, resulting in future calls to deprecate its use. 559 Making the options part of the base NVO3 header has less of those 560 issues. Even though the implementation of any particular option can 561 not be predicted ahead of time, the option mechanism and ability to 562 skip the options is likely to be broadly implemented. 564 4. We compared the TLV vs Bit-fields style extension and it was 565 deemed that parsing both TLV and bit-fields is expensive and while 566 bit-fields may be simpler to parse, it is also more restrictive and 567 requires guessing which extensions will be widely implemented so they 568 can get early bit assignments, given that half the bits are already 569 assigned in GUE, a widely deployed extension may appear in a flag 570 extension, and this will require extra processing, to dig the flag 571 from the flag extension and then look for the extension itself. Also 572 Bit-fields are not flexible enough to address the requirements from 574 Internet-Draft NVO3 Encapsulation Considerations 576 OAM, Telemetry, and security extensions, for variable length option 577 and different subtypes of the same option. While TLV are more 578 flexible, a control plane can restrict the number of option TLVs as 579 well the order and size of the TLVs to make it simpler for a 580 dataplane implementation to handle. 582 5. We briefly discussed the multi-vendor NVE case, and the need to 583 allow vendors to put their own extensions in the NVE header. This is 584 possible with TLVs. 586 6. We also agreed that the C bit in Geneve is helpful to allow a 587 receiver NVE to easily decide whether to process options or not, for 588 example a UUID based packet trace, and how an optional extension such 589 as that can be ignored by a receiver NVE and thus make it easy for 590 NVE to skip over the options. Thus, the C-bit remains as defined in 591 Geneve. 593 7. There are already some extensions that are being discussed (see 594 section 6.2) of varying sizes. By using Geneve option it is possible 595 to get in band parameters like switch id, ingress port, egress port, 596 internal delay, and queue in telemetry defined extension TLV from 597 switches. It is also possible to add Security extension TLVs like 598 HMAC and DTLS/IPSEC to authenticate the Geneve packet header and 599 secure the Geneve packet payload by software or hardware tunnel 600 endpoints. A Group Based Policy extension TLV can be carried as 601 well. 603 8. There are implemented Geneve options today in production. There 604 are as well new hardware supporting Geneve TLV parsing. In addition, 605 an In-band Telemetry (INT) specification is being developed by P4.org 606 that illustrates the option of INT meta data carried over Geneve. 607 OVN/OVS have also defined some option TLV(s) for Geneve. 609 9. The DT has addressed the usage models while considering the 610 requirements and implementations in general that includes software 611 and hardware. 613 There seems to be interest to standardize some well-known secure 614 option TLVs to secure the header and payload to guarantee 615 encapsulation header integrity and tenant data privacy. The design 616 team recommends that the working group consider standardizing such 617 option(s). 619 We recommend the following enhancements to Geneve to make it more 620 suitable to hardware and yet provide the flexibility for software: 622 We would propose a text such as, while TLV are more flexible, a 623 control plane can restrict the number of option TLVs as well the 624 order and size of the TLVs to make it simpler for a data plane 625 implementation in software or hardware to handle. For example, there 627 Internet-Draft NVO3 Encapsulation Considerations 629 may be some critical information such as a secure hash that must be 630 processed in a certain order at lowest latency. 632 A control plane can negotiate a subset of option TLVs and certain TLV 633 ordering, as well as limiting the total number of option TLVs present 634 in the packet, for example, to allow for hardware capable of 635 processing fewer options. Hence, the control plane needs to have the 636 ability to describe the supported TLVs subset and their order. 638 The Geneve draft could specify that the subset and order of option 639 TLVs should be configurable for each remote NVE in the absence of a 640 protocol control plane. 642 We recommend that Geneve follow fragmentation recommendations in 643 overlay services like PWE3 and the L2/L3 VPN recommendations to 644 guarantee larger MTU for the tunnel overhead ([RFC3985] Section 5.3). 646 We request that Geneve provide a recommendation for critical bit 647 processing - text could specify how critical bits can be used with 648 control plane specifying the critical options. 650 Given that there is a telemetry option use case for a length of 256 651 bytes, we recommend that Geneve increase the Single TLV option length 652 to 256. 654 We request that Geneve address Requirements for OAM considerations 655 for alternate marking and for performance measurements that need 2 656 bits in the header and clarify the need for the current OAM bit in 657 the Geneve Header. 659 We recommend that the WG work on security options for Geneve. 661 Internet-Draft NVO3 Encapsulation Considerations 663 8. Acknowledgements 665 The authors would like to thank Tom Herbert for providing the 666 motivation for the Security/Integrity extension, and for his valuable 667 comments, and would like to thank T. Sridhar for his valuable 668 comments and feedback. 670 9. Security Considerations 672 This document does not introduce any additional security constraints. 674 10. IANA Considerations 676 This document has no actions for IANA. 678 Internet-Draft NVO3 Encapsulation Considerations 680 11. References 682 11.1 Normative References 684 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 685 Requirement Levels", BCP 14, RFC 2119, DOI 686 10.17487/RFC2119, March 1997, . 689 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 2119 690 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, May 691 2017, . 693 11.2 Informative References 695 [I-D.herbert-gue-extensions] Herbert, T., Yong, L., and F. Templin, 696 "Extensions for Generic UDP Encapsulation", 697 draft-herbert-gue-extensions-01 (work in progress), October 698 2016. 700 [I-D.ietf-intarea-gue] Herbert, T., Yong, L., and O. Zia, "Generic 701 UDP Encapsulation", draft-ietf-intarea-gue (work in 702 progress), October 2019. 704 [I-D.ietf-nvo3-vxlan-gpe] Maino, F., Kreeger, L., and U. Elzur, 705 "Generic Protocol Extension for VXLAN", 706 draft-ietf-nvo3-vxlan-gpe (work in progress), March 2021. 708 [I-D.smith-vxlan-group-policy] Smith, M. and L. Kreeger, "VXLAN Group 709 Policy Option", draft-smith-vxlan-group-policy-05 (work in 710 progress), October 2018. 712 [RFC2418] Bradner, S., "IETF Working Group Guidelines and 713 Procedures", BCP 25, RFC 2418, DOI 10.17487/RFC2418, 714 September 1998, . 716 [RFC3985] Bryant, S., Ed. and P. Pate, Ed., "Pseudo Wire Emulation 717 Edge-to-Edge (PWE3) Architecture", RFC 3985, DOI 718 10.17487/RFC3985, March 2005, . 721 [RFC8300] Quinn, P., Ed., Elzur, U., Ed., and C. Pignataro, Ed., 722 "Network Service Header (NSH)", RFC 8300, DOI 723 10.17487/RFC8300, January 2018, . 726 Internet-Draft NVO3 Encapsulation Considerations 728 [RFC8926] Gross, J., Ed., Ganga, I., Ed., and T. Sridhar, Ed., 729 "Geneve: Generic Network Virtualization Encapsulation", RFC 730 8926, DOI 10.17487/RFC8926, November 2020, 731 . 733 Internet-Draft NVO3 Encapsulation Considerations 735 Appendix A: Encapsulations Comparison 737 A.1. Overview 739 This section presents a comparison of the three NVO3 encapsulation 740 proposals, Geneve, GUE, and VXLAN-GPE. The three encapsulations use 741 an outer UDP/IP transport. Geneve and VXLAN-GPE use an 8-octet 742 header, while GUE uses a 4-octet header. In addition to the base 743 header, optional extensions may be included in the encapsulation, as 744 discussed in Section A.2 below. 746 A.2. Extensibility 748 A.2.1. Native Extensibility Support 750 The Geneve and GUE encapsulations both enable optional headers to be 751 incorporated at the end of the base encapsulation header. 753 VXLAN-GPE does not provide native support for header extensions. 754 However, as discussed in [I-D.ietf-nvo3-vxlan-gpe], extensibility can 755 be attained to some extent if the Network Service Header (NSH) 756 [RFC8300] is used immediately following the VXLAN-GPE header. NSH 757 supports either a fixed-size extension (MD Type 1), or a variable- 758 size TLV-based extension (MD Type 2). It should be noted that NSH- 759 over-VXLAN-GPE implies an additional overhead of the 8-octets NSH 760 header, in addition to the VXLAN-GPE header. 762 A.2.2. Extension Parsing 764 The Geneve Variable Length Options are defined as Type/Length/Value 765 (TLV) extensions. Similarly, VXLAN-GPE, when using NSH, can include 766 NSH TLV-based extensions. In contrast, GUE defines a small set of 767 possible extension fields (proposed in [I-D.herbert-gue-extensions]), 768 and a set of flags in the GUE header that indicate for each extension 769 type whether it is present or not. 771 TLV-based extensions, as defined in Geneve, provide the flexibility 772 for a large number of possible extension types. Similar behavior can 773 be supported in NSH-over-VXLAN-GPE when using MD Type 2. The flag- 774 based approach taken in GUE strives to simplify implementations by 775 defining a small number of possible extensions used in a fixed order. 777 Internet-Draft NVO3 Encapsulation Considerations 779 The Geneve and GUE headers both include a length field, defining the 780 total length of the encapsulation, including the optional extensions. 782 The length field simplifies the parsing of transit devices that skip 783 the encapsulation header without parsing its extensions. 785 A.2.3. Critical Extensions 787 The Geneve encapsulation header includes the 'C' field, which 788 indicates whether the current Geneve header includes critical 789 options, that is to say, options which must be parsed by the tunnel 790 endpoint. If the endpoint is not able to process a critical option, 791 the packet is discarded. 793 A.2.4. Maximal Header Length 795 The maximal header length in Geneve, including options, is 260 796 octets. GUE defines the maximal header to be 128 octets. VXLAN-GPE 797 uses a fixed-length header of 8 octets, unless NSH-over-VXLAN-GPE is 798 used, yielding an encapsulation header of up to 264 octets. 800 A.3. Encapsulation Header 802 A.3.1. Virtual Network Identifier (VNI) 804 The Geneve and VXLAN-GPE headers both include a 24-bit VNI field. 805 GUE, on the other hand, enables the use of a 32-bit field called 806 VNID; this field is not included in the GUE header, but was defined 807 as an optional extension in [I-D.herbert-gue-extensions]. 809 The VXLAN-GPE header includes the 'I' bit, indicating that the VNI 810 field is valid in the current header. A similar indicator is defined 811 as a flag in the GUE header [I-D.herbert-gue-extensions]. 813 A.3.2. Next Protocol 815 The three encapsulation headers include a field that specifies the 816 type of the next protocol header, which resides after the NVO3 817 encapsulation header. The Geneve header includes a 16-bit field that 818 uses the IEEE Ethertype convention. GUE uses an 8-bit field, which 820 Internet-Draft NVO3 Encapsulation Considerations 822 uses the IANA Internet protocol numbering. The VXLAN-GPE header 823 incorporates an 8-bit Next Protocol field, using a VXLAN-GPE-specific 824 registry, defined in [I-D.ietf-nvo3-vxlan-gpe]. 826 The VXLAN-GPE header also includes the 'P' bit, which explicitly 827 indicates whether the Next Protocol field is present in the current 828 header. 830 A.3.3. Other Header Fields 832 The OAM bit, which is defined in Geneve and in VXLAN-GPE, indicates 833 whether the current packet is an OAM packet. The GUE header includes 834 a similar field, but uses different terminology; the GUE 'C-bit' 835 specifies whether the current packet is a control packet. Note that 836 the GUE control bit can potentially be used in a large set of 837 protocols that are not OAM protocols. However, the control packet 838 examples discussed in [I-D.ietf-intarea-gue] are OAM-related. 840 Each of the three NVO3 encapsulation headers includes a 2-bit Version 841 field, which is currently defined to be zero. 843 The Geneve and VXLAN-GPE headers include reserved fields; 14 bits in 844 the Geneve header, and 27 bits in the VXLAN-GPE header are reserved. 846 A.4. Comparison Summary 847 Internet-Draft NVO3 Encapsulation Considerations 849 The following table summarizes the comparison between the three NVO3 850 encapsulations: 851 +----------------+----------------+----------------+----------------+ 852 | | Geneve | GUE | VXLAN-GPE | 853 +----------------+----------------+----------------+----------------+ 854 | Outer transport| UDP/IP | UDP/IP | UDP/IP | 855 +----------------+----------------+----------------+----------------+ 856 | Base header | 8 octets | 4 octets | 8 octets | 857 | length | | | (16 octets | 858 | | | | using NSH) | 859 +----------------+----------------+----------------+----------------+ 860 | Extensibility |Variable length |Extension fields| No native ext- | 861 | | options | | ensibility. | 862 | | | | Extensible | 863 | | | | using NSH. | 864 +----------------+----------------+----------------+----------------+ 865 | Extension | TLV-based | Flag-based | TLV-based | 866 | parsing method | | |(using NSH with | 867 | | | | MD Type 2) | 868 +----------------+----------------+----------------+----------------+ 869 | Extension | Variable | Fixed | Variable | 870 | order | | | (using NSH) | 871 +----------------+----------------+----------------+----------------+ 872 | Length field | + | + | - | 873 +----------------+----------------+----------------+----------------+ 874 | Max Header | 260 octets | 128 octets | 8 octets | 875 | Length | | |(264 using NSH) | 876 +----------------+----------------+----------------+----------------+ 877 | Critical exte- | + | - | - | 878 | nsion bit | | | | 879 +----------------+----------------+----------------+----------------+ 880 | VNI field size | 24 bits | 32 bits | 24 bits | 881 | | | (extension) | | 882 +----------------+----------------+----------------+----------------+ 883 | Next protocol | 16 bits | 8 bits | 8 bits | 884 | field | Ethertype | Internet prot- | New registry | 885 | | registry | ocol registry | | 886 +----------------+----------------+----------------+----------------+ 887 | Next protocol | - | - | + | 888 | indicator | | | | 889 +----------------+----------------+----------------+----------------+ 890 | OAM / control | OAM bit | Control bit | OAM bit | 891 | field | | | | 892 +----------------+----------------+----------------+----------------+ 893 | Version field | 2 bits | 2 bits | 2 bits | 894 +----------------+----------------+----------------+----------------+ 895 | Reserved bits | 14 bits | - | 27 bits | 896 +----------------+----------------+----------------+----------------+ 898 Figure 1: NVO3 Encapsulations Comparison 900 Internet-Draft NVO3 Encapsulation Considerations 902 Contributors 904 The following co-authors have contributed to this document: 906 Ilango Ganga Intel Email: ilango.s.ganga@intel.com 908 Pankaj Garg Microsoft Email: pankajg@microsoft.com 910 Rajeev Manur Broadcom Email: rajeev.manur@broadcom.com 912 Tal Mizrahi Marvell Email: talmi@marvell.com 914 David Mozes Email: mosesster@gmail.com 916 Erik Nordmark Email: nordmark@sonic.net 918 Michael Smith Cisco Email: michsmit@cisco.com 920 Sam Aldrin Google Email: aldrin.ietf@gmail.com 922 Ignas Bagdonas Equinix Email: ibagdona.ietf@gmail.com 924 Internet-Draft NVO3 Encapsulation Considerations 926 Authors' Addresses 928 Sami Boutros (editor) 929 Ciena 930 USA 932 Email: sboutros@ciena.com 934 Donald E. Eastlake, 3rd (editor) 935 Futurewei Technologies 936 2386 Panoramic Circle 937 Apopka, FL 32703 938 USA 940 Tel: +1-508-333-2270 941 Email: d3e3e3@gmail.com