idnits 2.17.1 draft-ietf-nvo3-encap-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (February 17, 2020) is 1529 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Outdated reference: A later version (-16) exists of draft-ietf-nvo3-geneve-14 == Outdated reference: A later version (-13) exists of draft-ietf-nvo3-vxlan-gpe-09 Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 NVO3 Workgroup S. Boutros, Ed. 3 Internet-Draft Ciena 4 Intended status: Informational February 17, 2020 5 Expires: August 20, 2020 7 NVO3 Encapsulation Considerations 8 draft-ietf-nvo3-encap-05 10 Abstract 12 As communicated by WG Chairs, the IETF NVO3 chairs and Routing Area 13 director have chartered a design team to take forward the 14 encapsulation discussion and see if there is potential to design a 15 common encapsulation that addresses the various technical concerns. 17 There are implications of different encapsulations in real 18 environments consisting of both software and hardware implementations 19 and spanning multiple data centers. For example, OAM functions such 20 as path MTU discovery become challenging with multiple encapsulations 21 along the data path. 23 The design team recommend Geneve with few modifications as the common 24 encapsulation, more details are described in section 7. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on August 20, 2020. 43 Copyright Notice 45 Copyright (c) 2020 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Problem Statement . . . . . . . . . . . . . . . . . . . . . . 3 61 2. Design Team Goals . . . . . . . . . . . . . . . . . . . . . . 3 62 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 63 4. Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . 4 64 5. Issues with current Encapsulations . . . . . . . . . . . . . 4 65 5.1. Geneve . . . . . . . . . . . . . . . . . . . . . . . . . 4 66 5.2. GUE . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 67 5.3. VXLAN-GPE . . . . . . . . . . . . . . . . . . . . . . . . 5 68 6. Common Encapsulation Considerations . . . . . . . . . . . . . 5 69 6.1. Current Encapsulations . . . . . . . . . . . . . . . . . 5 70 6.2. Useful Extensions Use cases . . . . . . . . . . . . . . . 5 71 6.2.1. Telemetry extensions. . . . . . . . . . . . . . . . . 6 72 6.2.2. Security/Integrity extensions . . . . . . . . . . . . 6 73 6.2.3. Group Base Policy . . . . . . . . . . . . . . . . . . 7 74 6.3. Hardware Considerations . . . . . . . . . . . . . . . . . 7 75 6.4. Extension Size . . . . . . . . . . . . . . . . . . . . . 7 76 6.5. Extension Ordering . . . . . . . . . . . . . . . . . . . 8 77 6.6. TLV vs Bit Fields . . . . . . . . . . . . . . . . . . . . 8 78 6.7. Control Plane Considerations . . . . . . . . . . . . . . 9 79 6.8. Split NVE . . . . . . . . . . . . . . . . . . . . . . . . 10 80 6.9. Larger VNI Considerations . . . . . . . . . . . . . . . . 11 81 7. Design team recommendations . . . . . . . . . . . . . . . . . 11 82 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 13 83 9. Security Considerations . . . . . . . . . . . . . . . . . . . 14 84 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 14 85 11. Appendix A . . . . . . . . . . . . . . . . . . . . . . . . . 14 86 11.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . 14 87 11.2. Extensibility . . . . . . . . . . . . . . . . . . . . . 14 88 11.2.1. Native Extensibility Support . . . . . . . . . . . . 14 89 11.2.2. Extension Parsing . . . . . . . . . . . . . . . . . 14 90 11.2.3. Critical Extensions . . . . . . . . . . . . . . . . 15 91 11.2.4. Maximal Header Length . . . . . . . . . . . . . . . 15 92 11.3. Encapsulation Header . . . . . . . . . . . . . . . . . . 15 93 11.3.1. Virtual Network Identifier (VNI) . . . . . . . . . . 15 94 11.3.2. Next Protocol . . . . . . . . . . . . . . . . . . . 15 95 11.3.3. Other Header Fields . . . . . . . . . . . . . . . . 16 97 11.4. Comparison Summary . . . . . . . . . . . . . . . . . . . 16 98 12. Contributors . . . . . . . . . . . . . . . . . . . . . . . . 17 99 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 18 100 13.1. Normative References . . . . . . . . . . . . . . . . . . 18 101 13.2. Informative References . . . . . . . . . . . . . . . . . 18 102 Author's Address . . . . . . . . . . . . . . . . . . . . . . . . 19 104 1. Problem Statement 106 As communicated by WG Chairs, the NVO3 WG charter states that it may 107 produce requirements for network virtualization data planes based on 108 encapsulation of virtual network traffic over an IP-based underlay 109 data plane. Such requirements should consider OAM and security. 110 Based on these requirements the WG will select, extend, and/or 111 develop one or more data plane encapsulation format(s). 113 This has led to drafts describing three encapsulations being adopted 114 by the working group: 116 - [I-D.ietf-nvo3-geneve] 118 - [I-D.ietf-nvo3-gue] 120 - [I-D.ietf-nvo3-vxlan-gpe] 122 Discussion on the list and in face-to-face meetings has identified a 123 number of technical problems with each of these encapsulations. 124 Furthermore, there was clear consensus at the IETF meeting in Berlin 125 that it is undesirable for the working group to progress more than 126 one data plane encapsulation. Although consensus could not be 127 reached on the list, the overall consensus was for a single 128 encapsulation [RFC2418],Section 3.3. 130 Nonetheless there has been resistance to converging on a single 131 encapsulation format. 133 2. Design Team Goals 135 As communicated by WG Chairs, the design team should take one of the 136 proposed encapsulations and enhance it to address the technical 137 concerns. The simple evolution of deployed networks as well as 138 applicability to all locations in the NVO3 architecture are goals. 139 The DT should specifically avoid a design that is burdensome on 140 hardware implementations, but should allow future extensibility. The 141 chosen design should also operate well with ICMP and in ECMP 142 environments. If further extensibility is required, then it should 143 be done in such a manner that it does not require the consent of an 144 entity outside of the IETF. 146 3. Terminology 148 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 149 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 150 document are to be interpreted as described in [RFC2119]. 152 4. Abbreviations 154 NVO3 Network Virtualization Overlays over Layer 3 156 OAM Operations, Administration, and Maintenance 158 TLV Type, Length, and Value 160 VNI Virtual Network Identifier 162 NVE Network Virtualization Edge 164 NVA Network Virtualization Authority 166 NIC Network interface card 168 Transit device Underlay network devices between NVE(s). 170 5. Issues with current Encapsulations 172 As summarized by WG Chairs. 174 5.1. Geneve 176 - Can't be implemented cost-effectively in all use cases because 177 variable length header and order of the TLVs makes is costly (in 178 terms of number of gates) to implement in hardware 180 - Header doesn't fit into largest commonly available parse buffer 181 (256 bytes in NIC). Cannot justify doubling buffer size unless it is 182 mandatory for hardware to process additional option fields. 184 5.2. GUE 186 - There were a significant number of objections related to the 187 complexity of implementation in hardware, similar to those noted for 188 Geneve above. 190 5.3. VXLAN-GPE 192 - GPE is not day-1 backwards compatible with VXLAN. Although the 193 frame format is similar, it uses a different UDP port, so would 194 require changes to existing implementations even if the rest of the 195 GPE frame is the same. 197 - GPE is insufficiently extensible. Numerous extensions and options 198 have been designed for GUE and Geneve. Note that these have not yet 199 been validated by the WG. 201 - Security e.g. of the VNI has not been addressed by GPE. Although a 202 shim header could be used for security and other extensions, this has 203 not been defined yet and its implications on offloading in NICs are 204 not understood. 206 6. Common Encapsulation Considerations 208 6.1. Current Encapsulations 210 Appendix A includes a detailed comparison between the three proposed 211 encapsulations. The comparison indicates several common properties, 212 but also three major differences among the encapsulations: 214 - Extensibility: Geneve and GUE were defined with built-in 215 extensibility, while VXLAN-GPE is not inherently extensible. Note 216 that any of the three encapsulations can be extended using the 217 Network Service Header (NSH). 219 - Extension method: Geneve is extensible using Type/Length/Value 220 (TLV) fields, while GUE uses a small set of possible extensions, and 221 a set of flags that indicate which extension is present. 223 - Length field: Geneve and GUE include a Length field, indicating the 224 length of the encapsulation header, while VXLAN-GPE does not include 225 such a field. 227 6.2. Useful Extensions Use cases 229 Non vendor specific TLV MUST follow the standardization process. The 230 following use cases for extensions shows that there is a strong 231 requirement to support variable length extensions with possible 232 different subtypes. 234 6.2.1. Telemetry extensions. 236 In several scenarios it is beneficial to make information about the 237 path a packet took through the network or through a network device as 238 well as associated telemetry information available to the operator. 240 This includes not only tasks like debugging, troubleshooting, as well 241 as network planning and network optimization but also policy or 242 service level agreement compliance checks. 244 Packet scheduling algorithms, especially for balancing traffic across 245 equal cost paths or links, often leverage information contained 246 within the packet, such as protocol number, IP-address or MAC- 247 address. Probe packets would thus either need to be sent from the 248 exact same endpoints with the exact same parameters, or probe packets 249 would need to be artificially constructed as "fake" packets and 250 inserted along the path. Both approaches are often not feasible from 251 an operational perspective, be it that access to the end-system is 252 not feasible, or that the diversity of parameters and associated 253 probe packets to be created is simply too large. An in-band 254 telemetry mechanism in extensions is an alternative in those cases. 256 6.2.2. Security/Integrity extensions 258 Since the currently proposed NVO3 encapsulations do not protect their 259 headers a single bit corruption in the VNI field could deliver a 260 packet to the wrong tenant. Extensions are needed to use any 261 sophisticated security. 263 The possibility of VNI spoofing with an NVO3 protocol is exacerbated 264 by the use of UDP. Systems typically have no restrictions on 265 applications being able to send to any UDP port so an unprivileged 266 application can trivially spoof for instance, VXLAN packets, 267 including using arbitrary VNIs. 269 One can envision HMAC-like support in some NVO3 extension to 270 authenticate the header and the outer IP addresses, thereby 271 preventing attackers from injecting packets with spoofed VNIs. 273 An other aspect of security is payload security. Essentially this is 274 to make packets that look like IP|UDP|NVO3 Encap|DTLS/IPSEC-ESP 275 Extension|payload. This is nice since we still have the UDP header 276 for ECMP, the NVO3 header is in plain text so it can by read by 277 network elements, and different security or other payload transforms 278 can be supported on a single UDP port (we don't need a separate UDP 279 for DTLS/IPSEC). 281 6.2.3. Group Base Policy 283 Another use case would be to carry the Group Based Policy (GBP) 284 source group information within a NVO3 header extension in a similar 285 manner as has been implemented for VXLAN 286 [I-D.smith-vxlan-group-policy]. This allows various forms of policy 287 such as access control and QoS to be applied between abstract groups 288 rather than coupled to specific endpoint addresses. 290 6.3. Hardware Considerations 292 Hardware restrictions should be taken into consideration along with 293 future hardware enhancements that may provide more flexible metadata 294 processing. However, the set of options that need to and will be 295 implemented in hardware will be a subset of what is implemented in 296 software, since software NVEs are likely to grow features, and hence 297 option support, at a more rapid rate. 299 We note that it is hard to predict which options will be implemented 300 in which piece of hardware and when. That depends on whether the 301 hardware will be in the form of a NIC providing increasing offload 302 capabilities to software NVEs, or a switch chip being used as an NVE 303 gateway towards non-NVO3 parts of the network, or even an transit 304 devices that participates in the NVO3 dataplane e.g. for OAM 305 purposes. 307 A result of this is that it doesn't look useful to prescribe some 308 order of the option so that the ones that are likely to be 309 implemented in hardware come first; we can't decide such an order 310 when we define the options, however a control plane can enforce such 311 order for some hardware implementations. 313 We do know that hardware needs to initially be able to efficiently 314 skip over the NVO3 header to find the inner payload. That is needed 315 for both NICs doing e.g. TCP offload and transit devices and NVEs 316 applying policy/ACLs to the inner payload. 318 6.4. Extension Size 320 Extension header length has a significant impact to hardware and 321 software implementations. A total header length that is too small 322 will unnecessarily constrained software flexibility. A total header 323 length that is too large will place a nontrivial cost on hardware 324 implementations. Thus, the design team recommends that there be a 325 minimum and maximum total extension header length selected. The 326 maximum total header length is determined by the bits allocated for 327 the total extension header length field. The risk with this approach 328 is that it may be difficult to extend the total header size in the 329 future. The minimum total header length is determined by a 330 requirement in the specifications that all implementations must meet. 331 The risk with this approach is that all implementations will only 332 implement the minimum total header length which would then become the 333 de facto maximum total header length. The recommended minimum total 334 header length is 64 bytes. 336 Single Extension size should always be 4 bytes aligned. 338 The maximum length of a single option should be large enough to meet 339 the different extension use case requirements e.g. in-band telemetry 340 and future use. 342 6.5. Extension Ordering 344 In order to support hardware nodes at the tunnel endpoint or at the 345 transit that can process one or few extensions TLVs in TCAM. A 346 control plane in such a deployment can signal a capability to ensure 347 a specific TLV will always appear in a specific order for example the 348 first one in the packet. 350 The order of the TLVs should be HW friendly for both the sender and 351 the receiver and possibly the transit node too. 353 Transit nodes doesn't participate in control plane communication 354 between the end points and are not required to process the options 355 however, if they do, they need to process only a small subset of 356 options that will be consumed by tunnel endpoints. 358 6.6. TLV vs Bit Fields 360 If there is a well-known initial set of options that are likely to be 361 implemented in software and in hardware, it can be efficient to use 362 the bit-field approach as in GUE. However, as described in section 363 6.3, if options are added over time and different subsets of options 364 are likely to be implemented in different pieces of hardware, then it 365 would be hard for the IETF to specify which options should get the 366 early bit fields. TLVs are a lot more flexible, which avoids the 367 need to determine the relative importance different options. 368 However, general TLV of arbitrary order, size, and repetition of the 369 same order is difficult to implement in hardware. A middle ground is 370 to use TLV with restrictions on the size and alignment, observing 371 that individual TLVs can have a fixed length, and support in the 372 control plane such that an NVE will only receive options that to 373 needs and implements. The control plane approach can potentially be 374 used to control the order of the TLVs sent to a particular NVE. Note 375 that transit devices are not likely to participate in the control 376 plane hence to the extent that they need to participate in option 377 processing they need more effort, But transit devices would have 378 issues with future GUE bits being defined for future options as well. 380 A benefit of TLVs from a HW perspective is that they are self 381 describing i.e., all the information is in the TLV. In a Bit fields 382 approach the hardware needs to look up the bit to determine the 383 length of the data associated with the bit through some separate 384 table, which would add hardware complexity. 386 There are use cases where multiple modules of software are running on 387 NVE. This can be modules such as a diagnostic module by one vendor 388 that does packet sampling and another module from a different vendor 389 that does a firewall. Using a TLV format, it is easier to have 390 different software modules process different TLVs, which could be 391 standard extensions or vendor specific extensions defined by the 392 different vendors, without conflicting with each other. This can 393 help with hardware modularity as well. There are some 394 implementations with options that allows different software like mac 395 learning and security handle different options. 397 6.7. Control Plane Considerations 399 Given that we want to allow large flexibility and extensibility for 400 e.g. software NVEs, yet be able to support key extensions in less 401 flexible e.g. hardware NVEs, it is useful to consider the control 402 plane. By control plane in this context we mean both protocols such 403 as EVPN and others, and also deployment specific configuration. 405 If each NVE can express in the control plane that they only care 406 about particular extensions (could be a single extension, or a few), 407 and the source NVEs only include requested extensions in the NVO3 408 packets, then the target NVE can both use a simpler parser (e.g., a 409 TCAM might be usable to look for a single NVO3 extension) and the 410 depth of the inner payload in the NVO3 packet will be minimized. 411 Furthermore, if the target NVE cares about a few extensions and can 412 express in the control plane the desired order of those extensions in 413 the NVO3 packets, then it can provide useful functionality with 414 minimal hardware requirements. 416 Note that transit devices that are not aware of the NVO3 extensions 417 somewhat benefit from such an approach, since the inner payload is 418 less deep in the packet if no extraneous extensions are included in 419 the packet. However, in general a transit device is not likely to 420 participate in the NVO3 control plane. (However, configuration 421 mechanisms can take into account limitations of the transit devices 422 used in particular deployments.) 423 Note that in this approach different NVEs could desire different 424 (sets of) extensions, which means that the source NVE needs to be 425 able to place different sets of extensions in different NVO3 packets, 426 and perhaps in different order. It also assumes that underlay 427 multicast or replication servers are not used together with NVO3 428 extensions. 430 There is a need to consider mandatory extensions versus optional 431 extensions. Mandatory extensions require the receiver to drop the 432 packet if the extension is unknown. A control plane mechanism can 433 prevent the need for dropping unknown extensions, since they would 434 not be included to targets that do not support them. 436 The control planes defined today need to add the ability to describe 437 the different encapsulations. Thus perhaps EVPN, and any other 438 control plane protocol that the IETF defines, should have a way to 439 enumerate the supported NVO3 extensions and their order. 441 The WG should consider developing a separate draft on guidance for 442 option processing and control plane participation. This should 443 provide examples/guidance on range of usage models and deployments 444 scenarios for specific options and ordering that are relevant for 445 that specific deployment. This includes end points and middle boxes 446 using the options. So, having the control plane negotiate the 447 constraints is most appropriate and flexible way to address these 448 requirements. 450 6.8. Split NVE 452 If the working group sees a need for having the hosts send and 453 receive options in a split NVE case, this is possible using any of 454 the existing extensible encapsulations (Geneve, GUE, GPE+NSH) by 455 defining a way to carry those over other transports. NSH can already 456 be used over different transports. 458 If we need to do this with other encapsulations it can be done by 459 defining an Ether type for other encapsulations so that it can be 460 carried over Ethernet and 802.1Q. 462 If we need to carry other encapsulations over MPLS, it would require 463 an EVPN control plane to signal that other encapsulation header + 464 options will be present in front of the L2 packet. The VNI can be 465 ignored in the header, and the MPLS label will be the one used to 466 identify the EVPN L2 instance. 468 6.9. Larger VNI Considerations 470 We discussed whether we should make VNI 32-bits or larger. The 471 benefit of 24-bit VNI would be to avoid unnecessary changes with 472 existing proposals and implementations that are almost all, if not 473 all, are using 24-bit VNI. If we need a larger VNI, an extension can 474 be used to support that. 476 7. Design team recommendations 478 We concluded that Geneve is most suitable as a starting point for 479 proposed standard for network virtualization, for the following 480 reasons: 482 1. We studied whether VNI should be in base header or in extensions 483 and whether it should be 24-bit or 32-bit. The design team agreed 484 that VNI is critical information for network virtualization and MUST 485 be present in all packets. Design team also agreed that 24-bit VNI 486 matches the existing widely used encapsulation format i.e. VxLAN and 487 NVGRE and hence more suitable to use going forward. 489 2. Geneve has the total options length that allow skipping over the 490 options for NIC offload operations, and will allow transit devices to 491 view flow information in the inner payload. 493 3. We considered the option of using NSH with VxLAN-GPE but given 494 that NSH is targeted at service chaining and contains service 495 chaining information, it is less suitable for the network 496 virtualization use case. The other downside for VxLAN-GPE was lack 497 of header length in VxLAN-GPE and hence makes skipping over the 498 headers to process inner payload more difficult. Total Option Length 499 is present in Geneve. It is not possible to skip any options in the 500 middle with VxLAN-GPE. In principle a split between a base header 501 and a header with options is interesting (whether that options header 502 is NSH or some new header without ties to a service path). We 503 explored whether it would make sense to either use NSH for this, or 504 define a new NVO3 options header. However, we observed that this 505 makes it slightly harder to find the inner payload since the length 506 field is not in the NVO3 header itself. Thus one more field would 507 have to be extracted to compute the start of the inner payload. 508 Also, if the experience with IPv6 extension headers is a guidance, 509 there would be a risk that key pieces of hardware might not implement 510 the options header, resulting in future calls to deprecate its use. 511 Making the options part of the base NVO3 header has less of those 512 issues. Even though the implementation of any particular option can 513 not be predicted ahead of time, the option mechanism and ability to 514 skip the options is likely to be broadly implemented. 516 4. We compared the TLV vs Bit-fields style extension and it was 517 deemed that parsing both TLV and bit-fields is expensive and while 518 bit-fields may be simpler to parse, it is also more restrictive and 519 requires guessing which extensions will be widely implemented so they 520 can get early bit assignments, given that half the bits are already 521 assigned in GUE, a widely deployed extension may appear in a flag 522 extension, and this will require extra processing, to dig the flag 523 from the flag extension and then look for the extension itself. As 524 well Bit-fields are not flexible enough to address the requirements 525 from OAM, Telemetry and security extensions, for variable length 526 option and different subtypes of the same option. While TLV are more 527 flexible, a control plane can restrict the number of option TLVs as 528 well the order and size of the TLVs to make it simpler for a 529 dataplane implementation to handle. 531 5. We briefly discussed multi-vendor NVE case, and the need to allow 532 vendors to put their own extensions in the NVE header. This is 533 possible with TLVs. 535 6. We also agreed that the C bit in Geneve is helpful to allow 536 receiver NVE to easily decide whether to process options or not. For 537 example a UUID based packet trace and how an optional extension such 538 as that can be ignored by receiver NVE and thus make it easy for NVE 539 to skip over the options. Thus the C-bit remains as defined in 540 Geneve. 542 7. There are already some extensions that are being discussed (see 543 section 6.2) of varying sizes, by using Geneve option it is possible 544 to get in band parameters like: switch id, ingress port, egress port, 545 internal delay, and queue in telemetry defined extension TLV from 546 switches. It is also possible to add Security extension TLVs like 547 HMAC and DTLS/IPSEC to authenticate the Geneve packet header and 548 secure the Geneve packet payload by software or hardware tunnel 549 endpoints. As well, a Group Based Policy extension TLV can be 550 carried. 552 8. There are implemented Geneve options today in production. There 553 are as well new HW supporting Geneve TLV parsing. In addition In- 554 band Telemetry (INT) specification being developed by P4.org 555 illustrates the option of INT meta data carried over Geneve. OVN/OVS 556 have also defined some option TLV(s) for Geneve. 558 9. The DT has addressed the usage models while considering the 559 requirements and implementations in general that includes software 560 and hardware. 562 There seems to be interest to standardize some well known secure 563 option TLVs to secure the header and payload to guarantee 564 encapsulation header integrity and tenant data privacy. The design 565 team recommends that the working group consider standardizing such 566 option(s). 568 We recommend the following enhancements to Geneve to make it more 569 suitable to hardware and yet provide the flexibility for software: 571 We would propose a text such as, while TLV are more flexible, a 572 control plane can restrict the number of option TLVs as well the 573 order and size of the TLVs to make it simpler for a data plane 574 implementation in software or hardware to handle. For example, there 575 may be some critical information such as secure hash that must be 576 processed in certain order at lowest latency. 578 A control plane can negotiate a subset of option TLVs and certain TLV 579 ordering, as well can limit the total number of option TLVs present 580 in the packet, for example, to allow hardware capable of processing 581 fewer options. Hence, the control planes need to have the ability to 582 describe the supported TLVs subset and their order. 584 The Geneve draft could specify that the subset and order of option 585 TLVs should be configurable for each remote NVE in the absence of a 586 protocol control plane. 588 We recommend Geneve to follow fragmentation recommendations in 589 overlay services like PWE3, and L2/L3 VPN recommendation to guarantee 590 larger MTU for the tunnel overhead [RFC3985],Section 5.3. 592 We request Geneve to provide a recommendation for critical bit 593 processing - text could look like how critical bits can be used with 594 control plane specifying the critical options. 596 Given that there is a telemetry option use case for a length of 256 597 bytes, we recommend Geneve to increase the Single TLV option length 598 to 256. 600 We request Geneve to address Requirements for OAM considerations for 601 alternate marking and for performance measurements that need 2 bits 602 in the header. And clarify the need of the current OAM bit in the 603 Geneve Header. 605 We recommend the WG to work on security options for Geneve. 607 8. Acknowledgements 609 The authors would like to thank Tom Herbert for providing the 610 motivation for the Security/Integrity extension, and for his valuable 611 comments, and would like to thank T. Sridhar for his valuable 612 comments and feedback. 614 9. Security Considerations 616 This document does not introduce any additional security constraints. 618 10. IANA Considerations 620 This document has no actions for IANA. 622 11. Appendix A 624 11.1. Overview 626 This section presents a comparison of the three NVO3 encapsulation 627 proposals, Geneve, GUE, and VXLAN-GPE. The three encapsulations use 628 an outer UDP/IP transport. Geneve and VXLAN-GPE use an 8-octet 629 header, while GUE uses a 4-octet header. In addition to the base 630 header, optional extensions may be included in the encapsulation, as 631 discussed in Section 3.2 below. 633 11.2. Extensibility 635 11.2.1. Native Extensibility Support 637 The Geneve and GUE encapsulations both enable optional headers to be 638 incorporated at the end of the base encapsulation header. 640 VXLAN-GPE does not provide native support for header extensions. 641 However, as discussed in [I-D.ietf-nvo3-vxlan-gpe], extensibility can 642 be attained to some extent if the Network Service Header (NSH) 643 [RFC8300] is used immediately following the VXLAN-GPE header. NSH 644 supports either a fixed-size extension (MD Type 1), or a variable- 645 size TLV-based extension (MD Type 2). It should be noted that NSH- 646 over-VXLAN-GPE implies an additional overhead of the 8- octets NSH 647 header, in addition to the VXLAN-GPE header. 649 11.2.2. Extension Parsing 651 The Geneve Variable Length Options are defined as Type/Length/ 652 Value(TLV) extensions. Similarly, VXLAN-GPE, when using NSH, can 653 include NSH TLV-based extensions. In contrast, GUE defines a small 654 set of possible extension fields (proposed in 655 [I-D.herbert-gue-extensions], and a set of flags in the GUE header 656 that indicate for each extension type whether it is present or not. 658 TLV-based extensions, as defined in Geneve, provide the flexibility 659 for a large number of possible extension types. Similar behavior can 660 be supported in NSH-over-VXLAN-GPE when using MD Type 2. The flag- 661 based approach taken in GUE strives to simplify implementations by 662 defining a small number of possible extensions, used in a fixed 663 order. 665 The Geneve and GUE headers both include a length field, defining the 666 total length of the encapsulation, including the optional extensions. 668 The length field simplifies the parsing of transit devices that skip 669 the encapsulation header without parsing its extensions. 671 11.2.3. Critical Extensions 673 The Geneve encapsulation header includes the 'C' field, which 674 indicates whether the current Geneve header includes critical 675 options, which must be parsed by the tunnel endpoint. If the 676 endpoint is not able to process the critical option, the packet is 677 discarded. 679 11.2.4. Maximal Header Length 681 The maximal header length in Geneve, including options, is 260 682 octets. GUE defines the maximal header to be 128 octets. VXLAN-GPE 683 uses a fixed-length header of 8 octets, unless NSH-over-VXLAN-GPE is 684 used, yielding an encapsulation header of up to 264 octets. 686 11.3. Encapsulation Header 688 11.3.1. Virtual Network Identifier (VNI) 690 The Geneve and VXLAN-GPE headers both include a 24-bit VNI field. 691 GUE, on the other hand, enables the use of a 32-bit field called 692 VNID; this field is not included in the GUE header, but was defined 693 as an optional extension in [I-D.herbert-gue-extensions]. 695 The VXLAN-GPE header includes the 'I' bit, indicating that the VNI 696 field is valid in the current header. A similar indicator is defined 697 as a flag in the GUE header herbert-gue-extensions. 699 11.3.2. Next Protocol 701 The three encapsulation headers include a field that specifies the 702 type of the next protocol header, which resides after the NVO3 703 encapsulation header. The Geneve header includes a 16-bit field that 704 uses the IEEE Ethertype convention. GUE uses an 8-bit field, which 705 uses the IANA Internet protocol numbering. The VXLAN-GPE header 706 incorporates an 8-bit Next Protocol field, using a VXLAN-GPE-specific 707 registry, defined in [I-D.ietf-nvo3-vxlan-gpe]. 709 The VXLAN-GPE header also includes the 'P' bit, which explicitly 710 indicates whether the Next Protocol field is present in the current 711 header. 713 11.3.3. Other Header Fields 715 The OAM bit, which is defined in Geneve and in VXLAN-GPE, indicates 716 whether the current packet is an OAM packet. The GUE header includes 717 a similar field, but uses different terminology; the GUE 'C-bit' 718 specifies whether the current packet is a control packet. Note that 719 the GUE control bit can potentially be used in a large set of 720 protocols that are not OAM protocols. However, the control packet 721 examples discussed in [I-D.ietf-nvo3-gue] are OAM-related. 723 Each of the three NVO3 encapsulation headers includes a 2-bit Version 724 field, which is currently defined to be zero. 726 The Geneve and VXLAN-GPE headers include reserved fields; 14 bits in 727 the Geneve header, and 27 bits in the VXLAN-GPE header are reserved. 729 11.4. Comparison Summary 731 The following table summarizes the comparison between the three NVO3 732 encapsulations. 734 +----------------+----------------+----------------+----------------+ 735 | | Geneve | GUE | VXLAN-GPE | 736 +----------------+----------------+----------------+----------------+ 737 | Outer transport| UDP/IP | UDP/IP | UDP/IP | 738 +----------------+----------------+----------------+----------------+ 739 | Base header | 8 octets | 4 octets | 8 octets | 740 | length | | | (16 octets | 741 | | | | using NSH) | 742 +----------------+----------------+----------------+----------------+ 743 | Extensibility |Variable length |Extension fields| No native ext- | 744 | | options | | ensibility. | 745 | | | | Extensible | 746 | | | | using NSH. | 747 +----------------+----------------+----------------+----------------+ 748 | Extension | TLV-based | Flag-based | TLV-based | 749 | parsing method | | |(using NSH with | 750 | | | | MD Type 2) | 751 +----------------+----------------+----------------+----------------+ 752 | Extension | Variable | Fixed | Variable | 753 | order | | | (using NSH) | 754 +----------------+----------------+----------------+----------------+ 755 | Length field | + | + | - | 756 +----------------+----------------+----------------+----------------+ 757 | Max Header | 260 octets | 128 octets | 8 octets | 758 | Length | | |(264 using NSH) | 759 +----------------+----------------+----------------+----------------+ 760 | Critical exte- | + | - | - | 761 | nsion bit | | | | 762 +----------------+----------------+----------------+----------------+ 763 | VNI field size | 24 bits | 32 bits | 24 bits | 764 | | | (extension) | | 765 +----------------+----------------+----------------+----------------+ 766 | Next protocol | 16 bits | 8 bits | 8 bits | 767 | field | Ethertype | Internet prot- | New registry | 768 | | registry | ocol registry | | 769 +----------------+----------------+----------------+----------------+ 770 | Next protocol | - | - | + | 771 | indicator | | | | 772 +----------------+----------------+----------------+----------------+ 773 | OAM / control | OAM bit | Control bit | OAM bit | 774 | field | | | | 775 +----------------+----------------+----------------+----------------+ 776 | Version field | 2 bits | 2 bits | 2 bits | 777 +----------------+----------------+----------------+----------------+ 778 | Reserved bits | 14 bits | - | 27 bits | 779 +----------------+----------------+----------------+----------------+ 781 Figure 1: NVO3 Encapsulation Comparison 783 12. Contributors 785 the following co-authors have contributed to this document. 787 Ilango Ganga Intel Email: ilango.s.ganga@intel.com 789 Pankaj Garg Microsoft Email: pankajg@microsoft.com 791 Rajeev Manur Broadcom Email: rajeev.manur@broadcom.com 793 Tal Mizrahi Marvell Email: talmi@marvell.com 795 David Mozes Email: mosesster@gmail.com 797 Erik Nordmark Email: nordmark@sonic.net 799 Michael Smith Cisco Email: michsmit@cisco.com 800 Sam Aldrin Google Email: aldrin.ietf@gmail.com 802 Ignas Bagdonas Equinix Email: ibagdona.ietf@gmail.com 804 13. References 806 13.1. Normative References 808 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 809 Requirement Levels", BCP 14, RFC 2119, 810 DOI 10.17487/RFC2119, March 1997, 811 . 813 13.2. Informative References 815 [I-D.herbert-gue-extensions] 816 Herbert, T., Yong, L., and F. Templin, "Extensions for 817 Generic UDP Encapsulation", draft-herbert-gue- 818 extensions-01 (work in progress), October 2016. 820 [I-D.ietf-nvo3-geneve] 821 Gross, J., Ganga, I., and T. Sridhar, "Geneve: Generic 822 Network Virtualization Encapsulation", draft-ietf- 823 nvo3-geneve-14 (work in progress), September 2019. 825 [I-D.ietf-nvo3-gue] 826 Herbert, T., Yong, L., and O. Zia, "Generic UDP 827 Encapsulation", draft-ietf-nvo3-gue-05 (work in progress), 828 October 2016. 830 [I-D.ietf-nvo3-vxlan-gpe] 831 Maino, F., Kreeger, L., and U. Elzur, "Generic Protocol 832 Extension for VXLAN", draft-ietf-nvo3-vxlan-gpe-09 (work 833 in progress), December 2019. 835 [I-D.smith-vxlan-group-policy] 836 Smith, M. and L. Kreeger, "VXLAN Group Policy Option", 837 draft-smith-vxlan-group-policy-05 (work in progress), 838 October 2018. 840 [RFC2418] Bradner, S., "IETF Working Group Guidelines and 841 Procedures", BCP 25, RFC 2418, DOI 10.17487/RFC2418, 842 September 1998, . 844 [RFC3985] Bryant, S., Ed. and P. Pate, Ed., "Pseudo Wire Emulation 845 Edge-to-Edge (PWE3) Architecture", RFC 3985, 846 DOI 10.17487/RFC3985, March 2005, 847 . 849 [RFC8300] Quinn, P., Ed., Elzur, U., Ed., and C. Pignataro, Ed., 850 "Network Service Header (NSH)", RFC 8300, 851 DOI 10.17487/RFC8300, January 2018, 852 . 854 Author's Address 856 Sami Boutros (editor) 857 Ciena 858 USA 860 Email: sboutros@ciena.com