idnits 2.17.1 draft-ietf-intarea-gue-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 7, 2019) is 1867 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'MUTLIQ' is mentioned on line 1449, but not defined == Unused Reference: 'MULTIQ' is defined on line 1423, but no explicit reference was found in the text ** Downref: Normative reference to an Informational RFC: RFC 2983 ** Downref: Normative reference to an Informational RFC: RFC 4459 ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126) -- Unexpected draft version: The latest known version of draft-herbert-gue-extensions is -01, but you're referring to -06. -- Obsolete informational reference (is this intentional?): RFC 5389 (Obsoleted by RFC 8489) -- Obsolete informational reference (is this intentional?): RFC 5245 (Obsoleted by RFC 8445, RFC 8839) -- Obsolete informational reference (is this intentional?): RFC 6830 (Obsoleted by RFC 9300, RFC 9301) == Outdated reference: A later version (-16) exists of draft-ietf-nvo3-geneve-10 Summary: 3 errors (**), 0 flaws (~~), 4 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Area WG T. Herbert 3 Internet-Draft Quantonium 4 Intended status: Standard track L. Yong 5 Expires September 8, 2019 Independent 6 O. Zia 7 Microsoft 8 March 7, 2019 10 Generic UDP Encapsulation 11 draft-ietf-intarea-gue-07 13 Status of this Memo 15 This Internet-Draft is submitted in full conformance with the 16 provisions of BCP 78 and BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html 34 This Internet-Draft will expire on September 8, 2019. 36 Copyright Notice 38 Copyright (c) 2019 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Abstract 60 This specification describes Generic UDP Encapsulation (GUE), which 61 is a scheme for using UDP to encapsulate packets of different IP 62 protocols for transport across layer 3 networks. By encapsulating 63 packets in UDP, specialized capabilities in networking hardware for 64 efficient handling of UDP packets can be leveraged. GUE specifies 65 basic encapsulation methods upon which higher level constructs, such 66 as tunnels and overlay networks for network virtualization, can be 67 constructed. GUE is extensible by allowing optional data fields as 68 part of the encapsulation, and is generic in that it can encapsulate 69 packets of various IP protocols. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 74 1.1. Applicability . . . . . . . . . . . . . . . . . . . . . . . 5 75 1.2. Terminology and acronyms . . . . . . . . . . . . . . . . . 6 76 1.3. Requirements Language . . . . . . . . . . . . . . . . . . . 7 77 2. Base packet format . . . . . . . . . . . . . . . . . . . . . . 8 78 2.1. GUE variant . . . . . . . . . . . . . . . . . . . . . . . . 8 79 3. Variant 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 80 3.1. Header format . . . . . . . . . . . . . . . . . . . . . . . 9 81 3.2. Proto/ctype field . . . . . . . . . . . . . . . . . . . . . 10 82 3.2.1. Proto field . . . . . . . . . . . . . . . . . . . . . . 10 83 3.2.2. Ctype field . . . . . . . . . . . . . . . . . . . . . . 11 84 3.3. Flags and extension fields . . . . . . . . . . . . . . . . 11 85 3.3.1. Requirements . . . . . . . . . . . . . . . . . . . . . 11 86 3.3.2. Example GUE header with extension fields . . . . . . . 12 87 3.4. Private data . . . . . . . . . . . . . . . . . . . . . . . 13 88 3.5. Message types . . . . . . . . . . . . . . . . . . . . . . . 13 89 3.5.1. Control messages . . . . . . . . . . . . . . . . . . . 13 90 3.5.2. Data messages . . . . . . . . . . . . . . . . . . . . . 14 91 4. Variant 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 92 4.1. Direct encapsulation of IPv4 . . . . . . . . . . . . . . . 15 93 4.2. Direct encapsulation of IPv6 . . . . . . . . . . . . . . . 16 94 5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 95 5.1. Network tunnel encapsulation . . . . . . . . . . . . . . . 17 96 5.2. Transport layer encapsulation . . . . . . . . . . . . . . . 17 97 5.3. Encapsulator operation . . . . . . . . . . . . . . . . . . 18 98 5.4. Decapsulator operation . . . . . . . . . . . . . . . . . . 18 99 5.4.1. Processing a received data message . . . . . . . . . . 18 100 5.4.2. Processing a received control message . . . . . . . . . 19 101 5.5. Middlebox inspection . . . . . . . . . . . . . . . . . . . 19 102 5.6. Router and switch operation . . . . . . . . . . . . . . . . 20 103 5.6.1. Connection semantics . . . . . . . . . . . . . . . . . 20 104 5.6.2. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . 21 105 5.7. Checksum Handling . . . . . . . . . . . . . . . . . . . . . 21 106 5.7.1. Requirements . . . . . . . . . . . . . . . . . . . . . 21 107 5.7.2. UDP Checksum with IPv4 . . . . . . . . . . . . . . . . 21 108 5.7.3. UDP Checksum with IPv6 . . . . . . . . . . . . . . . . 22 109 5.8. MTU and fragmentation . . . . . . . . . . . . . . . . . . . 22 110 5.9. Congestion control . . . . . . . . . . . . . . . . . . . . 23 111 5.10. Multicast . . . . . . . . . . . . . . . . . . . . . . . . 23 112 5.11. Flow entropy for ECMP . . . . . . . . . . . . . . . . . . 23 113 5.11.1. Flow classification . . . . . . . . . . . . . . . . . 24 114 5.11.2. Flow entropy properties . . . . . . . . . . . . . . . 24 115 5.12. Negotiation of acceptable flags and extension fields . . . 25 116 6. Motivation for GUE . . . . . . . . . . . . . . . . . . . . . . 26 117 6.1. Benefits of GUE . . . . . . . . . . . . . . . . . . . . . . 26 118 6.2. Comparison of GUE to other encapsulations . . . . . . . . . 26 119 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 28 120 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 28 121 8.1. UDP source port . . . . . . . . . . . . . . . . . . . . . . 28 122 8.2. GUE variant number . . . . . . . . . . . . . . . . . . . . 29 123 8.3. Control types . . . . . . . . . . . . . . . . . . . . . . . 29 124 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 29 125 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 30 126 10.1. Normative References . . . . . . . . . . . . . . . . . . . 30 127 10.2. Informative References . . . . . . . . . . . . . . . . . . 31 128 Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 34 129 A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 34 130 A.2. Checksum offload . . . . . . . . . . . . . . . . . . . . . 34 131 A.2.1. Transmit checksum offload . . . . . . . . . . . . . . . 35 132 A.2.2. Receive checksum offload . . . . . . . . . . . . . . . 35 133 A.3. Transmit Segmentation Offload . . . . . . . . . . . . . . . 36 134 A.4. Large Receive Offload . . . . . . . . . . . . . . . . . . . 37 135 Appendix B: Implementation considerations . . . . . . . . . . . . 37 136 B.1. Priveleged ports . . . . . . . . . . . . . . . . . . . . . 37 137 B.2. Setting flow entropy as a route selector . . . . . . . . . 38 138 B.3. Hardware protocol implementation considerations . . . . . . 38 139 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 39 141 1. Introduction 143 This specification describes Generic UDP Encapsulation (GUE) which is 144 a general method for encapsulating packets of arbitrary IP protocols 145 within User Datagram Protocol (UDP) [RFC0768] packets. Encapsulating 146 packets in UDP facilitates efficient transport across networks. 147 Networking devices widely provide protocol specific processing and 148 optimizations for UDP (as well as TCP) packets. Packets for atypical 149 IP protocols (those not usually parsed by networking hardware) can be 150 encapsulated in UDP packets to maximize deliverability and to 151 leverage flow specific mechanisms for routing and packet steering. 153 GUE provides an extensible header format for including optional data 154 in the encapsulation header. This data potentially covers items such 155 as a virtual networking identifier, security data for validating or 156 authenticating the GUE header, congestion control data, etc. GUE also 157 allows private optional data in the encapsulation header. This 158 feature can be used by a site or implementation to define local 159 custom optional data, and allows experimentation of options that may 160 eventually become standard. 162 This document does not define any specific GUE extensions. [GUEEXTEN] 163 specifies a set of initial extensions. 165 1.1. Applicability 167 GUE is a network encapsulation protocol that encapsulates packets for 168 various IP protocols. Potential use cases include network tunneling, 169 multi-tenant network virtualization, tunneling for mobility, and 170 transport layer encapsulation. GUE is intended for deploying overlay 171 networks in public or private data center environments, as well as 172 providing a general tunneling mechanism usable in the Internet. 174 GUE is a UDP based encapsulation protocol transported over existing 175 IPv4 and IPv6 networks. Hence, as a UDP based protocol, GUE adheres 176 to the UDP usage guidelines as specified in [RFC8085]. Applicability 177 of these guidelines are dependent on the underlay IP network and the 178 nature of GUE payload protocol (for example TCP/IP or IP/Ethernet). 180 [RFC8085] outlines two applicability scenarios for UDP applications, 181 1) general Internet and 2) controlled environment. GUE is intended to 182 allow deployment in both controlled environments and in the 183 uncontrolled Internet. The requirements of [RFC8085] pertaining to 184 deployment of a UDP encapsulation protocol in these environments are 185 applicable. Section 5 provides the specifics for satisfying 186 requirements of [RFC8085]. It is the responsibility of the operator 187 deploying GUE to ensure that the necessary operational requirements 188 are met for the environment in which GUE is being deployed. 190 GUE has much of the same applicability and benefits as GRE-in-UDP 191 [RFC8086] that are afforded by UDP encapsulation protocols. GUE 192 offers the possibility of good performance for load-balancing 193 encapsulated IP traffic in transit networks using existing Equal-Cost 194 Multipath (ECMP) mechanisms that use a hash of the five-tuple of 195 source IP address, destination IP address, UDP/TCP source port, 196 UDP/TCP destination port, and protocol number. Encapsulating packets 197 in UDP enables use of the UDP source port to provide entropy to ECMP 198 hashing. 200 In addition, GUE enables extending the use of atypical IP protocols 201 (those other than TCP and UDP) across networks that might otherwise 202 filter packets carrying those protocols. GUE may also be used with 203 connection oriented UDP semantics in order to facilitate traversal 204 through stateful firewalls and stateful NAT. 206 Additional motivation for the GUE protocol is provided in section 6. 208 1.2. Terminology and acronyms 210 GUE Generic UDP Encapsulation 212 GUE Header A variable length protocol header that is composed 213 of a primary four byte header and zero or more four 214 byte words of optional header data 216 GUE packet A UDP/IP packet that contains a GUE header and GUE 217 payload within the UDP payload 219 GUE variant A version of the GUE protocol or an alternate form 220 of a version 222 Encapsulator A network node that encapsulates packets in GUE 224 Decapsulator A network node that decapsulates and processes 225 packets encapsulated in GUE 227 Data message An encapsulated packet in a GUE payload that is 228 addressed to the protocol stack for an associated 229 protocol 231 Control message A formatted message in the GUE payload that is 232 implicitly addressed to the decapsulator to monitor 233 or control the state or behavior of a tunnel 235 Flags A set of bit flags in the primary GUE header 236 Extension field 237 An optional field in a GUE header whose presence is 238 indicated by corresponding flag(s) 240 C-bit A single bit flag in the primary GUE header that 241 indicates whether the GUE packet contains a control 242 message or data message 244 Hlen A field in the primary GUE header that gives the 245 length of the GUE header 247 Proto/ctype A field in the GUE header that holds either the IP 248 protocol number for a data message or a type for a 249 control message 251 Private data Optional data in the GUE header that can be used for 252 private purposes 254 Outer IP header Refers to the outer most IP header or packet when 255 encapsulating a packet over IP 257 Inner IP header Refers to an encapsulated IP header when an IP 258 packet is encapsulated 260 Outer packet Refers to an encapsulating packet 262 Inner packet Refers to a packet that is encapsulated 264 1.3. Requirements Language 266 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 267 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 268 document are to be interpreted as described in [RFC2119]. 270 2. Base packet format 272 A GUE packet is comprised of a UDP packet whose payload is a GUE 273 header followed by a payload which is either an encapsulated packet 274 of some IP protocol or a control message such as an OAM (Operations, 275 Administration, and Management) message. A GUE packet has the general 276 format: 278 +-------------------------------+ 279 | | 280 | UDP/IP header | 281 | | 282 |-------------------------------| 283 | | 284 | GUE Header | 285 | | 286 |-------------------------------| 287 | | 288 | Encapsulated packet | 289 | or control message | 290 | | 291 +-------------------------------+ 293 The GUE header is variable length as determined by the presence of 294 optional extension fields and private data. 296 2.1. GUE variant 298 The first two bits of the GUE header contain the GUE protocol variant 299 number. The variant number can indicate the version of the GUE 300 protocol as well as alternate forms of a version. 302 Variants 0 and 1 are described in this specification; variants 2 and 303 3 are reserved. 305 3. Variant 0 307 Variant 0 indicates version 0 of GUE. This variant defines a generic 308 extensible format to encapsulate packets by Internet protocol number. 310 3.1. Header format 312 The header format for variant 0 of GUE in UDP is: 314 0 1 2 3 315 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 316 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ 317 | Source port | Destination port | | 318 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP 319 | Length | Checksum | | 320 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ 321 | 0 |C| Hlen | Proto/ctype | Flags |\ 322 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 323 | | | 324 ~ Extensions Fields (optional) ~ | 325 | | GUE 326 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 327 | | | 328 ~ Private data (optional) ~ | 329 | | | 330 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ 332 The contents of the UDP header are: 334 o Source port: If connection semantics (section 5.6.1) are applied 335 to an encapsulation, this is set to the local source port for 336 the connection. When connection semantics are not applied, the 337 source port is either set to a flow entropy value, as described 338 in section 5.11, or is set to the GUE assigned port number, 339 6080. 341 o Destination port: If connection semantics (section 5.6.1) are 342 applied to an encapsulation, this is set to the destination port 343 for the tuple. If connection semantics are not applied then the 344 destination port is set to the GUE assigned port number, 6080. 346 o Length: Canonical length of the UDP packet (length of UDP header 347 and payload). 349 o Checksum: Standard UDP checksum (handling is described in 350 section 5.7). 352 The GUE header consists of: 354 o Variant: 0 indicates GUE protocol version 0 with a header. 356 o C: C-bit: When set indicates a control message. When not set 357 indicates a data message. 359 o Hlen: Length in 32-bit words of the GUE header, including 360 optional extension fields but not the first four bytes of the 361 header. Computed as (header_len - 4) / 4, where header_len is 362 the total header length in bytes. All GUE headers are a multiple 363 of four bytes in length. Maximum header length is 128 bytes. 365 o Proto/ctype: When the C-bit is set, this field contains a 366 control message type for the payload (section 3.2.2). When the 367 C-bit is not set, the field holds the Internet protocol number 368 for the encapsulated packet in the payload (section 3.2.1). The 369 control message or encapsulated packet begins at the offset 370 provided by Hlen. 372 o Flags: Header flags that may be allocated for various purposes 373 and may indicate the presence of extension fields. Undefined 374 header flag bits MUST be set to zero on transmission. 376 o Extension Fields: Optional fields whose presence is indicated by 377 corresponding flags. 379 o Private data: Optional private data block (see section 3.4). If 380 the private block is present, it immediately follows that last 381 extension field present in the header. The private block is 382 considered to be part of the GUE header. The length of this data 383 is determined by subtracting the starting offset of the private 384 data from the header length. 386 3.2. Proto/ctype field 388 The proto/ctype fields either contains an Internet protocol number 389 (when the C-bit is not set) or GUE control message type (when the C- 390 bit is set). 392 3.2.1. Proto field 394 When the C-bit is not set, the proto/ctype field MUST contain an IANA 395 Internet Protocol Number [IANA-PN]. The protocol number is 396 interpreted relative to the IP protocol that encapsulates the UDP 397 packet (i.e. protocol of the outer IP header). The protocol number 398 serves as an indication of the type of the next protocol header which 399 is contained in the GUE payload at the offset indicated in Hlen. 401 IP protocol number 59 ("No next header") can be set to indicate that 402 the GUE payload does not begin with the header of an IP protocol. 403 This would be the case, for instance, if the GUE payload were a 404 fragment when performing GUE level fragmentation. The interpretation 405 of the payload is performed through other means such as flags and 406 extension fields, and nodes MUST NOT parse packets based on the IP 407 protocol number in this case. 409 3.2.2. Ctype field 411 When the C-bit is set, the proto/ctype field MUST be set to a valid 412 control message type. A value of zero indicates that the GUE payload 413 requires further interpretation to deduce the control type. This 414 might be the case when the payload is a fragment of a control 415 message, where only the reassembled packet can be interpreted as a 416 control message. 418 Control messages will be defined in an IANA registry. Control message 419 types 1 through 127 may be defined in standards. Types 128 through 420 255 are reserved to be user defined for experimentation or private 421 control messages. 423 This document does not specify any standard control message types 424 other than type 0. Type 0 does not define a format of the control 425 message. Instead, it indicates that the GUE payload is a control 426 message, or part of a control message (as might be the case in GUE 427 fragmentation), that cannot be correctly parsed or interpreted 428 without additional context. 430 3.3. Flags and extension fields 432 Flags and associated extension fields are the primary mechanism of 433 extensibility in GUE. As mentioned in section 3.1, GUE header flags 434 indicate the presence of optional extension fields in the GUE header. 435 [GUEEXTEN] defines an initial set of GUE extensions. 437 3.3.1. Requirements 439 There are sixteen flag bits in the GUE header. Flags may indicate 440 presence of extension fields. The size of an extension field 441 indicated by a flag MUST be fixed in the specification of the flag. 443 Flags can be paired together to allow different lengths for an 444 extension field. For example, if two flag bits are paired, a field 445 can possibly be three different lengths-- that is bit value of 00 446 indicates no field present; 01, 10, and 11 indicate three possible 447 lengths for the field. Regardless of how flag bits are paired, the 448 lengths and offsets of extension fields corresponding to a set of 449 flags MUST be well defined and deterministic. 451 Extension fields are placed in order of the flags. New flags are to 452 be allocated from high to low order bit contiguously without holes. 453 Flags allow random access, for instance to inspect the field 454 corresponding to the Nth flag bit, an implementation only considers 455 the previous N-1 flags to determine the offset. Flags after the Nth 456 flag are not pertinent in calculating the offset of the field for the 457 Nth flag. Random access of flags and fields permits processing of 458 optional extensions in an order that is independent of their position 459 in the packet. 461 Flags (or paired flags) are idempotent such that new flags MUST NOT 462 cause reinterpretation of old flags. Also, new flags MUST NOT alter 463 interpretation of other elements in the GUE header nor how the 464 message is parsed (for instance, in a data message the proto/ctype 465 field always holds an IP protocol number as an invariant). 467 The set of available flags can be extended in the future by defining 468 a "flag extensions bit" that refers to a field containing a new set 469 of flags. 471 3.3.2. Example GUE header with extension fields 473 An example GUE header for a data message encapsulating an IPv4 packet 474 and containing the Group Identifier and Security extension fields 475 (both defined in [GUEEXTEN]) is shown below: 477 0 1 2 3 478 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 479 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 480 | 0 |0| 3 | 4 |1|0 0 1| 0 | 481 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 482 | Group Identifier | 483 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 484 | | 485 + Security + 486 | | 487 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 489 In the above example, the first flag bit is set which indicates that 490 the Group Identifier extension is present which is a 32 bit field. 491 The second through fourth bits of the flags are paired flags that 492 indicate the presence of a Security field with seven possible sizes. 493 In this example 001 indicates a sixty-four bit security field. 495 3.4. Private data 497 An implementation MAY use private data for its own use. The private 498 data immediately follows the last extension field in the GUE header 499 and is not a fixed length. This data is considered part of the GUE 500 header and MUST be accounted for in header length (Hlen). The length 501 of the private data MUST be a multiple of four bytes and is 502 determined by subtracting the offset of private data in the GUE 503 header from the header length. Specifically: 505 Private_length = (Hlen * 4) - Length(flags) 507 where "Length(flags)" returns the sum of lengths of all the extension 508 fields present in the GUE header. When there is no private data 509 present, the length of the private data is zero. 511 The semantics and interpretation of private data are implementation 512 specific. The private data may be structured as necessary, for 513 instance it might contain its own set of flags and extension fields. 515 An encapsulator and decapsulator MUST agree on the meaning of private 516 data before using it. The mechanism to achieve this agreement is 517 outside the scope of this document but could include implementation- 518 defined behavior, coordinated configuration, in-band communication 519 using GUE control messages, or out-of-band messages. 521 If a decapsulator receives a GUE packet with private data, it MUST 522 validate the private data appropriately. If a decapsulator does not 523 expect private data from an encapsulator, the packet MUST be dropped. 524 If a decapsulator cannot validate the contents of private data per 525 the provided semantics, the packet MUST also be dropped. An 526 implementation MAY place security data in GUE private data which if 527 present MUST be verified for packet acceptance. 529 3.5. Message types 531 There are two message types in GUE variant 0: control messages and 532 data messages. 534 3.5.1. Control messages 536 Control messages carry formatted data that are implicitly addressed 537 to the decapsulator to monitor or control the state or behavior of a 538 tunnel (OAM). For instance, an echo request and corresponding echo 539 reply message can be defined to test for liveness. 541 Control messages are indicated in the GUE header when the C-bit is 542 set. The payload is interpreted as a control message with type 543 specified in the proto/ctype field. The format and contents of the 544 control message are indicated by the type and can be variable length. 546 Other than interpreting the proto/ctype field as a control message 547 type, the meaning and semantics of the rest of the elements in the 548 GUE header are the same as that of data messages. Forwarding and 549 routing of control messages should be the same as that of a data 550 message with the same outer IP and UDP header; this ensures that 551 control messages can be created that follow the same path through the 552 network as data messages. 554 3.5.2. Data messages 556 Data messages carry encapsulated packets that are addressed to the 557 protocol stack for the associated protocol. Data messages are a 558 primary means of encapsulation and can be used to create tunnels for 559 overlay networks. 561 Data messages are indicated in GUE header when the C-bit is not set. 562 The payload of a data message is interpreted as an encapsulated 563 packet of an Internet protocol indicated in the proto/ctype field. 564 The encapsulated packet immediately follows the GUE header. 566 4. Variant 1 568 Variant 1 of GUE allows direct encapsulation of IPv4 and IPv6 in UDP. 569 In this variant there is no GUE header, a UDP packet carries an IP 570 packet. The first two bits of the UDP payload are the GUE variant 571 field and coincide with the first two bits of the version number in 572 the IP header. The first two version bits of IPv4 and IPv6 are 01, so 573 we use GUE variant 1 for direct IP encapsulation which makes the two 574 bits of GUE variant to also be 01. 576 This technique is effectively a means to compress out the GUE version 577 0 header when encapsulating IPv4 or IPv6 packets and there are no 578 flags, extension fields, or private data present. This method is 579 compatible to use on the same port number as packets with the GUE 580 header (GUE variant 0 packets). This technique saves encapsulation 581 overhead on costly links for the common use of IP encapsulation, and 582 also obviates the need to allocate a separate UDP port number for IP- 583 over-UDP encapsulation. 585 4.1. Direct encapsulation of IPv4 587 The format for encapsulating IPv4 directly in UDP is: 589 0 1 2 3 590 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 591 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ 592 | Source port | Destination port | | 593 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP 594 | Length | Checksum | | 595 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ 596 |0|1|0|0| IHL |Type of Service| Total Length | 597 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 598 | Identification |Flags| Fragment Offset | 599 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 600 | Time to Live | Protocol | Header Checksum | 601 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 602 | Source IPv4 Address | 603 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 604 | Destination IPv4 Address | 605 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 607 The UDP fields are set in a similar manner as described in section 608 3.1. 610 Note that the 0100 value in the first four bits of the UDP payload 611 expresses the GUE variant as 1 (bits 01) and IP version as 4 (bits 612 0100). 614 4.2. Direct encapsulation of IPv6 616 The format for encapsulating IPv6 directly in UDP is demonstrated 617 below: 619 0 1 2 3 620 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 621 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ 622 | Source port | Destination port | | 623 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP 624 | Length | Checksum | | 625 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ 626 |0|1|1|0| Traffic Class | Flow Label | 627 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 628 | Payload Length | NextHdr | Hop Limit | 629 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 630 | | 631 + + 632 | | 633 + Source IPv6 Address + 634 | | 635 + + 636 | | 637 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 638 | | 639 + + 640 | | 641 + Destination IPv6 Address + 642 | | 643 + + 644 | | 645 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 647 The UDP fields are set in a similar manner as described in section 648 3.1. 650 Note that the 0110 value in the first four bits of the the UDP 651 payload expresses the GUE variant as 1 (bits 01) and IP version as 6 652 (bits 0110). 654 5. Operation 656 The figure below illustrates the use of GUE encapsulation between two 657 hosts. Host 1 is sending packets to Host 2. An encapsulator performs 658 encapsulation of packets from Host 1. These encapsulated packets 659 traverse the network as UDP packets. At the decapsulator, packets are 660 decapsulated and sent on to Host 2. Packet flow in the reverse 661 direction need not be symmetric; for example, the reverse path might 662 not use GUE or any other form of encapsulation. 664 +---------------+ +---------------+ 665 | | | | 666 | Host 1 | | Host 2 | 667 | | | | 668 +---------------+ +---------------+ 669 | ^ 670 V | 671 +---------------+ +---------------+ +---------------+ 672 | | | | | | 673 | Encapsulator |-->| Layer 3 |-->| Decapsulator | 674 | | | Network | | | 675 +---------------+ +---------------+ +---------------+ 677 The encapsulator and decapsulator may be co-resident with the 678 corresponding hosts, or may be on separate nodes in the network. 680 5.1. Network tunnel encapsulation 682 Network tunneling can be achieved by encapsulating layer 2 or layer 3 683 packets. In this case, the encapsulator and decapsulator nodes are 684 the tunnel endpoints. These could be routers that provide network 685 tunnels on behalf of communicating hosts. 687 5.2. Transport layer encapsulation 689 When encapsulating layer 4 packets, the encapsulator and decapsulator 690 should be co-resident with the hosts. In this case, the encapsulation 691 headers are inserted between the IP header and the transport packet. 692 The addresses in the IP header refer to both the endpoints of the 693 encapsulation and the endpoints for terminating the encapsulated 694 transport protocol. Note that the transport layer ports in the 695 encapsulated packet are independent of the UDP ports in the outer 696 packet. 698 5.3. Encapsulator operation 700 Encapsulators create GUE data messages, set the fields of the UDP 701 header, set flags and optional extension fields in the GUE header, 702 and forward packets to a decapsulator. 704 An encapsulator can be an end host originating the packets of a flow, 705 or can be a network device performing encapsulation on behalf of 706 hosts (routers implementing tunnels for instance). In either case, 707 the intended target (decapsulator) is indicated by the outer 708 destination IP address and destination port in the UDP header. 710 If an encapsulator is tunneling packets -- that is encapsulating 711 packets of layer 2 or layer 3 protocols (e.g. EtherIP, IPIP, ESP 712 tunnel mode) -- it SHOULD follow standard conventions for tunneling 713 one protocol over another. For instance, if an IP packet is being 714 encapsulated in GUE then diffserv interaction [RFC2983] and ECN 715 propagation for tunnels [RFC6040] SHOULD be followed. 717 5.4. Decapsulator operation 719 A decapsulator performs decapsulation of GUE packets. A decapsulator 720 is addressed by the outer destination IP address and UDP destination 721 port of a GUE packet. The decapsulator validates packets, including 722 fields of the GUE header. 724 If a decapsulator receives a GUE packet with an unsupported variant, 725 unknown flag, bad header length (too small for included extension 726 fields), unknown control message type, bad protocol number, an 727 unsupported payload type, or an otherwise malformed header, it MUST 728 drop the packet. Such events MAY be logged subject to configuration 729 and rate limiting of logging messages. Note that set flags in a GUE 730 header that are unknown to a decapsulator MUST NOT be ignored. If a 731 GUE packet is received by a decapsulator with unknown flags, the 732 packet MUST be dropped. 734 5.4.1. Processing a received data message 736 If a valid data message is received, the UDP header and GUE header 737 are (logically) removed from the packet. The outer IP header remains 738 intact and the next protocol in the IP header is set to the protocol 739 from the proto field in the GUE header. The resulting packet is then 740 resubmitted into the protocol stack to process the packet as though 741 it was received with the protocol indicated in the GUE header. 743 As an example, consider that a data message is received where GUE 744 encapsulates an IPv4 packet using GUE variant 0. In this case proto 745 field in the GUE header is set to 4 for IPv4 encapsulation: 747 +-------------------------------------+ 748 | IP header (next proto = 17,UDP) | 749 |-------------------------------------| 750 | UDP | 751 |-------------------------------------| 752 | GUE (proto = 4,IPv4 encapsulation) | 753 |-------------------------------------| 754 | IPv4 header and packet | 755 +-------------------------------------+ 757 The receiver removes the UDP and GUE headers and sets the next 758 protocol field in the IP packet to 4, which is derived from the GUE 759 proto field. The resultant packet would have the format: 761 +-------------------------------------+ 762 | IP header (next proto = 4,IPv4) | 763 |-------------------------------------| 764 | IPv4 header and packet | 765 +-------------------------------------+ 767 This packet is then resubmitted into the protocol stack to be 768 processed as an IPv4 encapsulated packet. 770 5.4.2. Processing a received control message 772 If a valid control message is received, the packet MUST be processed 773 as a control message. The specific processing to be performed depends 774 on the value in the ctype field of the GUE header. 776 5.5. Middlebox inspection 778 A middlebox MAY inspect a GUE header. A middlebox MUST NOT modify a 779 GUE header or UDP payload. 781 To inspect a GUE header, a middlebox needs to identify GUE packets. 782 The obvious method is to match the destination UDP port number to be 783 the GUE port number (i.e. 6080). Per [RFC7605], transport port 784 numbers only have meaning at the endpoints of communications, so 785 inferring the type of a UDP payload based on port number may be 786 incorrect. Middleboxes MUST NOT take any action that would have 787 harmful side effects if a UDP packet were misinterpreted as being a 788 GUE packet. In particular, a middlebox MUST NOT modify a UDP payload 789 based on inferring the payload type from the port number lest the 790 middlebox could cause silent data corruption. 792 A middlebox MAY interpret some flags and extension fields of the GUE 793 header for classification purposes, but is not required to understand 794 any of the flags or extension fields in GUE packets. A middlebox MUST 795 NOT drop a GUE packet merely because there are flags unknown to it. 796 Similarly, a middlebox MUST NOT arbitrarily filter packets based on 797 GUE flags or extension fields that are present or not present. The 798 header length in the GUE header allows a middlebox to inspect the 799 payload packet without needing to parse the flags or extension 800 fields. 802 5.6. Router and switch operation 804 Routers and switches SHOULD forward GUE packets as standard UDP/IP 805 packets. The outer five-tuple should contain sufficient information 806 to perform flow classification corresponding to the flow of the inner 807 packet. A router does not normally need to parse a GUE header, and 808 none of the flags or extension fields in the GUE header are expected 809 to affect routing. In cases where the outer five-tuple does not 810 provide sufficient entropy for flow classification, for instance UDP 811 ports are fixed to provide connection semantics (section 5.6.1), then 812 the encapsulated packet MAY be parsed to determine flow entropy. 814 A router MUST NOT modify a GUE header or payload when forwarding a 815 packet. It MAY encapsulate a GUE packet in another GUE packet, for 816 instance to implement a network tunnel (i.e. by encapsulating an IP 817 packet with a GUE payload in another IP packet as a GUE payload). In 818 this case, the router takes the role of an encapsulator, and the 819 corresponding decapsulator is the logical endpoint of the tunnel. 820 When encapsulating a GUE packet within another GUE packet, there are 821 no provisions to automatically copy flags or fields to the outer GUE 822 header. Each layer of encapsulation is considered independent. 824 5.6.1. Connection semantics 826 A middlebox might infer bidirectional connection semantics for a UDP 827 flow. For instance, a stateful firewall might create a five-tuple 828 rule to match flows on egress, and a corresponding five-tuple rule 829 for matching ingress packets where the roles of source and 830 destination are reversed for the IP addresses and UDP port numbers. 831 To operate in this environment, a GUE tunnel should be configured to 832 assume connected semantics defined by the UDP five tuple and the use 833 of GUE encapsulation needs to be symmetric between both endpoints. 834 The source port set in the UDP header MUST be the destination port 835 the peer would set for replies. In this case, the UDP source port for 836 a tunnel would be a fixed value and not set to be flow entropy as 837 described in section 5.11. 839 The selection of whether to make the UDP source port fixed or set to 840 a flow entropy value for each packet sent SHOULD be configurable for 841 a tunnel. The default MUST be to set the flow entropy value in the 842 UDP source port. 844 5.6.2. NAT 846 IP address and port translation can be performed on the UDP/IP 847 headers adhering to the requirements for NAT (Network Address 848 Translation) with UDP [RFC4787]. In the case of stateful NAT, 849 connection semantics MUST be applied to a GUE tunnel as described in 850 section 5.6.1. GUE endpoints MAY also invoke STUN [RFC5389] or ICE 851 [RFC5245] to manage NAT port mappings for encapsulations. 853 5.7. Checksum Handling 855 The potential for mis-delivery of packets due to corruption of IP, 856 UDP, or GUE headers needs to be considered. Historically, the UDP 857 checksum would be considered sufficient as a check against corruption 858 of either the UDP header and payload or the IP addresses. 859 Encapsulation protocols, such as GUE, can be originated or terminated 860 on devices incapable of computing the UDP checksum for packet. This 861 section discusses the requirements around checksum and alternatives 862 that might be used when an endpoint does not support UDP checksum. 864 5.7.1. Requirements 866 One of the following requirements MUST be met: 868 o UDP checksums are enabled (for IPv4 or IPv6). 870 o The GUE header checksum is used (defined in [GUEEXTEN]). 872 o Use zero UDP checksums. This is always permissible with IPv4; in 873 IPv6, they can only be used in accordance with applicable 874 requirements in [RFC8086], [RFC6935], and [RFC6936]. 876 5.7.2. UDP Checksum with IPv4 878 For UDP in IPv4, the UDP checksum MUST be processed as specified in 879 [RFC0768] and [RFC1122] for both transmit and receive. An 880 encapsulator MAY set the UDP checksum to zero for performance or 881 implementation considerations. The IPv4 header includes a checksum 882 that protects against mis-delivery of the packet due to corruption of 883 IP addresses. The UDP checksum potentially provides protection 884 against corruption of the UDP header, GUE header, and GUE payload. 885 Enabling or disabling the use of checksums is a deployment 886 consideration that should take into account the risk and effects of 887 packet corruption, and whether the packets in the network are already 888 adequately protected by other, possibly stronger mechanisms, such as 889 the Ethernet CRC. If an encapsulator sets a zero UDP checksum for 890 IPv4, it SHOULD use the GUE header checksum as described in 891 [GUEEXTEN] if there are no other mechanisms used that would detect 892 corruption of GUE packets. 894 When a decapsulator receives a packet, the UDP checksum field MUST be 895 processed. If the UDP checksum is non-zero, the decapsulator MUST 896 verify the checksum before accepting the packet. By default, a 897 decapsulator SHOULD accept UDP packets with a zero checksum. A node 898 MAY be configured to disallow zero checksums per [RFC1122]. 899 Configuration of zero checksums can be selective. For instance, zero 900 checksums might be disallowed from certain hosts that are known to be 901 traversing paths subject to packet corruption. If verification of a 902 non-zero checksum fails, a decapsulator lacks the capability to 903 verify a non-zero checksum, or a packet with a zero-checksum was 904 received and the decapsulator is configured to disallow that, then 905 the packet MUST be dropped. 907 5.7.3. UDP Checksum with IPv6 909 In IPv6, there is no checksum in the IPv6 header that protects 910 against mis-delivery due to address corruption. Therefore, when GUE 911 is used over IPv6, either the UDP checksum or the GUE header checksum 912 SHOULD be used unless there are alternative mechanisms in use that 913 protect against misdelivery. The UDP checksum and GUE header checksum 914 SHOULD NOT be used at the same time since that would be mostly 915 redundant. 917 If neither the UDP checksum nor the GUE header checksum is used, then 918 the requirements for using zero IPv6 UDP checksums in [RFC6935] and 919 [RFC6936] MUST be met. 921 When a decapsulator receives a packet, the UDP checksum field MUST be 922 processed. If the UDP checksum is non-zero, the decapsulator MUST 923 verify the checksum before accepting the packet. By default a 924 decapsulator MUST only accept UDP packets with a zero checksum if the 925 GUE header checksum is used and is verified. If verification of a 926 non-zero checksum fails or a decapsulator lacks the capability to 927 verify a non-zero checksum then the packet MUST be dropped. If a 928 packet is received with a zero UDP checksum, no GUE header checksum, 929 and zero UDP checksums are disallowed then the packet MUST be 930 dropped. 932 5.8. MTU and fragmentation 934 Standard conventions for handling of MTU (Maximum Transmission Unit) 935 and fragmentation in conjunction with networking tunnels 936 (encapsulation of layer 2 or layer 3 packets) SHOULD be followed. 937 Details are described in MTU and Fragmentation Issues with In-the- 938 Network Tunneling [RFC4459]. 940 If a packet is fragmented before encapsulation in GUE, all the 941 related fragments MUST be encapsulated using the same UDP source 942 port. An operator SHOULD set MTU to account for encapsulation 943 overhead and reduce the likelihood of fragmentation. 945 Alternative to IP fragmentation, the GUE fragmentation extension can 946 be used. GUE fragmentation is described in [GUEEXTEN]. 948 5.9. Congestion control 950 Per requirements of [RFC8085], if the IP traffic encapsulated with 951 GUE implements proper congestion control then no additional 952 mechanisms should be required. 954 In the case that the encapsulated traffic does not implement any or 955 sufficient control, or it is not known whether a transmitter will 956 consistently implement proper congestion control, then congestion 957 control at the encapsulation layer MUST be provided per [RFC8085]. 958 Note that this case applies to a significant use case in network 959 virtualization in which guests run third party networking stacks that 960 cannot be implicitly trusted to implement conformant congestion 961 control. 963 Out of band mechanisms such as rate limiting, Managed Circuit Breaker 964 [RFC8084], or traffic isolation MAY be used to provide rudimentary 965 congestion control. For finer-grained congestion control that allows 966 alternate congestion control algorithms, reaction time within an RTT, 967 and interaction with ECN, in-band mechanisms might be warranted. 969 5.10. Multicast 971 GUE packets can be multicast to decapsulators using a multicast 972 destination address in the outer IP header. Each receiving host will 973 decapsulate the packet independently following normal decapsulator 974 operations. The receiving decapsulators need to agree on the same set 975 of GUE parameters and properties; how such an agreement is reached is 976 outside the scope of this document. 978 GUE allows encapsulation of unicast, broadcast, or multicast traffic. 979 Flow entropy (the value in the UDP source port) can be generated from 980 the header of encapsulated unicast or broadcast/multicast packets at 981 an encapsulator. The mapping mechanism between the encapsulated 982 multicast traffic and the multicast capability in the IP network is 983 transparent and independent of the encapsulation and is otherwise 984 outside the scope of this document. 986 5.11. Flow entropy for ECMP 987 A major objective of using GUE is that a network device can perform 988 flow classification corresponding to the flow of the inner 989 encapsulated packet based on the contents of the outer headers. 991 5.11.1. Flow classification 993 When a packet is encapsulated with GUE and connection semantics are 994 not applied, the source port in the outer UDP packet is set to a flow 995 entropy value that corresponds to the flow of the inner packet. When 996 a device computes a five-tuple hash on the outer UDP/IP header of a 997 GUE packet, the resultant value classifies the packet per its inner 998 flow. 1000 Examples of deriving flow entropy for encapsulation are: 1002 o If the encapsulated packet is a layer 4 packet, TCP/IPv4 for 1003 instance, the flow entropy could be based on the canonical five- 1004 tuple hash of the inner packet. 1006 o If the encapsulated packet is an AH transport mode packet with 1007 TCP as next header, the flow entropy could be a hash over a 1008 three-tuple: TCP protocol and TCP ports of the encapsulated 1009 packet. 1011 o If a node is encrypting a packet using ESP tunnel mode and GUE 1012 encapsulation, the flow entropy could be based on the contents 1013 of the clear-text packet. For instance, a canonical five-tuple 1014 hash for a TCP/IP packet could be used. 1016 [RFC6438] discusses methods to compute and set flow entropy value for 1017 IPv6 flow labels, such methods can also be used to create flow 1018 entropy values for GUE. 1020 5.11.2. Flow entropy properties 1022 The flow entropy is the value set in the UDP source port of a GUE 1023 packet. Flow entropy in the UDP source port SHOULD adhere to the 1024 following properties: 1026 o The value set in the source port is within the ephemeral port 1027 range (49152 to 65535 [RFC6335]). Since the high order two bits 1028 of the port are set to one, this provides fourteen bits of 1029 entropy for the value. 1031 o The flow entropy has a uniform distribution across encapsulated 1032 flows. 1034 o An encapsulator MAY occasionally change the flow entropy used 1035 for an inner flow per its discretion (for security, route 1036 selection, etc). To avoid thrashing or flapping the value, the 1037 flow entropy used for a flow SHOULD NOT change more than once 1038 every thirty seconds (or a configurable value). 1040 o Decapsulators, or any networking devices, SHOULD NOT attempt to 1041 interpret flow entropy as anything more than an opaque value. 1042 Neither should they attempt to reproduce the hash calculation 1043 used by an encapasulator in creating a flow entropy value. They 1044 MAY use the value to match further receive packets for steering 1045 decisions, but MUST NOT assume that the hash uniquely or 1046 permanently identifies a flow. 1048 o Input to the flow entropy calculation is not restricted to ports 1049 and addresses; input could include the flow label from an IPv6 1050 packet, SPI from an ESP packet, or other flow related state in 1051 the encapsulator that is not necessarily conveyed in the packet. 1053 o The assignment function for flow entropy SHOULD be randomly 1054 seeded to mitigate denial of service attacks. The seed SHOULD be 1055 changed periodically. 1057 5.12. Negotiation of acceptable flags and extension fields 1059 An encapsulator and decapsulator need to achieve agreement about GUE 1060 parameters that will be used in communications. Parameters include 1061 supported GUE variants, flags and extension fields that can be used, 1062 security algorithms and keys, supported protocols and control 1063 messages, etc. This document proposes different general methods to 1064 accomplish this, however the details of implementing these are 1065 considered out of scope. 1067 General methods for this are: 1069 o Configuration. The parameters used for a tunnel are configured 1070 at each endpoint. 1072 o Negotiation. A tunnel negotiation can be performed. This could 1073 be accomplished in-band of GUE using control messages. 1075 o Via a control plane. Parameters for communicating with a tunnel 1076 endpoint can be set in a control plane protocol (such as that 1077 needed for network virtualization). 1079 o Via security negotiation. Use of security typically implies a 1080 key exchange between endpoints. Other GUE parameters may be 1081 conveyed as part of that process. 1083 6. Motivation for GUE 1085 This section provides the motivation for GUE with respect to other 1086 encapsulation methods. 1088 6.1. Benefits of GUE 1090 * GUE is a generic encapsulation protocol. GUE can encapsulate 1091 protocols that are represented by an IP protocol number. This 1092 includes layer 2, layer 3, and layer 4 protocols. 1094 * GUE is an extensible encapsulation protocol. Standardized 1095 optional data such as security, virtual networking identifiers, 1096 fragmentation are defined. 1098 * For extensibility, GUE uses flag fields as opposed to TLVs as 1099 some other encapsulation protocols do. Flag fields are strictly 1100 ordered, allow random access, and are efficient in use of header 1101 space. 1103 * GUE allows private data to be sent as part of the encapsulation. 1104 This permits experimentation or customization in deployment. 1106 * GUE allows sending of control messages such as OAM using the 1107 same GUE header format (for routing purposes) as normal data 1108 messages. 1110 * GUE maximizes deliverability of non-UDP and non-TCP protocols. 1112 * GUE provides a means for exposing per flow entropy for ECMP for 1113 atypical protocols such as SCTP, DCCP, ESP, etc. 1115 6.2. Comparison of GUE to other encapsulations 1117 A number of different encapsulation techniques have been proposed for 1118 the encapsulation of one protocol over another. EtherIP [RFC3378] 1119 provides layer 2 tunneling of Ethernet frames over IP. GRE [RFC2784], 1120 MPLS [RFC4023], and L2TP [RFC2661] provide methods for tunneling 1121 layer 2 and layer 3 packets over IP. NVGRE [RFC7637] and VXLAN 1122 [RFC7348] are proposals for encapsulation of layer 2 packets for 1123 network virtualization. IPIP [RFC2003] and Generic packet tunneling 1124 in IPv6 [RFC2473] provide methods for tunneling IP packets over IP. 1126 Several proposals exist for encapsulating packets over UDP including 1127 ESP over UDP [RFC3948], TCP directly over UDP [TCPUDP], VXLAN 1128 [RFC7348], LISP [RFC6830] which encapsulates layer 3 packets, 1129 MPLS/UDP [RFC7510], GENEVE [GENEVE], and GRE-in-UDP Encapsulation 1130 [RFC8086]. 1132 GUE has the following discriminating features: 1134 o UDP encapsulation leverages specialized network device 1135 processing for efficient transport. The semantics for using the 1136 UDP source port for flow entropy as input to ECMP are defined in 1137 section 5.11. 1139 o GUE permits encapsulation of arbitrary IP protocols, which 1140 includes layer 2, 3, and 4 protocols. 1142 o Multiple protocols can be multiplexed over a single UDP port 1143 number. This is in contrast to techniques to encapsulate 1144 protocols over UDP using a protocol specific port number (such 1145 as ESP/UDP, GRE/UDP, SCTP/UDP). GUE provides a uniform and 1146 extensible mechanism for encapsulating all IP protocols in UDP 1147 with minimal overhead (four bytes of additional header). 1149 o GUE is extensible. New flags and extension fields can be 1150 defined. 1152 o The GUE header includes a header length field. This allows a 1153 network node to inspect an encapsulated packet without needing 1154 to parse the full encapsulation header. 1156 o Private data in the encapsulation header allows local 1157 customization and experimentation while being compatible with 1158 processing in network nodes (routers and middleboxes). 1160 o GUE includes both data messages (encapsulation of packets) and 1161 control messages (such as OAM). 1163 o The flags-field model facilitates efficient implementation of 1164 extensibility in hardware. For instance, a TCAM can be used to 1165 parse a known set of N flags where the number of entries in the 1166 TCAM is 2^N. By comparison, the number of TCAM entries needed to 1167 parse a set of N arbitrarily ordered TLVs is approximately e*N!. 1169 o GUE includes a variant that encapsulates IPv4 and IPv6 packets 1170 directly within UDP. 1172 7. Security Considerations 1174 There are two important considerations of security with respect to 1175 GUE. 1177 o Authentication and integrity of the GUE header. 1179 o Authentication, integrity, and confidentiality of the GUE 1180 payload. 1182 GUE security is provided by extensions for security defined in 1183 [GUEEXTEN]. These extensions include methods to authenticate the GUE 1184 header and encrypt the GUE payload. 1186 The GUE header can be authenticated using a security extension for an 1187 HMAC (Hashed Message Authentication Code). Securing the GUE payload 1188 can be accomplished use of the GUE Payload Transform extension. This 1189 extension allows the use of DTLS (Datagram Transport Layer Security) 1190 to encrypt and authenticate the GUE payload. 1192 A hash function for computing flow entropy (section 5.11) SHOULD be 1193 randomly seeded to mitigate some possible denial service attacks. 1195 8. IANA Considerations 1197 8.1. UDP source port 1199 A user UDP port number assignment for GUE has been assigned: 1201 Service Name: gue 1202 Transport Protocol(s): UDP 1203 Assignee: Tom Herbert 1204 Contact: Tom Herbert 1205 Description: Generic UDP Encapsulation 1206 Reference: draft-herbert-gue 1207 Port Number: 6080 1208 Service Code: N/A 1209 Known Unauthorized Uses: N/A 1210 Assignment Notes: N/A 1212 8.2. GUE variant number 1214 IANA is requested to set up a registry for the GUE variant number. 1215 The GUE variant number is two bits containing four possible values. 1216 This document defines variants 0 and 1. New values are assigned in 1217 accordance with RFC Required policy [RFC5226]. 1219 +----------------+----------------+---------------+ 1220 | Variant number | Description | Reference | 1221 +----------------+----------------+---------------+ 1222 | 0 | GUE Version 0 | This document | 1223 | | with header | | 1224 | | | | 1225 | 1 | GUE Version 0 | This document | 1226 | | with direct IP | | 1227 | | encapsulation | | 1228 | | | | 1229 | 2..3 | Unassigned | | 1230 +----------------+----------------+---------------+ 1232 8.3. Control types 1234 IANA is requested to set up a registry for the GUE control types. 1235 Control types are 8 bit values. New values for control types 1-127 1236 are assigned in accordance with RFC Required policy [RFC5226]. 1238 +----------------+------------------+---------------+ 1239 | Control type | Description | Reference | 1240 +----------------+------------------+---------------+ 1241 | 0 | Control payload | This document | 1242 | | needs more | | 1243 | | context for | | 1244 | | interpretation | | 1245 | | | | 1246 | 1..127 | Unassigned | | 1247 | | | | 1248 | 128..255 | User defined | This document | 1249 +----------------+------------------+---------------+ 1251 9. Acknowledgements 1253 The authors would like to thank David Liu, Erik Nordmark, Fred 1254 Templin, Adrian Farrel, Bob Briscoe, and Murray Kucherawy for 1255 valuable input on this draft. Special thanks to Fred Templin who is 1256 serving as document shepherd. 1258 10. References 1260 10.1. Normative References 1262 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI 1263 10.17487/RFC0768, August 1980, . 1266 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 1267 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 1268 March 2017, . 1270 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1271 Requirement Levels", BCP 14, RFC 2119, DOI 1272 10.17487/RFC2119, March 1997, . 1275 [RFC2983] Black, D., "Differentiated Services and Tunnels", RFC 1276 2983, DOI 10.17487/RFC2983, October 2000, . 1279 [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion 1280 Notification", RFC 6040, DOI 10.17487/RFC6040, November 1281 2010, . 1283 [RFC6935] Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and 1284 UDP Checksums for Tunneled Packets", RFC 6935, DOI 1285 10.17487/RFC6935, April 2013, . 1288 [RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement 1289 for the Use of IPv6 UDP Datagrams with Zero Checksums", 1290 RFC 6936, DOI 10.17487/RFC6936, April 2013, 1291 . 1293 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 1294 Communication Layers", STD 3, RFC 1122, DOI 1295 10.17487/RFC1122, October 1989, . 1298 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1299 Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April 1300 2006, . 1302 [RFC6335] Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S. 1303 Cheshire, "Internet Assigned Numbers Authority (IANA) 1304 Procedures for the Management of the Service Name and 1305 Transport Protocol Port Number Registry", BCP 165, RFC 1306 6335, DOI 10.17487/RFC6335, August 2011, . 1309 [RFC5226] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1310 IANA Considerations Section in RFCs", RFC 5226, DOI 1311 10.17487/RFC5226, May 2008, . 1314 [GUEEXTEN] Herbert, T., Yong, L., and Templin, F., "Extensions for 1315 Generic UDP Encapsulation", draft-herbert-gue-extensions- 1316 06 1318 10.2. Informative References 1320 [RFC8086] Yong, L., Ed., Crabbe, E., Xu, X., and T. Herbert, "GRE- 1321 in-UDP Encapsulation", RFC 8086, DOI 10.17487/RFC8086, 1322 March 2017, . 1324 [RFC7605] Touch, J., "Recommendations on Using Assigned Transport 1325 Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, 1326 August 2015, . 1328 [RFC4787] Audet, F., Ed., and C. Jennings, "Network Address 1329 Translation (NAT) Behavioral Requirements for Unicast 1330 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 1331 2007, . 1333 [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, 1334 "Session Traversal Utilities for NAT (STUN)", RFC 5389, 1335 DOI 10.17487/RFC5389, October 2008, . 1338 [RFC5245] Rosenberg, J., "Interactive Connectivity Establishment 1339 (ICE): A Protocol for Network Address Translator (NAT) 1340 Traversal for Offer/Answer Protocols", RFC 5245, DOI 1341 10.17487/RFC5245, April 2010, . 1344 [RFC8084] Fairhurst, G., "Network Transport Circuit Breakers", BCP 1345 208, RFC 8084, DOI 10.17487/RFC8084, March 2017, 1346 . 1348 [RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label 1349 for Equal Cost Multipath Routing and Link Aggregation in 1350 Tunnels", RFC 6438, DOI 10.17487/RFC6438, November 2011, 1351 . 1353 [RFC3378] Housley, R. and S. Hollenbeck, "EtherIP: Tunneling 1354 Ethernet Frames in IP Datagrams", RFC 3378, DOI 1355 10.17487/RFC3378, September 2002, . 1358 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 1359 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 1360 DOI 10.17487/RFC2784, March 2000, . 1363 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed., 1364 "Encapsulating MPLS in IP or Generic Routing Encapsulation 1365 (GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005, 1366 . 1368 [RFC2661] Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn, 1369 G., and B. Palter, "Layer Two Tunneling Protocol "L2TP"", 1370 RFC 2661, DOI 10.17487/RFC2661, August 1999, 1371 . 1373 [RFC7637] Garg, P., Ed., and Y. Wang, Ed., "NVGRE: Network 1374 Virtualization Using Generic Routing Encapsulation", RFC 1375 7637, DOI 10.17487/RFC7637, September 2015, 1376 . 1378 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 1379 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual 1380 eXtensible Local Area Network (VXLAN): A Framework for 1381 Overlaying Virtualized Layer 2 Networks over Layer 3 1382 Networks", RFC 7348, August 2014, . 1385 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, DOI 1386 10.17487/RFC2003, October 1996, . 1389 [RFC2473] Conta, A. and S. Deering, "Generic Packet Tunneling in 1390 IPv6 Specification", RFC 2473, DOI 10.17487/RFC2473, 1391 December 1998, . 1393 [RFC3948] Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M. 1394 Stenberg, "UDP Encapsulation of IPsec ESP Packets", RFC 1395 3948, DOI 10.17487/RFC3948, January 2005, . 1398 [RFC6830] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The 1399 Locator/ID Separation Protocol (LISP)", RFC 6830, DOI 1400 10.17487/RFC6830, January 2013, . 1403 [RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, 1404 "Encapsulating MPLS in UDP", RFC 7510, DOI 1405 10.17487/RFC7510, April 2015, . 1408 [IANA-PN] IANA, "Protocol Numbers", 1409 . 1411 [TCPUDP] Chesire, S., Graessley, J., and McGuire, R., 1412 "Encapsulation of TCP and other Transport Protocols over 1413 UDP", draft-cheshire-tcp-over-udp-00 1415 [GENEVE] Gross, J., Ed., Ganga, I. Ed., and Sridhar, T., "Geneve: 1416 Generic Network Virtualization Encapsulation", draft-ietf- 1417 nvo3-geneve-10 1419 [UDPENCAP] Herbert, T., "UDP Encapsulation in Linux", 1420 1423 [MULTIQ] Herbert, T. and de Bruijn, W., "Scaling in the Linux 1424 Networking Stack", 1427 [CSUMOFF] Cree, E., "Checksum Offloads in the Linux Networking 1428 Stack", 1431 [SEGOFF] Duyck, A., "Segmentation Offloads in the Linux Networking 1432 Stack", 1435 Appendix A: NIC processing for GUE 1437 This appendix is informational and does not constitute a normative 1438 part of this document. 1440 This appendix provides some guidelines for Network Interface Cards 1441 (NICs) to implement common offloads and accelerations to support GUE. 1442 Note that most of this discussion is generally applicable to other 1443 methods of UDP based encapsulation. An overview of UDP based 1444 encapsulation and acceleration is in [UDPENCAP] 1446 A.1. Receive multi-queue 1448 Contemporary NICs support multiple receive descriptor queues (multi- 1449 queue) [MUTLIQ]. Multi-queue enables load balancing of network 1450 processing for a NIC across multiple CPUs. On packet reception, a NIC 1451 selects an appropriate queue for host processing. Receive Side 1452 Scaling (RSS) is a common method which uses the flow hash for a 1453 packet to index an indirection table where each entry stores a queue 1454 number. Flow Director and Accelerated Receive Flow Steering (aRFS) 1455 allow a host to program the queue that is used for a given flow which 1456 is identified either by an explicit five-tuple or by the flow's hash. 1458 GUE encapsulation is compatible with multi-queue NICs that support 1459 five-tuple hash calculation for UDP/IP packets as input to RSS. The 1460 flow entropy in the UDP source port ensures classification of the 1461 encapsulated flow even in the case that the outer source and 1462 destination addresses are the same for all flows (e.g. all flows are 1463 going over a single tunnel). 1465 By default, UDP RSS support is often disabled in NICs to avoid out- 1466 of-order reception that can occur when UDP packets are fragmented. As 1467 discussed is section 5.8, fragmentation of GUE packets is mostly 1468 avoided by fragmenting packets before entering a tunnel, GUE 1469 fragmentation, path MTU discovery in higher layer protocols, or 1470 operator adjusting MTUs. Other UDP traffic might not implement such 1471 procedures to avoid fragmentation, so enabling UDP RSS support in the 1472 NIC might be a considered tradeoff during configuration. 1474 A.2. Checksum offload 1476 Many NICs provide capabilities to calculate the standard ones 1477 complement checksum for packets in transmit or receive [CSUMOFF]. 1478 When using GUE encapsulation, there are at least two checksums that 1479 are of interest: the encapsulated packet's transport checksum, and 1480 the UDP checksum in the outer header. 1482 A.2.1. Transmit checksum offload 1484 NICs can provide a protocol agnostic method to offload the transmit 1485 checksum (NETIF_F_HW_CSUM in Linux parlance) that can be used with 1486 GUE. In this method, the host provides checksum related parameters in 1487 a transmit descriptor for a packet. These parameters include the 1488 starting offset of data to checksum, the length of data to checksum, 1489 and the offset in the packet where the computed checksum is to be 1490 written. The host initializes the checksum field to a pseudo header 1491 checksum. 1493 In the case of GUE, the checksum for an encapsulated transport layer 1494 packet, a TCP packet for instance, can be offloaded by setting the 1495 appropriate checksum parameters. 1497 NICs typically can offload only one transmit checksum per packet, so 1498 simultaneously offloading both an inner transport packet's checksum 1499 and the outer UDP checksum is likely not possible. 1501 If an encapsulator is co-resident with a host, then checksum offload 1502 may be performed using remote checksum offload (RCO)[GUEEXTEN]. 1503 Remote checksum offload relies on NIC offload of the simple UDP/IP 1504 checksum which is commonly supported even in legacy devices. In 1505 remote checksum offload, the outer UDP checksum is set and the GUE 1506 header includes an option indicating the start and offset of the 1507 inner "offloaded" checksum. The inner checksum is initialized to the 1508 pseudo header checksum. When a decapsulator receives a GUE packet 1509 with the remote checksum offload option, it completes the offload 1510 operation by determining the packet checksum from the indicated start 1511 point to the end of the packet, and then adds this into the checksum 1512 field at the offset given in the option. Computing the checksum from 1513 the start to end of packet is efficient if checksum-complete is 1514 provided on the receiver. 1516 Another alternative when an encapsulator is co-resident with a host 1517 is to perform Local Checksum Offload (LCO) [CSUMOFF]. In this method, 1518 the inner transport layer checksum is offloaded and the outer UDP 1519 checksum can be deduced based on the fact that the portion of the 1520 packet covered by the inner transport checksum will sum to zero or at 1521 least the bitwise "not" of the inner pseudo header. 1523 A.2.2. Receive checksum offload 1525 GUE is compatible with NICs that perform a protocol agnostic receive 1526 checksum (CHECKSUM_COMPLETE in Linux parlance). In this technique, a 1527 NIC computes a ones complement checksum over all (or some predefined 1528 portion) of a packet. The computed value is provided to the host 1529 stack in the packet's receive descriptor. The host driver can use 1530 this checksum to "patch up" and validate any inner packet transport 1531 checksums, as well as the outer UDP checksum if it is non-zero. 1533 Many legacy NICs don't provide checksum-complete but instead provide 1534 an indication that a checksum has been verified (CHECKSUM_UNNECESSARY 1535 in Linux). Usually, such validation is only done for simple TCP/IP or 1536 UDP/IP packets. If a NIC indicates that a UDP checksum is valid, the 1537 checksum-complete value for the UDP packet is the bitwise "not" of 1538 the pseudo header checksum. In this way, checksum-unnecessary can be 1539 converted to checksum-complete. So, if the NIC provides checksum- 1540 unnecessary for the outer UDP header in an encapsulation, checksum 1541 conversion can be done so that the checksum-complete value is derived 1542 and can be used by the stack to validate checksums in the 1543 encapsulated packet. 1545 A.3. Transmit Segmentation Offload 1547 Transmit Segmentation Offload (TSO) [SEGOFF] is a NIC feature where a 1548 host provides a large (>MTU size) TCP packet to the NIC, which in 1549 turn splits the packet into separate segments and transmits each one. 1550 This is useful to reduce CPU load on the host. 1552 The process of TSO can be generalized as: 1554 - Split the TCP payload into segments of size less than or equal 1555 to MTU. 1557 - For each created segment: 1559 1. Replicate the TCP header and all preceding headers of the 1560 original packet. 1562 2. Set payload length fields in any headers to reflect the 1563 length of the segment. 1565 3. Set TCP sequence number to correctly reflect the offset of 1566 the TCP data in the stream. 1568 4. Recompute and set any checksums that either cover the payload 1569 of the packet or cover header which was changed by setting a 1570 payload length. 1572 Following this general process, TSO can be extended to support TCP 1573 encapsulation in GUE. For each segment the Ethernet, outer IP, UDP 1574 header, GUE header, inner IP header (if tunneling), and TCP headers 1575 are replicated. Any packet length header fields need to be set 1576 properly (including the length in the outer UDP header), and 1577 checksums need to be set correctly (including the outer UDP checksum 1578 if being used). 1580 To facilitate TSO with GUE, it is recommended that extension fields 1581 do not contain values that need to be updated on a per segment basis. 1582 For example, extension fields should not include checksums, lengths, 1583 or sequence numbers that refer to the payload. If the GUE header does 1584 not contain such fields then the TSO engine only needs to copy the 1585 bits in the GUE header when creating each segment and does not need 1586 to parse the GUE header. 1588 A.4. Large Receive Offload 1590 Large Receive Offload (LRO) [SEGOFF] is a NIC feature where packets 1591 of a TCP connection are reassembled, or coalesced, in the NIC and 1592 delivered to the host as one large packet. This feature can reduce 1593 CPU utilization in the host. 1595 LRO requires significant protocol awareness to be implemented 1596 correctly and is difficult to generalize. Packets in the same flow 1597 need to be unambiguously identified. In the presence of tunnels or 1598 network virtualization, this may require more than a five-tuple match 1599 (for instance packets for flows in two different virtual networks may 1600 have identical five-tuples). Additionally, a NIC needs to perform 1601 validation over packets that are being coalesced, and needs to 1602 fabricate a single meaningful header from all the coalesced packets. 1604 The conservative approach to supporting LRO for GUE would be to 1605 assign packets to the same flow only if they have identical five- 1606 tuple and were encapsulated the same way. That is the outer IP 1607 addresses, the outer UDP ports, GUE protocol, GUE flags and fields, 1608 and inner five tuple are all identical. 1610 Appendix B: Implementation considerations 1612 This appendix is informational and does not constitute a normative 1613 part of this document. 1615 B.1. Priveleged ports 1617 Using the source port to contain a flow entropy value disallows the 1618 security method of a receiver enforcing that the source port be a 1619 privileged port. Privileged ports are defined by some operating 1620 systems to restrict source port binding. Unix, for instance, 1621 considered port number less than 1024 to be privileged. 1623 Enforcing that packets are sent from a privileged port is widely 1624 considered an inadequate security mechanism and has been mostly 1625 deprecated. To approximate this behavior, an implementation could 1626 restrict a user from sending a packet destined to the GUE port 1627 without proper credentials. 1629 B.2. Setting flow entropy as a route selector 1631 An encapsulator generating flow entropy in the UDP source port could 1632 modulate the value to perform a type of multipath source routing. 1633 Assuming that networking switches perform ECMP based on the flow 1634 hash, a sender can affect the path by altering the flow entropy. For 1635 instance, a host can store a flow hash in its protocol control block 1636 (PCB) for an inner flow, and might alter the value upon detecting 1637 that packets are traversing a lossy path. Changing the flow entropy 1638 for a flow SHOULD be subject to hysteresis (at most once every thirty 1639 seconds) to limit the number of out of order packets. 1641 B.3. Hardware protocol implementation considerations 1643 Low level data path protocols, such as GUE, are often supported in 1644 high speed network device hardware. Variable length header (VLH) 1645 protocols like GUE are sometimes considered difficult to efficiently 1646 implement in hardware. In order to retain the important 1647 characteristics of an extensible and robust protocol, hardware 1648 vendors may practice "constrained flexibility". In this model, only 1649 certain combinations or protocol header parameterizations are 1650 implemented in the hardware fast path. Each such parameterization is 1651 fixed length so that the particular instance can be optimized as a 1652 fixed length protocol. In the case of GUE, this constitutes specific 1653 combinations of GUE flags, fields, and next protocol. The selected 1654 combinations would naturally be the most common cases which form the 1655 "fast path", and other combinations are assumed to take the "slow 1656 path". 1658 In time, the needs and requirements of a protocol may change which 1659 may manifest themselves as new parameterizations to be supported in 1660 the fast path. To allow this extensibility, a device practicing 1661 constrained flexibility should allow fast path parameterizations to 1662 be programmable. 1664 Authors' Addresses 1666 Tom Herbert 1667 Quantonium 1668 4701 Patrick Henry 1669 Santa Clara, CA 95054 1670 US 1672 Email: tom@herbertland.com 1674 Lucy Yong 1675 Independent 1676 Austin, TX 1677 US 1679 Osama Zia 1680 Microsoft 1681 1 Microsoft Way 1682 Redmond, WA 98029 1683 US 1685 Email: osamaz@microsoft.com