idnits 2.17.1 draft-ietf-intarea-gue-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (August 31, 2018) is 2066 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2119' is mentioned on line 229, but not defined == Missing Reference: 'GUEXTENS' is mentioned on line 453, but not defined == Missing Reference: 'RFC2460' is mentioned on line 550, but not defined ** Obsolete undefined reference: RFC 2460 (Obsoleted by RFC 8200) == Missing Reference: 'RFC5245' is mentioned on line 837, but not defined ** Obsolete undefined reference: RFC 5245 (Obsoleted by RFC 8445, RFC 8839) == Missing Reference: 'RFC768' is mentioned on line 867, but not defined == Missing Reference: 'RFC6335' is mentioned on line 1029, but not defined == Missing Reference: 'RFC2473' is mentioned on line 1127, but not defined == Missing Reference: 'RFC5226' is mentioned on line 1238, but not defined ** Obsolete undefined reference: RFC 5226 (Obsoleted by RFC 8126) == Unused Reference: 'RFC2434' is defined on line 1272, but no explicit reference was found in the text == Unused Reference: 'RFC3828' is defined on line 1301, but no explicit reference was found in the text == Unused Reference: 'RFC7605' is defined on line 1313, but no explicit reference was found in the text == Unused Reference: 'RFC4340' is defined on line 1331, but no explicit reference was found in the text == Unused Reference: 'RFC5285' is defined on line 1346, but no explicit reference was found in the text == Unused Reference: 'GUE4NVO3' is defined on line 1403, but no explicit reference was found in the text == Unused Reference: 'GUESEC' is defined on line 1407, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Downref: Normative reference to an Informational RFC: RFC 2983 ** Downref: Normative reference to an Informational RFC: RFC 4459 -- Obsolete informational reference (is this intentional?): RFC 5389 (Obsoleted by RFC 8489) -- Obsolete informational reference (is this intentional?): RFC 5245 (ref. 'RFC5285') (Obsoleted by RFC 8445, RFC 8839) -- Obsolete informational reference (is this intentional?): RFC 5405 (Obsoleted by RFC 8085) -- Obsolete informational reference (is this intentional?): RFC 6830 (Obsoleted by RFC 9300, RFC 9301) == Outdated reference: A later version (-01) exists of draft-herbert-gue-extensions-00 == Outdated reference: A later version (-04) exists of draft-hy-nvo3-gue-4-nvo-03 == Outdated reference: A later version (-01) exists of draft-herbert-transports-over-udp-00 == Outdated reference: A later version (-16) exists of draft-ietf-nvo3-geneve-05 Summary: 6 errors (**), 0 flaws (~~), 20 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Area WG T. Herbert 3 Internet-Draft Quantonium 4 Intended status: Standard track L. Yong 5 Expires March 4, 2019 Huawei USA 6 O. Zia 7 Microsoft 8 August 31, 2018 10 Generic UDP Encapsulation 11 draft-ietf-intarea-gue-06 13 Status of this Memo 15 This Internet-Draft is submitted in full conformance with the 16 provisions of BCP 78 and BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html 34 This Internet-Draft will expire on March 4, 2019. 36 Copyright Notice 38 Copyright (c) 2018 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Abstract 60 This specification describes Generic UDP Encapsulation (GUE), which 61 is a scheme for using UDP to encapsulate packets of different IP 62 protocols for transport across layer 3 networks. By encapsulating 63 packets in UDP, specialized capabilities in networking hardware for 64 efficient handling of UDP packets can be leveraged. GUE specifies 65 basic encapsulation methods upon which higher level constructs, such 66 as tunnels and overlay networks for network virtualization, can be 67 constructed. GUE is extensible by allowing optional data fields as 68 part of the encapsulation, and is generic in that it can encapsulate 69 packets of various IP protocols. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 74 1.1. Terminology and acronyms . . . . . . . . . . . . . . . . . 5 75 1.2. Requirements Language . . . . . . . . . . . . . . . . . . 6 76 2. Base packet format . . . . . . . . . . . . . . . . . . . . . . 7 77 2.1. GUE variant . . . . . . . . . . . . . . . . . . . . . . . . 7 78 3. Variant 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 79 3.1. Header format . . . . . . . . . . . . . . . . . . . . . . . 8 80 3.2. Proto/ctype field . . . . . . . . . . . . . . . . . . . . . 9 81 3.2.1 Proto field . . . . . . . . . . . . . . . . . . . . . . 9 82 3.2.2 Ctype field . . . . . . . . . . . . . . . . . . . . . . 10 83 3.3. Flags and extension fields . . . . . . . . . . . . . . . . 11 84 3.3.1. Requirements . . . . . . . . . . . . . . . . . . . . . 11 85 3.3.2. Example GUE header with extension fields . . . . . . . 11 86 3.4. Private data . . . . . . . . . . . . . . . . . . . . . . . 12 87 3.5. Message types . . . . . . . . . . . . . . . . . . . . . . . 13 88 3.5.1. Control messages . . . . . . . . . . . . . . . . . . . 13 89 3.5.2. Data messages . . . . . . . . . . . . . . . . . . . . . 13 90 3.6. Hiding the transport layer protocol number . . . . . . . . 13 91 4. Variant 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 92 4.1. Direct encapsulation of IPv4 . . . . . . . . . . . . . . . 15 93 4.2. Direct encapsulation of IPv6 . . . . . . . . . . . . . . . 16 94 5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 95 5.1. Network tunnel encapsulation . . . . . . . . . . . . . . . 17 96 5.2. Transport layer encapsulation . . . . . . . . . . . . . . . 17 97 5.3. Encapsulator operation . . . . . . . . . . . . . . . . . . 18 98 5.4. Decapsulator operation . . . . . . . . . . . . . . . . . . 18 99 5.4.1. Processing a received data message . . . . . . . . . . 18 100 5.4.2. Processing a received control message . . . . . . . . . 19 101 5.5. Router and switch operation . . . . . . . . . . . . . . . . 19 102 5.6. Middlebox interactions . . . . . . . . . . . . . . . . . . 20 103 5.6.1. Inferring connection semantics . . . . . . . . . . . . 20 104 5.6.2. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . 20 105 5.7. Checksum Handling . . . . . . . . . . . . . . . . . . . . . 20 106 5.7.1. Requirements . . . . . . . . . . . . . . . . . . . . . 21 107 5.7.2. UDP Checksum with IPv4 . . . . . . . . . . . . . . . . 21 108 5.7.3. UDP Checksum with IPv6 . . . . . . . . . . . . . . . . 22 109 5.8. MTU and fragmentation . . . . . . . . . . . . . . . . . . . 22 110 5.9. Congestion control . . . . . . . . . . . . . . . . . . . . 22 111 5.10. Multicast . . . . . . . . . . . . . . . . . . . . . . . . 23 112 5.11. Flow entropy for ECMP . . . . . . . . . . . . . . . . . . 23 113 5.11.1. Flow classification . . . . . . . . . . . . . . . . . 23 114 5.11.2. Flow entropy properties . . . . . . . . . . . . . . . 24 115 5.12 Negotiation of acceptable flags and extension fields . . . 25 116 6. Motivation for GUE . . . . . . . . . . . . . . . . . . . . . . 26 117 6.1. Benefits of GUE . . . . . . . . . . . . . . . . . . . . . . 26 118 6.2 Comparison of GUE to other encapsulations . . . . . . . . . 26 119 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 28 120 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 28 121 8.1. UDP source port . . . . . . . . . . . . . . . . . . . . . . 28 122 8.2. GUE variant number . . . . . . . . . . . . . . . . . . . . 29 123 8.3. Control types . . . . . . . . . . . . . . . . . . . . . . . 29 124 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 29 125 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 30 126 10.1. Normative References . . . . . . . . . . . . . . . . . . . 30 127 10.2. Informative References . . . . . . . . . . . . . . . . . . 30 128 Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 33 129 A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 33 130 A.2. Checksum offload . . . . . . . . . . . . . . . . . . . . . 34 131 A.2.1. Transmit checksum offload . . . . . . . . . . . . . . . 34 132 A.2.2. Receive checksum offload . . . . . . . . . . . . . . . 35 133 A.3. Transmit Segmentation Offload . . . . . . . . . . . . . . . 35 134 A.4. Large Receive Offload . . . . . . . . . . . . . . . . . . . 36 135 Appendix B: Implementation considerations . . . . . . . . . . . . 36 136 B.1. Priveleged ports . . . . . . . . . . . . . . . . . . . . . 37 137 B.2. Setting flow entropy as a route selector . . . . . . . . . 37 138 B.3. Hardware protocol implementation considerations . . . . . . 37 139 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 38 141 1. Introduction 143 This specification describes Generic UDP Encapsulation (GUE) which is 144 a general method for encapsulating packets of arbitrary IP protocols 145 within User Datagram Protocol (UDP) [RFC0768] packets. Encapsulating 146 packets in UDP facilitates efficient transport across networks. 147 Networking devices widely provide protocol specific processing and 148 optimizations for UDP (as well as TCP) packets. Packets for atypical 149 IP protocols (those not usually parsed by networking hardware) can be 150 encapsulated in UDP packets to maximize deliverability and to 151 leverage flow specific mechanisms for routing and packet steering. 153 GUE provides an extensible header format for including optional data 154 in the encapsulation header. This data potentially covers items such 155 as the virtual networking identifier, security data for validating or 156 authenticating the GUE header, congestion control data, etc. GUE also 157 allows private optional data in the encapsulation header. This 158 feature can be used by a site or implementation to define local 159 custom optional data, and allows experimentation of options that may 160 eventually become standard. 162 This document does not define any specific GUE extensions. [GUEEXTEN] 163 specifies a set of initial extensions. 165 The motivation for the GUE protocol is described in section 6. 167 1.1. Terminology and acronyms 169 GUE Generic UDP Encapsulation 171 GUE Header A variable length protocol header that is composed 172 of a primary four byte header and zero or more four 173 byte words for optional header data 175 GUE packet A UDP/IP packet that contains a GUE header and GUE 176 payload within the UDP payload 178 GUE variant A version of the GUE protocol or an alternate form 179 of a version 181 Encapsulator A network node that encapsulates packets in GUE 183 Decapsulator A network node that decapsulates and processes 184 packets encapsulated in GUE 186 Data message An encapsulated packet in the GUE payload that is 187 addressed to the protocol stack for an associated 188 protocol 190 Control message A formatted message in the GUE payload that is 191 implicitly addressed to the decapsulator to monitor 192 or control the state or behavior of a tunnel 194 Flags A set of bit flags in the primary GUE header 196 Extension field 197 An optional field in a GUE header whose presence is 198 indicated by corresponding flag(s) 200 C-bit A single bit flag in the primary GUE header that 201 indicates whether the GUE packet contains a control 202 message or data message 204 Hlen A field in the primary GUE header that gives the 205 length of the GUE header 207 Proto/ctype A field in the GUE header that holds either the IP 208 protocol number for a data message or a type for a 209 control message 211 Private data Optional data in the GUE header that can be used for 212 private purposes 214 Outer IP header Refers to the outer most IP header or packet when 215 encapsulating a packet over IP 217 Inner IP header Refers to an encapsulated IP header when an IP 218 packet is encapsulated 220 Outer packet Refers to an encapsulating packet 222 Inner packet Refers to a packet that is encapsulated 224 1.2. Requirements Language 226 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 227 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 228 document are to be interpreted as described in [RFC2119]. 230 2. Base packet format 232 A GUE packet is comprised of a UDP packet whose payload is a GUE 233 header followed by a payload which is either an encapsulated packet 234 of some IP protocol or a control message such as an OAM (Operations, 235 Administration, and Management) message. A GUE packet has the general 236 format: 238 +-------------------------------+ 239 | | 240 | UDP/IP header | 241 | | 242 |-------------------------------| 243 | | 244 | GUE Header | 245 | | 246 |-------------------------------| 247 | | 248 | Encapsulated packet | 249 | or control message | 250 | | 251 +-------------------------------+ 253 The GUE header is variable length as determined by the presence of 254 optional extension fields. 256 2.1. GUE variant 258 The first two bits of the GUE header contain the GUE protocol variant 259 number. The variant number can indicate the version of the GUE 260 protocol as well as alternate forms of a version. 262 Variants 0 and 1 are described in this specification; variants 2 and 263 3 are reserved. 265 3. Variant 0 267 Variant 0 indicates version 0 of GUE. This variant defines a generic 268 extensible format to encapsulate packets by Internet protocol number. 270 3.1. Header format 272 The header format for variant 0 of GUE in UDP is: 274 0 1 2 3 275 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 276 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ 277 | Source port | Destination port | | 278 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP 279 | Length | Checksum | | 280 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ 281 | 0 |C| Hlen | Proto/ctype | Flags | 282 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 283 | | 284 ~ Extensions Fields (optional) ~ 285 | | 286 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 287 | | 288 ~ Private data (optional) ~ 289 | | 290 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 292 The contents of the UDP header are: 294 o Source port: If connection semantics (section 5.6.1) are applied 295 to an encapsulation, this is set to the local source port for 296 the connection. When connection semantics are not applied, the 297 source port is either set to a flow entropy value as described 298 in section 5.11, or it should be set to the GUE assigned port 299 number, 6080. 301 o Destination port: If connection semantics (section 5.6.1) are 302 applied to an encapsulation, this is set to the destination port 303 for the tuple. If connection semantics are not applied this is 304 set to the GUE assigned port number, 6080. 306 o Length: Canonical length of the UDP packet (length of UDP header 307 and payload). 309 o Checksum: Standard UDP checksum (handling is described in 310 section 5.7). 312 The GUE header consists of: 314 o Variant: 0 indicates GUE protocol version 0 with a header. 316 o C: C-bit: When set indicates a control message, not set 317 indicates a data message. 319 o Hlen: Length in 32-bit words of the GUE header, including 320 optional extension fields but not the first four bytes of the 321 header. Computed as (header_len - 4) / 4, where header_len is 322 the total header length in bytes. All GUE headers are a multiple 323 of four bytes in length. Maximum header length is 128 bytes. 325 o Proto/ctype: When the C-bit is set, this field contains a 326 control message type for the payload (section 3.2.2). When the 327 C-bit is not set, the field holds the Internet protocol number 328 for the encapsulated packet in the payload (section 3.2.1). The 329 control message or encapsulated packet begins at the offset 330 provided by Hlen. 332 o Flags: Header flags that may be allocated for various purposes 333 and may indicate presence of extension fields. Undefined header 334 flag bits MUST be set to zero on transmission. 336 o Extension Fields: Optional fields whose presence is indicated by 337 corresponding flags. 339 o Private data: Optional private data block (see section 3.4). If 340 the private block is present, it immediately follows that last 341 extension field present in the header. The private block is 342 considered to be part of the GUE header. The length of this data 343 is determined by subtracting the starting offset from the header 344 length. 346 3.2. Proto/ctype field 348 The proto/ctype fields either contains an Internet protocol number 349 (when the C-bit is not set) or GUE control message type (when the C- 350 bit is set). 352 3.2.1 Proto field 354 When the C-bit is not set, the proto/ctype field MUST contain an IANA 355 Internet Protocol Number. The protocol number is interpreted relative 356 to the IP protocol that encapsulates the UDP packet (i.e. protocol of 357 the outer IP header). The protocol number serves as an indication of 358 the type of the next protocol header which is contained in the GUE 359 payload at the offset indicated in Hlen. Intermediate devices MAY 360 parse the GUE payload per the number in the proto/ctype field, and 361 header flags cannot affect the interpretation of the proto/ctype 362 field. 364 When the outer IP protocol is IPv4, the proto field MUST be set to a 365 valid IP protocol number usable with IPv4; it MUST NOT be set to a 366 number for IPv6 extension headers or ICMPv6 options (number 58). An 367 exception is that the destination options extension header using the 368 PadN option MAY be used with IPv4 as described in section 3.6. The 369 "no next header" protocol number (59) also MAY be used with IPv4 as 370 described below. 372 When the outer IP protocol is IPv6, the proto field can be set to any 373 defined protocol number except that it MUST NOT be set to Hop-by-hop 374 options (number 0). If a received GUE packet in IPv6 contains a 375 protocol number that is an extension header (e.g. Destination 376 Options) then the extension header is processed after the GUE header 377 is processed as though the GUE header is an extension header. 379 IP protocol number 59 ("No next header") can be set to indicate that 380 the GUE payload does not begin with the header of an IP protocol. 381 This would be the case, for instance, if the GUE payload were a 382 fragment when performing GUE level fragmentation. The interpretation 383 of the payload is performed through other means (such as flags and 384 extension fields), and intermediate devices MUST NOT parse packets 385 based on the IP protocol number in this case. 387 3.2.2 Ctype field 389 When the C-bit is set, the proto/ctype field MUST be set to a valid 390 control message type. A value of zero indicates that the GUE payload 391 requires further interpretation to deduce the control type. This 392 might be the case when the payload is a fragment of a control 393 message, where only the reassembled packet can be interpreted as a 394 control message. 396 Control messages will be defined in an IANA registry. Control message 397 types 1 through 127 may be defined in standards. Types 128 through 398 255 are reserved to be user defined for experimentation or private 399 control messages. 401 This document does not specify any standard control message types 402 other than type 0. Type 0 does not define a format of the control 403 message. Instead, it indicates that the GUE payload is a control 404 message, or part of a control message (as might be the case in GUE 405 fragmentation), that cannot be correctly parsed or interpreted 406 without additional context. 408 3.3. Flags and extension fields 410 Flags and associated extension fields are the primary mechanism of 411 extensibility in GUE. As mentioned in section 3.1, GUE header flags 412 indicate the presence of optional extension fields in the GUE header. 413 [GUEXTENS] defines an initial set of GUE extensions. 415 3.3.1. Requirements 417 There are sixteen flag bits in the GUE header. Flags may indicate 418 presence of an extension fields. The size of an extension field 419 indicated by a flag MUST be fixed. 421 Flags can be paired together to allow different lengths for an 422 extension field. For example, if two flag bits are paired, a field 423 can possibly be three different lengths-- that is bit value of 00 424 indicates no field present; 01, 10, and 11 indicate three possible 425 lengths for the field. Regardless of how flag bits are paired, the 426 lengths and offsets of optional fields corresponding to a set of 427 flags MUST be well defined. 429 Extension fields are placed in order of the flags. New flags are to 430 be allocated from high to low order bit contiguously without holes. 431 Flags allow random access, for instance to inspect the field 432 corresponding to the Nth flag bit, an implementation only considers 433 the previous N-1 flags to determine the offset. Flags after the Nth 434 flag are not pertinent in calculating the offset of the field for the 435 Nth flag. Random access of flags and fields permits processing of 436 optional extensions in an order that is independent of their position 437 in the packet. 439 Flags (or paired flags) are idempotent such that new flags MUST NOT 440 cause reinterpretation of old flags. Also, new flags MUST NOT alter 441 interpretation of other elements in the GUE header nor how the 442 message is parsed (for instance, in a data message the proto/ctype 443 field always holds an IP protocol number as an invariant). 445 The set of available flags can be extended in the future by defining 446 a "flag extensions bit" that refers to a field containing a new set 447 of flags. 449 3.3.2. Example GUE header with extension fields 451 An example GUE header for a data message encapsulating an IPv4 packet 452 and containing the Group Identifier and Security extension fields 453 (both defined in [GUEXTENS]) is shown below: 455 0 1 2 3 456 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 457 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 458 | 0 |0| 3 | 94 |1|0 0 1| 0 | 459 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 460 | Group Identifier | 461 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 462 | | 463 + Security + 464 | | 465 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 467 In the above example, the first flag bit is set which indicates that 468 the Group Identifier extension is present which is a 32 bit field. 469 The second through fourth bits of the flags are paired flags that 470 indicate the presence of a Security field with seven possible sizes. 471 In this example 001 indicates a sixty-four bit security field. 473 3.4. Private data 475 An implementation MAY use private data for its own use. The private 476 data immediately follows the last field in the GUE header and is not 477 a fixed length. This data is considered part of the GUE header and 478 MUST be accounted for in header length (Hlen). The length of the 479 private data MUST be a multiple of four and is determined by 480 subtracting the offset of private data in the GUE header from the 481 header length. Specifically: 483 Private_length = (Hlen * 4) - Length(flags) 485 where "Length(flags)" returns the sum of lengths of all the extension 486 fields present in the GUE header. When there is no private data 487 present, the length of the private data is zero. 489 The semantics and interpretation of private data are implementation 490 specific. The private data may be structured as necessary, for 491 instance it might contain its own set of flags and extension fields. 493 An encapsulator and decapsulator MUST agree on the meaning of private 494 data before using it. The mechanism to achieve this agreement is 495 outside the scope of this document but could include implementation- 496 defined behavior, coordinated configuration, in-band communication 497 using GUE control messages, or out-of-band messages. 499 If a decapsulator receives a GUE packet with private data, it MUST 500 validate the private data appropriately. If a decapsulator does not 501 expect private data from an encapsulator, the packet MUST be dropped. 502 If a decapsulator cannot validate the contents of private data per 503 the provided semantics, the packet MUST also be dropped. An 504 implementation MAY place security data in GUE private data which if 505 present MUST be verified for packet acceptance. 507 3.5. Message types 509 3.5.1. Control messages 511 Control messages carry formatted data that are implicitly addressed 512 to the decapsulator to monitor or control the state or behavior of a 513 tunnel (OAM). For instance, an echo request and corresponding echo 514 reply message can be defined to test for liveness. 516 Control messages are indicated in the GUE header when the C-bit is 517 set. The payload is interpreted as a control message with type 518 specified in the proto/ctype field. The format and contents of the 519 control message are indicated by the type and can be variable length. 521 Other than interpreting the proto/ctype field as a control message 522 type, the meaning and semantics of the rest of the elements in the 523 GUE header are the same as that of data messages. Forwarding and 524 routing of control messages should be the same as that of a data 525 message with the same outer IP and UDP header and GUE flags; this 526 ensures that control messages can be created that follow the same 527 path as data messages. 529 3.5.2. Data messages 531 Data messages carry encapsulated packets that are addressed to the 532 protocol stack for the associated protocol. Data messages are a 533 primary means of encapsulation and can be used to create tunnels for 534 overlay networks. 536 Data messages are indicated in GUE header when the C-bit is not set. 537 The payload of a data message is interpreted as an encapsulated 538 packet of an Internet protocol indicated in the proto/ctype field. 539 The packet immediately follows the GUE header. 541 3.6. Hiding the transport layer protocol number 543 The GUE header indicates the Internet protocol of the encapsulated 544 packet. A protocol number is either contained in the Proto/ctype 545 field of the primary GUE header or in the Payload Type field of a GUE 546 Transform extension field (used to encrypt the payload with DTLS, 547 [GUEEXTEN]). If the transport protocol number needs to be hidden from 548 the network, then a trivial destination options can be used. 550 The PadN destination option [RFC2460] can be used to encode the 551 transport protocol as a next header of an extension header (and 552 maintain alignment of encapsulated transport headers). The 553 Proto/ctype field or Payload Type field of the GUE Transform field is 554 set to 60 to indicate that the first encapsulated header is a 555 destination options extension header. 557 The format of the extension header is below: 559 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 560 | Next Header | 2 | 1 | 0 | 561 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 563 For IPv4, it is permitted in GUE to used this precise destination 564 option to contain the obfuscated protocol number. In this case next 565 header MUST refer to a valid IP protocol for IPv4. No other extension 566 headers or destination options are permitted with IPv4. 568 4. Variant 1 570 Variant 1 of GUE allows direct encapsulation of IPv4 and IPv6 in UDP. 571 In this variant there is no GUE header; a UDP packet carries an IP 572 packet. The first two bits of the UDP payload for GUE are the GUE 573 variant and coincide with the first two bits of the version number in 574 the IP header. The first two version bits of IPv4 and IPv6 are 01, so 575 we use GUE variant 1 for direct IP encapsulation which makes two bits 576 of GUE variant to also be 01. 578 This technique is effectively a means to compress out the version 0 579 GUE header when encapsulating IPv4 or IPv6 packets and there are no 580 flags or extension fields present. This method is compatible to use 581 on the same port number as packets with the GUE header (GUE variant 0 582 packets). This technique saves encapsulation overhead on costly links 583 for the common use of IP encapsulation, and also obviates the need to 584 allocate a separate port number for IP-over-UDP encapsulation. 586 4.1. Direct encapsulation of IPv4 588 The format for encapsulating IPv4 directly in UDP is: 590 0 1 2 3 591 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 592 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ 593 | Source port | Destination port | | 594 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP 595 | Length | Checksum | | 596 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ 597 |0|1|0|0| IHL |Type of Service| Total Length | 598 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 599 | Identification |Flags| Fragment Offset | 600 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 601 | Time to Live | Protocol | Header Checksum | 602 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 603 | Source IPv4 Address | 604 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 605 | Destination IPv4 Address | 606 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 608 The UDP fields are set in a similar manner as described in section 609 3.1. 611 Note that the 0100 value in the first four bits of the the UDP 612 payload expresses the GUE variant as 1 (bits 01) and IP version as 4 613 (bits 0100). 615 4.2. Direct encapsulation of IPv6 617 The format for encapsulating IPv6 directly in UDP is demonstrated 618 below: 620 0 1 2 3 621 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 622 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ 623 | Source port | Destination port | | 624 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP 625 | Length | Checksum | | 626 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ 627 |0|1|1|0| Traffic Class | Flow Label | 628 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 629 | Payload Length | NextHdr | Hop Limit | 630 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 631 | | 632 + + 633 | | 634 + Source IPv6 Address + 635 | | 636 + + 637 | | 638 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 639 | | 640 + + 641 | | 642 + Destination IPv6 Address + 643 | | 644 + + 645 | | 646 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 648 The UDP fields are set in a similar manner as described in section 649 3.1. 651 Note that the 0110 value in the first four bits of the the UDP 652 payload expresses the GUE variant as 1 (bits 01) and IP version as 6 653 (bits 0110). 655 5. Operation 657 The figure below illustrates the use of GUE encapsulation between two 658 hosts. Host 1 is sending packets to Host 2. An encapsulator performs 659 encapsulation of packets from Host 1. These encapsulated packets 660 traverse the network as UDP packets. At the decapsulator, packets are 661 decapsulated and sent on to Host 2. Packet flow in the reverse 662 direction need not be symmetric; for example, the reverse path might 663 not use GUE and/or any other form of encapsulation. 665 +---------------+ +---------------+ 666 | | | | 667 | Host 1 | | Host 2 | 668 | | | | 669 +---------------+ +---------------+ 670 | ^ 671 V | 672 +---------------+ +---------------+ +---------------+ 673 | | | | | | 674 | Encapsulator |-->| Layer 3 |-->| Decapsulator | 675 | | | Network | | | 676 +---------------+ +---------------+ +---------------+ 678 The encapsulator and decapsulator may be co-resident with the 679 corresponding hosts, or may be on separate nodes in the network. 681 5.1. Network tunnel encapsulation 683 Network tunneling can be achieved by encapsulating layer 2 or layer 3 684 packets. In this case the encapsulator and decapsulator nodes are the 685 tunnel endpoints. These could be routers that provide network tunnels 686 on behalf of communicating hosts. 688 5.2. Transport layer encapsulation 690 When encapsulating layer 4 packets, the encapsulator and decapsulator 691 should be co-resident with the hosts. In this case, the encapsulation 692 headers are inserted between the IP header and the transport packet. 693 The addresses in the IP header refer to both the endpoints of the 694 encapsulation and the endpoints for terminating the transport 695 protocol. Note that the transport layer ports in the encapsulated 696 packet are independent of the UDP ports in the outer packet. 698 Details about performing transport layer encapsulation are discussed 699 in [TOU]. 701 5.3. Encapsulator operation 703 Encapsulators create GUE data messages, set the fields of the UDP 704 header, set flags and optional extension fields in the GUE header, 705 and forward packets to a decapsulator. 707 An encapsulator can be an end host originating the packets of a flow, 708 or can be a network device performing encapsulation on behalf of 709 hosts (routers implementing tunnels for instance). In either case, 710 the intended target (decapsulator) is indicated by the outer 711 destination IP address and destination port in the UDP header. 713 If an encapsulator is tunneling packets -- that is encapsulating 714 packets of layer 2 or layer 3 protocols (e.g. EtherIP, IPIP, ESP 715 tunnel mode) -- it SHOULD follow standard conventions for tunneling 716 of one protocol over another. For instance, if an IP packet is being 717 encapsualated in GUE then diffserv interaction [RFC2983] and ECN 718 propagation for tunnels [RFC6040] SHOULD be followed. 720 5.4. Decapsulator operation 722 A decapsulator performs decapsulation of GUE packets. A decapsulator 723 is addressed by the outer destination IP address of a GUE packet. 724 The decapsulator validates packets, including fields of the GUE 725 header. 727 If a decapsulator receives a GUE packet with an unsupported variant, 728 unknown flag, bad header length (too small for included extension 729 fields), unknown control message type, bad protocol number, an 730 unsupported payload type, or an otherwise malformed header, it MUST 731 drop the packet. Such events MAY be logged subject to configuration 732 and rate limiting of logging messages. Note that set flags in a GUE 733 header that are unknown to a decapsulator MUST NOT be ignored. If a 734 GUE packet is received by a decapsulator with unknown flags, the 735 packet MUST be dropped. 737 5.4.1. Processing a received data message 739 If a valid data message is received, the UDP header and GUE header 740 are removed from the packet. The outer IP header remains intact and 741 the next protocol in the IP header is set to the protocol from the 742 proto field in the GUE header. The resulting packet is then 743 resubmitted into the protocol stack to process that packet as though 744 it was received with the protocol in the GUE header. 746 As an example, consider that a data message is received where GUE 747 encapsulates an IPv4 packet using GUE variant 0. In this case proto 748 field in the GUE header is set to 4 for IPv4 encapsulation: 750 +-------------------------------------+ 751 | IP header (next proto = 17,UDP) | 752 |-------------------------------------| 753 | UDP | 754 |-------------------------------------| 755 | GUE (proto = 4,IPv4 encapsulation) | 756 |-------------------------------------| 757 | IPv4 header and packet | 758 +-------------------------------------+ 760 The receiver removes the UDP and GUE headers and sets the next 761 protocol field in the IP packet to 4, which is derived from the GUE 762 proto field. The resultant packet would have the format: 764 +-------------------------------------+ 765 | IP header (next proto = 4,IPv4) | 766 |-------------------------------------| 767 | IP header and packet | 768 +-------------------------------------+ 770 This packet is then resubmitted into the protocol stack to be 771 processed as an IPv4 encapsulated packet. 773 5.4.2. Processing a received control message 775 If a valid control message is received, the packet MUST be processed 776 as a control message. The specific processing to be performed depends 777 on the value in the ctype field of the GUE header. 779 5.5. Router and switch operation 781 Routers and switches SHOULD forward GUE packets as standard UDP/IP 782 packets. The outer five-tuple should contain sufficient information 783 to perform flow classification corresponding to the flow of the inner 784 packet. A router does not normally need to parse a GUE header, and 785 none of the flags or extension fields in the GUE header are expected 786 to affect routing. In cases where the outer five-tuple does not 787 provide sufficient entropy for flow classification, for instance UDP 788 ports are fixed to provide connection semantics (section 5.6.1), then 789 the encapsulated packet MAY be parsed to determine flow entropy. 791 A router MUST NOT modify a GUE header when forwarding a packet. It 792 MAY encapsulate a GUE packet in another GUE packet, for instance to 793 implement a network tunnel (i.e. by encapsulating an IP packet with a 794 GUE payload in another IP packet as a GUE payload). In this case, the 795 router takes the role of an encapsulator, and the corresponding 796 decapsulator is the logical endpoint of the tunnel. When 797 encapsulating a GUE packet within another GUE packet, there are no 798 provisions to automatically copy flags or fields to the outer GUE 799 header. Each layer of encapsulation is considered independent. 801 5.6. Middlebox interactions 803 A middlebox MAY interpret some flags and extension fields of the GUE 804 header for classification purposes, but is not required to understand 805 any of the flags or extension fields in GUE packets. A middlebox MUST 806 NOT drop a GUE packet merely because there are flags unknown to it. 807 The header length in the GUE header allows a middlebox to inspect the 808 payload packet without needing to parse the flags or extension 809 fields. 811 5.6.1. Inferring connection semantics 813 A middlebox might infer bidirectional connection semantics for a UDP 814 flow. For instance, a stateful firewall might create a five-tuple 815 rule to match flows on egress, and a corresponding five-tuple rule 816 for matching ingress packets where the roles of source and 817 destination are reversed for the IP addresses and UDP port numbers. 818 To operate in this environment, a GUE tunnel should be configured to 819 assume connected semantics defined by the UDP five tuple and the use 820 of GUE encapsulation needs to be symmetric between both endpoints. 821 The source port set in the UDP header MUST be the destination port 822 the peer would set for replies. In this case, the UDP source port for 823 a tunnel would be a fixed value and not set to be flow entropy as 824 described in section 5.11. 826 The selection of whether to make the UDP source port fixed or set to 827 a flow entropy value for each packet sent SHOULD be configurable for 828 a tunnel. The default MUST be to set the flow entropy value in the 829 UDP source port. 831 5.6.2. NAT 833 IP address and port translation can be performed on the UDP/IP 834 headers adhering to the requirements for NAT with UDP [RFC4787]. In 835 the case of stateful NAT, connection semantics MUST be applied to a 836 GUE tunnel as described in section 5.6.1. GUE endpoints MAY also 837 invoke STUN [RFC5389] or ICE [RFC5245] to manage NAT port mappings 838 for encapsulations. 840 5.7. Checksum Handling 842 The potential for mis-delivery of packets due to corruption of IP, 843 UDP, or GUE headers needs to be considered. Historically, the UDP 844 checksum would be considered sufficient as a check against corruption 845 of either the UDP header and payload or the IP addresses. 847 Encapsulation protocols, such as GUE, can be originated or terminated 848 on devices incapable of computing the UDP checksum for packet. This 849 section discusses the requirements around checksum and alternatives 850 that might be used when an endpoint does not support UDP checksum. 852 5.7.1. Requirements 854 One of the following requirements MUST be met: 856 o UDP checksums are enabled (for IPv4 or IPv6). 858 o The GUE header checksum is used (defined in [GUEEXTEN]). 860 o Use zero UDP checksums. This is always permissible with IPv4; in 861 IPv6, they can only be used in accordance with applicable 862 requirements in [RFC8086], [RFC6935], and [RFC6936]. 864 5.7.2. UDP Checksum with IPv4 866 For UDP in IPv4, the UDP checksum MUST be processed as specified in 867 [RFC768] and [RFC1122] for both transmit and receive. An 868 encapsulator MAY set the UDP checksum to zero for performance or 869 implementation considerations. The IPv4 header includes a checksum 870 that protects against mis-delivery of the packet due to corruption 871 of IP addresses. The UDP checksum potentially provides protection 872 against corruption of the UDP header, GUE header, and GUE payload. 873 Enabling or disabling the use of checksums is a deployment 874 consideration that should take into account the risk and effects of 875 packet corruption, and whether the packets in the network are 876 already adequately protected by other, possibly stronger mechanisms, 877 such as the Ethernet CRC. If an encapsulator sets a zero UDP 878 checksum for IPv4, it SHOULD use the GUE header checksum as 879 described in [GUEEXTEN] assuming there are no other mechanisms used 880 to protect the GUE packet. 882 When a decapsulator receives a packet, the UDP checksum field MUST 883 be processed. If the UDP checksum is non-zero, the decapsulator MUST 884 verify the checksum before accepting the packet. By default, a 885 decapsulator SHOULD accept UDP packets with a zero checksum. A node 886 MAY be configured to disallow zero checksums per [RFC1122]. 887 Configuration of zero checksums can be selective. For instance, zero 888 checksums might be disallowed from certain hosts that are known to 889 be traversing paths subject to packet corruption. If verification of 890 a non-zero checksum fails, a decapsulator lacks the capability to 891 verify a non-zero checksum, or a packet with a zero-checksum was 892 received and the decapsulator is configured to disallow, then the 893 packet MUST be dropped. 895 5.7.3. UDP Checksum with IPv6 897 In IPv6, there is no checksum in the IPv6 header that protects 898 against mis-delivery due to address corruption. Therefore, when GUE 899 is used over IPv6, either the UDP checksum or the GUE header 900 checksum SHOULD be used unless there are alternative mechanisms in 901 use that protect against misdelivery. The UDP checksum and GUE 902 header checksum SHOULD NOT be used at the same time since that would 903 be mostly redundant. 905 If neither the UDP checksum or the GUE header checksum is used, then 906 the requirements for using zero IPv6 UDP checksums in [RFC6935] and 907 [RFC6936] MUST be met. 909 When a decapsulator receives a packet, the UDP checksum field MUST 910 be processed. If the UDP checksum is non-zero, the decapsulator MUST 911 verify the checksum before accepting the packet. By default a 912 decapsulator MUST only accept UDP packets with a zero checksum if 913 the GUE header checksum is used and is verified. If verification of 914 a non-zero checksum fails, a decapsulator lacks the capability to 915 verify a non-zero checksum, or a packet with a zero-checksum and no 916 GUE header checksum was received, the packet MUST be dropped. 918 5.8. MTU and fragmentation 920 Standard conventions for handling of MTU (Maximum Transmission Unit) 921 and fragmentation in conjunction with networking tunnels 922 (encapsulation of layer 2 or layer 3 packets) SHOULD be followed. 923 Details are described in MTU and Fragmentation Issues with In-the- 924 Network Tunneling [RFC4459]. 926 If a packet is fragmented before encapsulation in GUE, all the 927 related fragments MUST be encapsulated using the same UDP source 928 port. An operator SHOULD set MTU to account for encapsulation 929 overhead and reduce the likelihood of fragmentation. 931 Alternative to IP fragmentation, the GUE fragmentation extension can 932 be used. GUE fragmentation is described in [GUEEXTEN]. 934 5.9. Congestion control 936 Per requirements of [RFC5405], if the IP traffic encapsulated with 937 GUE implements proper congestion control no additional mechanisms 938 should be required. 940 In the case that the encapsulated traffic does not implement any or 941 sufficient control, or it is not known whether a transmitter will 942 consistently implement proper congestion control, then congestion 943 control at the encapsulation layer MUST be provided per [RFC5405]. 944 Note that this case applies to a significant use case in network 945 virtualization in which guests run third party networking stacks 946 that cannot be implicitly trusted to implement conformant congestion 947 control. 949 Out of band mechanisms such as rate limiting, Managed Circuit 950 Breaker [RFC8084], or traffic isolation MAY be used to provide 951 rudimentary congestion control. For finer-grained congestion control 952 that allows alternate congestion control algorithms, reaction time 953 within an RTT, and interaction with ECN, in-band mechanisms might be 954 warranted. 956 5.10. Multicast 958 GUE packets can be multicast to decapsulators using a multicast 959 destination address in the encapsulating IP headers. Each receiving 960 host will decapsulate the packet independently following normal 961 decapsulator operations. The receiving decapsulators need to agree 962 on the same set of GUE parameters and properties; how such an 963 agreement is reached is outside the scope of this document. 965 GUE allows encapsulation of unicast, broadcast, or multicast 966 traffic. Flow entropy (the value in the UDP source port) can be 967 generated from the header of encapsulated unicast or 968 broadcast/multicast packets at an encapsulator. The mapping 969 mechanism between the encapsulated multicast traffic and the 970 multicast capability in the IP network is transparent and 971 independent of the encapsulation and is otherwise outside the scope 972 of this document. 974 5.11. Flow entropy for ECMP 976 5.11.1. Flow classification 978 A major objective of using GUE is that a network device can perform 979 flow classification corresponding to the flow of the inner 980 encapsulated packet based on the contents in the outer headers. 982 Hardware devices commonly perform hash computations on packet 983 headers to classify packets into flows or flow buckets. Flow 984 classification is done to support load balancing of flows across a 985 set of networking resources. Examples of such load balancing 986 techniques are Equal Cost Multipath routing (ECMP), port selection 987 in Link Aggregation, and NIC device Receive Side Scaling (RSS). 988 Hashes are usually either a three-tuple hash of IP protocol, source 989 address, and destination address; or a five-tuple hash consisting of 990 IP protocol, source address, destination address, source port, and 991 destination port. Typically, networking hardware will compute five- 992 tuple hashes for TCP and UDP, but only three-tuple hashes for other 993 IP protocols. Since the five-tuple hash provides more granularity, 994 load balancing can be finer-grained with better distribution. When a 995 packet is encapsulated with GUE and connection semantics are not 996 applied, the source port in the outer UDP packet is set to a flow 997 entropy value that corresponds to the flow of the inner packet. When 998 a device computes a five-tuple hash on the outer UDP/IP header of a 999 GUE packet, the resultant value classifies the packet per its inner 1000 flow. 1002 Examples of deriving flow entropy for encapsulation are: 1004 o If the encapsulated packet is a layer 4 packet, TCP/IPv4 for 1005 instance, the flow entropy could be based on the canonical five- 1006 tuple hash of the inner packet. 1008 o If the encapsulated packet is an AH transport mode packet with 1009 TCP as next header, the flow entropy could be a hash over a 1010 three-tuple: TCP protocol and TCP ports of the encapsulated 1011 packet. 1013 o If a node is encrypting a packet using ESP tunnel mode and GUE 1014 encapsulation, the flow entropy could be based on the contents 1015 of the clear-text packet. For instance, a canonical five-tuple 1016 hash for a TCP/IP packet could be used. 1018 [RFC6438] discusses methods to compute and set flow entropy value for 1019 IPv6 flow labels. Such methods can also be used to create flow 1020 entropy values for GUE. 1022 5.11.2. Flow entropy properties 1024 The flow entropy is the value set in the UDP source port of a GUE 1025 packet. Flow entropy in the UDP source port SHOULD adhere to the 1026 following properties: 1028 o The value set in the source port is within the ephemeral port 1029 range (49152 to 65535 [RFC6335]). Since the high order two bits 1030 of the port are set to one, this provides fourteen bits of 1031 entropy for the value. 1033 o The flow entropy has a uniform distribution across encapsulated 1034 flows. 1036 o An encapsulator MAY occasionally change the flow entropy used 1037 for an inner flow per its discretion (for security, route 1038 selection, etc). To avoid thrashing or flapping the value, the 1039 flow entropy used for a flow SHOULD NOT change more than once 1040 every thirty seconds (or a configurable value). 1042 o Decapsulators, or any networking devices, SHOULD NOT attempt to 1043 interpret flow entropy as anything more than an opaque value. 1044 Neither should they attempt to reproduce the hash calculation 1045 used by an encapasulator in creating a flow entropy value. They 1046 MAY use the value to match further receive packets for steering 1047 decisions, but MUST NOT assume that the hash uniquely or 1048 permanently identifies a flow. 1050 o Input to the flow entropy calculation is not restricted to ports 1051 and addresses; input could include flow label from an IPv6 1052 packet, SPI from an ESP packet, or other flow related state in 1053 the encapsulator that is not necessarily conveyed in the packet. 1055 o The assignment function for flow entropy SHOULD be randomly 1056 seeded to mitigate denial of service attacks. The seed SHOULD be 1057 changed periodically. 1059 5.12 Negotiation of acceptable flags and extension fields 1061 An encapsulator and decapsulator need to achieve agreement about GUE 1062 parameters that will be used in communications. Parameters include 1063 supported GUE variants, flags and extension fields that can be used, 1064 security algorithms and keys, supported protocols and control 1065 messages, etc. This document proposes different general methods to 1066 accomplish this, however the details of implementing these are 1067 considered out of scope. 1069 General methods for this are: 1071 o Configuration. The parameters used for a tunnel are configured 1072 at each endpoint. 1074 o Negotiation. A tunnel negotiation can be performed. This could 1075 be accomplished in-band of GUE using control messages or private 1076 data. 1078 o Via a control plane. Parameters for communicating with a tunnel 1079 endpoint can be set in a control plane protocol (such as that 1080 needed for network virtualization). 1082 o Via security negotiation. Use of security typically implies a 1083 key exchange between endpoints. Other GUE parameters may be 1084 conveyed as part of that process. 1086 6. Motivation for GUE 1088 This section presents the motivation for GUE with respect to other 1089 encapsulation methods. 1091 6.1. Benefits of GUE 1093 * GUE is a generic encapsulation protocol. GUE can encapsulate 1094 protocols that are represented by an IP protocol number. This 1095 includes layer 2, layer 3, and layer 4 protocols. 1097 * GUE is an extensible encapsulation protocol. Standardized 1098 optional data such as security, virtual networking identifiers, 1099 fragmentation are being defined. 1101 * For extensilbity, GUE uses flag fields as opposed to TLVs as 1102 some other encapsulation protocols do. Flag fields are strictly 1103 ordered, allow random access, and are efficient in use of header 1104 space. 1106 * GUE allows private data to be sent as part of the encapsulation. 1107 This permits experimentation or customization in deployment. 1109 * GUE allows sending of control messages such as OAM using the 1110 same GUE header format (for routing purposes) as normal data 1111 messages. 1113 * GUE maximizes deliverability of non-UDP and non-TCP protocols. 1115 * GUE provides a means for exposing per flow entropy for ECMP for 1116 atypical protocols such as SCTP, DCCP, ESP, etc. 1118 6.2 Comparison of GUE to other encapsulations 1120 A number of different encapsulation techniques have been proposed for 1121 the encapsulation of one protocol over another. EtherIP [RFC3378] 1122 provides layer 2 tunneling of Ethernet frames over IP. GRE [RFC2784], 1123 MPLS [RFC4023], and L2TP [RFC2661] provide methods for tunneling 1124 layer 2 and layer 3 packets over IP. NVGRE [RFC7637] and VXLAN 1125 [RFC7348] are proposals for encapsulation of layer 2 packets for 1126 network virtualization. IPIP [RFC2003] and Generic packet tunneling 1127 in IPv6 [RFC2473] provide methods for tunneling IP packets over IP. 1129 Several proposals exist for encapsulating packets over UDP including 1130 ESP over UDP [RFC3948], TCP directly over UDP [TCPUDP], VXLAN 1131 [RFC7348], LISP [RFC6830] which encapsulates layer 3 packets, 1132 MPLS/UDP [RFC7510], GENEVE [GENEVE], and GRE-in-UDP Encapsulation 1133 [RFC8086]. 1135 GUE has the following discriminating features: 1137 o UDP encapsulation leverages specialized network device 1138 processing for efficient transport. The semantics for using the 1139 UDP source port for flow entropy as input to ECMP are defined in 1140 section 5.11. 1142 o GUE permits encapsulation of arbitrary IP protocols, which 1143 includes layer 2 3, and 4 protocols. 1145 o Multiple protocols can be multiplexed over a single UDP port 1146 number. This is in contrast to techniques to encapsulate 1147 protocols over UDP using a protocol specific port number (such 1148 as ESP/UDP, GRE/UDP, SCTP/UDP). GUE provides a uniform and 1149 extensible mechanism for encapsulating all IP protocols in UDP 1150 with minimal overhead (four bytes of additional header). 1152 o GUE is extensible. New flags and extension fields can be 1153 defined. 1155 o The GUE header includes a header length field. This allows a 1156 network node to inspect an encapsulated packet without needing 1157 to parse the full encapsulation header. 1159 o Private data in the encapsulation header allows local 1160 customization and experimentation while being compatible with 1161 processing in network nodes (routers and middleboxes). 1163 o GUE includes both data messages (encapsulation of packets) and 1164 control messages (such as OAM). 1166 o The flags-field model facilitates efficient implementation of 1167 extensibility in hardware. For instance, a TCAM can be use to 1168 parse a known set of N flags where the number of entries in the 1169 TCAM is 2^N. By comparison, the number of TCAM entries needed to 1170 parse a set of N arbitrarily ordered TLVS is approximately e*N!. 1172 o GUE includes a variant that encapsulates IPv4 and IPv6 packets 1173 directly within UDP. 1175 7. Security Considerations 1177 There are two important considerations of security with respect to 1178 GUE. 1180 o Authentication and integrity of the GUE header. 1182 o Authentication, integrity, and confidentiality of the GUE 1183 payload. 1185 GUE security is provided by extensions for security defined in 1186 [GUEEXTEN]. These extensions include methods to authenticate the GUE 1187 header and encrypt the GUE payload. 1189 The GUE header can be authenticated using a security extension for an 1190 HMAC. Securing the GUE payload can be accomplished use of the GUE 1191 Payload Transform. This extension can be used to perform DTLS in the 1192 payload of a GUE packet to encrypt the payload. 1194 A hash function for computing flow entropy (section 5.11) SHOULD be 1195 randomly seeded to mitigate some possible denial service attacks. 1197 8. IANA Considerations 1199 8.1. UDP source port 1201 A user UDP port number assignment for GUE has been assigned: 1203 Service Name: gue 1204 Transport Protocol(s): UDP 1205 Assignee: Tom Herbert 1206 Contact: Tom Herbert 1207 Description: Generic UDP Encapsulation 1208 Reference: draft-herbert-gue 1209 Port Number: 6080 1210 Service Code: N/A 1211 Known Unauthorized Uses: N/A 1212 Assignment Notes: N/A 1214 8.2. GUE variant number 1216 IANA is requested to set up a registry for the GUE variant number. 1217 The GUE variant number is 2 bits containing four possible values. 1218 This document defines version 0 and 1. New values are assigned in 1219 accordance with RFC Required policy [RFC5226]. 1221 +----------------+----------------+---------------+ 1222 | Variant number | Description | Reference | 1223 +----------------+----------------+---------------+ 1224 | 0 | GUE Version 0 | This document | 1225 | | with header | | 1226 | | | | 1227 | 1 | GUE Version 0 | This document | 1228 | | with direct IP | | 1229 | | encapsulation | | 1230 | | | | 1231 | 2..3 | Unassigned | | 1232 +----------------+----------------+---------------+ 1234 8.3. Control types 1236 IANA is requested to set up a registry for the GUE control types. 1237 Control types are 8 bit values. New values for control types 1-127 1238 are assigned in accordance with RFC Required policy [RFC5226]. 1240 +----------------+------------------+---------------+ 1241 | Control type | Description | Reference | 1242 +----------------+------------------+---------------+ 1243 | 0 | Control payload | This document | 1244 | | needs more | | 1245 | | context for | | 1246 | | interpretation | | 1247 | | | | 1248 | 1..127 | Unassigned | | 1249 | | | | 1250 | 128..255 | User defined | This document | 1251 +----------------+------------------+---------------+ 1253 9. Acknowledgements 1255 The authors would like to thank David Liu, Erik Nordmark, Fred 1256 Templin, Adrian Farrel, Bob Briscoe, and Murray Kucherawy for 1257 valuable input on this draft. 1259 10. References 1261 10.1. Normative References 1263 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI 1264 10.17487/RFC0768, August 1980, . 1267 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 1268 Communication Layers", STD 3, RFC 1122, DOI 1269 10.17487/RFC1122, October 1989, . 1272 [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1273 IANA Considerations Section in RFCs", RFC 2434, DOI 1274 10.17487/RFC2434, October 1998, . 1277 [RFC2983] Black, D., "Differentiated Services and Tunnels", RFC 1278 2983, DOI 10.17487/RFC2983, October 2000, . 1281 [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion 1282 Notification", RFC 6040, DOI 10.17487/RFC6040, November 1283 2010, . 1285 [RFC6935] Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and 1286 UDP Checksums for Tunneled Packets", RFC 6935, DOI 1287 10.17487/RFC6935, April 2013, . 1290 [RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement 1291 for the Use of IPv6 UDP Datagrams with Zero Checksums", 1292 RFC 6936, DOI 10.17487/RFC6936, April 2013, 1293 . 1295 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1296 Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April 1297 2006, . 1299 10.2. Informative References 1301 [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., 1302 and G. Fairhurst, Ed., "The Lightweight User Datagram 1303 Protocol (UDP-Lite)", RFC 3828, July 2004, 1304 . 1306 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 1307 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual 1308 eXtensible Local Area Network (VXLAN): A Framework for 1309 Overlaying Virtualized Layer 2 Networks over Layer 3 1310 Networks", RFC 7348, August 2014, . 1313 [RFC7605] Touch, J., "Recommendations on Using Assigned Transport 1314 Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, 1315 August 2015, . 1317 [RFC7637] Garg, P., Ed., and Y. Wang, Ed., "NVGRE: Network 1318 Virtualization Using Generic Routing Encapsulation", RFC 1319 7637, DOI 10.17487/RFC7637, September 2015, 1320 . 1322 [RFC8086] Yong, L., Ed., Crabbe, E., Xu, X., and T. Herbert, "GRE- 1323 in-UDP Encapsulation", RFC 8086, DOI 10.17487/RFC8086, 1324 March 2017, . 1326 [RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, 1327 "Encapsulating MPLS in UDP", RFC 7510, DOI 1328 10.17487/RFC7510, April 2015, . 1331 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 1332 Congestion Control Protocol (DCCP)", RFC 4340, DOI 1333 10.17487/RFC4340, March 2006, . 1336 [RFC4787] Audet, F., Ed., and C. Jennings, "Network Address 1337 Translation (NAT) Behavioral Requirements for Unicast 1338 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 1339 2007, . 1341 [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, 1342 "Session Traversal Utilities for NAT (STUN)", RFC 5389, 1343 DOI 10.17487/RFC5389, October 2008, . 1346 [RFC5285] Rosenberg, J., "Interactive Connectivity Establishment 1347 (ICE): A Protocol for Network Address Translator (NAT) 1348 Traversal for Offer/Answer Protocols", RFC 5245, DOI 1349 10.17487/RFC5245, April 2010, . 1352 [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 1353 for Application Designers", BCP 145, RFC 5405, DOI 1354 10.17487/RFC5405, November 2008, . 1357 [RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label 1358 for Equal Cost Multipath Routing and Link Aggregation in 1359 Tunnels", RFC 6438, DOI 10.17487/RFC6438, November 2011, 1360 . 1362 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, DOI 1363 10.17487/RFC2003, October 1996, . 1366 [RFC3948] Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M. 1367 Stenberg, "UDP Encapsulation of IPsec ESP Packets", RFC 1368 3948, DOI 10.17487/RFC3948, January 2005, . 1371 [RFC6830] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The 1372 Locator/ID Separation Protocol (LISP)", RFC 6830, DOI 1373 10.17487/RFC6830, January 2013, . 1376 [RFC3378] Housley, R. and S. Hollenbeck, "EtherIP: Tunneling 1377 Ethernet Frames in IP Datagrams", RFC 3378, DOI 1378 10.17487/RFC3378, September 2002, . 1381 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 1382 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 1383 DOI 10.17487/RFC2784, March 2000, . 1386 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed., 1387 "Encapsulating MPLS in IP or Generic Routing Encapsulation 1388 (GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005, 1389 . 1391 [RFC2661] Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn, 1392 G., and B. Palter, "Layer Two Tunneling Protocol "L2TP"", 1393 RFC 2661, DOI 10.17487/RFC2661, August 1999, 1394 . 1396 [RFC8084] Fairhurst, G., "Network Transport Circuit Breakers", BCP 1397 208, RFC 8084, DOI 10.17487/RFC8084, March 2017, 1398 . 1400 [GUEEXTEN] Herbert, T., Yong, L., and Templin, F., "Extensions for 1401 Generic UDP Encapsulation" draft-herbert-gue-extensions-00 1403 [GUE4NVO3] Yong, L., Herbert, T., Zia, O., "Generic UDP Encapsulation 1404 (GUE) for Network Virtualization Overlay" draft-hy-nvo3- 1405 gue-4-nvo-03 1407 [GUESEC] Yong, L., Herbert, T., "Generic UDP Encapsulation (GUE) 1408 for Secure Transport" draft-hy-gue-4-secure-transport-03 1410 [TCPUDP] Chesire, S., Graessley, J., and McGuire, R., 1411 "Encapsulation of TCP and other Transport Protocols over 1412 UDP" draft-cheshire-tcp-over-udp-00 1414 [TOU] Herbert, T., "Transport layer protocols over UDP" draft- 1415 herbert-transports-over-udp-00 1417 [GENEVE] Gross, J., Ed., Ganga, I. Ed., and Sridhar, T., "Geneve: 1418 Generic Network Virtualization Encapsulation", draft-ietf- 1419 nvo3-geneve-05 1421 [LCO] Cree, E., https://www.kernel.org/doc/Documentation/ 1422 networking/checksum-offloads.txt 1424 Appendix A: NIC processing for GUE 1426 This appendix provides some guidelines for Network Interface Cards 1427 (NICs) to implement common offloads and accelerations to support GUE. 1428 Note that most of this discussion is generally applicable to other 1429 methods of UDP based encapsulation. 1431 A.1. Receive multi-queue 1433 Contemporary NICs support multiple receive descriptor queues (multi- 1434 queue). Multi-queue enables load balancing of network processing for 1435 a NIC across multiple CPUs. On packet reception, a NIC selects the 1436 appropriate queue for host processing. Receive Side Scaling is a 1437 common method which uses the flow hash for a packet to index an 1438 indirection table where each entry stores a queue number. Flow 1439 Director and Accelerated Receive Flow Steering (aRFS) allow a host to 1440 program the queue that is used for a given flow which is identified 1441 either by an explicit five-tuple or by the flow's hash. 1443 GUE encapsulation is compatible with multi-queue NICs that support 1444 five-tuple hash calculation for UDP/IP packets as input to RSS. The 1445 flow entropy in the UDP source port ensures classification of the 1446 encapsulated flow even in the case that the outer source and 1447 destination addresses are the same for all flows (e.g. all flows are 1448 going over a single tunnel). 1450 By default, UDP RSS support is often disabled in NICs to avoid out- 1451 of-order reception that can occur when UDP packets are fragmented. As 1452 discussed above, fragmentation of GUE packets is mostly avoided by 1453 fragmenting packets before entering a tunnel, GUE fragmentation, path 1454 MTU discovery in higher layer protocols, or operator adjusting MTUs. 1455 Other UDP traffic might not implement such procedures to avoid 1456 fragmentation, so enabling UDP RSS support in the NIC might be a 1457 considered tradeoff during configuration. 1459 A.2. Checksum offload 1461 Many NICs provide capabilities to calculate standard ones complement 1462 payload checksum for packets in transmit or receive. When using GUE 1463 encapsulation, there are at least two checksums that are of interest: 1464 the encapsulated packet's transport checksum, and the UDP checksum in 1465 the outer header. 1467 A.2.1. Transmit checksum offload 1469 NICs can provide a protocol agnostic method to offload transmit 1470 checksum (NETIF_F_HW_CSUM in Linux parlance) that can be used with 1471 GUE. In this method, the host provides checksum related parameters in 1472 a transmit descriptor for a packet. These parameters include the 1473 starting offset of data to checksum, the length of data to checksum, 1474 and the offset in the packet where the computed checksum is to be 1475 written. The host initializes the checksum field to pseudo header 1476 checksum. 1478 In the case of GUE, the checksum for an encapsulated transport layer 1479 packet, a TCP packet for instance, can be offloaded by setting the 1480 appropriate checksum parameters. 1482 NICs typically can offload only one transmit checksum per packet, so 1483 simultaneously offloading both an inner transport packet's checksum 1484 and the outer UDP checksum is likely not possible. 1486 If an encapsulator is co-resident with a host, then checksum offload 1487 may be performed using remote checksum offload (described in 1488 [GUEEXTEN]). Remote checksum offload relies on NIC offload of the 1489 simple UDP/IP checksum which is commonly supported even in legacy 1490 devices. In remote checksum offload, the outer UDP checksum is set 1491 and the GUE header includes an option indicating the start and offset 1492 of the inner "offloaded" checksum. The inner checksum is initialized 1493 to the pseudo header checksum. When a decapsulator receives a GUE 1494 packet with the remote checksum offload option, it completes the 1495 offload operation by determining the packet checksum from the 1496 indicated start point to the end of the packet, and then adds this 1497 into the checksum field at the offset given in the option. Computing 1498 the checksum from the start to end of packet is efficient if 1499 checksum-complete is provided on the receiver. 1501 Another alternative when an encapsulator is co-resident with a host 1502 is to perform Local Checksum Offload [LCO]. In this method, the inner 1503 transport layer checksum is offloaded and the outer UDP checksum can 1504 be deduced based on the fact that the portion of the packet covered 1505 by the inner transport checksum will sum to zero (or at least the bit 1506 wise "not" of the inner pseudo header). 1508 A.2.2. Receive checksum offload 1510 GUE is compatible with NICs that perform a protocol agnostic receive 1511 checksum (CHECKSUM_COMPLETE in Linux parlance). In this technique, a 1512 NIC computes a ones complement checksum over all (or some predefined 1513 portion) of a packet. The computed value is provided to the host 1514 stack in the packet's receive descriptor. The host driver can use 1515 this checksum to "patch up" and validate any inner packet transport 1516 checksum, as well as the outer UDP checksum if it is non-zero. 1518 Many legacy NICs don't provide checksum-complete but instead provide 1519 an indication that a checksum has been verified (CHECKSUM_UNNECESSARY 1520 in Linux). Usually, such validation is only done for simple TCP/IP or 1521 UDP/IP packets. If a NIC indicates that a UDP checksum is valid, the 1522 checksum-complete value for the UDP packet is the "not" of the pseudo 1523 header checksum. In this way, checksum-unnecessary can be converted 1524 to checksum-complete. So, if the NIC provides checksum-unnecessary 1525 for the outer UDP header in an encapsulation, checksum conversion can 1526 be done so that the checksum-complete value is derived and can be 1527 used by the stack to validate checksums in the encapsulated packet. 1529 A.3. Transmit Segmentation Offload 1531 Transmit Segmentation Offload (TSO) is a NIC feature where a host 1532 provides a large (>MTU size) TCP packet to the NIC, which in turn 1533 splits the packet into separate segments and transmits each one. This 1534 is useful to reduce CPU load on the host. 1536 The process of TSO can be generalized as: 1538 - Split the TCP payload into segments which allow packets with 1539 size less than or equal to MTU. 1541 - For each created segment: 1543 1. Replicate the TCP header and all preceding headers of the 1544 original packet. 1546 2. Set payload length fields in any headers to reflect the 1547 length of the segment. 1549 3. Set TCP sequence number to correctly reflect the offset of 1550 the TCP data in the stream. 1552 4. Recompute and set any checksums that either cover the payload 1553 of the packet or cover header which was changed by setting a 1554 payload length. 1556 Following this general process, TSO can be extended to support TCP 1557 encapsulation in GUE. For each segment the Ethernet, outer IP, UDP 1558 header, GUE header, inner IP header (if tunneling), and TCP headers 1559 are replicated. Any packet length header fields need to be set 1560 properly (including the length in the outer UDP header), and 1561 checksums need to be set correctly (including the outer UDP checksum 1562 if being used). 1564 To facilitate TSO with GUE, it is recommended that extension fields 1565 do not contain values that need to be updated on a per segment basis. 1566 For example, extension fields should not include checksums, lengths, 1567 or sequence numbers that refer to the payload. If the GUE header does 1568 not contain such fields then the TSO engine only needs to copy the 1569 bits in the GUE header when creating each segment and does not need 1570 to parse the GUE header. 1572 A.4. Large Receive Offload 1574 Large Receive Offload (LRO) is a NIC feature where packets of a TCP 1575 connection are reassembled, or coalesced, in the NIC and delivered to 1576 the host as one large packet. This feature can reduce CPU utilization 1577 in the host. 1579 LRO requires significant protocol awareness to be implemented 1580 correctly and is difficult to generalize. Packets in the same flow 1581 need to be unambiguously identified. In the presence of tunnels or 1582 network virtualization, this may require more than a five-tuple match 1583 (for instance packets for flows in two different virtual networks may 1584 have identical five-tuples). Additionally, a NIC needs to perform 1585 validation over packets that are being coalesced, and needs to 1586 fabricate a single meaningful header from all the coalesced packets. 1588 The conservative approach to supporting LRO for GUE would be to 1589 assign packets to the same flow only if they have identical five- 1590 tuple and were encapsulated the same way. That is the outer IP 1591 addresses, the outer UDP ports, GUE protocol, GUE flags and fields, 1592 and inner five tuple are all identical. 1594 Appendix B: Implementation considerations 1595 This appendix is informational and does not constitute a normative 1596 part of this document. 1598 B.1. Priveleged ports 1600 Using the source port to contain a flow entropy value disallows the 1601 security method of a receiver enforcing that the source port be a 1602 privileged port. Privileged ports are defined by some operating 1603 systems to restrict source port binding. Unix, for instance, 1604 considered port number less than 1024 to be privileged. 1606 Enforcing that packets are sent from a privileged port is widely 1607 considered an inadequate security mechanism and has been mostly 1608 deprecated. To approximate this behavior, an implementation could 1609 restrict a user from sending a packet destined to the GUE port 1610 without proper credentials. 1612 B.2. Setting flow entropy as a route selector 1614 An encapsulator generating flow entropy in the UDP source port could 1615 modulate the value to perform a type of multipath source routing. 1616 Assuming that networking switches perform ECMP based on the flow 1617 hash, a sender can affect the path by altering the flow entropy. For 1618 instance, a host can store a flow hash in its protocol control block 1619 (PCB) for an inner flow, and might alter the value upon detecting 1620 that packets are traversing a lossy path. Changing the flow entropy 1621 for a flow SHOULD be subject to hysteresis (at most once every thirty 1622 seconds) to limit the number of out of order packets. 1624 B.3. Hardware protocol implementation considerations 1626 Low level data path protocols, such is GUE, are often supported in 1627 high speed network device hardware. Variable length header (VLH) 1628 protocols like GUE are often considered difficult to efficiently 1629 implement in hardware. In order to retain the important 1630 characteristics of an extensible and robust protocol, hardware 1631 vendors may practice "constrained flexibility". In this model, only 1632 certain combinations or protocol header parameterizations are 1633 implemented in hardware fast path. Each such parameterization is 1634 fixed length so that the particular instance can be optimized as a 1635 fixed length protocol. In the case of GUE this constitutes specific 1636 combinations of GUE flags, fields, and next protocol. The selected 1637 combinations would naturally be the most common cases which form the 1638 "fast path", and other combinations are assumed to take the "slow 1639 path". 1641 In time, needs and requirements of the protocol may change which may 1642 manifest themselves as new parameterizations to be supported in the 1643 fast path. To allow this extensibility, a device practicing 1644 constrained flexibility should allow the fast path parameterizations 1645 to be programmable. 1647 Authors' Addresses 1649 Tom Herbert 1650 Quantonium 1651 4701 Patrick Henry 1652 Santa Clara, CA 95054 1653 US 1655 Email: tom@herbertland.com 1657 Lucy Yong 1658 Huawei USA 1659 5340 Legacy Dr. 1660 Plano, TX 75024 1661 US 1663 Email: lucy.yong@huawei.com 1665 Osama Zia 1666 Microsoft 1667 1 Microsoft Way 1668 Redmond, WA 98029 1669 US 1671 Email: osamaz@microsoft.com