idnits 2.17.1 draft-ietf-intarea-gue-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 334: '... flag bits MUST be set to zero o...' RFC 2119 keyword, line 482: '...and decapsulator MUST agree on the mea...' RFC 2119 keyword, line 488: '... GUE packet with private data, it MUST...' RFC 2119 keyword, line 490: '...capsulator the packet MUST be dropped....' RFC 2119 keyword, line 492: '...ntics the packet MUST also be dropped....' (19 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (October 31, 2016) is 2734 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'GUEXTENS' is mentioned on line 442, but not defined == Missing Reference: 'RFC5245' is mentioned on line 820, but not defined ** Obsolete undefined reference: RFC 5245 (Obsoleted by RFC 8445, RFC 8839) == Missing Reference: 'RFC768' is mentioned on line 849, but not defined == Missing Reference: 'RFC6335' is mentioned on line 1003, but not defined == Missing Reference: 'RFC2473' is mentioned on line 1096, but not defined -- Looks like a reference, but probably isn't: '7510' on line 1101 == Missing Reference: 'RFC5226' is mentioned on line 1228, but not defined ** Obsolete undefined reference: RFC 5226 (Obsoleted by RFC 8126) == Unused Reference: 'RFC2434' is defined on line 1276, but no explicit reference was found in the text == Unused Reference: 'RFC3828' is defined on line 1305, but no explicit reference was found in the text == Unused Reference: 'RFC7510' is defined on line 1326, but no explicit reference was found in the text == Unused Reference: 'RFC4340' is defined on line 1331, but no explicit reference was found in the text == Unused Reference: 'RFC5285' is defined on line 1346, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Downref: Normative reference to an Informational RFC: RFC 2983 ** Downref: Normative reference to an Informational RFC: RFC 4459 -- Obsolete informational reference (is this intentional?): RFC 5389 (Obsoleted by RFC 8489) -- Obsolete informational reference (is this intentional?): RFC 5245 (ref. 'RFC5285') (Obsoleted by RFC 8445, RFC 8839) -- Obsolete informational reference (is this intentional?): RFC 5405 (Obsoleted by RFC 8085) -- Obsolete informational reference (is this intentional?): RFC 6830 (Obsoleted by RFC 9300, RFC 9301) == Outdated reference: A later version (-01) exists of draft-herbert-gue-extensions-00 == Outdated reference: A later version (-04) exists of draft-hy-nvo3-gue-4-nvo-03 == Outdated reference: A later version (-01) exists of draft-herbert-transports-over-udp-00 Summary: 6 errors (**), 0 flaws (~~), 15 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Area WG T. Herbert 3 Internet-Draft Facebook 4 Intended status: Standard track L. Yong 5 Expires May 4, 2017 Huawei USA 6 O. Zia 7 Microsoft 8 October 31, 2016 10 Generic UDP Encapsulation 11 draft-ietf-intarea-gue-00 13 Status of this Memo 15 This Internet-Draft is submitted in full conformance with the 16 provisions of BCP 78 and BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html 34 This Internet-Draft will expire on May 4, 2017. 36 Copyright Notice 38 Copyright (c) 2016 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Abstract 60 This specification describes Generic UDP Encapsulation (GUE), which 61 is a scheme for using UDP to encapsulate packets of different IP 62 protocols for transport across layer 3 networks. By encapsulating 63 packets in UDP, specialized capabilities in networking hardware for 64 efficient handling of UDP packets can be leveraged. GUE specifies 65 basic encapsulation methods upon which higher level constructs, such 66 tunnels and overlay networks for network virtualization, can be 67 constructed. GUE is extensible by allowing optional data fields as 68 part of the encapsulation, and is generic in that it can encapsulate 69 packets of various IP protocols. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 5 74 1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 5 75 2. Base packet format . . . . . . . . . . . . . . . . . . . . . . 7 76 2.1. GUE version . . . . . . . . . . . . . . . . . . . . . . . . 7 77 3. Version 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 78 3.1. Header format . . . . . . . . . . . . . . . . . . . . . . . 8 79 3.2. Proto/ctype field . . . . . . . . . . . . . . . . . . . . . 9 80 3.2.1 Proto field . . . . . . . . . . . . . . . . . . . . . . 9 81 3.2.2 Ctype field . . . . . . . . . . . . . . . . . . . . . . 10 82 3.3. Flags and extension fields . . . . . . . . . . . . . . . . 10 83 3.3.1. Requirements . . . . . . . . . . . . . . . . . . . . . 10 84 3.3.2. Example GUE header with extension fields . . . . . . . 11 85 3.4. Private data . . . . . . . . . . . . . . . . . . . . . . . 12 86 3.5. Message types . . . . . . . . . . . . . . . . . . . . . . . 12 87 3.5.1. Control messages . . . . . . . . . . . . . . . . . . . 12 88 3.5.2. Data messages . . . . . . . . . . . . . . . . . . . . . 13 89 3.6. Hiding the transport layer protocol number . . . . . . . . 13 90 4. Version 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 91 4.1. Direct encapsulation of IPv4 . . . . . . . . . . . . . . . 14 92 4.2. Direct encapsulation of IPv6 . . . . . . . . . . . . . . . 15 93 5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 94 5.1. Network tunnel encapsulation . . . . . . . . . . . . . . . 16 95 5.2. Transport layer encapsulation . . . . . . . . . . . . . . . 16 96 5.3. Encapsulator operation . . . . . . . . . . . . . . . . . . 16 97 5.4. Decapsulator operation . . . . . . . . . . . . . . . . . . 17 98 5.4.1. Processing a received data message . . . . . . . . . . 17 99 5.4.2. Processing a received control message . . . . . . . . . 18 100 5.5. Router and switch operation . . . . . . . . . . . . . . . . 18 101 5.6. Middlebox interactions . . . . . . . . . . . . . . . . . . 18 102 5.6.1. Connection semantics . . . . . . . . . . . . . . . . . 19 103 5.6.2. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . 19 104 5.7. Checksum Handling . . . . . . . . . . . . . . . . . . . . . 19 105 5.7.1. Requirements . . . . . . . . . . . . . . . . . . . . . 19 106 5.7.2. UDP Checksum with IPv4 . . . . . . . . . . . . . . . . 20 107 5.7.3. UDP Checksum with IPv6 . . . . . . . . . . . . . . . . 20 108 5.8. MTU and fragmentation . . . . . . . . . . . . . . . . . . . 21 109 5.9. Congestion control . . . . . . . . . . . . . . . . . . . . 21 110 5.10. Multicast . . . . . . . . . . . . . . . . . . . . . . . . 21 111 5.11. Flow entropy for ECMP . . . . . . . . . . . . . . . . . . 22 112 5.11.1. Flow classification . . . . . . . . . . . . . . . . . 22 113 5.11.2. Flow entropy properties . . . . . . . . . . . . . . . 23 114 5.12. Negotiation of acceptable flags and extension fields . . . 24 115 6. Motivation for GUE . . . . . . . . . . . . . . . . . . . . . . 24 116 6.1. Benefits of GUE . . . . . . . . . . . . . . . . . . . . . . 24 117 6.2. Comparison of GUE to other encapsulations . . . . . . . . . 25 118 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 26 119 8. IANA Consideration . . . . . . . . . . . . . . . . . . . . . . 27 120 8.1. UDP source port . . . . . . . . . . . . . . . . . . . . . . 27 121 8.2. GUE version number . . . . . . . . . . . . . . . . . . . . 27 122 8.3. Control types . . . . . . . . . . . . . . . . . . . . . . . 27 123 8.4. Flag-fields . . . . . . . . . . . . . . . . . . . . . . . . 28 124 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 29 125 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 29 126 10.1. Normative References . . . . . . . . . . . . . . . . . . . 29 127 10.2. Informative References . . . . . . . . . . . . . . . . . . 30 128 Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 32 129 A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 32 130 A.2. Checksum offload . . . . . . . . . . . . . . . . . . . . . 33 131 A.2.1. Transmit checksum offload . . . . . . . . . . . . . . . 33 132 A.2.2. Receive checksum offload . . . . . . . . . . . . . . . 34 133 A.3. Transmit Segmentation Offload . . . . . . . . . . . . . . . 34 134 A.4. Large Receive Offload . . . . . . . . . . . . . . . . . . . 35 135 Appendix B: Implementation considerations . . . . . . . . . . . . 36 136 B.1. Priveleged ports . . . . . . . . . . . . . . . . . . . . . 36 137 B.2. Setting flow entropy as a route selector . . . . . . . . . 36 138 B.3. Hardware protocol implementation considerations . . . . . . 36 139 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 37 141 1. Introduction 143 This specification describes Generic UDP Encapsulation (GUE) which is 144 a general method for encapsulating packets of arbitrary IP protocols 145 within User Datagram Protocol (UDP) [RFC0768] packets. Encapsulating 146 packets in UDP facilitates efficient transport across networks. 147 Networking devices widely provide protocol specific processing and 148 optimizations for UDP (as well as TCP) packets. Packets for atypical 149 IP protocols (those not usually parsed by networking hardware) can be 150 encapsulated in UDP packets to maximize deliverability and to 151 leverage flow specific mechanisms for routing and packet steering. 153 GUE provides an extensible header format for including optional data 154 in the encapsulation header. This data potentially covers items such 155 as virtual networking identifier, security data for validating or 156 authenticating the GUE header, congestion control data, etc. GUE also 157 allows private optional data in the encapsulation header. This 158 feature can be used by a site or implementation to define local 159 custom optional data, and allows experimentation of options that may 160 eventually become standard. 162 This document does not define any specific GUE extensions. 163 [GUEEXTENS] specifies a set of core extensions and [GUE4NVO3] defines 164 an extension for using GUE with network virtualization. 166 The motivation for the GUE protocol is described in section 6. 168 1.1 Terminology 170 GUE Generic UDP Encapsulation 172 GUE Header A variable length protocol header that is composed 173 of a primary four byte header and zero or more four 174 byte words for optional header data 176 GUE packet A UDP/IP packet that contains a GUE header and GUE 177 payload within the UDP payload 179 Encapsulator A network node that encapsulates a packet in GUE 181 Decapsulator A network node that decapsulates and processes 182 packets encapsulated in GUE 184 Data message An encapsulated packet in the GUE payload that is 185 addressed to the protocol stack for an associated 186 protocol 188 Control message A formatted message in the GUE payload that is 189 implicitly addressed to a decapsulator to monitor or 190 control the state or behavior of a tunnel 192 Flags A set of bit flags in the primary GUE header 194 Extension field 195 An optional field in a GUE header whose presence is 196 indicated by corresponding flag(s) 198 C-bit A single bit flag in the primary GUE header that 199 indicates whether the GUE packet contains a control 200 message or not. 202 Hlen A field in the primary GUE header that gives the 203 length of the GUE header 205 Proto/ctype A field in the GUE header that holds either the IP 206 protocol number for a data message or a type for a 207 control message 209 Private data Optional data in the GUE header that may be used for 210 private purposes 212 Outer IP header Refers to the outer most IP header of a packet when 213 encapsulating a packet over IP 215 Inner IP header Refers to an encapsulated IP header when an IP 216 packets is encapsulated 218 Outer packet Refers to an encapsulating packet 220 Inner packet Refers to a packet that is encapsulated 222 Tunnel An abstraction of a path across a network that ships 223 packets or protocols across a network that normally 224 wouldn't support them. Tunnels provide communication 225 paths between two endpoints. Encapsulation is one 226 common technique used to actualize tunnels 228 Overlay network A computer network that is built on top of another 229 network 231 Underlay network 232 A network over which an overlay network is built 234 2. Base packet format 236 A GUE packet is comprised of a UDP packet whose payload is a GUE 237 header followed by a payload which is either an encapsulated packet 238 of some IP protocol or a control message (like an OAM message). A GUE 239 packet has the general format: 241 +-------------------------------+ 242 | | 243 | UDP/IP header | 244 | | 245 |-------------------------------| 246 | | 247 | GUE Header | 248 | | 249 |-------------------------------| 250 | | 251 | Encapsulated packet | 252 | or control message | 253 | | 254 +-------------------------------+ 256 The GUE header is variable length as determined by the presence of 257 optional extension fields. 259 2.1. GUE version 261 The first two bits of the GUE header contain the GUE protocol version 262 number. The rest of the fields after the GUE version number are 263 defined based on the version number. Versions 0 and 1 are described 264 in this specification; versions 2 and 3 are reserved. 266 3. Version 0 268 Version 0 of GUE defines a generic extensible format to encapsulate 269 packets by Internet protocol number. 271 3.1. Header format 273 The header format for version 0 of GUE in UDP is: 275 0 1 2 3 276 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 277 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ 278 | Source port | Destination port | | 279 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP 280 | Length | Checksum | | 281 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ 282 | 0 |C| Hlen | Proto/ctype | Flags | 283 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 284 | | 285 ~ Extensions Fields (optional) ~ 286 | | 287 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 288 | | 289 ~ Private data (optional) ~ 290 | | 291 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 293 The contents of the UDP header are: 295 o Source port: If connection semantics (section 5.6.1) are applied 296 to an encapsulation, this is set to the source port in the local 297 tuple. When connection semantics are not applied this should be 298 set to a flow entropy value for use with ECMP; the properties of 299 flow entropy are described in section 5.11. 301 o Destination port: If connection semantics (section 5.6.1) are 302 applied to an encapsulation, this is set to the destination port 303 for the tuple. If connection semantics are not applied this is 304 set to the GUE assigned port number, 6080. 306 o Length: Canonical length of the UDP packet (length of UDP header 307 and payload). 309 o Checksum: Standard UDP checksum (handling is described in 310 section 5.7). 312 The GUE header consists of: 314 o Ver: GUE protocol version (0). 316 o C: C-bit. When set indicates a control message, not set 317 indicates a data message. 319 o Hlen: Length in 32-bit words of the GUE header, including 320 optional extension fields but not the first four bytes of the 321 header. Computed as (header_len - 4) / 4. All GUE headers are a 322 multiple of four bytes in length. Maximum header length is 128 323 bytes. 325 o Proto/ctype: When the C-bit is set this field contains a control 326 message type for the payload (section 3.2.2). When C-bit is not 327 set, the field holds the Internet protocol number for the 328 encapsulated packet in the payload (section 3.2.1). The control 329 message or encapsulated packet begins at the offset provided by 330 Hlen. 332 o Flags. Header flags that may be allocated for various purposes 333 and may indicate presence of extension fields. Undefined header 334 flag bits MUST be set to zero on transmission. 336 o Extension Fields: Optional fields whose presence is indicated by 337 corresponding flags. 339 o Private data: Optional private data (see section 3.4). If 340 private data is present it immediately follows that last 341 extension field present in the header. The length of this data 342 is determined by subtracting the starting offset from the header 343 length. 345 3.2. Proto/ctype field 347 The proto/ctype field contains the type of the GUE payload. This can 348 either be an IP protocol number or a control message type number. 349 Intermediate devices may parse the GUE payload per the number in the 350 proto/ctype field, and header flags cannot affect the interpretation 351 of the proto/ctype field. 353 3.2.1 Proto field 355 When the C-bit is not set the proto/ctype field contains an IANA 356 Internet Protocol Number. The protocol number is interpreted relative 357 to the IP protocol that encapsulates the UDP packet (i.e. protocol of 358 the outer IP header). 360 When the outer IP protocol is IPv4 the proto field may be set to any 361 number except for those that refer to IPv6 extension headers or 362 ICMPv6 options (number 58). An exception is that the destination 363 options extension header using the PadN option may be used with IPv4 364 as described in section 3.6. The "no next header" protocol number 365 (59) may be used with IPv4 as described below. 367 When the outer IP protocol is IPv6 the proto field may be set to any 368 defined protocol number except Hop-by-hop options (number 0). If a 369 received GUE packet in IPv6 contains a protocol number that is an 370 extension header (e.g. Destination Options) then the extension header 371 is processed after the GUE header as though the GUE header itself 372 were an extension header. 374 IP protocol number 59 ("No next header") may be set to indicate that 375 the GUE payload does not begin with the header of an IP protocol. 376 This would be the case, for instance, if the GUE payload were a 377 fragment when performing GUE level fragmentation. The interpretation 378 of the payload is performed through other means (such as flags and 379 extension fields), and intermediate devices must not parse packets 380 based on the IP protocol number in this case. 382 3.2.2 Ctype field 384 When the C-bit is set, the proto/ctype field must be set to a valid 385 control message type. A value of zero indicates that the GUE payload 386 requires further interpretation to deduce the control type. This 387 might be the case when the payload is a fragment of a control 388 message, where only the reassembled packet can be interpreted as a 389 control message. 391 Control message types 1 through 127 may be defined in standards. 392 Types 128 through 255 are reserved to be user defined for 393 experimentation or private control messages. 395 This document does not specify any standard control message types 396 other than type 0. 398 3.3. Flags and extension fields 400 Flags and associated extension fields are the primary mechanism of 401 extensibility in GUE. As mentioned in section 3.1 GUE header flags 402 may indicate the presence of optional extension fields in the GUE 403 header. [GUEXTENS] defines a basic set of GUE extensions. 405 3.3.1. Requirements 407 There are sixteen flag bits in the GUE header. A flag may indicate 408 presence of an extension fields. The size of an extension field 409 indicated by a flag must be fixed. 411 Flags may be paired together to allow different lengths for an 412 extension field. For example, if two flag bits are paired, a field 413 may possibly be three different lengths. Regardless of how flag bits 414 may be paired, the lengths and offsets of optional fields 415 corresponding to a set of flags must be well defined. 417 Extension fields are placed in order of the flags. New flags are to 418 be allocated from high to low order bit contiguously without holes. 419 Flags allow random access, for instance to inspect the field 420 corresponding to the Nth flag bit, an implementation only considers 421 the previous N-1 flags to determine the offset. Flags after the Nth 422 flag are not pertinent in calculating the offset of an extension 423 field indicated by the Nth flag. Random access of flags and fields 424 permits processing of optional extensions in an order that is 425 independent of their position in the packet. The processing order of 426 extensions defined in [GUEEXTENS] demonstrates this property. 428 Flags (or paired flags) are idempotent such that new flags must not 429 cause reinterpretation of old flags. Also, new flags should not alter 430 interpretation of other elements in the GUE header nor how the 431 message is parsed (for instance, in a data message the proto/ctype 432 field always holds an IP protocol number as an invariant). 434 The set of available flags may be extended in the future by defining 435 a "flag extensions bit" that refers to a field containing a new set 436 of flags. 438 3.3.2. Example GUE header with extension fields 440 An example GUE header for a data message encapsulating an IPv4 packet 441 and containing the VNID and Security extension fields (both defined 442 in [GUEXTENS]) is shown below: 444 0 1 2 3 445 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 446 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 447 | 0 |0| 3 | 94 |1|0 0 1| 0 | 448 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 449 | VNID | 450 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 451 | | 452 + Security + 453 | | 454 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 456 In the above example, the first flag bit is set which indicates that 457 the VNID extension is present; this is a 32 bit field. The second 458 through fourth bits of the flags are paired flags that indicate the 459 presence of a security field with seven possible sizes. In this 460 example 001 indicates a sixty-four bit security field. 462 3.4. Private data 464 An implementation may use private data for its own use. The private 465 data immediately follows the last extension field in the GUE header 466 and is not a fixed length. This data is considered part of the GUE 467 header and must be accounted for in header length (Hlen). The length 468 of the private data must be a multiple of four and is determined by 469 subtracting the offset of private data in the GUE header from the 470 header length. Specifically: 472 Private_length = (Hlen * 4) - Length(flags) 474 Where "Length(flags)" returns the sum of lengths of all the extension 475 fields present in the GUE header. When there is no private data 476 present, the length of the private data is zero. 478 The semantics and interpretation of private data are implementation 479 specific. The private data may be structured as necessary, for 480 instance it might contain its own set of flags and extension fields. 482 An encapsulator and decapsulator MUST agree on the meaning of private 483 data before using it. The mechanism to achieve this agreement is 484 outside the scope of this document but could include implementation- 485 defined behavior, coordinated configuration, in-band communication 486 using GUE control messages, or out-of-band messages. 488 If a decapsulator receives a GUE packet with private data, it MUST 489 validate the private data appropriately. If a decapsulator does not 490 expect private data from an encapsulator the packet MUST be dropped. 491 If a decapsulator cannot validate the contents of private data per 492 the provided semantics the packet MUST also be dropped. An 493 implementation may place security data in GUE private data which must 494 be verified for packet acceptance. 496 3.5. Message types 498 3.5.1. Control messages 500 Control messages carry formatted message that are implicitly 501 addressed to the decapsulator to monitor or control the state or 502 behavior of a tunnel (OAM). For instance, an echo request and 503 corresponding echo reply message may be defined to test for liveness. 505 Control messages are indicated in the GUE header when the C-bit is 506 set. The payload is interpreted as a control message with type 507 specified in the proto/ctype field. The format and contents of the 508 control message are indicated by the type and can be variable length. 510 Other than interpreting the proto/ctype field as a control message 511 type, the meaning and semantics of the rest of the elements in the 512 GUE header are the same as that of data messages. Forwarding and 513 routing of control messages should be the same as that of a data 514 message with the same outer IP and UDP header and GUE flags-- this 515 ensures that control messages can be created that follow the same 516 path as data messages. 518 3.5.2. Data messages 520 Data messages carry encapsulated packets that are addressed to the 521 protocol stack for the associated protocol. Data messages are a 522 primary means of encapsulation and can be used to create tunnels for 523 overlay networks. 525 Data messages are indicated in GUE header when the C-bit is not set. 526 The payload of a data message is interpreted as an encapsulated 527 packet of an Internet protocol indicated in the proto/ctype field. 528 The encapsulated packet immediately follows the GUE header. 530 3.6. Hiding the transport layer protocol number 532 The GUE header indicates the Internet protocol of the encapsulated 533 packet. This is either contained in the Proto/ctype field of the 534 primary GUE header, or is contained in the Payload Type field of a 535 GUE Transform Field (used to encrypt the payload with DTLS, 536 [GUESEC]). If the protocol number must be obfuscated, that is the 537 transport protocol in use must be hidden from the network, then a 538 trivial destination options can be used at the beginning of the 539 payload. 541 The PadN destination option can be used to encode the transport 542 protocol as a next header of an extension header (and maintain 543 alignment of encapsulated transport headers). The Proto/ctype field 544 or Payload Type field of the GUE Transform field is set to 60 to 545 indicate that the first encapsulated header is a Destination Options 546 extension header. 548 The format of the extension header is below: 550 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 551 | Next Header | 2 | 1 | 0 | 552 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 554 For IPv4, it is permitted in GUE to use this precise destination 555 option to contain the obfuscated protocol number. In this case next 556 header must refer to a valid IP protocol for IPv4. No other extension 557 headers or destination options are permitted with IPv4. 559 4. Version 1 561 Version 1 of GUE allows direct encapsulation of IPv4 and IPv6 in UDP. 562 In this version there is no GUE header; a UDP packet encapsulates an 563 IP packet. The first two bits of the UDP payload for GUE are the GUE 564 version and coincide with the first two bits of the version number in 565 the IP header. The first two version bits of IPv4 and IPv6 are 01, so 566 we use GUE version 1 for direct IP encapsulation which makes two bits 567 of GUE version to also be 01. 569 This technique is effectively a means to compress out the GUE header 570 when encapsulating IPv4 or IPv6 packets and there are no flags or 571 extension fields present. This method is compatible to use on the 572 same port number as packets with the GUE header (GUE version 0 573 packets). This technique saves encapsulation overhead on costly links 574 for the common use of IP encapsulation, and also obviates the need to 575 allocate a separate port number for IP-over-UDP encapsulation. 577 4.1. Direct encapsulation of IPv4 579 The format for encapsulating IPv4 directly in UDP is demonstrated 580 below: 582 0 1 2 3 583 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 584 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ 585 | Source port | Destination port | | 586 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP 587 | Length | Checksum | | 588 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ 589 |0|1|0|0| IHL |Type of Service| Total Length | 590 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 591 | Identification |Flags| Fragment Offset | 592 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 593 | Time to Live | Protocol | Header Checksum | 594 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 595 | Source IPv4 Address | 596 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 597 | Destination IPv4 Address | 598 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 600 Note that 0100 value IP version field expresses the GUE version as 1 601 (bits 01) and IP version as 4 (bits 0100). 603 4.2. Direct encapsulation of IPv6 605 The format for encapsulating IPv4 directly in UDP is demonstrated 606 below: 608 0 1 2 3 609 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 610 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\ 611 | Source port | Destination port | | 612 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP 613 | Length | Checksum | | 614 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/ 615 |0|1|1|0| Traffic Class | Flow Label | 616 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 617 | Payload Length | NextHdr | Hop Limit | 618 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 619 | | 620 + + 621 | | 622 + Outer Source IPv6 Address + 623 | | 624 + + 625 | | 626 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 627 | | 628 + + 629 | | 630 + Outer Destination IPv6 Address + 631 | | 632 + + 633 | | 634 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 636 Note that 0110 value IP version field expresses the GUE version as 1 637 (bits 01) and IP version as 6 (bits 0110). 639 5. Operation 641 The figure below illustrates the use of GUE encapsulation between two 642 hosts. Sever 1 is sending packets to host 2. An encapsulator performs 643 encapsulation of packets from host 1. These encapsulated packets 644 traverse the network as UDP packets. At the decapsulator, packets are 645 decapsulated and sent on to host 2. Packet flow in the reverse 646 direction need not be symmetric; GUE encapsulation is not required in 647 the reverse path. 649 +---------------+ +---------------+ 650 | | | | 651 | Host 1 | | Host 2 | 652 | | | | 653 +---------------+ +---------------+ 654 | ^ 655 V | 656 +---------------+ +---------------+ +---------------+ 657 | | | | | | 658 | Encapsulator |-->| Layer 3 |-->| Decapsulator | 659 | | | Network | | | 660 +---------------+ +---------------+ +---------------+ 662 The encapsulator and decapsulator may be co-resident with the 663 corresponding hosts, or may be on separate nodes in the network. 665 5.1. Network tunnel encapsulation 667 Network tunneling can be achieved by encapsulating layer 2 or layer 3 668 packets. In this case the encapsulator and decapsulator nodes are the 669 tunnel endpoints. These could be routers that provide network tunnels 670 on behalf of communicating hosts. 672 5.2. Transport layer encapsulation 674 When encapsulating layer 4 packets, the encapsulator and decapsulator 675 should be co-resident with the hosts. In this case, the encapsulation 676 headers are inserted between the IP header and the transport packet. 677 The addresses in the IP header refer to both the endpoints of the 678 encapsulation and the endpoints for terminating the the transport 679 protocol. Note that the transport layer ports in the encapsulated 680 packet are independent of the UDP ports in the outer packet. 682 Details about performing transport layer encapsulation are discussed 683 in [TOU]. 685 5.3. Encapsulator operation 687 Encapsulators create GUE data messages, set the fields of the UDP 688 header, set flags and optional extension fields in the GUE header, 689 and forward packets to a decapsulator. 691 An encapsulator may be an end host originating the packets of a flow, 692 or may be a network device performing encapsulation on behalf of 693 hosts (routers implementing tunnels for instance). In either case, 694 the intended target (decapsulator) is indicated by the outer 695 destination IP address and destination port in the UDP header. 697 If an encapsulator is tunneling packets, that is encapsulating 698 packets of layer 2 or layer 3 protocols (e.g. EtherIP, IPIP, ESP 699 tunnel mode), it should follow standard conventions for tunneling of 700 one protocol over another. For instance, if an IP packet is being 701 encapsualated in GUE then diffserv interaction [RFC2983] and ECN 702 propagation for tunnels [RFC6040] should be followed. 704 5.4. Decapsulator operation 706 A decapsulator performs decapsulation of GUE packets. A decapsulator 707 is addressed by the outer destination IP address of a GUE packet. 708 The decapsulator validates packets, including fields of the GUE 709 header. 711 If a decapsulator receives a GUE packet with an unsupported version, 712 unknown flag, bad header length (too small for included extension 713 fields), unknown control message type, bad protocol number, an 714 unsupported Proto/ctype, or an otherwise malformed header, it MUST 715 drop the packet. Such events may be logged subject to configuration 716 and rate limiting of logging messages. No error message is returned 717 back to the encapsulator. Note that set flags in GUE that are unknown 718 to a decapsulator MUST NOT be ignored. If a GUE packet is received by 719 a decapsulator with unknown flags, the packet MUST be dropped. 721 5.4.1. Processing a received data message 723 If a valid data message is received the UDP and GUE headers are 724 removed from the packet. The outer IP header remains in tact and the 725 next protocol in the header is set to the protocol from the proto 726 field in the GUE header. The resulting packet is then resubmitted 727 into the protocol stack to process that packet as though it was 728 received with the protocol in the GUE header. 730 As an example, consider that a data message is received where GUE 731 encapsulates an IP packet. In this case proto field in the GUE header 732 is set 94 for IPIP: 734 +-------------------------------------+ 735 | IP header (next proto = 17,UDP) | 736 |-------------------------------------| 737 | UDP | 738 |-------------------------------------| 739 | GUE (proto = 94,IPIP) | 740 |-------------------------------------| 741 | IP header and packet | 742 +-------------------------------------+ 744 The receiver removes the UDP and GUE headers and sets the next 745 protocol field in the IP packet to IPIP which is derived from the GUE 746 proto field. The resultant packet would have the format: 748 +-------------------------------------+ 749 | IP header (next proto = 94,IPIP) | 750 |-------------------------------------| 751 | IP header and packet | 752 +-------------------------------------+ 754 This packet is then resubmitted into the protocol stack to be 755 processed as an IPIP packet. 757 5.4.2. Processing a received control message 759 If a valid control message is received the packet must be processed 760 as a control message. The specific processing to be performed depends 761 on the ctype in the GUE header. 763 5.5. Router and switch operation 765 Routers and switches should forward GUE packets as standard UDP/IP 766 packets. The outer five-tuple should contain sufficient information 767 to perform flow classification corresponding to the flow of the inner 768 packet. A switch should not normally need to parse a GUE header, and 769 none of the flags or extension fields in the GUE header should affect 770 routing. 772 An intermediate node SHOULD NOT modify a GUE header or GUE payload 773 when forwarding packets since correctly identifying GUE packets in 774 the network based on port numbers is not robust (see [RFC7605]). An 775 intermediate node may encapsulate a GUE packet in another GUE packet, 776 for instance to implement a network tunnel (i.e. by encapsulating an 777 IP packet with a GUE payload in another IP packet as a GUE payload). 778 In this case the router takes the role of an encapsulator, and the 779 corresponding decapsulator is the logical endpoint of the tunnel. 780 When encapsulating a GUE packet within another GUE packet, there are 781 no provisions to automatically copy flags or extension fields to the 782 outer GUE header. Each layer of encapsulation is considered 783 independent. 785 5.6. Middlebox interactions 787 A middle box may interpret some flags and extension fields of the GUE 788 header for classification purposes, but is not required to understand 789 any of the flags or extension fields in GUE packets. A middle box 790 must not drop a GUE packet because there are flags unknown to it. The 791 header length in the GUE header allows a middlebox to inspect the 792 payload packet without needing to parse the flags or extension 793 fields. 795 5.6.1. Connection semantics 797 A middlebox may infer bidirectional connection semantics for a UDP 798 flow. For instance a stateful firewall may create a five-tuple rule 799 to match flows on egress, and a corresponding five-tuple rule for 800 matching ingress packets where the roles of source and destination 801 are reversed for the IP addresses and UDP port numbers. To operate in 802 this environment, a GUE tunnel must assume connected semantics 803 defined by the UDP five tuple and the use of GUE encapsulation must 804 be symmetric between both endpoints. The source port set in the UDP 805 header must be the destination port the peer would set for replies. 806 In this case the UDP source port for a tunnel would be a fixed value 807 for a tunnel and not set to be flow entropy as described in section 808 5.11. 810 The selection of whether to make the UDP source port fixed or set to 811 a flow entropy value for each packet sent should be configurable for 812 a tunnel. 814 5.6.2. NAT 816 IP address and port translation can be performed on the UDP/IP 817 headers adhering to the requirements for NAT with UDP [RFC4787]. In 818 the case of stateful NAT, connection semantics must be applied to a 819 GUE tunnel as described in section 5.6.1. GUE endpoints may also 820 invoke STUN [RFC5389] or ICE [RFC5245] to manage NAT port mappings 821 for encapsulations. 823 5.7. Checksum Handling 825 The potential for mis-delivery of packets due to corruption of IP, 826 UDP, or GUE headers must be considered. Historically, the UDP 827 checksum would be considered sufficient as a check against corruption 828 of either the UDP header and payload or the IP addresses. 829 Encapsulation protocols, such as GUE, may be originated or terminated 830 on devices incapable of computing the UDP checksum for packet. This 831 section discusses the requirements around checksum and alternatives 832 that might be used when an endpoint does not support UDP checksum. 834 5.7.1. Requirements 836 One of the following requirements must be met: 838 o UDP checksums are enabled (for IPv4 or IPv6). 840 o The GUE header checksum is used (defined in [GUEEXTENS]). 842 o Use zero UDP checksums. This is always permissable with IPv4, in 843 IPv6 they may only be used in accordance with applicable 844 requirements in [GREUDP], [RFC6935], and [RFC6936]. 846 5.7.2. UDP Checksum with IPv4 848 For UDP in IPv4, the UDP checksum MUST be processed as specified in 849 [RFC768] and [RFC1122] for both transmit and receive. An encapsulator 850 MAY set the UDP checksum to zero for performance or implementation 851 considerations. The IPv4 header includes a checksum that protects 852 against mis-delivery of the packet due to corruption of IP addresses. 853 The UDP checksum potentially provides protection against corruption 854 of the UDP header, GUE header, and GUE payload. Enabling or disabling 855 the use of checksums is a deployment consideration that should take 856 into account the risk and effects of packet corruption, and whether 857 the packets in the network are already adequately protected by other, 858 possibly stronger mechanisms such as the Ethernet CRC. If an 859 encapsulator sets a zero UDP checksum for IPv4 it SHOULD use the GUE 860 header checksum as described in [GUEEXTENS]. 862 When a decapsulator receives a packet, the UDP checksum field MUST be 863 processed. If the UDP checksum is non-zero, the decapsulator MUST 864 verify the checksum before accepting the packet. By default a 865 decapsulator SHOULD accept UDP packets with a zero checksum. A node 866 MAY be configured to disallow zero checksums per [RFC1122]; this may 867 be done selectively, for instance disallowing zero checksums from 868 certain hosts that are known to be sending over paths subject to 869 packet corruption. If verification of a non-zero checksum fails, a 870 decapsulator lacks the capability to verify a non-zero checksum, or a 871 packet with a zero-checksum was received and the decapsulator is 872 configured to disallow, the packet MUST be dropped. 874 5.7.3. UDP Checksum with IPv6 876 In IPv6 there is no checksum in the IPv6 header that protects against 877 mis-delivery due to address corruption. Therefore, when GUE is used 878 over IPv6, either the UDP checksum must be enabled, the GUE header 879 checksum must be used, or a zero UDP checksum is used if applicable 880 requirements are met. Setting a zero checksum may be desirable for 881 performance or implementation reasons, in which case the GUE header 882 checksum MUST be used or requirements for using zero UDP checksums in 883 [RFC6935] and [RFC6936] MUST be met. If the UDP checksum is enabled, 884 then the GUE header checksum should not be used since it is mostly 885 redundant. 887 When a decapsulator receives a packet, the UDP checksum field MUST be 888 processed. If the UDP checksum is non-zero, the decapsulator MUST 889 verify the checksum before accepting the packet. By default a 890 decapsulator MUST only accept UDP packets with a zero checksum if the 891 GUE header checksum is used and is verified. If verification of a 892 non-zero checksum fails, a decapsulator lacks the capability to 893 verify a non-zero checksum, or a packet with a zero-checksum and no 894 GUE header checksum was received, the packet MUST be dropped. 896 5.8. MTU and fragmentation 898 Standard conventions for handling of MTU (Maximum Transmission Unit) 899 and fragmentation in conjunction with networking tunnels 900 (encapsulation of layer 2 or layer 3 packets) should be followed. 901 Details are described in MTU and Fragmentation Issues with In-the- 902 Network Tunneling [RFC4459] 904 If a packet is fragmented before encapsulation in GUE, all the 905 related fragments must be encapsulated using the same UDP source 906 port. An operator should set MTU to account for encapsulation 907 overhead and reduce the likelihood of fragmentation. 909 Alternative to IP fragmentation, the GUE fragmentation extension can 910 be used. GUE fragmentation is described in [GUEEXTENS]. 912 5.9. Congestion control 914 Per requirements of [RFC5405], if the IP traffic encapsulated with 915 GUE implements proper congestion control no additional mechanisms 916 should be required. 918 In the case that the encapsulated traffic does not implement any or 919 sufficient control, or it is not known whether a transmitter will 920 consistently implement proper congestion control, then congestion 921 control at the encapsulation layer MUST be provided per RFC5405. Note 922 this case applies to a significant use case in network virtualization 923 in which guests run third party networking stacks that cannot be 924 implicitly trusted to implement conformant congestion control. 926 Out of band mechanisms such as rate limiting, Managed Circuit Breaker 927 [CIRCBRK], or traffic isolation may be used to provide rudimentary 928 congestion control. For finer grained congestion control that allows 929 alternate congestion control algorithms, reaction time within an RTT, 930 and interaction with ECN, in-band mechanisms may be warranted. 932 5.10. Multicast 934 GUE packets may be multicast to decapsulators using a multicast 935 destination address in the encapsulating IP headers. Each receiving 936 host will decapsulate the packet independently following normal 937 decapsulator operations. The receiving decapsulators should agree on 938 the same set of GUE parameters and properties; how such an agreement 939 is reached is outside the scope of this document. 941 GUE allows encapsulation of unicast, broadcast, or multicast traffic. 942 Flow entropy (the value in the UDP source port) may be generated from 943 the header of encapsulated unicast or broadcast/multicast packets at 944 an encapsulator. The mapping mechanism between the encapsulated 945 multicast traffic and the multicast capability in the IP network is 946 transparent and independent of the encapsulation and is otherwise 947 outside the scope of this document. 949 5.11. Flow entropy for ECMP 951 5.11.1. Flow classification 953 A major objective of using GUE is that a network device can perform 954 flow classification corresponding to the flow of the inner 955 encapsulated packet based on the contents in the outer headers. 957 Hardware devices commonly perform hash computations on packet headers 958 to classify packets into flows or flow buckets. Flow classification 959 is done to support load balancing of flows across a set of networking 960 resources. Examples of such load balancing techniques are Equal Cost 961 Multipath routing (ECMP), port selection in Link Aggregation, and NIC 962 device Receive Side Scaling (RSS). Hashes are usually either a 963 three-tuple hash of IP protocol, source address, and destination 964 address; or a five-tuple hash consisting of IP protocol, source 965 address, destination address, source port, and destination port. 966 Typically, networking hardware will compute five-tuple hashes for TCP 967 and UDP, but only three-tuple hashes for other IP protocols. Since 968 the five-tuple hash provides more granularity, load balancing can be 969 finer grained with better distribution. When a packet is encapsulated 970 with GUE and connection semantics are not applied, the source port in 971 the outer UDP packet is set to a flow entropy value that corresponds 972 to the flow of the inner packet. When a device computes a five-tuple 973 hash on the outer UDP/IP header of a GUE packet, the resultant value 974 classifies the packet per its inner flow. 976 Examples of deriving flow entropy for encapsulation are: 978 o If the encapsulated packet is a layer 4 packet, TCP/IPv4 for 979 instance, the flow entropy could be based on the canonical five- 980 tuple hash of the inner packet. 982 o If the encapsulated packet is an AH transport mode packet with 983 TCP as next header, the flow entropy could be a hash over a 984 three-tuple: TCP protocol and TCP ports of the encapsulated 985 packet. 987 o If a node is encrypting a packet using ESP tunnel mode and GUE 988 encapsulation, the flow entropy could be based on the contents 989 of clear-text packet. For instance, a canonical five-tuple hash 990 for a TCP/IP packet could be used. 992 [RFC6438] discusses methods to compute and flow entropy value for 993 IPv6 flow labels, those methods can also be used to create flow 994 entropy values for GUE. 996 5.11.2. Flow entropy properties 998 The flow entropy is the value set in the UDP source port of a GUE 999 packet. Flow entropy in the UDP source port should adhere to the 1000 following properties: 1002 o The value set in the source port should be within the ephemeral 1003 port range (49152 to 65535 [RFC6335]). Since the high order two 1004 bits of the port are set to one this provides fourteen bits of 1005 entropy for the value. 1007 o The flow entropy should have a uniform distribution across 1008 encapsulated flows. 1010 o An encapsulator may occasionally change the flow entropy used 1011 for an inner flow per its discretion (for security, route 1012 selection, etc). To avoid thrashing or flapping the value, the 1013 flow entropy used for a flow should not change more than once 1014 every thirty seconds (or a configurable value). 1016 o Decapsulators, or any networking devices, should not attempt to 1017 interpret flow entropy as anything more than an opaque value. 1018 Neither should they attempt to reproduce the hash calculation 1019 used by an encapasulator in creating a flow entropy value. They 1020 may use the value to match further receive packets for steering 1021 decisions, but cannot assume that the hash uniquely or 1022 permanently identifies a flow. 1024 o Input to the flow entropy calculation is not restricted to ports 1025 and addresses; input could include flow label from an IPv6 1026 packet, SPI from an ESP packet, or other flow related state in 1027 the encapsulator that is not necessarily conveyed in the packet. 1029 o The assignment function for flow entropy should be randomly 1030 seeded to mitigate denial of service attacks. The seed may be 1031 changed periodically. 1033 5.12. Negotiation of acceptable flags and extension fields 1035 An encapsulator and decapsulator must achieve agreement about GUE 1036 parameters that will be used in communications. Parameters include 1037 GUE versions, flags and optional extension fields that can be used, 1038 security algorithms and keys, supported protocols and control 1039 messages, etc. This document proposes different general methods to 1040 accomplish this, the details of implementing these are considered out 1041 of scope. 1043 General methods for this are: 1045 o Configuration. The parameters used for a tunnel are configured 1046 at each endpoint. 1048 o Negotiation. A tunnel negotiation can be performed. This could 1049 be accomplished in-band of GUE using control messages or private 1050 data. 1052 o Via a control plane. Parameters for communicating with a tunnel 1053 endpoint can be set in a control plane protocol (such as that 1054 needed for nvo3). 1056 o Via security negotiation. If security is used that would 1057 typically imply a key exchange between endpoints. Other GUE 1058 parameters may be conveyed as part of that process. 1060 6. Motivation for GUE 1062 This section presents the motivation for GUE with respect to other 1063 encapsulation methods. 1065 6.1. Benefits of GUE 1067 * GUE is a generic encapsulation protocol. GUE can encapsulate 1068 protocols that are represented by an IP protocol number. This 1069 includes layer 2, layer 3, and layer 4 protocols. 1071 * GUE is an extensible encapsulation protocol. Standardized 1072 optional data such as security, virtual networking identifiers, 1073 fragmentation are being defined. 1075 * GUE allows private data to be sent as part of the encapsulation. 1076 This permits experimentation or customization in deployment. 1078 * GUE allows sending of control messages such as OAM using the 1079 same GUE header format (for routing purposes) as normal data 1080 messages. 1082 * GUE maximizes deliverability of non-UDP and non-TCP protocols. 1084 * GUE provides a means for exposing per flow entropy for ECMP for 1085 atypical protocols such as SCTP, DCCP, ESP, etc. 1087 6.2. Comparison of GUE to other encapsulations 1089 A number of different encapsulation techniques have been proposed for 1090 the encapsulation of one protocol over another. EtherIP [RFC3378] 1091 provides layer 2 tunneling of Ethernet frames over IP. GRE [RFC2784], 1092 MPLS [RFC4023], and L2TP [RFC2661] provide methods for tunneling 1093 layer 2 and layer 3 packets over IP. NVGRE [RFC7637] and VXLAN 1094 [RFC7348] are proposals for encapsulation of layer 2 packets for 1095 network virtualization. IPIP [RFC2003] and Generic packet tunneling 1096 in IPv6 [RFC2473] provide methods for tunneling IP packets over IP. 1098 Several proposals exist for encapsulating packets over UDP including 1099 ESP over UDP [RFC3948], TCP directly over UDP [TCPUDP], VXLAN 1100 [RFC7348], LISP [RFC6830] which encapsulates layer 3 packets, 1101 MPLS/UDP [7510], and Generic UDP Encapsulation for IP Tunneling (GRE 1102 over UDP)[GREUDP]. Generic UDP tunneling [GUT] is a proposal similar 1103 to GUE in that it aims to tunnel packets of IP protocols over UDP. 1105 GUE has the following discriminating features: 1107 o UDP encapsulation leverages specialized network device 1108 processing for efficient transport. The semantics for using the 1109 UDP source port for flow entropy as input to ECMP are defined in 1110 section 5.11. 1112 o GUE permits encapsulation of arbitrary IP protocols, which 1113 includes layer 2 3, and 4 protocols. 1115 o Multiple protocols can be multiplexed over a single UDP port 1116 number. This is in contrast to techniques to encapsulate 1117 protocols over UDP using a protocol specific port number (such 1118 as ESP/UDP, GRE/UDP, SCTP/UDP). GUE provides a uniform and 1119 extensible mechanism for encapsulating various IP protocols in 1120 UDP with minimal overhead (four bytes of additional header). 1122 o GUE is extensible. New flags and extension fields can be 1123 defined. 1125 o The GUE header includes a header length field. This allows a 1126 network node to inspect an encapsulated packet without needing 1127 to parse the full encapsulation header. 1129 o Private data in the encapsulation header allows local 1130 customization and experimentation while being compatible with 1131 processing in network nodes (routers and middleboxes). 1133 o GUE includes both data messages (encapsulation of packets) and 1134 control messages (such as OAM). 1136 o The flags-field model facilitates efficient implementation of 1137 extensibility in hardware. 1139 For instance a TCAM can be use to parse a known set of N flags 1140 where the number of entries in the TCAM is 2^N. 1142 By comparison, the number of TCAM entries needed to parse a set 1143 of N arbitrarily ordered TLVS is: 1145 N! + (N N-1)(N-1)! + (N N-2)(N-2)! + ... + (N 2)2! + (N 1)1! 1147 7. Security Considerations 1149 There are two important considerations of security with respect to 1150 GUE. 1152 o Authentication and integrity of the GUE header 1154 o Authentication, integrity, and confidentiality of the GUE 1155 payload. 1157 Security is integrated into GUE by the use of GUE security related 1158 extensions; these are defined in [GUEEXTENS]. These extensions 1159 include methods to authenticate the GUE header and encrypt the GUE 1160 payload. 1162 IPsec in transport mode may be used to authenticate or encrypt GUE 1163 packets (GUE header and payload). Existing network security 1164 mechanisms, such as address spoofing detection, DDOS mitigation, and 1165 transparent encrypted tunnels can be applied to GUE packets. 1167 A hash function for computing flow entropy (section 5.11) should be 1168 randomly seeded to mitigate some possible denial service attacks. 1170 8. IANA Consideration 1172 8.1. UDP source port 1174 A user UDP port number assignment for GUE has been assigned: 1176 Service Name: gue 1177 Transport Protocol(s): UDP 1178 Assignee: Tom Herbert 1179 Contact: Tom Herbert 1180 Description: Generic UDP Encapsulation 1181 Reference: draft-herbert-gue 1182 Port Number: 6080 1183 Service Code: N/A 1184 Known Unauthorized Uses: N/A 1185 Assignment Notes: N/A 1187 8.2. GUE version number 1189 IANA is requested to set up a registry for the GUE version number. 1190 The GUE version number is 2 bits containing four possible values. 1191 This document defines version 0 and 1. New values are assigned via 1192 Standards Action [RFC5226]. 1194 +----------------+-------------+---------------+ 1195 | Version number | Description | Reference | 1196 +----------------+-------------+---------------+ 1197 | 0 | Version 0 | This document | 1198 | | | | 1199 | 1 | Version 1 | This document | 1200 | | | | 1201 | 2..3 | Unassigned | | 1202 +----------------+-------------+---------------+ 1204 8.3. Control types 1206 IANA is requested to set up a registry for the GUE control types. 1207 Control types are 8 bit values. New values for control types 1-127 1208 are assigned via Standards Action [RFC5226]. 1210 +----------------+------------------+---------------+ 1211 | Control type | Description | Reference | 1212 +----------------+------------------+---------------+ 1213 | 0 | Need further | This document | 1214 | | interpretation | | 1215 | | | | 1216 | 1..127 | Unassigned | | 1217 | | | | 1218 | 128..255 | User defined | This document | 1219 +----------------+------------------+---------------+ 1221 8.4. Flag-fields 1223 IANA is requested to create a "GUE flag-fields" registry to allocate 1224 flags and extension fields used with GUE. This shall be a registry of 1225 bit assignments for flags, length of extension fields for 1226 corresponding flags, and descriptive strings. There are sixteen bits 1227 for primary GUE header flags (bit number 0-15). New values are 1228 assigned via Standards Action [RFC5226]. 1230 +-------------+--------------+-------------+--------------------+ 1231 | Flags bits | Field size | Description | Reference | 1232 +-------------+--------------+-------------+--------------------+ 1233 | Bit 0 | 4 bytes | VNID | [GUE4NVO3] | 1234 | | | | | 1235 | Bit 1..3 | 001->8 bytes | Security | [GUEEXTENS] | 1236 | | 010->16 bytes| | | 1237 | | 011->32 bytes| | | 1238 | | | | | 1239 | Bit 4 | 8 bytes | Fragmen- | [GUEEXTENS] | 1240 | | | tation | | 1241 | | | | | 1242 | Bit 5 | 4 bytes | Payload | [GUEEXTENS] | 1243 | | | transform | | 1244 | | | | | 1245 | Bit 6 | 4 bytes | Remote | [GUEEXTENS] | 1246 | | | checksum | | 1247 | | | offload | | 1248 | | | | | 1249 | Bit 7 | 4 bytes | Checksum | [GUEEXTENS] | 1250 | | | | | 1251 | Bit 8..15 | | Unassigned | | 1252 +-------------+--------------+-------------+--------------------+ 1254 New flags are to be allocated from high to low order bit contiguously 1255 without holes. 1257 9. Acknowledgements 1259 The authors would like to thank David Liu, Erik Nordmark, Fred 1260 Templin, Adrian Farrel, and Bob Briscoe for valuable input on this 1261 draft. 1263 10. References 1265 10.1. Normative References 1267 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI 1268 10.17487/RFC0768, August 1980, . 1271 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 1272 Communication Layers", STD 3, RFC 1122, DOI 1273 10.17487/RFC1122, October 1989, . 1276 [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an 1277 IANA Considerations Section in RFCs", RFC 2434, DOI 1278 10.17487/RFC2434, October 1998, . 1281 [RFC2983] Black, D., "Differentiated Services and Tunnels", RFC 1282 2983, DOI 10.17487/RFC2983, October 2000, . 1285 [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion 1286 Notification", RFC 6040, DOI 10.17487/RFC6040, November 1287 2010, . 1289 [RFC6935] Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and 1290 UDP Checksums for Tunneled Packets", RFC 6935, DOI 1291 10.17487/RFC6935, April 2013, . 1294 [RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement 1295 for the Use of IPv6 UDP Datagrams with Zero Checksums", 1296 RFC 6936, DOI 10.17487/RFC6936, April 2013, 1297 . 1299 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 1300 Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April 1301 2006, . 1303 10.2. Informative References 1305 [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., 1306 and G. Fairhurst, Ed., "The Lightweight User Datagram 1307 Protocol (UDP-Lite)", RFC 3828, July 2004, 1308 . 1310 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 1311 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual 1312 eXtensible Local Area Network (VXLAN): A Framework for 1313 Overlaying Virtualized Layer 2 Networks over Layer 3 1314 Networks", RFC 7348, August 2014, . 1317 [RFC7605] Touch, J., "Recommendations on Using Assigned Transport 1318 Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, 1319 August 2015, . 1321 [RFC7637] Garg, P., Ed., and Y. Wang, Ed., "NVGRE: Network 1322 Virtualization Using Generic Routing Encapsulation", RFC 1323 7637, DOI 10.17487/RFC7637, September 2015, 1324 . 1326 [RFC7510] Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black, 1327 "Encapsulating MPLS in UDP", RFC 7510, DOI 1328 10.17487/RFC7510, April 2015, . 1331 [RFC4340] Kohler, E., Handley, M., and S. Floyd, "Datagram 1332 Congestion Control Protocol (DCCP)", RFC 4340, DOI 1333 10.17487/RFC4340, March 2006, . 1336 [RFC4787] Audet, F., Ed., and C. Jennings, "Network Address 1337 Translation (NAT) Behavioral Requirements for Unicast 1338 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 1339 2007, . 1341 [RFC5389] Rosenberg, J., Mahy, R., Matthews, P., and D. Wing, 1342 "Session Traversal Utilities for NAT (STUN)", RFC 5389, 1343 DOI 10.17487/RFC5389, October 2008, . 1346 [RFC5285] Rosenberg, J., "Interactive Connectivity Establishment 1347 (ICE): A Protocol for Network Address Translator (NAT) 1348 Traversal for Offer/Answer Protocols", RFC 5245, DOI 1349 10.17487/RFC5245, April 2010, . 1352 [RFC5405] Eggert, L. and G. Fairhurst, "Unicast UDP Usage Guidelines 1353 for Application Designers", BCP 145, RFC 5405, DOI 1354 10.17487/RFC5405, November 2008, . 1357 [RFC7605] Touch, J., "Recommendations on Using Assigned Transport 1358 Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605, 1359 August 2015, . 1361 [RFC6438] Carpenter, B. and S. Amante, "Using the IPv6 Flow Label 1362 for Equal Cost Multipath Routing and Link Aggregation in 1363 Tunnels", RFC 6438, DOI 10.17487/RFC6438, November 2011, 1364 . 1366 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, DOI 1367 10.17487/RFC2003, October 1996, . 1370 [RFC3948] Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M. 1371 Stenberg, "UDP Encapsulation of IPsec ESP Packets", RFC 1372 3948, DOI 10.17487/RFC3948, January 2005, . 1375 [RFC6830] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The 1376 Locator/ID Separation Protocol (LISP)", RFC 6830, DOI 1377 10.17487/RFC6830, January 2013, . 1380 [RFC3378] Housley, R. and S. Hollenbeck, "EtherIP: Tunneling 1381 Ethernet Frames in IP Datagrams", RFC 3378, DOI 1382 10.17487/RFC3378, September 2002, . 1385 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. 1386 Traina, "Generic Routing Encapsulation (GRE)", RFC 2784, 1387 DOI 10.17487/RFC2784, March 2000, . 1390 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed., 1391 "Encapsulating MPLS in IP or Generic Routing Encapsulation 1392 (GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005, 1393 . 1395 [RFC2661] Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn, 1396 G., and B. Palter, "Layer Two Tunneling Protocol "L2TP"", 1397 RFC 2661, DOI 10.17487/RFC2661, August 1999, 1398 . 1400 [GUEEXTENS] Herbert, T., Yong, L., and Templin, F., "Extensions for 1401 Generic UDP Encapsulation" draft-herbert-gue-extensions-00 1403 [GUE4NVO3] Yong, L., Herbert, T., Zia, O., "Generic UDP 1404 Encapsulation (GUE) for Network Virtualization Overlay" 1405 draft-hy-nvo3-gue-4-nvo-03 1407 [GUESEC] Yong, L., Herbert, T., "Generic UDP Encapsulation (GUE) for 1408 Secure Transport" draft-hy-gue-4-secure-transport-03 1410 [TCPUDP] Chesire, S., Graessley, J., and McGuire, R., 1411 "Encapsulation of TCP and other Transport Protocols over 1412 UDP" draft-cheshire-tcp-over-udp-00 1414 [TOU] Herbert, T., "Transport layer protocols over UDP" draft- 1415 herbert-transports-over-udp-00 1417 [GREUDP] Crabbe, E., Yong, L., Xu, X., and Herbert, T., "Generic 1418 UDP Encapsulation for IP Tunneling" draft-ietf-tsvwg-gre- 1419 in-udp-encap-19 1421 [GUT] Manner, J., Varia, N., and Briscoe, B., "Generic UDP 1422 Tunnelling (GUT) draft-manner-tsvwg-gut-02.txt" 1424 [CIRCBRK] Fairhurst, G., "Network Transport Circuit Breakers", 1425 draft-ietf-tsvwg-circuit-breaker-15 1427 [LCO] Cree, E., https://www.kernel.org/doc/Documentation/ 1428 networking/checksum-offloads.txt 1430 Appendix A: NIC processing for GUE 1432 This appendix provides some guidelines for Network Interface Cards 1433 (NICs) to implement common offloads and accelerations to support GUE. 1434 Note that most of this discussion is generally applicable to other 1435 methods of UDP based encapsulation. 1437 This appendix is informational and does not constitute a normative 1438 part of this document. 1440 A.1. Receive multi-queue 1442 Contemporary NICs support multiple receive descriptor queues (multi- 1443 queue). Multi-queue enables load balancing of network processing for 1444 a NIC across multiple CPUs. On packet reception, a NIC must select 1445 the appropriate queue for host processing. Receive Side Scaling is a 1446 common method which uses the flow hash for a packet to index an 1447 indirection table where each entry stores a queue number. Flow 1448 Director and Accelerated Receive Flow Steering (aRFS) allow a host to 1449 program the queue that is used for a given flow which is identified 1450 either by an explicit five-tuple or by the flow's hash. 1452 GUE encapsulation should be compatible with multi-queue NICs that 1453 support five-tuple hash calculation for UDP/IP packets as input to 1454 RSS. The flow entropy in the UDP source port ensures classification 1455 of the encapsulated flow even in the case that the outer source and 1456 destination addresses are the same for all flows (e.g. all flows are 1457 going over a single tunnel). 1459 By default, UDP RSS support is often disabled in NICs to avoid out of 1460 order reception that can occur when UDP packets are fragmented. As 1461 discussed above, fragmentation of GUE packets should be mitigated by 1462 fragmenting packets before entering a tunnel, GUE fragmentation, path 1463 MTU discovery in higher layer protocols, or operator adjusting MTUs. 1464 Other UDP traffic may not implement such procedures to avoid 1465 fragmentation, so enabling UDP RSS support in the NIC should be a 1466 considered tradeoff during configuration. 1468 A.2. Checksum offload 1470 Many NICs provide capabilities to calculate standard ones complement 1471 payload checksum for packets in transmit or receive. When using GUE 1472 encapsulation there are at least two checksums that may be of 1473 interest: the encapsulated packet's transport checksum, and the UDP 1474 checksum in the outer header. 1476 A.2.1. Transmit checksum offload 1478 NICs may provide a protocol agnostic method to offload transmit 1479 checksum (NETIF_F_HW_CSUM in Linux parlance) that can be used with 1480 GUE. In this method the host provides checksum related parameters in 1481 a transmit descriptor for a packet. These parameters include the 1482 starting offset of data to checksum, the length of data to checksum, 1483 and the offset in the packet where the computed checksum is to be 1484 written. The host initializes the checksum field to pseudo header 1485 checksum. 1487 In the case of GUE, the checksum for an encapsulated transport layer 1488 packet, a TCP packet for instance, can be offloaded by setting the 1489 appropriate checksum parameters. 1491 NICs typically can offload only one transmit checksum per packet, so 1492 simultaneously offloading both an inner transport packet's checksum 1493 and the outer UDP checksum is likely not possible. 1495 If an encapsulator is co-resident with a host, then checksum offload 1496 may be performed using remote checksum offload (described in 1497 [GUEEXTENS]). Remote checksum offload relies on NIC offload of the 1498 simple UDP/IP checksum which is commonly supported even in legacy 1499 devices. In remote checksum offload the outer UDP checksum is set and 1500 the GUE header includes an option indicating the start and offset of 1501 the inner "offloaded" checksum. The inner checksum is initialized to 1502 the pseudo header checksum. When a decapsulator receives a GUE packet 1503 with the remote checksum offload option, it completes the offload 1504 operation by determining the packet checksum from the indicated start 1505 point to the end of the packet, and then adds this into the checksum 1506 field at the offset given in the option. Computing the checksum from 1507 the start to end of packet is efficient if checksum-complete is 1508 provided on the receiver. 1510 Another alternative when an encapsulator is co-resident with a host 1511 is to perform Local Checksum Offload [LCO]. In this method the inner 1512 transport layer checksum is offloaded and the outer UDP checksum can 1513 be deduced based on the fact that the portion of the packet cover by 1514 the inner transport checksum will sum to zero (or at least the bit 1515 wise not of the inner pseudo header). 1517 A.2.2. Receive checksum offload 1519 GUE is compatible with NICs that perform a protocol agnostic receive 1520 checksum (CHECKSUM_COMPLETE in Linux parlance). In this technique, a 1521 NIC computes a ones complement checksum over all (or some predefined 1522 portion) of a packet. The computed value is provided to the host 1523 stack in the packet's receive descriptor. The host driver can use 1524 this checksum to "patch up" and validate any inner packet transport 1525 checksum, as well as the outer UDP checksum if it is non-zero. 1527 Many legacy NICs don't provide checksum-complete but instead provide 1528 an indication that a checksum has been verified (CHECKSUM_UNNECESSARY 1529 in Linux). Usually, such validation is only done for simple TCP/IP or 1530 UDP/IP packets. If a NIC indicates that a UDP checksum is valid, the 1531 checksum-complete value for the UDP packet is the "not" of the pseudo 1532 header checksum. In this way, checksum-unnecessary can be converted 1533 to checksum-complete. So if the NIC provides checksum-unnecessary for 1534 the outer UDP header in an encapsulation, checksum conversion can be 1535 done so that the checksum-complete value is derived and can be used 1536 by the stack to validate checksums in the encapsulated packet. 1538 A.3. Transmit Segmentation Offload 1540 Transmit Segmentation Offload (TSO) is a NIC feature where a host 1541 provides a large (>MTU size) TCP packet to the NIC, which in turn 1542 splits the packet into separate segments and transmits each one. This 1543 is useful to reduce CPU load on the host. 1545 The process of TSO can be generalized as: 1547 - Split the TCP payload into segments which allow packets with 1548 size less than or equal to MTU. 1550 - For each created segment: 1552 1. Replicate the TCP header and all preceding headers of the 1553 original packet. 1555 2. Set payload length fields in any headers to reflect the 1556 length of the segment. 1558 3. Set TCP sequence number to correctly reflect the offset of 1559 the TCP data in the stream. 1561 4. Recompute and set any checksums that either cover the payload 1562 of the packet or cover header which was changed by setting a 1563 payload length. 1565 Following this general process, TSO can be extended to support TCP 1566 encapsulation in GUE. For each segment the Ethernet, outer IP, UDP 1567 header, GUE header, inner IP header if tunneling, and TCP headers are 1568 replicated. Any packet length header fields need to be set properly 1569 (including the length in the outer UDP header), and checksums need to 1570 be set correctly (including the outer UDP checksum if being used). 1572 To facilitate TSO with GUE it is recommended that extension fields 1573 should not contain values that must be updated on a per segment 1574 basis-- for example, extension fields should not include checksums, 1575 lengths, or sequence numbers that refer to the payload. If the GUE 1576 header does not contain such fields then the TSO engine only needs to 1577 copy the bits in the GUE header when creating each segment and does 1578 not need to parse the GUE header. 1580 A.4. Large Receive Offload 1582 Large Receive Offload (LRO) is a NIC feature where packets of a TCP 1583 connection are reassembled, or coalesced, in the NIC and delivered to 1584 the host as one large packet. This feature can reduce CPU utilization 1585 in the host. 1587 LRO requires significant protocol awareness to be implemented 1588 correctly and is difficult to generalize. Packets in the same flow 1589 need to be unambiguously identified. In the presence of tunnels or 1590 network virtualization, this may require more than a five-tuple match 1591 (for instance packets for flows in two different virtual networks may 1592 have identical five-tuples). Additionally, a NIC needs to perform 1593 validation over packets that are being coalesced, and needs to 1594 fabricate a single meaningful header from all the coalesced packets. 1596 The conservative approach to supporting LRO for GUE would be to 1597 assign packets to the same flow only if they have identical five- 1598 tuple and were encapsulated the same way. That is the outer IP 1599 addresses, the outer UDP ports, GUE protocol, GUE flags and fields, 1600 and inner five tuple are all identical. 1602 Appendix B: Implementation considerations 1604 This appendix is informational and does not constitute a normative 1605 part of this document. 1607 B.1. Priveleged ports 1609 Using the source port to contain a flow entropy value disallows the 1610 security method of a receiver enforcing that the source port be a 1611 privileged port. Privileged ports are defined by some operating 1612 systems to restrict source port binding. Unix, for instance, 1613 considered port number less than 1024 to be privileged. 1615 Enforcing that packets are sent from a privileged port is widely 1616 considered an inadequate security mechanism and has been mostly 1617 deprecated. To approximate this behavior, an implementation could 1618 restrict a user from sending a packet destined to the GUE port 1619 without proper credentials. 1621 B.2. Setting flow entropy as a route selector 1623 An encapsulator generating flow entropy in the UDP source port may 1624 modulate the value to perform a type of multipath source routing. 1625 Assuming that networking switches perform ECMP based on the flow 1626 hash, a sender can affect the path by altering the flow entropy. For 1627 instance, a host may store a flow hash in its PCB for an inner flow, 1628 and may alter the value upon detecting that packets are traversing a 1629 lossy path. Changing the flow entropy for a flow should be subject to 1630 hysteresis (at most once every thirty seconds) to limit the number of 1631 out of order packets. 1633 B.3. Hardware protocol implementation considerations 1635 A low level protocol, such is GUE, is likely interesting to being 1636 supported by high speed network devices. Variable length header (VLH) 1637 protocols like GUE are often considered difficult to efficiently 1638 implement in hardware. In order to retain the important 1639 characteristics of an extensible and robust protocol, hardware 1640 vendors may practice "constrained flexibility". In this model, only 1641 certain combinations or protocol header parameterizations are 1642 implemented in hardware fast path. Each such parameterization is 1643 fixed length so that the particular instance can be optimized as a 1644 fixed length protocol. In the case of GUE this constitutes specific 1645 combinations of GUE flags, fields, and next protocol. The selected 1646 combinations would naturally be the most common cases which form the 1647 "fast path", and other combinations are assumed to take the "slow 1648 path". 1650 In time, needs and requirements of the protocol may change which may 1651 manifest themselves as new parameterizations to be supported in the 1652 fast path. To allow allow this extensibility, a device practicing 1653 constrained flexibility should allow the fast path parameterizations 1654 to be programmable. 1656 Authors' Addresses 1658 Tom Herbert 1659 Facebook 1660 1 Hacker Way 1661 Menlo Park, CA 94052 1662 US 1664 Email: tom@herbertland.com 1666 Lucy Yong 1667 Huawei USA 1668 5340 Legacy Dr. 1669 Plano, TX 75024 1670 US 1672 Email: lucy.yong@huawei.com 1674 Osama Zia 1675 Microsoft 1676 1 Microsoft Way 1677 Redmond, WA 98029 1678 US 1680 Email: osamaz@microsoft.com