idnits 2.17.1 draft-ietf-nvo3-gue-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 312: '...and decapsulator MUST agree on the mea...' RFC 2119 keyword, line 317: '... GUE packet with private data, it MUST...' RFC 2119 keyword, line 319: '...capsulator the packet MUST be dropped....' RFC 2119 keyword, line 321: '...ntics the packet MUST also be dropped....' RFC 2119 keyword, line 431: '...o a decapsulator MUST NOT be ignored. ...' (18 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 16, 2015) is 3291 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC478' is mentioned on line 472, but not defined == Missing Reference: 'RFC6935' is mentioned on line 507, but not defined == Missing Reference: 'GUECSUM' is mentioned on line 520, but not defined == Missing Reference: 'RFC768' is mentioned on line 555, but not defined == Missing Reference: 'RFC1122' is mentioned on line 543, but not defined == Missing Reference: 'RFC2460' is mentioned on line 555, but not defined ** Obsolete undefined reference: RFC 2460 (Obsoleted by RFC 8200) == Missing Reference: 'RFC5405' is mentioned on line 591, but not defined ** Obsolete undefined reference: RFC 5405 (Obsoleted by RFC 8085) == Missing Reference: 'RFC2473' is mentioned on line 712, but not defined == Unused Reference: 'RFC2434' is defined on line 833, but no explicit reference was found in the text == Unused Reference: 'RFC5925' is defined on line 875, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Downref: Normative reference to an Informational RFC: RFC 2983 ** Downref: Normative reference to an Informational RFC: RFC 4459 -- Obsolete informational reference (is this intentional?): RFC 6830 (Obsoleted by RFC 9300, RFC 9301) == Outdated reference: A later version (-08) exists of draft-sridharan-virtualization-nvgre-03 == Outdated reference: A later version (-03) exists of draft-hy-gue-4-secure-transport-00 == Outdated reference: A later version (-02) exists of draft-herbert-remotecsumoffload-00 Summary: 6 errors (**), 0 flaws (~~), 14 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Virtualization Overlays (nvo3) T. Herbert 3 Internet-Draft Facebook 4 Intended status: Standard track L. Yong 5 Expires October 18, 2015 Huawei USA 6 O. Zia 7 Microsoft 8 April 16, 2015 10 Generic UDP Encapsulation 11 draft-ietf-nvo3-gue-00 13 Status of this Memo 15 This Internet-Draft is submitted in full conformance with the 16 provisions of BCP 78 and BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html 34 This Internet-Draft will expire on September 7, 2015. 36 Copyright Notice 38 Copyright (c) 2014 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (http://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Abstract 60 This specification describes Generic UDP Encapsulation (GUE), which 61 is a scheme for using UDP to encapsulate packets of arbitrary IP 62 protocols for transport across layer 3 networks. By encapsulating 63 packets in UDP, specialized capabilities in networking hardware for 64 efficient handling of UDP packets can be leveraged. GUE specifies 65 basic encapsulation methods upon which higher level constructs, such 66 tunnels and overlay networks for network virtualization, can be 67 constructed. GUE is extensible by allowing optional data fields as 68 part of the encapsulation, and is generic in that it can encapsulate 69 packets of various IP protocols. 71 Table of Contents 73 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 74 2. Packet formats . . . . . . . . . . . . . . . . . . . . . . . . 5 75 2.1. GUE header preamble . . . . . . . . . . . . . . . . . . . . 5 76 2.2. GUE header . . . . . . . . . . . . . . . . . . . . . . . . 6 77 2.3. Flags and optional fields . . . . . . . . . . . . . . . . . 7 78 2.4 Private data . . . . . . . . . . . . . . . . . . . . . . . . 8 79 3. Message types . . . . . . . . . . . . . . . . . . . . . . . . . 8 80 3.1. Control messages . . . . . . . . . . . . . . . . . . . . . 8 81 3.2. Data messages . . . . . . . . . . . . . . . . . . . . . . . 9 82 4. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 83 4.1. Network tunnel encapsulation . . . . . . . . . . . . . . . 10 84 4.2. Transport layer encapsulation . . . . . . . . . . . . . . . 10 85 4.3. Encapsulator operation . . . . . . . . . . . . . . . . . . 10 86 4.4. Decapsulator operation . . . . . . . . . . . . . . . . . . 10 87 4.5. Router and switch operation . . . . . . . . . . . . . . . . 11 88 4.6. Middlebox interactions . . . . . . . . . . . . . . . . . . 11 89 4.7. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 90 4.8. Checksum Handling . . . . . . . . . . . . . . . . . . . . . 12 91 4.8.1. Checksum requirements . . . . . . . . . . . . . . . . . 12 92 4.8.2. GUE header checksum . . . . . . . . . . . . . . . . . . 12 93 4.8.3. UDP Checksum with IPv4 . . . . . . . . . . . . . . . . 12 94 4.8.4. UDP Checksum with IPv6 . . . . . . . . . . . . . . . . 13 95 4.9. MTU and fragmentation issues . . . . . . . . . . . . . . . 14 96 4.10 Congestion control . . . . . . . . . . . . . . . . . . . . 14 97 5. Inner flow identifier properties . . . . . . . . . . . . . . . 14 98 5.1. Flow classification . . . . . . . . . . . . . . . . . . . . 14 99 5.2. Inner flow identifier properties . . . . . . . . . . . . . 15 100 6. Motivation for GUE . . . . . . . . . . . . . . . . . . . . . . 16 101 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 17 102 7.1. GUE security fields . . . . . . . . . . . . . . . . . . . . 18 103 7.2. GUE and IPsec . . . . . . . . . . . . . . . . . . . . . . . 18 104 8. IANA Consideration . . . . . . . . . . . . . . . . . . . . . . 18 105 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 19 106 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 107 10.1. Normative References . . . . . . . . . . . . . . . . . . . 19 108 10.2. Informative References . . . . . . . . . . . . . . . . . . 19 109 Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 20 110 A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 21 111 A.2. Checksum offload . . . . . . . . . . . . . . . . . . . . . 21 112 A.2.1. Transmit checksum offload . . . . . . . . . . . . . . . 21 113 A.2.2. Receive checksum offload . . . . . . . . . . . . . . . 22 114 A.3. Transmit Segmentation Offload . . . . . . . . . . . . . . . 23 115 A.4. Large Receive Offload . . . . . . . . . . . . . . . . . . . 23 116 Appendix B: Privileged ports . . . . . . . . . . . . . . . . . . . 24 117 Appendix C: Inner flow identifier as a route selector . . . . . . 24 118 Appendix D: Hardware protocol implementation considerations . . . 24 119 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 25 121 1. Introduction 123 This specification describes Generic UDP Encapsulation (GUE) which is 124 a general method for encapsulating packets of arbitrary IP protocols 125 within User Datagram Protocol (UDP) [RFC0768] packets. Encapsulating 126 packets in UDP facilitates efficient transport across networks. 127 Networking devices widely provide protocol specific processing and 128 optimizations for UDP (as well as TCP) packets. Packets for atypical 129 IP protocols (those not usually parsed by networking hardware) can be 130 encapsulated in UDP packets to maximize deliverability and to 131 leverage flow specific mechanisms for routing and packet steering. 133 GUE provides an extensible header format for including optional data 134 in the encapsulation header. This data potentially covers items such 135 as virtual networking identifier, security data for validating or 136 authenticating the GUE header, congestion control data, etc. GUE also 137 allows private optional data in the encapsulation header. This 138 feature can be used by a site or implementation to define local 139 custom optional data, and allows experimentation of options that may 140 eventually become standard. 142 2. Packet formats 144 A GUE packet is comprised of a UDP packet whose payload is a GUE 145 header followed by a payload which is either an encapsulated packet 146 of some IP protocol or a control message (like an OAM message). A GUE 147 packet has the general format: 149 +-------------------------------+ 150 | | 151 | UDP/IP header | 152 | | 153 |-------------------------------| 154 | | 155 | GUE Header | 156 | | 157 |-------------------------------| 158 | | 159 | Encapsulated packet | 160 | or control message | 161 | | 162 +-------------------------------+ 164 The GUE header is variable length as determined by the presence of 165 optional fields. 167 2.1. GUE header preamble 169 The first byte of the GUE header provides the GUE protocol version 170 number, indicator of a control or data message, and header length: 172 0 173 0 1 2 3 4 5 6 7 174 +-+-+-+-+-+-+-+-+ 175 |Ver|C| Hlen | 176 +-+-+-+-+-+-+-+-+ 178 Contents are: 180 o Ver: GUE protocol version. The rest of the fields after the 181 preamble are defined based on the version. This field is two 182 bits allowing four possible values. 184 o Control flag: When set indicates a control message, not set 185 indicates a data message. 187 o Hlen: Length in 32-bit words of the GUE header, including 188 optional fields but not the first four bytes of the header. 189 Computed as (header_len - 4) / 4. All GUE headers are a multiple 190 of four bytes in length. Maximum header length is 132 bytes. 192 2.2. GUE header 194 The header format for version 0x0 of GUE in UDP is: 196 0 1 2 3 197 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 198 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 199 | Source port | Destination port | 200 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 201 | Length | Checksum | 202 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 203 |0x0|C| Hlen | Proto/ctype | Flags |E| 204 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 205 | | 206 ~ Fields (optional) ~ 207 | | 208 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 209 | Extension flags (optional) | 210 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 211 | | 212 ~ Extension fields (optional) ~ 213 | | 214 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 215 | | 216 ~ Private data (optional) ~ 217 | | 218 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 220 The contents of the UDP header are: 222 o Source port (inner flow identifier): This should be set to a 223 value that represents the encapsulated flow. The properties of 224 the inner flow identifier are described below. 226 o Destination port: The GUE assigned port number, 6080. 228 o Length: Canonical length of the UDP packet (length of UDP header 229 and payload). 231 o Checksum: Standard UDP checksum. 233 The GUE header consists of: 235 o Preamble byte: Version number (0x0), C bit, and header length. 237 o Proto/ctype: When the C bit is set this field contains a control 238 message type for the payload. When C bit is not set, the field 239 holds the IP protocol number for the encapsulated packet in the 240 payload. The control message or encapsulated packet begins at 241 the offset provided by Hlen. 243 o Flags. Header flags that may be allocated for various purposes 244 and may indicate presence of optional fields. Undefined header 245 flag bits must be set to zero on transmission. 247 o 'E' Extension flag. Indicates presence of extension flags option 248 in the optional fields. 250 o Fields: Optional fields whose presence is indicated by 251 corresponding flags. 253 o Extension flags: An optional field indicated by the E bit. This 254 field provides an additional set of 32 header bit flags for the 255 header. 257 o Extension fields: Optional fields whose presence is indicated by 258 corresponding extension flags. 260 o Private data: Optional private data. If private data is present 261 it immediately follows that last field present in the header. 262 The length of this data is determined by subtracting the 263 starting offset from the header length. 265 2.3. Flags and optional fields 267 Flags and associated optional fields are the primary mechanism of 268 extensibility in GUE. There are sixteens flag bits in the primary GUE 269 header with one being reserved to indicate that an optional extension 270 flags field is present. The extension flags field contains an 271 additional thirty-two flag bits. 273 A flag may indicate presence of optional fields. The size of an 274 optional field indicated by a flag must be fixed. 276 Flags may be paired together to allow different lengths for an 277 optional field. For example, if two flag bits are paired, a field may 278 possibly be three different lengths. Regardless of how flag bits may 279 be paired, the lengths and offsets of optional fields corresponding 280 to a set of flags must be well defined. 282 Optional fields are placed in order of the flags. New flags should be 283 allocated from high to low order bit contiguously without holes. 284 Flags allow random access, for instance to inspect the field 285 corresponding to the Nth flag bit, an implementation only considers 286 the previous N-1 flags to determine the offset. Flags after the Nth 287 flag are not pertinent in calculating the offset of the Nth flag. 289 Flags (or paired flags) are idempotent such that new flags should not 290 cause reinterpretation of old flags. Also, new flags should not alter 291 interpretation of other elements in the GUE header nor how the 292 message is parsed (for instance, in a data message the proto/ctype 293 field always holds an IP protocol number as an invariant). 295 2.4 Private data 297 An implementation may use private data for its own use. The private 298 data immediately follows the last field in the GUE header and is not 299 a fixed length. This data is considered part of the GUE header and 300 must be accounted for in header length (Hlen). The length of the 301 private data must be a multiple of four and is determined by 302 subtracting the offset of private data in the GUE header from the 303 header length. Specifically: 305 Private_length = (Hlen * 4) - Length(flags) 307 Where "Length(flags)" returns the sum of lengths of all the optional 308 fields present in the GUE header. When there is no private data 309 present, length of the private data is zero. 311 The semantics and interpretation of private data are implementation 312 specific. A encapsulator and decapsulator MUST agree on the meaning 313 of private data before using it. The private data may be structured 314 as necessary, for instance it might contain its own set of flags and 315 optional fields. 317 If a decapsulator receives a GUE packet with private data, it MUST 318 validate the private data appropriately. If a decapsulator does not 319 expect private data from an encapsulator the packet MUST be dropped. 320 If a decapsulator cannot validate the contents of private data per 321 the provided semantics the packet MUST also be dropped. An 322 implementation may place security data in GUE private data which must 323 be verified for packet acceptance. 325 3. Message types 327 3.1. Control messages 329 Control messages are indicated in the GUE header when the C bit is 330 set. The payload is interpreted as a control message with type 331 specified in the proto/ctype field. The format and contents of the 332 control message are indicated by the type and can be variable length. 334 Other than interpreting the proto/ctype field as a control message 335 type, the meaning and semantics of the rest of the elements in the 336 GUE header are the same as that of data messages. Forwarding and 337 routing of control messages should be the same as that of a data 338 message with the same outer IP and UDP header and GUE flags-- this 339 ensures that a control message can be created which follows the same 340 path as a data message. 342 Control messages can be defined for OAM type messages. For instance, 343 an echo request and corresponding echo reply message may be defined 344 to test for liveness. 346 3.2. Data messages 348 Data messages are indicated in GUE header with C bit not set. The 349 payload of a data message is interpreted as an encapsulated packet of 350 an IP protocol indicated in the proto/ctype field. The packet 351 immediately follows the GUE header. 353 Data messages are a primary means of encapsulation and can be used to 354 create tunnels for overlay networks. 356 4. Operation 358 The figure below illustrates the use of GUE encapsulation between two 359 servers. Sever 1 is sending packets to server 2. An encapsulator 360 performs encapsulation of packets from server 1. These encapsulated 361 packets traverse the network as UDP packets. At the decapsulator, 362 packets are decapsulated and sent on to server 2. Packet flow in the 363 reverse direction need not be symmetric; GUE encapsulation is not 364 required in the reverse path. 366 +---------------+ +---------------+ 367 | | | | 368 | Server 1 | | Server 2 | 369 | | | | 370 +---------------+ +---------------+ 371 | ^ 372 V | 373 +---------------+ +---------------+ +---------------+ 374 | | | | | | 375 | Encapsulator |-->| Layer 3 |-->| Decapsulator | 376 | | | Network | | | 377 +---------------+ +---------------+ +---------------+ 379 The encapsulator and decapsulator may be co-resident with the 380 corresponding servers, or may be on separate nodes in the network. 382 4.1. Network tunnel encapsulation 384 Network tunneling can be achieved by encapsulating layer 2 or layer 3 385 packets. In this case the encapsulator and decapsulator nodes are the 386 tunnel endpoints. These could be routers that provide network tunnels 387 on behalf of communicating servers. 389 4.2. Transport layer encapsulation 391 When encapsulating layer 4 packets, the encapsulator and decapsulator 392 should be co-resident with the servers. In this case, the 393 encapsulation headers are inserted between the IP header and the 394 transport packet. The addresses in the IP header refer to both the 395 endpoints of the encapsulation and the endpoints for terminating the 396 the transport protocol. 398 4.3. Encapsulator operation 400 Encapsulators create GUE data messages, set the source port to the 401 inner flow identifier, set flags and optional fields in the GUE 402 header, and forward packets to a decapsulator. 404 An encapsulator may be an end host originating the packets of a flow, 405 or may be a network device performing encapsulation on behalf of 406 servers (routers implementing tunnels for instance). In either case, 407 the intended target (decapsulator) is indicated by the outer 408 destination IP address. 410 If an encapsulator is tunneling packets, that is encapsulating 411 packets of layer 2 or layer 3 protocols (e.g. EtherIP, IPIP, ESP 412 tunnel mode), it should follow standard conventions for tunneling of 413 one IP protocol over another. Diffserv interaction with tunnels is 414 described in [RFC2983], ECN propagation for tunnels is described in 415 [RFC6040]. 417 4.4. Decapsulator operation 419 A decapsulator performs decapsulation of GUE packets. A decapsulator 420 is addressed by the outer destination IP address of a GUE packet. 421 The decapsulator validates packets, including fields of the GUE 422 header. If a packet is acceptable, the UDP and GUE headers are 423 removed and the packet is resubmitted for IP protocol processing or 424 control message processing if it is a control message. 426 If a decapsulator receives a GUE packet with an unsupported version, 427 unknown flag, bad header length (too small for included optional 428 fields), unknown control message type, or an otherwise malformed 429 header, it must drop the packet and may log the event. No error 430 message is returned back to the encapsulator. Note that set flags in 431 GUE that are unknown to a decapsulator MUST NOT be ignored. If a GUE 432 packet is received by a decapsulator with unknown flags, the packet 433 MUST be dropped. 435 4.5. Router and switch operation 437 Routers and switches should forward GUE packets as standard UDP/IP 438 packets. The outer five-tuple should contain sufficient information 439 to perform flow classification corresponding to the flow of the inner 440 packet. A switch should not normally need to parse a GUE header, and 441 none of the flags or optional fields in the GUE header should affect 442 routing. 444 A router should not modify a GUE header when forwarding a packet. It 445 may encapsulate a GUE packet in another GUE packet, for instance to 446 implement a network tunnel. In this case the router takes the role of 447 an encapsulator, and the corresponding decapsulator is the logical 448 endpoint of the tunnel. 450 4.6. Middlebox interactions 452 A middle box may interpret some flags and optional fields of the GUE 453 header for classification purposes, but is not required to understand 454 all flags and fields in GUE packets. A middle box should not drop a 455 GUE packet because there are flags unknown to it. The header length 456 in the GUE header allows a middlebox to inspect the payload packet 457 without needing to parse the flags or optional fields. 459 A middlebox may infer bidirectional connection semantics to a UDP 460 flow. For instance a stateful firewall may create a five-tuple rule 461 to match flows on egress, and a corresponding five-tuple rule for 462 matching ingress packets where the roles of source and destination 463 are reversed for the IP addresses and UDP port numbers. To operate in 464 this environment, a GUE tunnel must assume connected semantics 465 defined by the UDP five tuple and the use of GUE encapsulation must 466 be symmetric between both endpoints. The source port set in the UDP 467 header must be the destination port the peer would set for replies. 469 4.7. NAT 471 IP address and port translation can be performed on the UDP/IP 472 headers adhering to the requirements for NAT with UDP [RFC478]. In 473 the case of stateful NAT, connection semantics must be applied to a 474 GUE tunnel as described above. 476 When using transport mode encapsulation and traversing a NAT, the IP 477 addresses may be changed such that the pseudo header checksum used 478 for checksum calculation is modified and the checksum will be found 479 invalid at the receiver. To compensate for this, A GUE option can be 480 added which contains the checksum over the source and destination 481 addresses when the packet is transmitted. Upon receiving this option, 482 the delta of the pseudo header checksum is computed by subtracting 483 the checksum over the source and and destination addresses from the 484 checksum value in the option. The resultant value is then added into 485 checksum calculation when validating the inner transport checksum. 487 4.8. Checksum Handling 489 This section describes the requirements around the UDP checksum and 490 GUE header checksum. Checksums are an important consideration in that 491 that they can provide end to end validation and protect against 492 packet mis-delivery. The latter is allowed by the inclusion of a 493 pseudo header that covers the IP addresses and UDP ports of the 494 encapsulating headers. 496 4.8.1. Checksum requirements 498 The potential for mis-delivery of packets due to corruption of IP, 499 UDP, or GUE headers must be considered. One of the following 500 requirements must be met: 502 o UDP checksums are enabled (for IPv4 or IPv6). 504 o The GUE header checksum is used. 506 o Zero UDP checksums are used in accordance with applicable 507 requirements in [GREUDP], [RFC6935], and [RFC6936]. 509 4.8.2. GUE header checksum 511 The GUE header checksum provides a UDP-lite [RFC3828] type of 512 checksum capability as an optional field of the GUE header. The GUE 513 header checksum minimally covers the GUE header and a GUE pseudo 514 header. The GUE pseudo header includes the corresponding IP 515 addresses as well as the UDP ports of the encapsulating headers. 516 This checksum should provide adequate protection against address 517 corruption in IPv6 when the UDP checksum is zero. Additionally, the 518 GUE checksum provides protection of the GUE header when the UDP 519 checksum is set to zero with either IPv4 or IPv6. The GUE header 520 checksum is defined in [GUECSUM]. 522 4.8.3. UDP Checksum with IPv4 524 For UDP in IPv4, the UDP checksum MUST be processed as specified in 525 [RFC768] and [RFC1122] for both transmit and receive. An 526 encapsulator MAY set the UDP checksum to zero for performance or 527 implementation considerations. The IPv4 header includes a checksum 528 which protects against mis-delivery of the packet due to corruption 529 of IP addresses. The UDP checksum potentially provides protection 530 against corruption of the UDP header, GUE header, and GUE payload. 531 Enabling or disabling the use of checksums is a deployment 532 consideration that should take into account the risk and effects of 533 packet corruption, and whether the packets in the network are 534 already adequately protected by other, possibly stronger mechanisms 535 such as the Ethernet CRC. If an encapsulator sets a zero UDP 536 checksum for IPv4 it SHOULD use the GUE header checksum as described 537 in section 4.8.2. 539 When a decapsulator receives a packet, the UDP checksum field MUST 540 be processed. If the UDP checksum is non-zero, the decapsulator MUST 541 verify the checksum before accepting the packet. By default a 542 decapsulator SHOULD accept UDP packets with a zero checksum. A node 543 MAY be configured to disallow zero checksums per [RFC1122]; this may 544 be done selectively, for instance disallowing zero checksums from 545 certain hosts that are known to be sending over paths subject to 546 packet corruption. If verification of a non-zero checksum fails, a 547 decapsulator lacks the capability to verify a non-zero checksum, or 548 a packet with a zero-checksum was received and the decapsulator is 549 configured to disallow, the packet MUST be dropped and an event MAY 550 be logged. 552 4.8.4. UDP Checksum with IPv6 554 For UDP in IPv6, the UDP checksum MUST be processed as specified in 555 [RFC768] and [RFC2460] for both transmit and receive. Unlike IPv4, 556 there is no header checksum in IPv6 that protects against mis- 557 delivery due to address corruption. Therefore, when GUE is used over 558 IPv6, either the UDP checksum must be enabled or the GUE header 559 checksum must be used. An encapsulator MAY set a zero UDP checksum 560 for performance or implementation reasons, in which case the GUE 561 header checksum MUST be used or applicable requirements for using 562 zero UDP checksums in [GREUDP] MUST be met. If the UDP checksum is 563 enabled, then the GUE header checksum should not be used since it is 564 mostly redundant. 566 When a decapsulator receives a packet, the UDP checksum field MUST 567 be processed. If the UDP checksum is non-zero, the decapsulator MUST 568 verify the checksum before accepting the packet. By default a 569 decapsulator MUST only accept UDP packets with a zero checksum if 570 the GUE header checksum is used and is verified. If verification of 571 a non-zero checksum fails, a decapsulator lacks the capability to 572 verify a non-zero checksum, or a packet with a zero-checksum and no 573 GUE header checksum was received, the packet MUST be dropped and an 574 event MAY be logged. 576 4.9. MTU and fragmentation issues 578 Standard conventions for handling of MTU (Maximum Transmission Unit) 579 and fragmentation in conjunction with networking tunnels 580 (encapsulation of layer 2 or layer 3 packets) should be followed. 581 Details are described in MTU and Fragmentation Issues with In-the- 582 Network Tunneling [RFC4459] 584 If a packet is fragmented before encapsulation in GUE, all the 585 related fragments must be encapsulated using the same source port 586 (inner flow identifier). An operator may set MTU to account for 587 encapsulation overhead and reduce the likelihood of fragmentation. 589 4.10 Congestion control 591 Per requirements of [RFC5405], if the IP traffic encapsulated with 592 GUE implements proper congestion control no additional mechanisms 593 should be required. 595 In the case that the encapsulated traffic does not implement any or 596 sufficient control, or it is not known rather a transmitter will 597 consistently implement proper congestion control, then congestion 598 control at the encapsulation layer must be provided. Note this case 599 applies to a significant use case in network virtualization in which 600 guests run third party networking stacks that cannot be implicitly 601 trusted to implement conformant congestion control. 603 Out of band mechanisms such as rate limiting, Managed Circuit 604 Breaker, or traffic isolation may used to provide rudimentary 605 congestion control. For finer grained congestion control that allow 606 alternate congestion control algorithms, reaction time within an 607 RTT, and interaction with ECN, in band mechanisms may warranted. 609 DCCP may be used to provide congestion control for encapsulated 610 flows. In this case, the protocol stack for an IP tunnel may be IP- 611 GUE-DCCP-IP. Alternatively, GUE can be extended to include 612 congestion control (related data carried in GUE optional fields). 613 Congestion control mechanisms for GUE will be elaborated in other 614 specifications. 616 5. Inner flow identifier properties 618 5.1. Flow classification 620 A major objective of using GUE is that a network device can perform 621 flow classification corresponding to the flow of the inner 622 encapsulated packet based on the contents in the outer headers. 624 Hardware devices commonly perform hash computations on packet 625 headers to classify packets into flows or flow buckets. Flow 626 classification is done to support load balancing (statistical 627 multiplexing) of flows across a set of networking resources. 628 Examples of such load balancing techniques are Equal Cost Multipath 629 routing (ECMP), port selection in Link Aggregation, and NIC device 630 Receive Side Scaling (RSS). Hashes are usually either a three-tuple 631 hash of IP protocol, source address, and destination address; or a 632 five-tuple hash consisting of IP protocol, source address, 633 destination address, source port, and destination port. Typically, 634 networking hardware will compute five-tuple hashes for TCP and UDP, 635 but only three-tuple hashes for other IP protocols. Since the five- 636 tuple hash provides more granularity, load balancing can be finer 637 grained with better distribution. When a packet is encapsulated with 638 GUE, the source port in the outer UDP packet is set to reflect the 639 flow of the inner packet. When a device computes a five-tuple hash 640 on the outer UDP/IP header of a GUE packet, the resultant value 641 classifies the packet per its inner flow. 643 To support flow classification, the source port of the UDP header in 644 GUE is set to a value that maps to the inner flow. This is referred 645 to as the inner flow identifier. The inner flow identifier is set by 646 the encapsulator; it can be computed on the fly based on packet 647 contents or retrieved from a state maintained for the inner flow. 649 Examples of deriving an inner flow identifier are: 651 o If the encapsulated packet is a layer 4 packet, TCP/IPv4 for 652 instance, the inner flow identifier could be based on the 653 canonical five-tuple hash of the inner packet. 655 o If the encapsulated packet is an AH transport mode packet with 656 TCP as next header, the inner flow identifier could be a hash 657 over a three-tuple: TCP protocol and TCP ports of the 658 encapsulated packet. 660 o If a node is encrypting a packet using ESP tunnel mode and GUE 661 encapsulation, the inner flow identifier could be based on the 662 contents of clear-text packet. For instance, a canonical five- 663 tuple hash for a TCP/IP packet could be used. 665 5.2. Inner flow identifier properties 667 The inner flow identifier is the value set in the UDP source 668 port of a GUE packet. The inner flow identifier should adhere to 669 the following properties: 671 o The value set in the source port should be within the ephemeral 672 port range. IANA suggests this range to be 49152 to 65535, where 673 the high order two bits of the port are set to one. This 674 provides fourteen bits of entropy for the inner flow identifier. 676 o The inner flow identifier should have a uniform distribution 677 across encapsulated flows. 679 o An encapsulator may occasionally change the inner flow 680 identifier used for an inner flow per its discretion (for 681 security, route selection, etc). Changing the value should 682 happen no more than once every thirty seconds. 684 o Decapsulators, or any networking devices, should not attempt any 685 interpretation of the inner flow identifier, nor should they 686 attempt to reproduce any hash calculation. They may use the 687 value to match further receive packets for steering decisions, 688 but cannot assume that the hash uniquely or permanently 689 identifies a flow. 691 o Input to the inner flow identifier is not restricted to ports 692 and addresses; input could include flow label from an IPv6 693 packet, SPI from an ESP packet, or other flow related state in 694 the encapsulator that is not necessarily conveyed in the packet. 696 o The assignment function for inner flow identifiers should be 697 randomly seeded to mitigate denial of service attacks. The seed 698 may be changed periodically. 700 6. Motivation for GUE 702 This section presents the motivation for GUE with respect to other 703 encapsulation methods. 705 A number of different encapsulation techniques have been proposed for 706 the encapsulation of one protocol over another. EtherIP [RFC3378] 707 provides layer 2 tunneling of Ethernet frames over IP. GRE [RFC2784], 708 MPLS [RFC4023], and L2TP [RFC2661] provide methods for tunneling 709 layer 2 and layer 3 packets over IP. NVGRE [NVGRE] and VXLAN 710 [RFC7348] are proposals for encapsulation of layer 2 packets for 711 network virtualization. IPIP [RFC2003] and Generic packet tunneling 712 in IPv6 [RFC2473] provide methods for tunneling IP packets over IP. 714 Several proposals exist for encapsulating packets over UDP including 715 ESP over UDP [RFC3948], TCP directly over UDP [TCPUDP], VXLAN, LISP 716 [RFC6830] which encapsulates layer 3 packets, and Generic UDP 717 Encapsulation for IP Tunneling (GRE over UDP)[GREUDP]. Generic UDP 718 tunneling [GUT] is a proposal similar to GUE in that it aims to 719 tunnel packets of IP protocols over UDP. 721 GUE has the following discriminating features: 723 o UDP encapsulation leverages specialized network device 724 processing for efficient transport. The semantics for using the 725 UDP source port as an identifier for an inner flow are defined. 727 o GUE permits encapsulation of arbitrary IP protocols, which 728 includes layer 2 3, and 4 protocols. This potentially allows 729 nearly all traffic within a data center to be normalized to be 730 either TCP or UDP on the wire. 732 o Multiple protocols can be multiplexed over a single UDP port 733 number. This is in contrast to techniques to encapsulate 734 protocols over UDP using a protocol specific port number (such 735 as ESP/UDP, GRE/UDP, SCTP/UDP). GUE provides a uniform and 736 extensible mechanism for encapsulating all IP protocols in UDP 737 with minimal overhead (four bytes of additional header). 739 o GUE is extensible. New flags and optional fields can be defined. 741 o The GUE header includes a header length field. This allows a 742 network node to inspect an encapsulated packet without needing 743 to parse the full encapsulation header. 745 o Private data in the encapsulation header allows local 746 customization and experimentation while being compatible with 747 processing in network nodes (routers and middleboxes). 749 o GUE includes both data messages (encapsulation of packets) and 750 control messages (such as OAM). 752 7. Security Considerations 754 Encapsulation of IP protocols within GUE should not increase 755 security risk, nor provide additional security in itself. As 756 suggested in section 5 the source port for of UDP packets in GUE 757 should be randomly seeded to mitigate some possible denial 758 service attacks. 760 GUE is most useful when it is in the outermost header of a 761 packet which allows for flow hash calculation as well as making 762 GUE header data (such as virtual network identifier) visible to 763 switches and middleboxes. GUE must be amenable to encapsulating 764 (and being encapsulated within) IPsec. Also, we allow provisions 765 to secure the GUE header itself without external protocol. 767 Security for Generic UDP Encapsulation is described in more 768 detail in [GUESEC]. 770 7.1. GUE security fields 772 Security fields should be used to provide integrity and 773 authentication of the GUE header. Security negotiation 774 (algorithms, interpretation of security field, key management, 775 etc.) is expected to be done out of band between hosts. 777 7.2. GUE and IPsec 779 GUE may be used to encapsulate IPsec packets. This allows the 780 benefits of deriving a flow hash for the inner, potentially 781 encrypted, packet. In this case the protocol stack may be: 783 +-------------------------------+ 784 | | 785 | UDP/IP header | 786 | | 787 |-------------------------------| 788 | | 789 | GUE Header | 790 | | 791 |-------------------------------| 792 | | 793 | ESP/AH/private security | 794 | | 795 |-------------------------------| 796 | | 797 | Encapsulated packet | 798 | | 799 +-------------------------------+ 801 Note that IPsec would not cover the GUE header in this case 802 (does not authenticate it for instance). GUE security optional 803 fields may be used to provide authentication or integrity of the 804 GUE header. 806 8. IANA Consideration 808 A user UDP port number assignment for GUE has been assigned: 810 Service Name: gue 811 Transport Protocol(s): UDP 812 Assignee: Tom Herbert 813 Contact: Tom Herbert 814 Description: Generic UDP Encapsulation 815 Reference: draft-herbert-gue 816 Port Number: 6080 817 Service Code: N/A 818 Known Unauthorized Uses: N/A 819 Assignment Notes: N/A 821 9. Acknowledgements 823 The authors would like to thank David Liu for valuable input on 824 this draft. 826 10. References 828 10.1. Normative References 830 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 831 August 1980. 833 [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an 834 IANA Considerations Section in RFCs", RFC 2434, October 1998. 836 [RFC2983] Black, D., "Differentiated Services and Tunnels", RFC 2983, 837 October 2000. 839 [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion 840 Notification", RFC 6040, November 2010. 842 [RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement 843 for the Use of IPv6 UDP Datagrams with Zero Checksums", RFC 6936, 844 April 2013. 846 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 847 Network Tunneling", RFC 4459, April 2006. 849 10.2. Informative References 851 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 852 October 1996. 854 [RFC3948] Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M. 855 Stenberg, "UDP Encapsulation of IPsec ESP Packets", RFC 3948, January 856 2005. 858 [RFC6830] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The 859 Locator/ID Separation Protocol (LISP)", RFC 6830, January 2013. 861 [RFC3378] Housley, R. and S. Hollenbeck, "EtherIP: Tunneling Ethernet 862 Frames in IP Datagrams", RFC 3378, September 2002. 864 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. Traina, 865 Generic Routing Encapsulation (GRE)", RFC 2784, March 2000. 867 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed., "Encapsulating 868 MPLS in IP or Generic Routing Encapsulation (GRE)", RFC 4023, March 869 2005. 871 [RFC2661] Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn, G., 872 and B. Palter, "Layer Two Tunneling Protocol "L2TP"", RFC 2661, 873 August 1999. 875 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 876 Authentication Option", RFC 5925, June 2010. 878 [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., 879 and G. Fairhurst, Ed., "The Lightweight User Datagram Protocol (UDP- 880 Lite)", RFC 3828, July 2004, . 883 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 884 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual eXtensible 885 Local Area Network (VXLAN): A Framework for Overlaying Virtualized 886 Layer 2 Networks over Layer 3 Networks", RFC 7348, August 2014, 887 . 889 [NVGRE] NVGRE: Network Virtualization using Generic Routing 890 Encapsulation draft-sridharan-virtualization-nvgre-03 892 [TCPUDP] Encapsulation of TCP and other Transport Protocols over UDP 893 draft-cheshire-tcp-over-udp-00 895 [GREUDP] Generic UDP Encapsulation for IP Tunneling draft-yong-tsvwg- 896 gre-in-udp-encap-02 898 [GUESEC] Yong, L., Herbert, T., "Generic UDP Encapsulation (GUE) for 899 Secure Transport", draft-hy-gue-4-secure-transport-00, work in 900 progress. 902 [GUT] Generic UDP Tunnelling (GUT) draft-manner-tsvwg-gut-02.txt 904 [REMCSUM] Remote Checksum Offload draft-herbert-remotecsumoffload-00 906 Appendix A: NIC processing for GUE 908 This appendix provides some guidelines for Network Interface Cards 909 (NICs) to implement common offloads and accelerations to support GUE. 910 Note that most of this discussion is generally applicable to other 911 methods of UDP based encapsulation. 913 A.1. Receive multi-queue 915 Contemporary NICs support multiple receive descriptor queues (multi- 916 queue). Multi-queue enables load balancing of network processing for 917 a NIC across multiple CPUs. On packet reception, a NIC must select 918 the appropriate queue for host processing. Receive Side Scaling is a 919 common method which uses the flow hash for a packet to index an 920 indirection table where each entry stores a queue number. Flow 921 Director and Accelerated Receive Flow Steering (aRFS) allow a host to 922 program the queue that is used for a given flow which is identified 923 either by an explicit five-tuple or by the flow's hash. 925 GUE encapsulation should be compatible with multi-queue NICs that 926 support five-tuple hash calculation for UDP/IP packets as input to 927 RSS. The inner flow identifier (source port) ensures classification 928 of the encapsulated flow even in the case that the outer source and 929 destination addresses are the same for all flows (e.g. all flows are 930 going over a single tunnel). 932 By default, UDP RSS support is often disabled in NICs to avoid out of 933 order reception that can occur when UDP packets are fragmented. As 934 discussed above, fragmentation of GUE packets should be mitigated by 935 fragmenting packets before entering a tunnel, path MTU discovery in 936 higher layer protocols, or operator adjusting MTUs. Other UDP traffic 937 may not implement such procedures to avoid fragmentation, so enabling 938 UDP RSS support in the NIC should be a considered tradeoff during 939 configuration. 941 A.2. Checksum offload 943 Many NICs provide capabilities to calculate standard ones complement 944 payload checksum for packets in transmit or receive. When using GUE 945 encapsulation there are at least two checksums that may be of 946 interest: the encapsulated packet's transport checksum, and the UDP 947 checksum in the outer header. 949 A.2.1. Transmit checksum offload 951 NICs may provide a protocol agnostic method to offload transmit 952 checksum (NETIF_F_HW_CSUM in Linux parlance) that can be used with 953 GUE. In this method the host provides checksum related parameters in 954 a transmit descriptor for a packet. These parameters include the 955 starting offset of data to checksum, the length of data to checksum, 956 and the offset in the packet where the computed checksum is to be 957 written. The host initializes the checksum field to pseudo header 958 checksum. 960 In the case of GUE, the checksum for an encapsulated transport layer 961 packet, a TCP packet for instance, can be offloaded by setting the 962 appropriate checksum parameters. 964 NICs typically can offload only one transmit checksum per packet, so 965 simultaneously offloading both an inner transport packet's checksum 966 and the outer UDP checksum is likely not possible. In this case 967 setting UDP checksum to zero (per above discussion) and offloading 968 the inner transport packet checksum might be acceptable. 970 If an encapsulator is co-resident with a host, then checksum offload 971 may be performed using remote checksum offload [REMCSUM]. Remote 972 checksum offload relies on NIC offload of the simple UDP/IP checksum 973 which is commonly supported even in legacy devices. In remote 974 checksum offload the outer UDP checksum is set and the GUE header 975 includes an option indicating the start and offset of the inner 976 "offloaded" checksum. The inner checksum is initialized to the pseudo 977 header checksum. When a decapsulator receives a GUE packet with the 978 remote checksum offload option, it completes the offload operation by 979 determining the packet checksum from the indicated start point to the 980 end of the packet, and then adds this into the checksum field at the 981 offset given in the option. Computing the checksum from the start to 982 end of packet is efficient if checksum-complete is provided on the 983 receiver. 985 A.2.2. Receive checksum offload 987 GUE is compatible with NICs that perform a protocol agnostic receive 988 checksum (CHECKSUM_COMPLETE in Linux parlance). In this technique, a 989 NIC computes a ones complement checksum over all (or some predefined 990 portion) of a packet. The computed value is provided to the host 991 stack in the packet's receive descriptor. The host driver can use 992 this checksum to "patch up" and validate any inner packet transport 993 checksum, as well as the outer UDP checksum if it is non-zero. 995 Many legacy NICs don't provide checksum-complete but instead provide 996 an indication that a checksum has been verified (CHECKSUM_UNNECESSARY 997 in Linux). Usually, such validation is only done for simple TCP/IP or 998 UDP/IP packets. If a NIC indicates that a UDP checksum is valid, the 999 checksum-complete value for the UDP packet is the "not" of the pseudo 1000 header checksum. In this way, checksum-unnecessary can be converted 1001 to checksum-complete. So if the NIC provides checksum-unnecessary for 1002 the outer UDP header in an encapsulation, checksum conversion can be 1003 done so that the checksum-complete value is derived and can be used 1004 by the stack to validate an checksums in the encapsulated packet. 1006 A.3. Transmit Segmentation Offload 1008 Transmit Segmentation Offload (TSO) is a NIC feature where a host 1009 provides a large (>MTU size) TCP packet to the NIC, which in turn 1010 splits the packet into separate segments and transmits each one. This 1011 is useful to reduce CPU load on the host. 1013 The process of TSO can be generalized as: 1015 - Split the TCP payload into segments which allow packets with 1016 size less than or equal to MTU. 1018 - For each created segment: 1020 1. Replicate the TCP header and all preceding headers of the 1021 original packet. 1023 2. Set payload length fields in any headers to reflect the 1024 length of the segment. 1026 3. Set TCP sequence number to correctly reflect the offset of 1027 the TCP data in the stream. 1029 4. Recompute and set any checksums that either cover the payload 1030 of the packet or cover header which was changed by setting a 1031 payload length. 1033 Following this general process, TSO can be extended to support TCP 1034 encapsulation in GUE. For each segment the Ethernet, outer IP, UDP 1035 header, GUE header, inner IP header if tunneling, and TCP headers are 1036 replicated. Any packet length header fields need to be set properly 1037 (including the length in the outer UDP header), and checksums need to 1038 be set correctly (including the outer UDP checksum if being used). 1040 To facilitate TSO with GUE it is recommended that optional fields 1041 should not contain values that must be updated on a per segment 1042 basis-- for example the GUE fields should not include checksums, 1043 lengths, or sequence numbers that refer to the payload. If the GUE 1044 header does not contain such fields then the TSO engine only needs to 1045 copy the bits in the GUE header when creating each segment and does 1046 not need to parse the GUE header. 1048 A.4. Large Receive Offload 1050 Large Receive Offload (LRO) is a NIC feature where packets of a TCP 1051 connection are reassembled, or coalesced, in the NIC and delivered to 1052 the host as one large packet. This feature can reduce CPU utilization 1053 in the host. 1055 LRO requires significant protocol awareness to be implemented 1056 correctly and is difficult to generalize. Packets in the same flow 1057 need to be unambiguously identified. In the presence of tunnels or 1058 network virtualization, this may require more than a five-tuple match 1059 (for instance packets for flows in two different virtual networks may 1060 have identical five-tuples). Additionally, a NIC needs to perform 1061 validation over packets that are being coalesced, and needs to 1062 fabricate a single meaningful header from all the coalesced packets. 1064 The conservative approach to supporting LRO for GUE would be to 1065 assign packets to the same flow only if they have identical five- 1066 tuple and were encapsulated the same way. That is the outer IP 1067 addresses, the outer UDP ports, GUE protocol, GUE flags and fields, 1068 and inner five tuple are all identical. 1070 Appendix B: Privileged ports 1072 Using the source port to contain an inner flow identifier value 1073 disallows the security method of a receiver enforcing that the source 1074 port be a privileged port. Privileged ports are defined by some 1075 operating systems to restrict source port binding. Unix, for 1076 instance, considered port number less than 1024 to be privileged. 1078 Enforcing that packets are sent from a privileged port is widely 1079 considered an inadequate security mechanism and has been mostly 1080 deprecated. To approximate this behavior, an implementation could 1081 restrict a user from sending a packet destined to the GUE port 1082 without proper credentials. 1084 Appendix C: Inner flow identifier as a route selector 1086 An encapsulator generating an inner flow identifier may modulate the 1087 value to perform a type of multipath source routing. Assuming that 1088 networking switches perform ECMP based on the flow hash, a sender can 1089 affect the path by altering the inner flow identifier. For instance, 1090 a host may store a flow hash in its PCB for an inner flow, and may 1091 alter the value upon detecting that packets are traversing a lossy 1092 path. Changing the inner flow identifier for a flow should be subject 1093 to hysteresis (at most once every thirty seconds) to limit the number 1094 of out of order packets. 1096 Appendix D: Hardware protocol implementation considerations 1098 A low level protocol, such is GUE, is likely interesting to being 1099 supported by high speed network devices. Variable length header (VLH) 1100 protocols like GUE are often considered difficult to efficiently 1101 implement in hardware. In order to retain the important 1102 characteristics of an extensible and robust protocol, hardware 1103 vendors may practice "constrained flexibility". In this model, only 1104 certain combinations or protocol header parameterizations are 1105 implemented in hardware fast path. Each such parameterization is 1106 fixed length so that the particular instance can be optimized as a 1107 fixed length protocol. In the case of GUE this constitutes specific 1108 combinations of GUE flags, fields, and next protocol. The selected 1109 combinations would naturally be the most common cases which form the 1110 "fast path", and other combinations are assumed to take the "slow 1111 path". 1113 In time, needs and requirements of the protocol may change which may 1114 manifest themselves as new parameterizations to be supported in the 1115 fast path. To allow allow this extensibility, a device practicing 1116 constrained flexibility should allow the fast path parameterizations 1117 to be programmable. 1119 Authors' Addresses 1121 Tom Herbert 1122 Facebook 1123 1 Hacker Way 1124 Menlo Park, CA 94052 1125 US 1127 Email: tom@herbertland.com 1129 Lucy Yong 1130 Huawei USA 1131 5340 Legacy Dr. 1132 Plano, TX 75024 1133 US 1135 Email: lucy.yong@huawei.com 1137 Osama Zia 1138 Microsoft 1139 1 Microsoft Way 1140 Redmond, WA 98029 1141 US 1143 Email: osamaz@microsoft.com