idnits 2.17.1 draft-herbert-gue-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 302: '...and decapsulator MUST agree on the mea...' RFC 2119 keyword, line 307: '... GUE packet with private data, it MUST...' RFC 2119 keyword, line 309: '...capsulator the packet MUST be dropped....' RFC 2119 keyword, line 311: '...ntics the packet MUST also be dropped....' RFC 2119 keyword, line 421: '...o a decapsulator MUST NOT be ignored. ...' (18 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (March 6, 2015) is 3338 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC478' is mentioned on line 462, but not defined == Missing Reference: 'RFC6935' is mentioned on line 497, but not defined == Missing Reference: 'GUECSUM' is mentioned on line 510, but not defined == Missing Reference: 'RFC768' is mentioned on line 545, but not defined == Missing Reference: 'RFC1122' is mentioned on line 533, but not defined == Missing Reference: 'RFC2460' is mentioned on line 545, but not defined ** Obsolete undefined reference: RFC 2460 (Obsoleted by RFC 8200) == Missing Reference: 'RFC5405' is mentioned on line 581, but not defined ** Obsolete undefined reference: RFC 5405 (Obsoleted by RFC 8085) == Missing Reference: 'RFC2473' is mentioned on line 702, but not defined == Unused Reference: 'RFC2434' is defined on line 823, but no explicit reference was found in the text == Unused Reference: 'RFC5925' is defined on line 865, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Downref: Normative reference to an Informational RFC: RFC 2983 ** Downref: Normative reference to an Informational RFC: RFC 4459 -- Obsolete informational reference (is this intentional?): RFC 6830 (Obsoleted by RFC 9300, RFC 9301) == Outdated reference: A later version (-08) exists of draft-sridharan-virtualization-nvgre-03 == Outdated reference: A later version (-03) exists of draft-hy-gue-4-secure-transport-00 == Outdated reference: A later version (-02) exists of draft-herbert-remotecsumoffload-00 Summary: 6 errors (**), 0 flaws (~~), 14 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Draft T. Herbert 3 Google 4 Category: Standard track L. Yong 5 Expires September 2015 Huawei USA 6 O. Zia 7 Microsoft 8 March 6, 2015 10 Generic UDP Encapsulation 11 13 Status of this Memo 15 This Internet-Draft is submitted in full conformance with the 16 provisions of BCP 78 and BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six months 24 and may be updated, replaced, or obsoleted by other documents at any 25 time. It is inappropriate to use Internet-Drafts as reference 26 material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html 34 This Internet-Draft will expire on September 7, 2015. 36 Copyright Notice 38 Copyright (c) 2014 IETF Trust and the persons identified as the 39 document authors. All rights reserved. 41 This document is subject to BCP 78 and the IETF Trust's Legal 42 Provisions Relating to IETF Documents 43 (http://trustee.ietf.org/license-info) in effect on the date of 44 publication of this document. Please review these documents 45 carefully, as they describe your rights and restrictions with respect 46 to this document. 48 Abstract 50 This specification describes Generic UDP Encapsulation (GUE), which 51 is a scheme for using UDP to encapsulate packets of arbitrary IP 52 protocols for transport across layer 3 networks. By encapsulating 53 packets in UDP, specialized capabilities in networking hardware for 54 efficient handling of UDP packets can be leveraged. GUE specifies 55 basic encapsulation methods upon which higher level constructs, such 56 tunnels and overlay networks for network virtualization, can be 57 constructed. GUE is extensible by allowing optional data fields as 58 part of the encapsulation, and is generic in that it can encapsulate 59 packets of various IP protocols. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 2. Packet formats . . . . . . . . . . . . . . . . . . . . . . . . 4 65 2.1. GUE header preamble . . . . . . . . . . . . . . . . . . . . 4 66 2.2. GUE header . . . . . . . . . . . . . . . . . . . . . . . . 5 67 2.3. Flags and optional fields . . . . . . . . . . . . . . . . . 6 68 2.4 Private data . . . . . . . . . . . . . . . . . . . . . . . . 7 69 3. Message types . . . . . . . . . . . . . . . . . . . . . . . . . 7 70 3.1. Control messages . . . . . . . . . . . . . . . . . . . . . 7 71 3.2. Data messages . . . . . . . . . . . . . . . . . . . . . . . 8 72 4. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 73 4.1. Network tunnel encapsulation . . . . . . . . . . . . . . . 9 74 4.2. Transport layer encapsulation . . . . . . . . . . . . . . . 9 75 4.3. Encapsulator operation . . . . . . . . . . . . . . . . . . 9 76 4.4. Decapsulator operation . . . . . . . . . . . . . . . . . . 9 77 4.5. Router and switch operation . . . . . . . . . . . . . . . . 10 78 4.6. Middlebox interactions . . . . . . . . . . . . . . . . . . 10 79 4.7. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 80 4.8. Checksum Handling . . . . . . . . . . . . . . . . . . . . . 11 81 4.8.1. Checksum requirements . . . . . . . . . . . . . . . . . 11 82 4.8.2. GUE header checksum . . . . . . . . . . . . . . . . . . 11 83 4.8.3. UDP Checksum with IPv4 . . . . . . . . . . . . . . . . 11 84 4.8.4. UDP Checksum with IPv6 . . . . . . . . . . . . . . . . 12 85 4.9. MTU and fragmentation issues . . . . . . . . . . . . . . . 13 86 4.10 Congestion control . . . . . . . . . . . . . . . . . . . . 13 87 5. Inner flow identifier properties . . . . . . . . . . . . . . . 13 88 5.1. Flow classification . . . . . . . . . . . . . . . . . . . . 13 89 5.2. Inner flow identifier properties . . . . . . . . . . . . . 14 90 6. Motivation for GUE . . . . . . . . . . . . . . . . . . . . . . 15 91 7. Security Considerations . . . . . . . . . . . . . . . . . . . . 16 92 7.1. GUE security fields . . . . . . . . . . . . . . . . . . . . 17 93 7.2. GUE and IPsec . . . . . . . . . . . . . . . . . . . . . . . 17 94 8. IANA Consideration . . . . . . . . . . . . . . . . . . . . . . 17 95 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 18 96 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 97 10.1. Normative References . . . . . . . . . . . . . . . . . . . 18 98 10.2. Informative References . . . . . . . . . . . . . . . . . . 18 99 Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 19 100 A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 20 101 A.2. Checksum offload . . . . . . . . . . . . . . . . . . . . . 20 102 A.2.1. Transmit checksum offload . . . . . . . . . . . . . . . 20 103 A.2.2. Receive checksum offload . . . . . . . . . . . . . . . 21 104 A.3. Transmit Segmentation Offload . . . . . . . . . . . . . . . 22 105 A.4. Large Receive Offload . . . . . . . . . . . . . . . . . . . 22 106 Appendix B: Privileged ports . . . . . . . . . . . . . . . . . . . 23 107 Appendix C: Inner flow identifier as a route selector . . . . . . 23 108 Appendix D: Hardware protocol implementation considerations . . . 23 109 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 24 111 1. Introduction 113 This specification describes Generic UDP Encapsulation (GUE) which is 114 a general method for encapsulating packets of arbitrary IP protocols 115 within User Datagram Protocol (UDP) [RFC0768] packets. Encapsulating 116 packets in UDP facilitates efficient transport across networks. 117 Networking devices widely provide protocol specific processing and 118 optimizations for UDP (as well as TCP) packets. Packets for atypical 119 IP protocols (those not usually parsed by networking hardware) can be 120 encapsulated in UDP packets to maximize deliverability and to 121 leverage flow specific mechanisms for routing and packet steering. 123 GUE provides an extensible header format for including optional data 124 in the encapsulation header. This data potentially covers items such 125 as virtual networking identifier, security data for validating or 126 authenticating the GUE header, congestion control data, etc. GUE also 127 allows private optional data in the encapsulation header. This 128 feature can be used by a site or implementation to define local 129 custom optional data, and allows experimentation of options that may 130 eventually become standard. 132 2. Packet formats 134 A GUE packet is comprised of a UDP packet whose payload is a GUE 135 header followed by a payload which is either an encapsulated packet 136 of some IP protocol or a control message (like an OAM message). A GUE 137 packet has the general format: 139 +-------------------------------+ 140 | | 141 | UDP/IP header | 142 | | 143 |-------------------------------| 144 | | 145 | GUE Header | 146 | | 147 |-------------------------------| 148 | | 149 | Encapsulated packet | 150 | or control message | 151 | | 152 +-------------------------------+ 154 The GUE header is variable length as determined by the presence of 155 optional fields. 157 2.1. GUE header preamble 159 The first byte of the GUE header provides the GUE protocol version 160 number, indicator of a control or data message, and header length: 162 0 163 0 1 2 3 4 5 6 7 164 +-+-+-+-+-+-+-+-+ 165 |Ver|C| Hlen | 166 +-+-+-+-+-+-+-+-+ 168 Contents are: 170 o Ver: GUE protocol version. The rest of the fields after the 171 preamble are defined based on the version. This field is two 172 bits allowing four possible values. 174 o Control flag: When set indicates a control message, not set 175 indicates a data message. 177 o Hlen: Length in 32-bit words of the GUE header, including 178 optional fields but not the first four bytes of the header. 179 Computed as (header_len - 4) / 4. All GUE headers are a multiple 180 of four bytes in length. Maximum header length is 132 bytes. 182 2.2. GUE header 184 The header format for version 0x0 of GUE in UDP is: 186 0 1 2 3 187 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 188 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 189 | Source port | Destination port | 190 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 191 | Length | Checksum | 192 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 193 |0x0|C| Hlen | Proto/ctype | Flags |E| 194 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 195 | | 196 ~ Fields (optional) ~ 197 | | 198 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 199 | Extension flags (optional) | 200 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 201 | | 202 ~ Extension fields (optional) ~ 203 | | 204 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 205 | | 206 ~ Private data (optional) ~ 207 | | 208 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 210 The contents of the UDP header are: 212 o Source port (inner flow identifier): This should be set to a 213 value that represents the encapsulated flow. The properties of 214 the inner flow identifier are described below. 216 o Destination port: The GUE assigned port number, 6080. 218 o Length: Canonical length of the UDP packet (length of UDP header 219 and payload). 221 o Checksum: Standard UDP checksum. 223 The GUE header consists of: 225 o Preamble byte: Version number (0x0), C bit, and header length. 227 o Proto/ctype: When the C bit is set this field contains a control 228 message type for the payload. When C bit is not set, the field 229 holds the IP protocol number for the encapsulated packet in the 230 payload. The control message or encapsulated packet begins at 231 the offset provided by Hlen. 233 o Flags. Header flags that may be allocated for various purposes 234 and may indicate presence of optional fields. Undefined header 235 flag bits must be set to zero on transmission. 237 o 'E' Extension flag. Indicates presence of extension flags option 238 in the optional fields. 240 o Fields: Optional fields whose presence is indicated by 241 corresponding flags. 243 o Extension flags: An optional field indicated by the E bit. This 244 field provides an additional set of 32 header bit flags for the 245 header. 247 o Extension fields: Optional fields whose presence is indicated by 248 corresponding extension flags. 250 o Private data: Optional private data. If private data is present 251 it immediately follows that last field present in the header. 252 The length of this data is determined by subtracting the 253 starting offset from the header length. 255 2.3. Flags and optional fields 257 Flags and associated optional fields are the primary mechanism of 258 extensibility in GUE. There are sixteens flag bits in the primary GUE 259 header with one being reserved to indicate that an optional extension 260 flags field is present. The extension flags field contains an 261 additional thirty-two flag bits. 263 A flag may indicate presence of optional fields. The size of an 264 optional field indicated by a flag must be fixed. 266 Flags may be paired together to allow different lengths for an 267 optional field. For example, if two flag bits are paired, a field may 268 possibly be three different lengths. Regardless of how flag bits may 269 be paired, the lengths and offsets of optional fields corresponding 270 to a set of flags must be well defined. 272 Optional fields are placed in order of the flags. New flags should be 273 allocated from high to low order bit contiguously without holes. 274 Flags allow random access, for instance to inspect the field 275 corresponding to the Nth flag bit, an implementation only considers 276 the previous N-1 flags to determine the offset. Flags after the Nth 277 flag are not pertinent in calculating the offset of the Nth flag. 279 Flags (or paired flags) are idempotent such that new flags should not 280 cause reinterpretation of old flags. Also, new flags should not alter 281 interpretation of other elements in the GUE header nor how the 282 message is parsed (for instance, in a data message the proto/ctype 283 field always holds an IP protocol number as an invariant). 285 2.4 Private data 287 An implementation may use private data for its own use. The private 288 data immediately follows the last field in the GUE header and is not 289 a fixed length. This data is considered part of the GUE header and 290 must be accounted for in header length (Hlen). The length of the 291 private data must be a multiple of four and is determined by 292 subtracting the offset of private data in the GUE header from the 293 header length. Specifically: 295 Private_length = (Hlen * 4) - Length(flags) 297 Where "Length(flags)" returns the sum of lengths of all the optional 298 fields present in the GUE header. When there is no private data 299 present, length of the private data is zero. 301 The semantics and interpretation of private data are implementation 302 specific. A encapsulator and decapsulator MUST agree on the meaning 303 of private data before using it. The private data may be structured 304 as necessary, for instance it might contain its own set of flags and 305 optional fields. 307 If a decapsulator receives a GUE packet with private data, it MUST 308 validate the private data appropriately. If a decapsulator does not 309 expect private data from an encapsulator the packet MUST be dropped. 310 If a decapsulator cannot validate the contents of private data per 311 the provided semantics the packet MUST also be dropped. An 312 implementation may place security data in GUE private data which must 313 be verified for packet acceptance. 315 3. Message types 317 3.1. Control messages 319 Control messages are indicated in the GUE header when the C bit is 320 set. The payload is interpreted as a control message with type 321 specified in the proto/ctype field. The format and contents of the 322 control message are indicated by the type and can be variable length. 324 Other than interpreting the proto/ctype field as a control message 325 type, the meaning and semantics of the rest of the elements in the 326 GUE header are the same as that of data messages. Forwarding and 327 routing of control messages should be the same as that of a data 328 message with the same outer IP and UDP header and GUE flags-- this 329 ensures that a control message can be created which follows the same 330 path as a data message. 332 Control messages can be defined for OAM type messages. For instance, 333 an echo request and corresponding echo reply message may be defined 334 to test for liveness. 336 3.2. Data messages 338 Data messages are indicated in GUE header with C bit not set. The 339 payload of a data message is interpreted as an encapsulated packet of 340 an IP protocol indicated in the proto/ctype field. The packet 341 immediately follows the GUE header. 343 Data messages are a primary means of encapsulation and can be used to 344 create tunnels for overlay networks. 346 4. Operation 348 The figure below illustrates the use of GUE encapsulation between two 349 servers. Sever 1 is sending packets to server 2. An encapsulator 350 performs encapsulation of packets from server 1. These encapsulated 351 packets traverse the network as UDP packets. At the decapsulator, 352 packets are decapsulated and sent on to server 2. Packet flow in the 353 reverse direction need not be symmetric; GUE encapsulation is not 354 required in the reverse path. 356 +---------------+ +---------------+ 357 | | | | 358 | Server 1 | | Server 2 | 359 | | | | 360 +---------------+ +---------------+ 361 | ^ 362 V | 363 +---------------+ +---------------+ +---------------+ 364 | | | | | | 365 | Encapsulator |-->| Layer 3 |-->| Decapsulator | 366 | | | Network | | | 367 +---------------+ +---------------+ +---------------+ 369 The encapsulator and decapsulator may be co-resident with the 370 corresponding servers, or may be on separate nodes in the network. 372 4.1. Network tunnel encapsulation 374 Network tunneling can be achieved by encapsulating layer 2 or layer 3 375 packets. In this case the encapsulator and decapsulator nodes are the 376 tunnel endpoints. These could be routers that provide network tunnels 377 on behalf of communicating servers. 379 4.2. Transport layer encapsulation 381 When encapsulating layer 4 packets, the encapsulator and decapsulator 382 should be co-resident with the servers. In this case, the 383 encapsulation headers are inserted between the IP header and the 384 transport packet. The addresses in the IP header refer to both the 385 endpoints of the encapsulation and the endpoints for terminating the 386 the transport protocol. 388 4.3. Encapsulator operation 390 Encapsulators create GUE data messages, set the source port to the 391 inner flow identifier, set flags and optional fields in the GUE 392 header, and forward packets to a decapsulator. 394 An encapsulator may be an end host originating the packets of a flow, 395 or may be a network device performing encapsulation on behalf of 396 servers (routers implementing tunnels for instance). In either case, 397 the intended target (decapsulator) is indicated by the outer 398 destination IP address. 400 If an encapsulator is tunneling packets, that is encapsulating 401 packets of layer 2 or layer 3 protocols (e.g. EtherIP, IPIP, ESP 402 tunnel mode), it should follow standard conventions for tunneling of 403 one IP protocol over another. Diffserv interaction with tunnels is 404 described in [RFC2983], ECN propagation for tunnels is described in 405 [RFC6040]. 407 4.4. Decapsulator operation 409 A decapsulator performs decapsulation of GUE packets. A decapsulator 410 is addressed by the outer destination IP address of a GUE packet. 411 The decapsulator validates packets, including fields of the GUE 412 header. If a packet is acceptable, the UDP and GUE headers are 413 removed and the packet is resubmitted for IP protocol processing or 414 control message processing if it is a control message. 416 If a decapsulator receives a GUE packet with an unsupported version, 417 unknown flag, bad header length (too small for included optional 418 fields), unknown control message type, or an otherwise malformed 419 header, it must drop the packet and may log the event. No error 420 message is returned back to the encapsulator. Note that set flags in 421 GUE that are unknown to a decapsulator MUST NOT be ignored. If a GUE 422 packet is received by a decapsulator with unknown flags, the packet 423 MUST be dropped. 425 4.5. Router and switch operation 427 Routers and switches should forward GUE packets as standard UDP/IP 428 packets. The outer five-tuple should contain sufficient information 429 to perform flow classification corresponding to the flow of the inner 430 packet. A switch should not normally need to parse a GUE header, and 431 none of the flags or optional fields in the GUE header should affect 432 routing. 434 A router should not modify a GUE header when forwarding a packet. It 435 may encapsulate a GUE packet in another GUE packet, for instance to 436 implement a network tunnel. In this case the router takes the role of 437 an encapsulator, and the corresponding decapsulator is the logical 438 endpoint of the tunnel. 440 4.6. Middlebox interactions 442 A middle box may interpret some flags and optional fields of the GUE 443 header for classification purposes, but is not required to understand 444 all flags and fields in GUE packets. A middle box should not drop a 445 GUE packet because there are flags unknown to it. The header length 446 in the GUE header allows a middlebox to inspect the payload packet 447 without needing to parse the flags or optional fields. 449 A middlebox may infer bidirectional connection semantics to a UDP 450 flow. For instance a stateful firewall may create a five-tuple rule 451 to match flows on egress, and a corresponding five-tuple rule for 452 matching ingress packets where the roles of source and destination 453 are reversed for the IP addresses and UDP port numbers. To operate in 454 this environment, a GUE tunnel must assume connected semantics 455 defined by the UDP five tuple and the use of GUE encapsulation must 456 be symmetric between both endpoints. The source port set in the UDP 457 header must be the destination port the peer would set for replies. 459 4.7. NAT 461 IP address and port translation can be performed on the UDP/IP 462 headers adhering to the requirements for NAT with UDP [RFC478]. In 463 the case of stateful NAT, connection semantics must be applied to a 464 GUE tunnel as described above. 466 When using transport mode encapsulation and traversing a NAT, the IP 467 addresses may be changed such that the pseudo header checksum used 468 for checksum calculation is modified and the checksum will be found 469 invalid at the receiver. To compensate for this, A GUE option can be 470 added which contains the checksum over the source and destination 471 addresses when the packet is transmitted. Upon receiving this option, 472 the delta of the pseudo header checksum is computed by subtracting 473 the checksum over the source and and destination addresses from the 474 checksum value in the option. The resultant value is then added into 475 checksum calculation when validating the inner transport checksum. 477 4.8. Checksum Handling 479 This section describes the requirements around the UDP checksum and 480 GUE header checksum. Checksums are an important consideration in that 481 that they can provide end to end validation and protect against 482 packet mis-delivery. The latter is allowed by the inclusion of a 483 pseudo header that covers the IP addresses and UDP ports of the 484 encapsulating headers. 486 4.8.1. Checksum requirements 488 The potential for mis-delivery of packets due to corruption of IP, 489 UDP, or GUE headers must be considered. One of the following 490 requirements must be met: 492 o UDP checksums are enabled (for IPv4 or IPv6). 494 o The GUE header checksum is used. 496 o Zero UDP checksums are used in accordance with applicable 497 requirements in [GREUDP], [RFC6935], and [RFC6936]. 499 4.8.2. GUE header checksum 501 The GUE header checksum provides a UDP-lite [RFC3828] type of 502 checksum capability as an optional field of the GUE header. The GUE 503 header checksum minimally covers the GUE header and a GUE pseudo 504 header. The GUE pseudo header includes the corresponding IP 505 addresses as well as the UDP ports of the encapsulating headers. 506 This checksum should provide adequate protection against address 507 corruption in IPv6 when the UDP checksum is zero. Additionally, the 508 GUE checksum provides protection of the GUE header when the UDP 509 checksum is set to zero with either IPv4 or IPv6. The GUE header 510 checksum is defined in [GUECSUM]. 512 4.8.3. UDP Checksum with IPv4 514 For UDP in IPv4, the UDP checksum MUST be processed as specified in 515 [RFC768] and [RFC1122] for both transmit and receive. An 516 encapsulator MAY set the UDP checksum to zero for performance or 517 implementation considerations. The IPv4 header includes a checksum 518 which protects against mis-delivery of the packet due to corruption 519 of IP addresses. The UDP checksum potentially provides protection 520 against corruption of the UDP header, GUE header, and GUE payload. 521 Enabling or disabling the use of checksums is a deployment 522 consideration that should take into account the risk and effects of 523 packet corruption, and whether the packets in the network are 524 already adequately protected by other, possibly stronger mechanisms 525 such as the Ethernet CRC. If an encapsulator sets a zero UDP 526 checksum for IPv4 it SHOULD use the GUE header checksum as described 527 in section 4.8.2. 529 When a decapsulator receives a packet, the UDP checksum field MUST 530 be processed. If the UDP checksum is non-zero, the decapsulator MUST 531 verify the checksum before accepting the packet. By default a 532 decapsulator SHOULD accept UDP packets with a zero checksum. A node 533 MAY be configured to disallow zero checksums per [RFC1122]; this may 534 be done selectively, for instance disallowing zero checksums from 535 certain hosts that are known to be sending over paths subject to 536 packet corruption. If verification of a non-zero checksum fails, a 537 decapsulator lacks the capability to verify a non-zero checksum, or 538 a packet with a zero-checksum was received and the decapsulator is 539 configured to disallow, the packet MUST be dropped and an event MAY 540 be logged. 542 4.8.4. UDP Checksum with IPv6 544 For UDP in IPv6, the UDP checksum MUST be processed as specified in 545 [RFC768] and [RFC2460] for both transmit and receive. Unlike IPv4, 546 there is no header checksum in IPv6 that protects against mis- 547 delivery due to address corruption. Therefore, when GUE is used over 548 IPv6, either the UDP checksum must be enabled or the GUE header 549 checksum must be used. An encapsulator MAY set a zero UDP checksum 550 for performance or implementation reasons, in which case the GUE 551 header checksum MUST be used or applicable requirements for using 552 zero UDP checksums in [GREUDP] MUST be met. If the UDP checksum is 553 enabled, then the GUE header checksum should not be used since it is 554 mostly redundant. 556 When a decapsulator receives a packet, the UDP checksum field MUST 557 be processed. If the UDP checksum is non-zero, the decapsulator MUST 558 verify the checksum before accepting the packet. By default a 559 decapsulator MUST only accept UDP packets with a zero checksum if 560 the GUE header checksum is used and is verified. If verification of 561 a non-zero checksum fails, a decapsulator lacks the capability to 562 verify a non-zero checksum, or a packet with a zero-checksum and no 563 GUE header checksum was received, the packet MUST be dropped and an 564 event MAY be logged. 566 4.9. MTU and fragmentation issues 568 Standard conventions for handling of MTU (Maximum Transmission Unit) 569 and fragmentation in conjunction with networking tunnels 570 (encapsulation of layer 2 or layer 3 packets) should be followed. 571 Details are described in MTU and Fragmentation Issues with In-the- 572 Network Tunneling [RFC4459] 574 If a packet is fragmented before encapsulation in GUE, all the 575 related fragments must be encapsulated using the same source port 576 (inner flow identifier). An operator may set MTU to account for 577 encapsulation overhead and reduce the likelihood of fragmentation. 579 4.10 Congestion control 581 Per requirements of [RFC5405], if the IP traffic encapsulated with 582 GUE implements proper congestion control no additional mechanisms 583 should be required. 585 In the case that the encapsulated traffic does not implement any or 586 sufficient control, or it is not known rather a transmitter will 587 consistently implement proper congestion control, then congestion 588 control at the encapsulation layer must be provided. Note this case 589 applies to a significant use case in network virtualization in which 590 guests run third party networking stacks that cannot be implicitly 591 trusted to implement conformant congestion control. 593 Out of band mechanisms such as rate limiting, Managed Circuit 594 Breaker, or traffic isolation may used to provide rudimentary 595 congestion control. For finer grained congestion control that allow 596 alternate congestion control algorithms, reaction time within an 597 RTT, and interaction with ECN, in band mechanisms may warranted. 599 DCCP may be used to provide congestion control for encapsulated 600 flows. In this case, the protocol stack for an IP tunnel may be IP- 601 GUE-DCCP-IP. Alternatively, GUE can be extended to include 602 congestion control (related data carried in GUE optional fields). 603 Congestion control mechanisms for GUE will be elaborated in other 604 specifications. 606 5. Inner flow identifier properties 608 5.1. Flow classification 610 A major objective of using GUE is that a network device can perform 611 flow classification corresponding to the flow of the inner 612 encapsulated packet based on the contents in the outer headers. 614 Hardware devices commonly perform hash computations on packet 615 headers to classify packets into flows or flow buckets. Flow 616 classification is done to support load balancing (statistical 617 multiplexing) of flows across a set of networking resources. 618 Examples of such load balancing techniques are Equal Cost Multipath 619 routing (ECMP), port selection in Link Aggregation, and NIC device 620 Receive Side Scaling (RSS). Hashes are usually either a three-tuple 621 hash of IP protocol, source address, and destination address; or a 622 five-tuple hash consisting of IP protocol, source address, 623 destination address, source port, and destination port. Typically, 624 networking hardware will compute five-tuple hashes for TCP and UDP, 625 but only three-tuple hashes for other IP protocols. Since the five- 626 tuple hash provides more granularity, load balancing can be finer 627 grained with better distribution. When a packet is encapsulated with 628 GUE, the source port in the outer UDP packet is set to reflect the 629 flow of the inner packet. When a device computes a five-tuple hash 630 on the outer UDP/IP header of a GUE packet, the resultant value 631 classifies the packet per its inner flow. 633 To support flow classification, the source port of the UDP header in 634 GUE is set to a value that maps to the inner flow. This is referred 635 to as the inner flow identifier. The inner flow identifier is set by 636 the encapsulator; it can be computed on the fly based on packet 637 contents or retrieved from a state maintained for the inner flow. 639 Examples of deriving an inner flow identifier are: 641 o If the encapsulated packet is a layer 4 packet, TCP/IPv4 for 642 instance, the inner flow identifier could be based on the 643 canonical five-tuple hash of the inner packet. 645 o If the encapsulated packet is an AH transport mode packet with 646 TCP as next header, the inner flow identifier could be a hash 647 over a three-tuple: TCP protocol and TCP ports of the 648 encapsulated packet. 650 o If a node is encrypting a packet using ESP tunnel mode and GUE 651 encapsulation, the inner flow identifier could be based on the 652 contents of clear-text packet. For instance, a canonical five- 653 tuple hash for a TCP/IP packet could be used. 655 5.2. Inner flow identifier properties 657 The inner flow identifier is the value set in the UDP source 658 port of a GUE packet. The inner flow identifier should adhere to 659 the following properties: 661 o The value set in the source port should be within the ephemeral 662 port range. IANA suggests this range to be 49152 to 65535, where 663 the high order two bits of the port are set to one. This 664 provides fourteen bits of entropy for the inner flow identifier. 666 o The inner flow identifier should have a uniform distribution 667 across encapsulated flows. 669 o An encapsulator may occasionally change the inner flow 670 identifier used for an inner flow per its discretion (for 671 security, route selection, etc). Changing the value should 672 happen no more than once every thirty seconds. 674 o Decapsulators, or any networking devices, should not attempt any 675 interpretation of the inner flow identifier, nor should they 676 attempt to reproduce any hash calculation. They may use the 677 value to match further receive packets for steering decisions, 678 but cannot assume that the hash uniquely or permanently 679 identifies a flow. 681 o Input to the inner flow identifier is not restricted to ports 682 and addresses; input could include flow label from an IPv6 683 packet, SPI from an ESP packet, or other flow related state in 684 the encapsulator that is not necessarily conveyed in the packet. 686 o The assignment function for inner flow identifiers should be 687 randomly seeded to mitigate denial of service attacks. The seed 688 may be changed periodically. 690 6. Motivation for GUE 692 This section presents the motivation for GUE with respect to other 693 encapsulation methods. 695 A number of different encapsulation techniques have been proposed for 696 the encapsulation of one protocol over another. EtherIP [RFC3378] 697 provides layer 2 tunneling of Ethernet frames over IP. GRE [RFC2784], 698 MPLS [RFC4023], and L2TP [RFC2661] provide methods for tunneling 699 layer 2 and layer 3 packets over IP. NVGRE [NVGRE] and VXLAN 700 [RFC7348] are proposals for encapsulation of layer 2 packets for 701 network virtualization. IPIP [RFC2003] and Generic packet tunneling 702 in IPv6 [RFC2473] provide methods for tunneling IP packets over IP. 704 Several proposals exist for encapsulating packets over UDP including 705 ESP over UDP [RFC3948], TCP directly over UDP [TCPUDP], VXLAN, LISP 706 [RFC6830] which encapsulates layer 3 packets, and Generic UDP 707 Encapsulation for IP Tunneling (GRE over UDP)[GREUDP]. Generic UDP 708 tunneling [GUT] is a proposal similar to GUE in that it aims to 709 tunnel packets of IP protocols over UDP. 711 GUE has the following discriminating features: 713 o UDP encapsulation leverages specialized network device 714 processing for efficient transport. The semantics for using the 715 UDP source port as an identifier for an inner flow are defined. 717 o GUE permits encapsulation of arbitrary IP protocols, which 718 includes layer 2 3, and 4 protocols. This potentially allows 719 nearly all traffic within a data center to be normalized to be 720 either TCP or UDP on the wire. 722 o Multiple protocols can be multiplexed over a single UDP port 723 number. This is in contrast to techniques to encapsulate 724 protocols over UDP using a protocol specific port number (such 725 as ESP/UDP, GRE/UDP, SCTP/UDP). GUE provides a uniform and 726 extensible mechanism for encapsulating all IP protocols in UDP 727 with minimal overhead (four bytes of additional header). 729 o GUE is extensible. New flags and optional fields can be defined. 731 o The GUE header includes a header length field. This allows a 732 network node to inspect an encapsulated packet without needing 733 to parse the full encapsulation header. 735 o Private data in the encapsulation header allows local 736 customization and experimentation while being compatible with 737 processing in network nodes (routers and middleboxes). 739 o GUE includes both data messages (encapsulation of packets) and 740 control messages (such as OAM). 742 7. Security Considerations 744 Encapsulation of IP protocols within GUE should not increase 745 security risk, nor provide additional security in itself. As 746 suggested in section 5 the source port for of UDP packets in GUE 747 should be randomly seeded to mitigate some possible denial 748 service attacks. 750 GUE is most useful when it is in the outermost header of a 751 packet which allows for flow hash calculation as well as making 752 GUE header data (such as virtual network identifier) visible to 753 switches and middleboxes. GUE must be amenable to encapsulating 754 (and being encapsulated within) IPsec. Also, we allow provisions 755 to secure the GUE header itself without external protocol. 757 Security for Generic UDP Encapsulation is described in more 758 detail in [GUESEC]. 760 7.1. GUE security fields 762 Security fields should be used to provide integrity and 763 authentication of the GUE header. Security negotiation 764 (algorithms, interpretation of security field, key management, 765 etc.) is expected to be done out of band between hosts. 767 7.2. GUE and IPsec 769 GUE may be used to encapsulate IPsec packets. This allows the 770 benefits of deriving a flow hash for the inner, potentially 771 encrypted, packet. In this case the protocol stack may be: 773 +-------------------------------+ 774 | | 775 | UDP/IP header | 776 | | 777 |-------------------------------| 778 | | 779 | GUE Header | 780 | | 781 |-------------------------------| 782 | | 783 | ESP/AH/private security | 784 | | 785 |-------------------------------| 786 | | 787 | Encapsulated packet | 788 | | 789 +-------------------------------+ 791 Note that IPsec would not cover the GUE header in this case 792 (does not authenticate it for instance). GUE security optional 793 fields may be used to provide authentication or integrity of the 794 GUE header. 796 8. IANA Consideration 798 A user UDP port number assignment for GUE has been assigned: 800 Service Name: gue 801 Transport Protocol(s): UDP 802 Assignee: Tom Herbert 803 Contact: Tom Herbert 804 Description: Generic UDP Encapsulation 805 Reference: draft-herbert-gue 806 Port Number: 6080 807 Service Code: N/A 808 Known Unauthorized Uses: N/A 809 Assignment Notes: N/A 811 9. Acknowledgements 813 The authors would like to thank David Liu for valuable input on 814 this draft. 816 10. References 818 10.1. Normative References 820 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 821 August 1980. 823 [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an 824 IANA Considerations Section in RFCs", RFC 2434, October 1998. 826 [RFC2983] Black, D., "Differentiated Services and Tunnels", RFC 2983, 827 October 2000. 829 [RFC6040] Briscoe, B., "Tunnelling of Explicit Congestion 830 Notification", RFC 6040, November 2010. 832 [RFC6936] Fairhurst, G. and M. Westerlund, "Applicability Statement 833 for the Use of IPv6 UDP Datagrams with Zero Checksums", RFC 6936, 834 April 2013. 836 [RFC4459] Savola, P., "MTU and Fragmentation Issues with In-the- 837 Network Tunneling", RFC 4459, April 2006. 839 10.2. Informative References 841 [RFC2003] Perkins, C., "IP Encapsulation within IP", RFC 2003, 842 October 1996. 844 [RFC3948] Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M. 845 Stenberg, "UDP Encapsulation of IPsec ESP Packets", RFC 3948, January 846 2005. 848 [RFC6830] Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The 849 Locator/ID Separation Protocol (LISP)", RFC 6830, January 2013. 851 [RFC3378] Housley, R. and S. Hollenbeck, "EtherIP: Tunneling Ethernet 852 Frames in IP Datagrams", RFC 3378, September 2002. 854 [RFC2784] Farinacci, D., Li, T., Hanks, S., Meyer, D., and P. Traina, 855 Generic Routing Encapsulation (GRE)", RFC 2784, March 2000. 857 [RFC4023] Worster, T., Rekhter, Y., and E. Rosen, Ed., "Encapsulating 858 MPLS in IP or Generic Routing Encapsulation (GRE)", RFC 4023, March 859 2005. 861 [RFC2661] Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn, G., 862 and B. Palter, "Layer Two Tunneling Protocol "L2TP"", RFC 2661, 863 August 1999. 865 [RFC5925] Touch, J., Mankin, A., and R. Bonica, "The TCP 866 Authentication Option", RFC 5925, June 2010. 868 [RFC3828] Larzon, L-A., Degermark, M., Pink, S., Jonsson, L-E., Ed., 869 and G. Fairhurst, Ed., "The Lightweight User Datagram Protocol (UDP- 870 Lite)", RFC 3828, July 2004, . 873 [RFC7348] Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger, 874 L., Sridhar, T., Bursell, M., and C. Wright, "Virtual eXtensible 875 Local Area Network (VXLAN): A Framework for Overlaying Virtualized 876 Layer 2 Networks over Layer 3 Networks", RFC 7348, August 2014, 877 . 879 [NVGRE] NVGRE: Network Virtualization using Generic Routing 880 Encapsulation draft-sridharan-virtualization-nvgre-03 882 [TCPUDP] Encapsulation of TCP and other Transport Protocols over UDP 883 draft-cheshire-tcp-over-udp-00 885 [GREUDP] Generic UDP Encapsulation for IP Tunneling draft-yong-tsvwg- 886 gre-in-udp-encap-02 888 [GUESEC] Yong, L., Herbert, T., "Generic UDP Encapsulation (GUE) for 889 Secure Transport", draft-hy-gue-4-secure-transport-00, work in 890 progress. 892 [GUT] Generic UDP Tunnelling (GUT) draft-manner-tsvwg-gut-02.txt 894 [REMCSUM] Remote Checksum Offload draft-herbert-remotecsumoffload-00 896 Appendix A: NIC processing for GUE 898 This appendix provides some guidelines for Network Interface Cards 899 (NICs) to implement common offloads and accelerations to support GUE. 900 Note that most of this discussion is generally applicable to other 901 methods of UDP based encapsulation. 903 A.1. Receive multi-queue 905 Contemporary NICs support multiple receive descriptor queues (multi- 906 queue). Multi-queue enables load balancing of network processing for 907 a NIC across multiple CPUs. On packet reception, a NIC must select 908 the appropriate queue for host processing. Receive Side Scaling is a 909 common method which uses the flow hash for a packet to index an 910 indirection table where each entry stores a queue number. Flow 911 Director and Accelerated Receive Flow Steering (aRFS) allow a host to 912 program the queue that is used for a given flow which is identified 913 either by an explicit five-tuple or by the flow's hash. 915 GUE encapsulation should be compatible with multi-queue NICs that 916 support five-tuple hash calculation for UDP/IP packets as input to 917 RSS. The inner flow identifier (source port) ensures classification 918 of the encapsulated flow even in the case that the outer source and 919 destination addresses are the same for all flows (e.g. all flows are 920 going over a single tunnel). 922 By default, UDP RSS support is often disabled in NICs to avoid out of 923 order reception that can occur when UDP packets are fragmented. As 924 discussed above, fragmentation of GUE packets should be mitigated by 925 fragmenting packets before entering a tunnel, path MTU discovery in 926 higher layer protocols, or operator adjusting MTUs. Other UDP traffic 927 may not implement such procedures to avoid fragmentation, so enabling 928 UDP RSS support in the NIC should be a considered tradeoff during 929 configuration. 931 A.2. Checksum offload 933 Many NICs provide capabilities to calculate standard ones complement 934 payload checksum for packets in transmit or receive. When using GUE 935 encapsulation there are at least two checksums that may be of 936 interest: the encapsulated packet's transport checksum, and the UDP 937 checksum in the outer header. 939 A.2.1. Transmit checksum offload 941 NICs may provide a protocol agnostic method to offload transmit 942 checksum (NETIF_F_HW_CSUM in Linux parlance) that can be used with 943 GUE. In this method the host provides checksum related parameters in 944 a transmit descriptor for a packet. These parameters include the 945 starting offset of data to checksum, the length of data to checksum, 946 and the offset in the packet where the computed checksum is to be 947 written. The host initializes the checksum field to pseudo header 948 checksum. 950 In the case of GUE, the checksum for an encapsulated transport layer 951 packet, a TCP packet for instance, can be offloaded by setting the 952 appropriate checksum parameters. 954 NICs typically can offload only one transmit checksum per packet, so 955 simultaneously offloading both an inner transport packet's checksum 956 and the outer UDP checksum is likely not possible. In this case 957 setting UDP checksum to zero (per above discussion) and offloading 958 the inner transport packet checksum might be acceptable. 960 If an encapsulator is co-resident with a host, then checksum offload 961 may be performed using remote checksum offload [REMCSUM]. Remote 962 checksum offload relies on NIC offload of the simple UDP/IP checksum 963 which is commonly supported even in legacy devices. In remote 964 checksum offload the outer UDP checksum is set and the GUE header 965 includes an option indicating the start and offset of the inner 966 "offloaded" checksum. The inner checksum is initialized to the pseudo 967 header checksum. When a decapsulator receives a GUE packet with the 968 remote checksum offload option, it completes the offload operation by 969 determining the packet checksum from the indicated start point to the 970 end of the packet, and then adds this into the checksum field at the 971 offset given in the option. Computing the checksum from the start to 972 end of packet is efficient if checksum-complete is provided on the 973 receiver. 975 A.2.2. Receive checksum offload 977 GUE is compatible with NICs that perform a protocol agnostic receive 978 checksum (CHECKSUM_COMPLETE in Linux parlance). In this technique, a 979 NIC computes a ones complement checksum over all (or some predefined 980 portion) of a packet. The computed value is provided to the host 981 stack in the packet's receive descriptor. The host driver can use 982 this checksum to "patch up" and validate any inner packet transport 983 checksum, as well as the outer UDP checksum if it is non-zero. 985 Many legacy NICs don't provide checksum-complete but instead provide 986 an indication that a checksum has been verified (CHECKSUM_UNNECESSARY 987 in Linux). Usually, such validation is only done for simple TCP/IP or 988 UDP/IP packets. If a NIC indicates that a UDP checksum is valid, the 989 checksum-complete value for the UDP packet is the "not" of the pseudo 990 header checksum. In this way, checksum-unnecessary can be converted 991 to checksum-complete. So if the NIC provides checksum-unnecessary for 992 the outer UDP header in an encapsulation, checksum conversion can be 993 done so that the checksum-complete value is derived and can be used 994 by the stack to validate an checksums in the encapsulated packet. 996 A.3. Transmit Segmentation Offload 998 Transmit Segmentation Offload (TSO) is a NIC feature where a host 999 provides a large (>MTU size) TCP packet to the NIC, which in turn 1000 splits the packet into separate segments and transmits each one. This 1001 is useful to reduce CPU load on the host. 1003 The process of TSO can be generalized as: 1005 - Split the TCP payload into segments which allow packets with 1006 size less than or equal to MTU. 1008 - For each created segment: 1010 1. Replicate the TCP header and all preceding headers of the 1011 original packet. 1013 2. Set payload length fields in any headers to reflect the 1014 length of the segment. 1016 3. Set TCP sequence number to correctly reflect the offset of 1017 the TCP data in the stream. 1019 4. Recompute and set any checksums that either cover the payload 1020 of the packet or cover header which was changed by setting a 1021 payload length. 1023 Following this general process, TSO can be extended to support TCP 1024 encapsulation in GUE. For each segment the Ethernet, outer IP, UDP 1025 header, GUE header, inner IP header if tunneling, and TCP headers are 1026 replicated. Any packet length header fields need to be set properly 1027 (including the length in the outer UDP header), and checksums need to 1028 be set correctly (including the outer UDP checksum if being used). 1030 To facilitate TSO with GUE it is recommended that optional fields 1031 should not contain values that must be updated on a per segment 1032 basis-- for example the GUE fields should not include checksums, 1033 lengths, or sequence numbers that refer to the payload. If the GUE 1034 header does not contain such fields then the TSO engine only needs to 1035 copy the bits in the GUE header when creating each segment and does 1036 not need to parse the GUE header. 1038 A.4. Large Receive Offload 1040 Large Receive Offload (LRO) is a NIC feature where packets of a TCP 1041 connection are reassembled, or coalesced, in the NIC and delivered to 1042 the host as one large packet. This feature can reduce CPU utilization 1043 in the host. 1045 LRO requires significant protocol awareness to be implemented 1046 correctly and is difficult to generalize. Packets in the same flow 1047 need to be unambiguously identified. In the presence of tunnels or 1048 network virtualization, this may require more than a five-tuple match 1049 (for instance packets for flows in two different virtual networks may 1050 have identical five-tuples). Additionally, a NIC needs to perform 1051 validation over packets that are being coalesced, and needs to 1052 fabricate a single meaningful header from all the coalesced packets. 1054 The conservative approach to supporting LRO for GUE would be to 1055 assign packets to the same flow only if they have identical five- 1056 tuple and were encapsulated the same way. That is the outer IP 1057 addresses, the outer UDP ports, GUE protocol, GUE flags and fields, 1058 and inner five tuple are all identical. 1060 Appendix B: Privileged ports 1062 Using the source port to contain an inner flow identifier value 1063 disallows the security method of a receiver enforcing that the source 1064 port be a privileged port. Privileged ports are defined by some 1065 operating systems to restrict source port binding. Unix, for 1066 instance, considered port number less than 1024 to be privileged. 1068 Enforcing that packets are sent from a privileged port is widely 1069 considered an inadequate security mechanism and has been mostly 1070 deprecated. To approximate this behavior, an implementation could 1071 restrict a user from sending a packet destined to the GUE port 1072 without proper credentials. 1074 Appendix C: Inner flow identifier as a route selector 1076 An encapsulator generating an inner flow identifier may modulate the 1077 value to perform a type of multipath source routing. Assuming that 1078 networking switches perform ECMP based on the flow hash, a sender can 1079 affect the path by altering the inner flow identifier. For instance, 1080 a host may store a flow hash in its PCB for an inner flow, and may 1081 alter the value upon detecting that packets are traversing a lossy 1082 path. Changing the inner flow identifier for a flow should be subject 1083 to hysteresis (at most once every thirty seconds) to limit the number 1084 of out of order packets. 1086 Appendix D: Hardware protocol implementation considerations 1088 A low level protocol, such is GUE, is likely interesting to being 1089 supported by high speed network devices. Variable length header (VLH) 1090 protocols like GUE are often considered difficult to efficiently 1091 implement in hardware. In order to retain the important 1092 characteristics of an extensible and robust protocol, hardware 1093 vendors may practice "constrained flexibility". In this model, only 1094 certain combinations or protocol header parameterizations are 1095 implemented in hardware fast path. Each such parameterization is 1096 fixed length so that the particular instance can be optimized as a 1097 fixed length protocol. In the case of GUE this constitutes specific 1098 combinations of GUE flags, fields, and next protocol. The selected 1099 combinations would naturally be the most common cases which form the 1100 "fast path", and other combinations are assumed to take the "slow 1101 path". 1103 In time, needs and requirements of the protocol may change which may 1104 manifest themselves as new parameterizations to be supported in the 1105 fast path. To allow allow this extensibility, a device practicing 1106 constrained flexibility should allow the fast path parameterizations 1107 to be programmable. 1109 Authors' Addresses 1111 Tom Herbert 1112 Google 1113 1600 Amphitheatre Parkway 1114 Mountain View, CA 1115 US 1117 EMail: tom@herbertland.com 1119 Lucy Yong 1120 Huawei USA 1121 5340 Legacy Dr. 1122 Plano, TX 75024 1123 US 1125 Osama Zia 1126 Microsoft 1127 osamaz@microsoft.com