idnits 2.17.1 

draft-ietf-intarea-gue-09.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (October 26, 2019) is 1644 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFC2460' is mentioned on line 914, but not defined

  ** Obsolete undefined reference: RFC 2460 (Obsoleted by RFC 8200)

  == Missing Reference: 'RFC8200' is mentioned on line 1049, but not defined

  == Missing Reference: 'RFC768' is mentioned on line 1049, but not defined

  == Missing Reference: 'RFC2914' is mentioned on line 1063, but not defined

  == Missing Reference: 'MUTLIQ' is mentioned on line 1626, but not defined

  == Unused Reference: 'RFC8084' is defined on line 1514, but no explicit
     reference was found in the text

  == Unused Reference: 'MULTIQ' is defined on line 1600, but no explicit
     reference was found in the text

  ** Downref: Normative reference to an Informational RFC: RFC 2983

  ** Downref: Normative reference to an Informational RFC: RFC 4459

  ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126)

  -- Obsolete informational reference (is this intentional?): RFC 5389
     (Obsoleted by RFC 8489)

  -- Obsolete informational reference (is this intentional?): RFC 5245
     (Obsoleted by RFC 8445, RFC 8839)

  -- Obsolete informational reference (is this intentional?): RFC 6830
     (Obsoleted by RFC 9300, RFC 9301)

  == Outdated reference: A later version (-13) exists of
     draft-ietf-intarea-tunnels-10

  == Outdated reference: A later version (-16) exists of
     draft-ietf-nvo3-geneve-10


     Summary: 4 errors (**), 0 flaws (~~), 10 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Area WG                                              T. Herbert
3	Internet-Draft                                                Quantonium
4	Intended status: Standard track                                  L. Yong
5	Expires April 28, 2020                                       Independent
6	                                                                  O. Zia
7	                                                               Microsoft
8	                                                        October 26, 2019

10	                       Generic UDP Encapsulation
11	                       draft-ietf-intarea-gue-09

13	Status of this Memo

15	   This Internet-Draft is submitted in full conformance with the
16	   provisions of BCP 78 and BCP 79.

18	   Internet-Drafts are working documents of the Internet Engineering
19	   Task Force (IETF), its areas, and its working groups.  Note that
20	   other groups may also distribute working documents as Internet-
21	   Drafts.

23	   Internet-Drafts are draft documents valid for a maximum of six months
24	   and may be updated, replaced, or obsoleted by other documents at any
25	   time.  It is inappropriate to use Internet-Drafts as reference
26	   material or to cite them other than as "work in progress."

28	   The list of current Internet-Drafts can be accessed at
29	   http://www.ietf.org/ietf/1id-abstracts.txt

31	   The list of Internet-Draft Shadow Directories can be accessed at
32	   http://www.ietf.org/shadow.html

34	   This Internet-Draft will expire on April 28, 2020.

36	Copyright Notice

38	   Copyright (c) 2019 IETF Trust and the persons identified as the
39	   document authors. All rights reserved.

41	   This document is subject to BCP 78 and the IETF Trust's Legal
42	   Provisions Relating to IETF Documents
43	   (http://trustee.ietf.org/license-info) in effect on the date of
44	   publication of this document. Please review these documents
45	   carefully, as they describe your rights and restrictions with respect
46	   to this document.

48	   This document is subject to BCP 78 and the IETF Trust's Legal
49	   Provisions Relating to IETF Documents
50	   (http://trustee.ietf.org/license-info) in effect on the date of
51	   publication of this document.  Please review these documents
52	   carefully, as they describe your rights and restrictions with respect
53	   to this document.  Code Components extracted from this document must
54	   include Simplified BSD License text as described in Section 4.e of
55	   the Trust Legal Provisions and are provided without warranty as
56	   described in the Simplified BSD License.

58	Abstract

60	   This specification describes Generic UDP Encapsulation (GUE), which
61	   is a scheme for using UDP to encapsulate packets of different IP
62	   protocols for transport across layer 3 networks. By encapsulating
63	   packets in UDP, specialized capabilities in networking hardware for
64	   efficient handling of UDP packets can be leveraged. GUE specifies
65	   basic encapsulation methods upon which higher level constructs, such
66	   as tunnels and overlay networks for network virtualization, can be
67	   constructed. GUE is extensible by allowing optional data fields as
68	   part of the encapsulation, and is generic in that it can encapsulate
69	   packets of various IP protocols.

71	Table of Contents

73	   1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  5
74	     1.1. Applicability . . . . . . . . . . . . . . . . . . . . . . .  5
75	     1.2. Terminology and acronyms  . . . . . . . . . . . . . . . . .  6
76	     1.3. Requirements Language . . . . . . . . . . . . . . . . . . .  7
77	   2. Base packet format  . . . . . . . . . . . . . . . . . . . . . .  8
78	     2.1. GUE variant . . . . . . . . . . . . . . . . . . . . . . . .  8
79	   3. Variant 0 . . . . . . . . . . . . . . . . . . . . . . . . . . .  8
80	     3.1. Header format . . . . . . . . . . . . . . . . . . . . . . .  9
81	     3.2. Proto/ctype field . . . . . . . . . . . . . . . . . . . . . 10
82	       3.2.1. Proto field . . . . . . . . . . . . . . . . . . . . . . 10
83	       3.2.2. Ctype field . . . . . . . . . . . . . . . . . . . . . . 10
84	     3.3. Flags and extension fields  . . . . . . . . . . . . . . . . 12
85	       3.3.1. Requirements  . . . . . . . . . . . . . . . . . . . . . 12
86	       3.3.2. Example GUE header with extension fields  . . . . . . . 12
87	     3.4. Surplus space . . . . . . . . . . . . . . . . . . . . . . . 13
88	     3.5. Message types . . . . . . . . . . . . . . . . . . . . . . . 13
89	       3.5.1. Control messages  . . . . . . . . . . . . . . . . . . . 13
90	       3.5.2. Data messages . . . . . . . . . . . . . . . . . . . . . 14
91	   4. Variant 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
92	     4.1. Direct encapsulation of IPv4  . . . . . . . . . . . . . . . 15
93	     4.2. Direct encapsulation of IPv6  . . . . . . . . . . . . . . . 16
94	   5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
95	     5.1. Network tunnel encapsulation  . . . . . . . . . . . . . . . 17
96	     5.2. Transport layer encapsulation . . . . . . . . . . . . . . . 17
97	     5.3. Encapsulator operation  . . . . . . . . . . . . . . . . . . 18
98	     5.4. Decapsulator operation  . . . . . . . . . . . . . . . . . . 18
99	       5.4.1. Processing a received data message  . . . . . . . . . . 18
100	       5.4.2. Processing a received control message . . . . . . . . . 19
101	     5.5. Middlebox inspection  . . . . . . . . . . . . . . . . . . . 19
102	     5.6. Router and switch operation . . . . . . . . . . . . . . . . 20
103	       5.6.1. Connection semantics  . . . . . . . . . . . . . . . . . 20
104	       5.6.2. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . 21
105	     5.7. MTU and fragmentation . . . . . . . . . . . . . . . . . . . 21
106	     5.8. UDP Checksum Handling . . . . . . . . . . . . . . . . . . . 21
107	       5.8.1. UDP Checksum with IPv4  . . . . . . . . . . . . . . . . 21
108	       5.8.2. UDP Checksum with IPv6  . . . . . . . . . . . . . . . . 22
109	     5.9. Congestion Considerations . . . . . . . . . . . . . . . . . 25
110	       5.9.1. GUE tunnels . . . . . . . . . . . . . . . . . . . . . . 25
111	       5.9.2 Transport layer encapsulation  . . . . . . . . . . . . . 26
112	     5.10. Multicast  . . . . . . . . . . . . . . . . . . . . . . . . 26
113	     5.11. Flow entropy for ECMP  . . . . . . . . . . . . . . . . . . 26
114	       5.11.1. Flow classification  . . . . . . . . . . . . . . . . . 26
115	       5.11.2. Flow entropy properties  . . . . . . . . . . . . . . . 27
116	     5.12. Negotiation of acceptable flags and extension fields . . . 28
117	   6. Motivation for GUE  . . . . . . . . . . . . . . . . . . . . . . 28
118	     6.1. Benefits of GUE . . . . . . . . . . . . . . . . . . . . . . 28
119	     6.2. Comparison of GUE to other encapsulations . . . . . . . . . 29
120	   7. Security Considerations . . . . . . . . . . . . . . . . . . . . 31
121	   8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 31
122	     8.1. UDP source port . . . . . . . . . . . . . . . . . . . . . . 31
123	     8.2. GUE variant number  . . . . . . . . . . . . . . . . . . . . 32
124	     8.3. Control types . . . . . . . . . . . . . . . . . . . . . . . 32
125	     8.4 Control Type Experimental Identifiers  . . . . . . . . . . . 32
126	   9. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . 33
127	   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 34
128	     10.1. Normative References . . . . . . . . . . . . . . . . . . . 34
129	     10.2. Informative References . . . . . . . . . . . . . . . . . . 35
130	   Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 38
131	     A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 38
132	     A.2. Checksum offload  . . . . . . . . . . . . . . . . . . . . . 38
133	       A.2.1. Transmit checksum offload . . . . . . . . . . . . . . . 39
134	       A.2.2. Receive checksum offload  . . . . . . . . . . . . . . . 39
135	     A.3. Transmit Segmentation Offload . . . . . . . . . . . . . . . 40
136	     A.4. Large Receive Offload . . . . . . . . . . . . . . . . . . . 41
137	   Appendix B: Implementation considerations  . . . . . . . . . . . . 41
138	     B.1. Priveleged ports  . . . . . . . . . . . . . . . . . . . . . 41
139	     B.2. Setting flow entropy as a route selector  . . . . . . . . . 42
140	     B.3. Hardware protocol implementation considerations . . . . . . 42
141	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 43

143	1. Introduction

145	   This specification describes Generic UDP Encapsulation (GUE) which is
146	   a general method for encapsulating packets of arbitrary IP protocols
147	   within User Datagram Protocol (UDP) [RFC0768] packets. Encapsulating
148	   packets in UDP facilitates efficient transport across networks.
149	   Networking devices widely provide protocol specific processing and
150	   optimizations for UDP (as well as TCP) packets. Packets for atypical
151	   IP protocols (those not usually parsed by networking hardware) can be
152	   encapsulated in UDP packets to maximize deliverability and to
153	   leverage flow specific mechanisms for routing and packet steering.

155	   GUE provides an extensible header format for including optional data
156	   in the encapsulation header. This data potentially covers items such
157	   as a virtual networking identifier, security data for validating or
158	   authenticating the GUE header, congestion control data, etc.

160	   This document does not define any specific GUE extensions. [GUEEXTEN]
161	   specifies a set of initial extensions.

163	1.1. Applicability

165	   GUE is a network encapsulation protocol that encapsulates packets for
166	   various IP protocols. Potential use cases include network tunneling,
167	   multi-tenant network virtualization, tunneling for mobility, and
168	   transport layer encapsulation. GUE is intended for deploying overlay
169	   networks in public or private data center environments, as well as
170	   providing a general tunneling mechanism usable in the Internet.

172	   GUE is a UDP based encapsulation protocol transported over existing
173	   IPv4 and IPv6 networks. Hence, as a UDP based protocol, GUE adheres
174	   to the UDP usage guidelines as specified in [RFC8085]. Applicability
175	   of these guidelines are dependent on the underlay IP network and the
176	   nature of GUE payload protocol (for example TCP/IP or IP/Ethernet).
177	   GUE may also be used to create IP tunnels, hence the guidelines in
178	   [IPTUN] are applicable.

180	   [RFC8085] outlines two applicability scenarios for UDP applications:
181	   (1) general Internet and (2) a traffic-managed controlled environment
182	   (TMCE). The requirements of [RFC8085] pertaining to deployment of a
183	   UDP encapsulation protocol in these environments are applicable.
184	   Section 5 provides the specifics for satisfying requirements of
185	   [RFC8085]. It is the responsibility of the operator deploying GUE to
186	   ensure that the necessary operational requirements are met for the
187	   environment in which GUE is being deployed.

189	   GUE has much of the same applicability and benefits as GRE-in-UDP
190	   [RFC8086] that are afforded by UDP encapsulation protocols. GUE
191	   offers the possibility of good performance for load-balancing
192	   encapsulated IP traffic in transit networks using existing Equal-Cost
193	   Multipath (ECMP) mechanisms that use a hash of the five-tuple of
194	   source IP address, destination IP address, UDP/TCP source port,
195	   UDP/TCP destination port, and protocol number. Encapsulating packets
196	   in UDP enables use of the UDP source port to provide entropy to ECMP
197	   hashing. A material difference between GUE and GRE-in-UDP is that the
198	   payload of GUE is always an IP protocol whereas the payload in GRE-
199	   in-UDP may be a non-IP protocol; this distinction is pertinent in the
200	   discussion of congestion considerations (section 5.9) since IP
201	   protocols are generally assumed to be congestion controlled.

203	   In addition, GUE enables extending the use of atypical IP protocols
204	   (those other than TCP and UDP) across networks that might otherwise
205	   filter packets carrying those protocols. GUE may also be used with
206	   connection oriented UDP semantics in order to facilitate traversal
207	   through stateful firewalls and stateful NAT.

209	   Additional motivation for the GUE protocol is provided in section 6.

211	1.2. Terminology and acronyms

213	   GUE              Generic UDP Encapsulation

215	   GUE Header       A variable length protocol header that is composed
216	                    of a primary four byte header and zero or more four
217	                    byte words of optional header data

219	   GUE packet       A UDP/IP packet that contains a GUE header and GUE
220	                    payload within the UDP payload

222	   GUE variant      A version of the GUE protocol or an alternate form
223	                    of a version

225	   Encapsulator     A network node that encapsulates packets in GUE

227	   Decapsulator     A network node that decapsulates and processes
228	                    packets encapsulated in GUE

230	   Data message     An encapsulated packet in a GUE payload that is
231	                    addressed to the protocol stack for an associated
232	                    protocol

234	   Control message  A formatted message in the GUE payload that is
235	                    implicitly addressed to the decapsulator to monitor
236	                    or control the state or behavior of a tunnel

238	   Flags            A set of bit flags in the primary GUE header
239	   Extension field  An optional field in a GUE header whose presence is
240	                    indicated by corresponding flag(s)

242	   C-bit            A single bit flag in the primary GUE header that
243	                    indicates whether the GUE packet contains a control
244	                    message or data message

246	   Hlen             A field in the primary GUE header that gives the
247	                    length of the GUE header

249	   Proto/ctype      A field in the GUE header that holds either the IP
250	                    protocol number for a data message or a type for a
251	                    control message

253	   Outer IP header  Refers to the outer most IP header or packet when
254	                    encapsulating a packet over IP

256	   Inner IP header  Refers to an encapsulated IP header when an IP
257	                    packet is encapsulated

259	   Outer packet     Refers to an encapsulating packet

261	   Inner packet     Refers to a packet that is encapsulated

263	   TMCE             A traffic-managed controlled environment, i.e., an
264	                    IP network that is traffic-engineered and/or
265	                    otherwise managed

267	1.3. Requirements Language

269	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
270	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
271	   document are to be interpreted as described in [RFC2119].

273	2. Base packet format

275	   A GUE packet is comprised of a UDP packet whose payload is a GUE
276	   header followed by a payload which is either an encapsulated packet
277	   of some IP protocol or a control message such as an OAM (Operations,
278	   Administration, and Management) message. A GUE packet has the general
279	   format:

281	   +-------------------------------+
282	   |                               |
283	   |        UDP/IP header          |
284	   |                               |
285	   |-------------------------------|
286	   |                               |
287	   |         GUE Header            |
288	   |                               |
289	   |-------------------------------|
290	   |                               |
291	   |      Encapsulated packet      |
292	   |      or control message       |
293	   |                               |
294	   +-------------------------------+

296	   The GUE header is variable length as determined by the presence of
297	   optional extension fields.

299	2.1. GUE variant

301	   The first two bits of the GUE header contain the GUE protocol variant
302	   number. The variant number can indicate the version of the GUE
303	   protocol as well as alternate forms of a version.

305	   Variants 0 and 1 are described in this specification; variants 2 and
306	   3 are reserved.

308	3. Variant 0

310	   Variant 0 indicates version 0 of GUE. This variant defines a generic
311	   extensible format to encapsulate packets by Internet protocol number.

313	3.1. Header format

315	   The header format for variant 0 of GUE in UDP is:

317	    0                   1                   2                   3
318	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
319	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\
320	   |        Source port            |      Destination port         | |
321	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP
322	   |           Length              |          Checksum             | |
323	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
324	   | 0 |C|   Hlen  |  Proto/ctype  |             Flags             |\
325	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
326	   |                                                               | GUE
327	   ~                  Extensions Fields (optional)                 ~ |
328	   |                                                               | |
329	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/

331	   The contents of the UDP header are:

333	      o Source port: If connection semantics (section 5.6.1) are applied
334	        to an encapsulation, this is set to the local source port for
335	        the connection. When connection semantics are not applied, the
336	        source port is either set to a flow entropy value, as described
337	        in section 5.11, or is set to the GUE assigned port number,
338	        6080.

340	      o Destination port: If connection semantics (section 5.6.1) are
341	        applied to an encapsulation, this is set to the destination port
342	        for the tuple. If connection semantics are not applied then the
343	        destination port is set to the GUE assigned port number, 6080.

345	      o Length: Canonical length of the UDP packet (length of UDP header
346	        and payload).

348	      o Checksum: Standard UDP checksum (handling is described in
349	        section 5.8).

351	   The GUE header consists of:

353	      o Variant: 0 indicates GUE protocol version 0 with a header.

355	      o C: C-bit: When set indicates a control message. When not set
356	        indicates a data message.

358	      o Hlen: Length in 32-bit words of the GUE header, including
359	        optional extension fields but not the first four bytes of the
360	        header. Computed as (header_len - 4) / 4, where header_len is
361	        the total header length in bytes. All GUE headers are a multiple
362	        of four bytes in length. Maximum header length is 128 bytes.

364	      o Proto/ctype: When the C-bit is set, this field contains a
365	        control message type for the payload (section 3.2.2). When the
366	        C-bit is not set, the field holds the Internet protocol number
367	        for the encapsulated packet in the payload (section 3.2.1). The
368	        control message or encapsulated packet begins at the offset
369	        provided by Hlen.

371	      o Flags: Header flags that may be allocated for various purposes
372	        and may indicate the presence of extension fields. Undefined
373	        header flag bits MUST be set to zero on transmission.

375	      o Extension Fields: Optional fields whose presence is indicated by
376	        corresponding flags.

378	3.2. Proto/ctype field

380	   The proto/ctype fields either contains an Internet protocol number
381	   (when the C-bit is not set) or GUE control message type (when the C-
382	   bit is set).

384	3.2.1. Proto field

386	   When the C-bit is not set, the proto/ctype field MUST contain an IANA
387	   Internet Protocol Number [IANA-PN]. The protocol number is
388	   interpreted relative to the IP protocol that encapsulates the UDP
389	   packet (i.e. protocol of the outer IP header). The protocol number
390	   serves as an indication of the type of the next protocol header which
391	   is contained in the GUE payload at the offset indicated in Hlen.

393	   IP protocol number 59 ("No next header") can be set to indicate that
394	   the GUE payload does not begin with the header of an IP protocol.
395	   This would be the case, for instance, if the GUE payload were a
396	   fragment when performing GUE level fragmentation. The interpretation
397	   of the payload is performed through other means such as flags and
398	   extension fields, and nodes MUST NOT parse packets based on the IP
399	   protocol number in this case.

401	3.2.2. Ctype field

403	   When the C-bit is set, the proto/ctype field MUST be set to a valid
404	   control message type. Control messages will be defined in an IANA
405	   registry. Type 0 and type 255 are specified in this document, type 1
406	   through 254 are reserved and may be defined in standards.

408	   Type 0 indicates that the GUE payload is a control message, or part
409	   of a control message that cannot be correctly parsed or interpreted
410	   without additional context. This might be the case when the payload
411	   is a fragment of a control message, where only the reassembled packet
412	   can be interpreted as a control message.

414	   Type 255 is reserved for experimentation. When this control type is
415	   set the first four bytes of the GUE payload (control message) are an
416	   experiment identifier (ExId). The ExID is used to differentiate
417	   experiments (similar to the experimental identifier defined for TCP
418	   options in [RFC6994]). A control message of type 255 MUST include an
419	   ExID.

421	   The format of a GUE control message with the experimental control
422	   message type is:

424	    0                   1                   2                   3
425	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
426	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\
427	   |        Source port            |      Destination port         | |
428	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP
429	   |           Length              |          Checksum             | |
430	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
431	   | 0 |1|   Hlen  |      255      |             Flags             |\
432	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
433	   |                                                               | GUE
434	   ~                  Extensions Fields (optional)                 ~ |
435	   |                                                               | |
436	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
437	   |                              ExID                             |
438	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
439	   |                                                               |
440	   ~                        Control message                        ~
441	   |                                                               |
442	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

444	   Note that the ExID is not part of the GUE header, it is in the
445	   payload. In particular, the ExID is not accounted for in the GUE
446	   Hlen.

448	   ExIDs are selected at design time, when the protocol designer first
449	   implements or specifies the experimental control message. An ExID is
450	   thirty-two bits. The value is stored in the header in network-
451	   standard (big-endian) byte order.

453	   ExIDs are registered with IANA using "first come, first served"
454	   (FCFS) priority. ExIDs MUST be unique.

456	3.3. Flags and extension fields

458	   Flags and associated extension fields are the primary mechanism of
459	   extensibility in GUE. As mentioned in section 3.1, GUE header flags
460	   indicate the presence of optional extension fields in the GUE header.
461	   [GUEEXTEN] defines an initial set of GUE extensions.

463	3.3.1. Requirements

465	   There are sixteen flag bits in the GUE header. Flags may indicate
466	   presence of extension fields. The size of an extension field
467	   indicated by a flag MUST be fixed in the specification of the flag.

469	   Flags can be grouped together to allow different lengths for an
470	   extension field. For example, if two flag bits are grouped, a field
471	   can possibly be three different lengths-- that is bit value of 00
472	   indicates no field present; 01, 10, and 11 indicate three possible
473	   lengths for the field. Regardless of how flag bits are grouped, the
474	   lengths and offsets of extension fields corresponding to a set of
475	   flags MUST be well defined and deterministic.

477	   Extension fields are placed in order of the flags. New flags are to
478	   be allocated from high to low order bit contiguously without holes.
479	   Flags allow random access, for instance to inspect the field
480	   corresponding to the Nth flag bit, an implementation only considers
481	   the previous N-1 flags to determine the offset. Flags after the Nth
482	   flag are not pertinent in calculating the offset of the field for the
483	   Nth flag. Random access of flags and fields permits processing of
484	   optional extensions in an order that is independent of their position
485	   in the packet.

487	   Flags (or grouped flags) are idempotent such that new flags MUST NOT
488	   cause reinterpretation of old flags. Also, new flags MUST NOT alter
489	   interpretation of other elements in the GUE header nor how the
490	   message is parsed (for instance, in a data message the proto/ctype
491	   field always holds an IP protocol number as an invariant).

493	   The set of available flags can be extended in the future by defining
494	   a "flag extensions bit" that refers to a field containing a new set
495	   of flags.

497	3.3.2. Example GUE header with extension fields

499	   An example GUE header for a data message encapsulating an IPv4 packet
500	   and containing the Group Identifier and Security extension fields
501	   (both defined in [GUEEXTEN]) is shown below:

503	    0                   1                   2                   3
504	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
505	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
506	   | 0 |0|    3    |       4       |1|0 0 1|          0            |
507	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
508	   |                        Group Identifier                       |
509	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
510	   |                                                               |
511	   +                           Security                            +
512	   |                                                               |
513	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

515	   In the above example, the first flag bit is set which indicates that
516	   the Group Identifier extension is present which is a 32 bit field.
517	   The second through fourth bits of the flags are grouped flags that
518	   indicate the presence of a Security field with seven possible sizes.
519	   In this example 001 indicates a sixty-four bit security field.

521	3.4. Surplus space

523	   The length of a GUE header, as indicated in the GUE Hlen field, may
524	   exceed the space consumed by optional extensions in a packet. The
525	   space between the end of the last optional field and the end of the
526	   header is termed the "surplus space".

528	   Surplus space is reserved per this specification and uses may be
529	   defined in future specifications. If a node receives a GUE packet
530	   with non-zero length of surplus space then it MUST NOT attempt to
531	   interpret the data in the surplus space. For purposes of transforms
532	   across the header, such as optional integrity check over the header,
533	   the surplus space is considered to be part of the GUE header and
534	   would be included in computation.

536	3.5. Message types

538	   There are two message types in GUE variant 0: control messages and
539	   data messages.

541	3.5.1. Control messages

543	   Control messages carry formatted data that are implicitly addressed
544	   to the decapsulator to monitor or control the state or behavior of a
545	   tunnel (OAM). For instance, an echo request and corresponding echo
546	   reply message can be defined to test for liveness.

548	   Control messages are indicated in the GUE header when the C-bit is
549	   set. The payload is interpreted as a control message with type
550	   specified in the proto/ctype field. The format and contents of the
551	   control message are indicated by the type and can be variable length.

553	   Other than interpreting the proto/ctype field as a control message
554	   type, the meaning and semantics of the rest of the elements in the
555	   GUE header are the same as that of data messages. Forwarding and
556	   routing of control messages should be the same as that of a data
557	   message with the same outer IP and UDP header; this ensures that
558	   control messages can be created that follow the same path through the
559	   network as data messages.

561	3.5.2. Data messages

563	   Data messages carry encapsulated packets that are addressed to the
564	   protocol stack for the associated protocol. Data messages are a
565	   primary means of encapsulation and can be used to create tunnels for
566	   overlay networks.

568	   Data messages are indicated in the GUE header when the C-bit is not
569	   set. The payload of a data message is interpreted as an encapsulated
570	   packet of an Internet protocol indicated in the proto/ctype field.
571	   The encapsulated packet immediately follows the GUE header.

573	4. Variant 1

575	   Variant 1 of GUE allows direct encapsulation of IPv4 and IPv6 in UDP.
576	   In this variant there is no GUE header, a UDP packet carries an IP
577	   packet. The first two bits of the UDP payload are the GUE variant
578	   field and coincide with the first two bits of the version number in
579	   the IP header. The first two version bits of IPv4 and IPv6 are 01, so
580	   we use GUE variant 1 for direct IP encapsulation which makes the two
581	   bits of GUE variant to also be 01.

583	   This technique is effectively a means to compress out the GUE version
584	   0 header when encapsulating IPv4 or IPv6 packets and there are no
585	   flags or extension fields. This method is compatible to use on the
586	   same port number as packets with the GUE header (GUE variant 0
587	   packets). This technique saves encapsulation overhead on costly links
588	   for the common use of IP encapsulation, and also obviates the need to
589	   allocate a separate UDP port number for IP-over-UDP encapsulation.

591	4.1. Direct encapsulation of IPv4

593	   The format for encapsulating IPv4 directly in UDP is:

595	    0                   1                   2                   3
596	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
597	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\
598	   |        Source port            |      Destination port         | |
599	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP
600	   |           Length              |          Checksum             | |
601	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
602	   |0|1|0|0|  IHL  |Type of Service|          Total Length         |
603	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
604	   |         Identification        |Flags|      Fragment Offset    |
605	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
606	   |  Time to Live |   Protocol    |   Header Checksum             |
607	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
608	   |                       Source IPv4 Address                     |
609	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
610	   |                     Destination IPv4 Address                  |
611	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

613	   The UDP fields are set in a similar manner as described in section
614	   3.1.

616	   Note that the 0100 value in the first four bits of the UDP payload
617	   expresses both the GUE variant as 1 (bits 01) and IP version as 4
618	   (bits 0100).

620	4.2. Direct encapsulation of IPv6

622	   The format for encapsulating IPv6 directly in UDP is demonstrated
623	   below:

625	    0                   1                   2                   3
626	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
627	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\
628	   |        Source port            |      Destination port         | |
629	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP
630	   |           Length              |          Checksum             | |
631	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
632	   |0|1|1|0| Traffic Class |           Flow Label                  |
633	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
634	   |         Payload Length        |     NextHdr   |   Hop Limit   |
635	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
636	   |                                                               |
637	   +                                                               +
638	   |                                                               |
639	   +                        Source IPv6 Address                    +
640	   |                                                               |
641	   +                                                               +
642	   |                                                               |
643	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
644	   |                                                               |
645	   +                                                               +
646	   |                                                               |
647	   +                      Destination IPv6 Address                 +
648	   |                                                               |
649	   +                                                               +
650	   |                                                               |
651	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

653	   The UDP fields are set in a similar manner as described in section
654	   3.1.

656	   Note that the 0110 value in the first four bits of the the UDP
657	   payload expresses both the GUE variant as 1 (bits 01) and IP version
658	   as 6 (bits 0110).

660	5. Operation

662	   The figure below illustrates the use of GUE encapsulation between two
663	   hosts. Host 1 is sending packets to Host 2. An encapsulator performs
664	   encapsulation of packets from Host 1. These encapsulated packets
665	   traverse the network as UDP packets. At the decapsulator, packets are
666	   decapsulated and sent on to Host 2. Packet flow in the reverse
667	   direction need not be symmetric; for example, the reverse path might
668	   not use GUE or any other form of encapsulation.

670	   +---------------+                       +---------------+
671	   |               |                       |               |
672	   |    Host 1     |                       |     Host 2    |
673	   |               |                       |               |
674	   +---------------+                       +---------------+
675	          |                                        ^
676	          V                                        |
677	   +---------------+   +---------------+   +---------------+
678	   |               |   |               |   |               |
679	   | Encapsulator  |-->|    Layer 3    |-->| Decapsulator  |
680	   |               |   |    Network    |   |               |
681	   +---------------+   +---------------+   +---------------+

683	   The encapsulator and decapsulator may be co-resident with the
684	   corresponding hosts, or may be on separate nodes in the network.

686	5.1. Network tunnel encapsulation

688	   Network tunneling can be achieved by encapsulating layer 2 or layer 3
689	   packets. In this case, the encapsulator and decapsulator nodes are
690	   the tunnel endpoints. These could be routers that provide network
691	   tunnels on behalf of communicating hosts.

693	5.2. Transport layer encapsulation

695	   When encapsulating layer 4 packets, the encapsulator and decapsulator
696	   should be co-resident with the hosts. In this case, the encapsulation
697	   headers are inserted between the IP header and the transport packet.
698	   The addresses in the IP header refer to both the endpoints of the
699	   encapsulation and the endpoints for terminating the encapsulated
700	   transport protocol. Note that the transport layer ports in the
701	   encapsulated packet are independent of the UDP ports in the outer
702	   packet.

704	5.3. Encapsulator operation

706	   Encapsulators create GUE data messages, set the fields of the UDP
707	   header, set flags and optional extension fields in the GUE header,
708	   and forward packets to a decapsulator.

710	   An encapsulator can be an end host originating the packets of a flow,
711	   or can be a network device performing encapsulation on behalf of
712	   hosts (routers implementing tunnels for instance). In either case,
713	   the intended target (decapsulator) is indicated by the outer
714	   destination IP address and destination port in the UDP header.

716	   If an encapsulator is tunneling packets, that is encapsulating
717	   packets of layer 2 or layer 3 protocols (e.g. EtherIP, IPIP, ESP
718	   tunnel mode), it SHOULD follow standard conventions for tunneling one
719	   protocol over another. For instance, if an IP packet is being
720	   encapsulated in GUE then diffserv interaction [RFC2983] and ECN
721	   propagation for tunnels [RFC6040] SHOULD be followed.

723	5.4. Decapsulator operation

725	   A decapsulator performs decapsulation of GUE packets. A decapsulator
726	   is addressed by the outer destination IP address and UDP destination
727	   port of a GUE packet. The decapsulator validates packets, including
728	   fields of the GUE header.

730	   If a decapsulator receives a GUE packet with an unsupported variant,
731	   unknown flag, bad header length (too small for included extension
732	   fields), unknown control message type, bad protocol number, an
733	   unsupported payload type, or an otherwise malformed header, it MUST
734	   drop the packet. Such events MAY be logged subject to configuration
735	   and rate limiting of logging messages. Note that set flags in a GUE
736	   header that are unknown to a decapsulator MUST NOT be ignored. If a
737	   GUE packet is received by a decapsulator with unknown flags, the
738	   packet MUST be dropped.

740	5.4.1. Processing a received data message

742	   If a valid data message is received, the UDP header and GUE header
743	   are (logically) removed from the packet. The outer IP header remains
744	   intact and the next protocol in the IP header is set to the protocol
745	   from the proto field in the GUE header. The resulting packet is then
746	   resubmitted into the protocol stack to process the packet as though
747	   it was received with the protocol indicated in the GUE header.

749	   As an example, consider that a data message is received where GUE
750	   encapsulates an IPv4 packet using GUE variant 0. In this case proto
751	   field in the GUE header is set to 4 for IPv4 encapsulation:

753	   +-------------------------------------+
754	   |   IP header (next proto = 17,UDP)   |
755	   |-------------------------------------|
756	   |                  UDP                |
757	   |-------------------------------------|
758	   |  GUE (proto = 4,IPv4 encapsulation) |
759	   |-------------------------------------|
760	   |        IPv4 header and packet       |
761	   +-------------------------------------+

763	   The receiver removes the UDP and GUE headers and sets the next
764	   protocol field in the IP packet to 4, which is derived from the GUE
765	   proto field. The resultant packet would have the format:

767	   +-------------------------------------+
768	   |   IP header (next proto = 4,IPv4)   |
769	   |-------------------------------------|
770	   |        IPv4 header and packet       |
771	   +-------------------------------------+

773	   This packet is then resubmitted into the protocol stack to be
774	   processed as an IPv4 encapsulated packet.

776	5.4.2. Processing a received control message

778	   If a valid control message is received, the packet MUST be processed
779	   as a control message. The specific processing to be performed depends
780	   on the value in the ctype field of the GUE header.

782	   If an experimental control message is received (ctype is 255) then
783	   the ExID MUST be processed. The ExID is used to identify the
784	   particular experimental control message.

786	   If a receiver does not recognize a control message type, or an
787	   experimental identifier in an experimental control message, then the
788	   packet MUST be dropped and and error message MAY be logged. If a GUE
789	   control message is received with control type 255 and the length of
790	   the GUE payload is less than four, the size of the ExId, then the
791	   packet MUST be dropped and an error message MAY be logged.

793	5.5. Middlebox inspection

795	   A middlebox MAY inspect a GUE header. A middlebox MUST NOT modify a
796	   GUE header or UDP payload.

798	   To inspect a GUE header, a middlebox needs to identify GUE packets.
799	   The obvious method is to match the destination UDP port number to be
800	   the GUE port number (i.e. 6080). Per [RFC7605], transport port
801	   numbers only have meaning at the endpoints of communications, so
802	   inferring the type of a UDP payload based on port number may be
803	   incorrect. Middleboxes MUST NOT take any action that would have
804	   harmful side effects if a UDP packet were misinterpreted as being a
805	   GUE packet. In particular, a middlebox MUST NOT modify a UDP payload
806	   based on inferring the payload type from the port number lest the
807	   middlebox could cause silent data corruption.

809	   A middlebox MAY interpret some flags and extension fields of the GUE
810	   header for classification purposes, but is not required to understand
811	   any of the flags or extension fields in GUE packets. A middlebox MUST
812	   NOT drop a GUE packet merely because there are flags unknown to it.
813	   Similarly, a middlebox MUST NOT arbitrarily filter packets based on
814	   GUE flags or extension fields that are present or not present. The
815	   header length in the GUE header allows a middlebox to inspect the
816	   payload packet without needing to parse the flags or extension
817	   fields.

819	5.6. Router and switch operation

821	   Routers and switches SHOULD forward GUE packets as standard UDP/IP
822	   packets. The outer five-tuple should contain sufficient information
823	   to perform flow classification corresponding to the flow of the inner
824	   packet. A router does not normally need to parse a GUE header, and
825	   none of the flags or extension fields in the GUE header are expected
826	   to affect routing. In cases where the outer five-tuple does not
827	   provide sufficient entropy for flow classification, for instance UDP
828	   ports are fixed to provide connection semantics (section 5.6.1), then
829	   the encapsulated packet MAY be parsed to determine flow entropy.

831	   A router MUST NOT modify a GUE header or payload when forwarding a
832	   packet. It MAY encapsulate a GUE packet in another GUE packet, for
833	   instance to implement a network tunnel (i.e. by encapsulating an IP
834	   packet with a GUE payload in another IP packet as a GUE payload). In
835	   this case, the router takes the role of an encapsulator, and the
836	   corresponding decapsulator is the logical endpoint of the tunnel.
837	   When encapsulating a GUE packet within another GUE packet, there are
838	   no provisions to automatically copy flags or fields to the outer GUE
839	   header. Each layer of encapsulation is considered independent.

841	5.6.1. Connection semantics

843	   A middlebox might infer bidirectional connection semantics for a UDP
844	   flow. For instance, a stateful firewall might create a five-tuple
845	   rule to match flows on egress, and a corresponding five-tuple rule
846	   for matching ingress packets where the roles of source and
847	   destination are reversed for the IP addresses and UDP port numbers.
848	   To operate in this environment, a GUE tunnel should be configured to
849	   assume connected semantics defined by the UDP five tuple and the use
850	   of GUE encapsulation needs to be symmetric between both endpoints.
851	   The source port set in the UDP header MUST be the destination port
852	   the peer would set for replies. In this case, the UDP source port for
853	   a tunnel would be a fixed value and not set to be flow entropy.

855	   The selection of whether to make the UDP source port fixed or set to
856	   a flow entropy value for each packet sent SHOULD be configurable for
857	   a tunnel. The default MUST be to set the flow entropy value in the
858	   UDP source port.

860	5.6.2. NAT

862	   IP address and port translation can be performed on the UDP/IP
863	   headers adhering to the requirements for NAT (Network Address
864	   Translation) with UDP [RFC4787]. In the case of stateful NAT,
865	   connection semantics MUST be applied to a GUE tunnel as described in
866	   section 5.6.1. GUE endpoints MAY also invoke STUN [RFC5389] or ICE
867	   [RFC5245] to manage NAT port mappings for encapsulations.

869	5.7. MTU and fragmentation

871	   Standard conventions for handling of MTU (Maximum Transmission Unit)
872	   and fragmentation in conjunction with networking tunnels
873	   (encapsulation of layer 2 or layer 3 packets) SHOULD be followed.
874	   Details are described in MTU and Fragmentation Issues with In-the-
875	   Network Tunneling [RFC4459].

877	   If a packet is fragmented before encapsulation in GUE, all the
878	   related fragments MUST be encapsulated using the same UDP source
879	   port. An operator SHOULD set MTU to account for encapsulation
880	   overhead and reduce the likelihood of fragmentation.

882	   Alternative to IP fragmentation, the GUE fragmentation extension can
883	   be used. GUE fragmentation is described in [GUEEXTEN].

885	5.8. UDP Checksum Handling

887	5.8.1. UDP Checksum with IPv4

889	   For UDP in IPv4, when a non-zero UDP checksum is used, the UDP
890	   checksum MUST be processed as specified in [RFC0768] and [RFC1122]
891	   for both transmit and receive. The IPv4 header includes a checksum
892	   that protects against misdelivery of the packet due to corruption of
893	   IP addresses. The UDP checksum potentially provides protection
894	   against corruption of the UDP header, GUE header, and GUE payload.
895	   Disabling the use of checksums is a deployment consideration that
896	   should take into account the risk and effects of packet corruption.

898	   When a decapsulator receives a packet, the UDP checksum field MUST be
899	   processed. If the UDP checksum is non-zero, the decapsulator MUST
900	   verify the checksum before accepting the packet. By default, a
901	   decapsulator SHOULD accept UDP packets with a zero checksum.  A node
902	   MAY be configured to disallow zero checksums per [RFC1122]; this may
903	   be done selectively, for instance by disallowing zero checksums from
904	   certain hosts that are known to be sending over paths subject to
905	   packet corruption. If verification of a non-zero checksum fails, a
906	   decapsulator lacks the capability to verify a non-zero checksum, or a
907	   packet with a zero checksum was received and the decapsulator is
908	   configured to disallow, the packet MUST be dropped and an event MAY
909	   be logged.

911	5.8.2. UDP Checksum with IPv6

913	   For UDP in IPv6, the UDP checksum MUST be processed as specified in
914	   [RFC0768] and [RFC2460] for both transmit and receive.

916	   When UDP is used over IPv6, the UDP checksum is relied upon to
917	   protect both the IPv6 and UDP headers from corruption. As such, by
918	   default a GUE encapsulator MUST use UDP checksums.

920	   [GUEEXTEN] specifies a GUE checksum option that includes a pseudo
921	   header containing the IP addresses. An encapsulator MAY use zero-UDP
922	   checksums if it uses the GUE checksum. A non-zero UDP checksum and
923	   the GUE checksum SHOULD NOT be used simultaneously in a packet since
924	   that would be redundant.

926	   When deployed in a TMCE, a GUE encapsulator MAY be configured to use
927	   UDP zero-checksum mode and no GUE checksum if the traffic-managed
928	   controlled environment or a set of closely cooperating traffic-
929	   managed controlled environments (such as by network operators who
930	   have agreed to work together in order to jointly provide specific
931	   services) meet at least one of the following conditions:

933	      a. It is known (perhaps through knowledge of equipment types and
934	         lower-layer checks) that packet corruption is exceptionally
935	         unlikely and where the operator is willing to take the risk of
936	         undetected packet corruption.

938	      b. It is judged through observational measurements (perhaps of
939	         historic or current traffic flows that use a non-zero checksum)
940	         that the level of packet corruption is tolerably low and where
941	         the operator is willing to take the risk of undetected packet
942	         corruption.

944	      c. Carrying applications that are tolerant of misdelivered or
945	         corrupted packets (perhaps through higher-layer checksum,
946	         validation, and retransmission or transmission redundancy)
947	         where the operator is willing to rely on the applications using
948	         GUE to survive any corrupt packets.

950	   The following requirements apply to encapsulators deployed in a TMCE
951	   environment that use UDP zero-checksum mode:

953	      a. Use of the UDP checksum with IPv6 MUST be the default
954	         configuration for all communications.

956	      b. The GUE implementation MUST comply with all requirements
957	         specified in Section 4 of [RFC6936] and with requirement 1
958	         specified in Section 5 of [RFC6936].

960	      c. A decapsulator SHOULD only allow the use of UDP zero-checksum
961	         mode for IPv6 on a single received UDP Destination Port,
962	         regardless of the encapsulator. The motivation for this
963	         requirement is possible corruption of the UDP Destination Port,
964	         which may cause packet delivery to the wrong UDP port. If that
965	         other UDP port requires the UDP checksum, the misdelivered
966	         packet will be discarded.

968	      d. It is RECOMMENDED that the UDP zero-checksum mode for IPv6 is
969	         only enabled for certain selected source addresses. The
970	         decapsulator MUST check that the source and destination IPv6
971	         addresses in a received packets are permitted by configuration
972	         to use UDP zero-checksum mode and discard any packet for which
973	         this check fails.

975	      e. The tunnel encapsulator SHOULD use different IPv6 addresses for
976	         each GUE communication (tunnel or transport flow) that uses UDP
977	         zero-checksum mode, regardless of the decapsulator, in order to
978	         strengthen the decapsulator's check of the IPv6 source address
979	         (i.e., the same IPv6 source address SHOULD NOT be used with
980	         more than one IPv6 destination address, independent of whether
981	         that destination address is a unicast or multicast address).
982	         When this is not possible, it is RECOMMENDED to use each source
983	         IPv6 address for as few GUE communications that use UDP zero-
984	         checksum mode as is feasible.

986	      f. When any middlebox exists on the path of GUE communication, it
987	         is RECOMMENDED to use the default mode, i.e., use UDP checksum,
988	         to reduce the chance that the encapsulated packets will be
989	         dropped.

991	      g. Any middlebox that allows the UDP zero-checksum mode for IPv6
992	         MUST comply with requirements 1 and 8-10 in Section 5 of
993	         [RFC6936].

995	      h. Measures SHOULD be taken to prevent IPv6 traffic with zero UDP
996	         checksums from "escaping" to the general Internet; see Section
997	         5.9 for examples of such measures.

999	      i. IPv6 traffic with zero UDP checksums MUST be actively monitored
1000	         for errors by the network operator. For example, the operator
1001	         may monitor Ethernet-layer packet error rates.

1003	      j. If a packet with a non-zero checksum is received, the checksum
1004	         MUST be verified before accepting the packet. This is
1005	         regardless of whether the tunnel encapsulator and decapsulator
1006	         have been configured with UDP zero-checksum mode.

1008	   The above requirements do not change either the requirements
1009	   specified in [RFC8200] as modified by [RFC6935] or the requirements
1010	   specified in [RFC6936].

1012	   The requirement to check the source IPv6 address in addition to the
1013	   destination IPv6 address and the strong recommendation against reuse
1014	   of source IPv6 addresses among GUE communications collectively
1015	   provide some mitigation for the absence of UDP checksum coverage of
1016	   the IPv6 header. A traffic-managed controlled environment that
1017	   satisfies at least one of three conditions listed at the beginning of
1018	   this section provides additional assurance.

1020	   GUE packets are suitable for transmission over lower layers in the
1021	   traffic-managed controlled environments that are allowed by the
1022	   exceptions stated above, and the rate of corruption of the inner IP
1023	   packet on such networks is not expected to increase by comparison to
1024	   traffic that is not encapsulated in UDP. For these reasons, GUE does
1025	   not provide an additional integrity check except when GUE checksum
1026	   [GUEEXTEN] is used when UDP zero-checksum mode is used with IPv6, and
1027	   this design is in accordance with requirements 2, 3, and 5 specified
1028	   in Section 5 of [RFC6936].

1030	   Generic UDP Encapsulation does not accumulate incorrect transport-
1031	   layer state as a consequence of GUE header corruption. A corrupt GUE
1032	   packet may result in either packet discard or packet forwarding
1033	   without accumulation of GUE state. Active monitoring of GUE traffic
1034	   for errors is REQUIRED, as the occurrence of errors will result in
1035	   some accumulation of error information outside the protocol for
1036	   operational and management purposes. This design is in accordance
1037	   with requirement 4 specified in Section 5 of [RFC6936].

1039	   The remaining requirements specified in Section 5 of [RFC6936] are
1040	   not applicable to GUE. Requirements 6 and 7 do not apply because GUE
1041	   does not include a control feedback mechanism. Requirements 8-10 are
1042	   middlebox requirements that do not apply to GUE tunnel endpoints.

1044	   (See Section 5.5 for further middlebox discussion.)

1046	   In summary, a TMCE GUE tunnel is allowed to use UDP zero- checksum
1047	   mode for IPv6 when the conditions and requirements stated above are
1048	   met. Otherwise, the UDP checksum needs to be used for IPv6 as
1049	   specified in [RFC768] and [RFC8200]. Use of GUE checksum is
1050	   RECOMMENDED when the UDP checksum is not used.

1052	5.9. Congestion Considerations

1054	   This section describes congestion considerations for GUE tunnels
1055	   (Layer 2 and Layer 3 encapsulation) and transport layer encapsulation
1056	   (Layer 4 protocol over GUE).

1058	5.9.1. GUE tunnels

1060	   Section 3.1.9 of [RFC8085] discusses the congestion considerations
1061	   for design and use of UDP tunnels; this is important because other
1062	   flows could share the path with one or more UDP tunnels,
1063	   necessitating congestion control [RFC2914] to avoid destructive
1064	   interference.

1066	   Congestion has potential impacts both on the rest of the network
1067	   containing a UDP tunnel and on the traffic flows using the UDP
1068	   tunnels. These impacts depend upon what sort of traffic is carried
1069	   over the tunnel, as well as the path of the tunnel. The GUE protocol
1070	   does not provide any congestion control and GUE UDP packets are
1071	   regular UDP packets. Therefore, a GUE tunnel MUST NOT be deployed to
1072	   carry non-congestion-controlled traffic over the Internet [RFC8085].

1074	   Within a TMCE network, GUE tunnels are appropriate for carrying
1075	   traffic that is not known to be congestion controlled. For example, a
1076	   GUE tunnel may be used to carry Multiprotocol Label Switching (MPLS)
1077	   traffic such as pseudowires or VPNs where specific bandwidth
1078	   guarantees are provided to each pseudowire or VPN. In such cases,
1079	   operators of TMCE networks avoid congestion by careful provisioning
1080	   of their networks, rate-limiting of user data traffic, and traffic
1081	   engineering according to path capacity.

1083	   When a GUE tunnel carries traffic that is not known to be congestion
1084	   controlled in a TMCE network, the tunnel MUST be deployed entirely
1085	   within that network, and measures SHOULD be taken to prevent the GUE
1086	   traffic from "escaping" the network to the general Internet. Examples
1087	   of such measures are:

1089	      o physical or logical isolation of the links carrying GUE from the
1090	        general Internet,

1092	      o deployment of packet filters that block the UDP ports assigned
1093	        for GUE, and

1095	      o imposition of restrictions on GUE traffic by software tools used
1096	        to set up GUE tunnels between specific end systems (as might be
1097	        used within a single data center) or by tunnel ingress nodes for
1098	        tunnels that don't terminate at end systems.

1100	5.9.2 Transport layer encapsulation

1102	   If GUE encapsulates a transport layer protocol, such as TCP, it is
1103	   expected that the transport layer or application layer properly
1104	   implements congestion control or avoidance. In the case that UDP is
1105	   encapsulated, the application is expected to provide congestion
1106	   control as specified in [RFC8085].

1108	5.10. Multicast

1110	   GUE packets can be multicast to decapsulators using a multicast
1111	   destination address in the outer IP header. Each receiving host will
1112	   decapsulate the packet independently following normal decapsulator
1113	   operations. The receiving decapsulators need to agree on the same set
1114	   of GUE parameters and properties; how such an agreement is reached is
1115	   outside the scope of this document.

1117	   GUE allows encapsulation of unicast, broadcast, or multicast traffic.
1118	   Flow entropy (the value in the UDP source port) can be generated from
1119	   the header of encapsulated unicast or broadcast/multicast packets at
1120	   an encapsulator. The mapping mechanism between the encapsulated
1121	   multicast traffic and the multicast capability in the IP network is
1122	   transparent and independent of the encapsulation and is otherwise
1123	   outside the scope of this document.

1125	5.11. Flow entropy for ECMP

1127	   A major objective of using GUE is that a network device can perform
1128	   flow classification corresponding to the flow of the inner
1129	   encapsulated packet based on the contents of the outer headers.

1131	5.11.1. Flow classification

1133	   When a packet is encapsulated with GUE and connection semantics are
1134	   not applied, the source port in the outer UDP packet is set to a flow
1135	   entropy value that corresponds to the flow of the inner packet. When
1136	   a device computes a five-tuple hash on the outer UDP/IP header of a
1137	   GUE packet, the resultant value classifies the packet per its inner
1138	   flow.

1140	   Examples of deriving flow entropy for encapsulation are:

1142	      o If the encapsulated packet is a layer 4 packet, TCP/IPv4 for
1143	        instance, the flow entropy could be based on the canonical five-
1144	        tuple hash of the inner packet.

1146	      o If the encapsulated packet is an AH transport mode packet with
1147	        TCP as next header, the flow entropy could be a hash over a
1148	        three-tuple: TCP protocol and TCP ports of the encapsulated
1149	        packet.

1151	      o If a node is encrypting a packet using ESP tunnel mode and GUE
1152	        encapsulation, the flow entropy could be based on the contents
1153	        of the clear-text packet. For instance, a canonical five-tuple
1154	        hash for a TCP/IP packet could be used.

1156	   [RFC6438] discusses methods to compute and set flow entropy value for
1157	   IPv6 flow labels, such methods can also be used to create flow
1158	   entropy values for GUE.

1160	5.11.2. Flow entropy properties

1162	   The flow entropy is the value set in the UDP source port of a GUE
1163	   packet. Flow entropy in the UDP source port SHOULD adhere to the
1164	   following properties:

1166	      o The value set in the source port is within the ephemeral port
1167	        range (49152 to 65535 [RFC6335]). Since the high order two bits
1168	        of the port are set to one, this provides fourteen bits of
1169	        entropy for the value.

1171	      o The flow entropy has a uniform distribution across encapsulated
1172	        flows.

1174	      o An encapsulator MAY occasionally change the flow entropy used
1175	        for an inner flow per its discretion (for security, route
1176	        selection, etc). To avoid thrashing or flapping the value, the
1177	        flow entropy used for a flow SHOULD NOT change more than once
1178	        every thirty seconds (or a configurable value).

1180	      o Decapsulators, or any networking devices, SHOULD NOT attempt to
1181	        interpret flow entropy as anything more than an opaque value.
1182	        Neither should they attempt to reproduce the hash calculation
1183	        used by an encapasulator in creating a flow entropy value. They
1184	        MAY use the value to match further receive packets for steering
1185	        decisions, but MUST NOT assume that the hash uniquely or
1186	        permanently identifies a flow.

1188	      o Input to the flow entropy calculation is not restricted to ports
1189	        and addresses; input could include the flow label from an IPv6
1190	        packet, SPI from an ESP packet, or other flow related state in
1191	        the encapsulator that is not necessarily conveyed in the packet.

1193	      o The assignment function for flow entropy SHOULD be randomly
1194	        seeded to mitigate denial of service attacks. The seed SHOULD be
1195	        changed periodically.

1197	5.12. Negotiation of acceptable flags and extension fields

1199	   An encapsulator and decapsulator need to achieve agreement about GUE
1200	   parameters that will be used in communications. Parameters include
1201	   supported GUE variants, flags and extension fields that can be used,
1202	   security algorithms and keys, supported protocols and control
1203	   messages, etc. This document proposes different general methods to
1204	   accomplish this, however the details of implementing these are
1205	   considered out of scope.

1207	   General methods for this are:

1209	      o Configuration. The parameters used for a tunnel are configured
1210	        at each endpoint.

1212	      o Negotiation. A tunnel negotiation can be performed. This could
1213	        be accomplished in-band of GUE using control messages.

1215	      o Via a control plane. Parameters for communicating with a tunnel
1216	        endpoint can be set in a control plane protocol (such as that
1217	        needed for network virtualization).

1219	      o Via security negotiation. Use of security typically implies a
1220	        key exchange between endpoints. Other GUE parameters may be
1221	        conveyed as part of that process.

1223	6. Motivation for GUE

1225	   This section provides the motivation for GUE with respect to other
1226	   encapsulation methods.

1228	6.1. Benefits of GUE

1230	      * GUE is a generic encapsulation protocol. GUE can encapsulate
1231	        protocols that are represented by an IP protocol number. This
1232	        includes layer 2, layer 3, and layer 4 protocols.

1234	      * GUE is an extensible encapsulation protocol. Standardized
1235	        optional data such as security, virtual networking identifiers,
1236	        fragmentation are defined.

1238	      * For extensibility, GUE uses flag fields as opposed to TLVs as
1239	        some other encapsulation protocols do. Flag fields are strictly
1240	        ordered, allow random access, and are efficient in use of header
1241	        space.

1243	      * GUE allows sending of control messages such as OAM using the
1244	        same GUE header format (for routing purposes) as normal data
1245	        messages.

1247	      * GUE maximizes deliverability of non-UDP and non-TCP protocols.

1249	      * GUE provides a means for exposing per flow entropy for ECMP for
1250	        IP atypical protocols such as SCTP, DCCP, ESP, etc.

1252	6.2. Comparison of GUE to other encapsulations

1254	   A number of different encapsulation techniques have been proposed for
1255	   the encapsulation of one protocol over another. EtherIP [RFC3378]
1256	   provides layer 2 tunneling of Ethernet frames over IP. GRE [RFC2784],
1257	   MPLS [RFC4023], and L2TP [RFC2661] provide methods for tunneling
1258	   layer 2 and layer 3 packets over IP. NVGRE [RFC7637] and VXLAN
1259	   [RFC7348] are proposals for encapsulation of layer 2 packets for
1260	   network virtualization. IPIP [RFC2003] and Generic packet tunneling
1261	   in IPv6 [RFC2473] provide methods for tunneling IP packets over IP.

1263	   Several proposals exist for encapsulating packets over UDP including
1264	   ESP over UDP [RFC3948], TCP directly over UDP [TCPUDP], VXLAN
1265	   [RFC7348], LISP [RFC6830] which encapsulates layer 3 packets,
1266	   MPLS/UDP [RFC7510], GENEVE [GENEVE], and GRE-in-UDP Encapsulation
1267	   [RFC8086].

1269	   GUE has the following discriminating features:

1271	      o UDP encapsulation leverages specialized network device
1272	        processing for efficient transport. The semantics for using the
1273	        UDP source port for flow entropy as input to ECMP are defined in
1274	        section 5.11.

1276	      o GUE permits encapsulation of arbitrary IP protocols, which
1277	        includes layer 2, 3, and 4 protocols.

1279	      o Multiple protocols can be multiplexed over a single UDP port
1280	        number. This is in contrast to techniques to encapsulate
1281	        protocols over UDP using a protocol specific port number (such
1282	        as ESP/UDP, GRE/UDP, SCTP/UDP). GUE provides a uniform and
1283	        extensible mechanism for encapsulating all IP protocols in UDP
1284	        with minimal overhead (four bytes of additional header).

1286	      o GUE is extensible. New flags and extension fields can be
1287	        defined.

1289	      o The GUE header includes a header length field. This allows a
1290	        network node to inspect an encapsulated packet without needing
1291	        to parse the full encapsulation header.

1293	      o GUE includes both data messages (encapsulation of packets) and
1294	        control messages (such as OAM).

1296	      o The flags-field model facilitates efficient implementation of
1297	        extensibility in hardware. For instance, a TCAM can be used to
1298	        parse a known set of N flags where the number of entries in the
1299	        TCAM is 2^N. By comparison, the number of TCAM entries needed to
1300	        parse a set of N arbitrarily ordered TLVs is approximately e*N!.

1302	      o GUE includes a variant that encapsulates IPv4 and IPv6 packets
1303	        directly within UDP.

1305	7. Security Considerations

1307	   There are two important considerations of security with respect to
1308	   GUE.

1310	      o Authentication and integrity of the GUE header.

1312	      o Authentication, integrity, and confidentiality of the GUE
1313	        payload.

1315	   GUE security is provided by extensions for security defined in
1316	   [GUEEXTEN]. These extensions include methods to authenticate the GUE
1317	   header and encrypt the GUE payload.

1319	   The GUE header can be authenticated using a security extension for an
1320	   HMAC (Hashed Message Authentication Code). Securing the GUE payload
1321	   can be accomplished by use of the GUE Payload Transform extension.
1322	   This extension allows the use of DTLS (Datagram Transport Layer
1323	   Security) to encrypt and authenticate the GUE payload.

1325	   A hash function for computing flow entropy (section 5.11) SHOULD be
1326	   randomly seeded to mitigate some possible denial service attacks.

1328	8. IANA Considerations

1330	8.1. UDP source port

1332	   A user UDP port number assignment for GUE has been assigned:

1334	          Service Name: gue
1335	          Transport Protocol(s): UDP
1336	          Assignee: Tom Herbert <tom@herbertland.com>
1337	          Contact: Tom Herbert <tom@herbertland.com>
1338	          Description: Generic UDP Encapsulation
1339	          Reference: draft-herbert-gue
1340	          Port Number: 6080
1341	          Service Code: N/A
1342	          Known Unauthorized Uses: N/A
1343	          Assignment Notes: N/A

1345	8.2. GUE variant number

1347	   IANA is requested to set up a registry for the GUE variant number.
1348	   The GUE variant number is two bits containing four possible values.
1349	   This document defines variants 0 and 1. New values are assigned in
1350	   accordance with RFC Required policy [RFC5226].

1352	      +----------------+----------------+---------------+
1353	      | Variant number | Description    | Reference     |
1354	      +----------------+----------------+---------------+
1355	      | 0              | GUE Version 0  | This document |
1356	      |                | with header    |               |
1357	      |                |                |               |
1358	      | 1              | GUE Version 0  | This document |
1359	      |                | with direct IP |               |
1360	      |                | encapsulation  |               |
1361	      |                |                |               |
1362	      | 2..3           | Unassigned     |               |
1363	      +----------------+----------------+---------------+

1365	8.3. Control types

1367	   IANA is requested to set up a registry for the GUE control types.
1368	   Control types are 8 bit values.  New values for control types 1-127
1369	   are assigned in accordance with RFC Required policy [RFC5226].

1371	      +----------------+------------------+---------------+
1372	      |  Control type  | Description      | Reference     |
1373	      +----------------+------------------+---------------+
1374	      | 0              | Control payload  | This document |
1375	      |                | needs more       |               |
1376	      |                | context for      |               |
1377	      |                | interpretation   |               |
1378	      |                |                  |               |
1379	      | 1..254         | Unassigned       |               |
1380	      |                |                  |               |
1381	      | 255            | Experimental     | This document |
1382	      +----------------+------------------+---------------+

1384	8.4 Control Type Experimental Identifiers

1386	   IANA is requested to create a "GUE Control Type Experimental
1387	   Identifiers (GUE Control ExIDs)" registry. The registry records 32-
1388	   bit ExIDs, as well as a reference (description, document pointer,
1389	   assignee name, and e-mail contact) for each entry.

1391	   Entries are assigned on a First Come, First Served (FCFS) basis
1392	   [RFC5226]. The registry operates FCFS on the entire ExID (in network-
1393	   standard order).

1395	   IANA will advise applicants of duplicate entries to select an
1396	   alternate value, as per typical FCFS processing.

1398	   IANA will record known duplicate uses to assist the community in both
1399	   debugging assigned uses as well as correcting unauthorized duplicate
1400	   uses.

1402	   IANA should impose no requirements on making a registration other
1403	   than indicating the desired codepoint and providing a point of
1404	   contact. A short description or acronym for the use is desired but
1405	   should not be required.

1407	   Initial assignments are:

1409	     +----------------+----------------+---------------+
1410	     |      ExI D     | Description    | Reference     |
1411	     +----------------+----------------+---------------+
1412	     | 1..x0ffffffff  | Unassigned     |               |
1413	     +----------------+----------------+---------------+

1415	9. Acknowledgements

1417	   The authors would like to thank David Liu, Erik Nordmark, Fred
1418	   Templin, Adrian Farrel, Bob Briscoe, Murray Kucherawy, Mirja
1419	   Kuhlewind, David Black, Joe Touch, and Greg Mirsky for valuable input
1420	   on this draft. Special thanks to Fred Templin who is serving as
1421	   document shepherd.

1423	10. References

1425	10.1. Normative References

1427	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1428	              Requirement Levels", BCP 14, RFC 2119, DOI
1429	              10.17487/RFC2119, March 1997, <https://www.rfc-
1430	              editor.org/info/rfc2119>.

1432	   [RFC0768]  Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI
1433	              10.17487/RFC0768, August 1980, <http://www.rfc-
1434	              editor.org/info/rfc768>.

1436	   [RFC8085]  Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
1437	              Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
1438	              March 2017, <https://www.rfc-editor.org/info/rfc8085>.

1440	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1441	              Requirement Levels", BCP 14, RFC 2119, DOI
1442	              10.17487/RFC2119, March 1997, <https://www.rfc-
1443	              editor.org/info/rfc2119>.

1445	   [RFC2983]  Black, D., "Differentiated Services and Tunnels", RFC
1446	              2983, DOI 10.17487/RFC2983, October 2000, <http://www.rfc-
1447	              editor.org/info/rfc2983>.

1449	   [RFC6040]  Briscoe, B., "Tunnelling of Explicit Congestion
1450	              Notification", RFC 6040, DOI 10.17487/RFC6040, November
1451	              2010, <http://www.rfc-editor.org/info/rfc6040>.

1453	   [RFC6935]  Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and
1454	              UDP Checksums for Tunneled Packets", RFC 6935, DOI
1455	              10.17487/RFC6935, April 2013, <http://www.rfc-
1456	              editor.org/info/rfc6935>.

1458	   [RFC6936]  Fairhurst, G. and M. Westerlund, "Applicability Statement
1459	              for the Use of IPv6 UDP Datagrams with Zero Checksums",
1460	              RFC 6936, DOI 10.17487/RFC6936, April 2013,
1461	              <http://www.rfc-editor.org/info/rfc6936>.

1463	   [RFC1122]  Braden, R., Ed., "Requirements for Internet Hosts -
1464	              Communication Layers", STD 3, RFC 1122, DOI
1465	              10.17487/RFC1122, October 1989, <http://www.rfc-
1466	              editor.org/info/rfc1122>.

1468	   [RFC4459]  Savola, P., "MTU and Fragmentation Issues with In-the-
1469	              Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April
1470	              2006, <http://www.rfc-editor.org/info/rfc4459>.

1472	   [RFC6335]  Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S.
1473	              Cheshire, "Internet Assigned Numbers Authority (IANA)
1474	              Procedures for the Management of the Service Name and
1475	              Transport Protocol Port Number Registry", BCP 165, RFC
1476	              6335, DOI 10.17487/RFC6335, August 2011, <https://www.rfc-
1477	              editor.org/info/rfc6335>.

1479	   [RFC5226]  Narten, T. and H. Alvestrand, "Guidelines for Writing an
1480	              IANA Considerations Section in RFCs", RFC 5226, DOI
1481	              10.17487/RFC5226, May 2008, <https://www.rfc-
1482	              editor.org/info/rfc5226>.

1484	10.2. Informative References

1486	   [RFC6994]  Touch, J., "Shared Use of Experimental TCP Options", RFC
1487	              6994, DOI 10.17487/RFC6994, August 2013, <https://www.rfc-
1488	              editor.org/info/rfc6994>.

1490	   [RFC8086]  Yong, L., Ed., Crabbe, E., Xu, X., and T. Herbert, "GRE-
1491	              in-UDP Encapsulation", RFC 8086, DOI 10.17487/RFC8086,
1492	              March 2017, <http://www.rfc-editor.org/info/rfc8086>.

1494	   [RFC7605]  Touch, J., "Recommendations on Using Assigned Transport
1495	              Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605,
1496	              August 2015, <https://www.rfc-editor.org/info/rfc7605>.

1498	   [RFC4787]  Audet, F., Ed., and C. Jennings, "Network Address
1499	              Translation (NAT) Behavioral Requirements for Unicast
1500	              UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January
1501	              2007, <http://www.rfc-editor.org/info/rfc4787>.

1503	   [RFC5389]  Rosenberg, J., Mahy, R., Matthews, P., and D. Wing,
1504	              "Session Traversal Utilities for NAT (STUN)", RFC 5389,
1505	              DOI 10.17487/RFC5389, October 2008, <http://www.rfc-
1506	              editor.org/info/rfc5389>.

1508	   [RFC5245]  Rosenberg, J., "Interactive Connectivity Establishment
1509	              (ICE): A Protocol for Network Address Translator (NAT)
1510	              Traversal for Offer/Answer Protocols", RFC 5245, DOI
1511	              10.17487/RFC5245, April 2010, <http://www.rfc-
1512	              editor.org/info/rfc5245>.

1514	   [RFC8084]  Fairhurst, G., "Network Transport Circuit Breakers", BCP
1515	              208, RFC 8084, DOI 10.17487/RFC8084, March 2017,
1516	              <https://www.rfc-editor.org/info/rfc8084>.

1518	   [RFC6438]  Carpenter, B. and S. Amante, "Using the IPv6 Flow Label
1519	              for Equal Cost Multipath Routing and Link Aggregation in
1520	              Tunnels", RFC 6438, DOI 10.17487/RFC6438, November 2011,
1521	              <http://www.rfc-editor.org/info/rfc6438>.

1523	   [RFC3378]  Housley, R. and S. Hollenbeck, "EtherIP: Tunneling
1524	              Ethernet Frames in IP Datagrams", RFC 3378, DOI
1525	              10.17487/RFC3378, September 2002, <http://www.rfc-
1526	              editor.org/info/rfc3378>.

1528	   [RFC2784]  Farinacci, D., Li, T., Hanks, S., Meyer, D., and P.
1529	              Traina, "Generic Routing Encapsulation (GRE)", RFC 2784,
1530	              DOI 10.17487/RFC2784, March 2000, <http://www.rfc-
1531	              editor.org/info/rfc2784>.

1533	   [RFC4023]  Worster, T., Rekhter, Y., and E. Rosen, Ed.,
1534	              "Encapsulating MPLS in IP or Generic Routing Encapsulation
1535	              (GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005,
1536	              <http://www.rfc-editor.org/info/rfc4023>.

1538	   [RFC2661]  Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn,
1539	              G., and B. Palter, "Layer Two Tunneling Protocol "L2TP"",
1540	              RFC 2661, DOI 10.17487/RFC2661, August 1999,
1541	              <http://www.rfc-editor.org/info/rfc2661>.

1543	   [RFC7637]  Garg, P., Ed., and Y. Wang, Ed., "NVGRE: Network
1544	              Virtualization Using Generic Routing Encapsulation", RFC
1545	              7637, DOI 10.17487/RFC7637, September 2015,
1546	              <https://www.rfc-editor.org/info/rfc7637>.

1548	   [RFC7348]  Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger,
1549	              L., Sridhar, T., Bursell, M., and C. Wright, "Virtual
1550	              eXtensible Local Area Network (VXLAN): A Framework for
1551	              Overlaying Virtualized Layer 2 Networks over Layer 3
1552	              Networks", RFC 7348, August 2014, <http://www.rfc-
1553	              editor.org/info/rfc7348>.

1555	   [RFC2003]  Perkins, C., "IP Encapsulation within IP", RFC 2003, DOI
1556	              10.17487/RFC2003, October 1996, <http://www.rfc-
1557	              editor.org/info/rfc2003>.

1559	   [RFC2473]  Conta, A. and S. Deering, "Generic Packet Tunneling in
1560	              IPv6 Specification", RFC 2473, DOI 10.17487/RFC2473,
1561	              December 1998, <https://www.rfc-editor.org/info/rfc2473>.

1563	   [RFC3948]  Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M.
1564	              Stenberg, "UDP Encapsulation of IPsec ESP Packets", RFC
1565	              3948, DOI 10.17487/RFC3948, January 2005, <http://www.rfc-
1566	              editor.org/info/rfc3948>.

1568	   [RFC6830]  Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The
1569	              Locator/ID Separation Protocol (LISP)", RFC 6830, DOI
1570	              10.17487/RFC6830, January 2013, <http://www.rfc-
1571	              editor.org/info/rfc6830>.

1573	   [RFC7510]  Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black,
1574	              "Encapsulating MPLS in UDP", RFC 7510, DOI
1575	              10.17487/RFC7510, April 2015, <http://www.rfc-
1576	              editor.org/info/rfc7510>.

1578	   [GUEEXTEN] Herbert, T., Yong, L., and Templin, F., "Extensions for
1579	              Generic UDP Encapsulation", draft-ietf-intarea-gue-
1580	              extensions-06

1582	   [IPTUN]    Touch, J. and Townsley, M., "IP Tunnels in the Internet
1583	              Architecture", draft-ietf-intarea-tunnels-10

1585	   [IANA-PN]  IANA, "Protocol Numbers",
1586	              <https://www.iana.org/assignments/protocol-numbers>.

1588	   [TCPUDP]   Chesire, S., Graessley, J., and McGuire, R.,
1589	              "Encapsulation of TCP and other Transport Protocols over
1590	              UDP", draft-cheshire-tcp-over-udp-00

1592	   [GENEVE]   Gross, J., Ed., Ganga, I. Ed., and Sridhar, T., "Geneve:
1593	              Generic Network Virtualization Encapsulation", draft-ietf-
1594	              nvo3-geneve-10

1596	   [UDPENCAP] Herbert, T., "UDP Encapsulation in Linux",
1597	              <http://people.netfilter.org/pablo/netdev0.1/papers/UDP-
1598	              Encapsulation-in-Linux.pdf>

1600	   [MULTIQ]   Herbert, T. and de Bruijn, W., "Scaling in the Linux
1601	              Networking Stack", <https://www.kernel.org/doc/
1602	              Documentation/networking/scaling.txt>

1604	   [CSUMOFF]  Cree, E., "Checksum Offloads in the Linux Networking
1605	              Stack", <https://www.kernel.org/doc/Documentation/
1606	              networking/checksum-offloads.txt>

1608	   [SEGOFF]   Duyck, A., "Segmentation Offloads in the Linux Networking
1609	              Stack", <https://www.kernel.org/doc/
1610	              Documentation/networking/segmentation-offloads.txt>

1612	Appendix A: NIC processing for GUE

1614	   This appendix is informational and does not constitute a normative
1615	   part of this document.

1617	   This appendix provides some guidelines for Network Interface Cards
1618	   (NICs) to implement common offloads and accelerations to support GUE.
1619	   Note that most of this discussion is generally applicable to other
1620	   methods of UDP based encapsulation. An overview of UDP based
1621	   encapsulation and acceleration is in [UDPENCAP]

1623	A.1. Receive multi-queue

1625	   Contemporary NICs support multiple receive descriptor queues (multi-
1626	   queue) [MUTLIQ]. Multi-queue enables load balancing of network
1627	   processing for a NIC across multiple CPUs. On packet reception, a NIC
1628	   selects an appropriate queue for host processing. Receive Side
1629	   Scaling (RSS) is a common method which uses the flow hash for a
1630	   packet to index an indirection table where each entry stores a queue
1631	   number. Flow Director and Accelerated Receive Flow Steering (aRFS)
1632	   allow a host to program the queue that is used for a given flow which
1633	   is identified either by an explicit five-tuple or by the flow's hash.

1635	   GUE encapsulation is compatible with multi-queue NICs that support
1636	   five-tuple hash calculation for UDP/IP packets as input to RSS. The
1637	   flow entropy in the UDP source port ensures classification of the
1638	   encapsulated flow even in the case that the outer source and
1639	   destination addresses are the same for all flows (e.g. all flows are
1640	   going over a single tunnel).

1642	   By default, UDP RSS support is often disabled in NICs to avoid out-
1643	   of-order reception that can occur when UDP packets are fragmented. As
1644	   discussed is section 5.7, fragmentation of GUE packets is mostly
1645	   avoided by fragmenting packets before entering a tunnel, GUE
1646	   fragmentation, path MTU discovery in higher layer protocols, or
1647	   operator adjusting MTUs. Other UDP traffic might not implement such
1648	   procedures to avoid fragmentation, so enabling UDP RSS support in the
1649	   NIC might be a considered tradeoff during configuration.

1651	A.2. Checksum offload

1653	   Many NICs provide capabilities to calculate the standard ones
1654	   complement checksum for packets in transmit or receive [CSUMOFF].
1655	   When using GUE encapsulation, there are at least two checksums that
1656	   are of interest: the encapsulated packet's transport checksum, and
1657	   the UDP checksum in the outer header.

1659	A.2.1. Transmit checksum offload

1661	   NICs can provide a protocol agnostic method to offload the transmit
1662	   checksum (NETIF_F_HW_CSUM in Linux parlance) that can be used with
1663	   GUE. In this method, the host provides checksum related parameters in
1664	   a transmit descriptor for a packet. These parameters include the
1665	   starting offset of data to checksum, the length of data to checksum,
1666	   and the offset in the packet where the computed checksum is to be
1667	   written. The host initializes the checksum field to a pseudo header
1668	   checksum.

1670	   In the case of GUE, the checksum for an encapsulated transport layer
1671	   packet, a TCP packet for instance, can be offloaded by setting the
1672	   appropriate checksum parameters.

1674	   NICs typically can offload only one transmit checksum per packet, so
1675	   simultaneously offloading both an inner transport packet's checksum
1676	   and the outer UDP checksum is likely not possible.

1678	   If an encapsulator is co-resident with a host, then checksum offload
1679	   may be performed using remote checksum offload (RCO)[GUEEXTEN].
1680	   Remote checksum offload relies on NIC offload of the simple UDP/IP
1681	   checksum which is commonly supported even in legacy devices. In
1682	   remote checksum offload, the outer UDP checksum is set and the GUE
1683	   header includes an option indicating the start and offset of the
1684	   inner "offloaded" checksum. The inner checksum is initialized to the
1685	   pseudo header checksum. When a decapsulator receives a GUE packet
1686	   with the remote checksum offload option, it completes the offload
1687	   operation by determining the packet checksum from the indicated start
1688	   point to the end of the packet, and then adds this into the checksum
1689	   field at the offset given in the option. Computing the checksum from
1690	   the start to end of packet is efficient if checksum-complete is
1691	   provided on the receiver.

1693	   Another alternative when an encapsulator is co-resident with a host
1694	   is to perform Local Checksum Offload (LCO) [CSUMOFF]. In this method,
1695	   the inner transport layer checksum is offloaded and the outer UDP
1696	   checksum can be deduced based on the fact that the portion of the
1697	   packet covered by the inner transport checksum will sum to zero or at
1698	   least the bitwise "not" of the inner pseudo header.

1700	A.2.2. Receive checksum offload

1702	   GUE is compatible with NICs that perform a protocol agnostic receive
1703	   checksum (CHECKSUM_COMPLETE in Linux parlance). In this technique, a
1704	   NIC computes a ones complement checksum over all (or some predefined
1705	   portion) of a packet. The computed value is provided to the host
1706	   stack in the packet's receive descriptor. The host driver can use
1707	   this checksum to "patch up" and validate any inner packet transport
1708	   checksums, as well as the outer UDP checksum if it is non-zero.

1710	   Many legacy NICs don't provide checksum-complete but instead provide
1711	   an indication that a checksum has been verified (CHECKSUM_UNNECESSARY
1712	   in Linux). Usually, such validation is only done for simple TCP/IP or
1713	   UDP/IP packets. If a NIC indicates that a UDP checksum is valid, the
1714	   checksum-complete value for the UDP packet is the bitwise "not" of
1715	   the pseudo header checksum. In this way, checksum-unnecessary can be
1716	   converted to checksum-complete. So, if the NIC provides checksum-
1717	   unnecessary for the outer UDP header in an encapsulation, checksum
1718	   conversion can be done so that the checksum-complete value is derived
1719	   and can be used by the stack to validate checksums in the
1720	   encapsulated packet.

1722	A.3. Transmit Segmentation Offload

1724	   Transmit Segmentation Offload (TSO) [SEGOFF] is a NIC feature where a
1725	   host provides a large (>MTU size) TCP packet to the NIC, which in
1726	   turn splits the packet into separate segments and transmits each one.
1727	   This is useful to reduce CPU load on the host.

1729	   The process of TSO can be generalized as:

1731	      - Split the TCP payload into segments of size less than or equal
1732	        to MTU.

1734	      - For each created segment:

1736	        1. Replicate the TCP header and all preceding headers of the
1737	           original packet.

1739	        2. Set payload length fields in any headers to reflect the
1740	           length of the segment.

1742	        3. Set TCP sequence number to correctly reflect the offset of
1743	           the TCP data in the stream.

1745	        4. Recompute and set any checksums that either cover the payload
1746	           of the packet or cover header which was changed by setting a
1747	           payload length.

1749	   Following this general process, TSO can be extended to support TCP
1750	   encapsulation in GUE.  For each segment the Ethernet, outer IP, UDP
1751	   header, GUE header, inner IP header (if tunneling), and TCP headers
1752	   are replicated. Any packet length header fields need to be set
1753	   properly (including the length in the outer UDP header), and
1754	   checksums need to be set correctly (including the outer UDP checksum
1755	   if being used).

1757	   To facilitate TSO with GUE, it is recommended that extension fields
1758	   do not contain values that need to be updated on a per segment basis.
1759	   For example, extension fields should not include checksums, lengths,
1760	   or sequence numbers that refer to the payload. If the GUE header does
1761	   not contain such fields then the TSO engine only needs to copy the
1762	   bits in the GUE header when creating each segment and does not need
1763	   to parse the GUE header.

1765	A.4. Large Receive Offload

1767	   Large Receive Offload (LRO) [SEGOFF] is a NIC feature where received
1768	   packets of a TCP connection are reassembled, or coalesced, in the NIC
1769	   and delivered to the host as one large packet. This feature can
1770	   reduce CPU utilization in the host.

1772	   LRO requires significant protocol awareness to be implemented
1773	   correctly and is difficult to generalize. Packets in the same flow
1774	   need to be unambiguously identified. In the presence of tunnels or
1775	   network virtualization, this may require more than a five-tuple match
1776	   (for instance packets for flows in two different virtual networks may
1777	   have identical five-tuples). Additionally, a NIC needs to perform
1778	   validation over packets that are being coalesced, and needs to
1779	   fabricate a single meaningful header from all the coalesced packets.

1781	   The conservative approach to supporting LRO for GUE would be to
1782	   assign packets to the same flow only if they have identical five-
1783	   tuple and were encapsulated the same way. That is the outer IP
1784	   addresses, the outer UDP ports, GUE protocol, GUE flags and fields,
1785	   and inner five tuple are all identical.

1787	Appendix B: Implementation considerations

1789	   This appendix is informational and does not constitute a normative
1790	   part of this document.

1792	B.1. Priveleged ports

1794	   Using the source port to contain a flow entropy value disallows the
1795	   security method of a receiver enforcing that the source port be a
1796	   privileged port. Privileged ports are defined by some operating
1797	   systems to restrict source port binding. Unix, for instance,
1798	   considered port number less than 1024 to be privileged.

1800	   Enforcing that packets are sent from a privileged port is widely
1801	   considered an inadequate security mechanism and has been mostly
1802	   deprecated. To approximate this behavior, an implementation could
1803	   restrict a user from sending a packet destined to the GUE port
1804	   without proper credentials.

1806	B.2. Setting flow entropy as a route selector

1808	   An encapsulator generating flow entropy in the UDP source port could
1809	   modulate the value to perform a type of multipath source routing.
1810	   Assuming that networking switches perform ECMP based on the flow
1811	   hash, a sender can affect the path by altering the flow entropy.  For
1812	   instance, a host can store a flow hash in its protocol control block
1813	   (PCB) for an inner flow, and might alter the value upon detecting
1814	   that packets are traversing a lossy path. Changing the flow entropy
1815	   for a flow SHOULD be subject to hysteresis (at most once every thirty
1816	   seconds) to limit the number of out of order packets.

1818	B.3. Hardware protocol implementation considerations

1820	   Low level data path protocols, such as GUE, are often supported in
1821	   high speed network device hardware. Variable length header (VLH)
1822	   protocols like GUE are sometimes considered difficult to efficiently
1823	   implement in hardware. In order to retain the important
1824	   characteristics of an extensible and robust protocol, hardware
1825	   vendors may practice "constrained flexibility". In this model, only
1826	   certain combinations or protocol header parameterizations are
1827	   implemented in the hardware fast path. Each such parameterization is
1828	   fixed length so that the particular instance can be optimized as a
1829	   fixed length protocol. In the case of GUE, this constitutes specific
1830	   combinations of GUE flags, fields, and next protocol. The selected
1831	   combinations would naturally be the most common cases which form the
1832	   "fast path", and other combinations are assumed to take the "slow
1833	   path".

1835	   In time, the needs and requirements of a protocol may change which
1836	   may manifest themselves as new parameterizations to be supported in
1837	   the fast path. To allow this extensibility, a device practicing
1838	   constrained flexibility should allow fast path parameterizations to
1839	   be programmable.

1841	Authors' Addresses

1843	   Tom Herbert
1844	   Quantonium
1845	   4701 Patrick Henry
1846	   Santa Clara, CA 95054
1847	   US

1849	   Email: tom@herbertland.com

1851	   Lucy Yong
1852	   Independent
1853	   Austin, TX
1854	   US

1856	   Email: lucy_yong@yahoo.com

1858	   Osama Zia
1859	   Microsoft
1860	   1 Microsoft Way
1861	   Redmond, WA 98029
1862	   US

1864	   Email: osamaz@microsoft.com