idnits 2.17.1 

draft-ietf-intarea-gue-08.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (October 4, 2019) is 1666 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFC2460' is mentioned on line 866, but not defined

  ** Obsolete undefined reference: RFC 2460 (Obsoleted by RFC 8200)

  == Missing Reference: 'RFC8200' is mentioned on line 1000, but not defined

  == Missing Reference: 'RFC768' is mentioned on line 1000, but not defined

  == Missing Reference: 'RFC2914' is mentioned on line 1014, but not defined

  == Missing Reference: 'MUTLIQ' is mentioned on line 1536, but not defined

  == Unused Reference: 'RFC8084' is defined on line 1424, but no explicit
     reference was found in the text

  == Unused Reference: 'MULTIQ' is defined on line 1510, but no explicit
     reference was found in the text

  ** Downref: Normative reference to an Informational RFC: RFC 2983

  ** Downref: Normative reference to an Informational RFC: RFC 4459

  ** Obsolete normative reference: RFC 5226 (Obsoleted by RFC 8126)

  -- Obsolete informational reference (is this intentional?): RFC 5389
     (Obsoleted by RFC 8489)

  -- Obsolete informational reference (is this intentional?): RFC 5245
     (Obsoleted by RFC 8445, RFC 8839)

  -- Obsolete informational reference (is this intentional?): RFC 6830
     (Obsoleted by RFC 9300, RFC 9301)

  == Outdated reference: A later version (-13) exists of
     draft-ietf-intarea-tunnels-10

  == Outdated reference: A later version (-16) exists of
     draft-ietf-nvo3-geneve-10


     Summary: 4 errors (**), 0 flaws (~~), 10 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Area WG                                              T. Herbert
3	Internet-Draft                                                Quantonium
4	Intended status: Standard track                                  L. Yong
5	Expires April 6, 2020                                        Independent
6	                                                                  O. Zia
7	                                                               Microsoft
8	                                                         October 4, 2019

10	                       Generic UDP Encapsulation
11	                       draft-ietf-intarea-gue-08

13	Status of this Memo

15	   This Internet-Draft is submitted in full conformance with the
16	   provisions of BCP 78 and BCP 79.

18	   Internet-Drafts are working documents of the Internet Engineering
19	   Task Force (IETF), its areas, and its working groups.  Note that
20	   other groups may also distribute working documents as Internet-
21	   Drafts.

23	   Internet-Drafts are draft documents valid for a maximum of six months
24	   and may be updated, replaced, or obsoleted by other documents at any
25	   time.  It is inappropriate to use Internet-Drafts as reference
26	   material or to cite them other than as "work in progress."

28	   The list of current Internet-Drafts can be accessed at
29	   http://www.ietf.org/ietf/1id-abstracts.txt

31	   The list of Internet-Draft Shadow Directories can be accessed at
32	   http://www.ietf.org/shadow.html

34	   This Internet-Draft will expire on April 6, 2020.

36	Copyright Notice

38	   Copyright (c) 2019 IETF Trust and the persons identified as the
39	   document authors. All rights reserved.

41	   This document is subject to BCP 78 and the IETF Trust's Legal
42	   Provisions Relating to IETF Documents
43	   (http://trustee.ietf.org/license-info) in effect on the date of
44	   publication of this document. Please review these documents
45	   carefully, as they describe your rights and restrictions with respect
46	   to this document.

48	   This document is subject to BCP 78 and the IETF Trust's Legal
49	   Provisions Relating to IETF Documents
50	   (http://trustee.ietf.org/license-info) in effect on the date of
51	   publication of this document.  Please review these documents
52	   carefully, as they describe your rights and restrictions with respect
53	   to this document.  Code Components extracted from this document must
54	   include Simplified BSD License text as described in Section 4.e of
55	   the Trust Legal Provisions and are provided without warranty as
56	   described in the Simplified BSD License.

58	Abstract

60	   This specification describes Generic UDP Encapsulation (GUE), which
61	   is a scheme for using UDP to encapsulate packets of different IP
62	   protocols for transport across layer 3 networks. By encapsulating
63	   packets in UDP, specialized capabilities in networking hardware for
64	   efficient handling of UDP packets can be leveraged. GUE specifies
65	   basic encapsulation methods upon which higher level constructs, such
66	   as tunnels and overlay networks for network virtualization, can be
67	   constructed. GUE is extensible by allowing optional data fields as
68	   part of the encapsulation, and is generic in that it can encapsulate
69	   packets of various IP protocols.

71	Table of Contents

73	   1. Introduction  . . . . . . . . . . . . . . . . . . . . . . . . .  5
74	     1.1. Applicability . . . . . . . . . . . . . . . . . . . . . . .  5
75	     1.2. Terminology and acronyms  . . . . . . . . . . . . . . . . .  6
76	     1.3. Requirements Language . . . . . . . . . . . . . . . . . . .  7
77	   2. Base packet format  . . . . . . . . . . . . . . . . . . . . . .  8
78	     2.1. GUE variant . . . . . . . . . . . . . . . . . . . . . . . .  8
79	   3. Variant 0 . . . . . . . . . . . . . . . . . . . . . . . . . . .  8
80	     3.1. Header format . . . . . . . . . . . . . . . . . . . . . . .  9
81	     3.2. Proto/ctype field . . . . . . . . . . . . . . . . . . . . . 10
82	       3.2.1. Proto field . . . . . . . . . . . . . . . . . . . . . . 10
83	       3.2.2. Ctype field . . . . . . . . . . . . . . . . . . . . . . 10
84	     3.3. Flags and extension fields  . . . . . . . . . . . . . . . . 11
85	       3.3.1. Requirements  . . . . . . . . . . . . . . . . . . . . . 11
86	       3.3.2. Example GUE header with extension fields  . . . . . . . 12
87	     3.4. Surplus space . . . . . . . . . . . . . . . . . . . . . . . 12
88	     3.5. Message types . . . . . . . . . . . . . . . . . . . . . . . 13
89	       3.5.1. Control messages  . . . . . . . . . . . . . . . . . . . 13
90	       3.5.2. Data messages . . . . . . . . . . . . . . . . . . . . . 13
91	   4. Variant 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
92	     4.1. Direct encapsulation of IPv4  . . . . . . . . . . . . . . . 14
93	     4.2. Direct encapsulation of IPv6  . . . . . . . . . . . . . . . 15
94	   5. Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
95	     5.1. Network tunnel encapsulation  . . . . . . . . . . . . . . . 16
96	     5.2. Transport layer encapsulation . . . . . . . . . . . . . . . 16
97	     5.3. Encapsulator operation  . . . . . . . . . . . . . . . . . . 17
98	     5.4. Decapsulator operation  . . . . . . . . . . . . . . . . . . 17
99	       5.4.1. Processing a received data message  . . . . . . . . . . 17
100	       5.4.2. Processing a received control message . . . . . . . . . 18
101	     5.5. Middlebox inspection  . . . . . . . . . . . . . . . . . . . 18
102	     5.6. Router and switch operation . . . . . . . . . . . . . . . . 19
103	       5.6.1. Connection semantics  . . . . . . . . . . . . . . . . . 19
104	       5.6.2. NAT . . . . . . . . . . . . . . . . . . . . . . . . . . 20
105	     5.7. MTU and fragmentation . . . . . . . . . . . . . . . . . . . 20
106	     5.8. UDP Checksum Handling . . . . . . . . . . . . . . . . . . . 20
107	       5.8.1. UDP Checksum with IPv4  . . . . . . . . . . . . . . . . 20
108	       5.8.2. UDP Checksum with IPv6  . . . . . . . . . . . . . . . . 21
109	     5.9. Congestion Considerations . . . . . . . . . . . . . . . . . 24
110	       5.9.1. GUE tunnels . . . . . . . . . . . . . . . . . . . . . . 24
111	       5.9.2 Transport layer encapsulation  . . . . . . . . . . . . . 25
112	     5.10. Multicast  . . . . . . . . . . . . . . . . . . . . . . . . 25
113	     5.11. Flow entropy for ECMP  . . . . . . . . . . . . . . . . . . 25
114	       5.11.1. Flow classification  . . . . . . . . . . . . . . . . . 25
115	       5.11.2. Flow entropy properties  . . . . . . . . . . . . . . . 26
116	     5.12. Negotiation of acceptable flags and extension fields . . . 27
117	   6. Motivation for GUE  . . . . . . . . . . . . . . . . . . . . . . 27
118	     6.1. Benefits of GUE . . . . . . . . . . . . . . . . . . . . . . 27
119	     6.2. Comparison of GUE to other encapsulations . . . . . . . . . 28
120	   7. Security Considerations . . . . . . . . . . . . . . . . . . . . 30
121	   8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . . 30
122	     8.1. UDP source port . . . . . . . . . . . . . . . . . . . . . . 30
123	     8.2. GUE variant number  . . . . . . . . . . . . . . . . . . . . 31
124	     8.3. Control types . . . . . . . . . . . . . . . . . . . . . . . 31
125	   9. Acknowledgements  . . . . . . . . . . . . . . . . . . . . . . . 31
126	   10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 32
127	     10.1. Normative References . . . . . . . . . . . . . . . . . . . 32
128	     10.2. Informative References . . . . . . . . . . . . . . . . . . 33
129	   Appendix A: NIC processing for GUE . . . . . . . . . . . . . . . . 36
130	     A.1. Receive multi-queue . . . . . . . . . . . . . . . . . . . . 36
131	     A.2. Checksum offload  . . . . . . . . . . . . . . . . . . . . . 36
132	       A.2.1. Transmit checksum offload . . . . . . . . . . . . . . . 37
133	       A.2.2. Receive checksum offload  . . . . . . . . . . . . . . . 37
134	     A.3. Transmit Segmentation Offload . . . . . . . . . . . . . . . 38
135	     A.4. Large Receive Offload . . . . . . . . . . . . . . . . . . . 39
136	   Appendix B: Implementation considerations  . . . . . . . . . . . . 39
137	     B.1. Priveleged ports  . . . . . . . . . . . . . . . . . . . . . 39
138	     B.2. Setting flow entropy as a route selector  . . . . . . . . . 40
139	     B.3. Hardware protocol implementation considerations . . . . . . 40
140	   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 41

142	1. Introduction

144	   This specification describes Generic UDP Encapsulation (GUE) which is
145	   a general method for encapsulating packets of arbitrary IP protocols
146	   within User Datagram Protocol (UDP) [RFC0768] packets. Encapsulating
147	   packets in UDP facilitates efficient transport across networks.
148	   Networking devices widely provide protocol specific processing and
149	   optimizations for UDP (as well as TCP) packets. Packets for atypical
150	   IP protocols (those not usually parsed by networking hardware) can be
151	   encapsulated in UDP packets to maximize deliverability and to
152	   leverage flow specific mechanisms for routing and packet steering.

154	   GUE provides an extensible header format for including optional data
155	   in the encapsulation header. This data potentially covers items such
156	   as a virtual networking identifier, security data for validating or
157	   authenticating the GUE header, congestion control data, etc.

159	   This document does not define any specific GUE extensions. [GUEEXTEN]
160	   specifies a set of initial extensions.

162	1.1. Applicability

164	   GUE is a network encapsulation protocol that encapsulates packets for
165	   various IP protocols. Potential use cases include network tunneling,
166	   multi-tenant network virtualization, tunneling for mobility, and
167	   transport layer encapsulation. GUE is intended for deploying overlay
168	   networks in public or private data center environments, as well as
169	   providing a general tunneling mechanism usable in the Internet.

171	   GUE is a UDP based encapsulation protocol transported over existing
172	   IPv4 and IPv6 networks. Hence, as a UDP based protocol, GUE adheres
173	   to the UDP usage guidelines as specified in [RFC8085]. Applicability
174	   of these guidelines are dependent on the underlay IP network and the
175	   nature of GUE payload protocol (for example TCP/IP or IP/Ethernet).
176	   GUE may also be used to create IP tunnels, hence the guidelines in
177	   [IPTUN] are applicable.

179	   [RFC8085] outlines two applicability scenarios for UDP applications:
180	   (1) general Internet and (2) a traffic-managed controlled environment
181	   (TMCE). The requirements of [RFC8085] pertaining to deployment of a
182	   UDP encapsulation protocol in these environments are applicable.
183	   Section 5 provides the specifics for satisfying requirements of
184	   [RFC8085]. It is the responsibility of the operator deploying GUE to
185	   ensure that the necessary operational requirements are met for the
186	   environment in which GUE is being deployed.

188	   GUE has much of the same applicability and benefits as GRE-in-UDP
189	   [RFC8086] that are afforded by UDP encapsulation protocols. GUE
190	   offers the possibility of good performance for load-balancing
191	   encapsulated IP traffic in transit networks using existing Equal-Cost
192	   Multipath (ECMP) mechanisms that use a hash of the five-tuple of
193	   source IP address, destination IP address, UDP/TCP source port,
194	   UDP/TCP destination port, and protocol number. Encapsulating packets
195	   in UDP enables use of the UDP source port to provide entropy to ECMP
196	   hashing. A material difference between GUE and GRE-in-UDP is that the
197	   payload of GUE is always an IP protocol whereas the payload in GRE-
198	   in-UDP may be a non-IP protocol; this distinction is pertinent in the
199	   discussion of congestion considerations (section 5.9) since IP
200	   protocols are generally assumed to be congestion controlled.

202	   In addition, GUE enables extending the use of atypical IP protocols
203	   (those other than TCP and UDP) across networks that might otherwise
204	   filter packets carrying those protocols. GUE may also be used with
205	   connection oriented UDP semantics in order to facilitate traversal
206	   through stateful firewalls and stateful NAT.

208	   Additional motivation for the GUE protocol is provided in section 6.

210	1.2. Terminology and acronyms

212	   GUE              Generic UDP Encapsulation

214	   GUE Header       A variable length protocol header that is composed
215	                    of a primary four byte header and zero or more four
216	                    byte words of optional header data

218	   GUE packet       A UDP/IP packet that contains a GUE header and GUE
219	                    payload within the UDP payload

221	   GUE variant      A version of the GUE protocol or an alternate form
222	                    of a version

224	   Encapsulator     A network node that encapsulates packets in GUE

226	   Decapsulator     A network node that decapsulates and processes
227	                    packets encapsulated in GUE

229	   Data message     An encapsulated packet in a GUE payload that is
230	                    addressed to the protocol stack for an associated
231	                    protocol

233	   Control message  A formatted message in the GUE payload that is
234	                    implicitly addressed to the decapsulator to monitor
235	                    or control the state or behavior of a tunnel

237	   Flags            A set of bit flags in the primary GUE header
238	   Extension field  An optional field in a GUE header whose presence is
239	                    indicated by corresponding flag(s)

241	   C-bit            A single bit flag in the primary GUE header that
242	                    indicates whether the GUE packet contains a control
243	                    message or data message

245	   Hlen             A field in the primary GUE header that gives the
246	                    length of the GUE header

248	   Proto/ctype      A field in the GUE header that holds either the IP
249	                    protocol number for a data message or a type for a
250	                    control message

252	   Outer IP header  Refers to the outer most IP header or packet when
253	                    encapsulating a packet over IP

255	   Inner IP header  Refers to an encapsulated IP header when an IP
256	                    packet is encapsulated

258	   Outer packet     Refers to an encapsulating packet

260	   Inner packet     Refers to a packet that is encapsulated

262	   TMCE             A traffic-managed controlled environment, i.e., an
263	                    IP network that is traffic-engineered and/or
264	                    otherwise managed

266	1.3. Requirements Language

268	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
269	   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
270	   document are to be interpreted as described in [RFC2119].

272	2. Base packet format

274	   A GUE packet is comprised of a UDP packet whose payload is a GUE
275	   header followed by a payload which is either an encapsulated packet
276	   of some IP protocol or a control message such as an OAM (Operations,
277	   Administration, and Management) message. A GUE packet has the general
278	   format:

280	   +-------------------------------+
281	   |                               |
282	   |        UDP/IP header          |
283	   |                               |
284	   |-------------------------------|
285	   |                               |
286	   |         GUE Header            |
287	   |                               |
288	   |-------------------------------|
289	   |                               |
290	   |      Encapsulated packet      |
291	   |      or control message       |
292	   |                               |
293	   +-------------------------------+

295	   The GUE header is variable length as determined by the presence of
296	   optional extension fields.

298	2.1. GUE variant

300	   The first two bits of the GUE header contain the GUE protocol variant
301	   number. The variant number can indicate the version of the GUE
302	   protocol as well as alternate forms of a version.

304	   Variants 0 and 1 are described in this specification; variants 2 and
305	   3 are reserved.

307	3. Variant 0

309	   Variant 0 indicates version 0 of GUE. This variant defines a generic
310	   extensible format to encapsulate packets by Internet protocol number.

312	3.1. Header format

314	   The header format for variant 0 of GUE in UDP is:

316	    0                   1                   2                   3
317	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
318	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\
319	   |        Source port            |      Destination port         | |
320	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP
321	   |           Length              |          Checksum             | |
322	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
323	   | 0 |C|   Hlen  |  Proto/ctype  |             Flags             |\
324	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |
325	   |                                                               | GUE
326	   ~                  Extensions Fields (optional)                 ~ |
327	   |                                                               | |
328	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/

330	   The contents of the UDP header are:

332	      o Source port: If connection semantics (section 5.6.1) are applied
333	        to an encapsulation, this is set to the local source port for
334	        the connection. When connection semantics are not applied, the
335	        source port is either set to a flow entropy value, as described
336	        in section 5.11, or is set to the GUE assigned port number,
337	        6080.

339	      o Destination port: If connection semantics (section 5.6.1) are
340	        applied to an encapsulation, this is set to the destination port
341	        for the tuple. If connection semantics are not applied then the
342	        destination port is set to the GUE assigned port number, 6080.

344	      o Length: Canonical length of the UDP packet (length of UDP header
345	        and payload).

347	      o Checksum: Standard UDP checksum (handling is described in
348	        section 5.8).

350	   The GUE header consists of:

352	      o Variant: 0 indicates GUE protocol version 0 with a header.

354	      o C: C-bit: When set indicates a control message. When not set
355	        indicates a data message.

357	      o Hlen: Length in 32-bit words of the GUE header, including
358	        optional extension fields but not the first four bytes of the
359	        header. Computed as (header_len - 4) / 4, where header_len is
360	        the total header length in bytes. All GUE headers are a multiple
361	        of four bytes in length. Maximum header length is 128 bytes.

363	      o Proto/ctype: When the C-bit is set, this field contains a
364	        control message type for the payload (section 3.2.2). When the
365	        C-bit is not set, the field holds the Internet protocol number
366	        for the encapsulated packet in the payload (section 3.2.1). The
367	        control message or encapsulated packet begins at the offset
368	        provided by Hlen.

370	      o Flags: Header flags that may be allocated for various purposes
371	        and may indicate the presence of extension fields. Undefined
372	        header flag bits MUST be set to zero on transmission.

374	      o Extension Fields: Optional fields whose presence is indicated by
375	        corresponding flags.

377	3.2. Proto/ctype field

379	   The proto/ctype fields either contains an Internet protocol number
380	   (when the C-bit is not set) or GUE control message type (when the C-
381	   bit is set).

383	3.2.1. Proto field

385	   When the C-bit is not set, the proto/ctype field MUST contain an IANA
386	   Internet Protocol Number [IANA-PN]. The protocol number is
387	   interpreted relative to the IP protocol that encapsulates the UDP
388	   packet (i.e. protocol of the outer IP header). The protocol number
389	   serves as an indication of the type of the next protocol header which
390	   is contained in the GUE payload at the offset indicated in Hlen.

392	   IP protocol number 59 ("No next header") can be set to indicate that
393	   the GUE payload does not begin with the header of an IP protocol.
394	   This would be the case, for instance, if the GUE payload were a
395	   fragment when performing GUE level fragmentation. The interpretation
396	   of the payload is performed through other means such as flags and
397	   extension fields, and nodes MUST NOT parse packets based on the IP
398	   protocol number in this case.

400	3.2.2. Ctype field

402	   When the C-bit is set, the proto/ctype field MUST be set to a valid
403	   control message type. A value of zero indicates that the GUE payload
404	   requires further interpretation to deduce the control type. This
405	   might be the case when the payload is a fragment of a control
406	   message, where only the reassembled packet can be interpreted as a
407	   control message.

409	   Control messages will be defined in an IANA registry. Control message
410	   types 1 through 127 may be defined in standards. Types 128 through
411	   255 are reserved to be user defined for experimentation.

413	   This document does not specify any standard control message types
414	   other than type 0. Type 0 indicates that the GUE payload is a control
415	   message, or part of a control message (as might be the case in GUE
416	   fragmentation) that cannot be correctly parsed or interpreted without
417	   additional context.

419	3.3. Flags and extension fields

421	   Flags and associated extension fields are the primary mechanism of
422	   extensibility in GUE. As mentioned in section 3.1, GUE header flags
423	   indicate the presence of optional extension fields in the GUE header.
424	   [GUEEXTEN] defines an initial set of GUE extensions.

426	3.3.1. Requirements

428	   There are sixteen flag bits in the GUE header. Flags may indicate
429	   presence of extension fields. The size of an extension field
430	   indicated by a flag MUST be fixed in the specification of the flag.

432	   Flags can be grouped together to allow different lengths for an
433	   extension field. For example, if two flag bits are grouped, a field
434	   can possibly be three different lengths-- that is bit value of 00
435	   indicates no field present; 01, 10, and 11 indicate three possible
436	   lengths for the field. Regardless of how flag bits are grouped, the
437	   lengths and offsets of extension fields corresponding to a set of
438	   flags MUST be well defined and deterministic.

440	   Extension fields are placed in order of the flags. New flags are to
441	   be allocated from high to low order bit contiguously without holes.
442	   Flags allow random access, for instance to inspect the field
443	   corresponding to the Nth flag bit, an implementation only considers
444	   the previous N-1 flags to determine the offset. Flags after the Nth
445	   flag are not pertinent in calculating the offset of the field for the
446	   Nth flag. Random access of flags and fields permits processing of
447	   optional extensions in an order that is independent of their position
448	   in the packet.

450	   Flags (or grouped flags) are idempotent such that new flags MUST NOT
451	   cause reinterpretation of old flags. Also, new flags MUST NOT alter
452	   interpretation of other elements in the GUE header nor how the
453	   message is parsed (for instance, in a data message the proto/ctype
454	   field always holds an IP protocol number as an invariant).

456	   The set of available flags can be extended in the future by defining
457	   a "flag extensions bit" that refers to a field containing a new set
458	   of flags.

460	3.3.2. Example GUE header with extension fields

462	   An example GUE header for a data message encapsulating an IPv4 packet
463	   and containing the Group Identifier and Security extension fields
464	   (both defined in [GUEEXTEN]) is shown below:

466	    0                   1                   2                   3
467	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
468	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
469	   | 0 |0|    3    |       4       |1|0 0 1|          0            |
470	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
471	   |                        Group Identifier                       |
472	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
473	   |                                                               |
474	   +                           Security                            +
475	   |                                                               |
476	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

478	   In the above example, the first flag bit is set which indicates that
479	   the Group Identifier extension is present which is a 32 bit field.
480	   The second through fourth bits of the flags are grouped flags that
481	   indicate the presence of a Security field with seven possible sizes.
482	   In this example 001 indicates a sixty-four bit security field.

484	3.4. Surplus space

486	   The length of a GUE header, as indicated in the GUE Hlen field, may
487	   exceed the space consumed by optional extensions in a packet. The
488	   space between the end of the last optional field and the end of the
489	   header is termed the "surplus space".

491	   Surplus space is reserved per this specification and uses may be
492	   defined in future specifications. If a node receives a GUE packet
493	   with non-zero length of surplus space then it MUST NOT attempt to
494	   interpret the data in the surplus space. For purposes of transforms
495	   across the header, such as optional integrity check over the header,
496	   the surplus space is considered to be part of the GUE header and
497	   would be included in computation.

499	3.5. Message types

501	   There are two message types in GUE variant 0: control messages and
502	   data messages.

504	3.5.1. Control messages

506	   Control messages carry formatted data that are implicitly addressed
507	   to the decapsulator to monitor or control the state or behavior of a
508	   tunnel (OAM). For instance, an echo request and corresponding echo
509	   reply message can be defined to test for liveness.

511	   Control messages are indicated in the GUE header when the C-bit is
512	   set. The payload is interpreted as a control message with type
513	   specified in the proto/ctype field. The format and contents of the
514	   control message are indicated by the type and can be variable length.

516	   Other than interpreting the proto/ctype field as a control message
517	   type, the meaning and semantics of the rest of the elements in the
518	   GUE header are the same as that of data messages. Forwarding and
519	   routing of control messages should be the same as that of a data
520	   message with the same outer IP and UDP header; this ensures that
521	   control messages can be created that follow the same path through the
522	   network as data messages.

524	3.5.2. Data messages

526	   Data messages carry encapsulated packets that are addressed to the
527	   protocol stack for the associated protocol. Data messages are a
528	   primary means of encapsulation and can be used to create tunnels for
529	   overlay networks.

531	   Data messages are indicated in the GUE header when the C-bit is not
532	   set. The payload of a data message is interpreted as an encapsulated
533	   packet of an Internet protocol indicated in the proto/ctype field.
534	   The encapsulated packet immediately follows the GUE header.

536	4. Variant 1

538	   Variant 1 of GUE allows direct encapsulation of IPv4 and IPv6 in UDP.
539	   In this variant there is no GUE header, a UDP packet carries an IP
540	   packet. The first two bits of the UDP payload are the GUE variant
541	   field and coincide with the first two bits of the version number in
542	   the IP header. The first two version bits of IPv4 and IPv6 are 01, so
543	   we use GUE variant 1 for direct IP encapsulation which makes the two
544	   bits of GUE variant to also be 01.

546	   This technique is effectively a means to compress out the GUE version
547	   0 header when encapsulating IPv4 or IPv6 packets and there are no
548	   flags or extension fields. This method is compatible to use on the
549	   same port number as packets with the GUE header (GUE variant 0
550	   packets). This technique saves encapsulation overhead on costly links
551	   for the common use of IP encapsulation, and also obviates the need to
552	   allocate a separate UDP port number for IP-over-UDP encapsulation.

554	4.1. Direct encapsulation of IPv4

556	   The format for encapsulating IPv4 directly in UDP is:

558	    0                   1                   2                   3
559	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
560	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\
561	   |        Source port            |      Destination port         | |
562	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP
563	   |           Length              |          Checksum             | |
564	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
565	   |0|1|0|0|  IHL  |Type of Service|          Total Length         |
566	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
567	   |         Identification        |Flags|      Fragment Offset    |
568	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
569	   |  Time to Live |   Protocol    |   Header Checksum             |
570	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
571	   |                       Source IPv4 Address                     |
572	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
573	   |                     Destination IPv4 Address                  |
574	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

576	   The UDP fields are set in a similar manner as described in section
577	   3.1.

579	   Note that the 0100 value in the first four bits of the UDP payload
580	   expresses both the GUE variant as 1 (bits 01) and IP version as 4
581	   (bits 0100).

583	4.2. Direct encapsulation of IPv6

585	   The format for encapsulating IPv6 directly in UDP is demonstrated
586	   below:

588	    0                   1                   2                   3
589	    0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
590	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+\
591	   |        Source port            |      Destination port         | |
592	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ UDP
593	   |           Length              |          Checksum             | |
594	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+/
595	   |0|1|1|0| Traffic Class |           Flow Label                  |
596	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
597	   |         Payload Length        |     NextHdr   |   Hop Limit   |
598	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
599	   |                                                               |
600	   +                                                               +
601	   |                                                               |
602	   +                        Source IPv6 Address                    +
603	   |                                                               |
604	   +                                                               +
605	   |                                                               |
606	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
607	   |                                                               |
608	   +                                                               +
609	   |                                                               |
610	   +                      Destination IPv6 Address                 +
611	   |                                                               |
612	   +                                                               +
613	   |                                                               |
614	   +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

616	   The UDP fields are set in a similar manner as described in section
617	   3.1.

619	   Note that the 0110 value in the first four bits of the the UDP
620	   payload expresses both the GUE variant as 1 (bits 01) and IP version
621	   as 6 (bits 0110).

623	5. Operation

625	   The figure below illustrates the use of GUE encapsulation between two
626	   hosts. Host 1 is sending packets to Host 2. An encapsulator performs
627	   encapsulation of packets from Host 1. These encapsulated packets
628	   traverse the network as UDP packets. At the decapsulator, packets are
629	   decapsulated and sent on to Host 2. Packet flow in the reverse
630	   direction need not be symmetric; for example, the reverse path might
631	   not use GUE or any other form of encapsulation.

633	   +---------------+                       +---------------+
634	   |               |                       |               |
635	   |    Host 1     |                       |     Host 2    |
636	   |               |                       |               |
637	   +---------------+                       +---------------+
638	          |                                        ^
639	          V                                        |
640	   +---------------+   +---------------+   +---------------+
641	   |               |   |               |   |               |
642	   | Encapsulator  |-->|    Layer 3    |-->| Decapsulator  |
643	   |               |   |    Network    |   |               |
644	   +---------------+   +---------------+   +---------------+

646	   The encapsulator and decapsulator may be co-resident with the
647	   corresponding hosts, or may be on separate nodes in the network.

649	5.1. Network tunnel encapsulation

651	   Network tunneling can be achieved by encapsulating layer 2 or layer 3
652	   packets. In this case, the encapsulator and decapsulator nodes are
653	   the tunnel endpoints. These could be routers that provide network
654	   tunnels on behalf of communicating hosts.

656	5.2. Transport layer encapsulation

658	   When encapsulating layer 4 packets, the encapsulator and decapsulator
659	   should be co-resident with the hosts. In this case, the encapsulation
660	   headers are inserted between the IP header and the transport packet.
661	   The addresses in the IP header refer to both the endpoints of the
662	   encapsulation and the endpoints for terminating the encapsulated
663	   transport protocol. Note that the transport layer ports in the
664	   encapsulated packet are independent of the UDP ports in the outer
665	   packet.

667	5.3. Encapsulator operation

669	   Encapsulators create GUE data messages, set the fields of the UDP
670	   header, set flags and optional extension fields in the GUE header,
671	   and forward packets to a decapsulator.

673	   An encapsulator can be an end host originating the packets of a flow,
674	   or can be a network device performing encapsulation on behalf of
675	   hosts (routers implementing tunnels for instance). In either case,
676	   the intended target (decapsulator) is indicated by the outer
677	   destination IP address and destination port in the UDP header.

679	   If an encapsulator is tunneling packets, that is encapsulating
680	   packets of layer 2 or layer 3 protocols (e.g. EtherIP, IPIP, ESP
681	   tunnel mode), it SHOULD follow standard conventions for tunneling one
682	   protocol over another. For instance, if an IP packet is being
683	   encapsulated in GUE then diffserv interaction [RFC2983] and ECN
684	   propagation for tunnels [RFC6040] SHOULD be followed.

686	5.4. Decapsulator operation

688	   A decapsulator performs decapsulation of GUE packets. A decapsulator
689	   is addressed by the outer destination IP address and UDP destination
690	   port of a GUE packet. The decapsulator validates packets, including
691	   fields of the GUE header.

693	   If a decapsulator receives a GUE packet with an unsupported variant,
694	   unknown flag, bad header length (too small for included extension
695	   fields), unknown control message type, bad protocol number, an
696	   unsupported payload type, or an otherwise malformed header, it MUST
697	   drop the packet. Such events MAY be logged subject to configuration
698	   and rate limiting of logging messages. Note that set flags in a GUE
699	   header that are unknown to a decapsulator MUST NOT be ignored. If a
700	   GUE packet is received by a decapsulator with unknown flags, the
701	   packet MUST be dropped.

703	5.4.1. Processing a received data message

705	   If a valid data message is received, the UDP header and GUE header
706	   are (logically) removed from the packet. The outer IP header remains
707	   intact and the next protocol in the IP header is set to the protocol
708	   from the proto field in the GUE header. The resulting packet is then
709	   resubmitted into the protocol stack to process the packet as though
710	   it was received with the protocol indicated in the GUE header.

712	   As an example, consider that a data message is received where GUE
713	   encapsulates an IPv4 packet using GUE variant 0. In this case proto
714	   field in the GUE header is set to 4 for IPv4 encapsulation:

716	   +-------------------------------------+
717	   |   IP header (next proto = 17,UDP)   |
718	   |-------------------------------------|
719	   |                  UDP                |
720	   |-------------------------------------|
721	   |  GUE (proto = 4,IPv4 encapsulation) |
722	   |-------------------------------------|
723	   |        IPv4 header and packet       |
724	   +-------------------------------------+

726	   The receiver removes the UDP and GUE headers and sets the next
727	   protocol field in the IP packet to 4, which is derived from the GUE
728	   proto field. The resultant packet would have the format:

730	   +-------------------------------------+
731	   |   IP header (next proto = 4,IPv4)   |
732	   |-------------------------------------|
733	   |        IPv4 header and packet       |
734	   +-------------------------------------+

736	   This packet is then resubmitted into the protocol stack to be
737	   processed as an IPv4 encapsulated packet.

739	5.4.2. Processing a received control message

741	   If a valid control message is received, the packet MUST be processed
742	   as a control message. The specific processing to be performed depends
743	   on the value in the ctype field of the GUE header.

745	5.5. Middlebox inspection

747	   A middlebox MAY inspect a GUE header. A middlebox MUST NOT modify a
748	   GUE header or UDP payload.

750	   To inspect a GUE header, a middlebox needs to identify GUE packets.
751	   The obvious method is to match the destination UDP port number to be
752	   the GUE port number (i.e. 6080). Per [RFC7605], transport port
753	   numbers only have meaning at the endpoints of communications, so
754	   inferring the type of a UDP payload based on port number may be
755	   incorrect. Middleboxes MUST NOT take any action that would have
756	   harmful side effects if a UDP packet were misinterpreted as being a
757	   GUE packet. In particular, a middlebox MUST NOT modify a UDP payload
758	   based on inferring the payload type from the port number lest the
759	   middlebox could cause silent data corruption.

761	   A middlebox MAY interpret some flags and extension fields of the GUE
762	   header for classification purposes, but is not required to understand
763	   any of the flags or extension fields in GUE packets. A middlebox MUST
764	   NOT drop a GUE packet merely because there are flags unknown to it.
765	   Similarly, a middlebox MUST NOT arbitrarily filter packets based on
766	   GUE flags or extension fields that are present or not present. The
767	   header length in the GUE header allows a middlebox to inspect the
768	   payload packet without needing to parse the flags or extension
769	   fields.

771	5.6. Router and switch operation

773	   Routers and switches SHOULD forward GUE packets as standard UDP/IP
774	   packets. The outer five-tuple should contain sufficient information
775	   to perform flow classification corresponding to the flow of the inner
776	   packet. A router does not normally need to parse a GUE header, and
777	   none of the flags or extension fields in the GUE header are expected
778	   to affect routing. In cases where the outer five-tuple does not
779	   provide sufficient entropy for flow classification, for instance UDP
780	   ports are fixed to provide connection semantics (section 5.6.1), then
781	   the encapsulated packet MAY be parsed to determine flow entropy.

783	   A router MUST NOT modify a GUE header or payload when forwarding a
784	   packet. It MAY encapsulate a GUE packet in another GUE packet, for
785	   instance to implement a network tunnel (i.e. by encapsulating an IP
786	   packet with a GUE payload in another IP packet as a GUE payload). In
787	   this case, the router takes the role of an encapsulator, and the
788	   corresponding decapsulator is the logical endpoint of the tunnel.
789	   When encapsulating a GUE packet within another GUE packet, there are
790	   no provisions to automatically copy flags or fields to the outer GUE
791	   header. Each layer of encapsulation is considered independent.

793	5.6.1. Connection semantics

795	   A middlebox might infer bidirectional connection semantics for a UDP
796	   flow. For instance, a stateful firewall might create a five-tuple
797	   rule to match flows on egress, and a corresponding five-tuple rule
798	   for matching ingress packets where the roles of source and
799	   destination are reversed for the IP addresses and UDP port numbers.
800	   To operate in this environment, a GUE tunnel should be configured to
801	   assume connected semantics defined by the UDP five tuple and the use
802	   of GUE encapsulation needs to be symmetric between both endpoints.
803	   The source port set in the UDP header MUST be the destination port
804	   the peer would set for replies. In this case, the UDP source port for
805	   a tunnel would be a fixed value and not set to be flow entropy.

807	   The selection of whether to make the UDP source port fixed or set to
808	   a flow entropy value for each packet sent SHOULD be configurable for
809	   a tunnel. The default MUST be to set the flow entropy value in the
810	   UDP source port.

812	5.6.2. NAT

814	   IP address and port translation can be performed on the UDP/IP
815	   headers adhering to the requirements for NAT (Network Address
816	   Translation) with UDP [RFC4787]. In the case of stateful NAT,
817	   connection semantics MUST be applied to a GUE tunnel as described in
818	   section 5.6.1. GUE endpoints MAY also invoke STUN [RFC5389] or ICE
819	   [RFC5245] to manage NAT port mappings for encapsulations.

821	5.7. MTU and fragmentation

823	   Standard conventions for handling of MTU (Maximum Transmission Unit)
824	   and fragmentation in conjunction with networking tunnels
825	   (encapsulation of layer 2 or layer 3 packets) SHOULD be followed.
826	   Details are described in MTU and Fragmentation Issues with In-the-
827	   Network Tunneling [RFC4459].

829	   If a packet is fragmented before encapsulation in GUE, all the
830	   related fragments MUST be encapsulated using the same UDP source
831	   port. An operator SHOULD set MTU to account for encapsulation
832	   overhead and reduce the likelihood of fragmentation.

834	   Alternative to IP fragmentation, the GUE fragmentation extension can
835	   be used. GUE fragmentation is described in [GUEEXTEN].

837	5.8. UDP Checksum Handling

839	5.8.1. UDP Checksum with IPv4

841	   For UDP in IPv4, when a non-zero UDP checksum is used, the UDP
842	   checksum MUST be processed as specified in [RFC0768] and [RFC1122]
843	   for both transmit and receive. The IPv4 header includes a checksum
844	   that protects against misdelivery of the packet due to corruption of
845	   IP addresses. The UDP checksum potentially provides protection
846	   against corruption of the UDP header, GUE header, and GUE payload.
847	   Disabling the use of checksums is a deployment consideration that
848	   should take into account the risk and effects of packet corruption.

850	   When a decapsulator receives a packet, the UDP checksum field MUST be
851	   processed. If the UDP checksum is non-zero, the decapsulator MUST
852	   verify the checksum before accepting the packet. By default, a
853	   decapsulator SHOULD accept UDP packets with a zero checksum.  A node
854	   MAY be configured to disallow zero checksums per [RFC1122]; this may
855	   be done selectively, for instance by disallowing zero checksums from
856	   certain hosts that are known to be sending over paths subject to
857	   packet corruption. If verification of a non-zero checksum fails, a
858	   decapsulator lacks the capability to verify a non-zero checksum, or a
859	   packet with a zero checksum was received and the decapsulator is
860	   configured to disallow, the packet MUST be dropped and an event MAY
861	   be logged.

863	5.8.2. UDP Checksum with IPv6

865	   For UDP in IPv6, the UDP checksum MUST be processed as specified in
866	   [RFC0768] and [RFC2460] for both transmit and receive.

868	   When UDP is used over IPv6, the UDP checksum is relied upon to
869	   protect both the IPv6 and UDP headers from corruption. As such, by
870	   default a GUE encapsulator MUST use UDP checksums.

872	   [GUEEXTEN] specifies a GUE checksum option that includes a pseudo
873	   header containing the IP addresses. An encapsulator MAY use zero-UDP
874	   checksums if it uses the GUE checksum. A non-zero UDP checksum and
875	   the GUE checksum SHOULD NOT be used simultaneously in a packet since
876	   that would be redundant.

878	   When deployed in a TMCE, a GUE encapsulator MAY be configured to use
879	   UDP zero-checksum mode and no GUE checksum if the traffic-managed
880	   controlled environment or a set of closely cooperating traffic-
881	   managed controlled environments (such as by network operators who
882	   have agreed to work together in order to jointly provide specific
883	   services) meet at least one of the following conditions:

885	      a. It is known (perhaps through knowledge of equipment types and
886	         lower-layer checks) that packet corruption is exceptionally
887	         unlikely and where the operator is willing to take the risk of
888	         undetected packet corruption.

890	      b. It is judged through observational measurements (perhaps of
891	         historic or current traffic flows that use a non-zero checksum)
892	         that the level of packet corruption is tolerably low and where
893	         the operator is willing to take the risk of undetected packet
894	         corruption.

896	      c. Carrying applications that are tolerant of misdelivered or
897	         corrupted packets (perhaps through higher-layer checksum,
898	         validation, and retransmission or transmission redundancy)
899	         where the operator is willing to rely on the applications using
900	         GUE to survive any corrupt packets.

902	   The following requirements apply to encapsulators deployed in a TMCE
903	   environment that use UDP zero-checksum mode:

905	      a. Use of the UDP checksum with IPv6 MUST be the default
906	         configuration for all communications.

908	      b. The GUE implementation MUST comply with all requirements
909	         specified in Section 4 of [RFC6936] and with requirement 1
910	         specified in Section 5 of [RFC6936].

912	      c. A decapsulator SHOULD only allow the use of UDP zero-checksum
913	         mode for IPv6 on a single received UDP Destination Port,
914	         regardless of the encapsulator. The motivation for this
915	         requirement is possible corruption of the UDP Destination Port,
916	         which may cause packet delivery to the wrong UDP port. If that
917	         other UDP port requires the UDP checksum, the misdelivered
918	         packet will be discarded.

920	      d. It is RECOMMENDED that the UDP zero-checksum mode for IPv6 is
921	         only enabled for certain selected source addresses. The
922	         decapsulator MUST check that the source and destination IPv6
923	         addresses in a received packets are permitted by configuration
924	         to use UDP zero-checksum mode and discard any packet for which
925	         this check fails.

927	      e. The tunnel encapsulator SHOULD use different IPv6 addresses for
928	         each GUE communication (tunnel or transport flow) that uses UDP
929	         zero-checksum mode, regardless of the decapsulator, in order to
930	         strengthen the decapsulator's check of the IPv6 source address
931	         (i.e., the same IPv6 source address SHOULD NOT be used with
932	         more than one IPv6 destination address, independent of whether
933	         that destination address is a unicast or multicast address).
934	         When this is not possible, it is RECOMMENDED to use each source
935	         IPv6 address for as few GUE communications that use UDP zero-
936	         checksum mode as is feasible.

938	      f. When any middlebox exists on the path of GUE communication, it
939	         is RECOMMENDED to use the default mode, i.e., use UDP checksum,
940	         to reduce the chance that the encapsulated packets will be
941	         dropped.

943	      g. Any middlebox that allows the UDP zero-checksum mode for IPv6
944	         MUST comply with requirements 1 and 8-10 in Section 5 of
945	         [RFC6936].

947	      h. Measures SHOULD be taken to prevent IPv6 traffic with zero UDP
948	         checksums from "escaping" to the general Internet; see Section
949	         5.9 for examples of such measures.

951	      i. IPv6 traffic with zero UDP checksums MUST be actively monitored
952	         for errors by the network operator. For example, the operator
953	         may monitor Ethernet-layer packet error rates.

955	      j. If a packet with a non-zero checksum is received, the checksum
956	         MUST be verified before accepting the packet. This is
957	         regardless of whether the tunnel encapsulator and decapsulator
958	         have been configured with UDP zero-checksum mode.

960	   The above requirements do not change either the requirements
961	   specified in [RFC8200] as modified by [RFC6935] or the requirements
962	   specified in [RFC6936].

964	   The requirement to check the source IPv6 address in addition to the
965	   destination IPv6 address and the strong recommendation against reuse
966	   of source IPv6 addresses among GUE communications collectively
967	   provide some mitigation for the absence of UDP checksum coverage of
968	   the IPv6 header. A traffic-managed controlled environment that
969	   satisfies at least one of three conditions listed at the beginning of
970	   this section provides additional assurance.

972	   GUE packets are suitable for transmission over lower layers in the
973	   traffic-managed controlled environments that are allowed by the
974	   exceptions stated above, and the rate of corruption of the inner IP
975	   packet on such networks is not expected to increase by comparison to
976	   traffic that is not encapsulated in UDP. For these reasons, GUE does
977	   not provide an additional integrity check except when GUE checksum
978	   [GUEEXTEN] is used when UDP zero-checksum mode is used with IPv6, and
979	   this design is in accordance with requirements 2, 3, and 5 specified
980	   in Section 5 of [RFC6936].

982	   Generic UDP Encapsulation does not accumulate incorrect transport-
983	   layer state as a consequence of GUE header corruption. A corrupt GUE
984	   packet may result in either packet discard or packet forwarding
985	   without accumulation of GUE state. Active monitoring of GUE traffic
986	   for errors is REQUIRED, as the occurrence of errors will result in
987	   some accumulation of error information outside the protocol for
988	   operational and management purposes. This design is in accordance
989	   with requirement 4 specified in Section 5 of [RFC6936].

991	   The remaining requirements specified in Section 5 of [RFC6936] are
992	   not applicable to GUE. Requirements 6 and 7 do not apply because GUE
993	   does not include a control feedback mechanism. Requirements 8-10 are
994	   middlebox requirements that do not apply to GUE tunnel endpoints.
995	   (See Section 5.5 for further middlebox discussion.)

997	   In summary, a TMCE GUE tunnel is allowed to use UDP zero- checksum
998	   mode for IPv6 when the conditions and requirements stated above are
999	   met. Otherwise, the UDP checksum needs to be used for IPv6 as
1000	   specified in [RFC768] and [RFC8200]. Use of GUE checksum is
1001	   RECOMMENDED when the UDP checksum is not used.

1003	5.9. Congestion Considerations

1005	   This section describes congestion considerations for GUE tunnels
1006	   (Layer 2 and Layer 3 encapsulation) and transport layer encapsulation
1007	   (Layer 4 protocol over GUE).

1009	5.9.1. GUE tunnels

1011	   Section 3.1.9 of [RFC8085] discusses the congestion considerations
1012	   for design and use of UDP tunnels; this is important because other
1013	   flows could share the path with one or more UDP tunnels,
1014	   necessitating congestion control [RFC2914] to avoid destructive
1015	   interference.

1017	   Congestion has potential impacts both on the rest of the network
1018	   containing a UDP tunnel and on the traffic flows using the UDP
1019	   tunnels. These impacts depend upon what sort of traffic is carried
1020	   over the tunnel, as well as the path of the tunnel. The GUE protocol
1021	   does not provide any congestion control and GUE UDP packets are
1022	   regular UDP packets. Therefore, a GUE tunnel MUST NOT be deployed to
1023	   carry non-congestion-controlled traffic over the Internet [RFC8085].

1025	   Within a TMCE network, GUE tunnels are appropriate for carrying
1026	   traffic that is not known to be congestion controlled. For example, a
1027	   GUE tunnel may be used to carry Multiprotocol Label Switching (MPLS)
1028	   traffic such as pseudowires or VPNs where specific bandwidth
1029	   guarantees are provided to each pseudowire or VPN. In such cases,
1030	   operators of TMCE networks avoid congestion by careful provisioning
1031	   of their networks, rate-limiting of user data traffic, and traffic
1032	   engineering according to path capacity.

1034	   When a GUE tunnel carries traffic that is not known to be congestion
1035	   controlled in a TMCE network, the tunnel MUST be deployed entirely
1036	   within that network, and measures SHOULD be taken to prevent the GUE
1037	   traffic from "escaping" the network to the general Internet. Examples
1038	   of such measures are:

1040	      o physical or logical isolation of the links carrying GUE from the
1041	        general Internet,

1043	      o deployment of packet filters that block the UDP ports assigned
1044	        for GUE, and

1046	      o imposition of restrictions on GUE traffic by software tools used
1047	        to set up GUE tunnels between specific end systems (as might be
1048	        used within a single data center) or by tunnel ingress nodes for
1049	        tunnels that don't terminate at end systems.

1051	5.9.2 Transport layer encapsulation

1053	   If GUE encapsulates a transport layer protocol, such as TCP, it is
1054	   expected that the transport layer or application layer properly
1055	   implements congestion control or avoidance. In the case that UDP is
1056	   encapsulated, the application is expected to provide congestion
1057	   control as specified in [RFC8085].

1059	5.10. Multicast

1061	   GUE packets can be multicast to decapsulators using a multicast
1062	   destination address in the outer IP header. Each receiving host will
1063	   decapsulate the packet independently following normal decapsulator
1064	   operations. The receiving decapsulators need to agree on the same set
1065	   of GUE parameters and properties; how such an agreement is reached is
1066	   outside the scope of this document.

1068	   GUE allows encapsulation of unicast, broadcast, or multicast traffic.
1069	   Flow entropy (the value in the UDP source port) can be generated from
1070	   the header of encapsulated unicast or broadcast/multicast packets at
1071	   an encapsulator. The mapping mechanism between the encapsulated
1072	   multicast traffic and the multicast capability in the IP network is
1073	   transparent and independent of the encapsulation and is otherwise
1074	   outside the scope of this document.

1076	5.11. Flow entropy for ECMP

1078	   A major objective of using GUE is that a network device can perform
1079	   flow classification corresponding to the flow of the inner
1080	   encapsulated packet based on the contents of the outer headers.

1082	5.11.1. Flow classification

1084	   When a packet is encapsulated with GUE and connection semantics are
1085	   not applied, the source port in the outer UDP packet is set to a flow
1086	   entropy value that corresponds to the flow of the inner packet. When
1087	   a device computes a five-tuple hash on the outer UDP/IP header of a
1088	   GUE packet, the resultant value classifies the packet per its inner
1089	   flow.

1091	   Examples of deriving flow entropy for encapsulation are:

1093	      o If the encapsulated packet is a layer 4 packet, TCP/IPv4 for
1094	        instance, the flow entropy could be based on the canonical five-
1095	        tuple hash of the inner packet.

1097	      o If the encapsulated packet is an AH transport mode packet with
1098	        TCP as next header, the flow entropy could be a hash over a
1099	        three-tuple: TCP protocol and TCP ports of the encapsulated
1100	        packet.

1102	      o If a node is encrypting a packet using ESP tunnel mode and GUE
1103	        encapsulation, the flow entropy could be based on the contents
1104	        of the clear-text packet. For instance, a canonical five-tuple
1105	        hash for a TCP/IP packet could be used.

1107	   [RFC6438] discusses methods to compute and set flow entropy value for
1108	   IPv6 flow labels, such methods can also be used to create flow
1109	   entropy values for GUE.

1111	5.11.2. Flow entropy properties

1113	   The flow entropy is the value set in the UDP source port of a GUE
1114	   packet. Flow entropy in the UDP source port SHOULD adhere to the
1115	   following properties:

1117	      o The value set in the source port is within the ephemeral port
1118	        range (49152 to 65535 [RFC6335]). Since the high order two bits
1119	        of the port are set to one, this provides fourteen bits of
1120	        entropy for the value.

1122	      o The flow entropy has a uniform distribution across encapsulated
1123	        flows.

1125	      o An encapsulator MAY occasionally change the flow entropy used
1126	        for an inner flow per its discretion (for security, route
1127	        selection, etc). To avoid thrashing or flapping the value, the
1128	        flow entropy used for a flow SHOULD NOT change more than once
1129	        every thirty seconds (or a configurable value).

1131	      o Decapsulators, or any networking devices, SHOULD NOT attempt to
1132	        interpret flow entropy as anything more than an opaque value.
1133	        Neither should they attempt to reproduce the hash calculation
1134	        used by an encapasulator in creating a flow entropy value. They
1135	        MAY use the value to match further receive packets for steering
1136	        decisions, but MUST NOT assume that the hash uniquely or
1137	        permanently identifies a flow.

1139	      o Input to the flow entropy calculation is not restricted to ports
1140	        and addresses; input could include the flow label from an IPv6
1141	        packet, SPI from an ESP packet, or other flow related state in
1142	        the encapsulator that is not necessarily conveyed in the packet.

1144	      o The assignment function for flow entropy SHOULD be randomly
1145	        seeded to mitigate denial of service attacks. The seed SHOULD be
1146	        changed periodically.

1148	5.12. Negotiation of acceptable flags and extension fields

1150	   An encapsulator and decapsulator need to achieve agreement about GUE
1151	   parameters that will be used in communications. Parameters include
1152	   supported GUE variants, flags and extension fields that can be used,
1153	   security algorithms and keys, supported protocols and control
1154	   messages, etc. This document proposes different general methods to
1155	   accomplish this, however the details of implementing these are
1156	   considered out of scope.

1158	   General methods for this are:

1160	      o Configuration. The parameters used for a tunnel are configured
1161	        at each endpoint.

1163	      o Negotiation. A tunnel negotiation can be performed. This could
1164	        be accomplished in-band of GUE using control messages.

1166	      o Via a control plane. Parameters for communicating with a tunnel
1167	        endpoint can be set in a control plane protocol (such as that
1168	        needed for network virtualization).

1170	      o Via security negotiation. Use of security typically implies a
1171	        key exchange between endpoints. Other GUE parameters may be
1172	        conveyed as part of that process.

1174	6. Motivation for GUE

1176	   This section provides the motivation for GUE with respect to other
1177	   encapsulation methods.

1179	6.1. Benefits of GUE

1181	      * GUE is a generic encapsulation protocol. GUE can encapsulate
1182	        protocols that are represented by an IP protocol number. This
1183	        includes layer 2, layer 3, and layer 4 protocols.

1185	      * GUE is an extensible encapsulation protocol. Standardized
1186	        optional data such as security, virtual networking identifiers,
1187	        fragmentation are defined.

1189	      * For extensibility, GUE uses flag fields as opposed to TLVs as
1190	        some other encapsulation protocols do. Flag fields are strictly
1191	        ordered, allow random access, and are efficient in use of header
1192	        space.

1194	      * GUE allows sending of control messages such as OAM using the
1195	        same GUE header format (for routing purposes) as normal data
1196	        messages.

1198	      * GUE maximizes deliverability of non-UDP and non-TCP protocols.

1200	      * GUE provides a means for exposing per flow entropy for ECMP for
1201	        IP atypical protocols such as SCTP, DCCP, ESP, etc.

1203	6.2. Comparison of GUE to other encapsulations

1205	   A number of different encapsulation techniques have been proposed for
1206	   the encapsulation of one protocol over another. EtherIP [RFC3378]
1207	   provides layer 2 tunneling of Ethernet frames over IP. GRE [RFC2784],
1208	   MPLS [RFC4023], and L2TP [RFC2661] provide methods for tunneling
1209	   layer 2 and layer 3 packets over IP. NVGRE [RFC7637] and VXLAN
1210	   [RFC7348] are proposals for encapsulation of layer 2 packets for
1211	   network virtualization. IPIP [RFC2003] and Generic packet tunneling
1212	   in IPv6 [RFC2473] provide methods for tunneling IP packets over IP.

1214	   Several proposals exist for encapsulating packets over UDP including
1215	   ESP over UDP [RFC3948], TCP directly over UDP [TCPUDP], VXLAN
1216	   [RFC7348], LISP [RFC6830] which encapsulates layer 3 packets,
1217	   MPLS/UDP [RFC7510], GENEVE [GENEVE], and GRE-in-UDP Encapsulation
1218	   [RFC8086].

1220	   GUE has the following discriminating features:

1222	      o UDP encapsulation leverages specialized network device
1223	        processing for efficient transport. The semantics for using the
1224	        UDP source port for flow entropy as input to ECMP are defined in
1225	        section 5.11.

1227	      o GUE permits encapsulation of arbitrary IP protocols, which
1228	        includes layer 2, 3, and 4 protocols.

1230	      o Multiple protocols can be multiplexed over a single UDP port
1231	        number. This is in contrast to techniques to encapsulate
1232	        protocols over UDP using a protocol specific port number (such
1233	        as ESP/UDP, GRE/UDP, SCTP/UDP). GUE provides a uniform and
1234	        extensible mechanism for encapsulating all IP protocols in UDP
1235	        with minimal overhead (four bytes of additional header).

1237	      o GUE is extensible. New flags and extension fields can be
1238	        defined.

1240	      o The GUE header includes a header length field. This allows a
1241	        network node to inspect an encapsulated packet without needing
1242	        to parse the full encapsulation header.

1244	      o GUE includes both data messages (encapsulation of packets) and
1245	        control messages (such as OAM).

1247	      o The flags-field model facilitates efficient implementation of
1248	        extensibility in hardware. For instance, a TCAM can be used to
1249	        parse a known set of N flags where the number of entries in the
1250	        TCAM is 2^N. By comparison, the number of TCAM entries needed to
1251	        parse a set of N arbitrarily ordered TLVs is approximately e*N!.

1253	      o GUE includes a variant that encapsulates IPv4 and IPv6 packets
1254	        directly within UDP.

1256	7. Security Considerations

1258	   There are two important considerations of security with respect to
1259	   GUE.

1261	      o Authentication and integrity of the GUE header.

1263	      o Authentication, integrity, and confidentiality of the GUE
1264	        payload.

1266	   GUE security is provided by extensions for security defined in
1267	   [GUEEXTEN]. These extensions include methods to authenticate the GUE
1268	   header and encrypt the GUE payload.

1270	   The GUE header can be authenticated using a security extension for an
1271	   HMAC (Hashed Message Authentication Code). Securing the GUE payload
1272	   can be accomplished by use of the GUE Payload Transform extension.
1273	   This extension allows the use of DTLS (Datagram Transport Layer
1274	   Security) to encrypt and authenticate the GUE payload.

1276	   A hash function for computing flow entropy (section 5.11) SHOULD be
1277	   randomly seeded to mitigate some possible denial service attacks.

1279	8. IANA Considerations

1281	8.1. UDP source port

1283	   A user UDP port number assignment for GUE has been assigned:

1285	          Service Name: gue
1286	          Transport Protocol(s): UDP
1287	          Assignee: Tom Herbert <tom@herbertland.com>
1288	          Contact: Tom Herbert <tom@herbertland.com>
1289	          Description: Generic UDP Encapsulation
1290	          Reference: draft-herbert-gue
1291	          Port Number: 6080
1292	          Service Code: N/A
1293	          Known Unauthorized Uses: N/A
1294	          Assignment Notes: N/A

1296	8.2. GUE variant number

1298	   IANA is requested to set up a registry for the GUE variant number.
1299	   The GUE variant number is two bits containing four possible values.
1300	   This document defines variants 0 and 1. New values are assigned in
1301	   accordance with RFC Required policy [RFC5226].

1303	      +----------------+----------------+---------------+
1304	      | Variant number | Description    | Reference     |
1305	      +----------------+----------------+---------------+
1306	      | 0              | GUE Version 0  | This document |
1307	      |                | with header    |               |
1308	      |                |                |               |
1309	      | 1              | GUE Version 0  | This document |
1310	      |                | with direct IP |               |
1311	      |                | encapsulation  |               |
1312	      |                |                |               |
1313	      | 2..3           | Unassigned     |               |
1314	      +----------------+----------------+---------------+

1316	8.3. Control types

1318	   IANA is requested to set up a registry for the GUE control types.
1319	   Control types are 8 bit values.  New values for control types 1-127
1320	   are assigned in accordance with RFC Required policy [RFC5226].

1322	      +----------------+------------------+---------------+
1323	      |  Control type  | Description      | Reference     |
1324	      +----------------+------------------+---------------+
1325	      | 0              | Control payload  | This document |
1326	      |                | needs more       |               |
1327	      |                | context for      |               |
1328	      |                | interpretation   |               |
1329	      |                |                  |               |
1330	      | 1..127         | Unassigned       |               |
1331	      |                |                  |               |
1332	      | 128..255       | Experimental     | This document |
1333	      +----------------+------------------+---------------+

1335	9. Acknowledgements

1337	   The authors would like to thank David Liu, Erik Nordmark, Fred
1338	   Templin, Adrian Farrel, Bob Briscoe, Murray Kucherawy, Mirja
1339	   Kuhlewind, and David Black for valuable input on this draft. Special
1340	   thanks to Fred Templin who is serving as document shepherd.

1342	10. References

1344	10.1. Normative References

1346	   [RFC0768]  Postel, J., "User Datagram Protocol", STD 6, RFC 768, DOI
1347	              10.17487/RFC0768, August 1980, <http://www.rfc-
1348	              editor.org/info/rfc768>.

1350	   [RFC8085]  Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage
1351	              Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085,
1352	              March 2017, <https://www.rfc-editor.org/info/rfc8085>.

1354	   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
1355	              Requirement Levels", BCP 14, RFC 2119, DOI
1356	              10.17487/RFC2119, March 1997, <https://www.rfc-
1357	              editor.org/info/rfc2119>.

1359	   [RFC2983]  Black, D., "Differentiated Services and Tunnels", RFC
1360	              2983, DOI 10.17487/RFC2983, October 2000, <http://www.rfc-
1361	              editor.org/info/rfc2983>.

1363	   [RFC6040]  Briscoe, B., "Tunnelling of Explicit Congestion
1364	              Notification", RFC 6040, DOI 10.17487/RFC6040, November
1365	              2010, <http://www.rfc-editor.org/info/rfc6040>.

1367	   [RFC6935]  Eubanks, M., Chimento, P., and M. Westerlund, "IPv6 and
1368	              UDP Checksums for Tunneled Packets", RFC 6935, DOI
1369	              10.17487/RFC6935, April 2013, <http://www.rfc-
1370	              editor.org/info/rfc6935>.

1372	   [RFC6936]  Fairhurst, G. and M. Westerlund, "Applicability Statement
1373	              for the Use of IPv6 UDP Datagrams with Zero Checksums",
1374	              RFC 6936, DOI 10.17487/RFC6936, April 2013,
1375	              <http://www.rfc-editor.org/info/rfc6936>.

1377	   [RFC1122]  Braden, R., Ed., "Requirements for Internet Hosts -
1378	              Communication Layers", STD 3, RFC 1122, DOI
1379	              10.17487/RFC1122, October 1989, <http://www.rfc-
1380	              editor.org/info/rfc1122>.

1382	   [RFC4459]  Savola, P., "MTU and Fragmentation Issues with In-the-
1383	              Network Tunneling", RFC 4459, DOI 10.17487/RFC4459, April
1384	              2006, <http://www.rfc-editor.org/info/rfc4459>.

1386	   [RFC6335]  Cotton, M., Eggert, L., Touch, J., Westerlund, M., and S.
1387	              Cheshire, "Internet Assigned Numbers Authority (IANA)
1388	              Procedures for the Management of the Service Name and
1389	              Transport Protocol Port Number Registry", BCP 165, RFC
1390	              6335, DOI 10.17487/RFC6335, August 2011, <https://www.rfc-
1391	              editor.org/info/rfc6335>.

1393	   [RFC5226]  Narten, T. and H. Alvestrand, "Guidelines for Writing an
1394	              IANA Considerations Section in RFCs", RFC 5226, DOI
1395	              10.17487/RFC5226, May 2008, <https://www.rfc-
1396	              editor.org/info/rfc5226>.

1398	10.2. Informative References

1400	   [RFC8086]  Yong, L., Ed., Crabbe, E., Xu, X., and T. Herbert, "GRE-
1401	              in-UDP Encapsulation", RFC 8086, DOI 10.17487/RFC8086,
1402	              March 2017, <http://www.rfc-editor.org/info/rfc8086>.

1404	   [RFC7605]  Touch, J., "Recommendations on Using Assigned Transport
1405	              Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605,
1406	              August 2015, <https://www.rfc-editor.org/info/rfc7605>.

1408	   [RFC4787]  Audet, F., Ed., and C. Jennings, "Network Address
1409	              Translation (NAT) Behavioral Requirements for Unicast
1410	              UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January
1411	              2007, <http://www.rfc-editor.org/info/rfc4787>.

1413	   [RFC5389]  Rosenberg, J., Mahy, R., Matthews, P., and D. Wing,
1414	              "Session Traversal Utilities for NAT (STUN)", RFC 5389,
1415	              DOI 10.17487/RFC5389, October 2008, <http://www.rfc-
1416	              editor.org/info/rfc5389>.

1418	   [RFC5245]  Rosenberg, J., "Interactive Connectivity Establishment
1419	              (ICE): A Protocol for Network Address Translator (NAT)
1420	              Traversal for Offer/Answer Protocols", RFC 5245, DOI
1421	              10.17487/RFC5245, April 2010, <http://www.rfc-
1422	              editor.org/info/rfc5245>.

1424	   [RFC8084]  Fairhurst, G., "Network Transport Circuit Breakers", BCP
1425	              208, RFC 8084, DOI 10.17487/RFC8084, March 2017,
1426	              <https://www.rfc-editor.org/info/rfc8084>.

1428	   [RFC6438]  Carpenter, B. and S. Amante, "Using the IPv6 Flow Label
1429	              for Equal Cost Multipath Routing and Link Aggregation in
1430	              Tunnels", RFC 6438, DOI 10.17487/RFC6438, November 2011,
1431	              <http://www.rfc-editor.org/info/rfc6438>.

1433	   [RFC3378]  Housley, R. and S. Hollenbeck, "EtherIP: Tunneling
1434	              Ethernet Frames in IP Datagrams", RFC 3378, DOI
1435	              10.17487/RFC3378, September 2002, <http://www.rfc-
1436	              editor.org/info/rfc3378>.

1438	   [RFC2784]  Farinacci, D., Li, T., Hanks, S., Meyer, D., and P.
1439	              Traina, "Generic Routing Encapsulation (GRE)", RFC 2784,
1440	              DOI 10.17487/RFC2784, March 2000, <http://www.rfc-
1441	              editor.org/info/rfc2784>.

1443	   [RFC4023]  Worster, T., Rekhter, Y., and E. Rosen, Ed.,
1444	              "Encapsulating MPLS in IP or Generic Routing Encapsulation
1445	              (GRE)", RFC 4023, DOI 10.17487/RFC4023, March 2005,
1446	              <http://www.rfc-editor.org/info/rfc4023>.

1448	   [RFC2661]  Townsley, W., Valencia, A., Rubens, A., Pall, G., Zorn,
1449	              G., and B. Palter, "Layer Two Tunneling Protocol "L2TP"",
1450	              RFC 2661, DOI 10.17487/RFC2661, August 1999,
1451	              <http://www.rfc-editor.org/info/rfc2661>.

1453	   [RFC7637]  Garg, P., Ed., and Y. Wang, Ed., "NVGRE: Network
1454	              Virtualization Using Generic Routing Encapsulation", RFC
1455	              7637, DOI 10.17487/RFC7637, September 2015,
1456	              <https://www.rfc-editor.org/info/rfc7637>.

1458	   [RFC7348]  Mahalingam, M., Dutt, D., Duda, K., Agarwal, P., Kreeger,
1459	              L., Sridhar, T., Bursell, M., and C. Wright, "Virtual
1460	              eXtensible Local Area Network (VXLAN): A Framework for
1461	              Overlaying Virtualized Layer 2 Networks over Layer 3
1462	              Networks", RFC 7348, August 2014, <http://www.rfc-
1463	              editor.org/info/rfc7348>.

1465	   [RFC2003]  Perkins, C., "IP Encapsulation within IP", RFC 2003, DOI
1466	              10.17487/RFC2003, October 1996, <http://www.rfc-
1467	              editor.org/info/rfc2003>.

1469	   [RFC2473]  Conta, A. and S. Deering, "Generic Packet Tunneling in
1470	              IPv6 Specification", RFC 2473, DOI 10.17487/RFC2473,
1471	              December 1998, <https://www.rfc-editor.org/info/rfc2473>.

1473	   [RFC3948]  Huttunen, A., Swander, B., Volpe, V., DiBurro, L., and M.
1474	              Stenberg, "UDP Encapsulation of IPsec ESP Packets", RFC
1475	              3948, DOI 10.17487/RFC3948, January 2005, <http://www.rfc-
1476	              editor.org/info/rfc3948>.

1478	   [RFC6830]  Farinacci, D., Fuller, V., Meyer, D., and D. Lewis, "The
1479	              Locator/ID Separation Protocol (LISP)", RFC 6830, DOI
1480	              10.17487/RFC6830, January 2013, <http://www.rfc-
1481	              editor.org/info/rfc6830>.

1483	   [RFC7510]  Xu, X., Sheth, N., Yong, L., Callon, R., and D. Black,
1484	              "Encapsulating MPLS in UDP", RFC 7510, DOI
1485	              10.17487/RFC7510, April 2015, <http://www.rfc-
1486	              editor.org/info/rfc7510>.

1488	   [GUEEXTEN] Herbert, T., Yong, L., and Templin, F., "Extensions for
1489	              Generic UDP Encapsulation", draft-ietf-intarea-gue-
1490	              extensions-06

1492	   [IPTUN]    Touch, J. and Townsley, M., "IP Tunnels in the Internet
1493	              Architecture", draft-ietf-intarea-tunnels-10

1495	   [IANA-PN]  IANA, "Protocol Numbers",
1496	              <https://www.iana.org/assignments/protocol-numbers>.

1498	   [TCPUDP]   Chesire, S., Graessley, J., and McGuire, R.,
1499	              "Encapsulation of TCP and other Transport Protocols over
1500	              UDP", draft-cheshire-tcp-over-udp-00

1502	   [GENEVE]   Gross, J., Ed., Ganga, I. Ed., and Sridhar, T., "Geneve:
1503	              Generic Network Virtualization Encapsulation", draft-ietf-
1504	              nvo3-geneve-10

1506	   [UDPENCAP] Herbert, T., "UDP Encapsulation in Linux",
1507	              <http://people.netfilter.org/pablo/netdev0.1/papers/UDP-
1508	              Encapsulation-in-Linux.pdf>

1510	   [MULTIQ]   Herbert, T. and de Bruijn, W., "Scaling in the Linux
1511	              Networking Stack", <https://www.kernel.org/doc/
1512	              Documentation/networking/scaling.txt>

1514	   [CSUMOFF]  Cree, E., "Checksum Offloads in the Linux Networking
1515	              Stack", <https://www.kernel.org/doc/Documentation/
1516	              networking/checksum-offloads.txt>

1518	   [SEGOFF]   Duyck, A., "Segmentation Offloads in the Linux Networking
1519	              Stack", <https://www.kernel.org/doc/
1520	              Documentation/networking/segmentation-offloads.txt>

1522	Appendix A: NIC processing for GUE

1524	   This appendix is informational and does not constitute a normative
1525	   part of this document.

1527	   This appendix provides some guidelines for Network Interface Cards
1528	   (NICs) to implement common offloads and accelerations to support GUE.
1529	   Note that most of this discussion is generally applicable to other
1530	   methods of UDP based encapsulation. An overview of UDP based
1531	   encapsulation and acceleration is in [UDPENCAP]

1533	A.1. Receive multi-queue

1535	   Contemporary NICs support multiple receive descriptor queues (multi-
1536	   queue) [MUTLIQ]. Multi-queue enables load balancing of network
1537	   processing for a NIC across multiple CPUs. On packet reception, a NIC
1538	   selects an appropriate queue for host processing. Receive Side
1539	   Scaling (RSS) is a common method which uses the flow hash for a
1540	   packet to index an indirection table where each entry stores a queue
1541	   number. Flow Director and Accelerated Receive Flow Steering (aRFS)
1542	   allow a host to program the queue that is used for a given flow which
1543	   is identified either by an explicit five-tuple or by the flow's hash.

1545	   GUE encapsulation is compatible with multi-queue NICs that support
1546	   five-tuple hash calculation for UDP/IP packets as input to RSS. The
1547	   flow entropy in the UDP source port ensures classification of the
1548	   encapsulated flow even in the case that the outer source and
1549	   destination addresses are the same for all flows (e.g. all flows are
1550	   going over a single tunnel).

1552	   By default, UDP RSS support is often disabled in NICs to avoid out-
1553	   of-order reception that can occur when UDP packets are fragmented. As
1554	   discussed is section 5.7, fragmentation of GUE packets is mostly
1555	   avoided by fragmenting packets before entering a tunnel, GUE
1556	   fragmentation, path MTU discovery in higher layer protocols, or
1557	   operator adjusting MTUs. Other UDP traffic might not implement such
1558	   procedures to avoid fragmentation, so enabling UDP RSS support in the
1559	   NIC might be a considered tradeoff during configuration.

1561	A.2. Checksum offload

1563	   Many NICs provide capabilities to calculate the standard ones
1564	   complement checksum for packets in transmit or receive [CSUMOFF].
1565	   When using GUE encapsulation, there are at least two checksums that
1566	   are of interest: the encapsulated packet's transport checksum, and
1567	   the UDP checksum in the outer header.

1569	A.2.1. Transmit checksum offload

1571	   NICs can provide a protocol agnostic method to offload the transmit
1572	   checksum (NETIF_F_HW_CSUM in Linux parlance) that can be used with
1573	   GUE. In this method, the host provides checksum related parameters in
1574	   a transmit descriptor for a packet. These parameters include the
1575	   starting offset of data to checksum, the length of data to checksum,
1576	   and the offset in the packet where the computed checksum is to be
1577	   written. The host initializes the checksum field to a pseudo header
1578	   checksum.

1580	   In the case of GUE, the checksum for an encapsulated transport layer
1581	   packet, a TCP packet for instance, can be offloaded by setting the
1582	   appropriate checksum parameters.

1584	   NICs typically can offload only one transmit checksum per packet, so
1585	   simultaneously offloading both an inner transport packet's checksum
1586	   and the outer UDP checksum is likely not possible.

1588	   If an encapsulator is co-resident with a host, then checksum offload
1589	   may be performed using remote checksum offload (RCO)[GUEEXTEN].
1590	   Remote checksum offload relies on NIC offload of the simple UDP/IP
1591	   checksum which is commonly supported even in legacy devices. In
1592	   remote checksum offload, the outer UDP checksum is set and the GUE
1593	   header includes an option indicating the start and offset of the
1594	   inner "offloaded" checksum. The inner checksum is initialized to the
1595	   pseudo header checksum. When a decapsulator receives a GUE packet
1596	   with the remote checksum offload option, it completes the offload
1597	   operation by determining the packet checksum from the indicated start
1598	   point to the end of the packet, and then adds this into the checksum
1599	   field at the offset given in the option. Computing the checksum from
1600	   the start to end of packet is efficient if checksum-complete is
1601	   provided on the receiver.

1603	   Another alternative when an encapsulator is co-resident with a host
1604	   is to perform Local Checksum Offload (LCO) [CSUMOFF]. In this method,
1605	   the inner transport layer checksum is offloaded and the outer UDP
1606	   checksum can be deduced based on the fact that the portion of the
1607	   packet covered by the inner transport checksum will sum to zero or at
1608	   least the bitwise "not" of the inner pseudo header.

1610	A.2.2. Receive checksum offload

1612	   GUE is compatible with NICs that perform a protocol agnostic receive
1613	   checksum (CHECKSUM_COMPLETE in Linux parlance). In this technique, a
1614	   NIC computes a ones complement checksum over all (or some predefined
1615	   portion) of a packet. The computed value is provided to the host
1616	   stack in the packet's receive descriptor. The host driver can use
1617	   this checksum to "patch up" and validate any inner packet transport
1618	   checksums, as well as the outer UDP checksum if it is non-zero.

1620	   Many legacy NICs don't provide checksum-complete but instead provide
1621	   an indication that a checksum has been verified (CHECKSUM_UNNECESSARY
1622	   in Linux). Usually, such validation is only done for simple TCP/IP or
1623	   UDP/IP packets. If a NIC indicates that a UDP checksum is valid, the
1624	   checksum-complete value for the UDP packet is the bitwise "not" of
1625	   the pseudo header checksum. In this way, checksum-unnecessary can be
1626	   converted to checksum-complete. So, if the NIC provides checksum-
1627	   unnecessary for the outer UDP header in an encapsulation, checksum
1628	   conversion can be done so that the checksum-complete value is derived
1629	   and can be used by the stack to validate checksums in the
1630	   encapsulated packet.

1632	A.3. Transmit Segmentation Offload

1634	   Transmit Segmentation Offload (TSO) [SEGOFF] is a NIC feature where a
1635	   host provides a large (>MTU size) TCP packet to the NIC, which in
1636	   turn splits the packet into separate segments and transmits each one.
1637	   This is useful to reduce CPU load on the host.

1639	   The process of TSO can be generalized as:

1641	      - Split the TCP payload into segments of size less than or equal
1642	        to MTU.

1644	      - For each created segment:

1646	        1. Replicate the TCP header and all preceding headers of the
1647	           original packet.

1649	        2. Set payload length fields in any headers to reflect the
1650	           length of the segment.

1652	        3. Set TCP sequence number to correctly reflect the offset of
1653	           the TCP data in the stream.

1655	        4. Recompute and set any checksums that either cover the payload
1656	           of the packet or cover header which was changed by setting a
1657	           payload length.

1659	   Following this general process, TSO can be extended to support TCP
1660	   encapsulation in GUE.  For each segment the Ethernet, outer IP, UDP
1661	   header, GUE header, inner IP header (if tunneling), and TCP headers
1662	   are replicated. Any packet length header fields need to be set
1663	   properly (including the length in the outer UDP header), and
1664	   checksums need to be set correctly (including the outer UDP checksum
1665	   if being used).

1667	   To facilitate TSO with GUE, it is recommended that extension fields
1668	   do not contain values that need to be updated on a per segment basis.
1669	   For example, extension fields should not include checksums, lengths,
1670	   or sequence numbers that refer to the payload. If the GUE header does
1671	   not contain such fields then the TSO engine only needs to copy the
1672	   bits in the GUE header when creating each segment and does not need
1673	   to parse the GUE header.

1675	A.4. Large Receive Offload

1677	   Large Receive Offload (LRO) [SEGOFF] is a NIC feature where received
1678	   packets of a TCP connection are reassembled, or coalesced, in the NIC
1679	   and delivered to the host as one large packet. This feature can
1680	   reduce CPU utilization in the host.

1682	   LRO requires significant protocol awareness to be implemented
1683	   correctly and is difficult to generalize. Packets in the same flow
1684	   need to be unambiguously identified. In the presence of tunnels or
1685	   network virtualization, this may require more than a five-tuple match
1686	   (for instance packets for flows in two different virtual networks may
1687	   have identical five-tuples). Additionally, a NIC needs to perform
1688	   validation over packets that are being coalesced, and needs to
1689	   fabricate a single meaningful header from all the coalesced packets.

1691	   The conservative approach to supporting LRO for GUE would be to
1692	   assign packets to the same flow only if they have identical five-
1693	   tuple and were encapsulated the same way. That is the outer IP
1694	   addresses, the outer UDP ports, GUE protocol, GUE flags and fields,
1695	   and inner five tuple are all identical.

1697	Appendix B: Implementation considerations

1699	   This appendix is informational and does not constitute a normative
1700	   part of this document.

1702	B.1. Priveleged ports

1704	   Using the source port to contain a flow entropy value disallows the
1705	   security method of a receiver enforcing that the source port be a
1706	   privileged port. Privileged ports are defined by some operating
1707	   systems to restrict source port binding. Unix, for instance,
1708	   considered port number less than 1024 to be privileged.

1710	   Enforcing that packets are sent from a privileged port is widely
1711	   considered an inadequate security mechanism and has been mostly
1712	   deprecated. To approximate this behavior, an implementation could
1713	   restrict a user from sending a packet destined to the GUE port
1714	   without proper credentials.

1716	B.2. Setting flow entropy as a route selector

1718	   An encapsulator generating flow entropy in the UDP source port could
1719	   modulate the value to perform a type of multipath source routing.
1720	   Assuming that networking switches perform ECMP based on the flow
1721	   hash, a sender can affect the path by altering the flow entropy.  For
1722	   instance, a host can store a flow hash in its protocol control block
1723	   (PCB) for an inner flow, and might alter the value upon detecting
1724	   that packets are traversing a lossy path. Changing the flow entropy
1725	   for a flow SHOULD be subject to hysteresis (at most once every thirty
1726	   seconds) to limit the number of out of order packets.

1728	B.3. Hardware protocol implementation considerations

1730	   Low level data path protocols, such as GUE, are often supported in
1731	   high speed network device hardware. Variable length header (VLH)
1732	   protocols like GUE are sometimes considered difficult to efficiently
1733	   implement in hardware. In order to retain the important
1734	   characteristics of an extensible and robust protocol, hardware
1735	   vendors may practice "constrained flexibility". In this model, only
1736	   certain combinations or protocol header parameterizations are
1737	   implemented in the hardware fast path. Each such parameterization is
1738	   fixed length so that the particular instance can be optimized as a
1739	   fixed length protocol. In the case of GUE, this constitutes specific
1740	   combinations of GUE flags, fields, and next protocol. The selected
1741	   combinations would naturally be the most common cases which form the
1742	   "fast path", and other combinations are assumed to take the "slow
1743	   path".

1745	   In time, the needs and requirements of a protocol may change which
1746	   may manifest themselves as new parameterizations to be supported in
1747	   the fast path. To allow this extensibility, a device practicing
1748	   constrained flexibility should allow fast path parameterizations to
1749	   be programmable.

1751	Authors' Addresses

1753	   Tom Herbert
1754	   Quantonium
1755	   4701 Patrick Henry
1756	   Santa Clara, CA 95054
1757	   US

1759	   Email: tom@herbertland.com

1761	   Lucy Yong
1762	   Independent
1763	   Austin, TX
1764	   US

1766	   Email: lucy_yong@yahoo.com

1768	   Osama Zia
1769	   Microsoft
1770	   1 Microsoft Way
1771	   Redmond, WA 98029
1772	   US

1774	   Email: osamaz@microsoft.com