idnits 2.17.1 

draft-ietf-quic-manageability-09.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 578: '...   RECOMMENDED.  First, it only provid...'
     RFC 2119 keyword, line 969: '...   NOT RECOMMENDED....'


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (22 January 2021) is 1191 days in the past.  Is this
     intentional?


  Checking references for intended status: Informational
  ----------------------------------------------------------------------------

  -- Looks like a reference, but probably isn't: '0' on line 1205

  == Unused Reference: 'Ding2015' is defined on line 1272, but no explicit
     reference was found in the text

  == Unused Reference: 'IPIM' is defined on line 1293, but no explicit
     reference was found in the text

  == Outdated reference: A later version (-18) exists of
     draft-ietf-quic-applicability-08

  == Outdated reference: A later version (-18) exists of
     draft-ietf-quic-applicability-08

  -- Duplicate reference: draft-ietf-quic-applicability, mentioned in
     'QUIC-APPLICABILITY', was also mentioned in 'I-D.ietf-quic-applicability'.

  == Outdated reference: A later version (-34) exists of
     draft-ietf-quic-http-33

  == Outdated reference: A later version (-18) exists of
     draft-ietf-tls-esni-09


     Summary: 1 error (**), 0 flaws (~~), 7 warnings (==), 3 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                      M. Kuehlewind
3	Internet-Draft                                                  Ericsson
4	Intended status: Informational                               B. Trammell
5	Expires: 26 July 2021                                             Google
6	                                                         22 January 2021

8	              Manageability of the QUIC Transport Protocol
9	                    draft-ietf-quic-manageability-09

11	Abstract

13	   This document discusses manageability of the QUIC transport protocol,
14	   focusing on caveats impacting network operations involving QUIC
15	   traffic.  Its intended audience is network operators, as well as
16	   content providers that rely on the use of QUIC-aware middleboxes,
17	   e.g. for load balancing.

19	Status of This Memo

21	   This Internet-Draft is submitted in full conformance with the
22	   provisions of BCP 78 and BCP 79.

24	   Internet-Drafts are working documents of the Internet Engineering
25	   Task Force (IETF).  Note that other groups may also distribute
26	   working documents as Internet-Drafts.  The list of current Internet-
27	   Drafts is at https://datatracker.ietf.org/drafts/current/.

29	   Internet-Drafts are draft documents valid for a maximum of six months
30	   and may be updated, replaced, or obsoleted by other documents at any
31	   time.  It is inappropriate to use Internet-Drafts as reference
32	   material or to cite them other than as "work in progress."

34	   This Internet-Draft will expire on 26 July 2021.

36	Copyright Notice

38	   Copyright (c) 2021 IETF Trust and the persons identified as the
39	   document authors.  All rights reserved.

41	   This document is subject to BCP 78 and the IETF Trust's Legal
42	   Provisions Relating to IETF Documents (https://trustee.ietf.org/
43	   license-info) in effect on the date of publication of this document.
44	   Please review these documents carefully, as they describe your rights
45	   and restrictions with respect to this document.  Code Components
46	   extracted from this document must include Simplified BSD License text
47	   as described in Section 4.e of the Trust Legal Provisions and are
48	   provided without warranty as described in the Simplified BSD License.

50	Table of Contents

52	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
53	   2.  Features of the QUIC Wire Image . . . . . . . . . . . . . . .   4
54	     2.1.  QUIC Packet Header Structure  . . . . . . . . . . . . . .   4
55	     2.2.  Coalesced Packets . . . . . . . . . . . . . . . . . . . .   6
56	     2.3.  Use of Port Numbers . . . . . . . . . . . . . . . . . . .   6
57	     2.4.  The QUIC Handshake  . . . . . . . . . . . . . . . . . . .   7
58	     2.5.  Integrity Protection of the Wire Image  . . . . . . . . .  11
59	     2.6.  Connection ID and Rebinding . . . . . . . . . . . . . . .  11
60	     2.7.  Packet Numbers  . . . . . . . . . . . . . . . . . . . . .  12
61	     2.8.  Version Negotiation and Greasing  . . . . . . . . . . . .  12
62	   3.  Network-visible Information about QUIC Flows  . . . . . . . .  12
63	     3.1.  Identifying QUIC Traffic  . . . . . . . . . . . . . . . .  13
64	       3.1.1.  Identifying Negotiated Version  . . . . . . . . . . .  13
65	       3.1.2.  Rejection of Garbage Traffic  . . . . . . . . . . . .  14
66	     3.2.  Connection Confirmation . . . . . . . . . . . . . . . . .  14
67	     3.3.  Application Identification  . . . . . . . . . . . . . . .  14
68	       3.3.1.  Extracting Server Name Indication (SNI)
69	               Information . . . . . . . . . . . . . . . . . . . . .  15
70	     3.4.  Flow Association  . . . . . . . . . . . . . . . . . . . .  16
71	     3.5.  Flow teardown . . . . . . . . . . . . . . . . . . . . . .  16
72	     3.6.  Flow Symmetry Measurement . . . . . . . . . . . . . . . .  16
73	     3.7.  Round-Trip Time (RTT) Measurement . . . . . . . . . . . .  17
74	       3.7.1.  Measuring Initial RTT . . . . . . . . . . . . . . . .  17
75	       3.7.2.  Using the Spin Bit for Passive RTT Measurement  . . .  17
76	   4.  Specific Network Management Tasks . . . . . . . . . . . . . .  19
77	     4.1.  Stateful Treatment of QUIC Traffic  . . . . . . . . . . .  19
78	     4.2.  Passive Network Performance Measurement and
79	           Troubleshooting . . . . . . . . . . . . . . . . . . . . .  19
80	     4.3.  Server Cooperation with Load Balancers  . . . . . . . . .  20
81	     4.4.  DDoS Detection and Mitigation . . . . . . . . . . . . . .  20
82	     4.5.  UDP Policing  . . . . . . . . . . . . . . . . . . . . . .  21
83	     4.6.  Distinguishing Acknowledgment traffic . . . . . . . . . .  21
84	     4.7.  Quality of Service handling and ECMP  . . . . . . . . . .  22
85	     4.8.  QUIC and Network Address Translation (NAT)  . . . . . . .  22
86	       4.8.1.  Resource Conservation . . . . . . . . . . . . . . . .  23
87	       4.8.2.  "Helping" with routing infrastructure issues  . . . .  23
88	     4.9.  Filtering behavior  . . . . . . . . . . . . . . . . . . .  24
89	   5.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  25
90	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  25
91	   7.  Contributors  . . . . . . . . . . . . . . . . . . . . . . . .  25
92	   8.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  25
93	   9.  Appendix  . . . . . . . . . . . . . . . . . . . . . . . . . .  26
94	     9.1.  Distinguishing IETF QUIC and Google QUIC Versions . . . .  26
95	     9.2.  Extracting the CRYPTO frame . . . . . . . . . . . . . . .  27
96	   10. References  . . . . . . . . . . . . . . . . . . . . . . . . .  28
97	     10.1.  Normative References . . . . . . . . . . . . . . . . . .  28
98	     10.2.  Informative References . . . . . . . . . . . . . . . . .  28
99	   Authors' Addresses  . . . . . . . . . . . . . . . . . . . . . . .  31

101	1.  Introduction

103	   QUIC [QUIC-TRANSPORT] is a new transport protocol encapsulated in UDP
104	   and encrypted by default.  QUIC integrates TLS [QUIC-TLS] to encrypt
105	   all payload data and most control information.  The design focused on
106	   support of semantics for HTTP, which required changes to HTTP known
107	   as HTTP/3 [QUIC-HTTP].

109	   Given that QUIC is an end-to-end transport protocol, all information
110	   in the protocol header, even that which can be inspected, is not
111	   meant to be mutable by the network, and is therefore integrity-
112	   protected.  While less information is visible to the network than for
113	   TCP, integrity protection can also simplify troubleshooting, because
114	   none of the nodes on the network path can modify the transport layer
115	   information.

117	   This document provides guidance for network operations that manage
118	   QUIC traffic.  This includes guidance on how to interpret and utilize
119	   information that is exposed by QUIC to the network, requirements and
120	   assumptions that the QUIC design with respect to network treatment,
121	   and a description of how common network management practices will be
122	   impacted by QUIC.

124	   Since QUIC's wire image [WIRE-IMAGE] is integrity protected and not
125	   modifiable on path, in-network operations are not possible without
126	   terminating the QUIC connection, for instance using a back-to-back
127	   proxy.  Proxy operations are not in scope for this document.  A proxy
128	   can either explicit identify itself as providing a proxy service, or
129	   may share the TLS credentials to authenticate as the server and (in
130	   some cases) client acting as a front-facing instance for the endpoint
131	   itself.

133	   Network management is not a one-size-fits-all endeavour: practices
134	   considered necessary or even mandatory within enterprise networks
135	   with certain compliance requirements, for example, would be
136	   impermissible on other networks without those requirements.  This
137	   document therefore does not make any specific recommendations as to
138	   which practices should or should not be applied; for each practice,
139	   it describes what is and is not possible with the QUIC transport
140	   protocol as defined.

142	2.  Features of the QUIC Wire Image

144	   In this section, we discuss those aspects of the QUIC transport
145	   protocol that have an impact on the design and operation of devices
146	   that forward QUIC packets.  Here, we are concerned primarily with the
147	   unencrypted part of QUIC's wire image [WIRE-IMAGE], which we define
148	   as the information available in the packet header in each QUIC
149	   packet, and the dynamics of that information.  Since QUIC is a
150	   versioned protocol, the wire image of the header format can also
151	   change from version to version.  However, the field that identifies
152	   the QUIC version in some packets, and the format of the Version
153	   Negotiation Packet, are both inspectable and invariant
154	   [QUIC-INVARIANTS].

156	   This document describes version 1 of the QUIC protocol, whose wire
157	   image is fully defined in [QUIC-TRANSPORT] and [QUIC-TLS].  Features
158	   of the wire image described herein may change in future versions of
159	   the protocol, except when specified as an invariant
160	   [QUIC-INVARIANTS], and cannot be used to identify QUIC as a protocol
161	   or to infer the behavior of future versions of QUIC.

163	   Section 9.1 provides non-normative guidance on the identification of
164	   QUIC version 1 packets compared to some pre-standard versions.

166	2.1.  QUIC Packet Header Structure

168	   QUIC packets may have either a long header, or a short header.  The
169	   first bit of the QUIC header is the Header Form bit, and indicates
170	   which type of header is present.  The purpose of this bit is
171	   invariant across QUIC versions.

173	   The long header exposes more information.  It is used during
174	   connection establishment, including version negotiation, retry, and
175	   0-RTT data.  It contains a version number, as well as source and
176	   destination connection IDs for grouping packets belonging to the same
177	   flow.  The definition and location of these fields in the QUIC long
178	   header are invariant for future versions of QUIC, although future
179	   versions of QUIC may provide additional fields in the long header
180	   [QUIC-INVARIANTS].

182	   Short headers are used after connection establishment, and contain
183	   only an optional destination connection ID and the spin bit for RTT
184	   measurement.

186	   The following information is exposed in QUIC packet headers:

188	   *  "fixed bit": the second most significant bit of the first octet
189	      most QUIC packets of the current version is currently set to 1,
190	      for endpoints to demultiplex with other UDP-encapsulated
191	      protocols.  Even thought this bit is fixed in the QUICv1
192	      specification, endpoints may use a version or extension that
193	      varies the bit.  Therefore, observers cannot reliably use it as an
194	      identifier for QUIC.

196	   *  latency spin bit: the third most significant bit of first octet in
197	      the short packet header.  The spin bit is set by endpoints such
198	      that tracking edge transitions can be used to passively observe
199	      end-to-end RTT.  See Section 3.7.2 for further details.

201	   *  header type: the long header has a 2 bit packet type field
202	      following the Header Form and fixed bits.  Header types correspond
203	      to stages of the handshake; see Section 17.2 of [QUIC-TRANSPORT]
204	      for details.

206	   *  version number: the version number is present in the long header,
207	      and identifies the version used for that packet.  During Version
208	      Negotiation (see Section 2.8 and Section 17.2.1 of
209	      [QUIC-TRANSPORT]), the version number field has a special value
210	      (0x00000000) that identifies the packet as a Version Negotiation
211	      packet.  Many QUIC versions that start with 0xff implement IETF
212	      drafts.  QUIC versions that start with 0x0000 are reserved for
213	      IETF consensus documents.  For example, QUIC version 1 uses
214	      version 0x00000001.  Operators should expect to observe packets
215	      with other version numbers as a result of various internet
216	      experiments and future standards.

218	   *  source and destination connection ID: short and long packet
219	      headers carry a destination connection ID, a variable-length field
220	      that can be used to identify the connection associated with a QUIC
221	      packet, for load-balancing and NAT rebinding purposes; see
222	      Section 4.3 and Section 2.6.  Long packet headers additionally
223	      carry a source connection ID.  The source connection ID
224	      corresponds to the destination connection ID the source would like
225	      to have on packets sent to it, and is only present on long packet
226	      headers.  On long header packets, the length of the connection IDs
227	      is also present; on short header packets, the length of the
228	      destination connection ID is implicit.

230	   *  length: the length of the remaining QUIC packet after the length
231	      field, present on long headers.  This field is used to implement
232	      coalesced packets during the handshake (see Section 2.2).

234	   *  token: Initial packets may contain a token, a variable-length
235	      opaque value optionally sent from client to server, used for
236	      validating the client's address.  Retry packets also contain a
237	      token, which can be used by the client in an Initial packet on a
238	      subsequent connection attempt.  The length of the token is
239	      explicit in both cases.

241	   Retry (Section 17.2.5 of [QUIC-TRANSPORT]) and Version Negotiation
242	   (Section 17.2.1 of [QUIC-TRANSPORT]) packets are not encrypted or
243	   obfuscated in any way.  For other kinds of packets, other information
244	   in the packet headers is cryptographically obfuscated:

246	   *  packet number: All packets except Version Negotiation and Retry
247	      packets have an associated packet number; however, this packet
248	      number is encrypted, and therefore not of use to on-path
249	      observers.  The offset of the packet number is encoded in long
250	      headers, while it is implicit (depending on destination connection
251	      ID length) in short headers.  The length of the packet number is
252	      cryptographically obfuscated.

254	   *  key phase: The Key Phase bit, present in short headers, specifies
255	      the keys used to encrypt the packet to support key rotation.  The
256	      Key Phase bit is cryptographically obfuscated.

258	2.2.  Coalesced Packets

260	   Multiple QUIC packets may be coalesced into a UDP datagram, with a
261	   datagram carrying one or more long header packets followed by zero or
262	   one short header packets.  When packets are coalesced, the Length
263	   fields in the long headers are used to separate QUIC packets; see
264	   Section 12.2 of [QUIC-TRANSPORT].  The length header field is
265	   variable length, and its position in the header is also variable
266	   depending on the length of the source and destination connection ID;
267	   see Section 17.2 of [QUIC-TRANSPORT].

269	2.3.  Use of Port Numbers

271	   Applications that have a mapping for TCP as well as QUIC are expected
272	   to use the same port number for both services.  However, as with TCP-
273	   based services, especially when application layer information is
274	   encrypted, there is no guarantee that a specific application will use
275	   the registered port, or the used port is carrying traffic belonging
276	   to the respective registered service.  For example, [QUIC-HTTP]
277	   specifies the use of Alt-Svc for discovery of HTTP/3 services on
278	   other ports.

280	   Further, as QUIC has a connection ID, it is also possible to maintain
281	   multiple QUIC connections over one 5-tuple.  However, if the
282	   connection ID is not present in the packet header, all packets of the
283	   5-tuple belong to the same QUIC connection.

285	2.4.  The QUIC Handshake

287	   New QUIC connections are established using a handshake, which is
288	   distinguishable on the wire and contains some information that can be
289	   passively observed.

291	   To illustrate the information visible in the QUIC wire image during
292	   the handshake, we first show the general communication pattern
293	   visible in the UDP datagrams containing the QUIC handshake, then
294	   examine each of the datagrams in detail.

296	   In the nominal case, the QUIC handshake can be recognized on the wire
297	   through at least four datagrams we'll call "QUIC Client Hello", "QUIC
298	   Server Hello", and "Initial Completion", and "Handshake Completion",
299	   for purposes of this illustration, as shown in Figure 1.

301	   Packets in the handshake belong to three separate cryptographic and
302	   transport contexts ("Initial", which contains observable payload, and
303	   "Handshake" and "1-RTT", which do not).  QUIC packets in separate
304	   contexts during the handshake are generally coalesced (see
305	   Section 2.2) in order to reduce the number of UDP datagrams sent
306	   during the handshake.

308	   As shown here, the client can send 0-RTT data as soon as it has sent
309	   its Client Hello, and the server can send 1-RTT data as soon as it
310	   has sent its Server Hello.

312	   Client                                    Server
313	     |                                          |
314	     +----QUIC Client Hello-------------------->|
315	     +----(zero or more 0RTT)------------------>|
316	     |                                          |
317	     |<--------------------QUIC Server Hello----+
318	     |<---------(1RTT encrypted data starts)----+
319	     |                                          |
320	     +----Initial Completion------------------->|
321	     +----(1RTT encrypted data starts)--------->|
322	     |                                          |
323	     |<-----------------Handshake Completion----+
324	     |                                          |

326	   Figure 1: General communication pattern visible in the QUIC handshake
327	   A typical handshake starts with the client sending of a QUIC Client
328	   Hello datagram as shown in Figure 2, which elicits a QUIC Server
329	   Hello datagram as shown in Figure 3 typically containing three
330	   packets: an Initial packet with the Server Hello, a Handshake packet
331	   with the rest of the server's side of the TLS handshake, and initial
332	   1-RTT data, if present.

334	   The Initial Completion datagram contains at least one Handshake
335	   packet and some also include an Initial packet.

337	   Datagrams that contain a QUIC Initial Packet (Client Hello, Server
338	   Hello, and some Initial Completion) must be at least 1200 octets
339	   long.  This protects against amplification attacks and verifies that
340	   the network path meets minimum Maximum Transmission Unit (MTU)
341	   requirements.  This is usually accomplished with either the addition
342	   of PADDING frames to the Initial packet, or coalescing of the Initial
343	   Packet with packets from other encryption contexts.

345	   The content of QUIC Initial packets are encrypted using Initial
346	   Secrets, which are derived from a per-version constant and the
347	   client's destination connection ID; they are therefore observable by
348	   any on-path device that knows the per-version constant.  We therefore
349	   consider these as visible in our illustration.  The content of QUIC
350	   Handshake packets are encrypted using keys established during the
351	   initial handshake exchange, and are therefore not visible.

353	   Initial, Handshake, and the Short Header packets transmitted after
354	   the handshake belong to cryptographic and transport contexts.  The
355	   Initial Completion Figure 4 and the Handshake Completion Figure 5
356	   datagrams finish these first two contexts, by sending the final
357	   acknowledgment and finishing the transmission of CRYPTO frames.

359	   +----------------------------------------------------------+
360	   | UDP header (source and destination UDP ports)            |
361	   +----------------------------------------------------------+
362	   | QUIC long header (type = Initial, Version, DCID, SCID) (Length)
363	   +----------------------------------------------------------+  |
364	   | QUIC CRYPTO frame header                                 |  |
365	   +----------------------------------------------------------+  |
366	   | TLS Client Hello (incl. TLS SNI)                         |  |
367	   +----------------------------------------------------------+  |
368	   | QUIC PADDING frames                                      |  |
369	   +----------------------------------------------------------+<-+

371	     Figure 2: Typical QUIC Client Hello datagram pattern with no 0-RTT

373	   The Client Hello datagram exposes version number, source and
374	   destination connection IDs in the clear.  Information in the TLS
375	   Client Hello frame, including any TLS Server Name Indication (SNI)
376	   present, is obfuscated using the Initial secret.  Note that the
377	   location of PADDING is implementation-dependent, and PADDING frames
378	   may not appear in a coalesced Initial packet.

380	   +------------------------------------------------------------+
381	   | UDP header (source and destination UDP ports)              |
382	   +------------------------------------------------------------+
383	   | QUIC long header (type = Initial, Version, DCID, SCID)   (Length)
384	   +------------------------------------------------------------+  |
385	   | QUIC CRYPTO frame header                                   |  |
386	   +------------------------------------------------------------+  |
387	   | TLS Server Hello                                           |  |
388	   +------------------------------------------------------------+  |
389	   | QUIC ACK frame (acknowledging client hello)                |  |
390	   +------------------------------------------------------------+<-+
391	   | QUIC long header (type = Handshake, Version, DCID, SCID) (Length)
392	   +------------------------------------------------------------+  |
393	   | encrypted payload (presumably CRYPTO frames)               |  |
394	   +------------------------------------------------------------+<-+
395	   | QUIC short header                                          |
396	   +------------------------------------------------------------+
397	   | 1-RTT encrypted payload                                    |
398	   +------------------------------------------------------------+

400	            Figure 3: Typical QUIC Server Hello datagram pattern

402	   The Server Hello datagram also exposes version number, source and
403	   destination connection IDs and information in the TLS Server Hello
404	   message which is obfuscated using the Initial secret.

406	   +------------------------------------------------------------+
407	   | UDP header (source and destination UDP ports)              |
408	   +------------------------------------------------------------+
409	   | QUIC long header (type = Initial, Version, DCID, SCID)   (Length)
410	   +------------------------------------------------------------+  |
411	   | QUIC ACK frame (acknowledging Server Hello Initial)        |  |
412	   +------------------------------------------------------------+<-+
413	   | QUIC long header (type = Handshake, Version, DCID, SCID) (Length)
414	   +------------------------------------------------------------+  |
415	   | encrypted payload (presumably CRYPTO/ACK frames)           |  |
416	   +------------------------------------------------------------+<-+
417	   | QUIC short header                                          |
418	   +------------------------------------------------------------+
419	   | 1-RTT encrypted payload                                    |
420	   +------------------------------------------------------------+
421	         Figure 4: Typical QUIC Initial Completion datagram pattern

423	   The Initial Completion datagram does not expose any additional
424	   information; however, recognizing it can be used to determine that a
425	   handshake has completed (see Section 3.2), and for three-way
426	   handshake RTT estimation as in Section 3.7.

428	   +------------------------------------------------------------+
429	   | UDP header (source and destination UDP ports)              |
430	   +------------------------------------------------------------+
431	   | QUIC long header (type = Handshake, Version, DCID, SCID) (Length)
432	   +------------------------------------------------------------+  |
433	   | encrypted payload (presumably ACK frame)                   |  |
434	   +------------------------------------------------------------+<-+
435	   | QUIC short header                                          |
436	   +------------------------------------------------------------+
437	   | 1-RTT encrypted payload                                    |
438	   +------------------------------------------------------------+

440	        Figure 5: Typical QUIC Handshake Completion datagram pattern

442	   Similar to Initial Completion, Handshake Completion also exposes no
443	   additional information; observing it serves only to determine that
444	   the handshake has completed.

446	   When the client uses 0-RTT connection resumption, 0-RTT data may also
447	   be seen in the QUIC Client Hello datagram, as shown in Figure 6.

449	   +----------------------------------------------------------+
450	   | UDP header (source and destination UDP ports)            |
451	   +----------------------------------------------------------+
452	   | QUIC long header (type = Initial, Version, DCID, SCID) (Length)
453	   +----------------------------------------------------------+  |
454	   | QUIC CRYPTO frame header                                 |  |
455	   +----------------------------------------------------------+  |
456	   | TLS Client Hello (incl. TLS SNI)                         |  |
457	   +----------------------------------------------------------+<-+
458	   | QUIC long header (type = 0RTT, Version, DCID, SCID)    (Length)
459	   +----------------------------------------------------------+  |
460	   | 0-rtt encrypted payload                                  |  |
461	   +----------------------------------------------------------+<-+

463	         Figure 6: Typical 0-RTT QUIC Client Hello datagram pattern

465	   In a 0-RTT QUIC Client Hello datagram, the PADDING frame is only
466	   present if necessary to increase the size of the datagram with 0RTT
467	   data to at least 1200 bytes.  Additional datagrams containing only
468	   0-RTT protected long header packets may be sent from the client to
469	   the server after the Client Hello datagram, containing the rest of
470	   the 0-RTT data.  The amount of 0-RTT protected data is limited by the
471	   initial congestion window, typically around 10 packets [RFC6928].

473	2.5.  Integrity Protection of the Wire Image

475	   As soon as the cryptographic context is established, all information
476	   in the QUIC header, including exposed information, is integrity
477	   protected.  Further, information that was sent and exposed in
478	   handshake packets sent before the cryptographic context was
479	   established are validated later during the cryptographic handshake.
480	   Therefore, devices on path cannot alter any information or bits in
481	   QUIC packet headers, except specific parts of Initial packets, since
482	   alteration of header information will lead to a failed integrity
483	   check at the receiver, and can even lead to connection termination.

485	2.6.  Connection ID and Rebinding

487	   The connection ID in the QUIC packet headers allows routing of QUIC
488	   packets at load balancers on other than five-tuple information,
489	   ensuring that related flows are appropriately balanced together; and
490	   to allow rebinding of a connection after one of the endpoint's
491	   addresses changes - usually the client's.  Client and server
492	   negotiate connection IDs during the handshake; typically, however,
493	   only the server will request a connection ID for the lifetime of the
494	   connection.  Connection IDs for either endpoint may change during the
495	   lifetime of a connection, with the new connection ID being negotiated
496	   via encrypted frames.  See Section 5.1 of [QUIC-TRANSPORT].
497	   Therefore, observing a new connection ID does not necessary indicate
498	   a new connection.

500	   Server-generated connection IDs should seek to obscure any encoding,
501	   of routing identities or any other information.  Exposing the server
502	   mapping would allow linkage of multiple IP addresses to the same host
503	   if the server also supports migration.  Furthermore, this opens an
504	   attack vector on specific servers or pools.

506	   The best way to obscure an encoding is to appear random to observers,
507	   which is most rigorously achieved with encryption.  Even when
508	   encrypted, a scheme could embed the unencrypted length of the
509	   connection ID in the connection ID itself, instead of remembering it.

511	   [QUIC_LB] further specified possible algorithms to generate
512	   connection IDs at load balancers.

514	2.7.  Packet Numbers

516	   The packet number field is always present in the QUIC packet header;
517	   however, it is always encrypted.  The encryption key for packet
518	   number protection on handshake packets sent before cryptographic
519	   context establishment is specific to the QUIC version, while packet
520	   number protection on subsequent packets uses secrets derived from the
521	   end-to-end cryptographic context.  Packet numbers are therefore not
522	   part of the wire image that is visible to on-path observers.

524	2.8.  Version Negotiation and Greasing

526	   Version Negotiation packets are not intrinsically protected, but QUIC
527	   versions can use later encrypted messages to verify that they were
528	   authentic.  Therefore any manipulation of this list will be detected
529	   and may cause the endpoints to terminate the connection attempt.

531	   Also note that the list of versions in the Version Negotiation packet
532	   may contain reserved versions.  This mechanism is used to avoid
533	   ossification in the implementation on the selection mechanism.
534	   Further, a client may send a Initial Client packet with a reserved
535	   version number to trigger version negotiation.  In the Version
536	   Negotiation packet the connection ID and packet number of the Client
537	   Initial packet are reflected to provide a proof of return-
538	   routability.  Therefore changing this information will also cause the
539	   connection to fail.

541	   QUIC is expected to evolve rapidly, so new versions, both
542	   experimental and IETF standard versions, will be deployed in the
543	   Internet more often than with traditional Internet- and transport-
544	   layer protocols.  Using a particular version number to recognize
545	   valid QUIC traffic is likely to persistently miss a fraction of QUIC
546	   flows and completely fail in the near future, and is therefore not
547	   recommended.  In addition, due to the speed of evolution of the
548	   protocol, devices that attempt to distinguish QUIC traffic from non-
549	   QUIC traffic for purposes of network admission control should admit
550	   all QUIC traffic regardless of version.

552	3.  Network-visible Information about QUIC Flows

554	   This section addresses the different kinds of observations and
555	   inferences that can be made about QUIC flows by a passive observer in
556	   the network based on the wire image in Section 2.  Here we assume a
557	   bidirectional observer (one that can see packets in both directions
558	   in the sequence in which they are carried on the wire) unless noted.

560	3.1.  Identifying QUIC Traffic

562	   The QUIC wire image is not specifically designed to be
563	   distinguishable from other UDP traffic.

565	   The only application binding defined by the IETF QUIC WG is HTTP/3
566	   [QUIC-HTTP] at the time of this writing; however, many other
567	   applications are currently being defined and deployed over QUIC, so
568	   an assumption that all QUIC traffic is HTTP/3 is not valid.  HTTP
569	   over QUIC uses UDP port 443 by default, although URLs referring to
570	   resources available over HTTP/3 may specify alternate port numbers.
571	   Simple assumptions about whether a given flow is using QUIC based
572	   upon a UDP port number may therefore not hold; see also [RFC7605]
573	   section 5.

575	   While the second most significant bit (0x40) of the first octet is
576	   set to 1 in most QUIC packets of the current version (see
577	   Section 2.1), this method of recognizing QUIC traffic is NOT
578	   RECOMMENDED.  First, it only provides one bit of information and is
579	   quite prone to collide with UDP-based protocols other than those that
580	   this static bit is meant to allow multiplexing with.  Second, this
581	   feature of the wire image is not invariant [QUIC-INVARIANTS] and may
582	   change in future versions of the protocol, or even be negotiated
583	   during the handshake via the use of transport parameters.

585	   Even though transport parameters transmitted in the client initial
586	   are obserable by the network, they cannot be modified by the network
587	   without risking connection failure.  Further, the negotiated reply
588	   from the server cannot be observed, so observers on the network
589	   cannot know which parameters are actually in use.

591	3.1.1.  Identifying Negotiated Version

593	   An in-network observer assuming that a set of packets belongs to a
594	   QUIC flow can infer the version number in use by observing the
595	   handshake: an Initial packet with a given version from a client to
596	   which a server responds with an Initial packet with the same version
597	   implies acceptance of that version.

599	   Negotiated version cannot be identified for flows for which a
600	   handshake is not observed, such as in the case of connection
601	   migration; however, these flows can be associated with flows for
602	   which a version has been identified; see Section 3.4.

604	   This document focuses on QUIC Version 1, and this section applies
605	   only to packets belonging to Version 1 QUIC flows; for purposes of
606	   on-path observation, it assumes that these packets have been
607	   identified as such through the observation of a version number
608	   exchange as described above.

610	3.1.2.  Rejection of Garbage Traffic

612	   A related question is whether a first packet of a given flow on a
613	   known QUIC-associated port is a valid QUIC packet, to support in-
614	   network filtering of garbage UDP packets (reflection attacks, random
615	   backscatter).  While heuristics based on the first byte of the packet
616	   (packet type) could be used to separate valid from invalid first
617	   packet types, the deployment of such heuristics is not recommended,
618	   as packet types may have different meanings in future versions of the
619	   protocol.

621	3.2.  Connection Confirmation

623	   Connection establishment uses Initial and Handshake packets
624	   containing a TLS handshake, and Retry packets that do not contain
625	   parts of the handshake.  Connection establishment can therefore be
626	   detected using heuristics similar to those used to detect TLS over
627	   TCP.  A client initiating a 0-RTT connection may also send data
628	   packets in 0-RTT Protected packets directly after the Initial packet
629	   containing the TLS Client Hello.  Since these packets may be
630	   reordered in the network, 0-RTT Protected data packets could be seen
631	   before the Initial packet.

633	   Note that clients send Initial packets before servers do, servers
634	   send Handshake packets before clients do, and only clients send
635	   Initial packets with tokens.  Therefore, the role as a client or
636	   server can generally be confirmed by an on- path observer.  An
637	   attempted connection after Retry can be detected by correlating the
638	   token on the Retry with the token on the subsequent Initial packet
639	   and the destination connection ID of the new Initial packet.

641	3.3.  Application Identification

643	   The cleartext TLS handshake may contain Server Name Indication (SNI)
644	   [RFC6066], by which the client reveals the name of the server it
645	   intends to connect to, in order to allow the server to present a
646	   certificate based on that name.  It may also contain information from
647	   Application-Layer Protocol Negotiation (ALPN) [RFC7301], by which the
648	   client exposes the names of application-layer protocols it supports;
649	   an observer can deduce that one of those protocols will be used if
650	   the connection continues.

652	   Work is currently underway in the TLS working group to encrypt the
653	   SNI in TLS 1.3 [TLS-ESNI].  This would make SNI-based application
654	   identification impossible through passive measurement for QUIC and
655	   other protocols that use TLS.

657	3.3.1.  Extracting Server Name Indication (SNI) Information

659	   If the SNI is not encrypted it can be derived from the QUIC Initial
660	   packet by calculating the Initial Secret to decrypt the packet
661	   payload and parse the QUIC CRYPTO Frame containing the TLS
662	   ClientHello.

664	   As both the initial salt for the Initial Secret as well as CRYPTO
665	   frame itself are version-specific, the first step is always to parse
666	   the version number (second to sixth byte of the long header).  Note
667	   that only long header packets carry the version number, so it is
668	   necessary to also check the if first bit of the QUIC packet is set to
669	   1, indicating a long header.

671	   Note that proprietary QUIC versions, that have been deployed before
672	   standardization, might not set the first bit in a QUIC long header
673	   packets to 1.  To parse these versions, example code is provided in
674	   the appendix (see Section 9.1), however, it is expected that these
675	   versions will gradually disappear over time.

677	   When the version has been identified as QUIC version 1, the packet
678	   type needs to be verified as an Initial packet by checking that the
679	   third and fourth bit of the header are both set to 0.  Then the
680	   client destination connection ID needs to be extracted to calculate
681	   the Initial Secret together with the version specific initial salt,
682	   as described in [QUIC-TLS].  The length of the connection ID is
683	   indicated in the 6th byte of the header followed by the connection ID
684	   itself.

686	   To determine the end of the header and find the start of the payload,
687	   the packet number length, the source connection ID length, and the
688	   token length need to be extracted.  The packet number length is
689	   defined by the seventh and eight bits of the header as described in
690	   section 17.2. of [QUIC-TRANSPORT], but is obfuscated as described in
691	   [QUIC-TLS].  The source connection ID length is specified in the byte
692	   after the destination connection ID.  And the token length, which
693	   follows the source connection ID, is a variable length integer as
694	   specified in Section 16 of [QUIC-TRANSPORT].

696	   After decryption, the Initial Client packet can be parsed to detect
697	   the CRYPTO frame that contains the TLS Client Hello, which then can
698	   be parsed similarly to TLS over TCP connections.  The Initial client
699	   packet may contain other frames, so the first bytes of each frame
700	   need to be checked to identify the frame type, and if needed skip
701	   over it.  Note that the length of the frames is dependent on the
702	   frame type.  In QUIC version 1, the packet is expected to only carry
703	   the CRYPTO frame and optionally padding frames.  However, PADDING
704	   frames, which are each one byte of zeros, may also occur before or
705	   after the CRYPTO frame.

707	   Note that client Initial packets after the first do not always use
708	   the destination connection ID that was used to generate the Initial
709	   keys.  Therefore, attempts to decrypt these packets using the
710	   procedure above might fail.

712	3.4.  Flow Association

714	   The QUIC connection ID (see Section 2.6) is designed to allow an on-
715	   path device such as a load-balancer to associate two flows as
716	   identified by five-tuple when the address and port of one of the
717	   endpoints changes; e.g. due to NAT rebinding or server IP address
718	   migration.  An observer keeping flow state can associate a connection
719	   ID with a given flow, and can associate a known flow with a new flow
720	   when when observing a packet sharing a connection ID and one endpoint
721	   address (IP address and port) with the known flow.

723	   However, since the connection ID may change multiple times during the
724	   lifetime of a flow, and the negotiation of connection ID changes is
725	   encrypted, packets with the same 5-tuple but different connection IDs
726	   may or may not belong to the same connection.

728	   The connection ID value should be treated as opaque; see Section 4.3
729	   for caveats regarding connection ID selection at servers.

731	3.5.  Flow teardown

733	   QUIC does not expose the end of a connection; the only indication to
734	   on-path devices that a flow has ended is that packets are no longer
735	   observed.  Stateful devices on path such as NATs and firewalls must
736	   therefore use idle timeouts to determine when to drop state for QUIC
737	   flows, see further section Section 4.1.

739	3.6.  Flow Symmetry Measurement

741	   QUIC explicitly exposes which side of a connection is a client and
742	   which side is a server during the handshake.  In addition, the
743	   symmetry of a flow (whether primarily client-to-server, primarily
744	   server-to-client, or roughly bidirectional, as input to basic traffic
745	   classification techniques) can be inferred through the measurement of
746	   data rate in each direction.  While QUIC traffic is protected and
747	   ACKs may be padded, padding is not required.

749	3.7.  Round-Trip Time (RTT) Measurement

751	   Round-trip time of QUIC flows can be inferred by observation once per
752	   flow, during the handshake, as in passive TCP measurement; this
753	   requires parsing of the QUIC packet header and recognition of the
754	   handshake, as illustrated in Section 2.4.  It can also be inferred
755	   during the flow's lifetime, if the endpoints use the spin bit
756	   facility described below and in [QUIC-TRANSPORT], section 17.3.1.

758	3.7.1.  Measuring Initial RTT

760	   In the common case, the delay between the Initial packet containing
761	   the TLS Client Hello and the Handshake packet containing the TLS
762	   Server Hello represents the RTT component on the path between the
763	   observer and the server.  The delay between the TLS Server Hello and
764	   the Handshake packet containing the TLS Finished message sent by the
765	   client represents the RTT component on the path between the observer
766	   and the client.  While the client may send 0-RTT Protected packets
767	   after the Initial packet during 0-RTT connection re-establishment,
768	   these can be ignored for RTT measurement purposes.

770	   Handshake RTT can be measured by adding the client-to-observer and
771	   observer-to-server RTT components together.  This measurement
772	   necessarily includes any transport and application layer delay (the
773	   latter mainly caused by the asymmetric crypto operations associated
774	   with the TLS handshake) at both sides.

776	3.7.2.  Using the Spin Bit for Passive RTT Measurement

778	   The spin bit provides an additional method to measure per-flow RTT
779	   from observation points on the network path throughout the duration
780	   of a connection.  Endpoint participation in spin bit signaling is
781	   optional in QUIC.  That is, while its location is fixed in this
782	   version of QUIC, an endpoint can unilaterally choose to not support
783	   "spinning" the bit.  Use of the spin bit for RTT measurement by
784	   devices on path is only possible when both endpoints enable it.  Some
785	   endpoints may disable use of the spin bit by default, others only in
786	   specific deployment scenarios, e.g. for servers and clients where the
787	   RTT would reveal the presence of a VPN or proxy.  To avoid making
788	   these connections identifiable based on the usage of the spin bit,
789	   all endpoints randomly disable "spinning" for at least one eighth of
790	   connections, even if otherwise enabled by default.  An endpoint not
791	   participating in spin bit signaling for a given connection can use a
792	   fixed spin value for the duration of the connection, or can set the
793	   bit randomly on each packet sent.

795	   When in use and a QUIC flow sends data continuously, the latency spin
796	   bit in each direction changes value once per round-trip time (RTT).
797	   An on-path observer can observe the time difference between edges
798	   (changes from 1 to 0 or 0 to 1) in the spin bit signal in a single
799	   direction to measure one sample of end-to-end RTT.

801	   Note that this measurement, as with passive RTT measurement for TCP,
802	   includes any transport protocol delay (e.g., delayed sending of
803	   acknowledgements) and/or application layer delay (e.g., waiting for a
804	   response to be generated).  It therefore provides devices on path a
805	   good instantaneous estimate of the RTT as experienced by the
806	   application.  A simple linear smoothing or moving minimum filter can
807	   be applied to the stream of RTT information to get a more stable
808	   estimate.

810	   However, application-limited and flow-control-limited senders can
811	   have application and transport layer delay, respectively, that are
812	   much greater than network RTT.  When the sender is application-
813	   limited and e.g. only sends small amount of periodic application
814	   traffic, where that period is longer than the RTT, measuring the spin
815	   bit provides information about the application period, not the
816	   network RTT.

818	   Since the spin bit logic at each endpoint considers only samples from
819	   packets that advance the largest packet number, signal generation
820	   itself is resistant to reordering.  However, reordering can cause
821	   problems at an observer by causing spurious edge detection and
822	   therefore inaccurate (i.e., lower) RTT estimates, if reordering
823	   occurs across a spin-bit flip in the stream.

825	   Simple heuristics based on the observed data rate per flow or changes
826	   in the RTT series can be used to reject bad RTT samples due to lost
827	   or reordered edges in the spin signal, as well as application or flow
828	   control limitation; for example, QoF [TMA-QOF] rejects component RTTs
829	   significantly higher than RTTs over the history of the flow.  These
830	   heuristics may use the handshake RTT as an initial RTT estimate for a
831	   given flow.  Usually such heuristics would also detect if the spin is
832	   either constant or randomly set for a connection.

834	   An on-path observer that can see traffic in both directions (from
835	   client to server and from server to client) can also use the spin bit
836	   to measure "upstream" and "downstream" component RTT; i.e, the
837	   component of the end-to-end RTT attributable to the paths between the
838	   observer and the server and the observer and the client,
839	   respectively.  It does this by measuring the delay between a spin
840	   edge observed in the upstream direction and that observed in the
841	   downstream direction, and vice versa.

843	4.  Specific Network Management Tasks

845	   In this section, we review specific network management and
846	   measurement techniques and how QUIC's design impacts them.

848	4.1.  Stateful Treatment of QUIC Traffic

850	   Stateful treatment of QUIC traffic (e.g., at a firewall or NAT
851	   middlebox) is possible through QUIC traffic and version
852	   identification (Section 3.1) and observation of the handshake for
853	   connection confirmation (Section 3.2).  The lack of any visible end-
854	   of-flow signal (Section 3.5) means that this state must be purged
855	   either through timers or through least-recently-used eviction,
856	   depending on application requirements.

858	   [RFC4787] recommends a 2 minute timeout interval for UDP.  However,
859	   timers can be lower, in the range of 15 to 30 seconds.  In contrast,
860	   [RFC5382] recommends a timeout of more than 2 hours for TCP, given
861	   that TCP is a connection-oriented protocol with well-defined closure
862	   semantics.  For network devices that are QUIC-aware, it is
863	   recommended to also use longer timeouts for QUIC traffic, as QUIC is
864	   connection-oriented.  As such, a handshake packet from the server
865	   indicates the willingness of the server to communicate with the
866	   client.

868	   The QUIC header optionally contains a connection ID which can be used
869	   as additional entropy beyond the 5-tuple, if needed.  The QUIC
870	   handshake needs to be observed in order to understand whether the
871	   connection ID is present and what length it has.  However, connection
872	   IDs may be renegotiated during a connection, and this renegotiation
873	   is not visible to the path.  Keying state off the connection ID may
874	   therefore cause undetectable and unrecoverable loss of state in the
875	   middle of a connection.  Use of connection ID specifically
876	   discouraged for NAT applications.

878	4.2.  Passive Network Performance Measurement and Troubleshooting

880	   Limited RTT measurement is possible by passive observation of QUIC
881	   traffic; see Section 3.7.  No passive measurement of loss is possible
882	   with the present wire image.  Extremely limited observation of
883	   upstream congestion may be possible via the observation of CE
884	   markings on ECN-enabled QUIC traffic.

886	4.3.  Server Cooperation with Load Balancers

888	   In the case of content distribution networking architectures
889	   including load balancers, the connection ID provides a way for the
890	   server to signal information about the desired treatment of a flow to
891	   the load balancers.  Guidance on assigning connection IDs is given in
892	   [QUIC-APPLICABILITY].

894	4.4.  DDoS Detection and Mitigation

896	   Current practices in detection and mitigation of Distributed Denial
897	   of Service (DDoS) attacks generally involve classification of
898	   incoming traffic (as packets, flows, or some other aggregate) into
899	   "good" (productive) and "bad" (DDoS) traffic, and then differential
900	   treatment of this traffic to forward only good traffic.  This
901	   operation is often done in a separate specialized mitigation
902	   environment through which all traffic is filtered; a generalized
903	   architecture for separation of concerns in mitigation is given in
904	   [DOTS-ARCH].

906	   Key to successful DDoS mitigation is efficient classification of this
907	   traffic in the mitigation environment.  Limited first-packet garbage
908	   detection as in Section 3.1.2 and stateful tracking of QUIC traffic
909	   as in Section 4.1 above may be useful during classification.

911	   Note that the use of a connection ID to support connection migration
912	   renders 5-tuple based filtering insufficient and requires more state
913	   to be maintained by DDoS defense systems.  For the common case of NAT
914	   rebinding, DDoS defense systems can detect a change in the client's
915	   endpoint address by linking flows based on the server's connection
916	   IDs.  QUIC's linkability resistance ensures that a deliberate
917	   connection migration is accompanied by a change in the connection ID.

919	   It is questionable whether connection migrations must be supported
920	   during a DDoS attack.  If the connection migration is not visible to
921	   the network that performs the DDoS detection, an active, migrated
922	   QUIC connection may be blocked by such a system under attack.  As
923	   soon as the connection blocking is detected by the client, the client
924	   may rely on the fast resumption mechanism provided by QUIC.  When
925	   clients migrate to a new path, they should be prepared for the
926	   migration to fail and attempt to reconnect quickly.

928	   TCP syncookies [RFC4937] are a well-established method of mitigating
929	   some kinds of TCP DDoS attacks.  QUIC Retry packets are the
930	   functional analogue to syncookies, forcing clients to prove
931	   possession of their IP address before committing server state.
932	   However, there are safeguards in QUIC against unsolicited injection
933	   of these packets by intermediaries who do not have consent of the end
934	   server.  See [QUIC_LB] for standard ways for intermediaries to send
935	   Retry packets on behalf of consenting servers.

937	4.5.  UDP Policing

939	   Today, UDP is the most prevalent DDoS vector, since it is easy for
940	   compromised non-admin applications to send a flood of large UDP
941	   packets (while with TCP the attacker gets throttled by the congestion
942	   controller) or to craft reflection and amplification attacks.
943	   Networks should therefore be prepared for UDP flood attacks on ports
944	   used for QUIC traffic.  One possible response to this threat is to
945	   police UDP traffic on the network, allocating a fixed portion of the
946	   network capacity to UDP and blocking UDP datagram over that cap.

948	   The recommended way to police QUIC packets is to either drop them all
949	   or to throttle them based on the hash of the UDP datagram's source
950	   and destination addresses, blocking a portion of the hash space that
951	   corresponds to the fraction of UDP traffic one wishes to drop.  When
952	   the handshake is blocked, QUIC-capable applications may failover to
953	   TCP (at least applications using well-known UDP ports).  However,
954	   blindly blocking a significant fraction of QUIC packets will allow
955	   many QUIC handshakes to complete, preventing a TCP failover, but the
956	   connections will suffer from severe packet loss.

958	4.6.  Distinguishing Acknowledgment traffic

960	   Some deployed in-network functions distinguish pure-acknowledgment
961	   (ACK) packets from packets carrying upper-layer data in order to
962	   attempt to enhance performance, for example by queueing ACKs
963	   differently or manipulating ACK signaling.  Distinguishing ACK
964	   packets is trivial in TCP, but not supported by QUIC, since
965	   acknowledgment signaling is carried inside QUIC's encrypted payload,
966	   and ACK manipulation is impossible.  Specifically, heuristics
967	   attempting to distinguish ACK-only packets from payload-carrying
968	   packets based on packet size are likely to fail, and are emphatically
969	   NOT RECOMMENDED.

971	4.7.  Quality of Service handling and ECMP

973	   It is expected that any QoS handling in the network, e.g. based on
974	   use of DiffServ Code Points (DSCPs) [RFC2475] as well as Equal-Cost
975	   Multi-Path (ECMP) routing, is applied on a per flow-basis (and not
976	   per-packet) and as such that all packets belonging to the same QUIC
977	   connection get uniform treatment.  Using ECMP to distribute packets
978	   from a single flow across multiple network paths or any other non-
979	   uniform treatment of packets belong to the same connection could
980	   result in variations in order, delivery rate, and drop rate.  As
981	   feedback about loss or delay of each packet is used as input to the
982	   congestion controller, these variations could adversely affect
983	   performance.

985	   Depending of the loss recovery mechanism implemented, QUIC may be
986	   more tolerant of packet re-ordering than traditional TCP traffic (see
987	   Section 2.7).  However, it cannot be known by the network which exact
988	   recovery mechanism is used and therefore reordering tolerance should
989	   be considered as unknown.

991	4.8.  QUIC and Network Address Translation (NAT)

993	   QUIC Connection IDs are opaque byte fields that are expressed
994	   consistently across all QUIC versions [QUIC-INVARIANTS], see
995	   Section 2.6.  This feature may appear to present opportunities to
996	   optimize NAT port usage and simplify the work of the QUIC server.  In
997	   fact, NAT behavior that relies on CID may instead cause connection
998	   failure when endpoints change Connection ID, and disable important
999	   protocol security features.  NATs should retain their existing 4-
1000	   tuple-based operation and refrain from parsing or otherwise using
1001	   QUIC connection IDs.

1003	   This section uses the colloquial term NAT to mean NAPT (section 2.2
1004	   of [RFC3022]), which overloads several IP addresses to one IP address
1005	   or to an IP address pool, as commonly deployed in carrier-grade NATs
1006	   or residential NATs.

1008	   The remainder of this section explains how QUIC supports NATs better
1009	   than other connection-oriented protocols, why NAT use of Connection
1010	   ID might appear attractive, and how NAT use of CID can create serious
1011	   problems for the endpoints.

1013	   [RFC4787] contains some guidance on building NATs to interact
1014	   constructively with a wide range of applications.  This section
1015	   extends the discussion to QUIC.

1017	   By using the CID, QUIC connections can survive NAT rebindings as long
1018	   as no routing function in the path is dependent on client IP address
1019	   and port to deliver packets between server and NAT.  Reducing the
1020	   timeout on UDP NATs might be tempting in light of this property, but
1021	   not all QUIC server deployments will be robust to rebinding.

1023	4.8.1.  Resource Conservation

1025	   NATs sometimes hit an operational limit where they exhaust available
1026	   public IP addresses and ports, and must evict flows from their
1027	   address/port mapping.  CIDs might appear to offer a way to multiplex
1028	   many connections over a single address and port.

1030	   However, QUIC endpoints may negotiate new connection IDs inside
1031	   cryptographically protected packets, and begin using them at will.
1032	   Imagine two clients behind a NAT that are sharing the same public IP
1033	   address and port.  The NAT is differentiating them using the incoming
1034	   Connection ID.  If one client secretly changes its connection ID,
1035	   there will be no mapping for the NAT, and the connection will
1036	   suddenly break.

1038	   QUIC is deliberately designed to fail rather than persist when the
1039	   network cannot support its operation.  For HTTP/3, this extends to
1040	   recommending a fallback to TCP-based versions of HTTP rather than
1041	   persisting with a QUIC connection that might be unstable.  And
1042	   [I-D.ietf-quic-applicability] recommends TCP fallback for other
1043	   protocols on the basis that this is preferable to sudden connection
1044	   errors and time outs.  Furthermore, wide deployment of NATs with this
1045	   behavior hinders the use of QUIC's migration function, which relies
1046	   on the ability to change the connection ID any time during the
1047	   lifetime of a QUIC connection.

1049	   It is possible, in principle, to encode the client's identity in a
1050	   connection ID using the techniques described in [QUIC_LB] and
1051	   explicit coordination with the NAT.  However, this implies that the
1052	   client shares configuration with the NAT, which might be logistically
1053	   difficult.  This adds administrative overhead while not resolving the
1054	   case where a client migrates to a point behind the NAT.

1056	   Note that multiplexing connection IDs over a single port anyway
1057	   violates the best common practice to avoid "port overloading" as
1058	   described in [RFC4787].

1060	4.8.2.  "Helping" with routing infrastructure issues

1062	   Concealing client address changes in order to simplify operational
1063	   routing issues will mask important signals that drive security
1064	   mechanisms, and therefore opens QUIC up to various attacks.

1066	   One challenge in QUIC deployments that want to benefit from QUIC's
1067	   migration capability is server infrastructures with routers and
1068	   switches that direct traffic based on address-port 4-tuple rather
1069	   than connection ID.  The use of source IP address means that a NAT
1070	   rebinding or address migration will deliver packets to the wrong
1071	   server.  As all QUIC payloads are encrypted, routers and switches
1072	   will not have access to negotiated but not-yet-in-use CIDs.  This is
1073	   a particular problem for low-state load balancers.  [QUIC_LB]
1074	   addresses this problem proposing a QUIC extension to allow some
1075	   server-load balancer coordination for routable CIDs.

1077	   It seems that a NAT anywhere in the front of such an infrastructure
1078	   setup could save the effort of converting all these devices by
1079	   decoding routable connection IDs and rewriting the packet IP
1080	   addresses to allow consistent routing by legacy devices.

1082	   Unfortunately, the change of IP address or port is an important
1083	   signal to QUIC endpoints.  It requires a review of path-dependent
1084	   variables like congestion control parameters.  It can also signify
1085	   various attacks that mislead one endpoint about the best peer address
1086	   for the connection (see section 9 of [QUIC-TRANSPORT]).  The QUIC
1087	   PATH_CHALLENGE and PATH_RESPONSE frames are intended to detect and
1088	   mitigate these attacks and verify connectivity to the new address.
1089	   This mechanism cannot work if the NAT is bleaching peer address
1090	   changes.

1092	   For example, an attacker might copy a legitimate QUIC packet and
1093	   change the source address to match its own.  In the absence of a
1094	   bleaching NAT, the receiving endpoint would interpret this as a
1095	   potential NAT rebinding and use a PATH_CHALLENGE frame to prove that
1096	   the peer endpoint is not truly at the new address, thus thwarting the
1097	   attack.  A bleaching NAT has no means of sending an encrypted
1098	   PATH_CHALLENGE frame, so it might start redirecting all QUIC traffic
1099	   to the attacker address and thus allow an observer to break the
1100	   connection.

1102	4.9.  Filtering behavior

1104	   [RFC4787] describes possible packet filtering behaviors that relate
1105	   to NATs.  Though the guidance there holds, a particularly unwise
1106	   behavior is to admit a handful of UDP packets and then make a
1107	   decision as to whether or not to filter it.  QUIC applications are
1108	   encouraged to fail over to TCP if early packets do not arrive at
1109	   their destination.  Admitting a few packets allows the QUIC endpoint
1110	   to determine that the path accepts QUIC.  Sudden drops afterwards
1111	   will result in slow and costly timeouts before abandoning the
1112	   connection.

1114	5.  IANA Considerations

1116	   This document has no actions for IANA.

1118	6.  Security Considerations

1120	   QUIC is an encrypted and authenticated transport.  That means, once
1121	   the cryptographic handshake is complete, QUIC endpoints discard most
1122	   packets that are not authenticated, greatly limiting the ability of
1123	   an attacker to interfere with existing connections.

1125	   However, some information is still observerable, as supporting
1126	   manageability of QUIC traffic inherently involves tradeoffs with the
1127	   confidentiality of QUIC's control information; this entire document
1128	   is therefore security-relevant.

1130	   More security considerations for QUIC are discussed in
1131	   [QUIC-TRANSPORT] and [QUIC-TLS], generally considering active or
1132	   passive attackers in the network as well as attacks on specific QUIC
1133	   mechanism.

1135	   Version Negotiation packets do not contain any mechanism to prevent
1136	   version downgrade attacks.  However, future versions of QUIC that use
1137	   Version Negotiation packets are require to define a mechanism that is
1138	   robust against version downgrade attacks.  Therefore a network node
1139	   should not attempt to impact version selection, as version downgrade
1140	   may result in connection failure.

1142	7.  Contributors

1144	   The following people have contributed text to sections of this
1145	   document:

1147	   *  Dan Druta

1149	   *  Martin Duke

1151	   *  Marcus Ilhar

1153	   *  Igor Lubashev

1155	   *  David Schinazi

1157	8.  Acknowledgments

1159	   Special thanks to Martin Thomson and Martin Duke for the detailed
1160	   reviews and feedback.

1162	   This work is partially supported by the European Commission under
1163	   Horizon 2020 grant agreement no. 688421 Measurement and Architecture
1164	   for a Middleboxed Internet (MAMI), and by the Swiss State Secretariat
1165	   for Education, Research, and Innovation under contract no. 15.0268.
1166	   This support does not imply endorsement.

1168	9.  Appendix

1170	   This appendix uses the following conventions: array[i] - one byte at
1171	   index i of array array[i:j] - subset of array starting with index i
1172	   (inclusive) up to j-1 (inclusive) array[i:] - subset of array
1173	   starting with index i (inclusive) up to the end of the array

1175	9.1.  Distinguishing IETF QUIC and Google QUIC Versions

1177	   This section contains algorithms that allows parsing versions from
1178	   both Google QUIC and IETF QUIC.  These mechanisms will become
1179	   irrelevant when IETF QUIC is fully deployed and Google QUIC is
1180	   deprecated.

1182	   Note that other than this appendix, nothing in this document applies
1183	   to Google QUIC.  And the purpose of this appendix is merely to
1184	   distinguish IETF QUIC from any versions of Google QUIC.

1186	   Conceptually, a Google QUIC version is an opaque 32bit field.  When
1187	   we refer to a version with four printable characters, we use its
1188	   ASCII representation: for example, Q050 refers to {'Q', '0', '5',
1189	   '0'} which is equal to {0x51, 0x30, 0x35, 0x30}. Otherwise, we use
1190	   its hexadecimal representation: for example, 0xff00001d refers to
1191	   {0xff, 0x00, 0x00, 0x1d}.

1193	   QUIC versions that start with 'Q' or 'T' followed by three digits are
1194	   Google QUIC versions.  Versions up to and including 43 are documented
1195	   by <https://docs.google.com/document/d/
1196	   1WJvyZflAO2pq77yOLbp9NsGjC1CHetAXV8I0fQe-B_U/preview>.  Versions
1197	   Q046, Q050, T050, and T051 are not fully documented, but this
1198	   appendix should contain enough information to allow parsing Client
1199	   Hellos for those versions.

1201	   To extract the version number itself, one needs to look at the first
1202	   byte of the QUIC packet, in other words the first byte of the UDP
1203	   payload.

1205	     first_byte = packet[0]
1206	     first_byte_bit1 = ((first_byte & 0x80) != 0)
1207	     first_byte_bit2 = ((first_byte & 0x40) != 0)
1208	     first_byte_bit3 = ((first_byte & 0x20) != 0)
1209	     first_byte_bit4 = ((first_byte & 0x10) != 0)
1210	     first_byte_bit5 = ((first_byte & 0x08) != 0)
1211	     first_byte_bit6 = ((first_byte & 0x04) != 0)
1212	     first_byte_bit7 = ((first_byte & 0x02) != 0)
1213	     first_byte_bit8 = ((first_byte & 0x01) != 0)
1214	     if (first_byte_bit1) {
1215	       version = packet[1:5]
1216	     } else if (first_byte_bit5 && !first_byte_bit2) {
1217	       if (!first_byte_bit8) {
1218	         abort("Packet without version")
1219	       }
1220	       if (first_byte_bit5) {
1221	         version = packet[9:13]
1222	       } else {
1223	         version = packet[5:9]
1224	       }
1225	     } else {
1226	       abort("Packet without version")
1227	     }

1229	9.2.  Extracting the CRYPTO frame
1230	     counter = 0
1231	     while (payload[counter] == 0) {
1232	       counter += 1
1233	     }
1234	     first_nonzero_payload_byte = payload[counter]
1235	     fnz_payload_byte_bit3 = ((first_nonzero_payload_byte & 0x20) != 0)

1237	     if (first_nonzero_payload_byte != 0x06) {
1238	       abort("Unexpected frame")
1239	     }
1240	     if (payload[counter+1] != 0x00) {
1241	       abort("Unexpected crypto stream offset")
1242	     }
1243	     counter += 2
1244	     if ((payload[counter] & 0xc0) == 0) {
1245	       crypto_data_length = payload[counter]
1246	       counter += 1
1247	     } else {
1248	       crypto_data_length = payload[counter:counter+2]
1249	       counter += 2
1250	     }
1251	     crypto_data = payload[counter:counter+crypto_data_length]
1252	     ParseTLS(crypto_data)

1254	10.  References

1256	10.1.  Normative References

1258	   [QUIC-TLS] Thomson, M. and S. Turner, "Using TLS to Secure QUIC",
1259	              Work in Progress, Internet-Draft, draft-ietf-quic-tls-34,
1260	              14 January 2021, <http://www.ietf.org/internet-drafts/
1261	              draft-ietf-quic-tls-34.txt>.

1263	   [QUIC-TRANSPORT]
1264	              Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed
1265	              and Secure Transport", Work in Progress, Internet-Draft,
1266	              draft-ietf-quic-transport-34, 14 January 2021,
1267	              <http://www.ietf.org/internet-drafts/draft-ietf-quic-
1268	              transport-34.txt>.

1270	10.2.  Informative References

1272	   [Ding2015] Ding, H. and M. Rabinovich, "TCP Stretch Acknowledgments
1273	              and Timestamps - Findings and Impliciations for Passive
1274	              RTT Measurement (ACM Computer Communication Review)", July
1275	              2015, <http://www.sigcomm.org/sites/default/files/ccr/
1276	              papers/2015/July/0000000-0000002.pdf>.

1278	   [DOTS-ARCH]
1279	              Mortensen, A., Reddy.K, T., Andreasen, F., Teague, N., and
1280	              R. Compton, "Distributed-Denial-of-Service Open Threat
1281	              Signaling (DOTS) Architecture", Work in Progress,
1282	              Internet-Draft, draft-ietf-dots-architecture-18, 6 March
1283	              2020, <http://www.ietf.org/internet-drafts/draft-ietf-
1284	              dots-architecture-18.txt>.

1286	   [I-D.ietf-quic-applicability]
1287	              Kuehlewind, M. and B. Trammell, "Applicability of the QUIC
1288	              Transport Protocol", Work in Progress, Internet-Draft,
1289	              draft-ietf-quic-applicability-08, 2 November 2020,
1290	              <http://www.ietf.org/internet-drafts/draft-ietf-quic-
1291	              applicability-08.txt>.

1293	   [IPIM]     Allman, M., Beverly, R., and B. Trammell, "In-Protocol
1294	              Internet Measurement (arXiv preprint 1612.02902)", 9
1295	              December 2016, <https://arxiv.org/abs/1612.02902>.

1297	   [QUIC-APPLICABILITY]
1298	              Kuehlewind, M. and B. Trammell, "Applicability of the QUIC
1299	              Transport Protocol", Work in Progress, Internet-Draft,
1300	              draft-ietf-quic-applicability-08, 2 November 2020,
1301	              <http://www.ietf.org/internet-drafts/draft-ietf-quic-
1302	              applicability-08.txt>.

1304	   [QUIC-HTTP]
1305	              Bishop, M., "Hypertext Transfer Protocol Version 3
1306	              (HTTP/3)", Work in Progress, Internet-Draft, draft-ietf-
1307	              quic-http-33, 15 December 2020, <http://www.ietf.org/
1308	              internet-drafts/draft-ietf-quic-http-33.txt>.

1310	   [QUIC-INVARIANTS]
1311	              Thomson, M., "Version-Independent Properties of QUIC",
1312	              Work in Progress, Internet-Draft, draft-ietf-quic-
1313	              invariants-13, 14 January 2021, <http://www.ietf.org/
1314	              internet-drafts/draft-ietf-quic-invariants-13.txt>.

1316	   [QUIC_LB]  Duke, M. and N. Banks, "QUIC-LB: Generating Routable QUIC
1317	              Connection IDs", Work in Progress, Internet-Draft, draft-
1318	              ietf-quic-load-balancers-05, 30 October 2020,
1319	              <http://www.ietf.org/internet-drafts/draft-ietf-quic-load-
1320	              balancers-05.txt>.

1322	   [RFC2475]  Blake, S., Black, D., Carlson, M., Davies, E., Wang, Z.,
1323	              and W. Weiss, "An Architecture for Differentiated
1324	              Services", RFC 2475, DOI 10.17487/RFC2475, December 1998,
1325	              <https://www.rfc-editor.org/info/rfc2475>.

1327	   [RFC3022]  Srisuresh, P. and K. Egevang, "Traditional IP Network
1328	              Address Translator (Traditional NAT)", RFC 3022,
1329	              DOI 10.17487/RFC3022, January 2001,
1330	              <https://www.rfc-editor.org/info/rfc3022>.

1332	   [RFC4787]  Audet, F., Ed. and C. Jennings, "Network Address
1333	              Translation (NAT) Behavioral Requirements for Unicast
1334	              UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January
1335	              2007, <https://www.rfc-editor.org/info/rfc4787>.

1337	   [RFC4937]  Arberg, P. and V. Mammoliti, "IANA Considerations for PPP
1338	              over Ethernet (PPPoE)", RFC 4937, DOI 10.17487/RFC4937,
1339	              June 2007, <https://www.rfc-editor.org/info/rfc4937>.

1341	   [RFC5382]  Guha, S., Ed., Biswas, K., Ford, B., Sivakumar, S., and P.
1342	              Srisuresh, "NAT Behavioral Requirements for TCP", BCP 142,
1343	              RFC 5382, DOI 10.17487/RFC5382, October 2008,
1344	              <https://www.rfc-editor.org/info/rfc5382>.

1346	   [RFC6066]  Eastlake 3rd, D., "Transport Layer Security (TLS)
1347	              Extensions: Extension Definitions", RFC 6066,
1348	              DOI 10.17487/RFC6066, January 2011,
1349	              <https://www.rfc-editor.org/info/rfc6066>.

1351	   [RFC6928]  Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis,
1352	              "Increasing TCP's Initial Window", RFC 6928,
1353	              DOI 10.17487/RFC6928, April 2013,
1354	              <https://www.rfc-editor.org/info/rfc6928>.

1356	   [RFC7301]  Friedl, S., Popov, A., Langley, A., and E. Stephan,
1357	              "Transport Layer Security (TLS) Application-Layer Protocol
1358	              Negotiation Extension", RFC 7301, DOI 10.17487/RFC7301,
1359	              July 2014, <https://www.rfc-editor.org/info/rfc7301>.

1361	   [RFC7605]  Touch, J., "Recommendations on Using Assigned Transport
1362	              Port Numbers", BCP 165, RFC 7605, DOI 10.17487/RFC7605,
1363	              August 2015, <https://www.rfc-editor.org/info/rfc7605>.

1365	   [TLS-ESNI] Rescorla, E., Oku, K., Sullivan, N., and C. Wood, "TLS
1366	              Encrypted Client Hello", Work in Progress, Internet-Draft,
1367	              draft-ietf-tls-esni-09, 16 December 2020,
1368	              <http://www.ietf.org/internet-drafts/draft-ietf-tls-esni-
1369	              09.txt>.

1371	   [TMA-QOF]  Trammell, B., Gugelmann, D., and N. Brownlee, "Inline Data
1372	              Integrity Signals for Passive Measurement (in Proc. TMA
1373	              2014)", April 2014.

1375	   [WIRE-IMAGE]
1376	              Trammell, B. and M. Kuehlewind, "The Wire Image of a
1377	              Network Protocol", RFC 8546, DOI 10.17487/RFC8546, April
1378	              2019, <https://www.rfc-editor.org/info/rfc8546>.

1380	Authors' Addresses

1382	   Mirja Kuehlewind
1383	   Ericsson

1385	   Email: mirja.kuehlewind@ericsson.com

1387	   Brian Trammell
1388	   Google
1389	   Gustav-Gull-Platz 1
1390	   CH- 8004 Zurich
1391	   Switzerland

1393	   Email: ietf@trammell.ch