idnits 2.17.1 draft-ietf-quic-transport-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The abstract seems to contain references ([2], [3], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (October 13, 2017) is 2380 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 3580 -- Looks like a reference, but probably isn't: '2' on line 3582 -- Looks like a reference, but probably isn't: '3' on line 3584 -- Looks like a reference, but probably isn't: '4' on line 3586 == Outdated reference: A later version (-28) exists of draft-ietf-tls-tls13-21 == Outdated reference: A later version (-34) exists of draft-ietf-quic-recovery-07 == Outdated reference: A later version (-34) exists of draft-ietf-quic-tls-07 -- Duplicate reference: RFC1191, mentioned in 'RFC1191', was also mentioned in 'PMTUDv4'. -- Obsolete informational reference (is this intentional?): RFC 6824 (Obsoleted by RFC 8684) -- Obsolete informational reference (is this intentional?): RFC 7540 (Obsoleted by RFC 9113) Summary: 2 errors (**), 0 flaws (~~), 5 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC J. Iyengar, Ed. 3 Internet-Draft Google 4 Intended status: Standards Track M. Thomson, Ed. 5 Expires: April 16, 2018 Mozilla 6 October 13, 2017 8 QUIC: A UDP-Based Multiplexed and Secure Transport 9 draft-ietf-quic-transport-07 11 Abstract 13 This document defines the core of the QUIC transport protocol. This 14 document describes connection establishment, packet format, 15 multiplexing and reliability. Accompanying documents describe the 16 cryptographic handshake and loss detection. 18 Note to Readers 20 Discussion of this draft takes place on the QUIC working group 21 mailing list (quic@ietf.org), which is archived at 22 https://mailarchive.ietf.org/arch/search/?email_list=quic [1]. 24 Working Group information can be found at https://github.com/quicwg 25 [2]; source code and issues list for this draft can be found at 26 https://github.com/quicwg/base-drafts/labels/transport [3]. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on April 16, 2018. 45 Copyright Notice 47 Copyright (c) 2017 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (https://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 63 2. Conventions and Definitions . . . . . . . . . . . . . . . . . 5 64 2.1. Notational Conventions . . . . . . . . . . . . . . . . . 5 65 3. A QUIC Overview . . . . . . . . . . . . . . . . . . . . . . . 6 66 3.1. Low-Latency Connection Establishment . . . . . . . . . . 6 67 3.2. Stream Multiplexing . . . . . . . . . . . . . . . . . . . 6 68 3.3. Rich Signaling for Congestion Control and Loss Recovery . 7 69 3.4. Stream and Connection Flow Control . . . . . . . . . . . 7 70 3.5. Authenticated and Encrypted Header and Payload . . . . . 7 71 3.6. Connection Migration and Resilience to NAT Rebinding . . 8 72 3.7. Version Negotiation . . . . . . . . . . . . . . . . . . . 8 73 4. Versions . . . . . . . . . . . . . . . . . . . . . . . . . . 8 74 5. Packet Types and Formats . . . . . . . . . . . . . . . . . . 9 75 5.1. Long Header . . . . . . . . . . . . . . . . . . . . . . . 9 76 5.2. Short Header . . . . . . . . . . . . . . . . . . . . . . 11 77 5.3. Version Negotiation Packet . . . . . . . . . . . . . . . 13 78 5.4. Cleartext Packets . . . . . . . . . . . . . . . . . . . . 13 79 5.4.1. Client Initial Packet . . . . . . . . . . . . . . . . 14 80 5.4.2. Server Stateless Retry Packet . . . . . . . . . . . . 14 81 5.4.3. Server Cleartext Packet . . . . . . . . . . . . . . . 15 82 5.4.4. Client Cleartext Packet . . . . . . . . . . . . . . . 15 83 5.5. Protected Packets . . . . . . . . . . . . . . . . . . . . 16 84 5.6. Connection ID . . . . . . . . . . . . . . . . . . . . . . 16 85 5.7. Packet Numbers . . . . . . . . . . . . . . . . . . . . . 17 86 5.7.1. Initial Packet Number . . . . . . . . . . . . . . . . 18 87 5.8. Handling Packets from Different Versions . . . . . . . . 18 88 6. Frames and Frame Types . . . . . . . . . . . . . . . . . . . 19 89 7. Life of a Connection . . . . . . . . . . . . . . . . . . . . 20 90 7.1. Matching Packets to Connections . . . . . . . . . . . . . 21 91 7.2. Version Negotiation . . . . . . . . . . . . . . . . . . . 22 92 7.2.1. Sending Version Negotiation Packets . . . . . . . . . 22 93 7.2.2. Handling Version Negotiation Packets . . . . . . . . 23 94 7.2.3. Using Reserved Versions . . . . . . . . . . . . . . . 23 95 7.3. Cryptographic and Transport Handshake . . . . . . . . . . 24 96 7.4. Transport Parameters . . . . . . . . . . . . . . . . . . 25 97 7.4.1. Transport Parameter Definitions . . . . . . . . . . . 27 98 7.4.2. Values of Transport Parameters for 0-RTT . . . . . . 28 99 7.4.3. New Transport Parameters . . . . . . . . . . . . . . 28 100 7.4.4. Version Negotiation Validation . . . . . . . . . . . 29 101 7.5. Stateless Retries . . . . . . . . . . . . . . . . . . . . 30 102 7.6. Proof of Source Address Ownership . . . . . . . . . . . . 31 103 7.6.1. Client Address Validation Procedure . . . . . . . . . 31 104 7.6.2. Address Validation on Session Resumption . . . . . . 32 105 7.6.3. Address Validation Token Integrity . . . . . . . . . 33 106 7.7. Connection Migration . . . . . . . . . . . . . . . . . . 33 107 7.7.1. Privacy Implications of Connection Migration . . . . 33 108 7.7.2. Address Validation for Migrated Connections . . . . . 35 109 7.8. Connection Termination . . . . . . . . . . . . . . . . . 35 110 7.8.1. Draining Period . . . . . . . . . . . . . . . . . . . 35 111 7.8.2. Idle Timeout . . . . . . . . . . . . . . . . . . . . 35 112 7.8.3. Immediate Close . . . . . . . . . . . . . . . . . . . 36 113 7.8.4. Stateless Reset . . . . . . . . . . . . . . . . . . . 36 114 8. Frame Types and Formats . . . . . . . . . . . . . . . . . . . 39 115 8.1. PADDING Frame . . . . . . . . . . . . . . . . . . . . . . 39 116 8.2. RST_STREAM Frame . . . . . . . . . . . . . . . . . . . . 39 117 8.3. CONNECTION_CLOSE frame . . . . . . . . . . . . . . . . . 40 118 8.4. APPLICATION_CLOSE frame . . . . . . . . . . . . . . . . . 41 119 8.5. MAX_DATA Frame . . . . . . . . . . . . . . . . . . . . . 41 120 8.6. MAX_STREAM_DATA Frame . . . . . . . . . . . . . . . . . . 42 121 8.7. MAX_STREAM_ID Frame . . . . . . . . . . . . . . . . . . . 43 122 8.8. PING frame . . . . . . . . . . . . . . . . . . . . . . . 43 123 8.9. BLOCKED Frame . . . . . . . . . . . . . . . . . . . . . . 44 124 8.10. STREAM_BLOCKED Frame . . . . . . . . . . . . . . . . . . 44 125 8.11. STREAM_ID_BLOCKED Frame . . . . . . . . . . . . . . . . . 44 126 8.12. NEW_CONNECTION_ID Frame . . . . . . . . . . . . . . . . . 45 127 8.13. STOP_SENDING Frame . . . . . . . . . . . . . . . . . . . 45 128 8.14. ACK Frame . . . . . . . . . . . . . . . . . . . . . . . . 46 129 8.14.1. ACK Block Section . . . . . . . . . . . . . . . . . 48 130 8.14.2. ACK Frames and Packet Protection . . . . . . . . . . 50 131 8.15. STREAM Frame . . . . . . . . . . . . . . . . . . . . . . 51 132 9. Packetization and Reliability . . . . . . . . . . . . . . . . 52 133 9.1. Special Considerations for PMTU Discovery . . . . . . . . 55 134 10. Streams: QUIC's Data Structuring Abstraction . . . . . . . . 55 135 10.1. Stream Identifiers . . . . . . . . . . . . . . . . . . . 56 136 10.2. Life of a Stream . . . . . . . . . . . . . . . . . . . . 56 137 10.2.1. idle . . . . . . . . . . . . . . . . . . . . . . . . 58 138 10.2.2. open . . . . . . . . . . . . . . . . . . . . . . . . 58 139 10.2.3. half-closed (local) . . . . . . . . . . . . . . . . 59 140 10.2.4. half-closed (remote) . . . . . . . . . . . . . . . . 59 141 10.2.5. closed . . . . . . . . . . . . . . . . . . . . . . . 60 142 10.3. Solicited State Transitions . . . . . . . . . . . . . . 60 143 10.4. Stream Concurrency . . . . . . . . . . . . . . . . . . . 61 144 10.5. Sending and Receiving Data . . . . . . . . . . . . . . . 62 145 10.6. Stream Prioritization . . . . . . . . . . . . . . . . . 62 146 11. Flow Control . . . . . . . . . . . . . . . . . . . . . . . . 63 147 11.1. Edge Cases and Other Considerations . . . . . . . . . . 64 148 11.1.1. Response to a RST_STREAM . . . . . . . . . . . . . . 65 149 11.1.2. Data Limit Increments . . . . . . . . . . . . . . . 65 150 11.2. Stream Limit Increment . . . . . . . . . . . . . . . . . 66 151 11.2.1. Blocking on Flow Control . . . . . . . . . . . . . . 66 152 11.3. Stream Final Offset . . . . . . . . . . . . . . . . . . 66 153 12. Error Handling . . . . . . . . . . . . . . . . . . . . . . . 67 154 12.1. Connection Errors . . . . . . . . . . . . . . . . . . . 67 155 12.2. Stream Errors . . . . . . . . . . . . . . . . . . . . . 68 156 12.3. Transport Error Codes . . . . . . . . . . . . . . . . . 68 157 12.4. Application Protocol Error Codes . . . . . . . . . . . . 70 158 13. Security and Privacy Considerations . . . . . . . . . . . . . 70 159 13.1. Spoofed ACK Attack . . . . . . . . . . . . . . . . . . . 70 160 13.2. Slowloris Attacks . . . . . . . . . . . . . . . . . . . 70 161 13.3. Stream Fragmentation and Reassembly Attacks . . . . . . 71 162 13.4. Stream Commitment Attack . . . . . . . . . . . . . . . . 71 163 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 72 164 14.1. QUIC Transport Parameter Registry . . . . . . . . . . . 72 165 14.2. QUIC Transport Error Codes Registry . . . . . . . . . . 73 166 15. References . . . . . . . . . . . . . . . . . . . . . . . . . 75 167 15.1. Normative References . . . . . . . . . . . . . . . . . . 75 168 15.2. Informative References . . . . . . . . . . . . . . . . . 76 169 15.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 77 170 Appendix A. Contributors . . . . . . . . . . . . . . . . . . . . 77 171 Appendix B. Acknowledgments . . . . . . . . . . . . . . . . . . 77 172 Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 78 173 C.1. Since draft-ietf-quic-transport-06 . . . . . . . . . . . 78 174 C.2. Since draft-ietf-quic-transport-05 . . . . . . . . . . . 78 175 C.3. Since draft-ietf-quic-transport-04 . . . . . . . . . . . 78 176 C.4. Since draft-ietf-quic-transport-03 . . . . . . . . . . . 79 177 C.5. Since draft-ietf-quic-transport-02 . . . . . . . . . . . 79 178 C.6. Since draft-ietf-quic-transport-01 . . . . . . . . . . . 80 179 C.7. Since draft-ietf-quic-transport-00 . . . . . . . . . . . 82 180 C.8. Since draft-hamilton-quic-transport-protocol-01 . . . . . 82 181 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 82 183 1. Introduction 185 QUIC is a multiplexed and secure transport protocol that runs on top 186 of UDP. QUIC aims to provide a flexible set of features that allow 187 it to be a general-purpose transport for multiple applications. 189 QUIC implements techniques learned from experience with TCP, SCTP and 190 other transport protocols. QUIC uses UDP as substrate so as to not 191 require changes to legacy client operating systems and middleboxes to 192 be deployable. QUIC authenticates all of its headers and encrypts 193 most of the data it exchanges, including its signaling. This allows 194 the protocol to evolve without incurring a dependency on upgrades to 195 middleboxes. This document describes the core QUIC protocol, 196 including the conceptual design, wire format, and mechanisms of the 197 QUIC protocol for connection establishment, stream multiplexing, 198 stream and connection-level flow control, and data reliability. 200 Accompanying documents describe QUIC's loss detection and congestion 201 control [QUIC-RECOVERY], and the use of TLS 1.3 for key negotiation 202 [QUIC-TLS]. 204 2. Conventions and Definitions 206 The words "MUST", "MUST NOT", "SHOULD", and "MAY" are used in this 207 document. It's not shouting; when they are capitalized, they have 208 the special meaning defined in [RFC2119]. 210 Definitions of terms that are used in this document: 212 Client: The endpoint initiating a QUIC connection. 214 Server: The endpoint accepting incoming QUIC connections. 216 Endpoint: The client or server end of a connection. 218 Stream: A logical, bi-directional channel of ordered bytes within a 219 QUIC connection. 221 Connection: A conversation between two QUIC endpoints with a single 222 encryption context that multiplexes streams within it. 224 Connection ID: The 64-bit unsigned number used as an identifier for 225 a QUIC connection. 227 QUIC packet: A well-formed UDP payload that can be parsed by a QUIC 228 receiver. QUIC packet size in this document refers to the UDP 229 payload size. 231 2.1. Notational Conventions 233 Packet and frame diagrams use the format described in Section 3.1 of 234 [RFC2360], with the following additional conventions: 236 [x] Indicates that x is optional 237 {x} Indicates that x is encrypted 239 x (A) Indicates that x is A bits long 241 x (A/B/C) ... Indicates that x is one of A, B, or C bits long 243 x (*) ... Indicates that x is variable-length 245 3. A QUIC Overview 247 This section briefly describes QUIC's key mechanisms and benefits. 248 Key strengths of QUIC include: 250 o Low-latency connection establishment 252 o Multiplexing without head-of-line blocking 254 o Authenticated and encrypted header and payload 256 o Rich signaling for congestion control and loss recovery 258 o Stream and connection flow control 260 o Connection migration and resilience to NAT rebinding 262 o Version negotiation 264 3.1. Low-Latency Connection Establishment 266 QUIC relies on a combined cryptographic and transport handshake for 267 setting up a secure transport connection. QUIC connections are 268 expected to commonly use 0-RTT handshakes, meaning that for most QUIC 269 connections, data can be sent immediately following the client 270 handshake packet, without waiting for a reply from the server. QUIC 271 provides a dedicated stream (Stream ID 0) to be used for performing 272 the cryptographic handshake and QUIC options negotiation. The format 273 of the QUIC options and parameters used during negotiation are 274 described in this document, but the handshake protocol that runs on 275 Stream ID 0 is described in the accompanying cryptographic handshake 276 draft [QUIC-TLS]. 278 3.2. Stream Multiplexing 280 When application messages are transported over TCP, independent 281 application messages can suffer from head-of-line blocking. When an 282 application multiplexes many streams atop TCP's single-bytestream 283 abstraction, a loss of a TCP segment results in blocking of all 284 subsequent segments until a retransmission arrives, irrespective of 285 the application streams that are encapsulated in subsequent segments. 286 QUIC ensures that lost packets carrying data for an individual stream 287 only impact that specific stream. Data received on other streams can 288 continue to be reassembled and delivered to the application. 290 3.3. Rich Signaling for Congestion Control and Loss Recovery 292 QUIC's packet framing and acknowledgments carry rich information that 293 help both congestion control and loss recovery in fundamental ways. 294 Each QUIC packet carries a new packet number, including those 295 carrying retransmitted data. This obviates the need for a separate 296 mechanism to distinguish acknowledgments for retransmissions from 297 those for original transmissions, avoiding TCP's retransmission 298 ambiguity problem. QUIC acknowledgments also explicitly encode the 299 delay between the receipt of a packet and its acknowledgment being 300 sent, and together with the monotonically-increasing packet numbers, 301 this allows for precise network roundtrip-time (RTT) calculation. 302 QUIC's ACK frames support up to 256 ACK blocks, so QUIC is more 303 resilient to reordering than TCP with SACK support, as well as able 304 to keep more bytes on the wire when there is reordering or loss. 306 3.4. Stream and Connection Flow Control 308 QUIC implements stream- and connection-level flow control. At a high 309 level, a QUIC receiver advertises the maximum amount of data that it 310 is willing to receive on each stream. As data is sent, received, and 311 delivered on a particular stream, the receiver sends MAX_STREAM_DATA 312 frames that increase the advertised limit for that stream, allowing 313 the peer to send more data on that stream. 315 In addition to this stream-level flow control, QUIC implements 316 connection-level flow control to limit the aggregate buffer that a 317 QUIC receiver is willing to allocate to all streams on a connection. 318 Connection-level flow control works in the same way as stream-level 319 flow control, but the bytes delivered and the limits are aggregated 320 across all streams. 322 3.5. Authenticated and Encrypted Header and Payload 324 TCP headers appear in plaintext on the wire and are not 325 authenticated, causing a plethora of injection and header 326 manipulation issues for TCP, such as receive-window manipulation and 327 sequence-number overwriting. While some of these are mechanisms used 328 by middleboxes to improve TCP performance, others are active attacks. 329 Even "performance-enhancing" middleboxes that routinely interpose on 330 the transport state machine end up limiting the evolvability of the 331 transport protocol, as has been observed in the design of MPTCP 332 [RFC6824] and in its subsequent deployability issues. 334 Generally, QUIC packets are always authenticated and the payload is 335 typically fully encrypted. The parts of the packet header which are 336 not encrypted are still authenticated by the receiver, so as to 337 thwart any packet injection or manipulation by third parties. Some 338 early handshake packets, such as the Version Negotiation packet, are 339 not encrypted, but information sent in these unencrypted handshake 340 packets is later verified as part of cryptographic processing. 342 3.6. Connection Migration and Resilience to NAT Rebinding 344 QUIC connections are identified by a Connection ID, a 64-bit unsigned 345 number randomly generated by the server. QUIC's consistent 346 connection ID allows connections to survive changes to the client's 347 IP and port, such as those caused by NAT rebindings or by the client 348 changing network connectivity to a new address. QUIC provides 349 automatic cryptographic verification of a rebound lient, since the 350 client continues to use the same session key for encrypting and 351 decrypting packets. The consistent connection ID can be used to 352 allow migration of the connection to a new server IP address as well, 353 since the Connection ID remains consistent across changes in the 354 client's and the server's network addresses. 356 3.7. Version Negotiation 358 QUIC version negotiation allows for multiple versions of the protocol 359 to be deployed and used concurrently. Version negotiation is 360 described in Section 7.2. 362 4. Versions 364 QUIC versions are identified using a 32-bit unsigned number. 366 The version 0x00000000 is reserved to represent an invalid version. 367 This version of the specification is identified by the number 368 0x00000001. 370 Version 0x00000001 of QUIC uses TLS as a cryptographic handshake 371 protocol, as described in [QUIC-TLS]. 373 Versions with the most significant 16 bits of the version number 374 cleared are reserved for use in future IETF consensus documents. 376 Versions that follow the pattern 0x?a?a?a?a are reserved for use in 377 forcing version negotiation to be exercised. That is, any version 378 number where the low four bits of all octets is 1010 (in binary). A 379 client or server MAY advertise support for any of these reserved 380 versions. 382 Reserved version numbers will probably never represent a real 383 protocol; a client MAY use one of these version numbers with the 384 expectation that the server will initiate version negotiation; a 385 server MAY advertise support for one of these versions and can expect 386 that clients ignore the value. 388 [[RFC editor: please remove the remainder of this section before 389 publication.]] 391 The version number for the final version of this specification 392 (0x00000001), is reserved for the version of the protocol that is 393 published as an RFC. 395 Version numbers used to identify IETF drafts are created by adding 396 the draft number to 0xff000000. For example, draft-ietf-quic- 397 transport-13 would be identified as 0xff00000D. 399 Implementors are encouraged to register version numbers of QUIC that 400 they are using for private experimentation on the github wiki [4]. 402 5. Packet Types and Formats 404 We first describe QUIC's packet types and their formats, since some 405 are referenced in subsequent mechanisms. 407 All numeric values are encoded in network byte order (that is, big- 408 endian) and all field sizes are in bits. When discussing individual 409 bits of fields, the least significant bit is referred to as bit 0. 410 Hexadecimal notation is used for describing the value of fields. 412 Any QUIC packet has either a long or a short header, as indicated by 413 the Header Form bit. Long headers are expected to be used early in 414 the connection before version negotiation and establishment of 1-RTT 415 keys. Short headers are minimal version-specific headers, which are 416 used after version negotiation and 1-RTT keys are established. 418 5.1. Long Header 419 0 1 2 3 420 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 421 +-+-+-+-+-+-+-+-+ 422 |1| Type (7) | 423 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 424 | | 425 + Connection ID (64) + 426 | | 427 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 428 | Packet Number (32) | 429 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 430 | Version (32) | 431 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 432 | Payload (*) ... 433 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 435 Figure 1: Long Header Format 437 Long headers are used for packets that are sent prior to the 438 completion of version negotiation and establishment of 1-RTT keys. 439 Once both conditions are met, a sender switches to sending packets 440 using the short header (Section 5.2). The long form allows for 441 special packets - such as the Version Negotiation packet - to be 442 represented in this uniform fixed-length packet format. A long 443 header contains the following fields: 445 Header Form: The most significant bit (0x80) of octet 0 (the first 446 octet) is set to 1 for long headers. 448 Long Packet Type: The remaining seven bits of octet 0 contain the 449 packet type. This field can indicate one of 128 packet types. 450 The types specified for this version are listed in Table 1. 452 Connection ID: Octets 1 through 8 contain the connection ID. 453 Section 5.6 describes the use of this field in more detail. 455 Packet Number: Octets 9 to 12 contain the packet number. 456 Section 5.7 describes the use of packet numbers. 458 Version: Octets 13 to 16 contain the selected protocol version. 459 This field indicates which version of QUIC is in use and 460 determines how the rest of the protocol fields are interpreted. 462 Payload: Octets from 17 onwards (the rest of QUIC packet) are the 463 payload of the packet. 465 The following packet types are defined: 467 +------+------------------------+---------------+ 468 | Type | Name | Section | 469 +------+------------------------+---------------+ 470 | 0x01 | Version Negotiation | Section 5.3 | 471 | | | | 472 | 0x02 | Client Initial | Section 5.4.1 | 473 | | | | 474 | 0x03 | Server Stateless Retry | Section 5.4.2 | 475 | | | | 476 | 0x04 | Server Cleartext | Section 5.4.3 | 477 | | | | 478 | 0x05 | Client Cleartext | Section 5.4.4 | 479 | | | | 480 | 0x06 | 0-RTT Protected | Section 5.5 | 481 +------+------------------------+---------------+ 483 Table 1: Long Header Packet Types 485 The header form, packet type, connection ID, packet number and 486 version fields of a long header packet are version-independent. The 487 types of packets defined in Table 1 are version-specific. See 488 Section 5.8 for details on how packets from different versions of 489 QUIC are interpreted. 491 The interpretation of the fields and the payload are specific to a 492 version and packet type. Type-specific semantics for this version 493 are described in the following sections. 495 5.2. Short Header 497 0 1 2 3 498 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 499 +-+-+-+-+-+-+-+-+ 500 |0|C|K| Type (5)| 501 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 502 | | 503 + [Connection ID (64)] + 504 | | 505 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 506 | Packet Number (8/16/32) ... 507 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 508 | Protected Payload (*) ... 509 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 511 Figure 2: Short Header Format 513 The short header can be used after the version and 1-RTT keys are 514 negotiated. This header form has the following fields: 516 Header Form: The most significant bit (0x80) of octet 0 is set to 0 517 for the short header. 519 Connection ID Flag: The second bit (0x40) of octet 0 indicates 520 whether the Connection ID field is present. If set to 1, then the 521 Connection ID field is present; if set to 0, the Connection ID 522 field is omitted. The Connection ID field can only be omitted if 523 the omit_connection_id transport parameter (Section 7.4.1) is 524 specified by the intended recipient of the packet. 526 Key Phase Bit: The third bit (0x20) of octet 0 indicates the key 527 phase, which allows a recipient of a packet to identify the packet 528 protection keys that are used to protect the packet. See 529 [QUIC-TLS] for details. 531 Short Packet Type: The remaining 5 bits of octet 0 include one of 32 532 packet types. Table 2 lists the types that are defined for short 533 packets. 535 Connection ID: If the Connection ID Flag is set, a connection ID 536 occupies octets 1 through 8 of the packet. See Section 5.6 for 537 more details. 539 Packet Number: The length of the packet number field depends on the 540 packet type. This field can be 1, 2 or 4 octets long depending on 541 the short packet type. 543 Protected Payload: Packets with a short header always include a 544 1-RTT protected payload. 546 The packet type in a short header currently determines only the size 547 of the packet number field. Additional types can be used to signal 548 the presence of other fields. 550 +------+--------------------+ 551 | Type | Packet Number Size | 552 +------+--------------------+ 553 | 0x01 | 1 octet | 554 | | | 555 | 0x02 | 2 octets | 556 | | | 557 | 0x03 | 4 octets | 558 +------+--------------------+ 560 Table 2: Short Header Packet Types 562 The header form, connection ID flag and connection ID of a short 563 header packet are version-independent. The remaining fields are 564 specific to the selected QUIC version. See Section 5.8 for details 565 on how packets from different versions of QUIC are interpreted. 567 5.3. Version Negotiation Packet 569 A Version Negotiation packet has long headers with a type value of 570 0x01 and is sent only by servers. The Version Negotiation packet is 571 a response to a client packet that contains a version that is not 572 supported by the server. 574 The packet number, connection ID and version fields echo 575 corresponding values from the triggering client packet. This allows 576 clients some assurance that the server received the packet and that 577 the Version Negotiation packet was not carried in a packet with a 578 spoofed source address. 580 A Version Negotiation packet is never explicitly acknowledged in an 581 ACK frame by a client. Receiving another Client Initial packet 582 implicitly acknowledges a Version Negotiation packet. 584 The payload of the Version Negotiation packet is a list of 32-bit 585 versions which the server supports, as shown below. 587 0 1 2 3 588 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 589 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 590 | Supported Version 1 (32) ... 591 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 592 | [Supported Version 2 (32)] ... 593 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 594 ... 595 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 596 | [Supported Version N (32)] ... 597 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 599 Figure 3: Version Negotiation Packet 601 See Section 7.2 for a description of the version negotiation process. 603 5.4. Cleartext Packets 605 Cleartext packets are sent during the handshake prior to key 606 negotiation. 608 All cleartext packets contain the current QUIC version in the version 609 field. 611 In order to prevent tampering by version-unaware middleboxes, 612 Cleartext packets are protected with a connection and version 613 specific key, as described in [QUIC-TLS]. This protection does not 614 provide confidentiality or integrity against on-path attackers, but 615 provides some level of protection against off-path attackers. 617 5.4.1. Client Initial Packet 619 The Client Initial packet uses long headers with a type value of 620 0x02. It carries the first cryptographic handshake message sent by 621 the client. 623 The client populates the connection ID field with randomly selected 624 values, unless it has received a packet from the server. If the 625 client has received a packet from the server, the connection ID field 626 uses the value provided by the server. 628 The first Client Initial packet that is sent by a client contains a 629 random 31-bit value. All subsequent packets contain a packet number 630 that is incremented by one, see (Section 5.7). 632 The payload of a Client Initial packet consists of a STREAM frame (or 633 frames) for stream 0 containing a cryptographic handshake message, 634 with enough PADDING frames that the packet is at least 1200 octets 635 (see Section 9). The stream in this packet always starts at an 636 offset of 0 (see Section 7.5) and the complete cryptographic 637 handshake message MUST fit in a single packet (see Section 7.3). 639 The client uses the Client Initial Packet type for any packet that 640 contains an initial cryptographic handshake message. This includes 641 all cases where a new packet containing the initial cryptographic 642 message needs to be created, this includes the packets sent after 643 receiving a Version Negotiation (Section 5.3) or Server Stateless 644 Retry packet (Section 5.4.2). 646 5.4.2. Server Stateless Retry Packet 648 A Server Stateless Retry packet uses long headers with a type value 649 of 0x03. It carries cryptographic handshake messages and 650 acknowledgments. It is used by a server that wishes to perform a 651 stateless retry (see Section 7.5). 653 The packet number and connection ID fields echo the corresponding 654 fields from the triggering client packet. This allows a client to 655 verify that the server received its packet. 657 A Server Stateless Retry packet is never explicitly acknowledged in 658 an ACK frame by a client. Receiving another Client Initial packet 659 implicitly acknowledges a Server Stateless Retry packet. 661 After receiving a Server Stateless Retry packet, the client uses a 662 new Client Initial packet containing the next cryptographic handshake 663 message. The client retains the state of its cryptographic 664 handshake, but discards all transport state. The Client Initial 665 packet that is generated in response to a Server Stateless Retry 666 packet includes STREAM frames on stream 0 that start again at an 667 offset of 0. 669 Continuing the cryptographic handshake is necessary to ensure that an 670 attacker cannot force a downgrade of any cryptographic parameters. 671 In addition to continuing the cryptographic handshake, the client 672 MUST remember the results of any version negotiation that occurred 673 (see Section 7.2). The client MAY also retain any observed RTT or 674 congestion state that it has accumulated for the flow, but other 675 transport state MUST be discarded. 677 The payload of the Server Stateless Retry packet contains a single 678 STREAM frame on stream 0 with offset 0 containing the server's 679 cryptographic stateless retry material. It MUST NOT contain any 680 other frames. The next STREAM frame sent by the server will also 681 start at stream offset 0. 683 5.4.3. Server Cleartext Packet 685 A Server Cleartext packet uses long headers with a type value of 686 0x04. It is used to carry acknowledgments and cryptographic 687 handshake messages from the server. 689 The connection ID field in a Server Cleartext packet contains a 690 connection ID that is chosen by the server (see Section 5.6). 692 The first Server Cleartext packet contains a randomized packet 693 number. This value is increased for each subsequent packet sent by 694 the server as described in Section 5.7. 696 The payload of this packet contains STREAM frames and could contain 697 PADDING and ACK frames. 699 5.4.4. Client Cleartext Packet 701 A Client Cleartext packet uses long headers with a type value of 702 0x05, and is sent when the client has received a Server Cleartext 703 packet from the server. 705 The connection ID field in a Client Cleartext packet contains a 706 server-selected connection ID, see Section 5.6. 708 The Client Cleartext packet includes a packet number that is one 709 higher than the last Client Initial, 0-RTT Protected or Client 710 Cleartext packet that was sent. The packet number is incremented for 711 each subsequent packet, see Section 5.7. 713 The payload of this packet contains STREAM frames and could contain 714 PADDING and ACK frames. 716 5.5. Protected Packets 718 Packets that are protected with 0-RTT keys are sent with long 719 headers; all packets protected with 1-RTT keys are sent with short 720 headers. The different packet types explicitly indicate the 721 encryption level and therefore the keys that are used to remove 722 packet protection. 724 Packets protected with 0-RTT keys use a type value of 0x06. The 725 connection ID field for a 0-RTT packet is selected by the client. 727 The client can send 0-RTT packets after receiving a Server Cleartext 728 packet (Section 5.4.3), if that packet does not complete the 729 handshake. Even if the client receives a different connection ID in 730 the Server Cleartext packet, it MUST continue to use the connection 731 ID selected by the client for 0-RTT packets, see Section 5.6. 733 The version field for protected packets is the current QUIC version. 735 The packet number field contains a packet number, which increases 736 with each packet sent, see Section 5.7 for details. 738 The payload is protected using authenticated encryption. [QUIC-TLS] 739 describes packet protection in detail. After decryption, the 740 plaintext consists of a sequence of frames, as described in 741 Section 6. 743 5.6. Connection ID 745 QUIC connections are identified by their 64-bit Connection ID. All 746 long headers contain a Connection ID. Short headers indicate the 747 presence of a Connection ID using the CONNECTION_ID flag. When 748 present, the Connection ID is in the same location in all packet 749 headers, making it straightforward for middleboxes, such as load 750 balancers, to locate and use it. 752 The client MUST choose a random connection ID and use it in Client 753 Initial packets (Section 5.4.1) and 0-RTT packets (Section 5.5). 755 When the server receives a Client Initial packet and decides to 756 proceed with the handshake, it chooses a new value for the connection 757 ID and sends that in a Server Cleartext packet (Section 5.4.3). The 758 server MAY choose to use the value that the client initially selects. 760 Once the client receives the connection ID that the server has 761 chosen, it MUST use it for all subsequent Client Cleartext 762 (Section 5.4.4) and 1-RTT (Section 5.5) packets but not for 0-RTT 763 packets (Section 5.5). 765 Server's Version Negotiation (Section 5.3) and Stateless Retry 766 (Section 5.4.2) packets MUST use connection ID selected by the 767 client. 769 5.7. Packet Numbers 771 The packet number is a 64-bit unsigned number and is used as part of 772 a cryptographic nonce for packet encryption. Each endpoint maintains 773 a separate packet number for sending and receiving. The packet 774 number for sending MUST increase by at least one after sending any 775 packet, unless otherwise specified (see Section 5.7.1). 777 A QUIC endpoint MUST NOT reuse a packet number within the same 778 connection (that is, under the same cryptographic keys). If the 779 packet number for sending reaches 2^64 - 1, the sender MUST close the 780 connection without sending a CONNECTION_CLOSE frame or any further 781 packets; a server MAY send a Stateless Reset (Section 7.8.4) in 782 response to further packets that it receives. 784 To reduce the number of bits required to represent the packet number 785 over the wire, only the least significant bits of the packet number 786 are transmitted. The actual packet number for each packet is 787 reconstructed at the receiver based on the largest packet number 788 received on a successfully authenticated packet. 790 A packet number is decoded by finding the packet number value that is 791 closest to the next expected packet. The next expected packet is the 792 highest received packet number plus one. For example, if the highest 793 successfully authenticated packet had a packet number of 0xaa82f30e, 794 then a packet containing a 16-bit value of 0x1f94 will be decoded as 795 0xaa831f94. 797 The sender MUST use a packet number size able to represent more than 798 twice as large a range than the difference between the largest 799 acknowledged packet and packet number being sent. A peer receiving 800 the packet will then correctly decode the packet number, unless the 801 packet is delayed in transit such that it arrives after many higher- 802 numbered packets have been received. An endpoint SHOULD use a large 803 enough packet number encoding to allow the packet number to be 804 recovered even if the packet arrives after packets that are sent 805 afterwards. 807 As a result, the size of the packet number encoding is at least one 808 more than the base 2 logarithm of the number of contiguous 809 unacknowledged packet numbers, including the new packet. 811 For example, if an endpoint has received an acknowledgment for packet 812 0x6afa2f, sending a packet with a number of 0x6b4264 requires a 813 16-bit or larger packet number encoding; whereas a 32-bit packet 814 number is needed to send a packet with a number of 0x6bc107. 816 Version Negotiation (Section 5.3) and Server Stateless Retry 817 (Section 5.4.2) packets have special rules for populating the packet 818 number field. 820 5.7.1. Initial Packet Number 822 The initial value for packet number MUST be selected from an uniform 823 random distribution between 0 and 2^31-1. That is, the lower 31 bits 824 of the packet number are randomized. [RFC4086] provides guidance on 825 the generation of random values. 827 The first set of packets sent by an endpoint MUST include the low 828 32-bits of the packet number. Once any packet has been acknowledged, 829 subsequent packets can use a shorter packet number encoding. 831 5.8. Handling Packets from Different Versions 833 Between different versions the following things are guaranteed to 834 remain constant: 836 o the location of the header form flag, 838 o the location of the Connection ID flag in short headers, 840 o the location and size of the Connection ID field in both header 841 forms, 843 o the location and size of the Version field in long headers, 845 o the location and size of the Packet Number field in long headers, 846 and 848 o the type, format and semantics of the Version Negotiation packet. 850 Implementations MUST assume that an unsupported version uses an 851 unknown packet format. All other fields MUST be ignored when 852 processing a packet that contains an unsupported version. 854 6. Frames and Frame Types 856 The payload of cleartext packets and the plaintext after decryption 857 of protected payloads consists of a sequence of frames, as shown in 858 Figure 4. 860 0 1 2 3 861 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 862 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 863 | Frame 1 (*) ... 864 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 865 | Frame 2 (*) ... 866 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 867 ... 868 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 869 | Frame N (*) ... 870 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 872 Figure 4: Contents of Protected Payload 874 Protected payloads MUST contain at least one frame, and MAY contain 875 multiple frames and multiple frame types. 877 Frames MUST fit within a single QUIC packet and MUST NOT span a QUIC 878 packet boundary. Each frame begins with a Frame Type byte, 879 indicating its type, followed by additional type-dependent fields: 881 0 1 2 3 882 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 883 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 884 | Type (8) | Type-Dependent Fields (*) ... 885 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 887 Figure 5: Generic Frame Layout 889 Frame types are listed in Table 3. Note that the Frame Type byte in 890 STREAM and ACK frames is used to carry other frame-specific flags. 891 For all other frames, the Frame Type byte simply identifies the 892 frame. These frames are explained in more detail as they are 893 referenced later in the document. 895 +-------------+-------------------+--------------+ 896 | Type Value | Frame Type Name | Definition | 897 +-------------+-------------------+--------------+ 898 | 0x00 | PADDING | Section 8.1 | 899 | | | | 900 | 0x01 | RST_STREAM | Section 8.2 | 901 | | | | 902 | 0x02 | CONNECTION_CLOSE | Section 8.3 | 903 | | | | 904 | 0x03 | APPLICATION_CLOSE | Section 8.4 | 905 | | | | 906 | 0x04 | MAX_DATA | Section 8.5 | 907 | | | | 908 | 0x05 | MAX_STREAM_DATA | Section 8.6 | 909 | | | | 910 | 0x06 | MAX_STREAM_ID | Section 8.7 | 911 | | | | 912 | 0x07 | PING | Section 8.8 | 913 | | | | 914 | 0x08 | BLOCKED | Section 8.9 | 915 | | | | 916 | 0x09 | STREAM_BLOCKED | Section 8.10 | 917 | | | | 918 | 0x0a | STREAM_ID_BLOCKED | Section 8.11 | 919 | | | | 920 | 0x0b | NEW_CONNECTION_ID | Section 8.12 | 921 | | | | 922 | 0x0c | STOP_SENDING | Section 8.13 | 923 | | | | 924 | 0xa0 - 0xbf | ACK | Section 8.14 | 925 | | | | 926 | 0xc0 - 0xff | STREAM | Section 8.15 | 927 +-------------+-------------------+--------------+ 929 Table 3: Frame Types 931 7. Life of a Connection 933 A QUIC connection is a single conversation between two QUIC 934 endpoints. QUIC's connection establishment intertwines version 935 negotiation with the cryptographic and transport handshakes to reduce 936 connection establishment latency, as described in Section 7.3. Once 937 established, a connection may migrate to a different IP or port at 938 either endpoint, due to NAT rebinding or mobility, as described in 939 Section 7.7. Finally a connection may be terminated by either 940 endpoint, as described in Section 7.8. 942 7.1. Matching Packets to Connections 944 Incoming packets are classified on receipt. Packets can either be 945 associated with an existing connection, be discarded, or - for 946 servers - potentially create a new connection. 948 Packets that can be associated with an existing connection are 949 handled according to the current state of that connection. Packets 950 are associated with existing connections using connection ID if it is 951 present; this might include connection IDs that were advertised using 952 NEW_CONNECTION_ID (Section 8.12). Packets without connection IDs and 953 long-form packets for connections that have incomplete cryptographic 954 handshakes are associated with an existing connection using the tuple 955 of source and destination IP addresses and ports. 957 A packet that uses the short header could be associated with an 958 existing connection with an incomplete cryptographic handshake. Such 959 a packet could be a valid packet that has been reordered with respect 960 to the long-form packets that will complete the cryptographic 961 handshake. This might happen after the final set of cryptographic 962 handshake messages from either peer. These packets are expected to 963 be correlated with a connection using the tuple of IP addresses and 964 ports. Packets that might be reordered in this fashion SHOULD be 965 buffered in anticipation of the handshake completing. 967 0-RTT packets might be received prior to a Client Initial packet at a 968 server. If the version of these packets is acceptable to the server, 969 it MAY buffer these packets in anticipation of receiving a reordered 970 Client Initial packet. 972 Buffering ensures that data is not lost, which improves performance; 973 conversely, discarding these packets could create false loss signals 974 for the congestion controllers. However, limiting the number and 975 size of buffered packets might be needed to prevent exposure to 976 denial of service. 978 For clients, any packet that cannot be associated with an existing 979 connection SHOULD be discarded if it is not buffered. Discarded 980 packets MAY be logged for diagnostic or security purposes. 982 For servers, packets that aren't associated with a connection 983 potentially create a new connection. However, only packets that use 984 the long packet header and that are at least the minimum size defined 985 for the protocol version can be initial packets. A server MAY 986 discard packets with a short header or packets that are smaller than 987 the smallest minimum size for any version that the server supports. 988 A server that discards a packet that cannot be associated with a 989 connection MAY also generate a stateless reset (Section 7.8.4). 991 This version of QUIC defines a minimum size for initial packets of 992 1200 octets (see Section 9). Versions of QUIC that define smaller 993 minimum initial packet sizes need to be aware that initial packets 994 will be discarded without action by servers that only support 995 versions with larger minimums. Clients that support multiple QUIC 996 versions can avoid this problem by ensuring that they increase the 997 size of their initial packets to the largest minimum size across all 998 of the QUIC versions they support. Servers need to recognize initial 999 packets that are the minimum size of all QUIC versions they support. 1001 7.2. Version Negotiation 1003 QUIC's connection establishment begins with version negotiation, 1004 since all communication between the endpoints, including packet and 1005 frame formats, relies on the two endpoints agreeing on a version. 1007 A QUIC connection begins with a client sending a Client Initial 1008 packet (Section 5.4.1). The details of the handshake mechanisms are 1009 described in Section 7.3, but all of the initial packets sent from 1010 the client to the server MUST use the long header format - which 1011 includes the version of the protocol being used - and they MUST be 1012 padded to at least 1200 octets. 1014 The server receives this packet and determines whether it potentially 1015 creates a new connection (see Section 7.1). If the packet might 1016 generate a new connection, the server then checks whether it 1017 understands the version that the client has selected. 1019 If the packet contains a version that is acceptable to the server, 1020 the server proceeds with the handshake (Section 7.3). This commits 1021 the server to the version that the client selected. 1023 7.2.1. Sending Version Negotiation Packets 1025 If the version selected by the client is not acceptable to the 1026 server, the server responds with a Version Negotiation packet 1027 (Section 5.3). This includes a list of versions that the server will 1028 accept. 1030 A server sends a Version Negotiation packet for any packet with an 1031 unacceptable version if that packet could create a new connection. 1032 This allows a server to process packets with unsupported versions 1033 without retaining state. Though either the Client Initial packet or 1034 the version negotiation packet that is sent in response could be 1035 lost, the client will send new packets until it successfully receives 1036 a response or it abandons the connection attempt. 1038 7.2.2. Handling Version Negotiation Packets 1040 When the client receives a Version Negotiation packet, it first 1041 checks that the packet number and connection ID match the values the 1042 client sent in a previous packet on the same connection. If this 1043 check fails, the packet MUST be discarded. 1045 Once the Version Negotiation packet is determined to be valid, the 1046 client then selects an acceptable protocol version from the list 1047 provided by the server. The client then attempts to create a 1048 connection using that version. Though the contents of the Client 1049 Initial packet the client sends might not change in response to 1050 version negotiation, a client MUST increase the packet number it uses 1051 on every packet it sends. Packets MUST continue to use long headers 1052 and MUST include the new negotiated protocol version. 1054 The client MUST use the long header format and include its selected 1055 version on all packets until it has 1-RTT keys and it has received a 1056 packet from the server which is not a Version Negotiation packet. 1058 A client MUST NOT change the version it uses unless it is in response 1059 to a Version Negotiation packet from the server. Once a client 1060 receives a packet from the server which is not a Version Negotiation 1061 packet, it MUST discard other Version Negotiation packets on the same 1062 connection. Similarly, a client MUST ignore a Version Negotiation 1063 packet if it has already received and acted on a Version Negotiation 1064 packet. 1066 A client MUST ignore a Version Negotiation packet that lists the 1067 client's chosen version. 1069 Version negotiation packets have no cryptographic protection. The 1070 result of the negotiation MUST be revalidated as part of the 1071 cryptographic handshake (see Section 7.4.4). 1073 7.2.3. Using Reserved Versions 1075 For a server to use a new version in the future, clients must 1076 correctly handle unsupported versions. To help ensure this, a server 1077 SHOULD include a reserved version (see Section 4) while generating a 1078 Version Negotiation packet. 1080 The design of version negotiation permits a server to avoid 1081 maintaining state for packets that it rejects in this fashion. 1082 However, when the server generates a Version Negotiation packet, it 1083 cannot randomly generate a reserved version number. This is because 1084 the server is required to include the same value in its transport 1085 parameters (see Section 7.4.4). To avoid the selected version number 1086 changing during connection establishment, the reserved version SHOULD 1087 be generated as a function of values that will be available to the 1088 server when later generating its handshake packets. 1090 A pseudorandom function that takes client address information (IP and 1091 port) and the client selected version as input would ensure that 1092 there is sufficient variability in the values that a server uses. 1094 A client MAY send a packet using a reserved version number. This can 1095 be used to solicit a list of supported versions from a server. 1097 7.3. Cryptographic and Transport Handshake 1099 QUIC relies on a combined cryptographic and transport handshake to 1100 minimize connection establishment latency. QUIC allocates stream 0 1101 for the cryptographic handshake. Version 0x00000001 of QUIC uses TLS 1102 1.3 as described in [QUIC-TLS]; a different QUIC version number could 1103 indicate that a different cryptographic handshake protocol is in use. 1105 QUIC provides this stream with reliable, ordered delivery of data. 1106 In return, the cryptographic handshake provides QUIC with: 1108 o authenticated key exchange, where 1110 * a server is always authenticated, 1112 * a client is optionally authenticated, 1114 * every connection produces distinct and unrelated keys, 1116 * keying material is usable for packet protection for both 0-RTT 1117 and 1-RTT packets, and 1119 * 1-RTT keys have forward secrecy 1121 o authenticated values for the transport parameters of the peer (see 1122 Section 7.4) 1124 o authenticated confirmation of version negotiation (see 1125 Section 7.4.4) 1127 o authenticated negotiation of an application protocol (TLS uses 1128 ALPN [RFC7301] for this purpose) 1130 o for the server, the ability to carry data that provides assurance 1131 that the client can receive packets that are addressed with the 1132 transport address that is claimed by the client (see Section 7.6) 1134 The initial cryptographic handshake message MUST be sent in a single 1135 packet. Any second attempt that is triggered by address validation 1136 MUST also be sent within a single packet. This avoids having to 1137 reassemble a message from multiple packets. Reassembling messages 1138 requires that a server maintain state prior to establishing a 1139 connection, exposing the server to a denial of service risk. 1141 The first client packet of the cryptographic handshake protocol MUST 1142 fit within a 1232 octet QUIC packet payload. This includes overheads 1143 that reduce the space available to the cryptographic handshake 1144 protocol. 1146 Details of how TLS is integrated with QUIC is provided in more detail 1147 in [QUIC-TLS]. 1149 7.4. Transport Parameters 1151 During connection establishment, both endpoints make authenticated 1152 declarations of their transport parameters. These declarations are 1153 made unilaterally by each endpoint. Endpoints are required to comply 1154 with the restrictions implied by these parameters; the description of 1155 each parameter includes rules for its handling. 1157 The format of the transport parameters is the TransportParameters 1158 struct from Figure 6. This is described using the presentation 1159 language from Section 3 of [I-D.ietf-tls-tls13]. 1161 uint32 QuicVersion; 1163 enum { 1164 initial_max_stream_data(0), 1165 initial_max_data(1), 1166 initial_max_stream_id(2), 1167 idle_timeout(3), 1168 omit_connection_id(4), 1169 max_packet_size(5), 1170 stateless_reset_token(6), 1171 (65535) 1172 } TransportParameterId; 1174 struct { 1175 TransportParameterId parameter; 1176 opaque value<0..2^16-1>; 1177 } TransportParameter; 1179 struct { 1180 select (Handshake.msg_type) { 1181 case client_hello: 1182 QuicVersion negotiated_version; 1183 QuicVersion initial_version; 1185 case encrypted_extensions: 1186 QuicVersion supported_versions<4..2^8-4>; 1188 case new_session_ticket: 1189 struct {}; 1190 }; 1191 TransportParameter parameters<30..2^16-1>; 1192 } TransportParameters; 1194 Figure 6: Definition of TransportParameters 1196 The "extension_data" field of the quic_transport_parameters extension 1197 defined in [QUIC-TLS] contains a TransportParameters value. TLS 1198 encoding rules are therefore used to encode the transport parameters. 1200 QUIC encodes transport parameters into a sequence of octets, which 1201 are then included in the cryptographic handshake. Once the handshake 1202 completes, the transport parameters declared by the peer are 1203 available. Each endpoint validates the value provided by its peer. 1204 In particular, version negotiation MUST be validated (see 1205 Section 7.4.4) before the connection establishment is considered 1206 properly complete. 1208 Definitions for each of the defined transport parameters are included 1209 in Section 7.4.1. Any given parameter MUST appear at most once in a 1210 given transport parameters extension. An endpoint MUST treat receipt 1211 of duplicate transport parameters as a connection error of type 1212 TRANSPORT_PARAMETER_ERROR. 1214 7.4.1. Transport Parameter Definitions 1216 An endpoint MUST include the following parameters in its encoded 1217 TransportParameters: 1219 initial_max_stream_data (0x0000): The initial stream maximum data 1220 parameter contains the initial value for the maximum data that can 1221 be sent on any newly created stream. This parameter is encoded as 1222 an unsigned 32-bit integer in units of octets. This is equivalent 1223 to an implicit MAX_STREAM_DATA frame (Section 8.6) being sent on 1224 all streams immediately after opening. 1226 initial_max_data (0x0001): The initial maximum data parameter 1227 contains the initial value for the maximum amount of data that can 1228 be sent on the connection. This parameter is encoded as an 1229 unsigned 32-bit integer in units of 1024 octets. That is, the 1230 value here is multiplied by 1024 to determine the actual maximum 1231 value. This is equivalent to sending a MAX_DATA (Section 8.5) for 1232 the connection immediately after completing the handshake. 1234 initial_max_stream_id (0x0002): The initial maximum stream ID 1235 parameter contains the initial maximum stream number the peer may 1236 initiate, encoded as an unsigned 32-bit integer. This is 1237 equivalent to sending a MAX_STREAM_ID (Section 8.7) immediately 1238 after completing the handshake. 1240 idle_timeout (0x0003): The idle timeout is a value in seconds that 1241 is encoded as an unsigned 16-bit integer. The maximum value is 1242 600 seconds (10 minutes). 1244 A server MUST include the following transport parameters: 1246 stateless_reset_token (0x0006): The Stateless Reset Token is used in 1247 verifying a stateless reset, see Section 7.8.4. This parameter is 1248 a sequence of 16 octets. 1250 A client MUST NOT include a stateless reset token. A server MUST 1251 treat receipt of a stateless_reset_token transport parameter as a 1252 connection error of type TRANSPORT_PARAMETER_ERROR. 1254 An endpoint MAY use the following transport parameters: 1256 omit_connection_id (0x0004): The omit connection identifier 1257 parameter indicates that packets sent to the endpoint that 1258 advertises this parameter can omit the connection ID. This can be 1259 used by an endpoint where it knows that source and destination IP 1260 address and port are sufficient for it to identify a connection. 1261 This parameter is zero length. Absence this parameter indicates 1262 that the endpoint relies on the connection ID being present in 1263 every packet. 1265 max_packet_size (0x0005): The maximum packet size parameter places a 1266 limit on the size of packets that the endpoint is willing to 1267 receive, encoded as an unsigned 16-bit integer. This indicates 1268 that packets larger than this limit will be dropped. The default 1269 for this parameter is the maximum permitted UDP payload of 65527. 1270 Values below 1200 are invalid. This limit only applies to 1271 protected packets (Section 5.5). 1273 7.4.2. Values of Transport Parameters for 0-RTT 1275 Transport parameters from the server MUST be remembered by the client 1276 for use with 0-RTT data. If the TLS NewSessionTicket message 1277 includes the quic_transport_parameters extension, then those values 1278 are used for the server values when establishing a new connection 1279 using that ticket. Otherwise, the transport parameters that the 1280 server advertises during connection establishment are used. 1282 A server can remember the transport parameters that it advertised, or 1283 store an integrity-protected copy of the values in the ticket and 1284 recover the information when accepting 0-RTT data. A server uses the 1285 transport parameters in determining whether to accept 0-RTT data. 1287 A server MAY accept 0-RTT and subsequently provide different values 1288 for transport parameters for use in the new connection. If 0-RTT 1289 data is accepted by the server, the server MUST NOT reduce any limits 1290 or alter any values that might be violated by the client with its 1291 0-RTT data. In particular, a server that accepts 0-RTT data MUST NOT 1292 set values for initial_max_data or initial_max_stream_data that are 1293 smaller than the remembered value of those parameters. Similarly, a 1294 server MUST NOT reduce the value of initial_max_stream_id. 1296 A server MUST reject 0-RTT data or even abort a handshake if the 1297 implied values for transport parameters cannot be supported. 1299 7.4.3. New Transport Parameters 1301 New transport parameters can be used to negotiate new protocol 1302 behavior. An endpoint MUST ignore transport parameters that it does 1303 not support. Absence of a transport parameter therefore disables any 1304 optional protocol feature that is negotiated using the parameter. 1306 New transport parameters can be registered according to the rules in 1307 Section 14.1. 1309 7.4.4. Version Negotiation Validation 1311 The transport parameters include three fields that encode version 1312 information. These retroactively authenticate the version 1313 negotiation (see Section 7.2) that is performed prior to the 1314 cryptographic handshake. 1316 The cryptographic handshake provides integrity protection for the 1317 negotiated version as part of the transport parameters (see 1318 Section 7.4). As a result, modification of version negotiation 1319 packets by an attacker can be detected. 1321 The client includes two fields in the transport parameters: 1323 o The negotiated_version is the version that was finally selected 1324 for use. This MUST be identical to the value that is on the 1325 packet that carries the ClientHello. A server that receives a 1326 negotiated_version that does not match the version of QUIC that is 1327 in use MUST terminate the connection with a 1328 VERSION_NEGOTIATION_ERROR error code. 1330 o The initial_version is the version that the client initially 1331 attempted to use. If the server did not send a version 1332 negotiation packet Section 5.3, this will be identical to the 1333 negotiated_version. 1335 A server that processes all packets in a stateful fashion can 1336 remember how version negotiation was performed and validate the 1337 initial_version value. 1339 A server that does not maintain state for every packet it receives 1340 (i.e., a stateless server) uses a different process. If the initial 1341 and negotiated versions are the same, a stateless server can accept 1342 the value. 1344 If the initial version is different from the negotiated_version, a 1345 stateless server MUST check that it would have sent a version 1346 negotiation packet if it had received a packet with the indicated 1347 initial_version. If a server would have accepted the version 1348 included in the initial_version and the value differs from the value 1349 of negotiated_version, the server MUST terminate the connection with 1350 a VERSION_NEGOTIATION_ERROR error. 1352 The server includes a list of versions that it would send in any 1353 version negotiation packet (Section 5.3) in supported_versions. The 1354 server populates this field even if it did not send a version 1355 negotiation packet. This field is absent if the parameters are 1356 included in a NewSessionTicket message. 1358 The client can validate that the negotiated_version is included in 1359 the supported_versions list and - if version negotiation was 1360 performed - that it would have selected the negotiated version. A 1361 client MUST terminate the connection with a VERSION_NEGOTIATION_ERROR 1362 error code if the negotiated_version value is not included in the 1363 supported_versions list. A client MUST terminate with a 1364 VERSION_NEGOTIATION_ERROR error code if version negotiation occurred 1365 but it would have selected a different version based on the value of 1366 the supported_versions list. 1368 When an endpoint accepts multiple QUIC versions, it can potentially 1369 interpret transport parameters as they are defined by any of the QUIC 1370 versions it supports. The version field in the QUIC packet header is 1371 authenticated using transport parameters. The position and the 1372 format of the version fields in transport parameters MUST either be 1373 identical across different QUIC versions, or be unambiguously 1374 different to ensure no confusion about their interpretation. One way 1375 that a new format could be introduced is to define a TLS extension 1376 with a different codepoint. 1378 7.5. Stateless Retries 1380 A server can process an initial cryptographic handshake messages from 1381 a client without committing any state. This allows a server to 1382 perform address validation (Section 7.6, or to defer connection 1383 establishment costs. 1385 A server that generates a response to an initial packet without 1386 retaining connection state MUST use the Server Stateless Retry packet 1387 (Section 5.4.2). This packet causes a client to reset its transport 1388 state and to continue the connection attempt with new connection 1389 state while maintaining the state of the cryptographic handshake. 1391 A server MUST NOT send multiple Server Stateless Retry packets in 1392 response to a client handshake packet. Thus, any cryptographic 1393 handshake message that is sent MUST fit within a single packet. 1395 In TLS, the Server Stateless Retry packet type is used to carry the 1396 HelloRetryRequest message. 1398 7.6. Proof of Source Address Ownership 1400 Transport protocols commonly spend a round trip checking that a 1401 client owns the transport address (IP and port) that it claims. 1402 Verifying that a client can receive packets sent to its claimed 1403 transport address protects against spoofing of this information by 1404 malicious clients. 1406 This technique is used primarily to avoid QUIC from being used for 1407 traffic amplification attack. In such an attack, a packet is sent to 1408 a server with spoofed source address information that identifies a 1409 victim. If a server generates more or larger packets in response to 1410 that packet, the attacker can use the server to send more data toward 1411 the victim than it would be able to send on its own. 1413 Several methods are used in QUIC to mitigate this attack. Firstly, 1414 the initial handshake packet is padded to at least 1200 octets. This 1415 allows a server to send a similar amount of data without risking 1416 causing an amplification attack toward an unproven remote address. 1418 A server eventually confirms that a client has received its messages 1419 when the cryptographic handshake successfully completes. This might 1420 be insufficient, either because the server wishes to avoid the 1421 computational cost of completing the handshake, or it might be that 1422 the size of the packets that are sent during the handshake is too 1423 large. This is especially important for 0-RTT, where the server 1424 might wish to provide application data traffic - such as a response 1425 to a request - in response to the data carried in the early data from 1426 the client. 1428 To send additional data prior to completing the cryptographic 1429 handshake, the server then needs to validate that the client owns the 1430 address that it claims. 1432 Source address validation is therefore performed during the 1433 establishment of a connection. TLS provides the tools that support 1434 the feature, but basic validation is performed by the core transport 1435 protocol. 1437 7.6.1. Client Address Validation Procedure 1439 QUIC uses token-based address validation. Any time the server wishes 1440 to validate a client address, it provides the client with a token. 1441 As long as the token cannot be easily guessed (see Section 7.6.3), if 1442 the client is able to return that token, it proves to the server that 1443 it received the token. 1445 During the processing of the cryptographic handshake messages from a 1446 client, TLS will request that QUIC make a decision about whether to 1447 proceed based on the information it has. TLS will provide QUIC with 1448 any token that was provided by the client. For an initial packet, 1449 QUIC can decide to abort the connection, allow it to proceed, or 1450 request address validation. 1452 If QUIC decides to request address validation, it provides the 1453 cryptographic handshake with a token. The contents of this token are 1454 consumed by the server that generates the token, so there is no need 1455 for a single well-defined format. A token could include information 1456 about the claimed client address (IP and port), a timestamp, and any 1457 other supplementary information the server will need to validate the 1458 token in the future. 1460 The cryptographic handshake is responsible for enacting validation by 1461 sending the address validation token to the client. A legitimate 1462 client will include a copy of the token when it attempts to continue 1463 the handshake. The cryptographic handshake extracts the token then 1464 asks QUIC a second time whether the token is acceptable. In 1465 response, QUIC can either abort the connection or permit it to 1466 proceed. 1468 A connection MAY be accepted without address validation - or with 1469 only limited validation - but a server SHOULD limit the data it sends 1470 toward an unvalidated address. Successful completion of the 1471 cryptographic handshake implicitly provides proof that the client has 1472 received packets from the server. 1474 7.6.2. Address Validation on Session Resumption 1476 A server MAY provide clients with an address validation token during 1477 one connection that can be used on a subsequent connection. Address 1478 validation is especially important with 0-RTT because a server 1479 potentially sends a significant amount of data to a client in 1480 response to 0-RTT data. 1482 A different type of token is needed when resuming. Unlike the token 1483 that is created during a handshake, there might be some time between 1484 when the token is created and when the token is subsequently used. 1485 Thus, a resumption token SHOULD include an expiration time. It is 1486 also unlikely that the client port number is the same on two 1487 different connections; validating the port is therefore unlikely to 1488 be successful. 1490 This token can be provided to the cryptographic handshake immediately 1491 after establishing a connection. QUIC might also generate an updated 1492 token if significant time passes or the client address changes for 1493 any reason (see Section 7.7). The cryptographic handshake is 1494 responsible for providing the client with the token. In TLS the 1495 token is included in the ticket that is used for resumption and 1496 0-RTT, which is carried in a NewSessionTicket message. 1498 7.6.3. Address Validation Token Integrity 1500 An address validation token MUST be difficult to guess. Including a 1501 large enough random value in the token would be sufficient, but this 1502 depends on the server remembering the value it sends to clients. 1504 A token-based scheme allows the server to offload any state 1505 associated with validation to the client. For this design to work, 1506 the token MUST be covered by integrity protection against 1507 modification or falsification by clients. Without integrity 1508 protection, malicious clients could generate or guess values for 1509 tokens that would be accepted by the server. Only the server 1510 requires access to the integrity protection key for tokens. 1512 In TLS the address validation token is often bundled with the 1513 information that TLS requires, such as the resumption secret. In 1514 this case, adding integrity protection can be delegated to the 1515 cryptographic handshake protocol, avoiding redundant protection. If 1516 integrity protection is delegated to the cryptographic handshake, an 1517 integrity failure will result in immediate cryptographic handshake 1518 failure. If integrity protection is performed by QUIC, QUIC MUST 1519 abort the connection if the integrity check fails with a 1520 PROTOCOL_VIOLATION error code. 1522 7.7. Connection Migration 1524 QUIC connections are identified by their 64-bit Connection ID. 1525 QUIC's consistent connection ID allows connections to survive changes 1526 to the client's IP and/or port, such as those caused by client or 1527 server migrating to a new network. Connection migration allows a 1528 client to retain any shared state with a connection when they move 1529 networks. This includes state that can be hard to recover such as 1530 outstanding requests, which might otherwise be lost with no easy way 1531 to retry them. 1533 7.7.1. Privacy Implications of Connection Migration 1535 Using a stable connection ID on multiple network paths allows a 1536 passive observer to correlate activity between those paths. A client 1537 that moves between networks might not wish to have their activity 1538 correlated by any entity other than a server. The NEW_CONNECTION_ID 1539 message can be sent by a server to provide an unlinkable connection 1540 ID for use in case the client wishes to explicitly break linkability 1541 between two points of network attachment. 1543 A client might need to send packets on multiple networks without 1544 receiving any response from the server. To ensure that the client is 1545 not linkable across each of these changes, a new connection ID and 1546 packet number gap are needed for each network. To support this, a 1547 server sends multiple NEW_CONNECTION_ID messages. Each 1548 NEW_CONNECTION_ID is marked with a sequence number. Connection IDs 1549 MUST be used in the order in which they are numbered. 1551 A client which wishes to break linkability upon changing networks 1552 MUST use the connection ID provided by the server as well as 1553 incrementing the packet sequence number by an externally 1554 unpredictable value computed as described in Section 7.7.1.1. Packet 1555 number gaps are cumulative. A client might skip connection IDs, but 1556 it MUST ensure that it applies the associated packet number gaps for 1557 connection IDs that it skips in addition to the packet number gap 1558 associated with the connection ID that it does use. 1560 A server that receives a packet that is marked with a new connection 1561 ID recovers the packet number by adding the cumulative packet number 1562 gap to its expected packet number. A server SHOULD discard packets 1563 that contain a smaller gap than it advertised. 1565 For instance, a server might provide a packet number gap of 7 1566 associated with a new connection ID. If the server received packet 1567 10 using the previous connection ID, it should expect packets on the 1568 new connection ID to start at 18. A packet with the new connection 1569 ID and a packet number of 17 is discarded as being in error. 1571 7.7.1.1. Packet Number Gap 1573 In order to avoid linkage, the packet number gap MUST be externally 1574 indistinguishable from random. The packet number gap for a 1575 connection ID with sequence number is computed by encoding the 1576 sequence number as a 32-bit integer in big-endian format, and then 1577 computing: 1579 Gap = HKDF-Expand-Label(packet_number_secret, 1580 "QUIC packet sequence gap", sequence, 4) 1582 The output of HKDF-Expand-Label is interpreted as a big-endian 1583 number. "packet_number_secret" is derived from the TLS key exchange, 1584 as described in Section 5.6 of [QUIC-TLS]. 1586 7.7.2. Address Validation for Migrated Connections 1588 TODO: see issue #161 1590 7.8. Connection Termination 1592 Connections should remain open until they become idle for a pre- 1593 negotiated period of time. A QUIC connection, once established, can 1594 be terminated in one of three ways: 1596 o idle timeout (Section 7.8.2) 1598 o immediate close (Section 7.8.3) 1600 o stateless reset (Section 7.8.4) 1602 7.8.1. Draining Period 1604 After a connection is closed for any reason, an endpoint might 1605 receive packets from its peer. These packets might have been sent 1606 prior to receiving any close signal, or they might be retransmissions 1607 of packets for which acknowledgments were lost. 1609 The draining period persists for three times the current 1610 Retransmission Timeout (RTO) interval as defined in [QUIC-RECOVERY]. 1611 During this period, new packets can be acknowledged, but no new 1612 application data can be sent on the connection. 1614 Different treatment is given to packets that are received while a 1615 connection is in the draining period depending on how the connection 1616 was closed. 1618 An endpoint that is in a draining period MUST NOT send packets unless 1619 they contain a CONNECTION_CLOSE or APPLICATION_CLOSE frame. 1621 Once the draining period has ended, an endpoint SHOULD discard per- 1622 connection state. This results in new packets on the connection 1623 being discarded. An endpoint MAY send a stateless reset in response 1624 to any further incoming packets. 1626 The draining period does not apply when a stateless reset 1627 (Section 7.8.4) is sent. 1629 7.8.2. Idle Timeout 1631 A connection that remains idle for longer than the idle timeout (see 1632 Section 7.4.1) becomes closed. Either peer removes connection state 1633 if they have neither sent nor received a packet for this time. 1635 The time at which an idle timeout takes effect won't be perfectly 1636 synchronized on peers. A connection enters the draining period when 1637 the idle timeout expires. During this time, an endpoint that 1638 receives new packets MAY choose to restore the connection. 1639 Alternatively, an endpoint that receives packets MAY signal the 1640 timeout using an immediate close. 1642 7.8.3. Immediate Close 1644 An endpoint sends a CONNECTION_CLOSE or APPLICATION_CLOSE frame to 1645 terminate the connection immediately. Either frame causes all open 1646 streams to immediately become closed; open streams can be assumed to 1647 be implicitly reset. After sending or receiving a CONNECTION_CLOSE 1648 frame, endpoints immediately enter a draining period. 1650 During the draining period, an endpoint that sends a CONNECTION_CLOSE 1651 or APPLICATION_CLOSE frame SHOULD respond to any subsequent packet 1652 that it receives with another packet containing either close frame. 1653 To reduce the state that an endpoint maintains in this case, it MAY 1654 send the exact same packet. However, endpoints SHOULD limit the 1655 number of packets they generate containing either close frame. For 1656 instance, an endpoint could progressively increase the number of 1657 packets that it receives before sending additional packets. 1659 Note: Allowing retransmission of a packet contradicts other advice 1660 in this document that recommends the creation of new packet 1661 numbers for every packet. Sending new packet numbers is primarily 1662 of advantage to loss recovery and congestion control, which are 1663 not expected to be relevant for a closed connection. 1664 Retransmitting the final packet requires less state. 1666 An immediate close can be used after an application protocol has 1667 arranged to close a connection. This might be after the application 1668 protocols negotiates a graceful shutdown. The application protocol 1669 exchanges whatever messages that are needed to cause both endpoints 1670 to agree to close the connection, after which the application 1671 requests that the connection be closed. The application protocol can 1672 use an APPLICATION_CLOSE message with an appropriate error code to 1673 signal closure. 1675 7.8.4. Stateless Reset 1677 A stateless reset is provided as an option of last resort for a 1678 server that does not have access to the state of a connection. A 1679 server crash or outage might result in clients continuing to send 1680 data to a server that is unable to properly continue the connection. 1681 A server that wishes to communicate a fatal connection error MUST use 1682 a CONNECTION_CLOSE or APPLICATION_CLOSE frame if it has sufficient 1683 state to do so. 1685 To support this process, the server sends a stateless_reset_token 1686 value during the handshake in the transport parameters. This value 1687 is protected by encryption, so only client and server know this 1688 value. 1690 A server that receives packets that it cannot process sends a packet 1691 in the following layout: 1693 0 1 2 3 1694 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1695 +-+-+-+-+-+-+-+-+ 1696 |0|C|K| 00001 | 1697 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1698 | | 1699 + [Connection ID (64)] + 1700 | | 1701 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1702 | Packet Number (8/16/32) | 1703 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1704 | Random Octets (*) ... 1705 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1706 | | 1707 + + 1708 | | 1709 + Stateless Reset Token (128) + 1710 | | 1711 + + 1712 | | 1713 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1715 A server copies the connection ID field from the packet that triggers 1716 the stateless reset. A server omits the connection ID if explicitly 1717 configured to do so, or if the client packet did not include a 1718 connection ID. 1720 The Packet Number field is set to a randomized value. The server 1721 SHOULD send a packet with a short header and a type of 0x01. This 1722 produces the shortest possible packet number encoding, which 1723 minimizes the perceived gap between the last packet that the server 1724 sent and this packet. A server MAY use a different short header 1725 type, indicating a different packet number length, but a longer 1726 packet number encoding might allow this message to be identified as a 1727 stateless reset more easily using heuristics. 1729 After the first short header octet and optional connection ID, the 1730 server includes the value of the Stateless Reset Token that it 1731 included in its transport parameters. 1733 After the Packet Number, the server pads the message with an 1734 arbitrary number of octets containing random values. 1736 Finally, the last 16 octets of the packet are set to the value of the 1737 Stateless Reset Token. 1739 This design ensures that a stateless reset packet is - to the extent 1740 possible - indistinguishable from a regular packet. 1742 A stateless reset is not appropriate for signaling error conditions. 1743 An endpoint that wishes to communicate a fatal connection error MUST 1744 use a CONNECTION_CLOSE or APPLICATION_CLOSE frame if it has 1745 sufficient state to do so. 1747 7.8.4.1. Detecting a Stateless Reset 1749 A client detects a potential stateless reset when a packet with a 1750 short header either cannot be decrypted or is marked as a duplicate 1751 packet. The client then compares the last 16 octets of the packet 1752 with the Stateless Reset Token provided by the server in its 1753 transport parameters. If these values are identical, the client MUST 1754 enter the draining period and not send any further packets on this 1755 connection. If the comparison fails, the packet can be discarded. 1757 7.8.4.2. Calculating a Stateless Reset Token 1759 The stateless reset token MUST be difficult to guess. In order to 1760 create a Stateless Reset Token, a server could randomly generate 1761 [RFC4086] a secret for every connection that it creates. However, 1762 this presents a coordination problem when there are multiple servers 1763 in a cluster or a storage problem for a server that might lose state. 1764 Stateless reset specifically exists to handle the case where state is 1765 lost, so this approach is suboptimal. 1767 A single static key can be used across all connections to the same 1768 endpoint by generating the proof using a second iteration of a 1769 preimage-resistant function that takes three inputs: the static key, 1770 a the connection ID for the connection (see Section 5.6), and an 1771 identifier for the server instance. A server could use HMAC 1772 [RFC2104] (for example, HMAC(static_key, server_id || connection_id)) 1773 or HKDF [RFC5869] (for example, using the static key as input keying 1774 material, with server and connection identifiers as salt). The 1775 output of this function is truncated to 16 octets to produce the 1776 Stateless Reset Token for that connection. 1778 A server that loses state can use the same method to generate a valid 1779 Stateless Reset Secret. The connection ID comes from the packet that 1780 the server receives. 1782 This design relies on the client always sending a connection ID in 1783 its packets so that the server can use the connection ID from a 1784 packet to reset the connection. A server that uses this design 1785 cannot allow clients to omit a connection ID (that is, it cannot use 1786 the truncate_connection_id transport parameter Section 7.4.1). 1788 Revealing the Stateless Reset Token allows any entity to terminate 1789 the connection, so a value can only be used once. This method for 1790 choosing the Stateless Reset Token means that the combination of 1791 server instance, connection ID, and static key cannot occur for 1792 another connection. A connection ID from a connection that is reset 1793 by revealing the Stateless Reset Token cannot be reused for new 1794 connections at the same server without first changing to use a 1795 different static key or server identifier. 1797 Note that Stateless Reset messages do not have any cryptographic 1798 protection. 1800 8. Frame Types and Formats 1802 As described in Section 6, Regular packets contain one or more 1803 frames. We now describe the various QUIC frame types that can be 1804 present in a Regular packet. The use of these frames and various 1805 frame header bits are described in subsequent sections. 1807 8.1. PADDING Frame 1809 The PADDING frame (type=0x00) has no semantic value. PADDING frames 1810 can be used to increase the size of a packet. Padding can be used to 1811 increase an initial client packet to the minimum required size, or to 1812 provide protection against traffic analysis for protected packets. 1814 A PADDING frame has no content. That is, a PADDING frame consists of 1815 the single octet that identifies the frame as a PADDING frame. 1817 8.2. RST_STREAM Frame 1819 An endpoint may use a RST_STREAM frame (type=0x01) to abruptly 1820 terminate a stream. 1822 After sending a RST_STREAM, an endpoint ceases transmission and 1823 retransmission of STREAM frames on the identified stream. A receiver 1824 of RST_STREAM can discard any data that it already received on that 1825 stream. 1827 The RST_STREAM frame is as follows: 1829 0 1 2 3 1830 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1831 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1832 | Stream ID (32) | 1833 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1834 | Application Error Code (16) | 1835 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1836 | | 1837 + Final Offset (64) + 1838 | | 1839 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1841 The fields are: 1843 Stream ID: The 32-bit Stream ID of the stream being terminated. 1845 Application Protocol Error Code: A 16-bit application protocol error 1846 code (see Section 12.4) which indicates why the stream is being 1847 closed. 1849 Final Offset: A 64-bit unsigned integer indicating the absolute byte 1850 offset of the end of data written on this stream by the RST_STREAM 1851 sender. 1853 8.3. CONNECTION_CLOSE frame 1855 An endpoint sends a CONNECTION_CLOSE frame (type=0x02) to notify its 1856 peer that the connection is being closed. CONNECTION_CLOSE is used 1857 to signal errors at the QUIC layer, or the absence of errors (with 1858 the NO_ERROR code). 1860 If there are open streams that haven't been explicitly closed, they 1861 are implicitly closed when the connection is closed. 1863 The CONNECTION_CLOSE frame is as follows: 1865 0 1 2 3 1866 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1867 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1868 | Error Code (16) | Reason Phrase Length (16) | 1869 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1870 | Reason Phrase (*) ... 1871 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1873 The fields of a CONNECTION_CLOSE frame are as follows: 1875 Error Code: A 16-bit error code which indicates the reason for 1876 closing this connection. CONNECTION_CLOSE uses codes from the 1877 space defined in Section 12.3 (APPLICATION_CLOSE uses codes from 1878 the application protocol error code space, see Section 12.4). 1880 Reason Phrase Length: A 16-bit unsigned number specifying the length 1881 of the reason phrase in bytes. Note that a CONNECTION_CLOSE frame 1882 cannot be split between packets, so in practice any limits on 1883 packet size will also limit the space available for a reason 1884 phrase. 1886 Reason Phrase: A human-readable explanation for why the connection 1887 was closed. This can be zero length if the sender chooses to not 1888 give details beyond the Error Code. This SHOULD be a UTF-8 1889 encoded string [RFC3629]. 1891 8.4. APPLICATION_CLOSE frame 1893 An APPLICATION_CLOSE frame (type=0x03) uses the same format as the 1894 CONNECTION_CLOSE frame (Section 8.3), except that it uses error codes 1895 from the application protocol error code space (Section 12.4) instead 1896 of the transport error code space. 1898 Other than the error code space, the format and semantics of the 1899 APPLICATION_CLOSE frame are identical to the CONNECTION_CLOSE frame. 1901 8.5. MAX_DATA Frame 1903 The MAX_DATA frame (type=0x04) is used in flow control to inform the 1904 peer of the maximum amount of data that can be sent on the connection 1905 as a whole. 1907 The frame is as follows: 1909 0 1 2 3 1910 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1911 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1912 | | 1913 + Maximum Data (64) + 1914 | | 1915 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1917 The fields in the MAX_DATA frame are as follows: 1919 Maximum Data: A 64-bit unsigned integer indicating the maximum 1920 amount of data that can be sent on the entire connection, in units 1921 of 1024 octets. That is, the updated connection-level data limit 1922 is determined by multiplying the encoded value by 1024. 1924 All data sent in STREAM frames counts toward this limit, with the 1925 exception of data on stream 0. The sum of the largest received 1926 offsets on all streams - including closed streams, but excluding 1927 stream 0 - MUST NOT exceed the value advertised by a receiver. An 1928 endpoint MUST terminate a connection with a 1929 QUIC_FLOW_CONTROL_RECEIVED_TOO_MUCH_DATA error if it receives more 1930 data than the maximum data value that it has sent, unless this is a 1931 result of a change in the initial limits (see Section 7.4.2). 1933 8.6. MAX_STREAM_DATA Frame 1935 The MAX_STREAM_DATA frame (type=0x05) is used in flow control to 1936 inform a peer of the maximum amount of data that can be sent on a 1937 stream. 1939 The frame is as follows: 1941 0 1 2 3 1942 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1943 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1944 | Stream ID (32) | 1945 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1946 | | 1947 + Maximum Stream Data (64) + 1948 | | 1949 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1951 The fields in the MAX_STREAM_DATA frame are as follows: 1953 Stream ID: The stream ID of the stream that is affected. 1955 Maximum Stream Data: A 64-bit unsigned integer indicating the 1956 maximum amount of data that can be sent on the identified stream, 1957 in units of octets. 1959 When counting data toward this limit, an endpoint accounts for the 1960 largest received offset of data that is sent or received on the 1961 stream. Loss or reordering can mean that the largest received offset 1962 on a stream can be greater than the total size of data received on 1963 that stream. Receiving STREAM frames might not increase the largest 1964 received offset. 1966 The data sent on a stream MUST NOT exceed the largest maximum stream 1967 data value advertised by the receiver. An endpoint MUST terminate a 1968 connection with a FLOW_CONTROL_ERROR error if it receives more data 1969 than the largest maximum stream data that it has sent for the 1970 affected stream, unless this is a result of a change in the initial 1971 limits (see Section 7.4.2). 1973 8.7. MAX_STREAM_ID Frame 1975 The MAX_STREAM_ID frame (type=0x06) informs the peer of the maximum 1976 stream ID that they are permitted to open. 1978 The frame is as follows: 1980 0 1 2 3 1981 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1982 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1983 | Maximum Stream ID (32) | 1984 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1986 The fields in the MAX_STREAM_ID frame are as follows: 1988 Maximum Stream ID: ID of the maximum peer-initiated stream ID for 1989 the connection. 1991 Loss or reordering can mean that a MAX_STREAM_ID frame can be 1992 received which states a lower stream limit than the client has 1993 previously received. MAX_STREAM_ID frames which do not increase the 1994 maximum stream ID MUST be ignored. 1996 A peer MUST NOT initiate a stream with a higher stream ID than the 1997 greatest maximum stream ID it has received. An endpoint MUST 1998 terminate a connection with a STREAM_ID_ERROR error if a peer 1999 initiates a stream with a higher stream ID than it has sent, unless 2000 this is a result of a change in the initial limits (see 2001 Section 7.4.2). 2003 8.8. PING frame 2005 Endpoints can use PING frames (type=0x07) to verify that their peers 2006 are still alive or to check reachability to the peer. The PING frame 2007 contains no additional fields. The receiver of a PING frame simply 2008 needs to acknowledge the packet containing this frame. 2010 A PING frame has no additional fields. 2012 The PING frame can be used to keep a connection alive when an 2013 application or application protocol wishes to prevent the connection 2014 from timing out. An application protocol SHOULD provide guidance 2015 about the conditions under which generating a PING is recommended. 2016 This guidance SHOULD indicate whether it is the client or the server 2017 that is expected to send the PING. Having both endpoints send PING 2018 frames without coordination can produce an excessive number of 2019 packets and poor performance. 2021 A connection will time out if no packets are sent or received for a 2022 period longer than the time specified in the idle_timeout transport 2023 parameter (see Section 7.8). However, state in middleboxes might 2024 time out earlier than that. Though REQ-5 in [RFC4787] recommends a 2 2025 minute timeout interval, experience shows that sending packets every 2026 15 to 30 seconds is necessary to prevent the majority of middleboxes 2027 from losing state for UDP flows. 2029 8.9. BLOCKED Frame 2031 A sender sends a BLOCKED frame (type=0x08) when it wishes to send 2032 data, but is unable to due to connection-level flow control (see 2033 Section 11.2.1). BLOCKED frames can be used as input to tuning of 2034 flow control algorithms (see Section 11.1.2). 2036 The BLOCKED frame does not contain a payload. 2038 8.10. STREAM_BLOCKED Frame 2040 A sender sends a STREAM_BLOCKED frame (type=0x09) when it wishes to 2041 send data, but is unable to due to stream-level flow control. This 2042 frame is analogous to BLOCKED (Section 8.9). 2044 The STREAM_BLOCKED frame is as follows: 2046 0 1 2 3 2047 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2048 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2049 | Stream ID (32) | 2050 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2052 The STREAM_BLOCKED frame contains a single field: 2054 Stream ID: A 32-bit unsigned number indicating the stream which is 2055 flow control blocked. 2057 8.11. STREAM_ID_BLOCKED Frame 2059 A sender MAY send a STREAM_ID_BLOCKED frame (type=0x0a) when it 2060 wishes to open a stream, but is unable to due to the maximum stream 2061 ID limit set by its peer (see Section 8.7). This does not open the 2062 stream, but informs the peer that a new stream was needed, but the 2063 stream limit prevented the creation of the stream. 2065 The STREAM_ID_BLOCKED frame does not contain a payload. 2067 8.12. NEW_CONNECTION_ID Frame 2069 A server sends a NEW_CONNECTION_ID frame (type=0x0b) to provide the 2070 client with alternative connection IDs that can be used to break 2071 linkability when migrating connections (see Section 7.7.1). 2073 The NEW_CONNECTION_ID is as follows: 2075 0 1 2 3 2076 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2077 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2078 | Sequence (16) | 2079 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2080 | | 2081 + Connection ID (64) + 2082 | | 2083 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2084 | | 2085 + + 2086 | | 2087 + Stateless Reset Token (128) + 2088 | | 2089 + + 2090 | | 2091 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2093 The fields are: 2095 Sequence: A 16-bit sequence number. This value starts at 0 and 2096 increases by 1 for each connection ID that is provided by the 2097 server. The sequence value can wrap; the value 65535 is followed 2098 by 0. When wrapping the sequence field, the server MUST ensure 2099 that a value with the same sequence has been received and 2100 acknowledged by the client. The connection ID that is assigned 2101 during the handshake is assumed to have a sequence of 65535. 2103 Connection ID: A 64-bit connection ID. 2105 Stateless Reset Token: A 128-bit value that will be used to for a 2106 stateless reset when the associated connection ID is used (see 2107 Section 7.8.4). 2109 8.13. STOP_SENDING Frame 2111 An endpoint may use a STOP_SENDING frame (type=0x0c) to communicate 2112 that incoming data is being discarded on receipt at application 2113 request. This signals a peer to abruptly terminate transmission on a 2114 stream. 2116 The STOP_SENDING frame is as follows: 2118 0 1 2 3 2119 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2120 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2121 | Stream ID (32) | 2122 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2123 | Application Error Code (16) | 2124 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2126 The fields are: 2128 Stream ID: The 32-bit Stream ID of the stream being ignored. 2130 Application Error Code: A 16-bit, application-specified reason the 2131 sender is ignoring the stream (see Section 12.4). 2133 8.14. ACK Frame 2135 Receivers send ACK frames to inform senders which packets they have 2136 received and processed, as well as which packets are considered 2137 missing. The ACK frame contains between 1 and 256 ACK blocks. ACK 2138 blocks are ranges of acknowledged packets. Implementations MUST NOT 2139 generate packets that only contain ACK frames in response to packets 2140 which only contain ACK frames. However, they SHOULD acknowledge 2141 packets containing only ACK frames when sending ACK frames in 2142 response to other packets. 2144 To limit ACK blocks to those that have not yet been received by the 2145 sender, the receiver SHOULD track which ACK frames have been 2146 acknowledged by its peer. Once an ACK frame has been acknowledged, 2147 the packets it acknowledges SHOULD NOT be acknowledged again. 2149 A receiver that is only sending ACK frames will not receive 2150 acknowledgments for its packets. Sending an occasional MAX_DATA or 2151 MAX_STREAM_DATA frame as data is received will ensure that 2152 acknowledgements are generated by a peer. Otherwise, an endpoint MAY 2153 send a PING frame once per RTT to solicit an acknowledgment. 2155 To limit receiver state or the size of ACK frames, a receiver MAY 2156 limit the number of ACK blocks it sends. A receiver can do this even 2157 without receiving acknowledgment of its ACK frames, with the 2158 knowledge this could cause the sender to unnecessarily retransmit 2159 some data. When this is necessary, the receiver SHOULD acknowledge 2160 newly received packets and stop acknowledging packets received in the 2161 past. 2163 Unlike TCP SACKs, QUIC ACK blocks are irrevocable. Once a packet has 2164 been acknowledged, even if it does not appear in a future ACK frame, 2165 it remains acknowledged. 2167 A client MUST NOT acknowledge Version Negotiation or Server Stateless 2168 Retry packets. These packet types contain packet numbers selected by 2169 the client, not the server. 2171 A sender MAY intentionally skip packet numbers to introduce entropy 2172 into the connection, to avoid opportunistic acknowledgement attacks. 2173 The sender SHOULD close the connection if an unsent packet number is 2174 acknowledged. The format of the ACK frame is efficient at expressing 2175 blocks of missing packets; skipping packet numbers between 1 and 255 2176 effectively provides up to 8 bits of efficient entropy on demand, 2177 which should be adequate protection against most opportunistic 2178 acknowledgement attacks. 2180 The type byte for a ACK frame contains embedded flags, and is 2181 formatted as "101NLLMM". These bits are parsed as follows: 2183 o The first three bits must be set to 101 indicating that this is an 2184 ACK frame. 2186 o The "N" bit indicates whether the frame contains a Num Blocks 2187 field. 2189 o The two "LL" bits encode the length of the Largest Acknowledged 2190 field. The values 00, 01, 02, and 03 indicate lengths of 8, 16, 2191 32, and 64 bits respectively. 2193 o The two "MM" bits encode the length of the ACK Block Length 2194 fields. The values 00, 01, 02, and 03 indicate lengths of 8, 16, 2195 32, and 64 bits respectively. 2197 An ACK frame is shown below. 2199 0 1 2 3 2200 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2201 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2202 |[Num Blocks(8)]| 2203 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2204 | Largest Acknowledged (8/16/32/64) ... 2205 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2206 | ACK Delay (16) | 2207 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2208 | ACK Block Section (*) ... 2209 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2211 Figure 7: ACK Frame Format 2213 The fields in the ACK frame are as follows: 2215 Num Blocks (opt): An optional 8-bit unsigned value specifying the 2216 number of additional ACK blocks (besides the required First ACK 2217 Block) in this ACK frame. Only present if the 'N' flag bit is 1. 2219 Largest Acknowledged: A variable-sized unsigned value representing 2220 the largest packet number the peer is acknowledging in this packet 2221 (typically the largest that the peer has seen thus far.) 2223 ACK Delay: The time from when the largest acknowledged packet, as 2224 indicated in the Largest Acknowledged field, was received by this 2225 peer to when this ACK was sent. 2227 ACK Block Section: Contains one or more blocks of packet numbers 2228 which have been successfully received, see Section 8.14.1. 2230 8.14.1. ACK Block Section 2232 The ACK Block Section contains between one and 256 blocks of packet 2233 numbers which have been successfully received. If the Num Blocks 2234 field is absent, only the First ACK Block length is present in this 2235 section. Otherwise, the Num Blocks field indicates how many 2236 additional blocks follow the First ACK Block Length field. 2238 0 1 2 3 2239 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2240 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2241 | First ACK Block Length (8/16/32/64) ... 2242 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2243 | [Gap 1 (8)] | [ACK Block 1 Length (8/16/32/64)] ... 2244 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2245 | [Gap 2 (8)] | [ACK Block 2 Length (8/16/32/64)] ... 2246 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2247 ... 2248 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2249 | [Gap N (8)] | [ACK Block N Length (8/16/32/64)] ... 2250 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2252 Figure 8: ACK Block Section 2254 The fields in the ACK Block Section are: 2256 First ACK Block Length: An unsigned packet number delta that 2257 indicates the number of contiguous additional packets being 2258 acknowledged starting at the Largest Acknowledged. 2260 Gap To Next Block (opt, repeated): An unsigned number specifying the 2261 number of contiguous missing packets from the end of the previous 2262 ACK block to the start of the next. Repeated "Num Blocks" times. 2264 ACK Block Length (opt, repeated): An unsigned packet number delta 2265 that indicates the number of contiguous packets being acknowledged 2266 starting after the end of the previous gap. Repeated "Num Blocks" 2267 times. 2269 8.14.1.1. Time Format 2271 DISCUSS_AND_REPLACE: Perhaps make this format simpler. 2273 The time format used in the ACK frame above is a 16-bit unsigned 2274 float with 11 explicit bits of mantissa and 5 bits of explicit 2275 exponent, specifying time in microseconds. The bit format is loosely 2276 modeled after IEEE 754. For example, 1 microsecond is represented as 2277 0x1, which has an exponent of zero, presented in the 5 high order 2278 bits, and mantissa of 1, presented in the 11 low order bits. When 2279 the explicit exponent is greater than zero, an implicit high-order 2280 12th bit of 1 is assumed in the mantissa. For example, a floating 2281 value of 0x800 has an explicit exponent of 1, as well as an explicit 2282 mantissa of 0, but then has an effective mantissa of 4096 (12th bit 2283 is assumed to be 1). Additionally, the actual exponent is one-less 2284 than the explicit exponent, and the value represents 4096 2285 microseconds. Any values larger than the representable range are 2286 clamped to 0xFFFF. 2288 8.14.2. ACK Frames and Packet Protection 2290 ACK frames that acknowledge protected packets MUST be carried in a 2291 packet that has an equivalent or greater level of packet protection. 2293 Packets that are protected with 1-RTT keys MUST be acknowledged in 2294 packets that are also protected with 1-RTT keys. 2296 A packet that is not protected and claims to acknowledge a packet 2297 number that was sent with packet protection is not valid. An 2298 unprotected packet that carries acknowledgments for protected packets 2299 MUST be discarded in its entirety. 2301 Packets that a client sends with 0-RTT packet protection MUST be 2302 acknowledged by the server in packets protected by 1-RTT keys. This 2303 can mean that the client is unable to use these acknowledgments if 2304 the server cryptographic handshake messages are delayed or lost. 2305 Note that the same limitation applies to other data sent by the 2306 server protected by the 1-RTT keys. 2308 Unprotected packets, such as those that carry the initial 2309 cryptographic handshake messages, MAY be acknowledged in unprotected 2310 packets. Unprotected packets are vulnerable to falsification or 2311 modification. Unprotected packets can be acknowledged along with 2312 protected packets in a protected packet. 2314 An endpoint SHOULD acknowledge packets containing cryptographic 2315 handshake messages in the next unprotected packet that it sends, 2316 unless it is able to acknowledge those packets in later packets 2317 protected by 1-RTT keys. At the completion of the cryptographic 2318 handshake, both peers send unprotected packets containing 2319 cryptographic handshake messages followed by packets protected by 2320 1-RTT keys. An endpoint SHOULD acknowledge the unprotected packets 2321 that complete the cryptographic handshake in a protected packet, 2322 because its peer is guaranteed to have access to 1-RTT packet 2323 protection keys. 2325 For instance, a server acknowledges a TLS ClientHello in the packet 2326 that carries the TLS ServerHello; similarly, a client can acknowledge 2327 a TLS HelloRetryRequest in the packet containing a second TLS 2328 ClientHello. The complete set of server handshake messages (TLS 2329 ServerHello through to Finished) might be acknowledged by a client in 2330 protected packets, because it is certain that the server is able to 2331 decipher the packet. 2333 8.15. STREAM Frame 2335 STREAM frames implicitly create a stream and carry stream data. The 2336 type byte for a STREAM frame contains embedded flags, and is 2337 formatted as "11FSSOOD". These bits are parsed as follows: 2339 o The first two bits must be set to 11, indicating that this is a 2340 STREAM frame. 2342 o "F" is the FIN bit, which is used for stream termination. 2344 o The "SS" bits encode the length of the Stream ID header field. 2345 The values 00, 01, 02, and 03 indicate lengths of 8, 16, 24, and 2346 32 bits long respectively. 2348 o The "OO" bits encode the length of the Offset header field. The 2349 values 00, 01, 02, and 03 indicate lengths of 0, 16, 32, and 64 2350 bits long respectively. 2352 o The "D" bit indicates whether a Data Length field is present in 2353 the STREAM header. When set to 0, this field indicates that the 2354 Stream Data field extends to the end of the packet. When set to 2355 1, this field indicates that Data Length field contains the length 2356 (in bytes) of the Stream Data field. The option to omit the 2357 length should only be used when the packet is a "full-sized" 2358 packet, to avoid the risk of corruption via padding. 2360 A STREAM frame is shown below. 2362 0 1 2 3 2363 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2364 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2365 | Stream ID (8/16/24/32) ... 2366 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2367 | Offset (0/16/32/64) ... 2368 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2369 | [Data Length (16)] | Stream Data (*) ... 2370 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2372 Figure 9: STREAM Frame Format 2374 The STREAM frame contains the following fields: 2376 Stream ID: The stream ID of the stream (see Section 10.1). 2378 Offset: A variable-sized unsigned number specifying the byte offset 2379 in the stream for the data in this STREAM frame. When the offset 2380 length is 0, the offset is 0. The first byte in the stream has an 2381 offset of 0. The largest offset delivered on a stream - the sum 2382 of the re-constructed offset and data length - MUST be less than 2383 2^64. 2385 Data Length: An optional 16-bit unsigned number specifying the 2386 length of the Stream Data field in this STREAM frame. This field 2387 is present when the "D" bit is set to 1. 2389 Stream Data: The bytes from the designated stream to be delivered. 2391 A stream frame's Stream Data MUST NOT be empty, unless the FIN bit is 2392 set. When the FIN flag is sent on an empty STREAM frame, the offset 2393 in the STREAM frame is the offset of the next byte that would be 2394 sent. 2396 Stream multiplexing is achieved by interleaving STREAM frames from 2397 multiple streams into one or more QUIC packets. A single QUIC packet 2398 can include multiple STREAM frames from one or more streams. 2400 Implementation note: One of the benefits of QUIC is avoidance of 2401 head-of-line blocking across multiple streams. When a packet loss 2402 occurs, only streams with data in that packet are blocked waiting for 2403 a retransmission to be received, while other streams can continue 2404 making progress. Note that when data from multiple streams is 2405 bundled into a single QUIC packet, loss of that packet blocks all 2406 those streams from making progress. An implementation is therefore 2407 advised to bundle as few streams as necessary in outgoing packets 2408 without losing transmission efficiency to underfilled packets. 2410 9. Packetization and Reliability 2412 The Path Maximum Transmission Unit (PMTU) is the maximum size of the 2413 entire IP header, UDP header, and UDP payload. The UDP payload 2414 includes the QUIC packet header, protected payload, and any 2415 authentication fields. 2417 All QUIC packets SHOULD be sized to fit within the estimated PMTU to 2418 avoid IP fragmentation or packet drops. To optimize bandwidth 2419 efficiency, endpoints SHOULD use Packetization Layer PMTU Discovery 2420 ([PLPMTUD]) and MAY use PMTU Discovery ([PMTUDv4], [PMTUDv6]) for 2421 detecting the PMTU, setting the PMTU appropriately, and storing the 2422 result of previous PMTU determinations. 2424 In the absence of these mechanisms, QUIC endpoints SHOULD NOT send IP 2425 packets larger than 1280 octets. Assuming the minimum IP header 2426 size, this results in a QUIC packet size of 1232 octets for IPv6 and 2427 1252 octets for IPv4. 2429 QUIC endpoints that implement any kind of PMTU discovery SHOULD 2430 maintain an estimate for each combination of local and remote IP 2431 addresses (as each pairing could have a different maximum MTU in the 2432 path). 2434 QUIC depends on the network path supporting a MTU of at least 1280 2435 octets. This is the IPv6 minimum and therefore also supported by 2436 most modern IPv4 networks. An endpoint MUST NOT reduce their MTU 2437 below this number, even if it receives signals that indicate a 2438 smaller limit might exist. 2440 Clients MUST ensure that the first packet in a connection, and any 2441 retransmissions of those octets, has a QUIC packet size of least 1200 2442 octets. The packet size for a QUIC packet includes the QUIC header 2443 and integrity check, but not the UDP or IP header. 2445 The initial client packet SHOULD be padded to exactly 1200 octets 2446 unless the client has a reasonable assurance that the PMTU is larger. 2447 Sending a packet of this size ensures that the network path supports 2448 an MTU of this size and helps reduce the amplitude of amplification 2449 attacks caused by server responses toward an unverified client 2450 address. 2452 Servers MUST ignore an initial plaintext packet from a client if its 2453 total size is less than 1200 octets. 2455 If a QUIC endpoint determines that the PMTU between any pair of local 2456 and remote IP addresses has fallen below 1280 octets, it MUST 2457 immediately cease sending QUIC packets on the affected path. This 2458 could result in termination of the connection if an alternative path 2459 cannot be found. 2461 A sender bundles one or more frames in a Regular QUIC packet (see 2462 Section 6). 2464 A sender SHOULD minimize per-packet bandwidth and computational costs 2465 by bundling as many frames as possible within a QUIC packet. A 2466 sender MAY wait for a short period of time to bundle multiple frames 2467 before sending a packet that is not maximally packed, to avoid 2468 sending out large numbers of small packets. An implementation may 2469 use heuristics about expected application sending behavior to 2470 determine whether and for how long to wait. This waiting period is 2471 an implementation decision, and an implementation should be careful 2472 to delay conservatively, since any delay is likely to increase 2473 application-visible latency. 2475 Regular QUIC packets are "containers" of frames; a packet is never 2476 retransmitted whole. How an endpoint handles the loss of the frame 2477 depends on the type of the frame. Some frames are simply 2478 retransmitted, some have their contents moved to new frames, and 2479 others are never retransmitted. 2481 When a packet is detected as lost, the sender re-sends any frames as 2482 necessary: 2484 o All application data sent in STREAM frames MUST be retransmitted, 2485 unless the endpoint has sent a RST_STREAM for that stream. When 2486 an endpoint sends a RST_STREAM frame, data outstanding on that 2487 stream SHOULD NOT be retransmitted, since subsequent data on this 2488 stream is expected to not be delivered by the receiver. 2490 o ACK and PADDING frames MUST NOT be retransmitted. ACK frames 2491 containing updated information will be sent as described in 2492 Section 8.14. 2494 o STOP_SENDING frames MUST be retransmitted, unless the stream has 2495 become closed in the appropriate direction. See Section 10.3. 2497 o The most recent MAX_STREAM_DATA frame for a stream MUST be 2498 retransmitted. Any previous unacknowledged MAX_STREAM_DATA frame 2499 for the same stream SHOULD NOT be retransmitted since a newer 2500 MAX_STREAM_DATA frame for a stream obviates the need for 2501 delivering older ones. Similarly, the most recent MAX_DATA frame 2502 MUST be retransmitted; previous unacknowledged ones SHOULD NOT be 2503 retransmitted. 2505 o All other frames MUST be retransmitted. 2507 Upon detecting losses, a sender MUST take appropriate congestion 2508 control action. The details of loss detection and congestion control 2509 are described in [QUIC-RECOVERY]. 2511 A packet MUST NOT be acknowledged until packet protection has been 2512 successfully removed and all frames contained in the packet have been 2513 processed. For STREAM frames, this means the data has been queued 2514 (but not necessarily delivered to the application). This also means 2515 that any stream state transitions triggered by STREAM or RST_STREAM 2516 frames have occurred. Once the packet has been fully processed, a 2517 receiver acknowledges receipt by sending one or more ACK frames 2518 containing the packet number of the received packet. 2520 To avoid creating an indefinite feedback loop, an endpoint MUST NOT 2521 send an ACK frame in response to a packet containing only ACK or 2522 PADDING frames, even if there are packet gaps which precede the 2523 received packet. The endpoint MUST acknowledge packets containing 2524 only ACK or PADDING frames in the next ACK frame that it sends. 2526 Strategies and implications of the frequency of generating 2527 acknowledgments are discussed in more detail in [QUIC-RECOVERY]. 2529 9.1. Special Considerations for PMTU Discovery 2531 Traditional ICMP-based path MTU discovery in IPv4 [RFC1191] is 2532 potentially vulnerable to off-path attacks that successfully guess 2533 the IP/port 4-tuple and reduce the MTU to a bandwidth-inefficient 2534 value. TCP connections mitigate this risk by using the (at minimum) 2535 8 bytes of transport header echoed in the ICMP message to validate 2536 the TCP sequence number as valid for the current connection. 2537 However, as QUIC operates over UDP, in IPv4 the echoed information 2538 could consist only of the IP and UDP headers, which usually has 2539 insufficient entropy to mitigate off-path attacks. 2541 As a result, endpoints that implement PMTUD in IPv4 SHOULD take steps 2542 to mitigate this risk. For instance, an application could: 2544 o Set the IPv4 Don't Fragment (DF) bit on a small proportion of 2545 packets, so that most invalid ICMP messages arrive when there are 2546 no DF packets outstanding, and can therefore be identified as 2547 spurious. 2549 o Store additional information from the IP or UDP headers from DF 2550 packets (for example, the IP ID or UDP checksum) to further 2551 authenticate incoming Datagram Too Big messages. 2553 o Any reduction in PMTU due to a report contained in an ICMP packet 2554 is provisional until QUIC's loss detection algorithm determines 2555 that the packet is actually lost. 2557 10. Streams: QUIC's Data Structuring Abstraction 2559 Streams in QUIC provide a lightweight, ordered, and bidirectional 2560 byte-stream abstraction modeled closely on HTTP/2 streams [RFC7540]. 2562 Streams can be created either by the client or the server, can 2563 concurrently send data interleaved with other streams, and can be 2564 cancelled. 2566 Data that is received on a stream is delivered in order within that 2567 stream, but there is no particular delivery order across streams. 2568 Transmit ordering among streams is left to the implementation. 2570 The creation and destruction of streams are expected to have minimal 2571 bandwidth and computational cost. A single STREAM frame may create, 2572 carry data for, and terminate a stream, or a stream may last the 2573 entire duration of a connection. 2575 Streams are individually flow controlled, allowing an endpoint to 2576 limit memory commitment and to apply back pressure. The creation of 2577 streams is also flow controlled, with each peer declaring the maximum 2578 stream ID it is willing to accept at a given time. 2580 An alternative view of QUIC streams is as an elastic "message" 2581 abstraction, similar to the way ephemeral streams are used in SST 2582 [SST], which may be a more appealing description for some 2583 applications. 2585 10.1. Stream Identifiers 2587 Streams are identified by an unsigned 32-bit integer, referred to as 2588 the Stream ID. To avoid Stream ID collision, clients MUST initiate 2589 streams using odd-numbered Stream IDs; servers MUST initiate streams 2590 using even-numbered Stream IDs. If an endpoint receives a frame 2591 which corresponds to a stream which is allocated to it (i.e., odd- 2592 numbered for the client or even-numbered for the server) but which it 2593 has not yet created, it MUST close the connection with error code 2594 STREAM_STATE_ERROR. 2596 Stream ID 0 (0x0) is reserved for the cryptographic handshake. 2597 Stream 0 MUST NOT be used for application data, and is the first 2598 client-initiated stream. 2600 A QUIC endpoint MUST NOT reuse a Stream ID. Streams MUST be created 2601 in sequential order. Open streams can be used in any order. Streams 2602 that are used out of order result in lower-numbered streams in the 2603 same direction being counted as open. 2605 Stream IDs are usually encoded as a 32-bit integer, though the STREAM 2606 frame (Section 8.15) permits a shorter encoding when the leading bits 2607 of the stream ID are zero. 2609 10.2. Life of a Stream 2611 The semantics of QUIC streams is based on HTTP/2 streams, and the 2612 lifecycle of a QUIC stream therefore closely follows that of an 2613 HTTP/2 stream [RFC7540], with some differences to accommodate the 2614 possibility of out-of-order delivery due to the use of multiple 2615 streams in QUIC. The lifecycle of a QUIC stream is shown in the 2616 following figure and described below. 2618 +--------+ 2619 | | 2620 | idle | 2621 | | 2622 +--------+ 2623 | 2624 send/recv STREAM/RST 2625 recv MSD/SB 2626 | 2627 v 2628 recv FIN/ +--------+ send FIN/ 2629 recv RST | | send RST 2630 ,---------| open |-----------. 2631 / | | \ 2632 v +--------+ v 2633 +----------+ +----------+ 2634 | half | | half | 2635 | closed | | closed | 2636 | (remote) | | (local) | 2637 +----------+ +----------+ 2638 | | 2639 | send FIN/ +--------+ recv FIN/ | 2640 \ send RST | | recv RST / 2641 `----------->| closed |<-------------' 2642 | | 2643 +--------+ 2645 send: endpoint sends this frame 2646 recv: endpoint receives this frame 2648 STREAM: a STREAM frame 2649 FIN: FIN flag in a STREAM frame 2650 RST: RST_STREAM frame 2651 MSD: MAX_STREAM_DATA frame 2652 SB: STREAM_BLOCKED frame 2654 Figure 10: Lifecycle of a stream 2656 Note that this diagram shows stream state transitions and the frames 2657 and flags that affect those transitions only. It is possible for a 2658 single frame to cause two transitions: receiving a RST_STREAM frame, 2659 or a STREAM frame with the FIN flag cause the stream state to move 2660 from "idle" to "open" and then immediately to one of the "half- 2661 closed" states. 2663 The recipient of a frame that changes stream state will have a 2664 delayed view of the state of a stream while the frame is in transit. 2665 Endpoints do not coordinate the creation of streams; they are created 2666 unilaterally by either endpoint. Endpoints can use acknowledgments 2667 to understand the peer's subjective view of stream state at any given 2668 time. 2670 In the absence of more specific guidance elsewhere in this document, 2671 implementations SHOULD treat the receipt of a frame that is not 2672 expressly permitted in the description of a state as a connection 2673 error (see Section 12). 2675 10.2.1. idle 2677 All streams start in the "idle" state. 2679 The following transitions are valid from this state: 2681 Sending or receiving a STREAM or RST_STREAM frame causes the 2682 identified stream to become "open". The stream identifier for a new 2683 stream is selected as described in Section 10.1. A RST_STREAM frame, 2684 or a STREAM frame with the FIN flag set also causes a stream to 2685 become "half-closed". 2687 An endpoint might receive MAX_STREAM_DATA or STREAM_BLOCKED frames on 2688 peer-initiated streams that are "idle" if there is loss or reordering 2689 of packets. Receiving these frames also causes the stream to become 2690 "open". 2692 An endpoint MUST NOT send a STREAM or RST_STREAM frame for a stream 2693 ID that is higher than the peers advertised maximum stream ID (see 2694 Section 8.7). 2696 10.2.2. open 2698 A stream in the "open" state may be used by both peers to send frames 2699 of any type. In this state, endpoints can send MAX_STREAM_DATA and 2700 MUST observe the value advertised by its receiving peer (see 2701 Section 11). 2703 Opening a stream causes all lower-numbered streams in the same 2704 direction to become open. Thus, opening an odd-numbered stream 2705 causes all "idle", odd-numbered streams with a lower identifier to 2706 become open and the same applies to even numbered streams. Endpoints 2707 open streams in increasing numeric order, but loss or reordering can 2708 cause packets that open streams to arrive out of order. 2710 From the "open" state, either endpoint can send a frame with the FIN 2711 flag set, which causes the stream to transition into one of the 2712 "half-closed" states. This flag can be set on the frame that opens 2713 the stream, which causes the stream to immediately become "half- 2714 closed". Once an endpoint has completed sending all stream data and 2715 a STREAM frame with a FIN flag, the stream state becomes "half-closed 2716 (local)". When an endpoint receives all stream data and a FIN flag 2717 the stream state becomes "half-closed (remote)". An endpoint MUST 2718 NOT consider the stream state to have changed until all data has been 2719 sent or received. 2721 A RST_STREAM frame on an "open" stream also causes the stream to 2722 become "half-closed". A stream that becomes "open" as a result of 2723 sending or receiving RST_STREAM immediately becomes "half-closed". 2724 Sending a RST_STREAM frame causes the stream to become "half-closed 2725 (local)"; receiving RST_STREAM causes the stream to become "half- 2726 closed (remote)". 2728 Any frame type that mentions a stream ID can be sent in this state. 2730 10.2.3. half-closed (local) 2732 A stream that is in the "half-closed (local)" state MUST NOT be used 2733 for sending on new STREAM frames. Retransmission of data that has 2734 already been sent on STREAM frames is permitted. An endpoint MAY 2735 also send MAX_STREAM_DATA and STOP_SENDING in this state. 2737 An application can decide to abandon a stream in this state. An 2738 endpoint can send RST_STREAM for a stream that was closed with the 2739 FIN flag. The final offset carried in this RST_STREAM frame MUST be 2740 the same as the previously established final offset. 2742 An endpoint that closes a stream MUST NOT send data beyond the final 2743 offset that it has chosen, see Section 10.2.5 for details. 2745 A stream transitions from this state to "closed" when a STREAM frame 2746 that contains a FIN flag is received and all prior data has arrived, 2747 or when a RST_STREAM frame is received. 2749 An endpoint can receive any frame that mentions a stream ID in this 2750 state. Providing flow-control credit using MAX_STREAM_DATA frames is 2751 necessary to continue receiving flow-controlled frames. In this 2752 state, a receiver MAY ignore MAX_STREAM_DATA frames for this stream, 2753 which might arrive for a short period after a frame bearing the FIN 2754 flag is sent. 2756 10.2.4. half-closed (remote) 2758 A stream is "half-closed (remote)" when the stream is no longer being 2759 used by the peer to send any data. An endpoint will have either 2760 received all data that a peer has sent or will have received a 2761 RST_STREAM frame and discarded any received data. 2763 Once all data has been either received or discarded, a sender is no 2764 longer obligated to update the maximum received data for the 2765 connection. 2767 Due to reordering, an endpoint could continue receiving frames for 2768 the stream even after the stream is closed for sending. Frames 2769 received after a peer closes a stream SHOULD be discarded. An 2770 endpoint MAY choose to limit the period over which it ignores frames 2771 and treat frames that arrive after this time as being in error. 2773 An endpoint may receive a RST_STREAM in this state, such as when the 2774 peer resets the stream after sending a FIN on it. In this case, the 2775 endpoint MAY discard any data that it already received on that 2776 stream. The endpoint SHOULD close the connection with a 2777 FINAL_OFFSET_ERROR if the received RST_STREAM carries a different 2778 offset from the one already established. 2780 An endpoint will know the final offset of the data it receives on a 2781 stream when it reaches the "half-closed (remote)" state, see 2782 Section 11.3 for details. 2784 A stream in this state can be used by the endpoint to send any frame 2785 that mentions a stream ID. In this state, the endpoint MUST observe 2786 advertised stream and connection data limits (see Section 11). 2788 A stream transitions from this state to "closed" by completing 2789 transmission of all data. This includes sending all data carried in 2790 STREAM frames including the terminal STREAM frame that contains a FIN 2791 flag. 2793 A stream also becomes "closed" when the endpoint sends a RST_STREAM 2794 frame. 2796 10.2.5. closed 2798 The "closed" state is the terminal state for a stream. Reordering 2799 might cause frames to be received after closing, see Section 10.2.4. 2801 If the application resets a stream that is already in the "closed" 2802 state, a RST_STREAM frame MAY still be sent in order to cancel 2803 retransmissions of previously-sent STREAM frames. 2805 10.3. Solicited State Transitions 2807 If an endpoint is no longer interested in the data it is receiving on 2808 a stream, it MAY send a STOP_SENDING frame identifying that stream to 2809 prompt closure of the stream in the opposite direction. This 2810 typically indicates that the receiving application is no longer 2811 reading data it receives from the stream, but is not a guarantee that 2812 incoming data will be ignored. 2814 STREAM frames received after sending STOP_SENDING are still counted 2815 toward the connection and stream flow-control windows, even though 2816 these frames will be discarded upon receipt. This avoids potential 2817 ambiguity about which STREAM frames count toward flow control. 2819 STOP_SENDING can only be sent for any stream that is not "idle", 2820 however it is mostly useful for streams in the "open" or "half-closed 2821 (local)" states. A STOP_SENDING frame requests that the receiving 2822 endpoint send a RST_STREAM frame. An endpoint that receives a 2823 STOP_SENDING frame MUST send a RST_STREAM frame for that stream with 2824 an error code of STOPPING. If the STOP_SENDING frame is received on 2825 a stream that is already in the "half-closed (local)" or "closed" 2826 states, a RST_STREAM frame MAY still be sent in order to cancel 2827 retransmission of previously-sent STREAM frames. 2829 While STOP_SENDING frames are retransmittable, an implementation MAY 2830 choose not to retransmit a lost STOP_SENDING frame if the stream has 2831 already been closed in the appropriate direction since the frame was 2832 first generated. See Section 9. 2834 10.4. Stream Concurrency 2836 An endpoint limits the number of concurrently active incoming streams 2837 by adjusting the maximum stream ID. An initial value is set in the 2838 transport parameters (see Section 7.4.1) and is subsequently 2839 increased by MAX_STREAM_ID frames (see Section 8.7). 2841 The maximum stream ID is specific to each endpoint and applies only 2842 to the peer that receives the setting. That is, clients specify the 2843 maximum stream ID the server can initiate, and servers specify the 2844 maximum stream ID the client can initiate. Each endpoint may respond 2845 on streams initiated by the other peer, regardless of whether it is 2846 permitted to initiated new streams. 2848 Endpoints MUST NOT exceed the limit set by their peer. An endpoint 2849 that receives a STREAM frame with an ID greater than the limit it has 2850 sent MUST treat this as a stream error of type STREAM_ID_ERROR 2851 (Section 12), unless this is a result of a change in the initial 2852 offsets (see Section 7.4.2). 2854 A receiver MUST NOT renege on an advertisement; that is, once a 2855 receiver advertises a stream ID via a MAX_STREAM_ID frame, it MUST 2856 NOT subsequently advertise a smaller maximum ID. A sender may 2857 receive MAX_STREAM_ID frames out of order; a sender MUST therefore 2858 ignore any MAX_STREAM_ID that does not increase the maximum. 2860 10.5. Sending and Receiving Data 2862 Once a stream is created, endpoints may use the stream to send and 2863 receive data. Each endpoint may send a series of STREAM frames 2864 encapsulating data on a stream until the stream is terminated in that 2865 direction. Streams are an ordered byte-stream abstraction, and they 2866 have no other structure within them. STREAM frame boundaries are not 2867 expected to be preserved in retransmissions from the sender or during 2868 delivery to the application at the receiver. 2870 When new data is to be sent on a stream, a sender MUST set the 2871 encapsulating STREAM frame's offset field to the stream offset of the 2872 first byte of this new data. The first byte of data that is sent on 2873 a stream has the stream offset 0. The largest offset delivered on a 2874 stream MUST be less than 2^64. A receiver MUST ensure that received 2875 stream data is delivered to the application as an ordered byte- 2876 stream. Data received out of order MUST be buffered for later 2877 delivery, as long as it is not in violation of the receiver's flow 2878 control limits. 2880 An endpoint MUST NOT send data on any stream without ensuring that it 2881 is within the data limits set by its peer. The cryptographic 2882 handshake stream, Stream 0, is exempt from the connection-level data 2883 limits established by MAX_DATA. Data on stream 0 other than the 2884 initial cryptographic handshake message is still subject to stream- 2885 level data limits and MAX_STREAM_DATA. This message is exempt from 2886 flow control because it needs to be sent in a single packet 2887 regardless of the server's flow control state. This rule applies 2888 even for 0-RTT handshakes where the remembered value of 2889 MAX_STREAM_DATA would not permit sending a full initial cryptographic 2890 handshake message. 2892 Flow control is described in detail in Section 11, and congestion 2893 control is described in the companion document [QUIC-RECOVERY]. 2895 10.6. Stream Prioritization 2897 Stream multiplexing has a significant effect on application 2898 performance if resources allocated to streams are correctly 2899 prioritized. Experience with other multiplexed protocols, such as 2900 HTTP/2 [RFC7540], shows that effective prioritization strategies have 2901 a significant positive impact on performance. 2903 QUIC does not provide frames for exchanging prioritization 2904 information. Instead it relies on receiving priority information 2905 from the application that uses QUIC. Protocols that use QUIC are 2906 able to define any prioritization scheme that suits their application 2907 semantics. A protocol might define explicit messages for signaling 2908 priority, such as those defined in HTTP/2; it could define rules that 2909 allow an endpoint to determine priority based on context; or it could 2910 leave the determination to the application. 2912 A QUIC implementation SHOULD provide ways in which an application can 2913 indicate the relative priority of streams. When deciding which 2914 streams to dedicate resources to, QUIC SHOULD use the information 2915 provided by the application. Failure to account for priority of 2916 streams can result in suboptimal performance. 2918 Stream priority is most relevant when deciding which stream data will 2919 be transmitted. Often, there will be limits on what can be 2920 transmitted as a result of connection flow control or the current 2921 congestion controller state. 2923 Giving preference to the transmission of its own management frames 2924 ensures that the protocol functions efficiently. That is, 2925 prioritizing frames other than STREAM frames ensures that loss 2926 recovery, congestion control, and flow control operate effectively. 2928 Stream 0 MUST be prioritized over other streams prior to the 2929 completion of the cryptographic handshake. This includes the 2930 retransmission of the second flight of client handshake messages, 2931 that is, the TLS Finished and any client authentication messages. 2933 STREAM frames that are determined to be lost SHOULD be retransmitted 2934 before sending new data, unless application priorities indicate 2935 otherwise. Retransmitting lost stream data can fill in gaps, which 2936 allows the peer to consume already received data and free up flow 2937 control window. 2939 11. Flow Control 2941 It is necessary to limit the amount of data that a sender may have 2942 outstanding at any time, so as to prevent a fast sender from 2943 overwhelming a slow receiver, or to prevent a malicious sender from 2944 consuming significant resources at a receiver. This section 2945 describes QUIC's flow-control mechanisms. 2947 QUIC employs a credit-based flow-control scheme similar to HTTP/2's 2948 flow control [RFC7540]. A receiver advertises the number of octets 2949 it is prepared to receive on a given stream and for the entire 2950 connection. This leads to two levels of flow control in QUIC: (i) 2951 Connection flow control, which prevents senders from exceeding a 2952 receiver's buffer capacity for the connection, and (ii) Stream flow 2953 control, which prevents a single stream from consuming the entire 2954 receive buffer for a connection. 2956 A data receiver sends MAX_STREAM_DATA or MAX_DATA frames to the 2957 sender to advertise additional credit. MAX_STREAM_DATA frames send 2958 the the maximum absolute byte offset of a stream, while MAX_DATA 2959 sends the maximum sum of the absolute byte offsets of all streams 2960 other than stream 0. 2962 A receiver MAY advertise a larger offset at any point by sending 2963 MAX_DATA or MAX_STREAM_DATA frames. A receiver MUST NOT renege on an 2964 advertisement; that is, once a receiver advertises an offset, it MUST 2965 NOT subsequently advertise a smaller offset. A sender could receive 2966 MAX_DATA or MAX_STREAM_DATA frames out of order; a sender MUST 2967 therefore ignore any flow control offset that does not move the 2968 window forward. 2970 A receiver MUST close the connection with a FLOW_CONTROL_ERROR error 2971 (Section 12) if the peer violates the advertised connection or stream 2972 data limits. 2974 A sender MUST send BLOCKED frames to indicate it has data to write 2975 but is blocked by lack of connection or stream flow control credit. 2976 BLOCKED frames are expected to be sent infrequently in common cases, 2977 but they are considered useful for debugging and monitoring purposes. 2979 A receiver advertises credit for a stream by sending a 2980 MAX_STREAM_DATA frame with the Stream ID set appropriately. A 2981 receiver could use the current offset of data consumed to determine 2982 the flow control offset to be advertised. A receiver MAY send 2983 MAX_STREAM_DATA frames in multiple packets in order to make sure that 2984 the sender receives an update before running out of flow control 2985 credit, even if one of the packets is lost. 2987 Connection flow control is a limit to the total bytes of stream data 2988 sent in STREAM frames on all streams. A receiver advertises credit 2989 for a connection by sending a MAX_DATA frame. A receiver maintains a 2990 cumulative sum of bytes received on all streams, which are used to 2991 check for flow control violations. A receiver might use a sum of 2992 bytes consumed on all contributing streams to determine the maximum 2993 data limit to be advertised. 2995 11.1. Edge Cases and Other Considerations 2997 There are some edge cases which must be considered when dealing with 2998 stream and connection level flow control. Given enough time, both 2999 endpoints must agree on flow control state. If one end believes it 3000 can send more than the other end is willing to receive, the 3001 connection will be torn down when too much data arrives. 3003 Conversely if a sender believes it is blocked, while endpoint B 3004 expects more data can be received, then the connection can be in a 3005 deadlock, with the sender waiting for a MAX_DATA or MAX_STREAM_DATA 3006 frame which will never come. 3008 On receipt of a RST_STREAM frame, an endpoint will tear down state 3009 for the matching stream and ignore further data arriving on that 3010 stream. This could result in the endpoints getting out of sync, 3011 since the RST_STREAM frame may have arrived out of order and there 3012 may be further bytes in flight. The data sender would have counted 3013 the data against its connection level flow control budget, but a 3014 receiver that has not received these bytes would not know to include 3015 them as well. The receiver must learn the number of bytes that were 3016 sent on the stream to make the same adjustment in its connection flow 3017 controller. 3019 To avoid this de-synchronization, a RST_STREAM sender MUST include 3020 the final byte offset sent on the stream in the RST_STREAM frame. On 3021 receiving a RST_STREAM frame, a receiver definitively knows how many 3022 bytes were sent on that stream before the RST_STREAM frame, and the 3023 receiver MUST use the final offset to account for all bytes sent on 3024 the stream in its connection level flow controller. 3026 11.1.1. Response to a RST_STREAM 3028 RST_STREAM terminates one direction of a stream abruptly. Whether 3029 any action or response can or should be taken on the data already 3030 received is an application-specific issue, but it will often be the 3031 case that upon receipt of a RST_STREAM an endpoint will choose to 3032 stop sending data in its own direction. If the sender of a 3033 RST_STREAM wishes to explicitly state that no future data will be 3034 processed, that endpoint MAY send a STOP_SENDING frame at the same 3035 time. 3037 11.1.2. Data Limit Increments 3039 This document leaves when and how many bytes to advertise in a 3040 MAX_DATA or MAX_STREAM_DATA to implementations, but offers a few 3041 considerations. These frames contribute to connection overhead. 3042 Therefore frequently sending frames with small changes is 3043 undesirable. At the same time, infrequent updates require larger 3044 increments to limits if blocking is to be avoided. Thus, larger 3045 updates require a receiver to commit to larger resource commitments. 3046 Thus there is a tradeoff between resource commitment and overhead 3047 when determining how large a limit is advertised. 3049 A receiver MAY use an autotuning mechanism to tune the frequency and 3050 amount that it increases data limits based on a roundtrip time 3051 estimate and the rate at which the receiving application consumes 3052 data, similar to common TCP implementations. 3054 11.2. Stream Limit Increment 3056 As with flow control, this document leaves when and how many streams 3057 to make available to a peer via MAX_STREAM_ID to implementations, but 3058 offers a few considerations. MAX_STREAM_ID frames constitute minimal 3059 overhead, while withholding MAX_STREAM_ID frames can prevent the peer 3060 from using the available parallelism. 3062 Implementations will likely want to increase the maximum stream ID as 3063 peer-initiated streams close. A receiver MAY also advance the 3064 maximum stream ID based on current activity, system conditions, and 3065 other environmental factors. 3067 11.2.1. Blocking on Flow Control 3069 If a sender does not receive a MAX_DATA or MAX_STREAM_DATA frame when 3070 it has run out of flow control credit, the sender will be blocked and 3071 MUST send a BLOCKED or STREAM_BLOCKED frame. These frames are 3072 expected to be useful for debugging at the receiver; they do not 3073 require any other action. A receiver SHOULD NOT wait for a BLOCKED 3074 or STREAM_BLOCKED frame before sending MAX_DATA or MAX_STREAM_DATA, 3075 since doing so will mean that a sender is unable to send for an 3076 entire round trip. 3078 For smooth operation of the congestion controller, it is generally 3079 considered best to not let the sender go into quiescence if 3080 avoidable. To avoid blocking a sender, and to reasonably account for 3081 the possibiity of loss, a receiver should send a MAX_DATA or 3082 MAX_STREAM_DATA frame at least two roundtrips before it expects the 3083 sender to get blocked. 3085 A sender sends a single BLOCKED or STREAM_BLOCKED frame only once 3086 when it reaches a data limit. A sender MUST NOT send multiple 3087 BLOCKED or STREAM_BLOCKED frames for the same data limit, unless the 3088 original frame is determined to be lost. Another BLOCKED or 3089 STREAM_BLOCKED frame can be sent after the data limit is increased. 3091 11.3. Stream Final Offset 3093 The final offset is the count of the number of octets that are 3094 transmitted on a stream. For a stream that is reset, the final 3095 offset is carried explicitly in the RST_STREAM frame. Otherwise, the 3096 final offset is the offset of the end of the data carried in STREAM 3097 frame marked with a FIN flag. 3099 An endpoint will know the final offset for a stream when the stream 3100 enters the "half-closed (remote)" state. However, if there is 3101 reordering or loss, an endpoint might learn the final offset prior to 3102 entering this state if it is carried on a STREAM frame. 3104 An endpoint MUST NOT send data on a stream at or beyond the final 3105 offset. 3107 Once a final offset for a stream is known, it cannot change. If a 3108 RST_STREAM or STREAM frame causes the final offset to change for a 3109 stream, an endpoint SHOULD respond with a FINAL_OFFSET_ERROR error 3110 (see Section 12). A receiver SHOULD treat receipt of data at or 3111 beyond the final offset as a FINAL_OFFSET_ERROR error, even after a 3112 stream is closed. Generating these errors is not mandatory, but only 3113 because requiring that an endpoint generate these errors also means 3114 that the endpoint needs to maintain the final offset state for closed 3115 streams, which could mean a significant state commitment. 3117 12. Error Handling 3119 An endpoint that detects an error SHOULD signal the existence of that 3120 error to its peer. Errors can affect an entire connection (see 3121 Section 12.1), or a single stream (see Section 12.2). 3123 The most appropriate error code (Section 12.3) SHOULD be included in 3124 the frame that signals the error. Where this specification 3125 identifies error conditions, it also identifies the error code that 3126 is used. 3128 A stateless reset (Section 7.8.4) is not suitable for any error that 3129 can be signaled with a CONNECTION_CLOSE, APPLICATION_CLOSE, or 3130 RST_STREAM frame. A stateless reset MUST NOT be used by an endpoint 3131 that has the state necessary to send a frame on the connection. 3133 12.1. Connection Errors 3135 Errors that result in the connection being unusable, such as an 3136 obvious violation of protocol semantics or corruption of state that 3137 affects an entire connection, MUST be signaled using a 3138 CONNECTION_CLOSE or APPLICATION_CLOSE frame (Section 8.3, 3139 Section 8.4). An endpoint MAY close the connection in this manner 3140 even if the error only affects a single stream. 3142 Application protocols can signal application-specific protocol errors 3143 using the APPLICATION_CLOSE frame. Errors that are specific to the 3144 transport, including all those described in this document, are 3145 carried in a CONNECTION_CLOSE frame. Other than the type of error 3146 code they carry, these frames are identical in format and semantics. 3148 A CONNECTION_CLOSE or APPLICATION_CLOSE frame could be sent in a 3149 packet that is lost. An endpoint SHOULD be prepared to retransmit a 3150 packet containing either frame type if it receives more packets on a 3151 terminated connection. Limiting the number of retransmissions and 3152 the time over which this final packet is sent limits the effort 3153 expended on terminated connections. 3155 An endpoint that chooses not to retransmit packets containing 3156 CONNECTION_CLOSE or APPLICATION_CLOSE risks a peer missing the first 3157 such packet. The only mechanism available to an endpoint that 3158 continues to receive data for a terminated connection is to use the 3159 stateless reset process (Section 7.8.4). 3161 An endpoint that receives an invalid CONNECTION_CLOSE or 3162 APPLICATION_CLOSE frame MUST NOT signal the existence of the error to 3163 its peer. 3165 12.2. Stream Errors 3167 If the error affects a single stream, but otherwise leaves the 3168 connection in a recoverable state, the endpoint can send a RST_STREAM 3169 frame (Section 8.2) with an appropriate error code to terminate just 3170 the affected stream. 3172 Stream 0 is critical to the functioning of the entire connection. If 3173 stream 0 is closed with either a RST_STREAM or STREAM frame bearing 3174 the FIN flag, an endpoint MUST generate a connection error of type 3175 PROTOCOL_VIOLATION. 3177 RST_STREAM MUST be instigated by the application and MUST carry an 3178 application error code. Resetting a stream without knowledge of the 3179 application protocol could cause the protocol to enter an 3180 unrecoverable state. Application protocols might require certain 3181 streams to be reliably delivered in order to guarantee consistent 3182 state between endpoints. 3184 12.3. Transport Error Codes 3186 QUIC error codes are 16-bit unsigned integers. 3188 This section lists the defined QUIC transport error codes that may be 3189 used in a CONNECTION_CLOSE frame. These errors apply to the entire 3190 connection. 3192 NO_ERROR (0x0): An endpoint uses this with CONNECTION_CLOSE to 3193 signal that the connection is being closed abruptly in the absence 3194 of any error. 3196 INTERNAL_ERROR (0x1): The endpoint encountered an internal error and 3197 cannot continue with the connection. 3199 FLOW_CONTROL_ERROR (0x3): An endpoint received more data than it 3200 permitted in its advertised data limits (see Section 11). 3202 STREAM_ID_ERROR (0x4): An endpoint received a frame for a stream 3203 identifier that exceeded its advertised maximum stream ID. 3205 STREAM_STATE_ERROR (0x5): An endpoint received a frame for a stream 3206 that was not in a state that permitted that frame (see 3207 Section 10.2). 3209 FINAL_OFFSET_ERROR (0x6): An endpoint received a STREAM frame 3210 containing data that exceeded the previously established final 3211 offset. Or an endpoint received a RST_STREAM frame containing a 3212 final offset that was lower than the maximum offset of data that 3213 was already received. Or an endpoint received a RST_STREAM frame 3214 containing a different final offset to the one already 3215 established. 3217 FRAME_FORMAT_ERROR (0x7): An endpoint received a frame that was 3218 badly formatted. For instance, an empty STREAM frame that omitted 3219 the FIN flag, or an ACK frame that has more acknowledgment ranges 3220 than the remainder of the packet could carry. This is a generic 3221 error code; an endpoint SHOULD use the more specific frame format 3222 error codes (0x1XX) if possible. 3224 TRANSPORT_PARAMETER_ERROR (0x8): An endpoint received transport 3225 parameters that were badly formatted, included an invalid value, 3226 was absent even though it is mandatory, was present though it is 3227 forbidden, or is otherwise in error. 3229 VERSION_NEGOTIATION_ERROR (0x9): An endpoint received transport 3230 parameters that contained version negotiation parameters that 3231 disagreed with the version negotiation that it performed. This 3232 error code indicates a potential version downgrade attack. 3234 PROTOCOL_VIOLATION (0xA): An endpoint detected an error with 3235 protocol compliance that was not covered by more specific error 3236 codes. 3238 FRAME_ERROR (0x1XX): An endpoint detected an error in a specific 3239 frame type. The frame type is included as the last octet of the 3240 error code. For example, an error in a MAX_STREAM_ID frame would 3241 be indicated with the code (0x106). 3243 See Section 14.2 for details of registering new error codes. 3245 12.4. Application Protocol Error Codes 3247 Application protocol error codes are 16-bit unsigned integers, but 3248 the management of application error codes are left to application 3249 protocols. Application protocol error codes are used for the 3250 RST_STREAM (Section 8.2) and APPLICATION_CLOSE (Section 8.4) frames. 3252 There is no restriction on the use of the 16-bit error code space for 3253 application protocols. However, QUIC reserves the error code with a 3254 value of 0 to mean STOPPING. The application error code of STOPPING 3255 (0) is used by the transport to cancel a stream in response to 3256 receipt of a STOP_SENDING frame. 3258 13. Security and Privacy Considerations 3260 13.1. Spoofed ACK Attack 3262 An attacker receives an STK from the server and then releases the IP 3263 address on which it received the STK. The attacker may, in the 3264 future, spoof this same address (which now presumably addresses a 3265 different endpoint), and initiate a 0-RTT connection with a server on 3266 the victim's behalf. The attacker then spoofs ACK frames to the 3267 server which cause the server to potentially drown the victim in 3268 data. 3270 There are two possible mitigations to this attack. The simplest one 3271 is that a server can unilaterally create a gap in packet-number 3272 space. In the non-attack scenario, the client will send an ACK frame 3273 with the larger value for largest acknowledged. In the attack 3274 scenario, the attacker could acknowledge a packet in the gap. If the 3275 server sees an acknowledgment for a packet that was never sent, the 3276 connection can be aborted. 3278 The second mitigation is that the server can require that 3279 acknowledgments for sent packets match the encryption level of the 3280 sent packet. This mitigation is useful if the connection has an 3281 ephemeral forward-secure key that is generated and used for every new 3282 connection. If a packet sent is protected with a forward-secure key, 3283 then any acknowledgments that are received for them MUST also be 3284 forward-secure protected. Since the attacker will not have the 3285 forward secure key, the attacker will not be able to generate 3286 forward-secure protected packets with ACK frames. 3288 13.2. Slowloris Attacks 3290 The attacks commonly known as Slowloris [SLOWLORIS] try to keep many 3291 connections to the target endpoint open and hold them open as long as 3292 possible. These attacks can be executed against a QUIC endpoint by 3293 generating the minimum amount of activity necessary to avoid being 3294 closed for inactivity. This might involve sending small amounts of 3295 data, gradually opening flow control windows in order to control the 3296 sender rate, or manufacturing ACK frames that simulate a high loss 3297 rate. 3299 QUIC deployments SHOULD provide mitigations for the Slowloris 3300 attacks, such as increasing the maximum number of clients the server 3301 will allow, limiting the number of connections a single IP address is 3302 allowed to make, imposing restrictions on the minimum transfer speed 3303 a connection is allowed to have, and restricting the length of time 3304 an endpoint is allowed to stay connected. 3306 13.3. Stream Fragmentation and Reassembly Attacks 3308 An adversarial endpoint might intentionally fragment the data on 3309 stream buffers in order to cause disproportionate memory commitment. 3310 An adversarial endpoint could open a stream and send some STREAM 3311 frames containing arbitrary fragments of the stream content. 3313 The attack is mitigated if flow control windows correspond to 3314 available memory. However, some receivers will over-commit memory 3315 and advertise flow control offsets in the aggregate that exceed 3316 actual available memory. The over-commitment strategy can lead to 3317 better performance when endpoints are well behaved, but renders 3318 endpoints vulnerable to the stream fragmentation attack. 3320 QUIC deployments SHOULD provide mitigations against the stream 3321 fragmentation attack. Mitigations could consist of avoiding over- 3322 committing memory, delaying reassembly of STREAM frames, implementing 3323 heuristics based on the age and duration of reassembly holes, or some 3324 combination. 3326 13.4. Stream Commitment Attack 3328 An adversarial endpoint can open lots of streams, exhausting state on 3329 an endpoint. The adversarial endpoint could repeat the process on a 3330 large number of connections, in a manner similar to SYN flooding 3331 attacks in TCP. 3333 Normally, clients will open streams sequentially, as explained in 3334 Section 10.1. However, when several streams are initiated at short 3335 intervals, transmission error may cause STREAM DATA frames opening 3336 streams to be received out of sequence. A receiver is obligated to 3337 open intervening streams if a higher-numbered stream ID is received. 3338 Thus, on a new connection, opening stream 2000001 opens 1 million 3339 streams, as required by the specification. 3341 The number of active streams is limited by the concurrent stream 3342 limit transport parameter, as explained in Section 10.4. If chosen 3343 judisciously, this limit mitigates the effect of the stream 3344 commitment attack. However, setting the limit too low could affect 3345 performance when applications expect to open large number of streams. 3347 14. IANA Considerations 3349 14.1. QUIC Transport Parameter Registry 3351 IANA [SHALL add/has added] a registry for "QUIC Transport Parameters" 3352 under a "QUIC Protocol" heading. 3354 The "QUIC Transport Parameters" registry governs a 16-bit space. 3355 This space is split into two spaces that are governed by different 3356 policies. Values with the first byte in the range 0x00 to 0xfe (in 3357 hexadecimal) are assigned via the Specification Required policy 3358 [RFC8126]. Values with the first byte 0xff are reserved for Private 3359 Use [RFC8126]. 3361 Registrations MUST include the following fields: 3363 Value: The numeric value of the assignment (registrations will be 3364 between 0x0000 and 0xfeff). 3366 Parameter Name: A short mnemonic for the parameter. 3368 Specification: A reference to a publicly available specification for 3369 the value. 3371 The nominated expert(s) verify that a specification exists and is 3372 readily accessible. The expert(s) are encouraged to be biased 3373 towards approving registrations unless they are abusive, frivolous, 3374 or actively harmful (not merely aesthetically displeasing, or 3375 architecturally dubious). 3377 The initial contents of this registry are shown in Table 4. 3379 +--------+-------------------------+---------------+ 3380 | Value | Parameter Name | Specification | 3381 +--------+-------------------------+---------------+ 3382 | 0x0000 | initial_max_stream_data | Section 7.4.1 | 3383 | | | | 3384 | 0x0001 | initial_max_data | Section 7.4.1 | 3385 | | | | 3386 | 0x0002 | initial_max_stream_id | Section 7.4.1 | 3387 | | | | 3388 | 0x0003 | idle_timeout | Section 7.4.1 | 3389 | | | | 3390 | 0x0004 | omit_connection_id | Section 7.4.1 | 3391 | | | | 3392 | 0x0005 | max_packet_size | Section 7.4.1 | 3393 | | | | 3394 | 0x0006 | stateless_reset_token | Section 7.4.1 | 3395 +--------+-------------------------+---------------+ 3397 Table 4: Initial QUIC Transport Parameters Entries 3399 14.2. QUIC Transport Error Codes Registry 3401 IANA [SHALL add/has added] a registry for "QUIC Transport Error 3402 Codes" under a "QUIC Protocol" heading. 3404 The "QUIC Transport Error Codes" registry governs a 16-bit space. 3405 This space is split into two spaces that are governed by different 3406 policies. Values with the first byte in the range 0x00 to 0xfe (in 3407 hexadecimal) are assigned via the Specification Required policy 3408 [RFC8126]. Values with the first byte 0xff are reserved for Private 3409 Use [RFC8126]. 3411 Registrations MUST include the following fields: 3413 Value: The numeric value of the assignment (registrations will be 3414 between 0x0000 and 0xfeff). 3416 Code: A short mnemonic for the parameter. 3418 Description: A brief description of the error code semantics, which 3419 MAY be a summary if a specification reference is provided. 3421 Specification: A reference to a publicly available specification for 3422 the value. 3424 The initial contents of this registry are shown in Table 5. Note 3425 that FRAME_ERROR takes the range from 0x100 to 0x1FF and private use 3426 occupies the range from 0xFE00 to 0xFFFF. 3428 +-----------+------------------------+---------------+--------------+ 3429 | Value | Error | Description | Specificatio | 3430 | | | | n | 3431 +-----------+------------------------+---------------+--------------+ 3432 | 0x0 | NO_ERROR | No error | Section 12.3 | 3433 | | | | | 3434 | 0x1 | INTERNAL_ERROR | Implementatio | Section 12.3 | 3435 | | | n error | | 3436 | | | | | 3437 | 0x3 | FLOW_CONTROL_ERROR | Flow control | Section 12.3 | 3438 | | | error | | 3439 | | | | | 3440 | 0x4 | STREAM_ID_ERROR | Invalid | Section 12.3 | 3441 | | | stream ID | | 3442 | | | | | 3443 | 0x5 | STREAM_STATE_ERROR | Frame | Section 12.3 | 3444 | | | received in | | 3445 | | | invalid | | 3446 | | | stream state | | 3447 | | | | | 3448 | 0x6 | FINAL_OFFSET_ERROR | Change to | Section 12.3 | 3449 | | | final stream | | 3450 | | | offset | | 3451 | | | | | 3452 | 0x7 | FRAME_FORMAT_ERROR | Generic frame | Section 12.3 | 3453 | | | format error | | 3454 | | | | | 3455 | 0x8 | TRANSPORT_PARAMETER_ER | Error in | Section 12.3 | 3456 | | ROR | transport | | 3457 | | | parameters | | 3458 | | | | | 3459 | 0x9 | VERSION_NEGOTIATION_ER | Version | Section 12.3 | 3460 | | ROR | negotiation | | 3461 | | | failure | | 3462 | | | | | 3463 | 0xA | PROTOCOL_VIOLATION | Generic | Section 12.3 | 3464 | | | protocol | | 3465 | | | violation | | 3466 | | | | | 3467 | 0x100-0x1 | FRAME_ERROR | Specific | Section 12.3 | 3468 | FF | | frame format | | 3469 | | | error | | 3470 +-----------+------------------------+---------------+--------------+ 3472 Table 5: Initial QUIC Transport Error Codes Entries 3474 15. References 3476 15.1. Normative References 3478 [I-D.ietf-tls-tls13] 3479 Rescorla, E., "The Transport Layer Security (TLS) Protocol 3480 Version 1.3", draft-ietf-tls-tls13-21 (work in progress), 3481 July 2017. 3483 [PLPMTUD] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 3484 Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, 3485 . 3487 [PMTUDv4] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 3488 DOI 10.17487/RFC1191, November 1990, 3489 . 3491 [PMTUDv6] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 3492 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 3493 DOI 10.17487/RFC8201, July 2017, 3494 . 3496 [QUIC-RECOVERY] 3497 Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection 3498 and Congestion Control", draft-ietf-quic-recovery-07 (work 3499 in progress), October 2017. 3501 [QUIC-TLS] 3502 Thomson, M., Ed. and S. Turner, Ed., "Using Transport 3503 Layer Security (TLS) to Secure QUIC", draft-ietf-quic- 3504 tls-07 (work in progress), October 2017. 3506 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 3507 DOI 10.17487/RFC1191, November 1990, 3508 . 3510 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 3511 Requirement Levels", BCP 14, RFC 2119, 3512 DOI 10.17487/RFC2119, March 1997, 3513 . 3515 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 3516 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November 3517 2003, . 3519 [RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker, 3520 "Randomness Requirements for Security", BCP 106, RFC 4086, 3521 DOI 10.17487/RFC4086, June 2005, 3522 . 3524 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 3525 Writing an IANA Considerations Section in RFCs", BCP 26, 3526 RFC 8126, DOI 10.17487/RFC8126, June 2017, 3527 . 3529 15.2. Informative References 3531 [EARLY-DESIGN] 3532 Roskind, J., "QUIC: Multiplexed Transport Over UDP", 3533 December 2013, . 3535 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 3536 Hashing for Message Authentication", RFC 2104, 3537 DOI 10.17487/RFC2104, February 1997, 3538 . 3540 [RFC2360] Scott, G., "Guide for Internet Standards Writers", BCP 22, 3541 RFC 2360, DOI 10.17487/RFC2360, June 1998, 3542 . 3544 [RFC4787] Audet, F., Ed. and C. Jennings, "Network Address 3545 Translation (NAT) Behavioral Requirements for Unicast 3546 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 3547 2007, . 3549 [RFC5869] Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand 3550 Key Derivation Function (HKDF)", RFC 5869, 3551 DOI 10.17487/RFC5869, May 2010, 3552 . 3554 [RFC6824] Ford, A., Raiciu, C., Handley, M., and O. Bonaventure, 3555 "TCP Extensions for Multipath Operation with Multiple 3556 Addresses", RFC 6824, DOI 10.17487/RFC6824, January 2013, 3557 . 3559 [RFC7301] Friedl, S., Popov, A., Langley, A., and E. Stephan, 3560 "Transport Layer Security (TLS) Application-Layer Protocol 3561 Negotiation Extension", RFC 7301, DOI 10.17487/RFC7301, 3562 July 2014, . 3564 [RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 3565 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 3566 DOI 10.17487/RFC7540, May 2015, 3567 . 3569 [SLOWLORIS] 3570 RSnake Hansen, R., "Welcome to Slowloris...", June 2009, 3571 . 3574 [SST] Ford, B., "Structured streams", ACM SIGCOMM Computer 3575 Communication Review Vol. 37, pp. 361, 3576 DOI 10.1145/1282427.1282421, October 2007. 3578 15.3. URIs 3580 [1] https://mailarchive.ietf.org/arch/search/?email_list=quic 3582 [2] https://github.com/quicwg 3584 [3] https://github.com/quicwg/base-drafts/labels/transport 3586 [4] https://github.com/quicwg/base-drafts/wiki/QUIC-Versions 3588 Appendix A. Contributors 3590 The original authors of this specification were Ryan Hamilton, Jana 3591 Iyengar, Ian Swett, and Alyssa Wilk. 3593 The original design and rationale behind this protocol draw 3594 significantly from work by Jim Roskind [EARLY-DESIGN]. In 3595 alphabetical order, the contributors to the pre-IETF QUIC project at 3596 Google are: Britt Cyr, Jeremy Dorfman, Ryan Hamilton, Jana Iyengar, 3597 Fedor Kouranov, Charles Krasic, Jo Kulik, Adam Langley, Jim Roskind, 3598 Robbie Shade, Satyam Shekhar, Cherie Shi, Ian Swett, Raman Tenneti, 3599 Victor Vasiliev, Antonio Vicente, Patrik Westin, Alyssa Wilk, Dale 3600 Worley, Fan Yang, Dan Zhang, Daniel Ziegler. 3602 Appendix B. Acknowledgments 3604 Special thanks are due to the following for helping shape pre-IETF 3605 QUIC and its deployment: Chris Bentzel, Misha Efimov, Roberto Peon, 3606 Alistair Riddoch, Siddharth Vijayakrishnan, and Assar Westerlund. 3608 This document has benefited immensely from various private 3609 discussions and public ones on the quic@ietf.org and proto- 3610 quic@chromium.org mailing lists. Our thanks to all. 3612 Appendix C. Change Log 3614 *RFC Editor's Note:* Please remove this section prior to 3615 publication of a final version of this document. 3617 Issue and pull request numbers are listed with a leading octothorp. 3619 C.1. Since draft-ietf-quic-transport-06 3621 o Replaced FNV-1a with AES-GCM for all "Cleartext" packets. 3623 C.2. Since draft-ietf-quic-transport-05 3625 o Stateless token is server-only (#726) 3627 o Refactor section on connection termination (#733, #748, #328, 3628 #177) 3630 o Limit size of Version Negotiation packet (#585) 3632 o Clarify when and what to ack (#736) 3634 o Renamed STREAM_ID_NEEDED to STREAM_ID_BLOCKED 3636 o Clarify Keep-alive requirements (#729) 3638 C.3. Since draft-ietf-quic-transport-04 3640 o Introduce STOP_SENDING frame, RST_STREAM only resets in one 3641 direction (#165) 3643 o Removed GOAWAY; application protocols are responsible for graceful 3644 shutdown (#696) 3646 o Reduced the number of error codes (#96, #177, #184, #211) 3648 o Version validation fields can't move or change (#121) 3650 o Removed versions from the transport parameters in a 3651 NewSessionTicket message (#547) 3653 o Clarify the meaning of "bytes in flight" (#550) 3655 o Public reset is now stateless reset and not visible to the path 3656 (#215) 3658 o Reordered bits and fields in STREAM frame (#620) 3659 o Clarifications to the stream state machine (#572, #571) 3661 o Increased the maximum length of the Largest Acknowledged field in 3662 ACK frames to 64 bits (#629) 3664 o truncate_connection_id is renamed to omit_connection_id (#659) 3666 o CONNECTION_CLOSE terminates the connection like TCP RST (#330, 3667 #328) 3669 o Update labels used in HKDF-Expand-Label to match TLS 1.3 (#642) 3671 C.4. Since draft-ietf-quic-transport-03 3673 o Change STREAM and RST_STREAM layout 3675 o Add MAX_STREAM_ID settings 3677 C.5. Since draft-ietf-quic-transport-02 3679 o The size of the initial packet payload has a fixed minimum (#267, 3680 #472) 3682 o Define when Version Negotiation packets are ignored (#284, #294, 3683 #241, #143, #474) 3685 o The 64-bit FNV-1a algorithm is used for integrity protection of 3686 unprotected packets (#167, #480, #481, #517) 3688 o Rework initial packet types to change how the connection ID is 3689 chosen (#482, #442, #493) 3691 o No timestamps are forbidden in unprotected packets (#542, #429) 3693 o Cryptographic handshake is now on stream 0 (#456) 3695 o Remove congestion control exemption for cryptographic handshake 3696 (#248, #476) 3698 o Version 1 of QUIC uses TLS; a new version is needed to use a 3699 different handshake protocol (#516) 3701 o STREAM frames have a reduced number of offset lengths (#543, #430) 3703 o Split some frames into separate connection- and stream- level 3704 frames (#443) 3706 * WINDOW_UPDATE split into MAX_DATA and MAX_STREAM_DATA (#450) 3707 * BLOCKED split to match WINDOW_UPDATE split (#454) 3709 * Define STREAM_ID_NEEDED frame (#455) 3711 o A NEW_CONNECTION_ID frame supports connection migration without 3712 linkability (#232, #491, #496) 3714 o Transport parameters for 0-RTT are retained from a previous 3715 connection (#405, #513, #512) 3717 * A client in 0-RTT no longer required to reset excess streams 3718 (#425, #479) 3720 o Expanded security considerations (#440, #444, #445, #448) 3722 C.6. Since draft-ietf-quic-transport-01 3724 o Defined short and long packet headers (#40, #148, #361) 3726 o Defined a versioning scheme and stable fields (#51, #361) 3728 o Define reserved version values for "greasing" negotiation (#112, 3729 #278) 3731 o The initial packet number is randomized (#35, #283) 3733 o Narrow the packet number encoding range requirement (#67, #286, 3734 #299, #323, #356) 3736 o Defined client address validation (#52, #118, #120, #275) 3738 o Define transport parameters as a TLS extension (#49, #122) 3740 o SCUP and COPT parameters are no longer valid (#116, #117) 3742 o Transport parameters for 0-RTT are either remembered from before, 3743 or assume default values (#126) 3745 o The server chooses connection IDs in its final flight (#119, #349, 3746 #361) 3748 o The server echoes the Connection ID and packet number fields when 3749 sending a Version Negotiation packet (#133, #295, #244) 3751 o Defined a minimum packet size for the initial handshake packet 3752 from the client (#69, #136, #139, #164) 3754 o Path MTU Discovery (#64, #106) 3755 o The initial handshake packet from the client needs to fit in a 3756 single packet (#338) 3758 o Forbid acknowledgment of packets containing only ACK and PADDING 3759 (#291) 3761 o Require that frames are processed when packets are acknowledged 3762 (#381, #341) 3764 o Removed the STOP_WAITING frame (#66) 3766 o Don't require retransmission of old timestamps for lost ACK frames 3767 (#308) 3769 o Clarified that frames are not retransmitted, but the information 3770 in them can be (#157, #298) 3772 o Error handling definitions (#335) 3774 o Split error codes into four sections (#74) 3776 o Forbid the use of Public Reset where CONNECTION_CLOSE is possible 3777 (#289) 3779 o Define packet protection rules (#336) 3781 o Require that stream be entirely delivered or reset, including 3782 acknowledgment of all STREAM frames or the RST_STREAM, before it 3783 closes (#381) 3785 o Remove stream reservation from state machine (#174, #280) 3787 o Only stream 1 does not contribute to connection-level flow control 3788 (#204) 3790 o Stream 1 counts towards the maximum concurrent stream limit (#201, 3791 #282) 3793 o Remove connection-level flow control exclusion for some streams 3794 (except 1) (#246) 3796 o RST_STREAM affects connection-level flow control (#162, #163) 3798 o Flow control accounting uses the maximum data offset on each 3799 stream, rather than bytes received (#378) 3801 o Moved length-determining fields to the start of STREAM and ACK 3802 (#168, #277) 3804 o Added the ability to pad between frames (#158, #276) 3806 o Remove error code and reason phrase from GOAWAY (#352, #355) 3808 o GOAWAY includes a final stream number for both directions (#347) 3810 o Error codes for RST_STREAM and CONNECTION_CLOSE are now at a 3811 consistent offset (#249) 3813 o Defined priority as the responsibility of the application protocol 3814 (#104, #303) 3816 C.7. Since draft-ietf-quic-transport-00 3818 o Replaced DIVERSIFICATION_NONCE flag with KEY_PHASE flag 3820 o Defined versioning 3822 o Reworked description of packet and frame layout 3824 o Error code space is divided into regions for each component 3826 o Use big endian for all numeric values 3828 C.8. Since draft-hamilton-quic-transport-protocol-01 3830 o Adopted as base for draft-ietf-quic-tls 3832 o Updated authors/editors list 3834 o Added IANA Considerations section 3836 o Moved Contributors and Acknowledgments to appendices 3838 Authors' Addresses 3840 Jana Iyengar (editor) 3841 Google 3843 Email: jri@google.com 3845 Martin Thomson (editor) 3846 Mozilla 3848 Email: martin.thomson@gmail.com