idnits 2.17.1 draft-ietf-quic-transport-11.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The abstract seems to contain references ([2], [3], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 17, 2018) is 2206 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 4478 -- Looks like a reference, but probably isn't: '2' on line 4480 -- Looks like a reference, but probably isn't: '3' on line 4482 -- Looks like a reference, but probably isn't: '4' on line 4484 == Outdated reference: A later version (-28) exists of draft-ietf-tls-tls13-21 == Outdated reference: A later version (-34) exists of draft-ietf-quic-recovery-10 == Outdated reference: A later version (-34) exists of draft-ietf-quic-tls-10 -- Duplicate reference: RFC1191, mentioned in 'RFC1191', was also mentioned in 'PMTUDv4'. -- Duplicate reference: RFC4821, mentioned in 'RFC4821', was also mentioned in 'PLPMTUD'. -- Obsolete informational reference (is this intentional?): RFC 7540 (ref. 'HTTP2') (Obsoleted by RFC 9113) == Outdated reference: A later version (-13) exists of draft-ietf-quic-invariants-01 Summary: 2 errors (**), 0 flaws (~~), 5 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC J. Iyengar, Ed. 3 Internet-Draft Fastly 4 Intended status: Standards Track M. Thomson, Ed. 5 Expires: October 19, 2018 Mozilla 6 April 17, 2018 8 QUIC: A UDP-Based Multiplexed and Secure Transport 9 draft-ietf-quic-transport-11 11 Abstract 13 This document defines the core of the QUIC transport protocol. This 14 document describes connection establishment, packet format, 15 multiplexing and reliability. Accompanying documents describe the 16 cryptographic handshake and loss detection. 18 Note to Readers 20 Discussion of this draft takes place on the QUIC working group 21 mailing list (quic@ietf.org), which is archived at 22 https://mailarchive.ietf.org/arch/search/?email_list=quic [1]. 24 Working Group information can be found at https://github.com/quicwg 25 [2]; source code and issues list for this draft can be found at 26 https://github.com/quicwg/base-drafts/labels/-transport [3]. 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on October 19, 2018. 45 Copyright Notice 47 Copyright (c) 2018 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents 52 (https://trustee.ietf.org/license-info) in effect on the date of 53 publication of this document. Please review these documents 54 carefully, as they describe your rights and restrictions with respect 55 to this document. Code Components extracted from this document must 56 include Simplified BSD License text as described in Section 4.e of 57 the Trust Legal Provisions and are provided without warranty as 58 described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 5 63 2. Conventions and Definitions . . . . . . . . . . . . . . . . . 6 64 2.1. Notational Conventions . . . . . . . . . . . . . . . . . 6 65 3. Versions . . . . . . . . . . . . . . . . . . . . . . . . . . 7 66 4. Packet Types and Formats . . . . . . . . . . . . . . . . . . 8 67 4.1. Long Header . . . . . . . . . . . . . . . . . . . . . . . 8 68 4.2. Short Header . . . . . . . . . . . . . . . . . . . . . . 10 69 4.3. Version Negotiation Packet . . . . . . . . . . . . . . . 12 70 4.4. Cryptographic Handshake Packets . . . . . . . . . . . . . 14 71 4.4.1. Initial Packet . . . . . . . . . . . . . . . . . . . 14 72 4.4.2. Retry Packet . . . . . . . . . . . . . . . . . . . . 15 73 4.4.3. Handshake Packet . . . . . . . . . . . . . . . . . . 16 74 4.5. Protected Packets . . . . . . . . . . . . . . . . . . . . 17 75 4.6. Coaslescing Packets . . . . . . . . . . . . . . . . . . . 17 76 4.7. Connection ID . . . . . . . . . . . . . . . . . . . . . . 18 77 4.8. Packet Numbers . . . . . . . . . . . . . . . . . . . . . 19 78 4.8.1. Initial Packet Number . . . . . . . . . . . . . . . . 20 79 5. Frames and Frame Types . . . . . . . . . . . . . . . . . . . 20 80 6. Life of a Connection . . . . . . . . . . . . . . . . . . . . 22 81 6.1. Matching Packets to Connections . . . . . . . . . . . . . 23 82 6.1.1. Client Packet Handling . . . . . . . . . . . . . . . 23 83 6.1.2. Server Packet Handling . . . . . . . . . . . . . . . 23 84 6.2. Version Negotiation . . . . . . . . . . . . . . . . . . . 24 85 6.2.1. Sending Version Negotiation Packets . . . . . . . . . 25 86 6.2.2. Handling Version Negotiation Packets . . . . . . . . 25 87 6.2.3. Using Reserved Versions . . . . . . . . . . . . . . . 26 88 6.3. Cryptographic and Transport Handshake . . . . . . . . . . 26 89 6.4. Transport Parameters . . . . . . . . . . . . . . . . . . 27 90 6.4.1. Transport Parameter Definitions . . . . . . . . . . . 29 91 6.4.2. Values of Transport Parameters for 0-RTT . . . . . . 30 92 6.4.3. New Transport Parameters . . . . . . . . . . . . . . 31 93 6.4.4. Version Negotiation Validation . . . . . . . . . . . 31 94 6.5. Stateless Retries . . . . . . . . . . . . . . . . . . . . 33 95 6.6. Proof of Source Address Ownership . . . . . . . . . . . . 33 96 6.6.1. Client Address Validation Procedure . . . . . . . . . 34 97 6.6.2. Address Validation on Session Resumption . . . . . . 35 98 6.6.3. Address Validation Token Integrity . . . . . . . . . 35 99 6.7. Path Validation . . . . . . . . . . . . . . . . . . . . . 36 100 6.7.1. Initiation . . . . . . . . . . . . . . . . . . . . . 36 101 6.7.2. Response . . . . . . . . . . . . . . . . . . . . . . 37 102 6.7.3. Completion . . . . . . . . . . . . . . . . . . . . . 37 103 6.7.4. Abandonment . . . . . . . . . . . . . . . . . . . . . 38 104 6.8. Connection Migration . . . . . . . . . . . . . . . . . . 38 105 6.8.1. Probing a New Path . . . . . . . . . . . . . . . . . 38 106 6.8.2. Initiating Connection Migration . . . . . . . . . . . 39 107 6.8.3. Responding to Connection Migration . . . . . . . . . 39 108 6.8.4. Loss Detection and Congestion Control . . . . . . . . 41 109 6.8.5. Privacy Implications of Connection Migration . . . . 42 110 6.9. Connection Termination . . . . . . . . . . . . . . . . . 43 111 6.9.1. Closing and Draining Connection States . . . . . . . 44 112 6.9.2. Idle Timeout . . . . . . . . . . . . . . . . . . . . 45 113 6.9.3. Immediate Close . . . . . . . . . . . . . . . . . . . 45 114 6.9.4. Stateless Reset . . . . . . . . . . . . . . . . . . . 46 115 7. Frame Types and Formats . . . . . . . . . . . . . . . . . . . 49 116 7.1. Variable-Length Integer Encoding . . . . . . . . . . . . 49 117 7.2. PADDING Frame . . . . . . . . . . . . . . . . . . . . . . 50 118 7.3. RST_STREAM Frame . . . . . . . . . . . . . . . . . . . . 50 119 7.4. CONNECTION_CLOSE frame . . . . . . . . . . . . . . . . . 51 120 7.5. APPLICATION_CLOSE frame . . . . . . . . . . . . . . . . . 52 121 7.6. MAX_DATA Frame . . . . . . . . . . . . . . . . . . . . . 52 122 7.7. MAX_STREAM_DATA Frame . . . . . . . . . . . . . . . . . . 53 123 7.8. MAX_STREAM_ID Frame . . . . . . . . . . . . . . . . . . . 54 124 7.9. PING Frame . . . . . . . . . . . . . . . . . . . . . . . 54 125 7.10. BLOCKED Frame . . . . . . . . . . . . . . . . . . . . . . 55 126 7.11. STREAM_BLOCKED Frame . . . . . . . . . . . . . . . . . . 55 127 7.12. STREAM_ID_BLOCKED Frame . . . . . . . . . . . . . . . . . 56 128 7.13. NEW_CONNECTION_ID Frame . . . . . . . . . . . . . . . . . 56 129 7.14. STOP_SENDING Frame . . . . . . . . . . . . . . . . . . . 58 130 7.15. ACK Frame . . . . . . . . . . . . . . . . . . . . . . . . 58 131 7.15.1. ACK Block Section . . . . . . . . . . . . . . . . . 60 132 7.15.2. Sending ACK Frames . . . . . . . . . . . . . . . . . 61 133 7.15.3. ACK Frames and Packet Protection . . . . . . . . . . 62 134 7.16. PATH_CHALLENGE Frame . . . . . . . . . . . . . . . . . . 63 135 7.17. PATH_RESPONSE Frame . . . . . . . . . . . . . . . . . . . 63 136 7.18. STREAM Frames . . . . . . . . . . . . . . . . . . . . . . 64 137 8. Packetization and Reliability . . . . . . . . . . . . . . . . 65 138 8.1. Packet Processing and Acknowledgment . . . . . . . . . . 66 139 8.2. Retransmission of Information . . . . . . . . . . . . . . 66 140 8.3. Packet Size . . . . . . . . . . . . . . . . . . . . . . . 68 141 8.4. Path Maximum Transmission Unit . . . . . . . . . . . . . 68 142 8.4.1. Special Considerations for PMTU Discovery . . . . . . 69 143 8.4.2. Special Considerations for Packetization Layer PMTU 144 Discovery . . . . . . . . . . . . . . . . . . . . . . 70 145 9. Streams: QUIC's Data Structuring Abstraction . . . . . . . . 70 146 9.1. Stream Identifiers . . . . . . . . . . . . . . . . . . . 71 147 9.2. Stream States . . . . . . . . . . . . . . . . . . . . . . 72 148 9.2.1. Send Stream States . . . . . . . . . . . . . . . . . 73 149 9.2.2. Receive Stream States . . . . . . . . . . . . . . . . 75 150 9.2.3. Permitted Frame Types . . . . . . . . . . . . . . . . 77 151 9.2.4. Bidirectional Stream States . . . . . . . . . . . . . 77 152 9.3. Solicited State Transitions . . . . . . . . . . . . . . . 78 153 9.4. Stream Concurrency . . . . . . . . . . . . . . . . . . . 79 154 9.5. Sending and Receiving Data . . . . . . . . . . . . . . . 80 155 9.6. Stream Prioritization . . . . . . . . . . . . . . . . . . 80 156 10. Flow Control . . . . . . . . . . . . . . . . . . . . . . . . 81 157 10.1. Edge Cases and Other Considerations . . . . . . . . . . 83 158 10.1.1. Response to a RST_STREAM . . . . . . . . . . . . . . 83 159 10.1.2. Data Limit Increments . . . . . . . . . . . . . . . 83 160 10.1.3. Handshake Exemption . . . . . . . . . . . . . . . . 84 161 10.2. Stream Limit Increment . . . . . . . . . . . . . . . . . 84 162 10.2.1. Blocking on Flow Control . . . . . . . . . . . . . . 84 163 10.3. Stream Final Offset . . . . . . . . . . . . . . . . . . 85 164 11. Error Handling . . . . . . . . . . . . . . . . . . . . . . . 85 165 11.1. Connection Errors . . . . . . . . . . . . . . . . . . . 86 166 11.2. Stream Errors . . . . . . . . . . . . . . . . . . . . . 87 167 11.3. Transport Error Codes . . . . . . . . . . . . . . . . . 87 168 11.4. Application Protocol Error Codes . . . . . . . . . . . . 88 169 12. Security and Privacy Considerations . . . . . . . . . . . . . 89 170 12.1. Spoofed ACK Attack . . . . . . . . . . . . . . . . . . . 89 171 12.2. Optimistic ACK Attack . . . . . . . . . . . . . . . . . 89 172 12.3. Slowloris Attacks . . . . . . . . . . . . . . . . . . . 90 173 12.4. Stream Fragmentation and Reassembly Attacks . . . . . . 90 174 12.5. Stream Commitment Attack . . . . . . . . . . . . . . . . 90 175 13. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 91 176 13.1. QUIC Transport Parameter Registry . . . . . . . . . . . 91 177 13.2. QUIC Transport Error Codes Registry . . . . . . . . . . 92 178 14. References . . . . . . . . . . . . . . . . . . . . . . . . . 94 179 14.1. Normative References . . . . . . . . . . . . . . . . . . 94 180 14.2. Informative References . . . . . . . . . . . . . . . . . 95 181 14.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 96 182 Appendix A. Contributors . . . . . . . . . . . . . . . . . . . . 96 183 Appendix B. Acknowledgments . . . . . . . . . . . . . . . . . . 97 184 Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 97 185 C.1. Since draft-ietf-quic-transport-10 . . . . . . . . . . . 97 186 C.2. Since draft-ietf-quic-transport-09 . . . . . . . . . . . 98 187 C.3. Since draft-ietf-quic-transport-08 . . . . . . . . . . . 98 188 C.4. Since draft-ietf-quic-transport-07 . . . . . . . . . . . 99 189 C.5. Since draft-ietf-quic-transport-06 . . . . . . . . . . . 100 190 C.6. Since draft-ietf-quic-transport-05 . . . . . . . . . . . 100 191 C.7. Since draft-ietf-quic-transport-04 . . . . . . . . . . . 100 192 C.8. Since draft-ietf-quic-transport-03 . . . . . . . . . . . 101 193 C.9. Since draft-ietf-quic-transport-02 . . . . . . . . . . . 101 194 C.10. Since draft-ietf-quic-transport-01 . . . . . . . . . . . 102 195 C.11. Since draft-ietf-quic-transport-00 . . . . . . . . . . . 104 196 C.12. Since draft-hamilton-quic-transport-protocol-01 . . . . . 104 197 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 105 199 1. Introduction 201 QUIC is a multiplexed and secure transport protocol that runs on top 202 of UDP. QUIC aims to provide a flexible set of features that allow 203 it to be a general-purpose secure transport for multiple 204 applications. 206 o Version negotiation 208 o Low-latency connection establishment 210 o Authenticated and encrypted header and payload 212 o Stream multiplexing 214 o Stream and connection-level flow control 216 o Connection migration and resilience to NAT rebinding 218 QUIC implements techniques learned from experience with TCP, SCTP and 219 other transport protocols. QUIC uses UDP as substrate so as to not 220 require changes to legacy client operating systems and middleboxes to 221 be deployable. QUIC authenticates all of its headers and encrypts 222 most of the data it exchanges, including its signaling. This allows 223 the protocol to evolve without incurring a dependency on upgrades to 224 middleboxes. This document describes the core QUIC protocol, 225 including the conceptual design, wire format, and mechanisms of the 226 QUIC protocol for connection establishment, stream multiplexing, 227 stream and connection-level flow control, connection migration, and 228 data reliability. 230 Accompanying documents describe QUIC's loss detection and congestion 231 control [QUIC-RECOVERY], and the use of TLS 1.3 for key negotiation 232 [QUIC-TLS]. 234 QUIC version 1 conforms to the protocol invariants in 235 [QUIC-INVARIANTS]. 237 2. Conventions and Definitions 239 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 240 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 241 "OPTIONAL" in this document are to be interpreted as described in BCP 242 14 [RFC2119] [RFC8174] when, and only when, they appear in all 243 capitals, as shown here. 245 Definitions of terms that are used in this document: 247 Client: The endpoint initiating a QUIC connection. 249 Server: The endpoint accepting incoming QUIC connections. 251 Endpoint: The client or server end of a connection. 253 Stream: A logical, bi-directional channel of ordered bytes within a 254 QUIC connection. 256 Connection: A conversation between two QUIC endpoints with a single 257 encryption context that multiplexes streams within it. 259 Connection ID: An opaque identifier that is used to identify a QUIC 260 connection at an endpoint. Each endpoint sets a value that its 261 peer includes in packets. 263 QUIC packet: A well-formed UDP payload that can be parsed by a QUIC 264 receiver. 266 QUIC is a name, not an acronym. 268 2.1. Notational Conventions 270 Packet and frame diagrams use the format described in Section 3.1 of 271 [RFC2360], with the following additional conventions: 273 [x] Indicates that x is optional 275 x (A) Indicates that x is A bits long 277 x (A/B/C) ... Indicates that x is one of A, B, or C bits long 279 x (i) ... Indicates that x uses the variable-length encoding in 280 Section 7.1 282 x (*) ... Indicates that x is variable-length 284 3. Versions 286 QUIC versions are identified using a 32-bit unsigned number. 288 The version 0x00000000 is reserved to represent version negotiation. 289 This version of the specification is identified by the number 290 0x00000001. 292 Other versions of QUIC might have different properties to this 293 version. The properties of QUIC that are guaranteed to be consistent 294 across all versions of the protocol are described in 295 [QUIC-INVARIANTS]. 297 Version 0x00000001 of QUIC uses TLS as a cryptographic handshake 298 protocol, as described in [QUIC-TLS]. 300 Versions with the most significant 16 bits of the version number 301 cleared are reserved for use in future IETF consensus documents. 303 Versions that follow the pattern 0x?a?a?a?a are reserved for use in 304 forcing version negotiation to be exercised. That is, any version 305 number where the low four bits of all octets is 1010 (in binary). A 306 client or server MAY advertise support for any of these reserved 307 versions. 309 Reserved version numbers will probably never represent a real 310 protocol; a client MAY use one of these version numbers with the 311 expectation that the server will initiate version negotiation; a 312 server MAY advertise support for one of these versions and can expect 313 that clients ignore the value. 315 [[RFC editor: please remove the remainder of this section before 316 publication.]] 318 The version number for the final version of this specification 319 (0x00000001), is reserved for the version of the protocol that is 320 published as an RFC. 322 Version numbers used to identify IETF drafts are created by adding 323 the draft number to 0xff000000. For example, draft-ietf-quic- 324 transport-13 would be identified as 0xff00000D. 326 Implementors are encouraged to register version numbers of QUIC that 327 they are using for private experimentation on the github wiki [4]. 329 4. Packet Types and Formats 331 We first describe QUIC's packet types and their formats, since some 332 are referenced in subsequent mechanisms. 334 All numeric values are encoded in network byte order (that is, big- 335 endian) and all field sizes are in bits. When discussing individual 336 bits of fields, the least significant bit is referred to as bit 0. 337 Hexadecimal notation is used for describing the value of fields. 339 Any QUIC packet has either a long or a short header, as indicated by 340 the Header Form bit. Long headers are expected to be used early in 341 the connection before version negotiation and establishment of 1-RTT 342 keys. Short headers are minimal version-specific headers, which are 343 used after version negotiation and 1-RTT keys are established. 345 4.1. Long Header 347 0 1 2 3 348 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 349 +-+-+-+-+-+-+-+-+ 350 |1| Type (7) | 351 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 352 | Version (32) | 353 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 354 |DCIL(4)|SCIL(4)| 355 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 356 | Destination Connection ID (0/32..144) ... 357 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 358 | Source Connection ID (0/32..144) ... 359 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 360 | Payload Length (i) ... 361 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 362 | Packet Number (32) | 363 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 364 | Payload (*) ... 365 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 367 Figure 1: Long Header Format 369 Long headers are used for packets that are sent prior to the 370 completion of version negotiation and establishment of 1-RTT keys. 371 Once both conditions are met, a sender switches to sending packets 372 using the short header (Section 4.2). The long form allows for 373 special packets - such as the Version Negotiation packet - to be 374 represented in this uniform fixed-length packet format. A long 375 header contains the following fields: 377 Header Form: The most significant bit (0x80) of octet 0 (the first 378 octet) is set to 1 for long headers. 380 Long Packet Type: The remaining seven bits of octet 0 contain the 381 packet type. This field can indicate one of 128 packet types. 382 The types specified for this version are listed in Table 1. 384 Version: The QUIC Version is a 32-bit field that follows the Type. 385 This field indicates which version of QUIC is in use and 386 determines how the rest of the protocol fields are interpreted. 388 DCIL and SCIL: Octet 1 contains the lengths of the two connection ID 389 fields that follow it. These lengths are encoded as two 4-bit 390 unsigned integers. The Destination Connection ID Length (DCIL) 391 field occupies the 4 high bits of the octet and the Source 392 Connection ID Length (SCIL) field occupies the 4 low bits of the 393 octet. An encoded length of 0 indicates that the connection ID is 394 also 0 octets in length. Non-zero encoded lengths are increased 395 by 3 to get the full length of the connection ID, producing a 396 length between 4 and 18 octets inclusive. For example, an octet 397 with the value 0x50 describes an 8-octet Destination Connection ID 398 and a zero-length Source Connection ID. 400 Destination Connection ID: The Destination Connection ID field 401 follows the connection ID lengths and is either 0 octets in length 402 or between 4 and 18 octets. Section 4.7 describes the use of this 403 field in more detail. 405 Source Connection ID: The Source Connection ID field follows the 406 Destination Connection ID and is either 0 octets in length or 407 between 4 and 18 octets. Section 4.7 describes the use of this 408 field in more detail. 410 Payload Length: The length of the Payload field in octets, encoded 411 as a variable-length integer (Section 7.1). 413 Packet Number: The Packet Number is a 32-bit field that follows the 414 two connection IDs. Section 4.8 describes the use of packet 415 numbers. 417 Payload: The payload of the packet. 419 The following packet types are defined: 421 +------+-----------------+---------------+ 422 | Type | Name | Section | 423 +------+-----------------+---------------+ 424 | 0x7F | Initial | Section 4.4.1 | 425 | | | | 426 | 0x7E | Retry | Section 4.4.2 | 427 | | | | 428 | 0x7D | Handshake | Section 4.4.3 | 429 | | | | 430 | 0x7C | 0-RTT Protected | Section 4.5 | 431 +------+-----------------+---------------+ 433 Table 1: Long Header Packet Types 435 The header form, type, connection ID lengths octet, destination and 436 source connection IDs, and version fields of a long header packet are 437 version-independent. The packet number and values for packet types 438 defined in Table 1 are version-specific. See [QUIC-INVARIANTS] for 439 details on how packets from different versions of QUIC are 440 interpreted. 442 The interpretation of the fields and the payload are specific to a 443 version and packet type. Type-specific semantics for this version 444 are described in the following sections. 446 End of the Payload field (which is also the end of the long header 447 packet) is determined by the value of the Payload Length field. 448 Senders can coalesce multiple long header packets into one UDP 449 datagram. See Section 4.6 for more details. 451 4.2. Short Header 453 0 1 2 3 454 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 455 +-+-+-+-+-+-+-+-+ 456 |0|K|1|1|0|R|T T| 457 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 458 | Destination Connection ID (0..144) ... 459 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 460 | Packet Number (8/16/32) ... 461 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 462 | Protected Payload (*) ... 463 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 465 Figure 2: Short Header Format 467 The short header can be used after the version and 1-RTT keys are 468 negotiated. This header form has the following fields: 470 Header Form: The most significant bit (0x80) of octet 0 is set to 0 471 for the short header. 473 Key Phase Bit: The second bit (0x40) of octet 0 indicates the key 474 phase, which allows a recipient of a packet to identify the packet 475 protection keys that are used to protect the packet. See 476 [QUIC-TLS] for details. 478 [[Editor's Note: this section should be removed and the bit 479 definitions changed before this draft goes to the IESG.]] 481 Third Bit: The third bit (0x20) of octet 0 is set to 1. 483 [[Editor's Note: this section should be removed and the bit 484 definitions changed before this draft goes to the IESG.]] 486 Fourth Bit: The fourth bit (0x10) of octet 0 is set to 1. 488 [[Editor's Note: this section should be removed and the bit 489 definitions changed before this draft goes to the IESG.]] 491 Google QUIC Demultipexing Bit: The fifth bit (0x8) of octet 0 is set 492 to 0. This allows implementations of Google QUIC to distinguish 493 Google QUIC packets from short header packets sent by a client 494 because Google QUIC servers expect the connection ID to always be 495 present. The special interpretation of this bit SHOULD be removed 496 from this specification when Google QUIC has finished 497 transitioning to the new header format. 499 Reserved: The sixth bit (0x4) of octet 0 is reserved for 500 experimentation. 502 Short Packet Type: The remaining 2 bits of octet 0 include one of 4 503 packet types. Table 2 lists the types that are defined for short 504 packets. 506 Destination Connection ID: The Destination Connection ID is a 507 connection ID that is chosen by the intended recipient of the 508 packet. See Section 4.7 for more details. 510 Packet Number: The length of the packet number field depends on the 511 packet type. This field can be 1, 2 or 4 octets long depending on 512 the short packet type. 514 Protected Payload: Packets with a short header always include a 515 1-RTT protected payload. 517 The packet type in a short header currently determines only the size 518 of the packet number field. Additional types can be used to signal 519 the presence of other fields. 521 +------+--------------------+ 522 | Type | Packet Number Size | 523 +------+--------------------+ 524 | 0x0 | 1 octet | 525 | | | 526 | 0x1 | 2 octets | 527 | | | 528 | 0x2 | 4 octets | 529 +------+--------------------+ 531 Table 2: Short Header Packet Types 533 The header form and connection ID field of a short header packet are 534 version-independent. The remaining fields are specific to the 535 selected QUIC version. See [QUIC-INVARIANTS] for details on how 536 packets from different versions of QUIC are interpreted. 538 4.3. Version Negotiation Packet 540 A Version Negotiation packet is inherently not version-specific, and 541 does not use the long packet header (see Section 4.1. Upon receipt 542 by a client, it will appear to be a packet using the long header, but 543 will be identified as a Version Negotiation packet based on the 544 Version field having a value of 0. 546 The Version Negotiation packet is a response to a client packet that 547 contains a version that is not supported by the server, and is only 548 sent by servers. 550 The layout of a Version Negotiation packet is: 552 0 1 2 3 553 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 554 +-+-+-+-+-+-+-+-+ 555 |1| Unused (7) | 556 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 557 | Version (32) | 558 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 559 |DCIL(4)|SCIL(4)| 560 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 561 | Destination Connection ID (0/32..144) ... 562 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 563 | Source Connection ID (0/32..144) ... 564 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 565 | Supported Version 1 (32) ... 566 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 567 | [Supported Version 2 (32)] ... 568 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 569 ... 570 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 571 | [Supported Version N (32)] ... 572 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 574 Figure 3: Version Negotiation Packet 576 The value in the Unused field is selected randomly by the server. 578 The Version field of a Version Negotiation packet MUST be set to 579 0x00000000. 581 The server MUST include the value from the Source Connection ID field 582 of the packet it receives in the Destination Connection ID field. 583 The value for Source Connection ID MUST be copied from the 584 Destination Connection ID of the received packet, which is initially 585 randomly selected by a client. Echoing both connection IDs gives 586 clients some assurance that the server received the packet and that 587 the Version Negotiation packet was not generated by an off-path 588 attacker. 590 The remainder of the Version Negotiation packet is a list of 32-bit 591 versions which the server supports. 593 A Version Negotiation packet cannot be explicitly acknowledged in an 594 ACK frame by a client. Receiving another Initial packet implicitly 595 acknowledges a Version Negotiation packet. 597 The Version Negotiation packet does not include the Packet Number and 598 Length fields present in other packets that use the long header form. 600 Consequently, a Version Negotiation packet consumes an entire UDP 601 datagram. 603 See Section 6.2 for a description of the version negotiation process. 605 4.4. Cryptographic Handshake Packets 607 Once version negotiation is complete, the cryptographic handshake is 608 used to agree on cryptographic keys. The cryptographic handshake is 609 carried in Initial (Section 4.4.1), Retry (Section 4.4.2) and 610 Handshake (Section 4.4.3) packets. 612 All these packets use the long header and contain the current QUIC 613 version in the version field. 615 In order to prevent tampering by version-unaware middleboxes, 616 handshake packets are protected with a connection- and version- 617 specific key, as described in [QUIC-TLS]. This protection does not 618 provide confidentiality or integrity against on-path attackers, but 619 provides some level of protection against off-path attackers. 621 4.4.1. Initial Packet 623 The Initial packet uses long headers with a type value of 0x7F. It 624 carries the first cryptographic handshake message sent by the client. 626 If the client has not previously received a Retry packet from the 627 server, it populates the Destination Connection ID field with a 628 randomly selected value. This MUST be at least 8 octets in length. 629 Until a packet is received from the server, the client MUST use the 630 same random value unless it also changes the Source Connection ID 631 (which effectively starts a new connection attempt). The randomized 632 Destination Connection ID is used to determine packet protection 633 keys, but is not included in server packets. 635 If the client received a Retry packet and is sending a second Initial 636 packet, then it sets the Destination Connection ID to the value from 637 the Source Connection ID in the Retry packet. Changing Destination 638 Connection ID also results in a change to the keys used to protect 639 the Initial packet. 641 The client populates the Source Connection ID field with a value of 642 its choosing and sets the low bits of the ConnID Len field to match. 644 The first Initial packet that is sent by a client contains a 645 randomized packet number. All subsequent packets contain a packet 646 number that is incremented by one, see (Section 4.8). 648 The payload of an Initial packet conveys a STREAM frame (or frames) 649 for stream 0 containing a cryptographic handshake message. The 650 stream in this packet always starts at an offset of 0 (see 651 Section 6.5) and the complete cryptographic handshake message MUST 652 fit in a single packet (see Section 6.3). 654 The payload of a UDP datagram carrying the Initial packet MUST be 655 expanded to at least 1200 octets (see Section 8), by adding PADDING 656 frames to the Initial packet and/or by combining the Initial packet 657 with a 0-RTT packet (see Section 4.6). 659 The client uses the Initial packet type for any packet that contains 660 an initial cryptographic handshake message. This includes all cases 661 where a new packet containing the initial cryptographic message needs 662 to be created, this includes the packets sent after receiving a 663 Version Negotiation (Section 4.3) or Retry packet (Section 4.4.2). 665 4.4.2. Retry Packet 667 A Retry packet uses long headers with a type value of 0x7E. It 668 carries cryptographic handshake messages and acknowledgments. It is 669 used by a server that wishes to perform a stateless retry (see 670 Section 6.5). 672 The server populates the Destination Connection ID with the 673 connection ID that the client included in the Source Connection ID of 674 the Initial packet. This might be a zero-length value. 676 The server includes a connection ID of its choice in the Source 677 Connection ID field. The client MUST use this connection ID in the 678 Destination Connection ID of subsequent packets that it sends. 680 The packet number field echoes the packet number field from the 681 triggering client packet. 683 A Retry packet is never explicitly acknowledged in an ACK frame by a 684 client. Receiving another Initial packet implicitly acknowledges a 685 Retry packet. 687 After receiving a Retry packet, the client uses a new Initial packet 688 containing the next cryptographic handshake message. The client 689 retains the state of its cryptographic handshake, but discards all 690 transport state. The Initial packet that is generated in response to 691 a Retry packet includes STREAM frames on stream 0 that start again at 692 an offset of 0. 694 Continuing the cryptographic handshake is necessary to ensure that an 695 attacker cannot force a downgrade of any cryptographic parameters. 697 In addition to continuing the cryptographic handshake, the client 698 MUST remember the results of any version negotiation that occurred 699 (see Section 6.2). The client MAY also retain any observed RTT or 700 congestion state that it has accumulated for the flow, but other 701 transport state MUST be discarded. 703 The payload of the Retry packet contains at least two frames. It 704 MUST include a STREAM frame on stream 0 with offset 0 containing the 705 server's cryptographic stateless retry material. It MUST also 706 include an ACK frame to acknowledge the client's Initial packet. It 707 MAY additionally include PADDING frames. The next STREAM frame sent 708 by the server will also start at stream offset 0. 710 4.4.3. Handshake Packet 712 A Handshake packet uses long headers with a type value of 0x7D. It 713 is used to carry acknowledgments and cryptographic handshake messages 714 from the server and client. 716 The Destination Connection ID field in a Handshake packet contains a 717 connection ID that is chosen by the recipient of the packet; the 718 Source Connection ID includes the connection ID that the sender of 719 the packet wishes to use (see Section 4.7). 721 The first Handshake packet sent by a server contains a randomized 722 packet number. This value is increased for each subsequent packet 723 sent by the server as described in Section 4.8. The client 724 increments the packet number from its previous packet by one for each 725 Handshake packet that it sends (which might be an Initial, 0-RTT 726 Protected, or Handshake packet). 728 Servers MUST NOT send more than three Handshake packets without 729 receiving a packet from a verified source address. Source addresses 730 can be verified through an address validation token, receipt of the 731 final cryptographic message from the client, or by receiving a valid 732 PATH_RESPONSE frame from the client. 734 If the server expects to generate more than three Handshake packets 735 in response to an Initial packet, it SHOULD include a PATH_CHALLENGE 736 frame in each Handshake packet that it sends. After receiving at 737 least one valid PATH_RESPONSE frame, the server can send its 738 remaining Handshake packets. Servers can instead perform address 739 validation using a Retry packet; this requires less state on the 740 server, but could involve additional computational effort depending 741 on implementation choices. 743 The payload of this packet contains STREAM frames and could contain 744 PADDING, ACK, PATH_CHALLENGE, or PATH_RESPONSE frames. Handshake 745 packets MAY contain CONNECTION_CLOSE frames if the handshake is 746 unsuccessful. 748 4.5. Protected Packets 750 Packets that are protected with 0-RTT keys are sent with long 751 headers; all packets protected with 1-RTT keys are sent with short 752 headers. The different packet types explicitly indicate the 753 encryption level and therefore the keys that are used to remove 754 packet protection. 756 Packets protected with 0-RTT keys use a type value of 0x7C. The 757 connection ID fields for a 0-RTT packet MUST match the values used in 758 the Initial packet (Section 4.4.1). 760 The client can send 0-RTT packets after receiving a Handshake packet 761 (Section 4.4.3), if that packet does not complete the handshake. 762 Even if the client receives a different connection ID in the 763 Handshake packet, it MUST continue to use the same Destination 764 Connection ID for 0-RTT packets, see Section 4.7. 766 The version field for protected packets is the current QUIC version. 768 The packet number field contains a packet number, which increases 769 with each packet sent, see Section 4.8 for details. 771 The payload is protected using authenticated encryption. [QUIC-TLS] 772 describes packet protection in detail. After decryption, the 773 plaintext consists of a sequence of frames, as described in 774 Section 5. 776 4.6. Coaslescing Packets 778 A sender can coalesce multiple QUIC packets (typically a 779 Cryptographic Handshake packet and a Protected packet) into one UDP 780 datagram. This can reduce the number of UDP datagrams needed to send 781 application data during the handshake and immediately afterwards. A 782 packet with a short header does not include a length, so it has to be 783 the last packet included in a UDP datagram. 785 The sender MUST NOT coalesce QUIC packets belonging to different QUIC 786 connections into a single UDP datagram. 788 Every QUIC packet that is coalesced into a single UDP datagram is 789 separate and complete. Though the values of some fields in the 790 packet header might be redundant, no fields are omitted. The 791 receiver of coalesced QUIC packets MUST individually process each 792 QUIC packet and separately acknowledge them, as if they were received 793 as the payload of different UDP datagrams. 795 4.7. Connection ID 797 A connection ID is used to ensure consistent routing of packets. The 798 long header contains two connection IDs: the Destination Connection 799 ID is chosen by the recipient of the packet and is used to provide 800 consistent routing; the Source Connection ID is used to set the 801 Destination Connection ID used by the peer. 803 During the handshake, packets with the long header are used to 804 establish the connection ID that each endpoint uses. Each endpoint 805 uses the Source Connection ID field to specify the connection ID that 806 is used in the Destination Connection ID field of packets being sent 807 to them. Upon receiving a packet, each endpoint sets the Destination 808 Connection ID it sends to match the value of the Source Connection ID 809 that they receive. 811 During the handshake, an endpoint might receive multiple packets with 812 the long header, and thus be given multiple opportunities to update 813 the Destination Connection ID it sends. A client MUST only change 814 the value it sends in the Destination Connection ID in response to 815 the first packet of each type it receives from the server (Retry or 816 Handshake); a server MUST set its value based on the Initial packet. 817 Any additional changes are not permitted; if subsequent packets of 818 those types include a different Source Connection ID, they MUST be 819 discarded. This avoids problems that might arise from stateless 820 processing of multiple Initial packets producing different connection 821 IDs. 823 Short headers only include the Destination Connection ID and omit the 824 explicit length. The length of the Destination Connection ID field 825 is expected to be known to endpoints. 827 Endpoints using a connection-ID based load balancer could agree with 828 the load balancer on a fixed or minimum length and on an encoding for 829 connection IDs. This fixed portion could encode an explicit length, 830 which allows the entire connection ID to vary in length and still be 831 used by the load balancer. 833 The very first packet sent by a client includes a random value for 834 Destination Connection ID. The same value MUST be used for all 0-RTT 835 packets sent on that connection (Section 4.5). This randomized value 836 is used to determine the handshake packet protection keys (see 837 Section 5.3.2 of [QUIC-TLS]). 839 A Version Negotiation (Section 4.3) packet MUST use both connection 840 IDs selected by the client, swapped to ensure correct routing toward 841 the client. 843 The connection ID can change over the lifetime of a connection, 844 especially in response to connection migration (Section 6.8). 845 NEW_CONNECTION_ID frames (Section 7.13) are used to provide new 846 connection ID values. 848 4.8. Packet Numbers 850 The packet number is an integer in the range 0 to 2^62-1. The value 851 is used in determining the cryptographic nonce for packet encryption. 852 Each endpoint maintains a separate packet number for sending and 853 receiving. The packet number for sending MUST increase by at least 854 one after sending any packet, unless otherwise specified (see 855 Section 4.8.1). 857 A QUIC endpoint MUST NOT reuse a packet number within the same 858 connection (that is, under the same cryptographic keys). If the 859 packet number for sending reaches 2^62 - 1, the sender MUST close the 860 connection without sending a CONNECTION_CLOSE frame or any further 861 packets; a server MAY send a Stateless Reset (Section 6.9.4) in 862 response to further packets that it receives. 864 For the packet header, the number of bits required to represent the 865 packet number are reduced by including only the least significant 866 bits of the packet number. The actual packet number for each packet 867 is reconstructed at the receiver based on the largest packet number 868 received on a successfully authenticated packet. 870 A packet number is decoded by finding the packet number value that is 871 closest to the next expected packet. The next expected packet is the 872 highest received packet number plus one. For example, if the highest 873 successfully authenticated packet had a packet number of 0xaa82f30e, 874 then a packet containing a 16-bit value of 0x1f94 will be decoded as 875 0xaa831f94. 877 The sender MUST use a packet number size able to represent more than 878 twice as large a range than the difference between the largest 879 acknowledged packet and packet number being sent. A peer receiving 880 the packet will then correctly decode the packet number, unless the 881 packet is delayed in transit such that it arrives after many higher- 882 numbered packets have been received. An endpoint SHOULD use a large 883 enough packet number encoding to allow the packet number to be 884 recovered even if the packet arrives after packets that are sent 885 afterwards. 887 As a result, the size of the packet number encoding is at least one 888 more than the base 2 logarithm of the number of contiguous 889 unacknowledged packet numbers, including the new packet. 891 For example, if an endpoint has received an acknowledgment for packet 892 0x6afa2f, sending a packet with a number of 0x6b4264 requires a 893 16-bit or larger packet number encoding; whereas a 32-bit packet 894 number is needed to send a packet with a number of 0x6bc107. 896 A Version Negotiation packet (Section 4.3) does not include a packet 897 number. The Retry packet (Section 4.4.2) has special rules for 898 populating the packet number field. 900 4.8.1. Initial Packet Number 902 The initial value for packet number MUST be selected randomly from a 903 range between 0 and 2^32 - 1025 (inclusive). This value is selected 904 so that Initial and Handshake packets exercise as many possible 905 values for the Packet Number field as possible. 907 Limiting the range allows both for loss of packets and for any 908 stateless exchanges. Packet numbers are incremented for subsequent 909 packets, but packet loss and stateless handling can both mean that 910 the first packet sent by an endpoint isn't necessarily the first 911 packet received by its peer. The first packet received by a peer 912 cannot be 2^32 or greater or the recipient will incorrectly assume a 913 packet number that is 2^32 values lower and discard the packet. 915 Use of a secure random number generator [RFC4086] is not necessary 916 for generating the initial packet number, nor is it necessary that 917 the value be uniformly distributed. 919 5. Frames and Frame Types 921 The payload of all packets, after removing packet protection, 922 consists of a sequence of frames, as shown in Figure 4. Version 923 Negotiation and Stateless Reset do not contain frames. 925 0 1 2 3 926 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 927 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 928 | Frame 1 (*) ... 929 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 930 | Frame 2 (*) ... 931 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 932 ... 933 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 934 | Frame N (*) ... 935 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 937 Figure 4: Contents of Protected Payload 939 Protected payloads MUST contain at least one frame, and MAY contain 940 multiple frames and multiple frame types. 942 Frames MUST fit within a single QUIC packet and MUST NOT span a QUIC 943 packet boundary. Each frame begins with a Frame Type byte, 944 indicating its type, followed by additional type-dependent fields: 946 0 1 2 3 947 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 948 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 949 | Type (8) | Type-Dependent Fields (*) ... 950 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 952 Figure 5: Generic Frame Layout 954 Frame types are listed in Table 3. Note that the Frame Type byte in 955 STREAM frames is used to carry other frame-specific flags. For all 956 other frames, the Frame Type byte simply identifies the frame. These 957 frames are explained in more detail as they are referenced later in 958 the document. 960 +-------------+-------------------+--------------+ 961 | Type Value | Frame Type Name | Definition | 962 +-------------+-------------------+--------------+ 963 | 0x00 | PADDING | Section 7.2 | 964 | | | | 965 | 0x01 | RST_STREAM | Section 7.3 | 966 | | | | 967 | 0x02 | CONNECTION_CLOSE | Section 7.4 | 968 | | | | 969 | 0x03 | APPLICATION_CLOSE | Section 7.5 | 970 | | | | 971 | 0x04 | MAX_DATA | Section 7.6 | 972 | | | | 973 | 0x05 | MAX_STREAM_DATA | Section 7.7 | 974 | | | | 975 | 0x06 | MAX_STREAM_ID | Section 7.8 | 976 | | | | 977 | 0x07 | PING | Section 7.9 | 978 | | | | 979 | 0x08 | BLOCKED | Section 7.10 | 980 | | | | 981 | 0x09 | STREAM_BLOCKED | Section 7.11 | 982 | | | | 983 | 0x0a | STREAM_ID_BLOCKED | Section 7.12 | 984 | | | | 985 | 0x0b | NEW_CONNECTION_ID | Section 7.13 | 986 | | | | 987 | 0x0c | STOP_SENDING | Section 7.14 | 988 | | | | 989 | 0x0d | ACK | Section 7.15 | 990 | | | | 991 | 0x0e | PATH_CHALLENGE | Section 7.16 | 992 | | | | 993 | 0x0f | PATH_RESPONSE | Section 7.17 | 994 | | | | 995 | 0x10 - 0x17 | STREAM | Section 7.18 | 996 +-------------+-------------------+--------------+ 998 Table 3: Frame Types 1000 6. Life of a Connection 1002 A QUIC connection is a single conversation between two QUIC 1003 endpoints. QUIC's connection establishment intertwines version 1004 negotiation with the cryptographic and transport handshakes to reduce 1005 connection establishment latency, as described in Section 6.3. Once 1006 established, a connection may migrate to a different IP or port at 1007 either endpoint, due to NAT rebinding or mobility, as described in 1008 Section 6.8. Finally a connection may be terminated by either 1009 endpoint, as described in Section 6.9. 1011 6.1. Matching Packets to Connections 1013 Incoming packets are classified on receipt. Packets can either be 1014 associated with an existing connection, or - for servers - 1015 potentially create a new connection. 1017 Hosts try to associate a packet with an existing connection. If the 1018 packet has a Destination Connection ID corresponding to an existing 1019 connection, QUIC processes that packet accordingly. Note that a 1020 NEW_CONNECTION_ID frame (Section 7.13) would associate more than one 1021 connection ID with a connection. 1023 If the Destination Connection ID is zero length and the packet 1024 matches the address/port tuple of a connection where the host did not 1025 require connection IDs, QUIC processes the packet as part of that 1026 connection. Endpoints MUST drop packets with zero-length Destination 1027 Connection ID fields if they do not correspond to a single 1028 connection. 1030 6.1.1. Client Packet Handling 1032 Valid packets sent to clients always include a Destination Connection 1033 ID that matches the value the client selects. Clients that choose to 1034 receive zero-length connection IDs can use the address/port tuple to 1035 identify a connection. Packets that don't match an existing 1036 connection MAY be discarded. 1038 Due to packet reordering or loss, clients might receive packets for a 1039 connection that are encrypted with a key it has not yet computed. 1040 Clients MAY drop these packets, or MAY buffer them in anticipation of 1041 later packets that allow it to compute the key. 1043 If a client receives a packet that has an unsupported version, it 1044 MUST discard that packet. 1046 6.1.2. Server Packet Handling 1048 If a server receives a packet that has an unsupported version and 1049 sufficient length to be an Initial packet for some version supported 1050 by the server, it SHOULD send a Version Negotiation packet as 1051 described in Section 6.2.1. Servers MAY rate control these packets 1052 to avoid storms of Version Negotiation packets. 1054 The first packet for an unsupported version can use different 1055 semantics and encodings for any version-specific field. In 1056 particular, different packet protection keys might be used for 1057 different versions. Servers that do not support a particular version 1058 are unlikely to be able to decrypt the content of the packet. 1059 Servers SHOULD NOT attempt to decode or decrypt a packet from an 1060 unknown version, but instead send a Version Negotiation packet, 1061 provided that the packet is sufficiently long. 1063 Servers MUST drop other packets that contain unsupported versions. 1065 Packets with a supported version, or no version field, are matched to 1066 a connection as described in Section 6.1. If not matched, the server 1067 continues below. 1069 If the packet is an Initial packet fully conforming with the 1070 specification, the server proceeds with the handshake (Section 6.3). 1071 This commits the server to the version that the client selected. 1073 If a server isn't currently accepting any new connections, it SHOULD 1074 send a Handshake packet containing a CONNECTION_CLOSE frame with 1075 error code SERVER_BUSY. 1077 If the packet is a 0-RTT packet, the server MAY buffer a limited 1078 number of these packets in anticipation of a late-arriving Initial 1079 Packet. Clients are forbidden from sending Handshake packets prior 1080 to receiving a server response, so servers SHOULD ignore any such 1081 packets. 1083 Servers MUST drop incoming packets under all other circumstances. 1084 They SHOULD send a Stateless Reset (Section 6.9.4) if a connection ID 1085 is present in the header. 1087 6.2. Version Negotiation 1089 Version negotiation ensures that client and server agree to a QUIC 1090 version that is mutually supported. A server sends a Version 1091 Negotiation packet in response to each packet that might initiate a 1092 new connection, see Section 6.1 for details. 1094 The size of the first packet sent by a client will determine whether 1095 a server sends a Version Negotiation packet. Clients that support 1096 multiple QUIC versions SHOULD pad their Initial packets to reflect 1097 the largest minimum Initial packet size of all their versions. This 1098 ensures that that the server responds if there are any mutually 1099 supported versions. 1101 6.2.1. Sending Version Negotiation Packets 1103 If the version selected by the client is not acceptable to the 1104 server, the server responds with a Version Negotiation packet (see 1105 Section 4.3). This includes a list of versions that the server will 1106 accept. 1108 This system allows a server to process packets with unsupported 1109 versions without retaining state. Though either the Initial packet 1110 or the Version Negotiation packet that is sent in response could be 1111 lost, the client will send new packets until it successfully receives 1112 a response or it abandons the connection attempt. 1114 6.2.2. Handling Version Negotiation Packets 1116 When the client receives a Version Negotiation packet, it first 1117 checks that the Destination and Source Connection ID fields match the 1118 Source and Destination Connection ID fields in a packet that the 1119 client sent. If this check fails, the packet MUST be discarded. 1121 Once the Version Negotiation packet is determined to be valid, the 1122 client then selects an acceptable protocol version from the list 1123 provided by the server. The client then attempts to create a 1124 connection using that version. Though the contents of the Initial 1125 packet the client sends might not change in response to version 1126 negotiation, a client MUST increase the packet number it uses on 1127 every packet it sends. Packets MUST continue to use long headers and 1128 MUST include the new negotiated protocol version. 1130 The client MUST use the long header format and include its selected 1131 version on all packets until it has 1-RTT keys and it has received a 1132 packet from the server which is not a Version Negotiation packet. 1134 A client MUST NOT change the version it uses unless it is in response 1135 to a Version Negotiation packet from the server. Once a client 1136 receives a packet from the server which is not a Version Negotiation 1137 packet, it MUST discard other Version Negotiation packets on the same 1138 connection. Similarly, a client MUST ignore a Version Negotiation 1139 packet if it has already received and acted on a Version Negotiation 1140 packet. 1142 A client MUST ignore a Version Negotiation packet that lists the 1143 client's chosen version. 1145 Version negotiation packets have no cryptographic protection. The 1146 result of the negotiation MUST be revalidated as part of the 1147 cryptographic handshake (see Section 6.4.4). 1149 6.2.3. Using Reserved Versions 1151 For a server to use a new version in the future, clients must 1152 correctly handle unsupported versions. To help ensure this, a server 1153 SHOULD include a reserved version (see Section 3) while generating a 1154 Version Negotiation packet. 1156 The design of version negotiation permits a server to avoid 1157 maintaining state for packets that it rejects in this fashion. The 1158 validation of version negotiation (see Section 6.4.4) only validates 1159 the result of version negotiation, which is the same no matter which 1160 reserved version was sent. A server MAY therefore send different 1161 reserved version numbers in the Version Negotiation Packet and in its 1162 transport parameters. 1164 A client MAY send a packet using a reserved version number. This can 1165 be used to solicit a list of supported versions from a server. 1167 6.3. Cryptographic and Transport Handshake 1169 QUIC relies on a combined cryptographic and transport handshake to 1170 minimize connection establishment latency. QUIC allocates stream 0 1171 for the cryptographic handshake. Version 0x00000001 of QUIC uses TLS 1172 1.3 as described in [QUIC-TLS]; a different QUIC version number could 1173 indicate that a different cryptographic handshake protocol is in use. 1175 QUIC provides this stream with reliable, ordered delivery of data. 1176 In return, the cryptographic handshake provides QUIC with: 1178 o authenticated key exchange, where 1180 * a server is always authenticated, 1182 * a client is optionally authenticated, 1184 * every connection produces distinct and unrelated keys, 1186 * keying material is usable for packet protection for both 0-RTT 1187 and 1-RTT packets, and 1189 * 1-RTT keys have forward secrecy 1191 o authenticated values for the transport parameters of the peer (see 1192 Section 6.4) 1194 o authenticated confirmation of version negotiation (see 1195 Section 6.4.4) 1197 o authenticated negotiation of an application protocol (TLS uses 1198 ALPN [RFC7301] for this purpose) 1200 o for the server, the ability to carry data that provides assurance 1201 that the client can receive packets that are addressed with the 1202 transport address that is claimed by the client (see Section 6.6) 1204 The initial cryptographic handshake message MUST be sent in a single 1205 packet. Any second attempt that is triggered by address validation 1206 MUST also be sent within a single packet. This avoids having to 1207 reassemble a message from multiple packets. Reassembling messages 1208 requires that a server maintain state prior to establishing a 1209 connection, exposing the server to a denial of service risk. 1211 The first client packet of the cryptographic handshake protocol MUST 1212 fit within a 1232 octet QUIC packet payload. This includes overheads 1213 that reduce the space available to the cryptographic handshake 1214 protocol. 1216 Details of how TLS is integrated with QUIC is provided in more detail 1217 in [QUIC-TLS]. 1219 6.4. Transport Parameters 1221 During connection establishment, both endpoints make authenticated 1222 declarations of their transport parameters. These declarations are 1223 made unilaterally by each endpoint. Endpoints are required to comply 1224 with the restrictions implied by these parameters; the description of 1225 each parameter includes rules for its handling. 1227 The format of the transport parameters is the TransportParameters 1228 struct from Figure 6. This is described using the presentation 1229 language from Section 3 of [I-D.ietf-tls-tls13]. 1231 uint32 QuicVersion; 1233 enum { 1234 initial_max_stream_data(0), 1235 initial_max_data(1), 1236 initial_max_stream_id_bidi(2), 1237 idle_timeout(3), 1238 max_packet_size(5), 1239 stateless_reset_token(6), 1240 ack_delay_exponent(7), 1241 initial_max_stream_id_uni(8), 1242 (65535) 1243 } TransportParameterId; 1245 struct { 1246 TransportParameterId parameter; 1247 opaque value<0..2^16-1>; 1248 } TransportParameter; 1250 struct { 1251 select (Handshake.msg_type) { 1252 case client_hello: 1253 QuicVersion initial_version; 1255 case encrypted_extensions: 1256 QuicVersion negotiated_version; 1257 QuicVersion supported_versions<4..2^8-4>; 1258 }; 1259 TransportParameter parameters<22..2^16-1>; 1260 } TransportParameters; 1262 Figure 6: Definition of TransportParameters 1264 The "extension_data" field of the quic_transport_parameters extension 1265 defined in [QUIC-TLS] contains a TransportParameters value. TLS 1266 encoding rules are therefore used to encode the transport parameters. 1268 QUIC encodes transport parameters into a sequence of octets, which 1269 are then included in the cryptographic handshake. Once the handshake 1270 completes, the transport parameters declared by the peer are 1271 available. Each endpoint validates the value provided by its peer. 1272 In particular, version negotiation MUST be validated (see 1273 Section 6.4.4) before the connection establishment is considered 1274 properly complete. 1276 Definitions for each of the defined transport parameters are included 1277 in Section 6.4.1. Any given parameter MUST appear at most once in a 1278 given transport parameters extension. An endpoint MUST treat receipt 1279 of duplicate transport parameters as a connection error of type 1280 TRANSPORT_PARAMETER_ERROR. 1282 6.4.1. Transport Parameter Definitions 1284 An endpoint MUST include the following parameters in its encoded 1285 TransportParameters: 1287 initial_max_stream_data (0x0000): The initial stream maximum data 1288 parameter contains the initial value for the maximum data that can 1289 be sent on any newly created stream. This parameter is encoded as 1290 an unsigned 32-bit integer in units of octets. This is equivalent 1291 to an implicit MAX_STREAM_DATA frame (Section 7.7) being sent on 1292 all streams immediately after opening. 1294 initial_max_data (0x0001): The initial maximum data parameter 1295 contains the initial value for the maximum amount of data that can 1296 be sent on the connection. This parameter is encoded as an 1297 unsigned 32-bit integer in units of octets. This is equivalent to 1298 sending a MAX_DATA (Section 7.6) for the connection immediately 1299 after completing the handshake. 1301 idle_timeout (0x0003): The idle timeout is a value in seconds that 1302 is encoded as an unsigned 16-bit integer. The maximum value is 1303 600 seconds (10 minutes). 1305 An endpoint MAY use the following transport parameters: 1307 initial_max_streams_bidi (0x0002): The initial maximum bidirectional 1308 streams parameter contains the initial maximum number of 1309 application-owned bidirectional streams the peer may initiate, 1310 encoded as an unsigned 16-bit integer. If this parameter is 1311 absent or zero, application-owned bidirectional streams cannot be 1312 created until a MAX_STREAM_ID frame is sent. Note that a value of 1313 0 does not prevent the cryptographic handshake stream (that is, 1314 stream 0) from being used. Setting this parameter is equivalent 1315 to sending a MAX_STREAM_ID (Section 7.8) immediately after 1316 completing the handshake containing the corresponding Stream ID. 1317 For example, a value of 0x05 would be equivalent to receiving a 1318 MAX_STREAM_ID containing 20 when received by a client or 17 when 1319 received by a server. 1321 initial_max_stream_id_uni (0x0008): The initial maximum 1322 unidirectional streams parameter contains the initial maximum 1323 number of application-owned unidirectional streams the peer may 1324 initiate, encoded as an unsigned 16-bit integer. If this 1325 parameter is absent or zero, unidirectional streams cannot be 1326 created until a MAX_STREAM_ID frame is sent. Setting this 1327 parameter is equivalent to sending a MAX_STREAM_ID (Section 7.8) 1328 immediately after completing the handshake containing the 1329 corresponding Stream ID. For example, a value of 0x05 would be 1330 equivalent to receiving a MAX_STREAM_ID containing 18 when 1331 received by a client or 19 when received by a server. 1333 max_packet_size (0x0005): The maximum packet size parameter places a 1334 limit on the size of packets that the endpoint is willing to 1335 receive, encoded as an unsigned 16-bit integer. This indicates 1336 that packets larger than this limit will be dropped. The default 1337 for this parameter is the maximum permitted UDP payload of 65527. 1338 Values below 1200 are invalid. This limit only applies to 1339 protected packets (Section 4.5). 1341 ack_delay_exponent (0x0007): An 8-bit unsigned integer value 1342 indicating an exponent used to decode the ACK Delay field in the 1343 ACK frame, see Section 7.15. If this value is absent, a default 1344 value of 3 is assumed (indicating a multiplier of 8). The default 1345 value is also used for ACK frames that are sent in Initial, 1346 Handshake, and Retry packets. Values above 20 are invalid. 1348 A server MAY include the following transport parameters: 1350 stateless_reset_token (0x0006): The Stateless Reset Token is used in 1351 verifying a stateless reset, see Section 6.9.4. This parameter is 1352 a sequence of 16 octets. 1354 A client MUST NOT include a stateless reset token. A server MUST 1355 treat receipt of a stateless_reset_token transport parameter as a 1356 connection error of type TRANSPORT_PARAMETER_ERROR. 1358 6.4.2. Values of Transport Parameters for 0-RTT 1360 A client that attempts to send 0-RTT data MUST remember the transport 1361 parameters used by the server. The transport parameters that the 1362 server advertises during connection establishment apply to all 1363 connections that are resumed using the keying material established 1364 during that handshake. Remembered transport parameters apply to the 1365 new connection until the handshake completes and new transport 1366 parameters from the server can be provided. 1368 A server can remember the transport parameters that it advertised, or 1369 store an integrity-protected copy of the values in the ticket and 1370 recover the information when accepting 0-RTT data. A server uses the 1371 transport parameters in determining whether to accept 0-RTT data. 1373 A server MAY accept 0-RTT and subsequently provide different values 1374 for transport parameters for use in the new connection. If 0-RTT 1375 data is accepted by the server, the server MUST NOT reduce any limits 1376 or alter any values that might be violated by the client with its 1377 0-RTT data. In particular, a server that accepts 0-RTT data MUST NOT 1378 set values for initial_max_data or initial_max_stream_data that are 1379 smaller than the remembered value of those parameters. Similarly, a 1380 server MUST NOT reduce the value of initial_max_stream_id_bidi or 1381 initial_max_stream_id_uni. 1383 Omitting or setting a zero value for certain transport parameters can 1384 result in 0-RTT data being enabled, but not usable. The following 1385 transport parameters SHOULD be set to non-zero values for 0-RTT: 1386 initial_max_stream_id_bidi, initial_max_stream_id_uni, 1387 initial_max_data, initial_max_stream_data. 1389 A server MUST reject 0-RTT data or even abort a handshake if the 1390 implied values for transport parameters cannot be supported. 1392 6.4.3. New Transport Parameters 1394 New transport parameters can be used to negotiate new protocol 1395 behavior. An endpoint MUST ignore transport parameters that it does 1396 not support. Absence of a transport parameter therefore disables any 1397 optional protocol feature that is negotiated using the parameter. 1399 New transport parameters can be registered according to the rules in 1400 Section 13.1. 1402 6.4.4. Version Negotiation Validation 1404 Though the cryptographic handshake has integrity protection, two 1405 forms of QUIC version downgrade are possible. In the first, an 1406 attacker replaces the QUIC version in the Initial packet. In the 1407 second, a fake Version Negotiation packet is sent by an attacker. To 1408 protect against these attacks, the transport parameters include three 1409 fields that encode version information. These parameters are used to 1410 retroactively authenticate the choice of version (see Section 6.2). 1412 The cryptographic handshake provides integrity protection for the 1413 negotiated version as part of the transport parameters (see 1414 Section 6.4). As a result, attacks on version negotiation by an 1415 attacker can be detected. 1417 The client includes the initial_version field in its transport 1418 parameters. The initial_version is the version that the client 1419 initially attempted to use. If the server did not send a Version 1420 Negotiation packet Section 4.3, this will be identical to the 1421 negotiated_version field in the server transport parameters. 1423 A server that processes all packets in a stateful fashion can 1424 remember how version negotiation was performed and validate the 1425 initial_version value. 1427 A server that does not maintain state for every packet it receives 1428 (i.e., a stateless server) uses a different process. If the 1429 initial_version matches the version of QUIC that is in use, a 1430 stateless server can accept the value. 1432 If the initial_version is different from the version of QUIC that is 1433 in use, a stateless server MUST check that it would have sent a 1434 Version Negotiation packet if it had received a packet with the 1435 indicated initial_version. If a server would have accepted the 1436 version included in the initial_version and the value differs from 1437 the QUIC version that is in use, the server MUST terminate the 1438 connection with a VERSION_NEGOTIATION_ERROR error. 1440 The server includes both the version of QUIC that is in use and a 1441 list of the QUIC versions that the server supports. 1443 The negotiated_version field is the version that is in use. This 1444 MUST be set by the server to the value that is on the Initial packet 1445 that it accepts (not an Initial packet that triggers a Retry or 1446 Version Negotiation packet). A client that receives a 1447 negotiated_version that does not match the version of QUIC that is in 1448 use MUST terminate the connection with a VERSION_NEGOTIATION_ERROR 1449 error code. 1451 The server includes a list of versions that it would send in any 1452 version negotiation packet (Section 4.3) in the supported_versions 1453 field. The server populates this field even if it did not send a 1454 version negotiation packet. 1456 The client validates that the negotiated_version is included in the 1457 supported_versions list and - if version negotiation was performed - 1458 that it would have selected the negotiated version. A client MUST 1459 terminate the connection with a VERSION_NEGOTIATION_ERROR error code 1460 if the current QUIC version is not listed in the supported_versions 1461 list. A client MUST terminate with a VERSION_NEGOTIATION_ERROR error 1462 code if version negotiation occurred but it would have selected a 1463 different version based on the value of the supported_versions list. 1465 When an endpoint accepts multiple QUIC versions, it can potentially 1466 interpret transport parameters as they are defined by any of the QUIC 1467 versions it supports. The version field in the QUIC packet header is 1468 authenticated using transport parameters. The position and the 1469 format of the version fields in transport parameters MUST either be 1470 identical across different QUIC versions, or be unambiguously 1471 different to ensure no confusion about their interpretation. One way 1472 that a new format could be introduced is to define a TLS extension 1473 with a different codepoint. 1475 6.5. Stateless Retries 1477 A server can process an initial cryptographic handshake messages from 1478 a client without committing any state. This allows a server to 1479 perform address validation (Section 6.6, or to defer connection 1480 establishment costs. 1482 A server that generates a response to an initial packet without 1483 retaining connection state MUST use the Retry packet (Section 4.4.2). 1484 This packet causes a client to reset its transport state and to 1485 continue the connection attempt with new connection state while 1486 maintaining the state of the cryptographic handshake. 1488 A server MUST NOT send multiple Retry packets in response to a client 1489 handshake packet. Thus, any cryptographic handshake message that is 1490 sent MUST fit within a single packet. 1492 In TLS, the Retry packet type is used to carry the HelloRetryRequest 1493 message. 1495 6.6. Proof of Source Address Ownership 1497 Transport protocols commonly spend a round trip checking that a 1498 client owns the transport address (IP and port) that it claims. 1499 Verifying that a client can receive packets sent to its claimed 1500 transport address protects against spoofing of this information by 1501 malicious clients. 1503 This technique is used primarily to avoid QUIC from being used for 1504 traffic amplification attack. In such an attack, a packet is sent to 1505 a server with spoofed source address information that identifies a 1506 victim. If a server generates more or larger packets in response to 1507 that packet, the attacker can use the server to send more data toward 1508 the victim than it would be able to send on its own. 1510 Several methods are used in QUIC to mitigate this attack. Firstly, 1511 the initial handshake packet is padded to at least 1200 octets. This 1512 allows a server to send a similar amount of data without risking 1513 causing an amplification attack toward an unproven remote address. 1515 A server eventually confirms that a client has received its messages 1516 when the cryptographic handshake successfully completes. This might 1517 be insufficient, either because the server wishes to avoid the 1518 computational cost of completing the handshake, or it might be that 1519 the size of the packets that are sent during the handshake is too 1520 large. This is especially important for 0-RTT, where the server 1521 might wish to provide application data traffic - such as a response 1522 to a request - in response to the data carried in the early data from 1523 the client. 1525 To send additional data prior to completing the cryptographic 1526 handshake, the server then needs to validate that the client owns the 1527 address that it claims. 1529 Source address validation is therefore performed during the 1530 establishment of a connection. TLS provides the tools that support 1531 the feature, but basic validation is performed by the core transport 1532 protocol. 1534 A different type of source address validation is performed after a 1535 connection migration, see Section 6.7. 1537 6.6.1. Client Address Validation Procedure 1539 QUIC uses token-based address validation. Any time the server wishes 1540 to validate a client address, it provides the client with a token. 1541 As long as the token cannot be easily guessed (see Section 6.6.3), if 1542 the client is able to return that token, it proves to the server that 1543 it received the token. 1545 During the processing of the cryptographic handshake messages from a 1546 client, TLS will request that QUIC make a decision about whether to 1547 proceed based on the information it has. TLS will provide QUIC with 1548 any token that was provided by the client. For an initial packet, 1549 QUIC can decide to abort the connection, allow it to proceed, or 1550 request address validation. 1552 If QUIC decides to request address validation, it provides the 1553 cryptographic handshake with a token. The contents of this token are 1554 consumed by the server that generates the token, so there is no need 1555 for a single well-defined format. A token could include information 1556 about the claimed client address (IP and port), a timestamp, and any 1557 other supplementary information the server will need to validate the 1558 token in the future. 1560 The cryptographic handshake is responsible for enacting validation by 1561 sending the address validation token to the client. A legitimate 1562 client will include a copy of the token when it attempts to continue 1563 the handshake. The cryptographic handshake extracts the token then 1564 asks QUIC a second time whether the token is acceptable. In 1565 response, QUIC can either abort the connection or permit it to 1566 proceed. 1568 A connection MAY be accepted without address validation - or with 1569 only limited validation - but a server SHOULD limit the data it sends 1570 toward an unvalidated address. Successful completion of the 1571 cryptographic handshake implicitly provides proof that the client has 1572 received packets from the server. 1574 6.6.2. Address Validation on Session Resumption 1576 A server MAY provide clients with an address validation token during 1577 one connection that can be used on a subsequent connection. Address 1578 validation is especially important with 0-RTT because a server 1579 potentially sends a significant amount of data to a client in 1580 response to 0-RTT data. 1582 A different type of token is needed when resuming. Unlike the token 1583 that is created during a handshake, there might be some time between 1584 when the token is created and when the token is subsequently used. 1585 Thus, a resumption token SHOULD include an expiration time. It is 1586 also unlikely that the client port number is the same on two 1587 different connections; validating the port is therefore unlikely to 1588 be successful. 1590 This token can be provided to the cryptographic handshake immediately 1591 after establishing a connection. QUIC might also generate an updated 1592 token if significant time passes or the client address changes for 1593 any reason (see Section 6.8). The cryptographic handshake is 1594 responsible for providing the client with the token. In TLS the 1595 token is included in the ticket that is used for resumption and 1596 0-RTT, which is carried in a NewSessionTicket message. 1598 6.6.3. Address Validation Token Integrity 1600 An address validation token MUST be difficult to guess. Including a 1601 large enough random value in the token would be sufficient, but this 1602 depends on the server remembering the value it sends to clients. 1604 A token-based scheme allows the server to offload any state 1605 associated with validation to the client. For this design to work, 1606 the token MUST be covered by integrity protection against 1607 modification or falsification by clients. Without integrity 1608 protection, malicious clients could generate or guess values for 1609 tokens that would be accepted by the server. Only the server 1610 requires access to the integrity protection key for tokens. 1612 In TLS the address validation token is often bundled with the 1613 information that TLS requires, such as the resumption secret. In 1614 this case, adding integrity protection can be delegated to the 1615 cryptographic handshake protocol, avoiding redundant protection. If 1616 integrity protection is delegated to the cryptographic handshake, an 1617 integrity failure will result in immediate cryptographic handshake 1618 failure. If integrity protection is performed by QUIC, QUIC MUST 1619 abort the connection if the integrity check fails with a 1620 PROTOCOL_VIOLATION error code. 1622 6.7. Path Validation 1624 Path validation is used by an endpoint to verify reachability of a 1625 peer over a specific path. That is, it tests reachability between a 1626 specific local address and a specific peer address, where an address 1627 is the two-tuple of IP address and port. Path validation tests that 1628 packets can be both sent to and received from a peer. 1630 Path validation is used during connection migration (see Section 6.8) 1631 by the migrating endpoint to verify reachability of a peer from a new 1632 local address. Path validation is also used by the peer to verify 1633 that the migrating endpoint is able to receive packets sent to its 1634 new address. That is, that the packets received from the migrating 1635 endpoint do not carry a spoofed source address. 1637 Path validation can be used at any time by either endpoint. For 1638 instance, an endpoint might check that a peer is still in possession 1639 of its address after a period of quiescence. 1641 Path validation is not designed as a NAT traversal mechanism. Though 1642 the mechanism described here might be effective for the creation of 1643 NAT bindings that support NAT traversal, the expectation is that one 1644 or other peer is able to receive packets without first having sent a 1645 packet on that path. Effective NAT traversal needs additional 1646 synchronization mechanisms that are not provided here. 1648 An endpoint MAY bundle PATH_CHALLENGE and PATH_RESPONSE frames that 1649 are used for path validation with other frames. For instance, an 1650 endpoint may pad a packet carrying a PATH_CHALLENGE for PMTU 1651 discovery, or an endpoint may bundle a PATH_RESPONSE with its own 1652 PATH_CHALLENGE. 1654 6.7.1. Initiation 1656 To initiate path validation, an endpoint sends a PATH_CHALLENGE frame 1657 containing a random payload on the path to be validated. 1659 An endpoint MAY send additional PATH_CHALLENGE frames to handle 1660 packet loss. An endpoint SHOULD NOT send a PATH_CHALLENGE more 1661 frequently than it would an Initial packet, ensuring that connection 1662 migration is no more load on a new path than establishing a new 1663 connection. 1665 The endpoint MUST use fresh random data in every PATH_CHALLENGE frame 1666 so that it can associate the peer's response with the causative 1667 PATH_CHALLENGE. 1669 6.7.2. Response 1671 On receiving a PATH_CHALLENGE frame, an endpoint MUST respond 1672 immediately by echoing the data contained in the PATH_CHALLENGE frame 1673 in a PATH_RESPONSE frame, with the following stipulation. Since a 1674 PATH_CHALLENGE might be sent from a spoofed address, an endpoint MAY 1675 limit the rate at which it sends PATH_RESPONSE frames and MAY 1676 silently discard PATH_CHALLENGE frames that would cause it to respond 1677 at a higher rate. 1679 To ensure that packets can be both sent to and received from the 1680 peer, the PATH_RESPONSE MUST be sent on the same path as the 1681 triggering PATH_CHALLENGE: from the same local address on which the 1682 PATH_CHALLENGE was received, to the same remote address from which 1683 the PATH_CHALLENGE was received. 1685 6.7.3. Completion 1687 A new address is considered valid when a PATH_RESPONSE frame is 1688 received containing data that was sent in a previous PATH_CHALLENGE. 1689 Receipt of an acknowledgment for a packet containing a PATH_CHALLENGE 1690 frame is not adequate validation, since the acknowledgment can be 1691 spoofed by a malicious peer. 1693 For path validation to be successful, a PATH_RESPONSE frame MUST be 1694 received from the same remote address to which the corresponding 1695 PATH_CHALLENGE was sent. If a PATH_RESPONSE frame is received from a 1696 different remote address than the one to which the PATH_CHALLENGE was 1697 sent, path validation is considered to have failed, even if the data 1698 matches that sent in the PATH_CHALLENGE. 1700 Additionally, the PATH_RESPONSE frame MUST be received on the same 1701 local address from which the corresponding PATH_CHALLENGE was sent. 1702 If a PATH_RESPONSE frame is received on a different local address 1703 than the one from which the PATH_CHALLENGE was sent, path validation 1704 is considered to have failed, even if the data matches that sent in 1705 the PATH_CHALLENGE. Thus, the endpoint considers the path to be 1706 valid when a PATH_RESPONSE frame is received on the same path with 1707 the same payload as the PATH_CHALLENGE frame. 1709 6.7.4. Abandonment 1711 An endpoint SHOULD abandon path validation after sending some number 1712 of PATH_CHALLENGE frames or after some time has passed. When setting 1713 this timer, implementations are cautioned that the new path could 1714 have a longer round-trip time than the original. 1716 Note that the endpoint might receive packets containing other frames 1717 on the new path, but a PATH_RESPONSE frame with appropriate data is 1718 required for path validation to succeed. 1720 If path validation fails, the path is deemed unusable. This does not 1721 necessarily imply a failure of the connection - endpoints can 1722 continue sending packets over other paths as appropriate. If no 1723 paths are available, an endpoint can wait for a new path to become 1724 available or close the connection. 1726 A path validation might be abandoned for other reasons besides 1727 failure. Primarily, this happens if a connection migration to a new 1728 path is initiated while a path validation on the old path is in 1729 progress. 1731 6.8. Connection Migration 1733 QUIC allows connections to survive changes to endpoint addresses 1734 (that is, IP address and/or port), such as those caused by a endpoint 1735 migrating to a new network. This section describes the process by 1736 which an endpoint migrates to a new address. 1738 An endpoint MUST NOT initiate connection migration before the 1739 handshake is finished and the endpoint has 1-RTT keys. 1741 This document limits migration of connections to new client 1742 addresses. Clients are responsible for initiating all migrations. 1743 Servers do not send non-probing packets (see Section 6.8.1) toward a 1744 client address until it sees a non-probing packet from that address. 1745 If a client receives packets from an unknown server address, the 1746 client MAY discard these packets. Migrating a connection to a new 1747 server address is left for future work. 1749 6.8.1. Probing a New Path 1751 An endpoint MAY probe for peer reachability from a new local address 1752 using path validation Section 6.7 prior to migrating the connection 1753 to the new local address. Failure of path validation simply means 1754 that the new path is not usable for this connection. Failure to 1755 validate a path does not cause the connection to end unless there are 1756 no valid alternative paths available. 1758 An endpoint uses a new connection ID for probes sent from a new local 1759 address, see Section 6.8.5 for further discussion. 1761 Receiving a PATH_CHALLENGE frame from a peer indicates that the peer 1762 is probing for reachability on a path. An endpoint sends a 1763 PATH_RESPONSE in response as per Section 6.7. 1765 PATH_CHALLENGE, PATH_RESPONSE, and PADDING frames are "probing 1766 frames", and all other frames are "non-probing frames". A packet 1767 containing only probing frames is a "probing packet", and a packet 1768 containing any other frame is a "non-probing packet". 1770 6.8.2. Initiating Connection Migration 1772 A endpoint can migrate a connection to a new local address by sending 1773 packets containing frames other than probing frames from that 1774 address. 1776 Each endpoint validates its peer's address during connection 1777 establishment. Therefore, a migrating endpoint can send to its peer 1778 knowing that the peer is willing to receive at the peer's current 1779 address. Thus an endpoint can migrate to a new local address without 1780 first validating the peer's address. 1782 When migrating, the new path might not support the endpoint's current 1783 sending rate. Therefore, the endpoint resets its congestion 1784 controller, as described in Section 6.8.4. 1786 Receiving acknowledgments for data sent on the new path serves as 1787 proof of the peer's reachability from the new address. Note that 1788 since acknowledgments may be received on any path, return 1789 reachability on the new path is not established. To establish return 1790 reachability on the new path, an endpoint MAY concurrently initiate 1791 path validation Section 6.7 on the new path. 1793 6.8.3. Responding to Connection Migration 1795 Receiving a packet from a new peer address containing a non-probing 1796 frame indicates that the peer has migrated to that address. 1798 In response to such a packet, an endpoint MUST start sending 1799 subsequent packets to the new peer address and MUST initiate path 1800 validation (Section 6.7) to verify the peer's ownership of the 1801 unvalidated address. 1803 An endpoint MAY send data to an unvalidated peer address, but it MUST 1804 protect against potential attacks as described in Section 6.8.3.1 and 1805 Section 6.8.3.2. An endpoint MAY skip validation of a peer address 1806 if that address has been seen recently. 1808 An endpoint only changes the address that it sends packets to in 1809 response to the highest-numbered non-probing packet. This ensures 1810 that an endpoint does not send packets to an old peer address in the 1811 case that it receives reordered packets. 1813 After changing the address to which it sends non-probing packets, an 1814 endpoint could abandon any path validation for other addresses. 1816 Receiving a packet from a new peer address might be the result of a 1817 NAT rebinding at the peer. 1819 After verifying a new client address, the server SHOULD send new 1820 address validation tokens (Section 6.6) to the client. 1822 6.8.3.1. Handling Address Spoofing by a Peer 1824 It is possible that a peer is spoofing its source address to cause an 1825 endpoint to send excessive amounts of data to an unwilling host. If 1826 the endpoint sends significantly more data than the spoofing peer, 1827 connection migration might be used to amplify the volume of data that 1828 an attacker can generate toward a victim. 1830 As described in Section 6.8.3, an endpoint is required to validate a 1831 peer's new address to confirm the peer's possession of the new 1832 address. Until a peer's address is deemed valid, an endpoint MUST 1833 limit the rate at which it sends data to this address. The endpoint 1834 MUST NOT send more than a minimum congestion window's worth of data 1835 per estimated round-trip time (kMinimumWindow, as defined in 1836 [QUIC-RECOVERY]). In the absence of this limit, an endpoint risks 1837 being used for a denial of service attack against an unsuspecting 1838 victim. Note that since the endpoint will not have any round-trip 1839 time measurements to this address, the estimate SHOULD be the default 1840 initial value (see [QUIC-RECOVERY]). 1842 If an endpoint skips validation of a peer address as described in 1843 Section 6.8.3, it does not need to limit its sending rate. 1845 6.8.3.2. Handling Address Spoofing by an On-path Attacker 1847 An on-path attacker could cause a spurious connection migration by 1848 copying and forwarding a packet with a spoofed address such that it 1849 arrives before the original packet. The packet with the spoofed 1850 address will be seen to come from a migrating connection, and the 1851 original packet will be seen as a duplicate and dropped. After a 1852 spurious migration, validation of the source address will fail 1853 because the entity at the source address does not have the necessary 1854 cryptographic keys to read or respond to the PATH_CHALLENGE frame 1855 that is sent to it even if it wanted to. 1857 To protect the connection from failing due to such a spurious 1858 migration, an endpoint MUST revert to using the last validated peer 1859 address when validation of a new peer address fails. 1861 If an endpoint has no state about the last validated peer address, it 1862 MUST close the connection silently by discarding all connection 1863 state. This results in new packets on the connection being handled 1864 generically. For instance, an endpoint MAY send a stateless reset in 1865 response to any further incoming packets. 1867 Note that receipt of packets with higher packet numbers from the 1868 legitimate peer address will trigger another connection migration. 1869 This will cause the validation of the address of the spurious 1870 migration to be abandoned. 1872 6.8.4. Loss Detection and Congestion Control 1874 The capacity available on the new path might not be the same as the 1875 old path. Packets sent on the old path SHOULD NOT contribute to 1876 congestion control or RTT estimation for the new path. 1878 On confirming a peer's ownership of its new address, an endpoint 1879 SHOULD immediately reset the congestion controller and round-trip 1880 time estimator for the new path. 1882 An endpoint MUST NOT return to the send rate used for the previous 1883 path unless it is reasonably sure that the previous send rate is 1884 valid for the new path. For instance, a change in the client's port 1885 number is likely indicative of a rebinding in a middlebox and not a 1886 complete change in path. This determination likely depends on 1887 heuristics, which could be imperfect; if the new path capacity is 1888 significantly reduced, ultimately this relies on the congestion 1889 controller responding to congestion signals and reducing send rates 1890 appropriately. 1892 There may be apparent reordering at the receiver when an endpoint 1893 sends data and probes from/to multiple addresses during the migration 1894 period, since the two resulting paths may have different round-trip 1895 times. A receiver of packets on multiple paths will still send ACK 1896 frames covering all received packets. 1898 While multiple paths might be used during connection migration, a 1899 single congestion control context and a single loss recovery context 1900 (as described in [QUIC-RECOVERY]) may be adequate. A sender can make 1901 exceptions for probe packets so that their loss detection is 1902 independent and does not unduly cause the congestion controller to 1903 reduce its sending rate. An endpoint might arm a separate alarm when 1904 a PATH_CHALLENGE is sent, which is disarmed when the corresponding 1905 PATH_RESPONSE is received. If the alarm fires before the 1906 PATH_RESPONSE is received, the endpoint might send a new 1907 PATH_CHALLENGE, and restart the alarm for a longer period of time. 1909 6.8.5. Privacy Implications of Connection Migration 1911 Using a stable connection ID on multiple network paths allows a 1912 passive observer to correlate activity between those paths. An 1913 endpoint that moves between networks might not wish to have their 1914 activity correlated by any entity other than a server. The 1915 NEW_CONNECTION_ID message can be sent by both endpoints to provide an 1916 unlinkable connection ID for use in case a peer wishes to explicitly 1917 break linkability between two points of network attachment. 1919 An endpoint might need to send packets on multiple networks without 1920 receiving any response from its peer. To ensure that the endpoint is 1921 not linkable across each of these changes, a new connection ID and 1922 packet number gap are needed for each network. To support this, each 1923 endpoint sends multiple NEW_CONNECTION_ID messages. Each 1924 NEW_CONNECTION_ID is marked with a sequence number. Connection IDs 1925 MUST be used in the order in which they are numbered. 1927 An endpoint that does not require the use of a connection ID should 1928 not request that its peer use a connection ID. Such an endpoint does 1929 not need to provide new connection IDs using the NEW_CONNECTION_ID 1930 frame. 1932 An endpoint which wishes to break linkability upon changing networks 1933 MUST use the connection ID provided by its peer as well as 1934 incrementing the packet sequence number by an externally 1935 unpredictable value computed as described in Section 6.8.5.1. Packet 1936 number gaps are cumulative. An endpoint might skip connection IDs, 1937 but it MUST ensure that it applies the associated packet number gaps 1938 for connection IDs that it skips in addition to the packet number gap 1939 associated with the connection ID that it does use. 1941 An endpoint that receives a packet that is marked with a new 1942 connection ID recovers the packet number by adding the cumulative 1943 packet number gap to its expected packet number. An endpoint MUST 1944 discard packets that contain a smaller gap than it advertised. 1946 Clients MAY change connection ID at any time based on implementation- 1947 specific concerns. For example, after a period of network inactivity 1948 NAT rebinding might occur when the client begins sending data again. 1950 A client might wish to reduce linkability by employing a new 1951 connection ID and source UDP port when sending traffic after a period 1952 of inactivity. Changing the UDP port from which it sends packets at 1953 the same time might cause the packet to appear as a connection 1954 migration. This ensures that the mechanisms that support migration 1955 are exercised even for clients that don't experience NAT rebindings 1956 or genuine migrations. Changing port number can cause a peer to 1957 reset its congestion state (see Section 6.8.4), so the port SHOULD 1958 only be changed infrequently. 1960 An endpoint that receives a successfully authenticated packet with a 1961 previously unused connection ID MUST use the next available 1962 connection ID for any packets it sends to that address. To avoid 1963 changing connection IDs multiple times when packets arrive out of 1964 order, endpoints MUST change only in response to a packet that 1965 increases the largest received packet number. Failing to do this 1966 could allow for use of that connection ID to link activity on new 1967 paths. There is no need to move to a new connection ID if the 1968 address of a peer changes without also changing the connection ID. 1970 For instance, a server might provide a packet number gap of 7 1971 associated with a new connection ID. If the server received packet 1972 10 using the previous connection ID, it should expect packets on the 1973 new connection ID to start at 18. A packet with the new connection 1974 ID and a packet number of 17 is discarded as being in error. 1976 6.8.5.1. Packet Number Gap 1978 In order to avoid linkage, the packet number gap MUST be externally 1979 indistinguishable from random. The packet number gap for a 1980 connection ID with sequence number is computed by encoding the 1981 sequence number as a 32-bit integer in big-endian format, and then 1982 computing: 1984 Gap = HKDF-Expand-Label(packet_number_secret, 1985 "QUIC packet sequence gap", sequence, 4) 1987 The output of HKDF-Expand-Label is interpreted as a big-endian 1988 number. "packet_number_secret" is derived from the TLS key exchange, 1989 as described in Section 5.6 of [QUIC-TLS]. 1991 6.9. Connection Termination 1993 Connections should remain open until they become idle for a pre- 1994 negotiated period of time. A QUIC connection, once established, can 1995 be terminated in one of three ways: 1997 o idle timeout (Section 6.9.2) 1998 o immediate close (Section 6.9.3) 2000 o stateless reset (Section 6.9.4) 2002 6.9.1. Closing and Draining Connection States 2004 The closing and draining connection states exist to ensure that 2005 connections close cleanly and that delayed or reordered packets are 2006 properly discarded. These states SHOULD persist for three times the 2007 current Retransmission Timeout (RTO) interval as defined in 2008 [QUIC-RECOVERY]. 2010 An endpoint enters a closing period after initiating an immediate 2011 close (Section 6.9.3). While closing, an endpoint MUST NOT send 2012 packets unless they contain a CONNECTION_CLOSE or APPLICATION_CLOSE 2013 frame (see Section 6.9.3 for details). 2015 In the closing state, only a packet containing a closing frame can be 2016 sent. An endpoint retains only enough information to generate a 2017 packet containing a closing frame and to identify packets as 2018 belonging to the connection. The connection ID and QUIC version is 2019 sufficient information to identify packets for a closing connection; 2020 an endpoint can discard all other connection state. An endpoint MAY 2021 retain packet protection keys for incoming packets to allow it to 2022 read and process a closing frame. 2024 The draining state is entered once an endpoint receives a signal that 2025 its peer is closing or draining. While otherwise identical to the 2026 closing state, an endpoint in the draining state MUST NOT send any 2027 packets. Retaining packet protection keys is unnecessary once a 2028 connection is in the draining state. 2030 An endpoint MAY transition from the closing period to the draining 2031 period if it can confirm that its peer is also closing or draining. 2032 Receiving a closing frame is sufficient confirmation, as is receiving 2033 a stateless reset. The draining period SHOULD end when the closing 2034 period would have ended. In other words, the endpoint can use the 2035 same end time, but cease retransmission of the closing packet. 2037 Disposing of connection state prior to the end of the closing or 2038 draining period could cause delayed or reordered packets to be 2039 handled poorly. Endpoints that have some alternative means to ensure 2040 that late-arriving packets on the connection do not create QUIC 2041 state, such as those that are able to close the UDP socket, MAY use 2042 an abbreviated draining period which can allow for faster resource 2043 recovery. Servers that retain an open socket for accepting new 2044 connections SHOULD NOT exit the closing or draining period early. 2046 Once the closing or draining period has ended, an endpoint SHOULD 2047 discard all connection state. This results in new packets on the 2048 connection being handled generically. For instance, an endpoint MAY 2049 send a stateless reset in response to any further incoming packets. 2051 The draining and closing periods do not apply when a stateless reset 2052 (Section 6.9.4) is sent. 2054 An endpoint is not expected to handle key updates when it is closing 2055 or draining. A key update might prevent the endpoint from moving 2056 from the closing state to draining, but it otherwise has no impact. 2058 An endpoint could receive packets from a new source address, 2059 indicating a client connection migration (Section 6.8), while in the 2060 closing period. An endpoint in the closing state MUST strictly limit 2061 the number of packets it sends to this new address until the address 2062 is validated (see Section 6.7). A server in the closing state MAY 2063 instead choose to discard packets received from a new source address. 2065 6.9.2. Idle Timeout 2067 A connection that remains idle for longer than the idle timeout (see 2068 Section 6.4.1) is closed. A connection enters the draining state 2069 when the idle timeout expires. 2071 The time at which an idle timeout takes effect won't be perfectly 2072 synchronized on both endpoints. An endpoint that sends packets near 2073 the end of an idle period could have those packets discarded if its 2074 peer enters the draining state before the packet is received. 2076 6.9.3. Immediate Close 2078 An endpoint sends a closing frame, either CONNECTION_CLOSE or 2079 APPLICATION_CLOSE, to terminate the connection immediately. Either 2080 closing frame causes all streams to immediately become closed; open 2081 streams can be assumed to be implicitly reset. 2083 After sending a closing frame, endpoints immediately enter the 2084 closing state. During the closing period, an endpoint that sends a 2085 closing frame SHOULD respond to any packet that it receives with 2086 another packet containing a closing frame. To minimize the state 2087 that an endpoint maintains for a closing connection, endpoints MAY 2088 send the exact same packet. However, endpoints SHOULD limit the 2089 number of packets they generate containing a closing frame. For 2090 instance, an endpoint could progressively increase the number of 2091 packets that it receives before sending additional packets or 2092 increase the time between packets. 2094 Note: Allowing retransmission of a packet contradicts other advice 2095 in this document that recommends the creation of new packet 2096 numbers for every packet. Sending new packet numbers is primarily 2097 of advantage to loss recovery and congestion control, which are 2098 not expected to be relevant for a closed connection. 2099 Retransmitting the final packet requires less state. 2101 After receiving a closing frame, endpoints enter the draining state. 2102 An endpoint that receives a closing frame MAY send a single packet 2103 containing a closing frame before entering the draining state, using 2104 a CONNECTION_CLOSE frame and a NO_ERROR code if appropriate. An 2105 endpoint MUST NOT send further packets, which could result in a 2106 constant exchange of closing frames until the closing period on 2107 either peer ended. 2109 An immediate close can be used after an application protocol has 2110 arranged to close a connection. This might be after the application 2111 protocols negotiates a graceful shutdown. The application protocol 2112 exchanges whatever messages that are needed to cause both endpoints 2113 to agree to close the connection, after which the application 2114 requests that the connection be closed. The application protocol can 2115 use an APPLICATION_CLOSE message with an appropriate error code to 2116 signal closure. 2118 6.9.4. Stateless Reset 2120 A stateless reset is provided as an option of last resort for a 2121 server that does not have access to the state of a connection. A 2122 server crash or outage might result in clients continuing to send 2123 data to a server that is unable to properly continue the connection. 2124 A server that wishes to communicate a fatal connection error MUST use 2125 a closing frame if it has sufficient state to do so. 2127 To support this process, the server sends a stateless_reset_token 2128 value during the handshake in the transport parameters. This value 2129 is protected by encryption, so only client and server know this 2130 value. 2132 A server that receives packets that it cannot process sends a packet 2133 in the following layout: 2135 0 1 2 3 2136 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2137 +-+-+-+-+-+-+-+-+ 2138 |0|K| Type (6) | 2139 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2140 | Destination Connection ID (144) ... 2141 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2142 | Packet Number (8/16/32) | 2143 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2144 | Random Octets (*) ... 2145 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2146 | | 2147 + + 2148 | | 2149 + Stateless Reset Token (128) + 2150 | | 2151 + + 2152 | | 2153 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2155 This design ensures that a stateless reset packet is - to the extent 2156 possible - indistinguishable from a regular packet with a short 2157 header. 2159 A server generates a random 18-octet Destination Connection ID field. 2160 For a client that depends on the server including a connection ID, 2161 this will mean that this value differs from previous packets. Ths 2162 results in two problems: 2164 o The packet might not reach the client. If the Destination 2165 Connection ID is critical for routing toward the client, then this 2166 packet could be incorrectly routed. This causes the stateless 2167 reset to be ineffective in causing errors to be quickly detected 2168 and recovered. In this case, clients will need to rely on other 2169 methods - such as timers - to detect that the connection has 2170 failed. 2172 o The randomly generated connection ID can be used by entities other 2173 than the client to identify this as a potential stateless reset. 2174 A server that occasionally uses different connection IDs might 2175 introduce some uncertainty about this. 2177 The Packet Number field is set to a randomized value. The server 2178 SHOULD send a packet with a short header and a type of 0x1F. This 2179 produces the shortest possible packet number encoding, which 2180 minimizes the perceived gap between the last packet that the server 2181 sent and this packet. A server MAY use a different short header 2182 type, indicating a different packet number length, but a longer 2183 packet number encoding might allow this message to be identified as a 2184 stateless reset more easily using heuristics. 2186 After the Packet Number, the server pads the message with an 2187 arbitrary number of octets containing random values. 2189 Finally, the last 16 octets of the packet are set to the value of the 2190 Stateless Reset Token. 2192 A stateless reset is not appropriate for signaling error conditions. 2193 An endpoint that wishes to communicate a fatal connection error MUST 2194 use a CONNECTION_CLOSE or APPLICATION_CLOSE frame if it has 2195 sufficient state to do so. 2197 This stateless reset design is specific to QUIC version 1. A server 2198 that supports multiple versions of QUIC needs to generate a stateless 2199 reset that will be accepted by clients that support any version that 2200 the server might support (or might have supported prior to losing 2201 state). Designers of new versions of QUIC need to be aware of this 2202 and either reuse this design, or use a portion of the packet other 2203 than the last 16 octets for carrying data. 2205 6.9.4.1. Detecting a Stateless Reset 2207 A client detects a potential stateless reset when a packet with a 2208 short header either cannot be decrypted or is marked as a duplicate 2209 packet. The client then compares the last 16 octets of the packet 2210 with the Stateless Reset Token provided by the server in its 2211 transport parameters. If these values are identical, the client MUST 2212 enter the draining period and not send any further packets on this 2213 connection. If the comparison fails, the packet can be discarded. 2215 6.9.4.2. Calculating a Stateless Reset Token 2217 The stateless reset token MUST be difficult to guess. In order to 2218 create a Stateless Reset Token, a server could randomly generate 2219 [RFC4086] a secret for every connection that it creates. However, 2220 this presents a coordination problem when there are multiple servers 2221 in a cluster or a storage problem for a server that might lose state. 2222 Stateless reset specifically exists to handle the case where state is 2223 lost, so this approach is suboptimal. 2225 A single static key can be used across all connections to the same 2226 endpoint by generating the proof using a second iteration of a 2227 preimage-resistant function that takes three inputs: the static key, 2228 the server's connection ID (see Section 4.7), and an identifier for 2229 the server instance. A server could use HMAC [RFC2104] (for example, 2230 HMAC(static_key, server_id || connection_id)) or HKDF [RFC5869] (for 2231 example, using the static key as input keying material, with server 2232 and connection identifiers as salt). The output of this function is 2233 truncated to 16 octets to produce the Stateless Reset Token for that 2234 connection. 2236 A server that loses state can use the same method to generate a valid 2237 Stateless Reset Secret. The connection ID comes from the packet that 2238 the server receives. 2240 This design relies on the client always sending a connection ID in 2241 its packets so that the server can use the connection ID from a 2242 packet to reset the connection. A server that uses this design 2243 cannot allow clients to use a zero-length connection ID. 2245 Revealing the Stateless Reset Token allows any entity to terminate 2246 the connection, so a value can only be used once. This method for 2247 choosing the Stateless Reset Token means that the combination of 2248 server instance, connection ID, and static key cannot occur for 2249 another connection. A connection ID from a connection that is reset 2250 by revealing the Stateless Reset Token cannot be reused for new 2251 connections at the same server without first changing to use a 2252 different static key or server identifier. 2254 Note that Stateless Reset messages do not have any cryptographic 2255 protection. 2257 7. Frame Types and Formats 2259 As described in Section 5, packets contain one or more frames. This 2260 section describes the format and semantics of the core QUIC frame 2261 types. 2263 7.1. Variable-Length Integer Encoding 2265 QUIC frames use a common variable-length encoding for all non- 2266 negative integer values. This encoding ensures that smaller integer 2267 values need fewer octets to encode. 2269 The QUIC variable-length integer encoding reserves the two most 2270 significant bits of the first octet to encode the base 2 logarithm of 2271 the integer encoding length in octets. The integer value is encoded 2272 on the remaining bits, in network byte order. 2274 This means that integers are encoded on 1, 2, 4, or 8 octets and can 2275 encode 6, 14, 30, or 62 bit values respectively. Table 4 summarizes 2276 the encoding properties. 2278 +------+--------+-------------+-----------------------+ 2279 | 2Bit | Length | Usable Bits | Range | 2280 +------+--------+-------------+-----------------------+ 2281 | 00 | 1 | 6 | 0-63 | 2282 | | | | | 2283 | 01 | 2 | 14 | 0-16383 | 2284 | | | | | 2285 | 10 | 4 | 30 | 0-1073741823 | 2286 | | | | | 2287 | 11 | 8 | 62 | 0-4611686018427387903 | 2288 +------+--------+-------------+-----------------------+ 2290 Table 4: Summary of Integer Encodings 2292 For example, the eight octet sequence c2 19 7c 5e ff 14 e8 8c (in 2293 hexadecimal) decodes to the decimal value 151288809941952652; the 2294 four octet sequence 9d 7f 3e 7d decodes to 494878333; the two octet 2295 sequence 7b bd decodes to 15293; and the single octet 25 decodes to 2296 37 (as does the two octet sequence 40 25). 2298 Error codes (Section 11.3) are described using integers, but do not 2299 use this encoding. 2301 7.2. PADDING Frame 2303 The PADDING frame (type=0x00) has no semantic value. PADDING frames 2304 can be used to increase the size of a packet. Padding can be used to 2305 increase an initial client packet to the minimum required size, or to 2306 provide protection against traffic analysis for protected packets. 2308 A PADDING frame has no content. That is, a PADDING frame consists of 2309 the single octet that identifies the frame as a PADDING frame. 2311 7.3. RST_STREAM Frame 2313 An endpoint may use a RST_STREAM frame (type=0x01) to abruptly 2314 terminate a stream. 2316 After sending a RST_STREAM, an endpoint ceases transmission and 2317 retransmission of STREAM frames on the identified stream. A receiver 2318 of RST_STREAM can discard any data that it already received on that 2319 stream. 2321 An endpoint that receives a RST_STREAM frame for a send-only stream 2322 MUST terminate the connection with error PROTOCOL_VIOLATION. 2324 The RST_STREAM frame is as follows: 2326 0 1 2 3 2327 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2328 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2329 | Stream ID (i) ... 2330 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2331 | Application Error Code (16) | 2332 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2333 | Final Offset (i) ... 2334 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2336 The fields are: 2338 Stream ID: A variable-length integer encoding of the Stream ID of 2339 the stream being terminated. 2341 Application Protocol Error Code: A 16-bit application protocol error 2342 code (see Section 11.4) which indicates why the stream is being 2343 closed. 2345 Final Offset: A variable-length integer indicating the absolute byte 2346 offset of the end of data written on this stream by the RST_STREAM 2347 sender. 2349 7.4. CONNECTION_CLOSE frame 2351 An endpoint sends a CONNECTION_CLOSE frame (type=0x02) to notify its 2352 peer that the connection is being closed. CONNECTION_CLOSE is used 2353 to signal errors at the QUIC layer, or the absence of errors (with 2354 the NO_ERROR code). 2356 If there are open streams that haven't been explicitly closed, they 2357 are implicitly closed when the connection is closed. 2359 The CONNECTION_CLOSE frame is as follows: 2361 0 1 2 3 2362 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2363 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2364 | Error Code (16) | Reason Phrase Length (i) ... 2365 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2366 | Reason Phrase (*) ... 2367 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2369 The fields of a CONNECTION_CLOSE frame are as follows: 2371 Error Code: A 16-bit error code which indicates the reason for 2372 closing this connection. CONNECTION_CLOSE uses codes from the 2373 space defined in Section 11.3 (APPLICATION_CLOSE uses codes from 2374 the application protocol error code space, see Section 11.4). 2376 Reason Phrase Length: A variable-length integer specifying the 2377 length of the reason phrase in bytes. Note that a 2378 CONNECTION_CLOSE frame cannot be split between packets, so in 2379 practice any limits on packet size will also limit the space 2380 available for a reason phrase. 2382 Reason Phrase: A human-readable explanation for why the connection 2383 was closed. This can be zero length if the sender chooses to not 2384 give details beyond the Error Code. This SHOULD be a UTF-8 2385 encoded string [RFC3629]. 2387 7.5. APPLICATION_CLOSE frame 2389 An APPLICATION_CLOSE frame (type=0x03) uses the same format as the 2390 CONNECTION_CLOSE frame (Section 7.4), except that it uses error codes 2391 from the application protocol error code space (Section 11.4) instead 2392 of the transport error code space. 2394 Other than the error code space, the format and semantics of the 2395 APPLICATION_CLOSE frame are identical to the CONNECTION_CLOSE frame. 2397 7.6. MAX_DATA Frame 2399 The MAX_DATA frame (type=0x04) is used in flow control to inform the 2400 peer of the maximum amount of data that can be sent on the connection 2401 as a whole. 2403 The frame is as follows: 2405 0 1 2 3 2406 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2407 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2408 | Maximum Data (i) ... 2409 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2411 The fields in the MAX_DATA frame are as follows: 2413 Maximum Data: A variable-length integer indicating the maximum 2414 amount of data that can be sent on the entire connection, in units 2415 of octets. 2417 All data sent in STREAM frames counts toward this limit, with the 2418 exception of data on stream 0. The sum of the largest received 2419 offsets on all streams - including streams in terminal states, but 2420 excluding stream 0 - MUST NOT exceed the value advertised by a 2421 receiver. An endpoint MUST terminate a connection with a 2422 QUIC_FLOW_CONTROL_RECEIVED_TOO_MUCH_DATA error if it receives more 2423 data than the maximum data value that it has sent, unless this is a 2424 result of a change in the initial limits (see Section 6.4.2). 2426 7.7. MAX_STREAM_DATA Frame 2428 The MAX_STREAM_DATA frame (type=0x05) is used in flow control to 2429 inform a peer of the maximum amount of data that can be sent on a 2430 stream. 2432 An endpoint that receives a MAX_STREAM_DATA frame for a receive-only 2433 stream MUST terminate the connection with error PROTOCOL_VIOLATION. 2435 An endpoint that receives a MAX_STREAM_DATA frame for a send-only 2436 stream it has not opened MUST terminate the connection with error 2437 PROTOCOL_VIOLATION. 2439 Note that an endpoint may legally receive a MAX_STREAM_DATA frame on 2440 a bidirectional stream it has not opened. 2442 The frame is as follows: 2444 0 1 2 3 2445 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2446 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2447 | Stream ID (i) ... 2448 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2449 | Maximum Stream Data (i) ... 2450 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2452 The fields in the MAX_STREAM_DATA frame are as follows: 2454 Stream ID: The stream ID of the stream that is affected encoded as a 2455 variable-length integer. 2457 Maximum Stream Data: A variable-length integer indicating the 2458 maximum amount of data that can be sent on the identified stream, 2459 in units of octets. 2461 When counting data toward this limit, an endpoint accounts for the 2462 largest received offset of data that is sent or received on the 2463 stream. Loss or reordering can mean that the largest received offset 2464 on a stream can be greater than the total size of data received on 2465 that stream. Receiving STREAM frames might not increase the largest 2466 received offset. 2468 The data sent on a stream MUST NOT exceed the largest maximum stream 2469 data value advertised by the receiver. An endpoint MUST terminate a 2470 connection with a FLOW_CONTROL_ERROR error if it receives more data 2471 than the largest maximum stream data that it has sent for the 2472 affected stream, unless this is a result of a change in the initial 2473 limits (see Section 6.4.2). 2475 7.8. MAX_STREAM_ID Frame 2477 The MAX_STREAM_ID frame (type=0x06) informs the peer of the maximum 2478 stream ID that they are permitted to open. 2480 The frame is as follows: 2482 0 1 2 3 2483 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2484 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2485 | Maximum Stream ID (i) ... 2486 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2488 The fields in the MAX_STREAM_ID frame are as follows: 2490 Maximum Stream ID: ID of the maximum unidirectional or bidirectional 2491 peer-initiated stream ID for the connection encoded as a variable- 2492 length integer. The limit applies to unidirectional steams if the 2493 second least signification bit of the stream ID is 1, and applies 2494 to bidirectional streams if it is 0. 2496 Loss or reordering can mean that a MAX_STREAM_ID frame can be 2497 received which states a lower stream limit than the client has 2498 previously received. MAX_STREAM_ID frames which do not increase the 2499 maximum stream ID MUST be ignored. 2501 A peer MUST NOT initiate a stream with a higher stream ID than the 2502 greatest maximum stream ID it has received. An endpoint MUST 2503 terminate a connection with a STREAM_ID_ERROR error if a peer 2504 initiates a stream with a higher stream ID than it has sent, unless 2505 this is a result of a change in the initial limits (see 2506 Section 6.4.2). 2508 7.9. PING Frame 2510 Endpoints can use PING frames (type=0x07) to verify that their peers 2511 are still alive or to check reachability to the peer. The PING frame 2512 contains no additional fields. 2514 The receiver of a PING frame simply needs to acknowledge the packet 2515 containing this frame. 2517 The PING frame can be used to keep a connection alive when an 2518 application or application protocol wishes to prevent the connection 2519 from timing out. An application protocol SHOULD provide guidance 2520 about the conditions under which generating a PING is recommended. 2521 This guidance SHOULD indicate whether it is the client or the server 2522 that is expected to send the PING. Having both endpoints send PING 2523 frames without coordination can produce an excessive number of 2524 packets and poor performance. 2526 A connection will time out if no packets are sent or received for a 2527 period longer than the time specified in the idle_timeout transport 2528 parameter (see Section 6.9). However, state in middleboxes might 2529 time out earlier than that. Though REQ-5 in [RFC4787] recommends a 2 2530 minute timeout interval, experience shows that sending packets every 2531 15 to 30 seconds is necessary to prevent the majority of middleboxes 2532 from losing state for UDP flows. 2534 7.10. BLOCKED Frame 2536 A sender SHOULD send a BLOCKED frame (type=0x08) when it wishes to 2537 send data, but is unable to due to connection-level flow control (see 2538 Section 10.2.1). BLOCKED frames can be used as input to tuning of 2539 flow control algorithms (see Section 10.1.2). 2541 The BLOCKED frame is as follows: 2543 0 1 2 3 2544 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2545 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2546 | Offset (i) ... 2547 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2549 The BLOCKED frame contains a single field. 2551 Offset: A variable-length integer indicating the connection-level 2552 offset at which the blocking occurred. 2554 7.11. STREAM_BLOCKED Frame 2556 A sender SHOULD send a STREAM_BLOCKED frame (type=0x09) when it 2557 wishes to send data, but is unable to due to stream-level flow 2558 control. This frame is analogous to BLOCKED (Section 7.10). 2560 An endpoint that receives a STREAM_BLOCKED frame for a send-only 2561 stream MUST terminate the connection with error PROTOCOL_VIOLATION. 2563 The STREAM_BLOCKED frame is as follows: 2565 0 1 2 3 2566 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2567 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2568 | Stream ID (i) ... 2569 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2570 | Offset (i) ... 2571 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2573 The STREAM_BLOCKED frame contains two fields: 2575 Stream ID: A variable-length integer indicating the stream which is 2576 flow control blocked. 2578 Offset: A variable-length integer indicating the offset of the 2579 stream at which the blocking occurred. 2581 7.12. STREAM_ID_BLOCKED Frame 2583 A sender MAY send a STREAM_ID_BLOCKED frame (type=0x0a) when it 2584 wishes to open a stream, but is unable to due to the maximum stream 2585 ID limit set by its peer (see Section 7.8). This does not open the 2586 stream, but informs the peer that a new stream was needed, but the 2587 stream limit prevented the creation of the stream. 2589 The STREAM_ID_BLOCKED frame is as follows: 2591 0 1 2 3 2592 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2593 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2594 | Stream ID (i) ... 2595 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2597 The STREAM_ID_BLOCKED frame contains a single field. 2599 Stream ID: A variable-length integer indicating the highest stream 2600 ID that the sender was permitted to open. 2602 7.13. NEW_CONNECTION_ID Frame 2604 An endpoint sends a NEW_CONNECTION_ID frame (type=0x0b) to provide 2605 its peer with alternative connection IDs that can be used to break 2606 linkability when migrating connections (see Section 6.8.5). 2608 The NEW_CONNECTION_ID is as follows: 2610 0 1 2 3 2611 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2612 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2613 | Sequence (i) ... 2614 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2615 | Length (8) | Connection ID (32..144) ... 2616 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2617 | | 2618 + + 2619 | | 2620 + Stateless Reset Token (128) + 2621 | | 2622 + + 2623 | | 2624 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2626 The fields are: 2628 Sequence: A variable-length integer. This value starts at 0 and 2629 increases by 1 for each connection ID that is provided by the 2630 server. The connection ID that is assigned during the handshake 2631 is assumed to have a sequence of -1. That is, the value selected 2632 during the handshake comes immediately before the first value that 2633 a server can send. 2635 Length: An 8-bit unsigned integer containing the length of the 2636 connection ID. Values less than 4 and greater than 18 are invalid 2637 and MUST be treated as a connection error of type 2638 PROTOCOL_VIOLATION. 2640 Connection ID: A connection ID of the specified length. 2642 Stateless Reset Token: A 128-bit value that will be used to for a 2643 stateless reset when the associated connection ID is used (see 2644 Section 6.9.4). 2646 An endpoint MUST NOT send this frame if it currently requires that 2647 its peer send packets with a zero-length Destination Connection ID. 2648 Changing the length of a connection ID to or from zero-length makes 2649 it difficult to identify when the value of the connection ID changed. 2650 An endpoint that is sending packets with a zero-length Destination 2651 Connection ID MUST treat receipt of a NEW_CONNECTION_ID frame as a 2652 connection error of type PROTOCOL_VIOLATION. 2654 7.14. STOP_SENDING Frame 2656 An endpoint may use a STOP_SENDING frame (type=0x0c) to communicate 2657 that incoming data is being discarded on receipt at application 2658 request. This signals a peer to abruptly terminate transmission on a 2659 stream. 2661 Receipt of a STOP_SENDING frame is only valid for a send stream that 2662 exists and is not in the "Ready" state (see Section 9.2.1). 2663 Receiving a STOP_SENDING frame for a send stream that is "Ready" or 2664 non-existent MUST be treated as a connection error of type 2665 PROTOCOL_VIOLATION. An endpoint that receives a STOP_SENDING frame 2666 for a receive-only stream MUST terminate the connection with error 2667 PROTOCOL_VIOLATION. 2669 The STOP_SENDING frame is as follows: 2671 0 1 2 3 2672 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2673 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2674 | Stream ID (i) ... 2675 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2676 | Application Error Code (16) | 2677 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2679 The fields are: 2681 Stream ID: A variable-length integer carrying the Stream ID of the 2682 stream being ignored. 2684 Application Error Code: A 16-bit, application-specified reason the 2685 sender is ignoring the stream (see Section 11.4). 2687 7.15. ACK Frame 2689 Receivers send ACK frames (type=0x0d) to inform senders which packets 2690 they have received and processed. A sent packet that has never been 2691 acknowledged is missing. The ACK frame contains any number of ACK 2692 blocks. ACK blocks are ranges of acknowledged packets. 2694 Unlike TCP SACKs, QUIC acknowledgements are irrevocable. Once a 2695 packet has been acknowledged, even if it does not appear in a future 2696 ACK frame, it remains acknowledged. 2698 A client MUST NOT acknowledge Retry packets. Retry packets include 2699 the packet number from the Initial packet it responds to. Version 2700 Negotiation packets cannot be acknowledged because they do not 2701 contain a packet number. Rather than relying on ACK frames, these 2702 packets are implicitly acknowledged by the next Initial packet sent 2703 by the client. 2705 An ACK frame is shown below. 2707 0 1 2 3 2708 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2709 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2710 | Largest Acknowledged (i) ... 2711 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2712 | ACK Delay (i) ... 2713 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2714 | ACK Block Count (i) ... 2715 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2716 | ACK Blocks (*) ... 2717 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2719 Figure 7: ACK Frame Format 2721 The fields in the ACK frame are as follows: 2723 Largest Acknowledged: A variable-length integer representing the 2724 largest packet number the peer is acknowledging; this is usually 2725 the largest packet number that the peer has received prior to 2726 generating the ACK frame. Unlike the packet number in the QUIC 2727 long or short header, the value in an ACK frame is not truncated. 2729 ACK Delay: A variable-length integer including the time in 2730 microseconds that the largest acknowledged packet, as indicated in 2731 the Largest Acknowledged field, was received by this peer to when 2732 this ACK was sent. The value of the ACK Delay field is scaled by 2733 multiplying the encoded value by the 2 to the power of the value 2734 of the "ack_delay_exponent" transport parameter set by the sender 2735 of the ACK frame. The "ack_delay_exponent" defaults to 3, or a 2736 multiplier of 8 (see Section 6.4.1). Scaling in this fashion 2737 allows for a larger range of values with a shorter encoding at the 2738 cost of lower resolution. 2740 ACK Block Count: The number of Additional ACK Block (and Gap) fields 2741 after the First ACK Block. 2743 ACK Blocks: Contains one or more blocks of packet numbers which have 2744 been successfully received, see Section 7.15.1. 2746 7.15.1. ACK Block Section 2748 The ACK Block Section consists of alternating Gap and ACK Block 2749 fields in descending packet number order. A First Ack Block field is 2750 followed by a variable number of alternating Gap and Additional ACK 2751 Blocks. The number of Gap and Additional ACK Block fields is 2752 determined by the ACK Block Count field. 2754 Gap and ACK Block fields use a relative integer encoding for 2755 efficiency. Though each encoded value is positive, the values are 2756 subtracted, so that each ACK Block describes progressively lower- 2757 numbered packets. As long as contiguous ranges of packets are small, 2758 the variable-length integer encoding ensures that each range can be 2759 expressed in a small number of octets. 2761 0 1 2 3 2762 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2763 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2764 | First ACK Block (i) ... 2765 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2766 | Gap (i) ... 2767 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2768 | Additional ACK Block (i) ... 2769 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2770 | Gap (i) ... 2771 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2772 | Additional ACK Block (i) ... 2773 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2774 ... 2775 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2776 | Gap (i) ... 2777 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2778 | Additional ACK Block (i) ... 2779 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2781 Figure 8: ACK Block Section 2783 Each ACK Block acknowledges a contiguous range of packets by 2784 indicating the number of acknowledged packets that precede the 2785 largest packet number in that block. A value of zero indicates that 2786 only the largest packet number is acknowledged. Larger ACK Block 2787 values indicate a larger range, with corresponding lower values for 2788 the smallest packet number in the range. Thus, given a largest 2789 packet number for the ACK, the smallest value is determined by the 2790 formula: 2792 smallest = largest - ack_block 2794 The range of packets that are acknowledged by the ACK block include 2795 the range from the smallest packet number to the largest, inclusive. 2797 The largest value for the First ACK Block is determined by the 2798 Largest Acknowledged field; the largest for Additional ACK Blocks is 2799 determined by cumulatively subtracting the size of all preceding ACK 2800 Blocks and Gaps. 2802 Each Gap indicates a range of packets that are not being 2803 acknowledged. The number of packets in the gap is one higher than 2804 the encoded value of the Gap Field. 2806 The value of the Gap field establishes the largest packet number 2807 value for the ACK block that follows the gap using the following 2808 formula: 2810 largest = previous_smallest - gap - 2 2812 If the calculated value for largest or smallest packet number for any 2813 ACK Block is negative, an endpoint MUST generate a connection error 2814 of type FRAME_ERROR indicating an error in an ACK frame (that is, 2815 0x10d). 2817 The fields in the ACK Block Section are: 2819 First ACK Block: A variable-length integer indicating the number of 2820 contiguous packets preceding the Largest Acknowledged that are 2821 being acknowledged. 2823 Gap (repeated): A variable-length integer indicating the number of 2824 contiguous unacknowledged packets preceding the packet number one 2825 lower than the smallest in the preceding ACK Block. 2827 ACK Block (repeated): A variable-length integer indicating the 2828 number of contiguous acknowledged packets preceding the largest 2829 packet number, as determined by the preceding Gap. 2831 7.15.2. Sending ACK Frames 2833 Implementations MUST NOT generate packets that only contain ACK 2834 frames in response to packets which only contain ACK frames. 2835 However, they MUST acknowledge packets containing only ACK frames 2836 when sending ACK frames in response to other packets. 2837 Implementations MUST NOT send more than one packet containing only 2838 ACK frames per received packet that contains frames other than ACK 2839 frames. Packets containing non-ACK frames MUST be acknowledged 2840 immediately or when a delayed ack timer expires. 2842 To limit ACK blocks to those that have not yet been received by the 2843 sender, the receiver SHOULD track which ACK frames have been 2844 acknowledged by its peer. Once an ACK frame has been acknowledged, 2845 the packets it acknowledges SHOULD NOT be acknowledged again. 2847 Because ACK frames are not sent in response to ACK-only packets, a 2848 receiver that is only sending ACK frames will only receive 2849 acknowledgements for its packets if the sender includes them in 2850 packets with non-ACK frames. A sender SHOULD bundle ACK frames with 2851 other frames when possible. 2853 To limit receiver state or the size of ACK frames, a receiver MAY 2854 limit the number of ACK blocks it sends. A receiver can do this even 2855 without receiving acknowledgment of its ACK frames, with the 2856 knowledge this could cause the sender to unnecessarily retransmit 2857 some data. Standard QUIC [QUIC-RECOVERY] algorithms declare packets 2858 lost after sufficiently newer packets are acknowledged. Therefore, 2859 the receiver SHOULD repeatedly acknowledge newly received packets in 2860 preference to packets received in the past. 2862 7.15.3. ACK Frames and Packet Protection 2864 ACK frames that acknowledge protected packets MUST be carried in a 2865 packet that has an equivalent or greater level of packet protection. 2867 Packets that are protected with 1-RTT keys MUST be acknowledged in 2868 packets that are also protected with 1-RTT keys. 2870 A packet that is not protected and claims to acknowledge a packet 2871 number that was sent with packet protection is not valid. An 2872 unprotected packet that carries acknowledgments for protected packets 2873 MUST be discarded in its entirety. 2875 Packets that a client sends with 0-RTT packet protection MUST be 2876 acknowledged by the server in packets protected by 1-RTT keys. This 2877 can mean that the client is unable to use these acknowledgments if 2878 the server cryptographic handshake messages are delayed or lost. 2879 Note that the same limitation applies to other data sent by the 2880 server protected by the 1-RTT keys. 2882 Unprotected packets, such as those that carry the initial 2883 cryptographic handshake messages, MAY be acknowledged in unprotected 2884 packets. Unprotected packets are vulnerable to falsification or 2885 modification. Unprotected packets can be acknowledged along with 2886 protected packets in a protected packet. 2888 An endpoint SHOULD acknowledge packets containing cryptographic 2889 handshake messages in the next unprotected packet that it sends, 2890 unless it is able to acknowledge those packets in later packets 2891 protected by 1-RTT keys. At the completion of the cryptographic 2892 handshake, both peers send unprotected packets containing 2893 cryptographic handshake messages followed by packets protected by 2894 1-RTT keys. An endpoint SHOULD acknowledge the unprotected packets 2895 that complete the cryptographic handshake in a protected packet, 2896 because its peer is guaranteed to have access to 1-RTT packet 2897 protection keys. 2899 For instance, a server acknowledges a TLS ClientHello in the packet 2900 that carries the TLS ServerHello; similarly, a client can acknowledge 2901 a TLS HelloRetryRequest in the packet containing a second TLS 2902 ClientHello. The complete set of server handshake messages (TLS 2903 ServerHello through to Finished) might be acknowledged by a client in 2904 protected packets, because it is certain that the server is able to 2905 decipher the packet. 2907 7.16. PATH_CHALLENGE Frame 2909 Endpoints can use PATH_CHALLENGE frames (type=0x0e) to check 2910 reachability to the peer and for path validation during connection 2911 establishment and connection migration. 2913 PATH_CHALLENGE frames contain an 8-byte payload. 2915 0 1 2 3 2916 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2917 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2918 | | 2919 + Data (8) + 2920 | | 2921 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2923 Data: This 8-byte field contains arbitrary data. 2925 A PATH_CHALLENGE frame containing 8 octets that are hard to guess is 2926 sufficient to ensure that it is easier to receive the packet than it 2927 is to guess the value correctly. 2929 The recipient of this frame MUST generate a PATH_RESPONSE frame 2930 (Section 7.17) containing the same Data. 2932 7.17. PATH_RESPONSE Frame 2934 The PATH_RESPONSE frame (type=0x0f) is sent in response to a 2935 PATH_CHALLENGE frame. Its format is identical to the PATH_CHALLENGE 2936 frame (Section 7.16). 2938 If the content of a PATH_RESPONSE frame does not match the content of 2939 a PATH_CHALLENGE frame previously sent by the endpoint, the endpoint 2940 MAY generate a connection error of type UNSOLICITED_PATH_RESPONSE. 2942 7.18. STREAM Frames 2944 STREAM frames implicitly create a stream and carry stream data. The 2945 STREAM frame takes the form 0b00010XXX (or the set of values from 2946 0x10 to 0x17). The value of the three low-order bits of the frame 2947 type determine the fields that are present in the frame. 2949 o The OFF bit (0x04) in the frame type is set to indicate that there 2950 is an Offset field present. When set to 1, the Offset field is 2951 present; when set to 0, the Offset field is absent and the Stream 2952 Data starts at an offset of 0 (that is, the frame contains the 2953 first octets of the stream, or the end of a stream that includes 2954 no data). 2956 o The LEN bit (0x02) in the frame type is set to indicate that there 2957 is a Length field present. If this bit is set to 0, the Length 2958 field is absent and the Stream Data field extends to the end of 2959 the packet. If this bit is set to 1, the Length field is present. 2961 o The FIN bit (0x01) of the frame type is set only on frames that 2962 contain the final offset of the stream. Setting this bit 2963 indicates that the frame marks the end of the stream. 2965 An endpoint that receives a STREAM frame for a send-only stream MUST 2966 terminate the connection with error PROTOCOL_VIOLATION. 2968 A STREAM frame is shown below. 2970 0 1 2 3 2971 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2972 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2973 | Stream ID (i) ... 2974 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2975 | [Offset (i)] ... 2976 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2977 | [Length (i)] ... 2978 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2979 | Stream Data (*) ... 2980 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2982 Figure 9: STREAM Frame Format 2984 The STREAM frame contains the following fields: 2986 Stream ID: A variable-length integer indicating the stream ID of the 2987 stream (see Section 9.1). 2989 Offset: A variable-length integer specifying the byte offset in the 2990 stream for the data in this STREAM frame. This field is present 2991 when the OFF bit is set to 1. When the Offset field is absent, 2992 the offset is 0. 2994 Length: A variable-length integer specifying the length of the 2995 Stream Data field in this STREAM frame. This field is present 2996 when the LEN bit is set to 1. When the LEN bit is set to 0, the 2997 Stream Data field consumes all the remaining octets in the packet. 2999 Stream Data: The bytes from the designated stream to be delivered. 3001 A stream frame's Stream Data MUST NOT be empty, unless the offset is 3002 0 or the FIN bit is set. When the FIN flag is sent on an empty 3003 STREAM frame, the offset in the STREAM frame is the offset of the 3004 next byte that would be sent. 3006 The first byte in the stream has an offset of 0. The largest offset 3007 delivered on a stream - the sum of the re-constructed offset and data 3008 length - MUST be less than 2^62. 3010 Stream multiplexing is achieved by interleaving STREAM frames from 3011 multiple streams into one or more QUIC packets. A single QUIC packet 3012 can include multiple STREAM frames from one or more streams. 3014 Implementation note: One of the benefits of QUIC is avoidance of 3015 head-of-line blocking across multiple streams. When a packet loss 3016 occurs, only streams with data in that packet are blocked waiting for 3017 a retransmission to be received, while other streams can continue 3018 making progress. Note that when data from multiple streams is 3019 bundled into a single QUIC packet, loss of that packet blocks all 3020 those streams from making progress. An implementation is therefore 3021 advised to bundle as few streams as necessary in outgoing packets 3022 without losing transmission efficiency to underfilled packets. 3024 8. Packetization and Reliability 3026 A sender bundles one or more frames in a QUIC packet (see Section 5). 3028 A sender SHOULD minimize per-packet bandwidth and computational costs 3029 by bundling as many frames as possible within a QUIC packet. A 3030 sender MAY wait for a short period of time to bundle multiple frames 3031 before sending a packet that is not maximally packed, to avoid 3032 sending out large numbers of small packets. An implementation may 3033 use knowledge about application sending behavior or heuristics to 3034 determine whether and for how long to wait. This waiting period is 3035 an implementation decision, and an implementation should be careful 3036 to delay conservatively, since any delay is likely to increase 3037 application-visible latency. 3039 8.1. Packet Processing and Acknowledgment 3041 A packet MUST NOT be acknowledged until packet protection has been 3042 successfully removed and all frames contained in the packet have been 3043 processed. Any stream state transitions triggered by the frame MUST 3044 have occurred. For STREAM frames, this means the data has been 3045 enqueued in preparation to be received by the application protocol, 3046 but it does not require that data is delivered and consumed. 3048 Once the packet has been fully processed, a receiver acknowledges 3049 receipt by sending one or more ACK frames containing the packet 3050 number of the received packet. To avoid creating an indefinite 3051 feedback loop, an endpoint MUST NOT send an ACK frame in response to 3052 a packet containing only ACK or PADDING frames, even if there are 3053 packet gaps which precede the received packet. The endpoint MUST 3054 acknowledge packets containing only ACK or PADDING frames in the next 3055 ACK frame that it sends. 3057 Strategies and implications of the frequency of generating 3058 acknowledgments are discussed in more detail in [QUIC-RECOVERY]. 3060 8.2. Retransmission of Information 3062 QUIC packets that are determined to be lost are not retransmitted 3063 whole. The same applies to the frames that are contained within lost 3064 packets. Instead, the information that might be carried in frames is 3065 sent again in new frames as needed. 3067 New frames and packets are used to carry information that is 3068 determined to have been lost. In general, information is sent again 3069 when a packet containing that information is determined to be lost 3070 and sending ceases when a packet containing that information is 3071 acknowledged. 3073 o Application data sent in STREAM frames is retransmitted in new 3074 STREAM frames unless the endpoint has sent a RST_STREAM for that 3075 stream. Once an endpoint sends a RST_STREAM frame, no further 3076 STREAM frames are needed. 3078 o The most recent set of acknowledgments are sent in ACK frames. An 3079 ACK frame SHOULD contain all unacknowledged acknowledgments, as 3080 described in Section 7.15.2. 3082 o Cancellation of stream transmission, as carried in a RST_STREAM 3083 frame, is sent until acknowledged or until all stream data is 3084 acknowledged by the peer (that is, either the "Reset Recvd" or 3085 "Data Recvd" state is reached on the send stream). The content of 3086 a RST_STREAM frame MUST NOT change when it is sent again. 3088 o Similarly, a request to cancel stream transmission, as encoded in 3089 a STOP_SENDING frame, is sent until the receive stream enters 3090 either a "Data Recvd" or "Reset Recvd" state, see Section 9.3. 3092 o Connection close signals, including those that use 3093 CONNECTION_CLOSE and APPLICATION_CLOSE frames, are not sent again 3094 when packet loss is detected, but as described in Section 6.9. 3096 o The current connection maximum data is sent in MAX_DATA frames. 3097 An updated value is sent in a MAX_DATA frame if the packet 3098 containing the most recently sent MAX_DATA frame is declared lost, 3099 or when the endpoint decides to update the limit. Care is 3100 necessary to avoid sending this frame too often as the limit can 3101 increase frequently and cause an unnecessarily large number of 3102 MAX_DATA frames to be sent. 3104 o The current maximum stream data offset is sent in MAX_STREAM_DATA 3105 frames. Like MAX_DATA, an updated value is sent when the packet 3106 containing the most recent MAX_STREAM_DATA frame for a stream is 3107 lost or when the limit is updated, with care taken to prevent the 3108 frame from being sent too often. An endpoint SHOULD stop sending 3109 MAX_STREAM_DATA frames when the receive stream enters a "Size 3110 Known" state. 3112 o The maximum stream ID for a stream of a given type is sent in 3113 MAX_STREAM_ID frames. Like MAX_DATA, an updated value is sent 3114 when a packet containing the most recent MAX_STREAM_ID for a 3115 stream type frame is declared lost or when the limit is updated, 3116 with care taken to prevent the frame from being sent too often. 3118 o Blocked signals are carried in BLOCKED, STREAM_BLOCKED, and 3119 STREAM_ID_BLOCKED frames. BLOCKED streams have connection scope, 3120 STREAM_BLOCKED frames have stream scope, and STREAM_ID_BLOCKED 3121 frames are scoped to a specific stream type. New frames are sent 3122 if packets containing the most recent frame for a scope is lost, 3123 but only while the endpoint is blocked on the corresponding limit. 3124 These frames always include the limit that is causing blocking at 3125 the time that they are transmitted. 3127 o A liveness or path validation check using PATH_CHALLENGE frames is 3128 sent periodically until a matching PATH_RESPONSE frame is received 3129 or until there is no remaining need for liveness or path 3130 validation checking. PATH_CHALLENGE frames include a different 3131 payload each time they are sent. 3133 o Responses to path validation using PATH_RESPONSE frames are sent 3134 just once. A new PATH_CHALLENGE frame will be sent if another 3135 PATH_RESPONSE frame is needed. 3137 o New connection IDs are sent in NEW_CONNECTION_ID frames and 3138 retransmitted if the packet containing them is lost. 3140 o PADDING frames contain no information, so lost PADDING frames do 3141 not require repair. 3143 Upon detecting losses, a sender MUST take appropriate congestion 3144 control action. The details of loss detection and congestion control 3145 are described in [QUIC-RECOVERY]. 3147 8.3. Packet Size 3149 The QUIC packet size includes the QUIC header and integrity check, 3150 but not the UDP or IP header. 3152 Clients MUST pad any Initial packet it sends to have a QUIC packet 3153 size of at least 1200 octets. Sending an Initial packet of this size 3154 ensures that the network path supports a reasonably sized packet, and 3155 helps reduce the amplitude of amplification attacks caused by server 3156 responses toward an unverified client address. 3158 An Initial packet MAY exceed 1200 octets if the client knows that the 3159 Path Maximum Transmission Unit (PMTU) supports the size that it 3160 chooses. 3162 A server MAY send a CONNECTION_CLOSE frame with error code 3163 PROTOCOL_VIOLATION in response to an Initial packet smaller than 1200 3164 octets. It MUST NOT send any other frame type in response, or 3165 otherwise behave as if any part of the offending packet was processed 3166 as valid. 3168 8.4. Path Maximum Transmission Unit 3170 The Path Maximum Transmission Unit (PMTU) is the maximum size of the 3171 entire IP header, UDP header, and UDP payload. The UDP payload 3172 includes the QUIC packet header, protected payload, and any 3173 authentication fields. 3175 All QUIC packets SHOULD be sized to fit within the estimated PMTU to 3176 avoid IP fragmentation or packet drops. To optimize bandwidth 3177 efficiency, endpoints SHOULD use Packetization Layer PMTU Discovery 3178 ([PLPMTUD]). Endpoints MAY use PMTU Discovery ([PMTUDv4], [PMTUDv6]) 3179 for detecting the PMTU, setting the PMTU appropriately, and storing 3180 the result of previous PMTU determinations. 3182 In the absence of these mechanisms, QUIC endpoints SHOULD NOT send IP 3183 packets larger than 1280 octets. Assuming the minimum IP header 3184 size, this results in a QUIC packet size of 1232 octets for IPv6 and 3185 1252 octets for IPv4. Some QUIC implementations MAY wish to be more 3186 conservative in computing allowed QUIC packet size given unknown 3187 tunneling overheads or IP header options. 3189 QUIC endpoints that implement any kind of PMTU discovery SHOULD 3190 maintain an estimate for each combination of local and remote IP 3191 addresses. Each pairing of local and remote addresses could have a 3192 different maximum MTU in the path. 3194 QUIC depends on the network path supporting a MTU of at least 1280 3195 octets. This is the IPv6 minimum MTU and therefore also supported by 3196 most modern IPv4 networks. An endpoint MUST NOT reduce its MTU below 3197 this number, even if it receives signals that indicate a smaller 3198 limit might exist. 3200 If a QUIC endpoint determines that the PMTU between any pair of local 3201 and remote IP addresses has fallen below 1280 octets, it MUST 3202 immediately cease sending QUIC packets on the affected path. This 3203 could result in termination of the connection if an alternative path 3204 cannot be found. 3206 8.4.1. Special Considerations for PMTU Discovery 3208 Traditional ICMP-based path MTU discovery in IPv4 [RFC1191] is 3209 potentially vulnerable to off-path attacks that successfully guess 3210 the IP/port 4-tuple and reduce the MTU to a bandwidth-inefficient 3211 value. TCP connections mitigate this risk by using the (at minimum) 3212 8 bytes of transport header echoed in the ICMP message to validate 3213 the TCP sequence number as valid for the current connection. 3214 However, as QUIC operates over UDP, in IPv4 the echoed information 3215 could consist only of the IP and UDP headers, which usually has 3216 insufficient entropy to mitigate off-path attacks. 3218 As a result, endpoints that implement PMTUD in IPv4 SHOULD take steps 3219 to mitigate this risk. For instance, an application could: 3221 o Set the IPv4 Don't Fragment (DF) bit on a small proportion of 3222 packets, so that most invalid ICMP messages arrive when there are 3223 no DF packets outstanding, and can therefore be identified as 3224 spurious. 3226 o Store additional information from the IP or UDP headers from DF 3227 packets (for example, the IP ID or UDP checksum) to further 3228 authenticate incoming Datagram Too Big messages. 3230 o Any reduction in PMTU due to a report contained in an ICMP packet 3231 is provisional until QUIC's loss detection algorithm determines 3232 that the packet is actually lost. 3234 8.4.2. Special Considerations for Packetization Layer PMTU Discovery 3236 The PADDING frame provides a useful option for PMTU probe packets 3237 that does not exist in other transports. PADDING frames generate 3238 acknowledgements, but their content need not be delivered reliably. 3239 PADDING frames may delay the delivery of application data, as they 3240 consume the congestion window. However, by definition their likely 3241 loss in a probe packet does not require delay-inducing retransmission 3242 of application data. 3244 When implementing the algorithm in Section 7.2 of [RFC4821], the 3245 initial value of search_low SHOULD be consistent with the IPv6 3246 minimum packet size. Paths that do not support this size cannot 3247 deliver Initial packets, and therefore are not QUIC-compliant. 3249 Section 7.3 of [RFC4821] discusses tradeoffs between small and large 3250 increases in the size of probe packets. As QUIC probe packets need 3251 not contain application data, aggressive increases in probe size 3252 carry fewer consequences. 3254 9. Streams: QUIC's Data Structuring Abstraction 3256 Streams in QUIC provide a lightweight, ordered byte-stream 3257 abstraction. 3259 There are two basic types of stream in QUIC. Unidirectional streams 3260 carry data in one direction only; bidirectional streams allow for 3261 data to be sent in both directions. Different stream identifiers are 3262 used to distinguish between unidirectional and bidirectional streams, 3263 as well as to create a separation between streams that are initiated 3264 by the client and server (see Section 9.1). 3266 Either type of stream can be created by either endpoint, can 3267 concurrently send data interleaved with other streams, and can be 3268 cancelled. 3270 Stream offsets allow for the octets on a stream to be placed in 3271 order. An endpoint MUST be capable of delivering data received on a 3272 stream in order. Implementations MAY choose to offer the ability to 3273 deliver data out of order. There is no means of ensuring ordering 3274 between octets on different streams. 3276 The creation and destruction of streams are expected to have minimal 3277 bandwidth and computational cost. A single STREAM frame may create, 3278 carry data for, and terminate a stream, or a stream may last the 3279 entire duration of a connection. 3281 Streams are individually flow controlled, allowing an endpoint to 3282 limit memory commitment and to apply back pressure. The creation of 3283 streams is also flow controlled, with each peer declaring the maximum 3284 stream ID it is willing to accept at a given time. 3286 An alternative view of QUIC streams is as an elastic "message" 3287 abstraction, similar to the way ephemeral streams are used in SST 3288 [SST], which may be a more appealing description for some 3289 applications. 3291 9.1. Stream Identifiers 3293 Streams are identified by an unsigned 62-bit integer, referred to as 3294 the Stream ID. The least significant two bits of the Stream ID are 3295 used to identify the type of stream (unidirectional or bidirectional) 3296 and the initiator of the stream. 3298 The least significant bit (0x1) of the Stream ID identifies the 3299 initiator of the stream. Clients initiate even-numbered streams 3300 (those with the least significant bit set to 0); servers initiate 3301 odd-numbered streams (with the bit set to 1). Separation of the 3302 stream identifiers ensures that client and server are able to open 3303 streams without the latency imposed by negotiating for an identifier. 3305 If an endpoint receives a frame for a stream that it expects to 3306 initiate (i.e., odd-numbered for the client or even-numbered for the 3307 server), but which it has not yet opened, it MUST close the 3308 connection with error code STREAM_STATE_ERROR. 3310 The second least significant bit (0x2) of the Stream ID 3311 differentiates between unidirectional streams and bidirectional 3312 streams. Unidirectional streams always have this bit set to 1 and 3313 bidirectional streams have this bit set to 0. 3315 The two type bits from a Stream ID therefore identify streams as 3316 summarized in Table 5. 3318 +----------+----------------------------------+ 3319 | Low Bits | Stream Type | 3320 +----------+----------------------------------+ 3321 | 0x0 | Client-Initiated, Bidirectional | 3322 | | | 3323 | 0x1 | Server-Initiated, Bidirectional | 3324 | | | 3325 | 0x2 | Client-Initiated, Unidirectional | 3326 | | | 3327 | 0x3 | Server-Initiated, Unidirectional | 3328 +----------+----------------------------------+ 3330 Table 5: Stream ID Types 3332 Stream ID 0 (0x0) is a client-initiated, bidirectional stream that is 3333 used for the cryptographic handshake. Stream 0 MUST NOT be used for 3334 application data. 3336 A QUIC endpoint MUST NOT reuse a Stream ID. Streams can be used in 3337 any order. Streams that are used out of order result in opening all 3338 lower-numbered streams of the same type in the same direction. 3340 Stream IDs are encoded as a variable-length integer (see 3341 Section 7.1). 3343 9.2. Stream States 3345 This section describes the two types of QUIC stream in terms of the 3346 states of their send or receive components. Two state machines are 3347 described: one for streams on which an endpoint transmits data 3348 (Section 9.2.1); another for streams from which an endpoint receives 3349 data (Section 9.2.2). 3351 Unidirectional streams use the applicable state machine directly. 3352 Bidirectional streams use both state machines. For the most part, 3353 the use of these state machines is the same whether the stream is 3354 unidirectional or bidirectional. The conditions for opening a stream 3355 are slightly more complex for a bidirectional stream because the 3356 opening of either send or receive sides causes the stream to open in 3357 both directions. 3359 An endpoint can open streams up to its maximum stream limit in any 3360 order, however endpoints SHOULD open the send side of streams for 3361 each type in order. 3363 Note: These states are largely informative. This document uses 3364 stream states to describe rules for when and how different types 3365 of frames can be sent and the reactions that are expected when 3366 different types of frames are received. Though these state 3367 machines are intended to be useful in implementing QUIC, these 3368 states aren't intended to constrain implementations. An 3369 implementation can define a different state machine as long as its 3370 behavior is consistent with an implementation that implements 3371 these states. 3373 9.2.1. Send Stream States 3375 Figure 10 shows the states for the part of a stream that sends data 3376 to a peer. 3378 o 3379 | Create Stream (Sending) 3380 | Create Bidirectional Stream (Receiving) 3381 v 3382 +-------+ 3383 | Ready | Send RST_STREAM 3384 | |-----------------------. 3385 +-------+ | 3386 | | 3387 | Send STREAM / | 3388 | STREAM_BLOCKED | 3389 v | 3390 +-------+ | 3391 | Send | Send RST_STREAM | 3392 | |---------------------->| 3393 +-------+ | 3394 | | 3395 | Send STREAM + FIN | 3396 v v 3397 +-------+ +-------+ 3398 | Data | Send RST_STREAM | Reset | 3399 | Sent +------------------>| Sent | 3400 +-------+ +-------+ 3401 | | 3402 | Recv All ACKs | Recv ACK 3403 v v 3404 +-------+ +-------+ 3405 | Data | | Reset | 3406 | Recvd | | Recvd | 3407 +-------+ +-------+ 3409 Figure 10: States for Send Streams 3411 The sending part of stream that the endpoint initiates (types 0 and 2 3412 for clients, 1 and 3 for servers) is opened by the application or 3413 application protocol. The "Ready" state represents a newly created 3414 stream that is able to accept data from the application. Stream data 3415 might be buffered in this state in preparation for sending. 3417 The sending part of a bidirectional stream initiated by a peer (type 3418 0 for a server, type 1 for a client) enters the "Ready" state if the 3419 receiving part enters the "Recv" state. 3421 Sending the first STREAM or STREAM_BLOCKED frame causes a send stream 3422 to enter the "Send" state. An implementation might choose to defer 3423 allocating a Stream ID to a send stream until it sends the first 3424 frame and enters this state, which can allow for better stream 3425 prioritization. 3427 In the "Send" state, an endpoint transmits - and retransmits as 3428 necessary - data in STREAM frames. The endpoint respects the flow 3429 control limits of its peer, accepting MAX_STREAM_DATA frames. An 3430 endpoint in the "Send" state generates STREAM_BLOCKED frames if it 3431 encounters flow control limits. 3433 After the application indicates that stream data is complete and a 3434 STREAM frame containing the FIN bit is sent, the send stream enters 3435 the "Data Sent" state. From this state, the endpoint only 3436 retransmits stream data as necessary. The endpoint no longer needs 3437 to track flow control limits or send STREAM_BLOCKED frames for a send 3438 stream in this state. The endpoint can ignore any MAX_STREAM_DATA 3439 frames it receives from its peer in this state; MAX_STREAM_DATA 3440 frames might be received until the peer receives the final stream 3441 offset. 3443 Once all stream data has been successfully acknowledged, the send 3444 stream enters the "Data Recvd" state, which is a terminal state. 3446 From any of the "Ready", "Send", or "Data Sent" states, an 3447 application can signal that it wishes to abandon transmission of 3448 stream data. Similarly, the endpoint might receive a STOP_SENDING 3449 frame from its peer. In either case, the endpoint sends a RST_STREAM 3450 frame, which causes the stream to enter the "Reset Sent" state. 3452 An endpoint MAY send a RST_STREAM as the first frame on a send 3453 stream; this causes the send stream to open and then immediately 3454 transition to the "Reset Sent" state. 3456 Once a packet containing a RST_STREAM has been acknowledged, the send 3457 stream enters the "Reset Recvd" state, which is a terminal state. 3459 9.2.2. Receive Stream States 3461 Figure 11 shows the states for the part of a stream that receives 3462 data from a peer. The states for a receive stream mirror only some 3463 of the states of the send stream at the peer. A receive stream 3464 doesn't track states on the send stream that cannot be observed, such 3465 as the "Ready" state; instead, receive streams track the delivery of 3466 data to the application or application protocol some of which cannot 3467 be observed by the sender. 3469 o 3470 | Recv STREAM / STREAM_BLOCKED / RST_STREAM 3471 | Create Bidirectional Stream (Sending) 3472 | Recv MAX_STREAM_DATA 3473 v 3474 +-------+ 3475 | Recv | Recv RST_STREAM 3476 | |-----------------------. 3477 +-------+ | 3478 | | 3479 | Recv STREAM + FIN | 3480 v | 3481 +-------+ | 3482 | Size | Recv RST_STREAM | 3483 | Known +---------------------->| 3484 +-------+ | 3485 | | 3486 | Recv All Data | 3487 v v 3488 +-------+ +-------+ 3489 | Data | Recv RST_STREAM | Reset | 3490 | Recvd +<-- (optional) --->| Recvd | 3491 +-------+ +-------+ 3492 | | 3493 | App Read All Data | App Read RST 3494 v v 3495 +-------+ +-------+ 3496 | Data | | Reset | 3497 | Read | | Read | 3498 +-------+ +-------+ 3500 Figure 11: States for Receive Streams 3502 The receiving part of a stream initiated by a peer (types 1 and 3 for 3503 a client, or 0 and 2 for a server) are created when the first STREAM, 3504 STREAM_BLOCKED, RST_STREAM, or MAX_STREAM_DATA (bidirectional only, 3505 see below) is received for that stream. The initial state for a 3506 receive stream is "Recv". Receiving a RST_STREAM frame causes the 3507 receive stream to immediately transition to the "Reset Recvd". 3509 The receive stream enters the "Recv" state when the sending part of a 3510 bidirectional stream initiated by the endpoint (type 0 for a client, 3511 type 1 for a server) enters the "Ready" state. 3513 A bidirectional stream also opens when a MAX_STREAM_DATA frame is 3514 received. Receiving a MAX_STREAM_DATA frame implies that the remote 3515 peer has opened the stream and is providing flow control credit. A 3516 MAX_STREAM_DATA frame might arrive before a STREAM or STREAM_BLOCKED 3517 frame if packets are lost or reordered. 3519 In the "Recv" state, the endpoint receives STREAM and STREAM_BLOCKED 3520 frames. Incoming data is buffered and can be reassembled into the 3521 correct order for delivery to the application. As data is consumed 3522 by the application and buffer space becomes available, the endpoint 3523 sends MAX_STREAM_DATA frames to allow the peer to send more data. 3525 When a STREAM frame with a FIN bit is received, the final offset (see 3526 Section 10.3) is known. The receive stream enters the "Size Known" 3527 state. In this state, the endpoint no longer needs to send 3528 MAX_STREAM_DATA frames, it only receives any retransmissions of 3529 stream data. 3531 Once all data for the stream has been received, the receive stream 3532 enters the "Data Recvd" state. This might happen as a result of 3533 receiving the same STREAM frame that causes the transition to "Size 3534 Known". In this state, the endpoint has all stream data. Any STREAM 3535 or STREAM_BLOCKED frames it receives for the stream can be discarded. 3537 The "Data Recvd" state persists until stream data has been delivered 3538 to the application or application protocol. Once stream data has 3539 been delivered, the stream enters the "Data Read" state, which is a 3540 terminal state. 3542 Receiving a RST_STREAM frame in the "Recv" or "Size Known" states 3543 causes the stream to enter the "Reset Recvd" state. This might cause 3544 the delivery of stream data to the application to be interrupted. 3546 It is possible that all stream data is received when a RST_STREAM is 3547 received (that is, from the "Data Recvd" state). Similarly, it is 3548 possible for remaining stream data to arrive after receiving a 3549 RST_STREAM frame (the "Reset Recvd" state). An implementation is 3550 able to manage this situation as they choose. Sending RST_STREAM 3551 means that an endpoint cannot guarantee delivery of stream data; 3552 however there is no requirement that stream data not be delivered if 3553 a RST_STREAM is received. An implementation MAY interrupt delivery 3554 of stream data, discard any data that was not consumed, and signal 3555 the existence of the RST_STREAM immediately. Alternatively, the 3556 RST_STREAM signal might be suppressed or withheld if stream data is 3557 completely received. In the latter case, the receive stream 3558 effectively transitions to "Data Recvd" from "Reset Recvd". 3560 Once the application has been delivered the signal indicating that 3561 the receive stream was reset, the receive stream transitions to the 3562 "Reset Read" state, which is a terminal state. 3564 9.2.3. Permitted Frame Types 3566 The sender of a stream sends just three frame types that affect the 3567 state of a stream at either sender or receiver: STREAM 3568 (Section 7.18), STREAM_BLOCKED (Section 7.11), and RST_STREAM 3569 (Section 7.3). 3571 A sender MUST NOT send any of these frames from a terminal state 3572 ("Data Recvd" or "Reset Recvd"). A sender MUST NOT send STREAM or 3573 STREAM_BLOCKED after sending a RST_STREAM; that is, in the "Reset 3574 Sent" state in addition to the terminal states. A receiver could 3575 receive any of these frames in any state, but only due to the 3576 possibility of delayed delivery of packets carrying them. 3578 The receiver of a stream sends MAX_STREAM_DATA (Section 7.7) and 3579 STOP_SENDING frames (Section 7.14). 3581 The receiver only sends MAX_STREAM_DATA in the "Recv" state. A 3582 receiver can send STOP_SENDING in any state where it has not received 3583 a RST_STREAM frame; that is states other than "Reset Recvd" or "Reset 3584 Read". However there is little value in sending a STOP_SENDING frame 3585 after all stream data has been received in the "Data Recvd" state. A 3586 sender could receive these frames in any state as a result of delayed 3587 delivery of packets. 3589 9.2.4. Bidirectional Stream States 3591 A bidirectional stream is composed of a send stream and a receive 3592 stream. Implementations may represent states of the bidirectional 3593 stream as composites of send and receive stream states. The simplest 3594 model presents the stream as "open" when either send or receive 3595 stream is in a non-terminal state and "closed" when both send and 3596 receive streams are in a terminal state. 3598 Table 6 shows a more complex mapping of bidirectional stream states 3599 that loosely correspond to the stream states in HTTP/2 [HTTP2]. This 3600 shows that multiple states on send or receive streams are mapped to 3601 the same composite state. Note that this is just one possibility for 3602 such a mapping; this mapping requires that data is acknowledged 3603 before the transition to a "closed" or "half-closed" state. 3605 +-----------------------+---------------------+---------------------+ 3606 | Send Stream | Receive Stream | Composite State | 3607 +-----------------------+---------------------+---------------------+ 3608 | No Stream/Ready | No Stream/Recv *1 | idle | 3609 | | | | 3610 | Ready/Send/Data Sent | Recv/Size Known | open | 3611 | | | | 3612 | Ready/Send/Data Sent | Data Recvd/Data | half-closed | 3613 | | Read | (remote) | 3614 | | | | 3615 | Ready/Send/Data Sent | Reset Recvd/Reset | half-closed | 3616 | | Read | (remote) | 3617 | | | | 3618 | Data Recvd | Recv/Size Known | half-closed (local) | 3619 | | | | 3620 | Reset Sent/Reset | Recv/Size Known | half-closed (local) | 3621 | Recvd | | | 3622 | | | | 3623 | Data Recvd | Recv/Size Known | half-closed (local) | 3624 | | | | 3625 | Reset Sent/Reset | Data Recvd/Data | closed | 3626 | Recvd | Read | | 3627 | | | | 3628 | Reset Sent/Reset | Reset Recvd/Reset | closed | 3629 | Recvd | Read | | 3630 | | | | 3631 | Data Recvd | Data Recvd/Data | closed | 3632 | | Read | | 3633 | | | | 3634 | Data Recvd | Reset Recvd/Reset | closed | 3635 | | Read | | 3636 +-----------------------+---------------------+---------------------+ 3638 Table 6: Possible Mapping of Stream States to HTTP/2 3640 Note (*1): A stream is considered "idle" if it has not yet been 3641 created, or if the receive stream is in the "Recv" state without 3642 yet having received any frames. 3644 9.3. Solicited State Transitions 3646 If an endpoint is no longer interested in the data it is receiving on 3647 a stream, it MAY send a STOP_SENDING frame identifying that stream to 3648 prompt closure of the stream in the opposite direction. This 3649 typically indicates that the receiving application is no longer 3650 reading data it receives from the stream, but is not a guarantee that 3651 incoming data will be ignored. 3653 STREAM frames received after sending STOP_SENDING are still counted 3654 toward the connection and stream flow-control windows, even though 3655 these frames will be discarded upon receipt. This avoids potential 3656 ambiguity about which STREAM frames count toward flow control. 3658 A STOP_SENDING frame requests that the receiving endpoint send a 3659 RST_STREAM frame. An endpoint that receives a STOP_SENDING frame 3660 MUST send a RST_STREAM frame for that stream, and can use an error 3661 code of STOPPING. If the STOP_SENDING frame is received on a send 3662 stream that is already in the "Data Sent" state, a RST_STREAM frame 3663 MAY still be sent in order to cancel retransmission of previously- 3664 sent STREAM frames. 3666 STOP_SENDING SHOULD only be sent for a receive stream that has not 3667 been reset. STOP_SENDING is most useful for streams in the "Recv" or 3668 "Size Known" states. 3670 An endpoint is expected to send another STOP_SENDING frame if a 3671 packet containing a previous STOP_SENDING is lost. However, once 3672 either all stream data or a RST_STREAM frame has been received for 3673 the stream - that is, the stream is in any state other than "Recv" or 3674 "Size Known" - sending a STOP_SENDING frame is unnecessary. 3676 9.4. Stream Concurrency 3678 An endpoint limits the number of concurrently active incoming streams 3679 by adjusting the maximum stream ID. An initial value is set in the 3680 transport parameters (see Section 6.4.1) and is subsequently 3681 increased by MAX_STREAM_ID frames (see Section 7.8). 3683 The maximum stream ID is specific to each endpoint and applies only 3684 to the peer that receives the setting. That is, clients specify the 3685 maximum stream ID the server can initiate, and servers specify the 3686 maximum stream ID the client can initiate. Each endpoint may respond 3687 on streams initiated by the other peer, regardless of whether it is 3688 permitted to initiated new streams. 3690 Endpoints MUST NOT exceed the limit set by their peer. An endpoint 3691 that receives a STREAM frame with an ID greater than the limit it has 3692 sent MUST treat this as a stream error of type STREAM_ID_ERROR 3693 (Section 11), unless this is a result of a change in the initial 3694 offsets (see Section 6.4.2). 3696 A receiver MUST NOT renege on an advertisement; that is, once a 3697 receiver advertises a stream ID via a MAX_STREAM_ID frame, it MUST 3698 NOT subsequently advertise a smaller maximum ID. A sender may 3699 receive MAX_STREAM_ID frames out of order; a sender MUST therefore 3700 ignore any MAX_STREAM_ID that does not increase the maximum. 3702 9.5. Sending and Receiving Data 3704 Once a stream is created, endpoints may use the stream to send and 3705 receive data. Each endpoint may send a series of STREAM frames 3706 encapsulating data on a stream until the stream is terminated in that 3707 direction. Streams are an ordered byte-stream abstraction, and they 3708 have no other structure within them. STREAM frame boundaries are not 3709 expected to be preserved in retransmissions from the sender or during 3710 delivery to the application at the receiver. 3712 When new data is to be sent on a stream, a sender MUST set the 3713 encapsulating STREAM frame's offset field to the stream offset of the 3714 first byte of this new data. The first octet of data on a stream has 3715 an offset of 0. An endpoint is expected to send every stream octet. 3716 The largest offset delivered on a stream MUST be less than 2^62. A 3717 receiver MUST ensure that received stream data is delivered to the 3718 application as an ordered byte-stream. Data received out of order 3719 MUST be buffered for later delivery, as long as it is not in 3720 violation of the receiver's flow control limits. 3722 An endpoint MUST NOT send data on any stream without ensuring that it 3723 is within the data limits set by its peer. The cryptographic 3724 handshake stream, Stream 0, is exempt from the connection-level data 3725 limits established by MAX_DATA. Data on stream 0 other than the 3726 initial cryptographic handshake message is still subject to stream- 3727 level data limits and MAX_STREAM_DATA. This message is exempt from 3728 flow control because it needs to be sent in a single packet 3729 regardless of the server's flow control state. This rule applies 3730 even for 0-RTT handshakes where the remembered value of 3731 MAX_STREAM_DATA would not permit sending a full initial cryptographic 3732 handshake message. 3734 Flow control is described in detail in Section 10, and congestion 3735 control is described in the companion document [QUIC-RECOVERY]. 3737 9.6. Stream Prioritization 3739 Stream multiplexing has a significant effect on application 3740 performance if resources allocated to streams are correctly 3741 prioritized. Experience with other multiplexed protocols, such as 3742 HTTP/2 [HTTP2], shows that effective prioritization strategies have a 3743 significant positive impact on performance. 3745 QUIC does not provide frames for exchanging prioritization 3746 information. Instead it relies on receiving priority information 3747 from the application that uses QUIC. Protocols that use QUIC are 3748 able to define any prioritization scheme that suits their application 3749 semantics. A protocol might define explicit messages for signaling 3750 priority, such as those defined in HTTP/2; it could define rules that 3751 allow an endpoint to determine priority based on context; or it could 3752 leave the determination to the application. 3754 A QUIC implementation SHOULD provide ways in which an application can 3755 indicate the relative priority of streams. When deciding which 3756 streams to dedicate resources to, QUIC SHOULD use the information 3757 provided by the application. Failure to account for priority of 3758 streams can result in suboptimal performance. 3760 Stream priority is most relevant when deciding which stream data will 3761 be transmitted. Often, there will be limits on what can be 3762 transmitted as a result of connection flow control or the current 3763 congestion controller state. 3765 Giving preference to the transmission of its own management frames 3766 ensures that the protocol functions efficiently. That is, 3767 prioritizing frames other than STREAM frames ensures that loss 3768 recovery, congestion control, and flow control operate effectively. 3770 Stream 0 MUST be prioritized over other streams prior to the 3771 completion of the cryptographic handshake. This includes the 3772 retransmission of the second flight of client handshake messages, 3773 that is, the TLS Finished and any client authentication messages. 3775 STREAM data in frames determined to be lost SHOULD be retransmitted 3776 before sending new data, unless application priorities indicate 3777 otherwise. Retransmitting lost stream data can fill in gaps, which 3778 allows the peer to consume already received data and free up flow 3779 control window. 3781 10. Flow Control 3783 It is necessary to limit the amount of data that a sender may have 3784 outstanding at any time, so as to prevent a fast sender from 3785 overwhelming a slow receiver, or to prevent a malicious sender from 3786 consuming significant resources at a receiver. This section 3787 describes QUIC's flow-control mechanisms. 3789 QUIC employs a credit-based flow-control scheme similar to HTTP/2's 3790 flow control [HTTP2]. A receiver advertises the number of octets it 3791 is prepared to receive on a given stream and for the entire 3792 connection. This leads to two levels of flow control in QUIC: (i) 3793 Connection flow control, which prevents senders from exceeding a 3794 receiver's buffer capacity for the connection, and (ii) Stream flow 3795 control, which prevents a single stream from consuming the entire 3796 receive buffer for a connection. 3798 A data receiver sends MAX_STREAM_DATA or MAX_DATA frames to the 3799 sender to advertise additional credit. MAX_STREAM_DATA frames send 3800 the the maximum absolute byte offset of a stream, while MAX_DATA 3801 sends the maximum sum of the absolute byte offsets of all streams 3802 other than stream 0. 3804 A receiver MAY advertise a larger offset at any point by sending 3805 MAX_DATA or MAX_STREAM_DATA frames. A receiver MUST NOT renege on an 3806 advertisement; that is, once a receiver advertises an offset, it MUST 3807 NOT subsequently advertise a smaller offset. A sender could receive 3808 MAX_DATA or MAX_STREAM_DATA frames out of order; a sender MUST 3809 therefore ignore any flow control offset that does not move the 3810 window forward. 3812 A receiver MUST close the connection with a FLOW_CONTROL_ERROR error 3813 (Section 11) if the peer violates the advertised connection or stream 3814 data limits. 3816 A sender SHOULD send BLOCKED or STREAM_BLOCKED frames to indicate it 3817 has data to write but is blocked by flow control limits. These 3818 frames are expected to be sent infrequently in common cases, but they 3819 are considered useful for debugging and monitoring purposes. 3821 A receiver advertises credit for a stream by sending a 3822 MAX_STREAM_DATA frame with the Stream ID set appropriately. A 3823 receiver could use the current offset of data consumed to determine 3824 the flow control offset to be advertised. A receiver MAY send 3825 MAX_STREAM_DATA frames in multiple packets in order to make sure that 3826 the sender receives an update before running out of flow control 3827 credit, even if one of the packets is lost. 3829 Connection flow control is a limit to the total bytes of stream data 3830 sent in STREAM frames on all streams except stream 0. A receiver 3831 advertises credit for a connection by sending a MAX_DATA frame. A 3832 receiver maintains a cumulative sum of bytes received on all 3833 contributing streams, which are used to check for flow control 3834 violations. A receiver might use a sum of bytes consumed on all 3835 contributing streams to determine the maximum data limit to be 3836 advertised. 3838 10.1. Edge Cases and Other Considerations 3840 There are some edge cases which must be considered when dealing with 3841 stream and connection level flow control. Given enough time, both 3842 endpoints must agree on flow control state. If one end believes it 3843 can send more than the other end is willing to receive, the 3844 connection will be torn down when too much data arrives. 3846 Conversely if a sender believes it is blocked, while endpoint B 3847 expects more data can be received, then the connection can be in a 3848 deadlock, with the sender waiting for a MAX_DATA or MAX_STREAM_DATA 3849 frame which will never come. 3851 On receipt of a RST_STREAM frame, an endpoint will tear down state 3852 for the matching stream and ignore further data arriving on that 3853 stream. This could result in the endpoints getting out of sync, 3854 since the RST_STREAM frame may have arrived out of order and there 3855 may be further bytes in flight. The data sender would have counted 3856 the data against its connection level flow control budget, but a 3857 receiver that has not received these bytes would not know to include 3858 them as well. The receiver must learn the number of bytes that were 3859 sent on the stream to make the same adjustment in its connection flow 3860 controller. 3862 To avoid this de-synchronization, a RST_STREAM sender MUST include 3863 the final byte offset sent on the stream in the RST_STREAM frame. On 3864 receiving a RST_STREAM frame, a receiver definitively knows how many 3865 bytes were sent on that stream before the RST_STREAM frame, and the 3866 receiver MUST use the final offset to account for all bytes sent on 3867 the stream in its connection level flow controller. 3869 10.1.1. Response to a RST_STREAM 3871 RST_STREAM terminates one direction of a stream abruptly. Whether 3872 any action or response can or should be taken on the data already 3873 received is an application-specific issue, but it will often be the 3874 case that upon receipt of a RST_STREAM an endpoint will choose to 3875 stop sending data in its own direction. If the sender of a 3876 RST_STREAM wishes to explicitly state that no future data will be 3877 processed, that endpoint MAY send a STOP_SENDING frame at the same 3878 time. 3880 10.1.2. Data Limit Increments 3882 This document leaves when and how many bytes to advertise in a 3883 MAX_DATA or MAX_STREAM_DATA to implementations, but offers a few 3884 considerations. These frames contribute to connection overhead. 3885 Therefore frequently sending frames with small changes is 3886 undesirable. At the same time, infrequent updates require larger 3887 increments to limits if blocking is to be avoided. Thus, larger 3888 updates require a receiver to commit to larger resource commitments. 3889 Thus there is a tradeoff between resource commitment and overhead 3890 when determining how large a limit is advertised. 3892 A receiver MAY use an autotuning mechanism to tune the frequency and 3893 amount that it increases data limits based on a round-trip time 3894 estimate and the rate at which the receiving application consumes 3895 data, similar to common TCP implementations. 3897 10.1.3. Handshake Exemption 3899 During the initial handshake, an endpoint could need to send a larger 3900 message on stream 0 than would ordinarily be permitted by the peer's 3901 initial stream flow control window. Since MAX_STREAM_DATA frames are 3902 not permitted in these early packets, the peer cannot provide 3903 additional flow control window in order to complete the handshake. 3905 Endpoints MAY exceed the flow control limits on stream 0 prior to the 3906 completion of the cryptographic handshake. (That is, in Initial, 3907 Retry, and Handshake packets.) However, once the handshake is 3908 complete, endpoints MUST NOT send additional data beyond the peer's 3909 permitted offset. If the amount of data sent during the handshake 3910 exceeds the peer's maximum offset, the endpoint cannot send 3911 additional data on stream 0 until the peer has sent a MAX_STREAM_DATA 3912 frame indicating a larger maximum offset. 3914 10.2. Stream Limit Increment 3916 As with flow control, this document leaves when and how many streams 3917 to make available to a peer via MAX_STREAM_ID to implementations, but 3918 offers a few considerations. MAX_STREAM_ID frames constitute minimal 3919 overhead, while withholding MAX_STREAM_ID frames can prevent the peer 3920 from using the available parallelism. 3922 Implementations will likely want to increase the maximum stream ID as 3923 peer-initiated streams close. A receiver MAY also advance the 3924 maximum stream ID based on current activity, system conditions, and 3925 other environmental factors. 3927 10.2.1. Blocking on Flow Control 3929 If a sender does not receive a MAX_DATA or MAX_STREAM_DATA frame when 3930 it has run out of flow control credit, the sender will be blocked and 3931 SHOULD send a BLOCKED or STREAM_BLOCKED frame. These frames are 3932 expected to be useful for debugging at the receiver; they do not 3933 require any other action. A receiver SHOULD NOT wait for a BLOCKED 3934 or STREAM_BLOCKED frame before sending MAX_DATA or MAX_STREAM_DATA, 3935 since doing so will mean that a sender is unable to send for an 3936 entire round trip. 3938 For smooth operation of the congestion controller, it is generally 3939 considered best to not let the sender go into quiescence if 3940 avoidable. To avoid blocking a sender, and to reasonably account for 3941 the possibiity of loss, a receiver should send a MAX_DATA or 3942 MAX_STREAM_DATA frame at least two round trips before it expects the 3943 sender to get blocked. 3945 A sender sends a single BLOCKED or STREAM_BLOCKED frame only once 3946 when it reaches a data limit. A sender SHOULD NOT send multiple 3947 BLOCKED or STREAM_BLOCKED frames for the same data limit, unless the 3948 original frame is determined to be lost. Another BLOCKED or 3949 STREAM_BLOCKED frame can be sent after the data limit is increased. 3951 10.3. Stream Final Offset 3953 The final offset is the count of the number of octets that are 3954 transmitted on a stream. For a stream that is reset, the final 3955 offset is carried explicitly in a RST_STREAM frame. Otherwise, the 3956 final offset is the offset of the end of the data carried in a STREAM 3957 frame marked with a FIN flag, or 0 in the case of incoming 3958 unidirectional streams. 3960 An endpoint will know the final offset for a stream when the receive 3961 stream enters the "Size Known" or "Reset Recvd" state. 3963 An endpoint MUST NOT send data on a stream at or beyond the final 3964 offset. 3966 Once a final offset for a stream is known, it cannot change. If a 3967 RST_STREAM or STREAM frame causes the final offset to change for a 3968 stream, an endpoint SHOULD respond with a FINAL_OFFSET_ERROR error 3969 (see Section 11). A receiver SHOULD treat receipt of data at or 3970 beyond the final offset as a FINAL_OFFSET_ERROR error, even after a 3971 stream is closed. Generating these errors is not mandatory, but only 3972 because requiring that an endpoint generate these errors also means 3973 that the endpoint needs to maintain the final offset state for closed 3974 streams, which could mean a significant state commitment. 3976 11. Error Handling 3978 An endpoint that detects an error SHOULD signal the existence of that 3979 error to its peer. Both transport-level and application-level errors 3980 can affect an entire connection (see Section 11.1), while only 3981 application-level errors can be isolated to a single stream (see 3982 Section 11.2). 3984 The most appropriate error code (Section 11.3) SHOULD be included in 3985 the frame that signals the error. Where this specification 3986 identifies error conditions, it also identifies the error code that 3987 is used. 3989 A stateless reset (Section 6.9.4) is not suitable for any error that 3990 can be signaled with a CONNECTION_CLOSE, APPLICATION_CLOSE, or 3991 RST_STREAM frame. A stateless reset MUST NOT be used by an endpoint 3992 that has the state necessary to send a frame on the connection. 3994 11.1. Connection Errors 3996 Errors that result in the connection being unusable, such as an 3997 obvious violation of protocol semantics or corruption of state that 3998 affects an entire connection, MUST be signaled using a 3999 CONNECTION_CLOSE or APPLICATION_CLOSE frame (Section 7.4, 4000 Section 7.5). An endpoint MAY close the connection in this manner 4001 even if the error only affects a single stream. 4003 Application protocols can signal application-specific protocol errors 4004 using the APPLICATION_CLOSE frame. Errors that are specific to the 4005 transport, including all those described in this document, are 4006 carried in a CONNECTION_CLOSE frame. Other than the type of error 4007 code they carry, these frames are identical in format and semantics. 4009 A CONNECTION_CLOSE or APPLICATION_CLOSE frame could be sent in a 4010 packet that is lost. An endpoint SHOULD be prepared to retransmit a 4011 packet containing either frame type if it receives more packets on a 4012 terminated connection. Limiting the number of retransmissions and 4013 the time over which this final packet is sent limits the effort 4014 expended on terminated connections. 4016 An endpoint that chooses not to retransmit packets containing 4017 CONNECTION_CLOSE or APPLICATION_CLOSE risks a peer missing the first 4018 such packet. The only mechanism available to an endpoint that 4019 continues to receive data for a terminated connection is to use the 4020 stateless reset process (Section 6.9.4). 4022 An endpoint that receives an invalid CONNECTION_CLOSE or 4023 APPLICATION_CLOSE frame MUST NOT signal the existence of the error to 4024 its peer. 4026 11.2. Stream Errors 4028 If an application-level error affects a single stream, but otherwise 4029 leaves the connection in a recoverable state, the endpoint can send a 4030 RST_STREAM frame (Section 7.3) with an appropriate error code to 4031 terminate just the affected stream. 4033 Stream 0 is critical to the functioning of the entire connection. If 4034 stream 0 is closed with either a RST_STREAM or STREAM frame bearing 4035 the FIN flag, an endpoint MUST generate a connection error of type 4036 PROTOCOL_VIOLATION. 4038 Other than STOPPING (Section 9.3), RST_STREAM MUST be instigated by 4039 the application and MUST carry an application error code. Resetting 4040 a stream without knowledge of the application protocol could cause 4041 the protocol to enter an unrecoverable state. Application protocols 4042 might require certain streams to be reliably delivered in order to 4043 guarantee consistent state between endpoints. 4045 11.3. Transport Error Codes 4047 QUIC error codes are 16-bit unsigned integers. 4049 This section lists the defined QUIC transport error codes that may be 4050 used in a CONNECTION_CLOSE frame. These errors apply to the entire 4051 connection. 4053 NO_ERROR (0x0): An endpoint uses this with CONNECTION_CLOSE to 4054 signal that the connection is being closed abruptly in the absence 4055 of any error. 4057 INTERNAL_ERROR (0x1): The endpoint encountered an internal error and 4058 cannot continue with the connection. 4060 SERVER_BUSY (0x2): The server is currently busy and does not accept 4061 any new connections. 4063 FLOW_CONTROL_ERROR (0x3): An endpoint received more data than it 4064 permitted in its advertised data limits (see Section 10). 4066 STREAM_ID_ERROR (0x4): An endpoint received a frame for a stream 4067 identifier that exceeded its advertised maximum stream ID. 4069 STREAM_STATE_ERROR (0x5): An endpoint received a frame for a stream 4070 that was not in a state that permitted that frame (see 4071 Section 9.2). 4073 FINAL_OFFSET_ERROR (0x6): An endpoint received a STREAM frame 4074 containing data that exceeded the previously established final 4075 offset. Or an endpoint received a RST_STREAM frame containing a 4076 final offset that was lower than the maximum offset of data that 4077 was already received. Or an endpoint received a RST_STREAM frame 4078 containing a different final offset to the one already 4079 established. 4081 FRAME_FORMAT_ERROR (0x7): An endpoint received a frame that was 4082 badly formatted. For instance, an empty STREAM frame that omitted 4083 the FIN flag, or an ACK frame that has more acknowledgment ranges 4084 than the remainder of the packet could carry. This is a generic 4085 error code; an endpoint SHOULD use the more specific frame format 4086 error codes (0x1XX) if possible. 4088 TRANSPORT_PARAMETER_ERROR (0x8): An endpoint received transport 4089 parameters that were badly formatted, included an invalid value, 4090 was absent even though it is mandatory, was present though it is 4091 forbidden, or is otherwise in error. 4093 VERSION_NEGOTIATION_ERROR (0x9): An endpoint received transport 4094 parameters that contained version negotiation parameters that 4095 disagreed with the version negotiation that it performed. This 4096 error code indicates a potential version downgrade attack. 4098 PROTOCOL_VIOLATION (0xA): An endpoint detected an error with 4099 protocol compliance that was not covered by more specific error 4100 codes. 4102 UNSOLICITED_PATH_RESPONSE (0xB): An endpoint received a 4103 PATH_RESPONSE frame that did not correspond to any PATH_CHALLENGE 4104 frame that it previously sent. 4106 FRAME_ERROR (0x1XX): An endpoint detected an error in a specific 4107 frame type. The frame type is included as the last octet of the 4108 error code. For example, an error in a MAX_STREAM_ID frame would 4109 be indicated with the code (0x106). 4111 Codes for errors occuring when TLS is used for the crypto handshake 4112 are defined in Section 11 of [QUIC-TLS]. See Section 13.2 for 4113 details of registering new error codes. 4115 11.4. Application Protocol Error Codes 4117 Application protocol error codes are 16-bit unsigned integers, but 4118 the management of application error codes are left to application 4119 protocols. Application protocol error codes are used for the 4120 RST_STREAM (Section 7.3) and APPLICATION_CLOSE (Section 7.5) frames. 4122 There is no restriction on the use of the 16-bit error code space for 4123 application protocols. However, QUIC reserves the error code with a 4124 value of 0 to mean STOPPING. The application error code of STOPPING 4125 (0) is used by the transport to cancel a stream in response to 4126 receipt of a STOP_SENDING frame. 4128 12. Security and Privacy Considerations 4130 12.1. Spoofed ACK Attack 4132 An attacker might be able to receive an address validation token 4133 (Section 6.6) from the server and then release the IP address it used 4134 to acquire that token. The attacker may, in the future, spoof this 4135 same address (which now presumably addresses a different endpoint), 4136 and initiate a 0-RTT connection with a server on the victim's behalf. 4137 The attacker can then spoof ACK frames to the server which cause the 4138 server to send excessive amounts of data toward the new owner of the 4139 IP address. 4141 There are two possible mitigations to this attack. The simplest one 4142 is that a server can unilaterally create a gap in packet-number 4143 space. In the non-attack scenario, the client will send an ACK frame 4144 with the larger value for largest acknowledged. In the attack 4145 scenario, the attacker could acknowledge a packet in the gap. If the 4146 server sees an acknowledgment for a packet that was never sent, the 4147 connection can be aborted. 4149 The second mitigation is that the server can require that 4150 acknowledgments for sent packets match the encryption level of the 4151 sent packet. This mitigation is useful if the connection has an 4152 ephemeral forward-secure key that is generated and used for every new 4153 connection. If a packet sent is protected with a forward-secure key, 4154 then any acknowledgments that are received for them MUST also be 4155 forward-secure protected. Since the attacker will not have the 4156 forward secure key, the attacker will not be able to generate 4157 forward-secure protected packets with ACK frames. 4159 12.2. Optimistic ACK Attack 4161 An endpoint that acknowledges packets it has not received might cause 4162 a congestion controller to permit sending at rates beyond what the 4163 network supports. An endpoint MAY skip packet numbers when sending 4164 packets to detect this behavior. An endpoint can then immediately 4165 close the connection with a connection error of type 4166 PROTOCOL_VIOLATION (see Section 6.9.3). 4168 12.3. Slowloris Attacks 4170 The attacks commonly known as Slowloris [SLOWLORIS] try to keep many 4171 connections to the target endpoint open and hold them open as long as 4172 possible. These attacks can be executed against a QUIC endpoint by 4173 generating the minimum amount of activity necessary to avoid being 4174 closed for inactivity. This might involve sending small amounts of 4175 data, gradually opening flow control windows in order to control the 4176 sender rate, or manufacturing ACK frames that simulate a high loss 4177 rate. 4179 QUIC deployments SHOULD provide mitigations for the Slowloris 4180 attacks, such as increasing the maximum number of clients the server 4181 will allow, limiting the number of connections a single IP address is 4182 allowed to make, imposing restrictions on the minimum transfer speed 4183 a connection is allowed to have, and restricting the length of time 4184 an endpoint is allowed to stay connected. 4186 12.4. Stream Fragmentation and Reassembly Attacks 4188 An adversarial endpoint might intentionally fragment the data on 4189 stream buffers in order to cause disproportionate memory commitment. 4190 An adversarial endpoint could open a stream and send some STREAM 4191 frames containing arbitrary fragments of the stream content. 4193 The attack is mitigated if flow control windows correspond to 4194 available memory. However, some receivers will over-commit memory 4195 and advertise flow control offsets in the aggregate that exceed 4196 actual available memory. The over-commitment strategy can lead to 4197 better performance when endpoints are well behaved, but renders 4198 endpoints vulnerable to the stream fragmentation attack. 4200 QUIC deployments SHOULD provide mitigations against the stream 4201 fragmentation attack. Mitigations could consist of avoiding over- 4202 committing memory, delaying reassembly of STREAM frames, implementing 4203 heuristics based on the age and duration of reassembly holes, or some 4204 combination. 4206 12.5. Stream Commitment Attack 4208 An adversarial endpoint can open lots of streams, exhausting state on 4209 an endpoint. The adversarial endpoint could repeat the process on a 4210 large number of connections, in a manner similar to SYN flooding 4211 attacks in TCP. 4213 Normally, clients will open streams sequentially, as explained in 4214 Section 9.1. However, when several streams are initiated at short 4215 intervals, transmission error may cause STREAM DATA frames opening 4216 streams to be received out of sequence. A receiver is obligated to 4217 open intervening streams if a higher-numbered stream ID is received. 4218 Thus, on a new connection, opening stream 2000001 opens 1 million 4219 streams, as required by the specification. 4221 The number of active streams is limited by the concurrent stream 4222 limit transport parameter, as explained in Section 9.4. If chosen 4223 judisciously, this limit mitigates the effect of the stream 4224 commitment attack. However, setting the limit too low could affect 4225 performance when applications expect to open large number of streams. 4227 13. IANA Considerations 4229 13.1. QUIC Transport Parameter Registry 4231 IANA [SHALL add/has added] a registry for "QUIC Transport Parameters" 4232 under a "QUIC Protocol" heading. 4234 The "QUIC Transport Parameters" registry governs a 16-bit space. 4235 This space is split into two spaces that are governed by different 4236 policies. Values with the first byte in the range 0x00 to 0xfe (in 4237 hexadecimal) are assigned via the Specification Required policy 4238 [RFC8126]. Values with the first byte 0xff are reserved for Private 4239 Use [RFC8126]. 4241 Registrations MUST include the following fields: 4243 Value: The numeric value of the assignment (registrations will be 4244 between 0x0000 and 0xfeff). 4246 Parameter Name: A short mnemonic for the parameter. 4248 Specification: A reference to a publicly available specification for 4249 the value. 4251 The nominated expert(s) verify that a specification exists and is 4252 readily accessible. The expert(s) are encouraged to be biased 4253 towards approving registrations unless they are abusive, frivolous, 4254 or actively harmful (not merely aesthetically displeasing, or 4255 architecturally dubious). 4257 The initial contents of this registry are shown in Table 7. 4259 +--------+----------------------------+---------------+ 4260 | Value | Parameter Name | Specification | 4261 +--------+----------------------------+---------------+ 4262 | 0x0000 | initial_max_stream_data | Section 6.4.1 | 4263 | | | | 4264 | 0x0001 | initial_max_data | Section 6.4.1 | 4265 | | | | 4266 | 0x0002 | initial_max_stream_id_bidi | Section 6.4.1 | 4267 | | | | 4268 | 0x0003 | idle_timeout | Section 6.4.1 | 4269 | | | | 4270 | 0x0005 | max_packet_size | Section 6.4.1 | 4271 | | | | 4272 | 0x0006 | stateless_reset_token | Section 6.4.1 | 4273 | | | | 4274 | 0x0007 | ack_delay_exponent | Section 6.4.1 | 4275 | | | | 4276 | 0x0008 | initial_max_stream_id_uni | Section 6.4.1 | 4277 +--------+----------------------------+---------------+ 4279 Table 7: Initial QUIC Transport Parameters Entries 4281 13.2. QUIC Transport Error Codes Registry 4283 IANA [SHALL add/has added] a registry for "QUIC Transport Error 4284 Codes" under a "QUIC Protocol" heading. 4286 The "QUIC Transport Error Codes" registry governs a 16-bit space. 4287 This space is split into two spaces that are governed by different 4288 policies. Values with the first byte in the range 0x00 to 0xfe (in 4289 hexadecimal) are assigned via the Specification Required policy 4290 [RFC8126]. Values with the first byte 0xff are reserved for Private 4291 Use [RFC8126]. 4293 Registrations MUST include the following fields: 4295 Value: The numeric value of the assignment (registrations will be 4296 between 0x0000 and 0xfeff). 4298 Code: A short mnemonic for the parameter. 4300 Description: A brief description of the error code semantics, which 4301 MAY be a summary if a specification reference is provided. 4303 Specification: A reference to a publicly available specification for 4304 the value. 4306 The initial contents of this registry are shown in Table 8. Note 4307 that FRAME_ERROR takes the range from 0x100 to 0x1FF and private use 4308 occupies the range from 0xFE00 to 0xFFFF. 4310 +-----------+------------------------+---------------+--------------+ 4311 | Value | Error | Description | Specificatio | 4312 | | | | n | 4313 +-----------+------------------------+---------------+--------------+ 4314 | 0x0 | NO_ERROR | No error | Section 11.3 | 4315 | | | | | 4316 | 0x1 | INTERNAL_ERROR | Implementatio | Section 11.3 | 4317 | | | n error | | 4318 | | | | | 4319 | 0x2 | SERVER_BUSY | Server | Section 11.3 | 4320 | | | currently | | 4321 | | | busy | | 4322 | | | | | 4323 | 0x3 | FLOW_CONTROL_ERROR | Flow control | Section 11.3 | 4324 | | | error | | 4325 | | | | | 4326 | 0x4 | STREAM_ID_ERROR | Invalid | Section 11.3 | 4327 | | | stream ID | | 4328 | | | | | 4329 | 0x5 | STREAM_STATE_ERROR | Frame | Section 11.3 | 4330 | | | received in | | 4331 | | | invalid | | 4332 | | | stream state | | 4333 | | | | | 4334 | 0x6 | FINAL_OFFSET_ERROR | Change to | Section 11.3 | 4335 | | | final stream | | 4336 | | | offset | | 4337 | | | | | 4338 | 0x7 | FRAME_FORMAT_ERROR | Generic frame | Section 11.3 | 4339 | | | format error | | 4340 | | | | | 4341 | 0x8 | TRANSPORT_PARAMETER_ER | Error in | Section 11.3 | 4342 | | ROR | transport | | 4343 | | | parameters | | 4344 | | | | | 4345 | 0x9 | VERSION_NEGOTIATION_ER | Version | Section 11.3 | 4346 | | ROR | negotiation | | 4347 | | | failure | | 4348 | | | | | 4349 | 0xA | PROTOCOL_VIOLATION | Generic | Section 11.3 | 4350 | | | protocol | | 4351 | | | violation | | 4352 | | | | | 4353 | 0xB | UNSOLICITED_PATH_RESPO | Unsolicited | Section 11.3 | 4354 | | NSE | PATH_RESPONSE | | 4355 | | | frame | | 4356 | | | | | 4357 | 0x100-0x1 | FRAME_ERROR | Specific | Section 11.3 | 4358 | FF | | frame format | | 4359 | | | error | | 4360 +-----------+------------------------+---------------+--------------+ 4362 Table 8: Initial QUIC Transport Error Codes Entries 4364 14. References 4366 14.1. Normative References 4368 [I-D.ietf-tls-tls13] 4369 Rescorla, E., "The Transport Layer Security (TLS) Protocol 4370 Version 1.3", draft-ietf-tls-tls13-21 (work in progress), 4371 July 2017. 4373 [PLPMTUD] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 4374 Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, 4375 . 4377 [PMTUDv4] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 4378 DOI 10.17487/RFC1191, November 1990, 4379 . 4381 [PMTUDv6] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 4382 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 4383 DOI 10.17487/RFC8201, July 2017, 4384 . 4386 [QUIC-RECOVERY] 4387 Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection 4388 and Congestion Control", draft-ietf-quic-recovery-10 (work 4389 in progress), April 2018. 4391 [QUIC-TLS] 4392 Thomson, M., Ed. and S. Turner, Ed., "Using Transport 4393 Layer Security (TLS) to Secure QUIC", draft-ietf-quic- 4394 tls-10 (work in progress), April 2018. 4396 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 4397 DOI 10.17487/RFC1191, November 1990, 4398 . 4400 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 4401 Requirement Levels", BCP 14, RFC 2119, 4402 DOI 10.17487/RFC2119, March 1997, 4403 . 4405 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 4406 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November 4407 2003, . 4409 [RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker, 4410 "Randomness Requirements for Security", BCP 106, RFC 4086, 4411 DOI 10.17487/RFC4086, June 2005, 4412 . 4414 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 4415 Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, 4416 . 4418 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 4419 Writing an IANA Considerations Section in RFCs", BCP 26, 4420 RFC 8126, DOI 10.17487/RFC8126, June 2017, 4421 . 4423 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 4424 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 4425 May 2017, . 4427 14.2. Informative References 4429 [EARLY-DESIGN] 4430 Roskind, J., "QUIC: Multiplexed Transport Over UDP", 4431 December 2013, . 4433 [HTTP2] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 4434 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 4435 DOI 10.17487/RFC7540, May 2015, 4436 . 4438 [QUIC-INVARIANTS] 4439 Thomson, M., "Version-Independent Properties of QUIC", 4440 draft-ietf-quic-invariants-01 (work in progress), April 4441 2018. 4443 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 4444 Hashing for Message Authentication", RFC 2104, 4445 DOI 10.17487/RFC2104, February 1997, 4446 . 4448 [RFC2360] Scott, G., "Guide for Internet Standards Writers", BCP 22, 4449 RFC 2360, DOI 10.17487/RFC2360, June 1998, 4450 . 4452 [RFC4787] Audet, F., Ed. and C. Jennings, "Network Address 4453 Translation (NAT) Behavioral Requirements for Unicast 4454 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 4455 2007, . 4457 [RFC5869] Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand 4458 Key Derivation Function (HKDF)", RFC 5869, 4459 DOI 10.17487/RFC5869, May 2010, 4460 . 4462 [RFC7301] Friedl, S., Popov, A., Langley, A., and E. Stephan, 4463 "Transport Layer Security (TLS) Application-Layer Protocol 4464 Negotiation Extension", RFC 7301, DOI 10.17487/RFC7301, 4465 July 2014, . 4467 [SLOWLORIS] 4468 RSnake Hansen, R., "Welcome to Slowloris...", June 2009, 4469 . 4472 [SST] Ford, B., "Structured streams", ACM SIGCOMM Computer 4473 Communication Review Vol. 37, pp. 361, 4474 DOI 10.1145/1282427.1282421, October 2007. 4476 14.3. URIs 4478 [1] https://mailarchive.ietf.org/arch/search/?email_list=quic 4480 [2] https://github.com/quicwg 4482 [3] https://github.com/quicwg/base-drafts/labels/-transport 4484 [4] https://github.com/quicwg/base-drafts/wiki/QUIC-Versions 4486 Appendix A. Contributors 4488 The original authors of this specification were Ryan Hamilton, Jana 4489 Iyengar, Ian Swett, and Alyssa Wilk. 4491 The original design and rationale behind this protocol draw 4492 significantly from work by Jim Roskind [EARLY-DESIGN]. In 4493 alphabetical order, the contributors to the pre-IETF QUIC project at 4494 Google are: Britt Cyr, Jeremy Dorfman, Ryan Hamilton, Jana Iyengar, 4495 Fedor Kouranov, Charles Krasic, Jo Kulik, Adam Langley, Jim Roskind, 4496 Robbie Shade, Satyam Shekhar, Cherie Shi, Ian Swett, Raman Tenneti, 4497 Victor Vasiliev, Antonio Vicente, Patrik Westin, Alyssa Wilk, Dale 4498 Worley, Fan Yang, Dan Zhang, Daniel Ziegler. 4500 Appendix B. Acknowledgments 4502 Special thanks are due to the following for helping shape pre-IETF 4503 QUIC and its deployment: Chris Bentzel, Misha Efimov, Roberto Peon, 4504 Alistair Riddoch, Siddharth Vijayakrishnan, and Assar Westerlund. 4506 This document has benefited immensely from various private 4507 discussions and public ones on the quic@ietf.org and proto- 4508 quic@chromium.org mailing lists. Our thanks to all. 4510 Appendix C. Change Log 4512 *RFC Editor's Note:* Please remove this section prior to 4513 publication of a final version of this document. 4515 Issue and pull request numbers are listed with a leading octothorp. 4517 C.1. Since draft-ietf-quic-transport-10 4519 o Swap payload length and packed number fields in long header 4520 (#1294) 4522 o Clarified that CONNECTION_CLOSE is allowed in Handshake packet 4523 (#1274) 4525 o Spin bit reserved (#1283) 4527 o Coalescing multiple QUIC packets in a UDP datagram (#1262, #1285) 4529 o A more complete connection migration (#1249) 4531 o Refine opportunistic ACK defense text (#305, #1030, #1185) 4533 o A Stateless Reset Token isn't mandatory (#818, #1191) 4535 o Removed implicit stream opening (#896, #1193) 4537 o An empty STREAM frame can be used to open a stream without sending 4538 data (#901, #1194) 4540 o Define stream counts in transport parameters rather than a maximum 4541 stream ID (#1023, #1065) 4543 o STOP_SENDING is now prohibited before streams are used (#1050) 4544 o Recommend including ACK in Retry packets and allow PADDING (#1067, 4545 #882) 4547 o Endpoints now become closing after an idle timeout (#1178, #1179) 4549 o Remove implication that Version Negotiation is sent when a packet 4550 of the wrong version is received (#1197) 4552 C.2. Since draft-ietf-quic-transport-09 4554 o Added PATH_CHALLENGE and PATH_RESPONSE frames to replace PING with 4555 Data and PONG frame. Changed ACK frame type from 0x0e to 0x0d. 4556 (#1091, #725, #1086) 4558 o A server can now only send 3 packets without validating the client 4559 address (#38, #1090) 4561 o Delivery order of stream data is no longer strongly specified 4562 (#252, #1070) 4564 o Rework of packet handling and version negotiation (#1038) 4566 o Stream 0 is now exempt from flow control until the handshake 4567 completes (#1074, #725, #825, #1082) 4569 o Improved retransmission rules for all frame types: information is 4570 retransmitted, not packets or frames (#463, #765, #1095, #1053) 4572 o Added an error code for server busy signals (#1137) 4574 o Endpoints now set the connection ID that their peer uses. 4575 Connection IDs are variable length. Removed the 4576 omit_connection_id transport parameter and the corresponding short 4577 header flag. (#1089, #1052, #1146, #821, #745, #821, #1166, #1151) 4579 C.3. Since draft-ietf-quic-transport-08 4581 o Clarified requirements for BLOCKED usage (#65, #924) 4583 o BLOCKED frame now includes reason for blocking (#452, #924, #927, 4584 #928) 4586 o GAP limitation in ACK Frame (#613) 4588 o Improved PMTUD description (#614, #1036) 4590 o Clarified stream state machine (#634, #662, #743, #894) 4591 o Reserved versions don't need to be generated deterministically 4592 (#831, #931) 4594 o You don't always need the draining period (#871) 4596 o Stateless reset clarified as version-specific (#930, #986) 4598 o initial_max_stream_id_x transport parameters are optional (#970, 4599 #971) 4601 o Ack Delay assumes a default value during the handshake (#1007, 4602 #1009) 4604 o Removed transport parameters from NewSessionTicket (#1015) 4606 C.4. Since draft-ietf-quic-transport-07 4608 o The long header now has version before packet number (#926, #939) 4610 o Rename and consolidate packet types (#846, #822, #847) 4612 o Packet types are assigned new codepoints and the Connection ID 4613 Flag is inverted (#426, #956) 4615 o Removed type for Version Negotiation and use Version 0 (#963, 4616 #968) 4618 o Streams are split into unidirectional and bidirectional (#643, 4619 #656, #720, #872, #175, #885) 4621 * Stream limits now have separate uni- and bi-directinal 4622 transport parameters (#909, #958) 4624 * Stream limit transport parameters are now optional and default 4625 to 0 (#970, #971) 4627 o The stream state machine has been split into read and write (#634, 4628 #894) 4630 o Employ variable-length integer encodings throughout (#595) 4632 o Improvements to connection close 4634 * Added distinct closing and draining states (#899, #871) 4636 * Draining period can terminate early (#869, #870) 4638 * Clarifications about stateless reset (#889, #890) 4640 o Address validation for connection migration (#161, #732, #878) 4642 o Clearly defined retransmission rules for BLOCKED (#452, #65, #924) 4644 o negotiated_version is sent in server transport parameters (#710, 4645 #959) 4647 o Increased the range over which packet numbers are randomized 4648 (#864, #850, #964) 4650 C.5. Since draft-ietf-quic-transport-06 4652 o Replaced FNV-1a with AES-GCM for all "Cleartext" packets (#554) 4654 o Split error code space between application and transport (#485) 4656 o Stateless reset token moved to end (#820) 4658 o 1-RTT-protected long header types removed (#848) 4660 o No acknowledgments during draining period (#852) 4662 o Remove "application close" as a separate close type (#854) 4664 o Remove timestamps from the ACK frame (#841) 4666 o Require transport parameters to only appear once (#792) 4668 C.6. Since draft-ietf-quic-transport-05 4670 o Stateless token is server-only (#726) 4672 o Refactor section on connection termination (#733, #748, #328, 4673 #177) 4675 o Limit size of Version Negotiation packet (#585) 4677 o Clarify when and what to ack (#736) 4679 o Renamed STREAM_ID_NEEDED to STREAM_ID_BLOCKED 4681 o Clarify Keep-alive requirements (#729) 4683 C.7. Since draft-ietf-quic-transport-04 4685 o Introduce STOP_SENDING frame, RST_STREAM only resets in one 4686 direction (#165) 4688 o Removed GOAWAY; application protocols are responsible for graceful 4689 shutdown (#696) 4691 o Reduced the number of error codes (#96, #177, #184, #211) 4693 o Version validation fields can't move or change (#121) 4695 o Removed versions from the transport parameters in a 4696 NewSessionTicket message (#547) 4698 o Clarify the meaning of "bytes in flight" (#550) 4700 o Public reset is now stateless reset and not visible to the path 4701 (#215) 4703 o Reordered bits and fields in STREAM frame (#620) 4705 o Clarifications to the stream state machine (#572, #571) 4707 o Increased the maximum length of the Largest Acknowledged field in 4708 ACK frames to 64 bits (#629) 4710 o truncate_connection_id is renamed to omit_connection_id (#659) 4712 o CONNECTION_CLOSE terminates the connection like TCP RST (#330, 4713 #328) 4715 o Update labels used in HKDF-Expand-Label to match TLS 1.3 (#642) 4717 C.8. Since draft-ietf-quic-transport-03 4719 o Change STREAM and RST_STREAM layout 4721 o Add MAX_STREAM_ID settings 4723 C.9. Since draft-ietf-quic-transport-02 4725 o The size of the initial packet payload has a fixed minimum (#267, 4726 #472) 4728 o Define when Version Negotiation packets are ignored (#284, #294, 4729 #241, #143, #474) 4731 o The 64-bit FNV-1a algorithm is used for integrity protection of 4732 unprotected packets (#167, #480, #481, #517) 4734 o Rework initial packet types to change how the connection ID is 4735 chosen (#482, #442, #493) 4737 o No timestamps are forbidden in unprotected packets (#542, #429) 4739 o Cryptographic handshake is now on stream 0 (#456) 4741 o Remove congestion control exemption for cryptographic handshake 4742 (#248, #476) 4744 o Version 1 of QUIC uses TLS; a new version is needed to use a 4745 different handshake protocol (#516) 4747 o STREAM frames have a reduced number of offset lengths (#543, #430) 4749 o Split some frames into separate connection- and stream- level 4750 frames (#443) 4752 * WINDOW_UPDATE split into MAX_DATA and MAX_STREAM_DATA (#450) 4754 * BLOCKED split to match WINDOW_UPDATE split (#454) 4756 * Define STREAM_ID_NEEDED frame (#455) 4758 o A NEW_CONNECTION_ID frame supports connection migration without 4759 linkability (#232, #491, #496) 4761 o Transport parameters for 0-RTT are retained from a previous 4762 connection (#405, #513, #512) 4764 * A client in 0-RTT no longer required to reset excess streams 4765 (#425, #479) 4767 o Expanded security considerations (#440, #444, #445, #448) 4769 C.10. Since draft-ietf-quic-transport-01 4771 o Defined short and long packet headers (#40, #148, #361) 4773 o Defined a versioning scheme and stable fields (#51, #361) 4775 o Define reserved version values for "greasing" negotiation (#112, 4776 #278) 4778 o The initial packet number is randomized (#35, #283) 4780 o Narrow the packet number encoding range requirement (#67, #286, 4781 #299, #323, #356) 4783 o Defined client address validation (#52, #118, #120, #275) 4784 o Define transport parameters as a TLS extension (#49, #122) 4786 o SCUP and COPT parameters are no longer valid (#116, #117) 4788 o Transport parameters for 0-RTT are either remembered from before, 4789 or assume default values (#126) 4791 o The server chooses connection IDs in its final flight (#119, #349, 4792 #361) 4794 o The server echoes the Connection ID and packet number fields when 4795 sending a Version Negotiation packet (#133, #295, #244) 4797 o Defined a minimum packet size for the initial handshake packet 4798 from the client (#69, #136, #139, #164) 4800 o Path MTU Discovery (#64, #106) 4802 o The initial handshake packet from the client needs to fit in a 4803 single packet (#338) 4805 o Forbid acknowledgment of packets containing only ACK and PADDING 4806 (#291) 4808 o Require that frames are processed when packets are acknowledged 4809 (#381, #341) 4811 o Removed the STOP_WAITING frame (#66) 4813 o Don't require retransmission of old timestamps for lost ACK frames 4814 (#308) 4816 o Clarified that frames are not retransmitted, but the information 4817 in them can be (#157, #298) 4819 o Error handling definitions (#335) 4821 o Split error codes into four sections (#74) 4823 o Forbid the use of Public Reset where CONNECTION_CLOSE is possible 4824 (#289) 4826 o Define packet protection rules (#336) 4828 o Require that stream be entirely delivered or reset, including 4829 acknowledgment of all STREAM frames or the RST_STREAM, before it 4830 closes (#381) 4832 o Remove stream reservation from state machine (#174, #280) 4834 o Only stream 1 does not contribute to connection-level flow control 4835 (#204) 4837 o Stream 1 counts towards the maximum concurrent stream limit (#201, 4838 #282) 4840 o Remove connection-level flow control exclusion for some streams 4841 (except 1) (#246) 4843 o RST_STREAM affects connection-level flow control (#162, #163) 4845 o Flow control accounting uses the maximum data offset on each 4846 stream, rather than bytes received (#378) 4848 o Moved length-determining fields to the start of STREAM and ACK 4849 (#168, #277) 4851 o Added the ability to pad between frames (#158, #276) 4853 o Remove error code and reason phrase from GOAWAY (#352, #355) 4855 o GOAWAY includes a final stream number for both directions (#347) 4857 o Error codes for RST_STREAM and CONNECTION_CLOSE are now at a 4858 consistent offset (#249) 4860 o Defined priority as the responsibility of the application protocol 4861 (#104, #303) 4863 C.11. Since draft-ietf-quic-transport-00 4865 o Replaced DIVERSIFICATION_NONCE flag with KEY_PHASE flag 4867 o Defined versioning 4869 o Reworked description of packet and frame layout 4871 o Error code space is divided into regions for each component 4873 o Use big endian for all numeric values 4875 C.12. Since draft-hamilton-quic-transport-protocol-01 4877 o Adopted as base for draft-ietf-quic-tls 4879 o Updated authors/editors list 4880 o Added IANA Considerations section 4882 o Moved Contributors and Acknowledgments to appendices 4884 Authors' Addresses 4886 Jana Iyengar (editor) 4887 Fastly 4889 Email: jri.ietf@gmail.com 4891 Martin Thomson (editor) 4892 Mozilla 4894 Email: martin.thomson@gmail.com