idnits 2.17.1 draft-ietf-quic-transport-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** There are 19 instances of too long lines in the document, the longest one being 9 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (November 28, 2016) is 2705 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'QUIC-HTTP' is defined on line 2060, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'QUIC-RECOVERY' -- Possible downref: Non-RFC (?) normative reference: ref. 'QUIC-TLS' ** Obsolete normative reference: RFC 7540 (Obsoleted by RFC 9113) Summary: 3 errors (**), 0 flaws (~~), 3 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC J. Iyengar, Ed. 3 Internet-Draft Google 4 Intended status: Standards Track M. Thomson, Ed. 5 Expires: June 1, 2017 Mozilla 6 November 28, 2016 8 QUIC: A UDP-Based Multiplexed and Secure Transport 9 draft-ietf-quic-transport-00 11 Abstract 13 QUIC is a multiplexed and secure transport protocol that runs on top 14 of UDP. QUIC builds on past transport experience, and implements 15 mechanisms that make it useful as a modern general-purpose transport 16 protocol. Using UDP as the basis of QUIC is intended to address 17 compatibility issues with legacy clients and middleboxes. QUIC 18 authenticates all of its headers, preventing third parties from from 19 changing them. QUIC encrypts most of its headers, thereby limiting 20 protocol evolution to QUIC endpoints only. Therefore, middleboxes, 21 in large part, are not required to be updated as new protocol 22 versions are deployed. This document describes the core QUIC 23 protocol, including the conceptual design, wire format, and 24 mechanisms of the QUIC protocol for connection establishment, stream 25 multiplexing, stream and connection-level flow control, and data 26 reliability. Accompanying documents describe QUIC's loss recovery 27 and congestion control, and the use of TLS 1.3 for key negotiation. 29 Status of This Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at http://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on June 1, 2017. 46 Copyright Notice 48 Copyright (c) 2016 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 64 2. Conventions and Definitions . . . . . . . . . . . . . . . . . 4 65 3. A QUIC Overview . . . . . . . . . . . . . . . . . . . . . . . 4 66 3.1. Low-Latency Version Negotiation . . . . . . . . . . . . . 5 67 3.2. Low-Latency Connection Establishment . . . . . . . . . . 5 68 3.3. Stream Multiplexing . . . . . . . . . . . . . . . . . . . 5 69 3.4. Rich Signaling for Congestion Control and Loss Recovery . 6 70 3.5. Stream and Connection Flow Control . . . . . . . . . . . 6 71 3.6. Authenticated and Encrypted Header and Payload . . . . . 6 72 3.7. Connection Migration and Resilience to NAT Rebinding . . 7 73 4. Packet Types and Formats . . . . . . . . . . . . . . . . . . 7 74 4.1. Common Header . . . . . . . . . . . . . . . . . . . . . . 7 75 4.2. Regular Packets . . . . . . . . . . . . . . . . . . . . . 9 76 4.2.1. Packet Number Compression and Reconstruction . . . . 10 77 4.2.2. Frames and Frame Types . . . . . . . . . . . . . . . 11 78 4.3. Version Negotiation Packet . . . . . . . . . . . . . . . 12 79 4.4. Public Reset Packet . . . . . . . . . . . . . . . . . . . 12 80 5. Life of a Connection . . . . . . . . . . . . . . . . . . . . 13 81 5.1. Version Negotiation . . . . . . . . . . . . . . . . . . . 13 82 5.2. Crypto and Transport Handshake . . . . . . . . . . . . . 15 83 5.2.1. Transport Parameters and Options . . . . . . . . . . 15 84 5.2.2. Proof of Source Address Ownership . . . . . . . . . . 16 85 5.2.3. Crypto Handshake Protocol Features . . . . . . . . . 16 86 5.3. Connection Migration . . . . . . . . . . . . . . . . . . 17 87 5.4. Connection Termination . . . . . . . . . . . . . . . . . 18 88 6. Frame Types and Formats . . . . . . . . . . . . . . . . . . . 19 89 6.1. STREAM Frame . . . . . . . . . . . . . . . . . . . . . . 19 90 6.2. ACK Frame . . . . . . . . . . . . . . . . . . . . . . . . 20 91 6.2.1. Time Format . . . . . . . . . . . . . . . . . . . . . 23 92 6.3. STOP_WAITING Frame . . . . . . . . . . . . . . . . . . . 23 93 6.4. WINDOW_UPDATE Frame . . . . . . . . . . . . . . . . . . . 24 94 6.5. BLOCKED Frame . . . . . . . . . . . . . . . . . . . . . . 24 95 6.6. RST_STREAM Frame . . . . . . . . . . . . . . . . . . . . 25 96 6.7. PADDING Frame . . . . . . . . . . . . . . . . . . . . . . 25 97 6.8. PING frame . . . . . . . . . . . . . . . . . . . . . . . 26 98 6.9. CONNECTION_CLOSE frame . . . . . . . . . . . . . . . . . 26 99 6.10. GOAWAY Frame . . . . . . . . . . . . . . . . . . . . . . 27 100 7. Packetization and Reliability . . . . . . . . . . . . . . . . 27 101 8. Streams: QUIC's Data Structuring Abstraction . . . . . . . . 29 102 8.1. Life of a Stream . . . . . . . . . . . . . . . . . . . . 29 103 8.1.1. idle . . . . . . . . . . . . . . . . . . . . . . . . 31 104 8.1.2. reserved . . . . . . . . . . . . . . . . . . . . . . 31 105 8.1.3. open . . . . . . . . . . . . . . . . . . . . . . . . 32 106 8.1.4. half-closed (local) . . . . . . . . . . . . . . . . . 32 107 8.1.5. half-closed (remote) . . . . . . . . . . . . . . . . 32 108 8.1.6. closed . . . . . . . . . . . . . . . . . . . . . . . 33 109 8.2. Stream Identifiers . . . . . . . . . . . . . . . . . . . 34 110 8.3. Stream Concurrency . . . . . . . . . . . . . . . . . . . 34 111 8.4. Sending and Receiving Data . . . . . . . . . . . . . . . 34 112 9. Flow Control . . . . . . . . . . . . . . . . . . . . . . . . 35 113 9.1. Edge Cases and Other Considerations . . . . . . . . . . . 36 114 9.1.1. Mid-stream RST_STREAM . . . . . . . . . . . . . . . . 36 115 9.1.2. Response to a RST_STREAM . . . . . . . . . . . . . . 37 116 9.1.3. Offset Increment . . . . . . . . . . . . . . . . . . 37 117 9.1.4. BLOCKED frames . . . . . . . . . . . . . . . . . . . 37 118 10. Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . 38 119 11. Security and Privacy Considerations . . . . . . . . . . . . . 43 120 11.1. Spoofed Ack Attack . . . . . . . . . . . . . . . . . . . 43 121 12. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 44 122 13. References . . . . . . . . . . . . . . . . . . . . . . . . . 44 123 13.1. Normative References . . . . . . . . . . . . . . . . . . 44 124 13.2. Informative References . . . . . . . . . . . . . . . . . 44 125 Appendix A. Contributors . . . . . . . . . . . . . . . . . . . . 45 126 Appendix B. Acknowledgments . . . . . . . . . . . . . . . . . . 45 127 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 45 129 1. Introduction 131 QUIC is a multiplexed and secure transport protocol that runs on top 132 of UDP. QUIC builds on past transport experience and implements 133 mechanisms that make it useful as a modern general-purpose transport 134 protocol. Using UDP as the substrate, QUIC seeks to be compatible 135 with legacy clients and middleboxes. QUIC authenticates all of its 136 headers, preventing middleboxes and other third parties from changing 137 them, and encrypts most of its headers, limiting protocol evolution 138 largely to QUIC endpoints only. 140 This document describes the core QUIC protocol, including the 141 conceptual design, wire format, and mechanisms of the QUIC protocol 142 for connection establishment, stream multiplexing, stream and 143 connection-level flow control, and data reliability. Accompanying 144 documents describe QUIC's loss detection and congestion control 145 [QUIC-RECOVERY], and the use of TLS 1.3 for key negotiation 146 [QUIC-TLS]. 148 2. Conventions and Definitions 150 The words "MUST", "MUST NOT", "SHOULD", and "MAY" are used in this 151 document. It's not shouting; when they are capitalized, they have 152 the special meaning defined in [RFC2119]. 154 Definitions of terms that are used in this document: 156 o Client: The endpoint initiating a QUIC connection. 158 o Server: The endpoint accepting incoming QUIC connections. 160 o Endpoint: The client or server end of a connection. 162 o Stream: A logical, bi-directional channel of ordered bytes within 163 a QUIC connection. 165 o Connection: A conversation between two QUIC endpoints with a 166 single encryption context that multiplexes streams within it. 168 o Connection ID: The identifier for a QUIC connection. 170 o QUIC packet: A well-formed UDP payload that can be parsed by a 171 QUIC receiver. QUIC packet size in this document refers to the 172 UDP payload size. 174 3. A QUIC Overview 176 This section briefly describes QUIC's key mechanisms and benefits. 177 Key strengths of QUIC include: 179 o Low-latency Version Negotiation 181 o Low-latency connection establishment 183 o Multiplexing without head-of-line blocking 185 o Authenticated and encrypted header and payload 187 o Rich signaling for congestion control and loss recovery 189 o Stream and connection flow control 190 o Connection Migration and Resilience to NAT rebinding 192 3.1. Low-Latency Version Negotiation 194 QUIC combines version negotiation with the rest of connection 195 establishment to avoid unnecessary roundtrip delays. A QUIC client 196 proposes a version to use for the connection, and encodes the rest of 197 the handshake using the proposed version. If the server does not 198 speak the client-chosen version, it forces version negotiation by 199 sending back a Version Negotiation packet to the client, causing a 200 roundtrip of delay before connection establishment. 202 This mechanism eliminates roundtrip latency when the client's 203 optimistically-chosen version is spoken by the server, and 204 incentivizes servers to not lag behind clients in deployment of newer 205 versions. Additionally, an application may negotiate QUIC versions 206 out-of-band to increase chances of success in the first roundtrip and 207 to obviate the additional roundtrip in the case of version mismatch. 209 3.2. Low-Latency Connection Establishment 211 QUIC relies on a combined crypto and transport handshake for setting 212 up a secure transport connection. QUIC connections are expected to 213 commonly use 0-RTT handshakes, meaning that for most QUIC 214 connections, data can be sent immediately following the client 215 handshake packet, without waiting for a reply from the server. QUIC 216 provides a dedicated stream (Stream ID 1) to be used for performing 217 the crypto handshake and QUIC options negotiation. The format of the 218 QUIC options and parameters used during negotiation are described in 219 this document, but the handshake protocol that runs on Stream ID 1 is 220 described in the accompanying crypto handshake draft [QUIC-TLS]. 222 3.3. Stream Multiplexing 224 When application messages are transported over TCP, independent 225 application messages can suffer from head-of-line blocking. When an 226 application multiplexes many streams atop TCP's single-bytestream 227 abstraction, a loss of a TCP segment results in blocking of all 228 subsequent segments until a retransmission arrives, irrespective of 229 the application streams that are encapsulated in subsequent segments. 230 QUIC ensures that lost packets carrying data for an individual stream 231 only impact that specific stream. Data received on other streams can 232 continue to be reassembled and delivered to the application. 234 3.4. Rich Signaling for Congestion Control and Loss Recovery 236 QUIC's packet framing and acknowledgments carry rich information that 237 help both congestion control and loss recovery in fundamental ways. 238 Each QUIC packet carries a new packet number, including those 239 carrying retransmitted data. This obviates the need for a separate 240 mechanism to distinguish acks for retransmissions from those for 241 original transmissions, avoiding TCP's retransmission ambiguity 242 problem. QUIC acknowledgments also explicitly encode the delay 243 between the receipt of a packet and its acknowledgment being sent, 244 and together with the monotonically-increasing packet numbers, this 245 allows for precise network roundtrip-time (RTT) calculation. QUIC's 246 ACK frames support up to 256 ack blocks, so QUIC is more resilient to 247 reordering than TCP with SACK support, as well as able to keep more 248 bytes on the wire when there is reordering or loss. 250 3.5. Stream and Connection Flow Control 252 QUIC implements stream- and connection-level flow control, closely 253 following HTTP/2's flow control mechanisms. At a high level, a QUIC 254 receiver advertises the absolute byte offset within each stream up to 255 which the receiver is willing to receive data. As data is sent, 256 received, and delivered on a particular stream, the receiver sends 257 WINDOW_UPDATE frames that increase the advertised offset limit for 258 that stream, allowing the peer to send more data on that stream. In 259 addition to this stream-level flow control, QUIC implements 260 connection-level flow control to limit the aggregate buffer that a 261 QUIC receiver is willing to allocate to all streams on a connection. 262 Connection-level flow control works in the same way as stream-level 263 flow control, but the bytes delivered and highest received offset are 264 all aggregates across all streams. 266 3.6. Authenticated and Encrypted Header and Payload 268 TCP headers appear in plaintext on the wire and are not 269 authenticated, causing a plethora of injection and header 270 manipulation issues for TCP, such as receive-window manipulation and 271 sequence-number overwriting. While some of these are mechanisms used 272 by middleboxes to improve TCP performance, others are active attacks. 273 Even "performance-enhancing" middleboxes that routinely interpose on 274 the transport state machine end up limiting the evolvability of the 275 transport protocol, as has been observed in the design of MPTCP and 276 in its subsequent deployability issues. 278 Generally, QUIC packets are always authenticated and the payload is 279 typically fully encrypted. The parts of the packet header which are 280 not encrypted are still authenticated by the receiver, so as to 281 thwart any packet injection or manipulation by third parties. Some 282 early handshake packets, such as the Version Negotiation packet, are 283 not encrypted, but information sent in these unencrypted handshake 284 packets is later verified under crypto cover. 286 PUBLIC_RESET packets that reset a connection are currently not 287 authenticated. 289 3.7. Connection Migration and Resilience to NAT Rebinding 291 QUIC connections are identified by a 64-bit Connection ID, randomly 292 generated by the client. QUIC's consistent connection ID allows 293 connections to survive changes to the client's IP and port, such as 294 those caused by NAT rebindings or by the client changing network 295 connectivity to a new address. QUIC provides automatic cryptographic 296 verification of a rebound client, since the client continues to use 297 the same session key for encrypting and decrypting packets. The 298 consistent connection ID can be used to allow migration of the 299 connection to a new server IP address as well, since the Connection 300 ID remains consistent across changes in the client's and the server's 301 network addresses. 303 4. Packet Types and Formats 305 We first describe QUIC's packet types and their formats, since some 306 are referenced in subsequent mechanisms. Note that unless otherwise 307 noted, all values specified in this document are in little-endian 308 format and all field sizes are in bits. 310 4.1. Common Header 312 All QUIC packets begin with a QUIC Common header, as shown below. 314 +------------+---------------------------------+ 315 | Flags(8) | Connection ID (64) (optional) | 316 +------------+---------------------------------+ 318 The fields in the Common Header are the following: 320 o Flags: 322 * 0x01 = VERSION. The semantics of this flag depends on whether 323 the packet is sent by the server or the client. A client MAY 324 set this flag and include exactly one proposed version. A 325 server may set this flag when the client-proposed version was 326 unsupported, and may then provide a list (0 or more) of 327 acceptable versions as a part of version negotiation (described 328 in Section XXX.) 330 * 0x02 = PUBLIC_RESET. Set to indicate that the packet is a 331 Public Reset packet. 333 * 0x04 = DIVERSIFICATION_NONCE. Set to indicate the presence of 334 a 32-byte diversification nonce in the header. 335 (DISCUSS_AND_MODIFY: This flag should be removed along with the 336 Diversification Nonce bits, as discussed further below.) 338 * 0x08 = CONNECTION_ID. Indicates the Connection ID is present 339 in the packet. This must be set in all packets until 340 negotiated to a different value for a given direction. For 341 instance, if a client indicates that the 5-tuple fully 342 identifies the connection at the client, the connection ID is 343 optional in the server-to-client direction. 345 * 0x30 = PACKET_NUMBER_SIZE. These two bits indicate the number 346 of low-order-bytes of the packet number that are present in 347 each packet. 349 + 11 indicates that 6 bytes of the packet number are present 351 + 10 indicates that 4 bytes of the packet number are present 353 + 01 indicates that 2 bytes of the packet number are present 355 + 00 indicates that 1 byte of the packet number is present 357 * 0x40 = MULTIPATH. This bit is reserved for multipath use. 359 * 0x80 is currently unused, and must be set to 0. 361 o Connection ID: An unsigned 64-bit random number chosen by the 362 client, used as the identifier of the connection. Connection ID 363 is tied to a QUIC connection, and remains consistent across client 364 and/or server IP and port changes. 366 While all QUIC packets have the same common header, there are three 367 types of packets: Regular packets, Version Negotiation packets, and 368 Public Reset packets. The flowchart below shows how a packet is 369 classified into one of these three packet types: 371 Check the flags in the common header 372 | 373 | 374 V 375 +--------------+ 376 | PUBLIC_RESET | YES 377 | flag set? |-------> Public Reset packet 378 +--------------+ 379 | 380 | NO 381 V 382 +------------+ +-------------+ 383 | VERSION | YES | Packet sent | YES 384 | flag set? |-------->| by server? |--------> Version Negotiation 385 +------------+ +-------------+ packet 386 | | 387 | NO | NO 388 V V 389 Regular packet with Regular packet with 390 no QUIC Version in header QUIC Version in header 392 Figure 1: Types of QUIC Packets 394 4.2. Regular Packets 396 Each Regular packet's header consists of a Common Header followed by 397 fields specific to Regular packets, as shown below: 399 +------------+---------------------------------+ 400 | Flags(8) | Connection ID (64) (optional) | -> 401 +------------+---------------------------------+ 402 +---------------------------------------+-------------------------------+ 403 | Version (32) (client-only, optional) | Diversification Nonce (256) | -> 404 +---------------------------------------+-------------------------------+ 405 +------------------------------------+ 406 | Packet Number (8, 16, 32, or 48) | -> 407 +------------------------------------+ 408 +------------+ 409 | AEAD Data | 410 +------------+ 412 Decrypted AEAD Data: 413 +------------+-----------+ +-----------+ 414 | Frame 1 | Frame 2 | ... | Frame N | 415 +------------+-----------+ +-----------+ 417 Figure 2: Regular Packet 419 The fields in a Regular packet past the Common Header are the 420 following: 422 o QUIC Version: A 32-bit opaque tag that represents the version of 423 the QUIC protocol. Only present in the client-to-server 424 direction, and if the VERSION flag is set. Version Negotiation is 425 described in Section XXX. 427 o DISCUSS_AND_REPLACE: Diversification Nonce: A 32-byte nonce 428 generated by the server and used only in the Server->Client 429 direction to ensure that the server is able to generate unique 430 keys per connection. Specifically, when using QUIC's 0-RTT crypto 431 handshake, a repeated CHLO with the exact same connection ID and 432 CHLO can lead to the same (intermediate) initial-encryption keys 433 being derived for the connection. A server-generated nonce 434 disallows a client from causing the same keys to be derived for 435 two distinct connections. Once the connection is forward-secure, 436 this nonce is no longer present in packets. This nonce can be 437 removed from the packet header if a requirement can be added for 438 the crypto handshake to ensure key uniqueness. The expectation is 439 that TLS1.3 meets this requirement. Upon working group adoption 440 of this document, this requirement should be added to the crypto 441 handshake requirements, and the nonce should be removed from the 442 packet format. 444 o Packet Number: The lower 8, 16, 32, or 48 bits of the packet 445 number, based on the PACKET_NUMBER_SIZE flag. Each Regular packet 446 is assigned a packet number by the sender. The first packet sent 447 by an endpoint MUST have a packet number of 1. 449 o AEAD Data: A Regular packet's header, which includes the Common 450 Header, and the Version, Diversification Nonce, and Packet Number 451 fields, is authenticated but not encrypted. The rest of a Regular 452 packet, starting with the first frame, is both authenticated and 453 encrypted. Immediately following the header, Regular packets 454 contain AEAD (Authenticated Encryption with Associated Data) data. 455 This data must be decrypted in order for the contents to be 456 interpreted. After decryption, the plaintext consists of a 457 sequence of frames, as shown (frames are described in 458 Section XXX). 460 4.2.1. Packet Number Compression and Reconstruction 462 The complete packet number is a 64-bit unsigned number and is used as 463 part of a cryptographic nonce for packet encryption. To reduce the 464 number of bits required to represent the packet number over the wire, 465 at most 48 bits of the packet number are transmitted over the wire. 466 A QUIC endpoint MUST NOT reuse a complete packet number within the 467 same connection (that is, under the same cryptographic keys). If the 468 total number of packets transmitted in this connection reaches 2^64 - 469 1, the sender MUST close the connection by sending a CONNECTION_CLOSE 470 frame with the error code QUIC_SEQUENCE_NUMBER_LIMIT_REACHED 471 (connection termination is described in Section XXX.) For 472 unambiguous reconstruction of the complete packet number by a 473 receiver from the lower-order bits, a QUIC sender MUST NOT have more 474 than 2^(packet_number_size - 2) in flight at any point in the 475 connection. In other words, 477 o If a sender sets PACKET_NUMBER_SIZE bits to 11, it MUST NOT have 478 more than (2^46) packets in flight. 480 o If a sender sets PACKET_NUMBER_SIZE bits to 10, it MUST NOT have 481 more than (2^30) packets in flight. 483 o If a sender sets PACKET_NUMBER_SIZE bits to 01, it MUST NOT have 484 more than (2^14) packets in flight. 486 o If a sender sets PACKET_NUMBER_SIZE bits to 00, it MUST NOT have 487 more than (2^6) packets in flight. 489 DISCUSS: Should the receiver be required to enforce this rule that 490 the sender MUST NOT exceed the inflight limit? Specifically, 491 should the receiver drop packets that are received outside this 492 window? 494 Any truncated packet number received from a peer MUST be 495 reconstructed as the value closest to the next expected packet 496 number from that peer. 498 (TODO: Clarify how packet number size can change mid-connection.) 500 4.2.2. Frames and Frame Types 502 A Regular packet MUST contain at least one frame, and MAY contain 503 multiple frames and multiple frame types. Frames MUST fit within a 504 single QUIC packet and MUST NOT span a QUIC packet boundary. Each 505 frame begins with a Frame Type byte, indicating its type, followed by 506 type-dependent headers, and variable-length data, as follows: 508 +-----------+---------------------------+-------------------------+ 509 | Type (8) | Headers (type-dependent) | Data (type-dependent) | 510 +-----------+---------------------------+-------------------------+ 512 The following table lists currently defined frame types. Note that 513 the Frame Type byte in STREAM and ACK frames is used to carry other 514 frame-specific flags. For all other frames, the Frame Type byte 515 simply identifies the frame. These frames are explained in more 516 detail as they are referenced later in the document. 518 +------------------+--------------------+ 519 | Type-field value | Frame type | 520 +------------------+--------------------+ 521 | 1FDOOOSS | STREAM | 522 | 01NTLLMM | ACK | 523 | 00000000 (0x00) | PADDING | 524 | 00000001 (0x01) | RST_STREAM | 525 | 00000010 (0x02) | CONNECTION_CLOSE | 526 | 00000011 (0x03) | GOAWAY | 527 | 00000100 (0x04) | WINDOW_UPDATE | 528 | 00000101 (0x05) | BLOCKED | 529 | 00000110 (0x06) | STOP_WAITING | 530 | 00000111 (0x07) | PING | 531 +------------------+--------------------+ 533 Figure 3: Types of QUIC Frames 535 4.3. Version Negotiation Packet 537 A Version Negotiation packet is only sent by the server, MUST have 538 the VERSION flag set, and MUST include the full 64-bit Connection ID. 539 The rest of the Version Negotiation packet is a list of 4-byte 540 versions which the server supports, as shown below. 542 +-----------------------------------+ 543 | Flags(8) | Connection ID (64) | -> 544 +-----------------------------------+ 545 +------------------------------+----------------------------------------+ 546 | 1st Supported Version (32) | 2nd Supported Version (32) supported | ... 547 +------------------------------+----------------------------------------+ 549 Figure 4: Version Negotiation Packet 551 4.4. Public Reset Packet 553 A Public Reset packet MUST have the PUBLIC_RESET flag set, and MUST 554 include the full 64-bit connection ID. The rest of the Public Reset 555 packet is encoded as if it were a crypto handshake message of the tag 556 PRST, as shown below. 558 +-----------------------------------+ 559 | Flags(8) | Connection ID (64) | -> 560 +-----------------------------------+ 561 +-------------------------------------+ 562 | Quic Tag (PRST) and tag value map | 563 +-------------------------------------+ 565 Figure 5: Public Reset Packet 567 The tag value map contains the following tag-values: 569 o RNON (public reset nonce proof) - a 64-bit unsigned integer. 571 o RSEQ (rejected packet number) - a 64-bit packet number. 573 o CADR (client address) - the observed client IP address and port 574 number. This is currently for debugging purposes only and hence 575 is optional. 577 DISCUSS_AND_REPLACE: The crypto handshake message format is described 578 in the QUIC crypto document, and should be replaced with something 579 simpler when this document is adopted. The purpose of the tag-value 580 map following the PRST tag is to enable the receiver of the Public 581 Reset packet to reasonably authenticate the packet. This map is an 582 extensible map format that allows specification of various tags, 583 which should again be replaced by something simpler. 585 5. Life of a Connection 587 A QUIC connection is a single conversation between two QUIC 588 endpoints. QUIC's connection establishment intertwines version 589 negotiation with the crypto and transport handshakes to reduce 590 connection establishment latency, as described in Section XXX. Once 591 established, a connection may migrate to a different IP or port at 592 either endpoint, due to NAT rebinding or mobility, as described in 593 Section XXX. Finally a connection may be terminated by either 594 endpoint, as described in Section XXX. 596 5.1. Version Negotiation 598 QUIC's connection establishment begins with version negotiation, 599 since all communication between the endpoints, including packet and 600 frame formats, relies on the two endpoints agreeing on a version. 602 A QUIC connection begins with a client sending a handshake packet. 603 The details of the handshake mechanisms are described in Section XX, 604 but all of the initial packets sent from the client to the server 605 MUST have the VERSION flag set, and MUST specify the version of the 606 protocol being used. 608 When the server receives a packet from a client with the VERSION flag 609 set for a connection that has not yet been established, it compares 610 the client's version to the versions it supports. 612 o If the client's version is acceptable to the server, the server 613 MUST use this protocol version for the lifetime of the connection. 614 All subsequent packets sent by the server MUST have the version 615 flag off. 617 o If the client's version is not acceptable to the server, the 618 server MUST send a Version Negotiation packet to the client. This 619 packet will have the VERSION flag set and will include the 620 server's set of supported versions. On subsequently received 621 packets for the same connection ID with the unacceptable version, 622 the server MUST continue responding with a Version Negotiation 623 packet. 625 When the client receives a Version Negotiation packet from the 626 server, it should select an acceptable protocol version. If such a 627 version is found, the client MUST resend all packets using the new 628 version, and the resent packets MUST use new packet numbers. These 629 packets MUST continue to have the VERSION flag set and MUST include 630 the new negotiated protocol version. 632 The client MUST send its version on all packets until it receives a 633 packet from the server with the VERSION flag off. If version 634 negotiation is successful, the client should receive a packet from 635 the server with the VERSION flag off indicating the end of version 636 negotiation. All subsequent packets the client sends MUST have the 637 version flag off. 639 Once the server receives a packet from the client with the VERSION 640 flag off, it MUST ignore the VERSION flag in subsequently received 641 packets. 643 The Version Negotiation packet is unencrypted and exchanged without 644 authentication. To avoid a downgrade attack, the client needs to 645 verify its record of the server's version list in the Version 646 Negotiation packet and the server needs to verify its record of the 647 client's originally proposed version. Therefore, the client and 648 server MUST include this information later in their corresponding 649 crypto handshake data. 651 5.2. Crypto and Transport Handshake 653 QUIC relies on a combined crypto and transport handshake to minimize 654 connection establishment latency. QUIC provides a dedicated stream 655 (Stream ID 1) to be used for performing a combined connection and 656 security handshake (streams are described in detail in Section XXX). 657 The crypto handshake protocol encapsulates and delivers QUIC's 658 transport handshake to the peer on the crypto stream. The first QUIC 659 packet from the client to the server MUST carry handshake information 660 as data on Stream ID 1. 662 5.2.1. Transport Parameters and Options 664 During connection establishment, the handshake must negotiate various 665 transport parameters. The currently defined transport parameters are 666 described later in the document. 668 The transport component of the handshake is responsible for 669 exchanging and negotiating the following parameters for a QUIC 670 connection. Not all parameters are negotiated, some are parameters 671 sent in just one direction. These parameters and options are encoded 672 and handed off to the crypto handshake protocol to be transmitted to 673 the peer. 675 5.2.1.1. Encoding 677 (TODO: Describe format with example) 679 QUIC encodes the transport parameters and options as tag-value pairs, 680 all as 7-bit ASCII strings. QUIC parameter tags are listed below. 682 5.2.1.2. Required Transport Parameters 684 o SFCW: Stream Flow Control Window. The stream level flow control 685 byte offset advertised by the sender of this parameter. 687 o CFCW: Connection Flow Control Window. The connection level flow 688 control byte offset advertised by the sender of this parameter. 690 o MSPC: Maximum number of incoming streams per connection. 692 5.2.1.3. Optional Transport Parameters 694 o TCID: Indicates support for truncated Connection IDs. If sent by 695 a peer, indicates that connection IDs sent to the peer should be 696 truncated to 0 bytes. This is expected to commonly be used by an 697 endpoint where the 5-tuple is sufficient to identify a connection. 698 For instance, if the 5-tuple is unique at the client, the client 699 MAY send a TCID parameter to the server. When a TCID parameter is 700 received, an endpoint MAY choose to not send the connection ID on 701 subsequent packets. 703 o COPT: Connection Options are a repeated tag field. The field 704 contains any connection options being requested by the client or 705 server. These are typically used for experimentation and will 706 evolve over time. Example use cases include changing congestion 707 control algorithms and parameters such as initial window. (TODO: 708 List connection options.) 710 5.2.2. Proof of Source Address Ownership 712 Transport protocols commonly use a roundtrip time to verify a 713 client's address ownership for protection from malicious clients that 714 spoof their source address. QUIC uses a cookie, called the Source 715 Address Token (STK), to mostly eliminate this roundtrip of delay. 716 This technique is similar to TCP Fast Open's use of a cookie to avoid 717 a roundtrip of delay in TCP connection establishment. 719 On a new connection, a QUIC server sends an STK, which is opaque to 720 and stored by the client. On a subsequent connection, the client 721 echoes it in the transport handshake as proof of IP ownership. 723 A QUIC server also uses the STK to store server-designated connection 724 IDs for Stateless Rejects, to verify that an incoming connection 725 contains the correct connection ID. 727 A QUIC server MAY additionally store other data in a the STK, such as 728 measured bandwidth and measured minimum RTT to the client that may 729 help the server better bootstrap a subsequent connection from the 730 same client. A server MAY send an updated STK message mid-connection 731 to update server state that is stored at the client in the STK. 733 (TODO: Describe server and client actions on STK, encoding, 734 recommendations for what to put in an STK. Describe SCUP messages.) 736 5.2.3. Crypto Handshake Protocol Features 738 QUIC's current crypto handshake mechanism is documented in 739 [QUICCrypto]. QUIC does not restrict itself to using a specific 740 handshake protocol, so the details of a specific handshake protocol 741 are out of this document's scope. If not explicitly specified in the 742 application mapping, TLS is assumed to be the default crypto 743 handshake protocol, as described in [QUIC-TLS]. An application that 744 maps to QUIC MAY however specify an alternative crypto handshake 745 protocol to be used. 747 The following list of requirements and recommendations documents 748 properties of the current prototype handshake which should be 749 provided by any handshake protocol. 751 o The crypto handshake MUST ensure that the final negotiated key is 752 distinct for every connection between two endpoints. 754 o Transport Negotiation: The crypto handshake MUST provide a 755 mechanism for the transport component to exchange transport 756 parameters and Source Address Tokens. To avoid downgrade attacks, 757 the transport parameters sent and received MUST be verified before 758 the handshake completes successfully. 760 o Connection Establishment in 0-RTT: Since low-latency connection 761 establishment is a critical feature of QUIC, the QUIC handshake 762 protocol SHOULD attempt to achieve 0-RTT connection establishment 763 latency for repeated connections between the same endpoints. 765 o Source Address Spoofing Defense: Since QUIC handles source address 766 verification, the crypto protocol SHOULD NOT impose a separate 767 source address verification mechanism. 769 o Server Config Update: A QUIC server may refresh the source-address 770 token (STK) mid-connection, to update the information stored in 771 the STK at the client and to extend the period over which 0-RTT 772 connections can be established by the client. 774 o Certificate Compression: Early QUIC experience demonstrated that 775 compressing certificates exchanged during a handshake is valuable 776 in reducing latency. This additionally helps to reduce the 777 amplification attack footprint when a server sends a large set of 778 certificates, which is not uncommon with TLS. The crypto protocol 779 SHOULD compress certificates and any other information to minimize 780 the number of packets sent during a handshake. 782 The following information used during the QUIC handshake MUST be 783 cryptographically verified by the crypto handshake protocol: 785 o Client's originally proposed version in its first packet. 787 o Server's version list in it's Version Negotiation packet, if one 788 was sent. 790 5.3. Connection Migration 792 QUIC connections are identified by their 64-bit Connection ID. 793 QUIC's consistent connection ID allows connections to survive changes 794 to the client's IP and/or port, such as those caused by client or 795 server migrating to a new network. QUIC also provides automatic 796 cryptographic verification of a rebound client, since the client 797 continues to use the same session key for encrypting and decrypting 798 packets. 800 DISCUSS: Simultaneous migration. Is this reasonable? 802 TODO: Perhaps move mitigation techniques from Security Considerations 803 here. 805 5.4. Connection Termination 807 Connections should remain open until they become idle for a pre- 808 negotiated period of time. A QUIC connection, once established, can 809 be terminated in one of three ways: 811 1. Explicit Shutdown: An endpoint sends a CONNECTION_CLOSE frame to 812 the peer initiating a connection termination. An endpoint may 813 send a GOAWAY frame to the peer prior to a CONNECTION_CLOSE to 814 indicate that the connection will soon be terminated. A GOAWAY 815 frame signals to the peer that any active streams will continue 816 to be processed, but the sender of the GOAWAY will not initiate 817 any additional streams and will not accept any new incoming 818 streams. On termination of the active streams, a 819 CONNECTION_CLOSE may be sent. If an endpoint sends a 820 CONNECTION_CLOSE frame while unterminated streams are active (no 821 FIN bit or RST_STREAM frames have been sent or received for one 822 or more streams), then the peer must assume that the streams were 823 incomplete and were abnormally terminated. 825 2. Implicit Shutdown: The default idle timeout for a QUIC connection 826 is 30 seconds, and is a required parameter (ICSL) in connection 827 negotiation. The maximum is 10 minutes. If there is no network 828 activity for the duration of the idle timeout, the connection is 829 closed. By default a CONNECTION_CLOSE frame will be sent. A 830 silent close option can be enabled when it is expensive to send 831 an explicit close, such as mobile networks that must wake up the 832 radio. 834 3. Abrupt Shutdown: An endpoint may send a Public Reset packet at 835 any time during the connection to abruptly terminate an active 836 connection. A Public Reset packet SHOULD only be used as a final 837 recourse. Commonly, a public reset is expected to be sent when a 838 packet on an established connection is received by an endpoint 839 that is unable decrypt the packet. For instance, if a server 840 reboots mid-connection and loses any cryptographic state 841 associated with open connections, and then receives a packet on 842 an open connection, it should send a Public Reset packet in 843 return. (TODO: articulate rules around when a public reset 844 should be sent.) 846 TODO: Connections that are terminated are added to a TIME_WAIT list 847 at the server, so as to absorb any straggler packets in the network. 848 Discuss TIME_WAIT list. 850 6. Frame Types and Formats 852 As described in Section XXX, Regular packets contain one or more 853 frames. We now describe the various QUIC frame types that can be 854 present in a Regular packet. The use of these frames and various 855 frame header bits are described in subsequent sections. 857 6.1. STREAM Frame 859 STREAM frames implicitly create a stream and carry stream data. A 860 STREAM frame is shown below. 862 +------------+--------------------------------+ 863 | Type (8) | Stream ID (8, 16, 24, or 32) | 864 +------------+--------------------------------+ 865 +---------------------------------------------+ 866 | Offset (0, 16, 24, 32, 40, 48, 56, or 64) | 867 +---------------------------------------------+ 868 +-------------------------+---------------------------------+ 869 | Data length (0 or 16) | Stream Data (per data length) | 870 +-------------------------+---------------------------------+ 872 The STREAM frame header fields are as follows: 874 o Frame Type: The Frame Type byte is an 8-bit value containing 875 various flags, and is formatted as the following 8 bits: 1FDOOOSS. 877 * The leftmost bit must be set to 1 indicating that this is a 878 STREAM frame. 880 * 'F' is the FIN bit, which is used for stream termination. 882 * The 'D' bit indicates whether a Data Length field is present in 883 the STREAM header. When set to 0, this field indicates that 884 the Stream Data field extends to the end of the packet. When 885 set to 1, this field indicates that Data Length field contains 886 the length (in bytes) of the Stream Data field. The option to 887 omit the length should only be used when the packet is a "full- 888 sized" packet, to avoid the risk of corruption via padding. 890 * The 'OOO' bits encode the length of the Offset header field as 891 0, 16, 24, 32, 40, 48, 56, or 64 bits long. 893 * The 'SS' bits encode the length of the Stream ID header field 894 as 8, 16, 24, or 32 bits. (DISCUSS: Consider making this 8, 895 16, 32, 64.) 897 o Stream ID: A variable-sized unsigned ID unique to this stream. 899 o Offset: A variable-sized unsigned number specifying the byte 900 offset in the stream for the data in this STREAM frame. The first 901 byte in the stream has an offset of 0. 903 o Data Length: An optional 16-bit unsigned number specifying the 904 length of the Stream Data field in this STREAM frame. 906 A STREAM frame MUST have either non-zero data length or the FIN bit 907 set. 909 Stream multiplexing is achieved by interleaving STREAM frames from 910 multiple streams into one or more QUIC packets. A single QUIC packet 911 MAY bundle STREAM frames from multiple streams. 913 Implementation note: One of the benefits of QUIC is avoidance of 914 head-of-line blocking across multiple streams. When a packet loss 915 occurs, only streams with data in that packet are blocked waiting for 916 a retransmission to be received, while other streams can continue 917 making progress. Note that when data from multiple streams is 918 bundled into a single QUIC packet, loss of that packet blocks all 919 those streams from making progress. An implementation is therefore 920 advised to bundle as few streams as necessary in outgoing packets 921 without losing transmission efficiency to underfilled packets. 923 6.2. ACK Frame 925 Receivers send ACK frames to inform senders which packets they have 926 received, as well as which packets are considered missing. The ACK 927 frame contains between 1 and 256 ack blocks. Ack blocks are ranges 928 of acknowledged packets. 930 To limit the ACK blocks to the ones that haven't yet been received by 931 the sender, the sender periodically sends STOP_WAITING frames that 932 signal the receiver to stop acking packets below a specified sequence 933 number, raising the "least unacked" packet number at the receiver. A 934 sender of an ACK frame thus reports only those ACK blocks between the 935 received least unacked and the reported largest observed packet 936 numbers. It is recommended for the sender to send the most recent 937 largest acked packet it has received in an ack as the STOP_WAITING 938 frame's least unacked value. 940 Unlike TCP SACKs, QUIC ACK blocks are irrevocable. Once a packet is 941 acked, even if it does not appear in a future ack frame, it is 942 assumed to be acked. 944 A sender MAY intentionally skip packet numbers to introduce entropy 945 into the connection, to avoid opportunistic ack attacks. The sender 946 MUST close the connection if an unsent packet number is acked. The 947 format of the ACK frame is efficient at expressing blocks of missing 948 packets; skipping packet numbers between 1 and 255 effectively 949 provides up to 8 bits of efficient entropy on demand, which should be 950 adequate protection against most opportunistic ack attacks. 952 +--------------------------------------------------------------+ 953 | Type (8) | Largest Acked (8, 16, 32, or 48) | Ack Delay (16) | 954 +--------------------------------------------------------------+ 956 Ack Block Section: 957 +-------------------------------------------------------------------------+ 958 | Number Blocks (8) (opt) | First Ack Block Length (8, 16, 32 or 48 bits) | 959 +-------------------------------------------------------------------------+ 960 +-----------------------------------------------------------------+ 961 | Gap To Next Block (8) | Ack Block Length (8, 16, 32, or 48 bits | <-- optional, 962 +-----------------------------------------------------------------+ repeats 964 Timestamp Section: 965 +--------------------+ 966 | Num Timestamps (8) | 967 +--------------------+ 968 +---------------------------------------------------------+ 969 | Delta Largest Acked (8) | Time Since Largest Acked (32) | <-- optional 970 +---------------------------------------------------------+ 971 +---------------------------------------------------------------+ 972 | Delta Largest Acked (8) | Time Since Previous Timestamp (16) | <-- optional, 973 +---------------------------------------------------------------+ repeats 975 The fields in the ACK frame are as follows: 977 o Frame Type: The Frame Type byte is an 8-bit value containing 978 various flags. This byte is formatted as the following 8 bits: 979 01NULLMM. 981 * The first two bits must be set to 01 indicating that this is an 982 ACK frame. 984 * The 'N' bit indicates whether the frame has more than 1 ack 985 range. 987 * The 'U' bit is unused. 989 * The two 'LL' bits encode the length of the Largest Acked field 990 as 1, 2, 4, or 6 bytes long. 992 * The two 'MM' bits encode the length of the Ack Block Length 993 fields as 1, 2, 4, or 6 bytes long. 995 o Largest Acked: A variable-sized unsigned value representing the 996 largest packet number the peer is acking in this packet (typically 997 the largest that the peer has seen thus far.) 999 o Ack Delay: Time from when the largest acked, as indicated in the 1000 Largest Acked field, was received by this peer to when this ack 1001 was sent. 1003 o Ack Block Section: 1005 * Num Blocks (opt): An optional 8-bit unsigned value specifying 1006 the number of additional ack blocks (besides the required First 1007 Ack Block) in this ACK frame. Only present if the 'N' flag bit 1008 is 1. 1010 * First Ack Block Length: An unsigned packet number delta that 1011 indicates the number of contiguous additional packets being 1012 acked starting at the Largest Acked. 1014 * Gap To Next Block (opt, repeated): An unsigned number 1015 specifying the number of contiguous missing packets from the 1016 end of the previous ack block to the start of the next. 1018 * Ack Block Length (opt, repeated): An unsigned packet number 1019 delta that indicates the number of contiguous packets being 1020 acked starting after the end of the previous gap. Along with 1021 the previous field, this field is repeated "Num Blocks" times. 1023 o Timestamp Section: 1025 * Num Timestamps: An unsigned 8-bit number specifying the total 1026 number of pairs following, including 1027 the First Timestamp. 1029 * Delta Largest Acked (opt): An optional 8-bit unsigned packet 1030 number delta specifying the delta between the largest acked and 1031 the first packet whose timestamp is being reported. In other 1032 words, this first packet number may be computed as (Largest 1033 Acked - Delta Largest Acked.) 1035 * First Timestamp (opt): An optional 32-bit unsigned value 1036 specifying the time delta in microseconds, from the beginning 1037 of the connection to the arrival of this packet. 1039 * Delta Largest Observed (opt, repeated): (Same as above.) 1041 * Time Since Previous Timestamp (opt, repeated): An optional 1042 16-bit unsigned value specifying time delta from the previous 1043 reported timestamp. It is encoded in the same format as the 1044 Ack Delay. Along with the previous field, this field is 1045 repeated "Num Timestamps" times. 1047 6.2.1. Time Format 1049 DISCUSS_AND_REPLACE: Perhaps make this format simpler. 1051 The time format used in the ACK frame above is a 16-bit unsigned 1052 float with 11 explicit bits of mantissa and 5 bits of explicit 1053 exponent, specifying time in microseconds. The bit format is loosely 1054 modeled after IEEE 754. For example, 1 microsecond is represented as 1055 0x1, which has an exponent of zero, presented in the 5 high order 1056 bits, and mantissa of 1, presented in the 11 low order bits. When 1057 the explicit exponent is greater than zero, an implicit high-order 1058 12th bit of 1 is assumed in the mantissa. For example, a floating 1059 value of 0x800 has an explicit exponent of 1, as well as an explicit 1060 mantissa of 0, but then has an effective mantissa of 4096 (12th bit 1061 is assumed to be 1). Additionally, the actual exponent is one-less 1062 than the explicit exponent, and the value represents 4096 1063 microseconds. Any values larger than the representable range are 1064 clamped to 0xFFFF. 1066 6.3. STOP_WAITING Frame 1068 The STOP_WAITING frame is sent to inform the peer that it should not 1069 continue to wait for packets with packet numbers lower than a 1070 specified value. The packet number is encoded in 1, 2, 4 or 6 bytes, 1071 using the same coding length as is specified for the packet number 1072 for the enclosing packet's header (specified in the QUIC Frame 1073 packet's Flags field.) The frame is as follows: 1075 +---------------------------------------------------+ 1076 | Type (8) | Least unacked delta (8, 16, 32, or 48) | 1077 +---------------------------------------------------+ 1079 The fields in the STOP_WAITING frame are as follows: 1081 o Frame Type: The Frame Type byte is an 8-bit value that must be set 1082 to 0x06 indicating that this is a STOP_WAITING frame. 1084 o Least Unacked Delta: A variable-length packet number delta with 1085 the same length as the packet header's packet number. Subtract it 1086 from the complete packet number of the enclosing packet to 1087 determine the least unacked packet number. The resulting least 1088 unacked packet number is the earliest packet for which the sender 1089 is still awaiting an ack. If the receiver is missing any packets 1090 earlier than this packet, the receiver SHOULD consider those 1091 packets to be irrecoverably lost and MUST NOT report those packets 1092 as missing in subsequent acks. 1094 6.4. WINDOW_UPDATE Frame 1096 The WINDOW_UPDATE frame informs the peer of an increase in an 1097 endpoint's flow control receive window. The StreamID can be zero, 1098 indicating this WINDOW_UPDATE applies to the connection level flow 1099 control window, or non-zero, indicating that the specified stream 1100 should increase its flow control window. The frame is as follows: 1102 +---------------------------------------------------+ 1103 | Type(8) | Stream ID (32) | Byte offset (64) | 1104 +---------------------------------------------------+ 1106 The fields in the WINDOW_UPDATE frame are as follows: 1108 o Frame Type: The Frame Type byte is an 8-bit value that must be set 1109 to 0x04 indicating that this is a WINDOW_UPDATE frame. 1111 o Stream ID: ID of the stream whose flow control windows is being 1112 updated, or 0 to specify the connection-level flow control window. 1114 o Byte offset: A 64-bit unsigned integer indicating the absolute 1115 byte offset of data which can be sent on the given stream. In the 1116 case of connection level flow control, the cumulative number of 1117 bytes which can be sent on all currently open streams. 1119 6.5. BLOCKED Frame 1121 A sender sends a BLOCKED frame when it is ready to send data (and has 1122 data to send), but is currently flow control blocked. BLOCKED frames 1123 are purely informational frames, but extremely useful for debugging 1124 purposes. A receiver of a BLOCKED frame should simply discard it 1125 (after possibly printing a helpful log message). The frame is as 1126 follows: 1128 +------------------------------+ 1129 | Type(8) | Stream ID (32) | 1130 +------------------------------+ 1132 The fields in the BLOCKED frame are as follows: 1134 o Frame Type: The Frame Type byte is an 8-bit value that must be set 1135 to 0x05 indicating that this is a BLOCKED frame. 1137 o Stream ID: A 32-bit unsigned number indicating the stream which is 1138 flow control blocked. A non-zero Stream ID field specifies the 1139 stream that is flow control blocked. When zero, the Stream ID 1140 field indicates that the connection is flow control blocked. 1142 6.6. RST_STREAM Frame 1144 An endpoint may use a RST_STREAM frame to abruptly terminate a 1145 stream. The frame is as follows: 1147 +----------------------------------------------------------------------+ 1148 | Type(8) | StreamID (32) | Byte offset (64) | Error code (32) | 1149 +----------------------------------------------------------------------+ 1151 The fields are: 1153 o Frame type: The Frame Type is an 8-bit value that must be set to 1154 0x01 specifying that this is a RST_STREAM frame. 1156 o Stream ID: The 32-bit Stream ID of the stream being terminated. 1158 o Byte offset: A 64-bit unsigned integer indicating the absolute 1159 byte offset of the end of data written on this stream by the 1160 RST_STREAM sender. 1162 o Error code: A 32-bit error code which indicates why the stream is 1163 being closed. 1165 6.7. PADDING Frame 1167 The PADDING frame pads a packet with 0x00 bytes. When this frame is 1168 encountered, the rest of the packet is expected to be padding bytes. 1169 The frame contains 0x00 bytes and extends to the end of the QUIC 1170 packet. A PADDING frame only has a Frame Type field, and must have 1171 the 8-bit Frame Type field set to 0x00. The PADDING frame is as 1172 follows: 1174 +--------+ 1175 | 0x00 | 1176 +--------+ 1178 6.8. PING frame 1180 Endpoints can use PING frames to verify that their peers are still 1181 alive or to check reachability to the peer. The PING frame contains 1182 no payload. The receiver of a PING frame simply needs to ACK the 1183 packet containing this frame. The PING frame SHOULD be used to keep 1184 a connection alive when a stream is open. The default is to send a 1185 PING frame after 15 seconds of quiescence. A PING frame only has a 1186 Frame Type field, and must have the 8-bit Frame Type field set to 1187 0x07. The PING frame is as follows: 1189 +--------+ 1190 | 0x07 | 1191 +--------+ 1193 6.9. CONNECTION_CLOSE frame 1195 An endpoint sends a CONNECTION_CLOSE frame to notify its peer that 1196 the connection is being closed. If there are open streams that 1197 haven't been explicitly closed, they are implicitly closed when the 1198 connection is closed. (Ideally, a GOAWAY frame would be sent with 1199 enough time that all streams are torn down.) The frame is as 1200 follows: 1202 +-----------------------------------------------------------------------+ 1203 | Type(8) | Error code (32) | Reason phrase length (16) | Reason phrase | 1204 +-----------------------------------------------------------------------+ 1206 The fields of a CONNECTION_CLOSE frame are as follows: 1208 o Frame Type: An 8-bit value that must be set to 0x02 specifying 1209 that this is a CONNECTION_CLOSE frame. 1211 o Error Code: A 32-bit error code which indicates the reason for 1212 closing this connection. 1214 o Reason Phrase Length: A 16-bit unsigned number specifying the 1215 length of the reason phrase. This may be zero if the sender 1216 chooses to not give details beyond the QuicErrorCode. 1218 o Reason Phrase: An optional human-readable explanation for why the 1219 connection was closed. 1221 6.10. GOAWAY Frame 1223 An endpoint may use a GOAWAY frame to notify its peer that the 1224 connection should stop being used, and will likely be aborted in the 1225 future. The endpoints will continue using any active streams, but 1226 the sender of the GOAWAY will not initiate any additional streams, 1227 and will not accept any new streams. The frame is as follows: 1229 +-----------------------------------------------------------+ 1230 | Type (8) | Error code (32) | Last Good Stream ID (32) | 1231 +-----------------------------------------------------------+ 1232 +----------------------------------------------+ 1233 | Reason phrase length (16) | Reason phrase | 1234 +----------------------------------------------+ 1236 The fields of a GOAWAY frame are as follows: 1238 o Frame type: An 8-bit value that must be set to 0x03 specifying 1239 that this is a GOAWAY frame. 1241 o Error Code: A 32-bit field error code which indicates the reason 1242 for closing this connection. 1244 o Last Good Stream ID: The last Stream ID which was accepted by the 1245 sender of the GOAWAY message. If no streams were replied to, this 1246 value must be set to 0. 1248 o Reason Phrase Length: A 16-bit unsigned number specifying the 1249 length of the reason phrase. This may be zero if the sender 1250 chooses to not give details beyond the error code. 1252 o Reason Phrase: An optional human-readable explanation for why the 1253 connection was closed. 1255 7. Packetization and Reliability 1257 The maximum packet size for QUIC is the maximum size of the encrypted 1258 payload of the resulting UDP datagram. All QUIC packets SHOULD be 1259 sized to fit within the path's MTU to avoid IP fragmentation. The 1260 recommended default maximum packet size is 1350 bytes for IPv6 and 1261 1370 bytes for IPv4. To optimize better, endpoints MAY use PLPMTUD 1262 [RFC4821] for detecting the path's MTU and setting the maximum packet 1263 size appropriately. 1265 A sender bundles one or more frames in a Regular QUIC packet. A 1266 sender MAY bundle any set of frames in a packet. All QUIC packets 1267 MUST contain a packet number and MAY contain one or more frames 1268 (Section XX). Packet numbers MUST be unique within a connection and 1269 MUST NOT be reused within the same connection. Packet numbers MUST 1270 be assigned to packets in a strictly monotonically increasing order. 1271 The initial packet number used, at both the client and the server, 1272 MUST be 0. That is, the first packet in both directions of the 1273 connection MUST have a packet number of 0. 1275 A sender SHOULD minimize per-packet bandwidth and computational costs 1276 by bundling as many frames as possible within a QUIC packet. A 1277 sender MAY wait for a short period of time to bundle multiple frames 1278 before sending a packet that is not maximally packed, to avoid 1279 sending out large numbers of small packets. An implementation may 1280 use heuristics about expected application sending behavior to 1281 determine whether and for how long to wait. This waiting period is 1282 an implementation decision, and an implementation should be careful 1283 to delay conservatively, since any delay is likely to increase 1284 application-visible latency. 1286 Regular QUIC packets are "containers" of frames; a packet is never 1287 retransmitted whole, but frames in a lost packet may be rebundled and 1288 transmitted in a subsequent packet as necessary. 1290 A packet may contain frames and/or application data, only some of 1291 which may require reliability. When a packet is detected as lost, 1292 the sender SHOULD only resend frames that require retransmission. 1294 o All application data sent in STREAM frames MUST be retransmitted, 1295 with one exception. When an endpoint sends a RST_STREAM frame, 1296 data outstanding on that stream SHOULD NOT be retransmitted, since 1297 subsequent data on this stream is expected to not be delivered by 1298 the receiver. 1300 o ACK, STOP_WAITING, and PADDING frames MUST NOT be retransmitted. 1301 New frames of these types may however be bundled with any outgoing 1302 packet. 1304 o All other frames MUST be retransmitted. 1306 Upon detecting losses, a sender MUST take appropriate congestion 1307 control action. The details of loss detection and congestion control 1308 are described in [QUIC-RECOVERY]. 1310 A receiver acknowledges receipt of a received packet by sending one 1311 or more ACK frames containing the packet number of the received 1312 packet. To avoid perpetual acking between endpoints, a receiver MUST 1313 NOT generate an ack in response to every packet containing only ACK 1314 frames. However, since it is possible that an endpoint sends only 1315 packets containing ACK frame (or other non-retransmittable frames), 1316 the receiving peer MAY send an ACK frame after a reasonable number 1317 (currently 20) of such packets have been received. 1319 Strategies and implications of the frequency of generating 1320 acknowledgments are discussed in more detail in [QUIC-RECOVERY]. 1322 8. Streams: QUIC's Data Structuring Abstraction 1324 Streams in QUIC provide a lightweight, ordered, and bidirectional 1325 byte-stream abstraction. Streams can be created either by the client 1326 or the server, can concurrently send data interleaved with other 1327 streams, and can be cancelled. QUIC's stream lifetime is modeled 1328 closely after HTTP/2's [RFC7540]. Streams are independent of each 1329 other in delivery order. That is, data that is received on a stream 1330 is delivered in order within that stream, but there is no particular 1331 delivery order across streams. Transmit ordering among streams is 1332 left to the implementation. QUIC streams are considered lightweight 1333 in that the creation and destruction of streams are expected to have 1334 minimal bandwidth and computational cost. A single STREAM frame may 1335 create, carry data for, and terminate a stream, or a stream may last 1336 the entire duration of a connection. Implementations are therefore 1337 advised to keep these extremes in mind and to implement stream 1338 creation and destruction to be as lightweight as possible. 1340 An alternative view of QUIC streams is as an elastic "message" 1341 abstraction, similar to the way ephemeral streams are used in SST 1342 [SST], which may be a more appealing description for some 1343 applications. 1345 8.1. Life of a Stream 1347 The semantics of QUIC streams is based on HTTP/2 streams, and the 1348 lifecycle of a QUIC stream therefore closely follows that of an 1349 HTTP/2 stream [RFC7540], with some differences to accommodate the 1350 possibility of out-of-order delivery due to the use of multiple 1351 streams in QUIC. The lifecycle of a QUIC stream is shown in the 1352 following figure and described below. 1354 app +--------+ 1355 reserve_stream | | 1356 ,--------------| idle | 1357 / | | 1358 / +--------+ 1359 V | 1360 +----------+ send data/ | 1361 | | recv data | send data/ 1362 ,---| reserved |------------. | recv data 1363 | | | \ | 1364 | +----------+ v v 1365 | recv FIN/ +--------+ send FIN/ 1366 | app read_close | | app write_close 1367 | ,---------| open |-----------. 1368 | / | | \ 1369 | v +--------+ v 1370 | +----------+ | +----------+ 1371 | | half | | | half | 1372 | | closed | | send RST/ | closed | 1373 | | (remote) | | recv RST | (local) | 1374 | +----------+ | +----------+ 1375 | | | | 1376 | | recv FIN/ | send FIN/ | 1377 | | app write_close/ | app read_close/ | 1378 | | send RST/ v send RST/ | 1379 | | recv RST +--------+ recv RST | 1380 | send RST/ `------------->| |<---------------' 1381 | recv RST | closed | 1382 `-------------------------->| | 1383 +--------+ 1385 send: endpoint sends this frame 1386 recv: endpoint receives this frame 1388 data: application data in a STREAM frame 1389 FIN: FIN flag in a STREAM frame 1390 RST: RST_STREAM frame 1392 app: application API signals to QUIC 1393 reserve_stream: causes a StreamID to be reserved for later use 1394 read_close: causes stream to be half-closed without receiving a FIN 1395 write_close: causes stream to be half-closed without sending a FIN 1397 Figure 6: Lifecycle of a stream 1399 Note that this diagram shows stream state transitions and the frames 1400 and flags that affect those transitions only. For the purpose of 1401 state transitions, the FIN flag is processed as a separate event to 1402 the frame that bears it; a STREAM frame with the FIN flag set can 1403 cause two state transitions. When the FIN bit is sent on an empty 1404 STREAM frame, the offset in the STREAM frame MUST be one greater than 1405 the last data byte sent on this stream. 1407 Both endpoints have a subjective view of the state of a stream that 1408 could be different when frames are in transit. Endpoints do not 1409 coordinate the creation of streams; they are created unilaterally by 1410 either endpoint. The negative consequences of a mismatch in states 1411 are limited to the "closed" state after sending RST_STREAM, where 1412 frames might be received for some time after closing. 1414 Streams have the following states: 1416 8.1.1. idle 1418 All streams start in the "idle" state. 1420 The following transitions are valid from this state: 1422 Sending or receiving a STREAM frame causes the stream to become 1423 "open". The stream identifier is selected as described in 1424 Section XX. The same STREAM frame can also cause a stream to 1425 immediately become "half-closed". 1427 An application can reserve an idle stream for later use. The stream 1428 state for the reserved stream transitions to "reserved". 1430 Receiving any frame other than STREAM or RST_STREAM on a stream in 1431 this state MUST be treated as a connection error (Section XX) of type 1432 YYYY. 1434 8.1.2. reserved 1436 A stream in this state has been reserved for later use by the 1437 application. In this state only the following transitions are 1438 possible: 1440 o Sending or receiving a STREAM frame causes the stream to become 1441 "open". 1443 o Sending or receiving a RST_STREAM frame causes the stream to 1444 become "closed". 1446 8.1.3. open 1448 A stream in the "open" state may be used by both peers to send frames 1449 of any type. In this state, a sending peer must observe the flow- 1450 control limit advertised by its receiving peer (Section XX). 1452 From this state, either endpoint can send a frame with the FIN flag 1453 set, which causes the stream to transition into one of the "half- 1454 closed" states. An endpoint sending an FIN flag causes the stream 1455 state to become "half-closed (local)". An endpoint receiving a FIN 1456 flag causes the stream state to become "half-closed (remote)"; the 1457 receiving endpoint MUST NOT process the FIN flag until all preceding 1458 data on the stream has been received. 1460 Either endpoint can send a RST_STREAM frame from this state, causing 1461 it to transition immediately to "closed". 1463 8.1.4. half-closed (local) 1465 A stream that is in the "half-closed (local)" state MUST NOT be used 1466 for sending STREAM frames; WINDOW_UPDATE and RST_STREAM MAY be sent 1467 in this state. 1469 A stream transitions from this state to "closed" when a frame that 1470 contains an FIN flag is received or when either peer sends a 1471 RST_STREAM frame. 1473 An endpoint can receive any type of frame in this state. Providing 1474 flow-control credit using WINDOW_UPDATE frames is necessary to 1475 continue receiving flow-controlled frames. In this state, a receiver 1476 MAY ignore WINDOW_UPDATE frames for this stream, which might arrive 1477 for a short period after a frame bearing the FIN flag is sent. 1479 8.1.5. half-closed (remote) 1481 A stream that is "half-closed (remote)" is no longer being used by 1482 the peer to send any data. In this state, a sender is no longer 1483 obligated to maintain a receiver stream-level flow-control window. 1485 If an endpoint receives any STREAM frames for a stream that is in 1486 this state, it MUST close the connection with a 1487 QUIC_STREAM_DATA_AFTER_TERMINATION error (Section XX). 1489 A stream in this state can be used by the endpoint to send frames of 1490 any type. In this state, the endpoint continues to observe 1491 advertised stream-level and connection-level flow-control limits 1492 (Section XX). 1494 A stream can transition from this state to "closed" by sending a 1495 frame that contains a FIN flag or when either peer sends a RST_STREAM 1496 frame. 1498 8.1.6. closed 1500 The "closed" state is the terminal state. 1502 A final offset is present in both a frame bearing a FIN flag and in a 1503 RST_STREAM frame. Upon sending either of these frames for a stream, 1504 the endpoint MUST NOT send a STREAM frame carrying data beyond the 1505 final offset. 1507 An endpoint that receives any frame for this stream after receiving 1508 either a FIN flag and all stream data preceding it, or a RST_STREAM 1509 frame, MUST quietly discard the frame, with one exception. If a 1510 STREAM frame carrying data beyond the received final offset is 1511 received, the endpoint MUST close the connection with a 1512 QUIC_STREAM_DATA_AFTER_TERMINATION error (Section XX). 1514 An endpoint that receives a RST_STREAM frame (and which has not sent 1515 a FIN or a RST_STREAM) MUST immediately respond with a RST_STREAM 1516 frame, and MUST NOT send any more data on the stream. This endpoint 1517 may continue receiving frames for the stream on which a RST_STREAM is 1518 received. 1520 If this state is reached as a result of sending a RST_STREAM frame, 1521 the peer that receives the RST_STREAM might have already sent - or 1522 enqueued for sending - frames on the stream that cannot be withdrawn. 1523 An endpoint MUST ignore frames that it receives on closed streams 1524 after it has sent a RST_STREAM frame. An endpoint MAY choose to 1525 limit the period over which it ignores frames and treat frames that 1526 arrive after this time as being in error. 1528 STREAM frames received after sending RST_STREAM are counted toward 1529 the connection and stream flow-control windows. Even though these 1530 frames might be ignored, because they are sent before their sender 1531 receives the RST_STREAM, the sender will consider the frames to count 1532 against its flow-control windows. 1534 In the absence of more specific guidance elsewhere in this document, 1535 implementations SHOULD treat the receipt of a frame that is not 1536 expressly permitted in the description of a state as a connection 1537 error (Section XX). Frames of unknown types are ignored. 1539 (TODO: QUIC_STREAM_NO_ERROR is a special case. Write it up.) 1541 8.2. Stream Identifiers 1543 Streams are identified by an unsigned 32-bit integer, referred to as 1544 the StreamID. To avoid StreamID collision, clients MUST initiate 1545 streams usinge odd-numbered StreamIDs; streams initiated by the 1546 server MUST use even-numbered StreamIDs. 1548 A StreamID of zero (0x0) is reserved and used for connection-level 1549 flow control frames (Section XX); the StreamID of zero cannot be used 1550 to establish a new stream. 1552 StreamID 1 (0x1) is reserved for the crypto handshake. StreamID 1 1553 MUST NOT be used for application data, and MUST be the first client- 1554 initiated stream. 1556 Streams MUST be created or reserved in sequential order, but MAY be 1557 used in arbitrary order. A QUIC endpoint MUST NOT reuse a StreamID 1558 on a given connection. 1560 8.3. Stream Concurrency 1562 An endpoint can limit the number of concurrently active incoming 1563 streams by setting the MSPC parameter (see Section XX) in the 1564 transport parameters. The maximum concurrent streams setting is 1565 specific to each endpoint and applies only to the peer that receives 1566 the setting. That is, clients specify the maximum number of 1567 concurrent streams the server can initiate, and servers specify the 1568 maximum number of concurrent streams the client can initiate. 1570 Streams that are in the "open" state or in either of the "half- 1571 closed" states count toward the maximum number of streams that an 1572 endpoint is permitted to open. Streams in any of these three states 1573 count toward the limit advertised in the MSPC setting. 1575 Endpoints MUST NOT exceed the limit set by their peer. An endpoint 1576 that receives a STREAM frame that causes its advertised concurrent 1577 stream limit to be exceeded MUST treat this as a stream error of type 1578 QUIC_TOO_MANY_OPEN_STREAMS (Section XX). 1580 8.4. Sending and Receiving Data 1582 Once a stream is created, endpoints may use the stream to send and 1583 receive data. Each endpoint may send a series of STREAM frames 1584 encapsulating data on a stream until the stream is terminated in that 1585 direction. Streams are an ordered byte-stream abstraction, and they 1586 have no other structure within them. STREAM frame boundaries are not 1587 expected to be preserved in retransmissions from the sender or during 1588 delivery to the application at the receiver. 1590 When new data is to be sent on a stream, a sender MUST set the 1591 encapsulating STREAM frame's offset field to the stream offset of the 1592 first byte of this new data. The first byte of data that is sent on 1593 a stream has the stream offset 0. A receiver MUST ensure that 1594 received stream data is delivered to the application as an ordered 1595 byte-stream. Data received out of order MUST be buffered for later 1596 delivery, as long as it is not in violation of the receiver's flow 1597 control limits. 1599 An endpoint MUST NOT send any stream data without consulting the 1600 congestion controller and the flow controller, with the following two 1601 exceptions. 1603 o The crypto handshake stream, Stream 1, MUST NOT be subject to 1604 congestion control or connection-level flow control, but MUST be 1605 subject to stream-level flow control. 1607 o An application MAY exclude specific stream IDs from connection- 1608 level flow control. If so, these streams MUST NOT be subject to 1609 connection-level flow control. 1611 Flow control is described in detail in Section XX, and congestion 1612 control is described in the companion document [QUIC-RECOVERY]. 1614 9. Flow Control 1616 It is necessary to limit the amount of data that a sender may have 1617 outstanding at any time, so as to prevent a fast sender from 1618 overwhelming a slow receiver, or to prevent a malicious sender from 1619 consuming significant resources at a receiver. This section 1620 describes QUIC's flow-control mechanisms. 1622 QUIC employs a credit-based flow-control scheme similar to HTTP/2's 1623 flow control [RFC7540]. A receiver advertises the number of octets 1624 it is prepared to receive on a given stream and for the entire 1625 connection. This leads to two levels of flow control in QUIC: (i) 1626 Connection flow control, which prevents senders from exceeding a 1627 receiver's buffer capacity for the connection, and (ii) Stream flow 1628 control, which prevents a single stream from consuming the entire 1629 receive buffer for a connection. 1631 A receiver sends WINDOW_UPDATE frames to the sender to advertise 1632 additional credit, for both connection and stream flow control. A 1633 receiver advertises the maximum absolute byte offset in the stream or 1634 in the connection which the receiver is willing to receive. 1636 The initial flow control credit is 65536 bytes for both the stream 1637 and connection flow controllers. 1639 A receiver MAY advertise a larger offset at any point in the 1640 connection by sending a WINDOW_UPDATE frame. A receiver MUST NOT 1641 renege on an advertisement; that is, once a receiver advertises an 1642 offset via a WINDOW_UPDATE frame, it MUST NOT subsequently advertise 1643 a smaller offset. A sender may receive WINDOW_UPDATE frames out of 1644 order; a sender MUST therefore ignore any reductions in flow control 1645 credit. 1647 A sender MUST send BLOCKED frames to indicate it has data to write 1648 but is blocked by lack of connection or stream flow control credit. 1649 BLOCKED frames are expected to be sent infrequently in common cases, 1650 but they are considered useful for debugging and monitoring purposes. 1652 A receiver advertises credit for a stream by sending a WINDOW_UPDATE 1653 frame with the StreamID set appropriately. A receiver may simply use 1654 the current received offset to determine the flow control offset to 1655 be advertised. 1657 Connection flow control is a limit to the total bytes of stream data 1658 sent in STREAM frames. A receiver advertises credit for a connection 1659 by sending a WINDOW_UPDATE frame with the StreamID set to zero 1660 (0x00). A receiver may maintain a cumulative sum of bytes received 1661 cumulatively on all streams to determine the value of the connection 1662 flow control offset to be advertised in WINDOW_UPDATE frames. A 1663 sender may maintain a cumulative sum of stream data bytes sent to 1664 impose the connection flow control limit. 1666 9.1. Edge Cases and Other Considerations 1668 There are some edge cases which must be considered when dealing with 1669 stream and connection level flow control. Given enough time, both 1670 endpoints must agree on flow control state. If one end believes it 1671 can send more than the other end is willing to receive, the 1672 connection will be torn down when too much data arrives. Conversely 1673 if a sender believes it is blocked, while endpoint B expects more 1674 data can be received, then the connection can be in a deadlock, with 1675 the sender waiting for a WINDOW_UPDATE which will never come. 1677 9.1.1. Mid-stream RST_STREAM 1679 On receipt of an RST_STREAM frame, an endpoint will tear down state 1680 for the matching stream and ignore further data arriving on that 1681 stream. This could result in the endpoints getting out of sync, 1682 since the RST_STREAM frame may have arrived out of order and there 1683 may be further bytes in flight. The data sender would have counted 1684 the data against its connection level flow control budget, but a 1685 receiver that has not received these bytes would not know to include 1686 them as well. The receiver must learn of the number of bytes that 1687 were sent on the stream to make the same adjustment in its connection 1688 flow controller. 1690 To avoid this de-synchronization, a RST_STREAM sender MUST include 1691 the final byte offset sent on the stream in the RST_STREAM frame. On 1692 receiving a RST_STREAM frame, a receiver definitively knows how many 1693 bytes were sent on that stream before the RST_STREAM frame, and the 1694 receiver MUST use the final offset to account for all bytes sent on 1695 the stream in its connection level flow controller. 1697 9.1.2. Response to a RST_STREAM 1699 Since streams are bidirectional, a sender of a RST_STREAM needs to 1700 know how many bytes the peer has sent on the stream. If an endpoint 1701 receives a RST_STREAM frame and has sent neither a FIN nor a 1702 RST_STREAM, it MUST send a RST_STREAM in response, bearing the offset 1703 of the last byte sent on this stream as the final offset. 1705 9.1.3. Offset Increment 1707 This document leaves when and how many bytes to advertise in a 1708 WINDOW_UPDATE to the implementation, but offers a few considerations. 1709 WINDOW_UPDATE frames constitute overhead, and therefore, sending a 1710 WINDOW_UPDATE with small offset increments is undesirable. At the 1711 same time, sending WINDOW_UPDATES with large offset increments 1712 requires the sender to commit to that amount of buffer. 1713 Implementations must find the correct tradeoff between these sides to 1714 determine how large an offset increment to send in a WINDOW_UPDATE. 1716 A receiver MAY use an autotuning mechanism to tune the size of the 1717 offset increment to advertise based on a roundtrip time estimate and 1718 the rate at which the receiving application consumes data, similar to 1719 common TCP implementations. 1721 9.1.4. BLOCKED frames 1723 If a sender does not receive a WINDOW_UPDATE frame when it has run 1724 out of flow control credit, the sender will be blocked and MUST send 1725 a BLOCKED frame. A BLOCKED frame is expected to be useful for 1726 debugging at the receiver. A receiver SHOULD NOT wait for a BLOCKED 1727 frame before sending with a WINDOW_UPDATE, since doing so will cause 1728 at least one roundtrip of quiescence. For smooth operation of the 1729 congestion controller, it is generally considered best to not let the 1730 sender go into quiescence if avoidable. To avoid blocking a sender, 1731 and to reasonably account for the possibiity of loss, a receiver 1732 should send a WINDOW_UPDATE frame at least two roundtrips before it 1733 expects the sender to get blocked. 1735 10. Error Codes 1737 This section lists all the QUIC error codes that may be used in a 1738 CONNECTION_CLOSE frame. TODO: Trim list and group errors for 1739 readabiity. 1741 o 0x01: QUIC_INTERNAL_ERROR. (Connection has reached an invalid 1742 state.) 1744 o 0x02: QUIC_STREAM_DATA_AFTER_TERMINATION. (There were data frames 1745 after the a fin or reset.) 1747 o 0x03: QUIC_INVALID_PACKET_HEADER. (Control frame is malformed.) 1749 o 0x04: QUIC_INVALID_FRAME_DATA. (Frame data is malformed.) 1751 o 0x30: QUIC_MISSING_PAYLOAD. (The packet contained no payload.) 1753 o 0x2e: QUIC_INVALID_STREAM_DATA. (STREAM frame data is malformed.) 1755 o 0x57: QUIC_OVERLAPPING_STREAM_DATA. (STREAM frame data overlaps 1756 with buffered data.) 1758 o 0x3d: QUIC_UNENCRYPTED_STREAM_DATA. (Received STREAM frame data 1759 is not encrypted.) 1761 o 0x58: QUIC_ATTEMPT_TO_SEND_UNENCRYPTED_STREAM_DATA. (Attempt to 1762 send unencrypted STREAM frame. Not sent on the wire, used for 1763 local logging.) 1765 o 0x59: QUIC_MAYBE_CORRUPTED_MEMORY. (Received a frame which is 1766 likely the result of memory corruption.) 1768 o 0x06: QUIC_INVALID_RST_STREAM_DATA. (RST_STREAM frame data is 1769 malformed.) 1771 o 0x07: QUIC_INVALID_CONNECTION_CLOSE_DATA. (CONNECTION_CLOSE frame 1772 data is malformed.) 1774 o 0x08: QUIC_INVALID_GOAWAY_DATA. (GOAWAY frame data is malformed.) 1776 o 0x39: QUIC_INVALID_WINDOW_UPDATE_DATA. (WINDOW_UPDATE frame data 1777 is malformed.) 1779 o 0x3a: QUIC_INVALID_BLOCKED_DATA. (BLOCKED frame data is 1780 malformed.) 1782 o 0x3c: QUIC_INVALID_STOP_WAITING_DATA. (STOP_WAITING frame data is 1783 malformed.) 1785 o 0x4e: QUIC_INVALID_PATH_CLOSE_DATA. (PATH_CLOSE frame data is 1786 malformed.) 1788 o 0x09: QUIC_INVALID_ACK_DATA. (ACK frame data is malformed.) 1790 o 0x0a: QUIC_INVALID_VERSION_NEGOTIATION_PACKET. (Version 1791 negotiation packet is malformed.) 1793 o 0x0b: QUIC_INVALID_PUBLIC_RST_PACKET. (Public RST packet is 1794 malformed.) 1796 o 0x0c: QUIC_DECRYPTION_FAILURE. (There was an error decrypting.) 1798 o 0x0d: QUIC_ENCRYPTION_FAILURE. (There was an error encrypting.) 1800 o 0x0e: QUIC_PACKET_TOO_LARGE. (The packet exceeded 1801 kMaxPacketSize.) 1803 o 0x10: QUIC_PEER_GOING_AWAY. (The peer is going away. May be a 1804 client or server.) 1806 o 0x11: QUIC_INVALID_STREAM_ID. (A stream ID was invalid.) 1808 o 0x31: QUIC_INVALID_PRIORITY. (A priority was invalid.) 1810 o 0x12: QUIC_TOO_MANY_OPEN_STREAMS. (Too many streams already 1811 open.) 1813 o 0x4c: QUIC_TOO_MANY_AVAILABLE_STREAMS. (The peer created too many 1814 available streams.) 1816 o 0x13: QUIC_PUBLIC_RESET. (Received public reset for this 1817 connection.) 1819 o 0x14: QUIC_INVALID_VERSION. (Invalid protocol version.) 1821 o 0x16: QUIC_INVALID_HEADER_ID. (The Header ID for a stream was too 1822 far from the previous.) 1824 o 0x17: QUIC_INVALID_NEGOTIATED_VALUE. (Negotiable parameter 1825 received during handshake had invalid value.) 1827 o 0x18: QUIC_DECOMPRESSION_FAILURE. (There was an error 1828 decompressing data.) 1830 o 0x19: QUIC_NETWORK_IDLE_TIMEOUT. (The connection timed out due to 1831 no network activity.) 1833 o 0x43: QUIC_HANDSHAKE_TIMEOUT. (The connection timed out waiting 1834 for the handshake to complete.) 1836 o 0x1a: QUIC_ERROR_MIGRATING_ADDRESS. (There was an error 1837 encountered migrating addresses.) 1839 o 0x56: QUIC_ERROR_MIGRATING_PORT. (There was an error encountered 1840 migrating port only.) 1842 o 0x1b: QUIC_PACKET_WRITE_ERROR. (There was an error while writing 1843 to the socket.) 1845 o 0x33: QUIC_PACKET_READ_ERROR. (There was an error while reading 1846 from the socket.) 1848 o 0x32: QUIC_EMPTY_STREAM_FRAME_NO_FIN. (We received a STREAM_FRAME 1849 with no data and no fin flag set.) 1851 o 0x38: QUIC_INVALID_HEADERS_STREAM_DATA. (We received invalid data 1852 on the headers stream.) 1854 o 0x3b: QUIC_FLOW_CONTROL_RECEIVED_TOO_MUCH_DATA. (The peer 1855 received too much data, violating flow control.) 1857 o 0x3f: QUIC_FLOW_CONTROL_SENT_TOO_MUCH_DATA. (The peer sent too 1858 much data, violating flow control.) 1860 o 0x40: QUIC_FLOW_CONTROL_INVALID_WINDOW. (The peer received an 1861 invalid flow control window.) 1863 o 0x3e: QUIC_CONNECTION_IP_POOLED. (The connection has been IP 1864 pooled into an existing connection.) 1866 o 0x44: QUIC_TOO_MANY_OUTSTANDING_SENT_PACKETS. (The connection has 1867 too many outstanding sent packets.) 1869 o 0x45: QUIC_TOO_MANY_OUTSTANDING_RECEIVED_PACKETS. (The connection 1870 has too many outstanding received packets.) 1872 o 0x46: QUIC_CONNECTION_CANCELLED. (The quic connection has been 1873 cancelled.) 1875 o 0x47: QUIC_BAD_PACKET_LOSS_RATE. (Disabled QUIC because of high 1876 packet loss rate.) 1878 o 0x49: QUIC_PUBLIC_RESETS_POST_HANDSHAKE. (Disabled QUIC because 1879 of too many PUBLIC_RESETs post handshake.) 1881 o 0x4a: QUIC_TIMEOUTS_WITH_OPEN_STREAMS. (Disabled QUIC because of 1882 too many timeouts with streams open.) 1884 o 0x4b: QUIC_FAILED_TO_SERIALIZE_PACKET. (Closed because we failed 1885 to serialize a packet.) 1887 o 0x55: QUIC_TOO_MANY_RTOS. (QUIC timed out after too many RTOs.) 1888 x1c: QUIC_HANDSHAKE_FAILED. (Crypto errors.Hanshake failed.) 1890 o 0x1d: QUIC_CRYPTO_TAGS_OUT_OF_ORDER. (Handshake message contained 1891 out of order tags.) 1893 o 0x1e: QUIC_CRYPTO_TOO_MANY_ENTRIES. (Handshake message contained 1894 too many entries.) 1896 o 0x1f: QUIC_CRYPTO_INVALID_VALUE_LENGTH. (Handshake message 1897 contained an invalid value length.) 1899 o 0x20: QUIC_CRYPTO_MESSAGE_AFTER_HANDSHAKE_COMPLETE. (A crypto 1900 message was received after the handshake was complete.) 1902 o 0x21: QUIC_INVALID_CRYPTO_MESSAGE_TYPE. (A crypto message was 1903 received with an illegal message tag.) 1905 o 0x22: QUIC_INVALID_CRYPTO_MESSAGE_PARAMETER. (A crypto message 1906 was received with an illegal parameter.) 1908 o 0x34: QUIC_INVALID_CHANNEL_ID_SIGNATURE. (An invalid channel id 1909 signature was supplied.) 1911 o 0x23: QUIC_CRYPTO_MESSAGE_PARAMETER_NOT_FOUND. (A crypto message 1912 was received with a mandatory parameter missing.) 1914 o 0x24: QUIC_CRYPTO_MESSAGE_PARAMETER_NO_OVERLAP. (A crypto message 1915 was received with a parameter that has no overlapwith the local 1916 parameter.) 1918 o 0x25: QUIC_CRYPTO_MESSAGE_INDEX_NOT_FOUND. (A crypto message was 1919 received that contained a parameter with too fewvalues.) 1921 o 0x5e: QUIC_UNSUPPORTED_PROOF_DEMAND. (A demand for an unsupport 1922 proof type was received.) 1924 o 0x26: QUIC_CRYPTO_INTERNAL_ERROR. (An internal error occured in 1925 crypto processing.) 1927 o 0x27: QUIC_CRYPTO_VERSION_NOT_SUPPORTED. (A crypto handshake 1928 message specified an unsupported version.) 1930 o 0x48: QUIC_CRYPTO_HANDSHAKE_STATELESS_REJECT. (A crypto handshake 1931 message resulted in a stateless reject.) 1933 o 0x28: QUIC_CRYPTO_NO_SUPPORT. (There was no intersection between 1934 the crypto primitives supported by thepeer and ourselves.) 1936 o 0x29: QUIC_CRYPTO_TOO_MANY_REJECTS. (The server rejected our 1937 client hello messages too many times.) 1939 o 0x2a: QUIC_PROOF_INVALID. (The client rejected the server's 1940 certificate chain or signature.) 1942 o 0x2b: QUIC_CRYPTO_DUPLICATE_TAG. (A crypto message was received 1943 with a duplicate tag.) 1945 o 0x2c: QUIC_CRYPTO_ENCRYPTION_LEVEL_INCORRECT. (A crypto message 1946 was received with the wrong encryption level (i.e. itshould have 1947 been encrypted but was not.)) 1949 o 0x2d: QUIC_CRYPTO_SERVER_CONFIG_EXPIRED. (The server config for a 1950 server has expired.) 1952 o 0x35: QUIC_CRYPTO_SYMMETRIC_KEY_SETUP_FAILED. (We failed to setup 1953 the symmetric keys for a connection.) 1955 o 0x36: QUIC_CRYPTO_MESSAGE_WHILE_VALIDATING_CLIENT_HELLO. (A 1956 handshake message arrived, but we are still validating theprevious 1957 handshake message.) 1959 o 0x41: QUIC_CRYPTO_UPDATE_BEFORE_HANDSHAKE_COMPLETE. (A server 1960 config update arrived before the handshake is complete.) 1962 o 0x5a: QUIC_CRYPTO_CHLO_TOO_LARGE. (CHLO cannot fit in one 1963 packet.) 1965 o 0x37: QUIC_VERSION_NEGOTIATION_MISMATCH. (This connection 1966 involved a version negotiation which appears to have beentampered 1967 with.) 1969 o 0x50: QUIC_IP_ADDRESS_CHANGED. (IP address changed causing 1970 connection close.) 1972 o 0x51: QUIC_CONNECTION_MIGRATION_NO_MIGRATABLE_STREAMS. 1973 (Connection migration errors.Network changed, but connection had 1974 no migratable streams.) 1976 o 0x52: QUIC_CONNECTION_MIGRATION_TOO_MANY_CHANGES. (Connection 1977 changed networks too many times.) 1979 o 0x53: QUIC_CONNECTION_MIGRATION_NO_NEW_NETWORK. (Connection 1980 migration was attempted, but there was no new network tomigrate 1981 to.) 1983 o 0x54: QUIC_CONNECTION_MIGRATION_NON_MIGRATABLE_STREAM. (Network 1984 changed, but connection had one or more non-migratable streams.) 1986 o 0x5d: QUIC_TOO_MANY_FRAME_GAPS. (Stream frames arrived too 1987 discontiguously so that stream sequencer buffermaintains too many 1988 gaps.) 1990 o 0x5f: QUIC_STREAM_SEQUENCER_INVALID_STATE. (Sequencer buffer get 1991 into weird state where continuing read/write will leadto crash.) 1993 o 0x60: QUIC_TOO_MANY_SESSIONS_ON_SERVER. (Connection closed 1994 because of server hits max number of sessions allowed. 1996 11. Security and Privacy Considerations 1998 11.1. Spoofed Ack Attack 2000 An attacker receives an STK from the server and then releases the IP 2001 address on which it received the STK. The attacked may in the 2002 future, spoof this same address (which now presumably addresses a 2003 different endpoint), and initiates a 0-RTT connection with a server 2004 on the victim's behalf. The attacker then spoofs ack packets to the 2005 server which cause the server to potentially drown the victim in 2006 data. 2008 There are two possible mitigations to this attack. The simplest one 2009 is that a server can unilaterally create a gap in packet-number 2010 space. In the non-attack scenario, the client will send an ack with 2011 a larger largest acked. In the attack scenario, the attacker may ack 2012 a packet in the gap. If the server sees an ack for a packet that was 2013 never sent, the connection can be aborted. 2015 The second mitigation is that the server can require that acks for 2016 sent packets match the encryption level of the sent packet. This 2017 mitigation is useful if the connection has an ephemeral forward- 2018 secure key that is generated and used for every new connection. If a 2019 packet sent is encrypted with a forward-secure key, then any acks 2020 that are received for them must also be forward-secure encrypted. 2021 Since the attacker will not have the forward secure key, the attacker 2022 will not be able to generate forward-secure encrypted ack packets. 2024 12. IANA Considerations 2026 This document has no IANA actions yet. 2028 13. References 2030 13.1. Normative References 2032 [QUIC-RECOVERY] 2033 Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection 2034 and Congestion Control", November 2016. 2036 [QUIC-TLS] 2037 Thomson, M., Ed. and S. Turner, Ed, Ed., "Using Transport 2038 Layer Security (TLS) to Secure QUIC", November 2016. 2040 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 2041 Requirement Levels", BCP 14, RFC 2119, 2042 DOI 10.17487/RFC2119, March 1997, 2043 . 2045 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 2046 Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, 2047 . 2049 [RFC7540] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 2050 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 2051 DOI 10.17487/RFC7540, May 2015, 2052 . 2054 13.2. Informative References 2056 [EARLY-DESIGN] 2057 Roskind, J., "QUIC: Multiplexed Transport Over UDP", 2058 December 2013, . 2060 [QUIC-HTTP] 2061 Bishop, M., Ed., "Hypertext Transfer Protocol (HTTP) over 2062 QUIC", November 2016. 2064 [QUICCrypto] 2065 Langley, A. and W. Chang, "QUIC Crypto", May 2016, 2066 . 2068 [SST] Ford, B., "Structured Streams: A New Transport 2069 Abstraction", ACM SIGCOMM 2007 , August 2007. 2071 Appendix A. Contributors 2073 The original authors of this specification were Ryan Hamilton, Jana 2074 Iyengar, Ian Swett, and Alyssa Wilk. 2076 The original design and rationale behind this protocol draw 2077 significantly from work by Jim Roskind [EARLY-DESIGN]. In 2078 alphabetical order, the contributors to the pre-IETF QUIC project at 2079 Google are: Britt Cyr, Jeremy Dorfman, Ryan Hamilton, Jana Iyengar, 2080 Fedor Kouranov, Charles Krasic, Jo Kulik, Adam Langley, Jim Roskind, 2081 Robbie Shade, Satyam Shekhar, Cherie Shi, Ian Swett, Raman Tenneti, 2082 Victor Vasiliev, Antonio Vicente, Patrik Westin, Alyssa Wilk, Dale 2083 Worley, Fan Yang, Dan Zhang, Daniel Ziegler. 2085 Appendix B. Acknowledgments 2087 Special thanks are due to the following for helping shape pre-IETF 2088 QUIC and its deployment: Chris Bentzel, Misha Efimov, Roberto Peon, 2089 Alistair Riddoch, Siddharth Vijayakrishnan, and Assar Westerlund. 2091 This document has benefited immensely from various private 2092 discussions and public ones on the quic@ietf.org and proto- 2093 quic@chromium.org mailing lists. Our thanks to all. 2095 Authors' Addresses 2097 Jana Iyengar (editor) 2098 Google 2100 Email: jri@google.com 2102 Martin Thomson (editor) 2103 Mozilla 2105 Email: martin.thomson@gmail.com