idnits 2.17.1 draft-ietf-quic-transport-28.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 3 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: The server includes a connection ID of its choice in the Source Connection ID field. This value MUST not be equal to the Destination Connection ID field of the packet sent by the client. A client MUST discard a Retry packet that contains a Source Connection ID field that is identical to the Destination Connection ID field of its Initial packet. The client MUST use the value from the Source Connection ID field of the Retry packet in the Destination Connection ID field of subsequent packets that it sends. -- The document date (20 May 2020) is 1437 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 2179 == Missing Reference: 'CH' is mentioned on line 2175, but not defined == Missing Reference: 'SH' is mentioned on line 2177, but not defined == Missing Reference: 'EE' is mentioned on line 2178, but not defined == Missing Reference: 'CERT' is mentioned on line 2178, but not defined == Missing Reference: 'CV' is mentioned on line 2178, but not defined == Missing Reference: 'FIN' is mentioned on line 2178, but not defined -- Looks like a reference, but probably isn't: '1' on line 2177 == Outdated reference: A later version (-22) exists of draft-ietf-tsvwg-datagram-plpmtud-21 == Outdated reference: A later version (-34) exists of draft-ietf-quic-recovery-28 == Outdated reference: A later version (-34) exists of draft-ietf-quic-tls-28 -- Obsolete informational reference (is this intentional?): RFC 7540 (ref. 'HTTP2') (Obsoleted by RFC 9113) == Outdated reference: A later version (-13) exists of draft-ietf-quic-invariants-08 == Outdated reference: A later version (-18) exists of draft-ietf-quic-manageability-06 Summary: 0 errors (**), 0 flaws (~~), 14 warnings (==), 5 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC J. Iyengar, Ed. 3 Internet-Draft Fastly 4 Intended status: Standards Track M. Thomson, Ed. 5 Expires: 21 November 2020 Mozilla 6 20 May 2020 8 QUIC: A UDP-Based Multiplexed and Secure Transport 9 draft-ietf-quic-transport-28 11 Abstract 13 This document defines the core of the QUIC transport protocol. 14 Accompanying documents describe QUIC's loss detection and congestion 15 control and the use of TLS for key negotiation. 17 Note to Readers 19 Discussion of this draft takes place on the QUIC working group 20 mailing list (quic@ietf.org (mailto:quic@ietf.org)), which is 21 archived at https://mailarchive.ietf.org/arch/search/?email_list=quic 23 Working Group information can be found at https://github.com/quicwg; 24 source code and issues list for this draft can be found at 25 https://github.com/quicwg/base-drafts/labels/-transport. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at https://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on 21 November 2020. 44 Copyright Notice 46 Copyright (c) 2020 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 51 license-info) in effect on the date of publication of this document. 52 Please review these documents carefully, as they describe your rights 53 and restrictions with respect to this document. Code Components 54 extracted from this document must include Simplified BSD License text 55 as described in Section 4.e of the Trust Legal Provisions and are 56 provided without warranty as described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 6 61 1.1. Document Structure . . . . . . . . . . . . . . . . . . . 7 62 1.2. Terms and Definitions . . . . . . . . . . . . . . . . . . 8 63 1.3. Notational Conventions . . . . . . . . . . . . . . . . . 9 64 2. Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 65 2.1. Stream Types and Identifiers . . . . . . . . . . . . . . 11 66 2.2. Sending and Receiving Data . . . . . . . . . . . . . . . 12 67 2.3. Stream Prioritization . . . . . . . . . . . . . . . . . . 12 68 2.4. Required Operations on Streams . . . . . . . . . . . . . 13 69 3. Stream States . . . . . . . . . . . . . . . . . . . . . . . . 13 70 3.1. Sending Stream States . . . . . . . . . . . . . . . . . . 14 71 3.2. Receiving Stream States . . . . . . . . . . . . . . . . . 17 72 3.3. Permitted Frame Types . . . . . . . . . . . . . . . . . . 19 73 3.4. Bidirectional Stream States . . . . . . . . . . . . . . . 20 74 3.5. Solicited State Transitions . . . . . . . . . . . . . . . 21 75 4. Flow Control . . . . . . . . . . . . . . . . . . . . . . . . 23 76 4.1. Data Flow Control . . . . . . . . . . . . . . . . . . . . 23 77 4.2. Flow Credit Increments . . . . . . . . . . . . . . . . . 24 78 4.3. Handling Stream Cancellation . . . . . . . . . . . . . . 25 79 4.4. Stream Final Size . . . . . . . . . . . . . . . . . . . . 26 80 4.5. Controlling Concurrency . . . . . . . . . . . . . . . . . 26 81 5. Connections . . . . . . . . . . . . . . . . . . . . . . . . . 27 82 5.1. Connection ID . . . . . . . . . . . . . . . . . . . . . . 27 83 5.1.1. Issuing Connection IDs . . . . . . . . . . . . . . . 29 84 5.1.2. Consuming and Retiring Connection IDs . . . . . . . . 30 85 5.2. Matching Packets to Connections . . . . . . . . . . . . . 31 86 5.2.1. Client Packet Handling . . . . . . . . . . . . . . . 32 87 5.2.2. Server Packet Handling . . . . . . . . . . . . . . . 32 88 5.2.3. Considerations for Simple Load Balancers . . . . . . 33 89 5.3. Life of a QUIC Connection . . . . . . . . . . . . . . . . 33 90 5.4. Required Operations on Connections . . . . . . . . . . . 34 91 6. Version Negotiation . . . . . . . . . . . . . . . . . . . . . 35 92 6.1. Sending Version Negotiation Packets . . . . . . . . . . . 35 93 6.2. Handling Version Negotiation Packets . . . . . . . . . . 36 94 6.2.1. Version Negotiation Between Draft Versions . . . . . 36 95 6.3. Using Reserved Versions . . . . . . . . . . . . . . . . . 37 96 7. Cryptographic and Transport Handshake . . . . . . . . . . . . 37 97 7.1. Example Handshake Flows . . . . . . . . . . . . . . . . . 38 98 7.2. Negotiating Connection IDs . . . . . . . . . . . . . . . 39 99 7.3. Authenticating Connection IDs . . . . . . . . . . . . . . 41 100 7.4. Transport Parameters . . . . . . . . . . . . . . . . . . 43 101 7.4.1. Values of Transport Parameters for 0-RTT . . . . . . 43 102 7.4.2. New Transport Parameters . . . . . . . . . . . . . . 45 103 7.5. Cryptographic Message Buffering . . . . . . . . . . . . . 45 104 8. Address Validation . . . . . . . . . . . . . . . . . . . . . 46 105 8.1. Address Validation During Connection Establishment . . . 46 106 8.1.1. Token Construction . . . . . . . . . . . . . . . . . 47 107 8.1.2. Address Validation using Retry Packets . . . . . . . 47 108 8.1.3. Address Validation for Future Connections . . . . . . 48 109 8.1.4. Address Validation Token Integrity . . . . . . . . . 51 110 8.2. Path Validation . . . . . . . . . . . . . . . . . . . . . 51 111 8.3. Initiating Path Validation . . . . . . . . . . . . . . . 52 112 8.4. Path Validation Responses . . . . . . . . . . . . . . . . 52 113 8.5. Successful Path Validation . . . . . . . . . . . . . . . 53 114 8.6. Failed Path Validation . . . . . . . . . . . . . . . . . 53 115 9. Connection Migration . . . . . . . . . . . . . . . . . . . . 54 116 9.1. Probing a New Path . . . . . . . . . . . . . . . . . . . 55 117 9.2. Initiating Connection Migration . . . . . . . . . . . . . 55 118 9.3. Responding to Connection Migration . . . . . . . . . . . 56 119 9.3.1. Peer Address Spoofing . . . . . . . . . . . . . . . . 56 120 9.3.2. On-Path Address Spoofing . . . . . . . . . . . . . . 57 121 9.3.3. Off-Path Packet Forwarding . . . . . . . . . . . . . 58 122 9.4. Loss Detection and Congestion Control . . . . . . . . . . 59 123 9.5. Privacy Implications of Connection Migration . . . . . . 60 124 9.6. Server's Preferred Address . . . . . . . . . . . . . . . 61 125 9.6.1. Communicating a Preferred Address . . . . . . . . . . 61 126 9.6.2. Responding to Connection Migration . . . . . . . . . 62 127 9.6.3. Interaction of Client Migration and Preferred 128 Address . . . . . . . . . . . . . . . . . . . . . . . 62 129 9.7. Use of IPv6 Flow-Label and Migration . . . . . . . . . . 63 130 10. Connection Termination . . . . . . . . . . . . . . . . . . . 63 131 10.1. Closing and Draining Connection States . . . . . . . . . 64 132 10.2. Idle Timeout . . . . . . . . . . . . . . . . . . . . . . 65 133 10.3. Immediate Close . . . . . . . . . . . . . . . . . . . . 66 134 10.3.1. Immediate Close During the Handshake . . . . . . . . 67 135 10.4. Stateless Reset . . . . . . . . . . . . . . . . . . . . 69 136 10.4.1. Detecting a Stateless Reset . . . . . . . . . . . . 71 137 10.4.2. Calculating a Stateless Reset Token . . . . . . . . 72 138 10.4.3. Looping . . . . . . . . . . . . . . . . . . . . . . 73 139 11. Error Handling . . . . . . . . . . . . . . . . . . . . . . . 74 140 11.1. Connection Errors . . . . . . . . . . . . . . . . . . . 74 141 11.2. Stream Errors . . . . . . . . . . . . . . . . . . . . . 75 142 12. Packets and Frames . . . . . . . . . . . . . . . . . . . . . 75 143 12.1. Protected Packets . . . . . . . . . . . . . . . . . . . 76 144 12.2. Coalescing Packets . . . . . . . . . . . . . . . . . . . 76 145 12.3. Packet Numbers . . . . . . . . . . . . . . . . . . . . . 77 146 12.4. Frames and Frame Types . . . . . . . . . . . . . . . . . 78 147 13. Packetization and Reliability . . . . . . . . . . . . . . . . 81 148 13.1. Packet Processing . . . . . . . . . . . . . . . . . . . 82 149 13.2. Generating Acknowledgements . . . . . . . . . . . . . . 82 150 13.2.1. Sending ACK Frames . . . . . . . . . . . . . . . . . 83 151 13.2.2. Managing ACK Ranges . . . . . . . . . . . . . . . . 84 152 13.2.3. Receiver Tracking of ACK Frames . . . . . . . . . . 85 153 13.2.4. Limiting ACK Ranges . . . . . . . . . . . . . . . . 85 154 13.2.5. Measuring and Reporting Host Delay . . . . . . . . . 86 155 13.2.6. ACK Frames and Packet Protection . . . . . . . . . . 86 156 13.3. Retransmission of Information . . . . . . . . . . . . . 87 157 13.4. Explicit Congestion Notification . . . . . . . . . . . . 89 158 13.4.1. ECN Counts . . . . . . . . . . . . . . . . . . . . . 90 159 13.4.2. ECN Validation . . . . . . . . . . . . . . . . . . . 90 160 14. Packet Size . . . . . . . . . . . . . . . . . . . . . . . . . 92 161 14.1. Path Maximum Transmission Unit (PMTU) . . . . . . . . . 93 162 14.2. ICMP Packet Too Big Messages . . . . . . . . . . . . . . 94 163 14.3. Datagram Packetization Layer PMTU Discovery . . . . . . 95 164 14.3.1. PMTU Probes Containing Source Connection ID . . . . 95 165 15. Versions . . . . . . . . . . . . . . . . . . . . . . . . . . 96 166 16. Variable-Length Integer Encoding . . . . . . . . . . . . . . 97 167 17. Packet Formats . . . . . . . . . . . . . . . . . . . . . . . 97 168 17.1. Packet Number Encoding and Decoding . . . . . . . . . . 98 169 17.2. Long Header Packets . . . . . . . . . . . . . . . . . . 99 170 17.2.1. Version Negotiation Packet . . . . . . . . . . . . . 101 171 17.2.2. Initial Packet . . . . . . . . . . . . . . . . . . . 103 172 17.2.3. 0-RTT . . . . . . . . . . . . . . . . . . . . . . . 105 173 17.2.4. Handshake Packet . . . . . . . . . . . . . . . . . . 106 174 17.2.5. Retry Packet . . . . . . . . . . . . . . . . . . . . 107 175 17.3. Short Header Packets . . . . . . . . . . . . . . . . . . 109 176 17.3.1. Latency Spin Bit . . . . . . . . . . . . . . . . . . 111 177 18. Transport Parameter Encoding . . . . . . . . . . . . . . . . 112 178 18.1. Reserved Transport Parameters . . . . . . . . . . . . . 113 179 18.2. Transport Parameter Definitions . . . . . . . . . . . . 113 180 19. Frame Types and Formats . . . . . . . . . . . . . . . . . . . 117 181 19.1. PADDING Frame . . . . . . . . . . . . . . . . . . . . . 117 182 19.2. PING Frame . . . . . . . . . . . . . . . . . . . . . . . 118 183 19.3. ACK Frames . . . . . . . . . . . . . . . . . . . . . . . 118 184 19.3.1. ACK Ranges . . . . . . . . . . . . . . . . . . . . . 120 185 19.3.2. ECN Counts . . . . . . . . . . . . . . . . . . . . . 121 186 19.4. RESET_STREAM Frame . . . . . . . . . . . . . . . . . . . 122 187 19.5. STOP_SENDING Frame . . . . . . . . . . . . . . . . . . . 123 188 19.6. CRYPTO Frame . . . . . . . . . . . . . . . . . . . . . . 123 189 19.7. NEW_TOKEN Frame . . . . . . . . . . . . . . . . . . . . 124 190 19.8. STREAM Frames . . . . . . . . . . . . . . . . . . . . . 125 191 19.9. MAX_DATA Frame . . . . . . . . . . . . . . . . . . . . . 127 192 19.10. MAX_STREAM_DATA Frame . . . . . . . . . . . . . . . . . 127 193 19.11. MAX_STREAMS Frames . . . . . . . . . . . . . . . . . . . 128 194 19.12. DATA_BLOCKED Frame . . . . . . . . . . . . . . . . . . . 129 195 19.13. STREAM_DATA_BLOCKED Frame . . . . . . . . . . . . . . . 130 196 19.14. STREAMS_BLOCKED Frames . . . . . . . . . . . . . . . . . 130 197 19.15. NEW_CONNECTION_ID Frame . . . . . . . . . . . . . . . . 131 198 19.16. RETIRE_CONNECTION_ID Frame . . . . . . . . . . . . . . . 132 199 19.17. PATH_CHALLENGE Frame . . . . . . . . . . . . . . . . . . 133 200 19.18. PATH_RESPONSE Frame . . . . . . . . . . . . . . . . . . 134 201 19.19. CONNECTION_CLOSE Frames . . . . . . . . . . . . . . . . 134 202 19.20. HANDSHAKE_DONE frame . . . . . . . . . . . . . . . . . . 135 203 19.21. Extension Frames . . . . . . . . . . . . . . . . . . . . 136 204 20. Transport Error Codes . . . . . . . . . . . . . . . . . . . . 136 205 20.1. Application Protocol Error Codes . . . . . . . . . . . . 138 206 21. Security Considerations . . . . . . . . . . . . . . . . . . . 138 207 21.1. Handshake Denial of Service . . . . . . . . . . . . . . 138 208 21.2. Amplification Attack . . . . . . . . . . . . . . . . . . 139 209 21.3. Optimistic ACK Attack . . . . . . . . . . . . . . . . . 140 210 21.4. Slowloris Attacks . . . . . . . . . . . . . . . . . . . 140 211 21.5. Stream Fragmentation and Reassembly Attacks . . . . . . 140 212 21.6. Stream Commitment Attack . . . . . . . . . . . . . . . . 141 213 21.7. Peer Denial of Service . . . . . . . . . . . . . . . . . 141 214 21.8. Explicit Congestion Notification Attacks . . . . . . . . 142 215 21.9. Stateless Reset Oracle . . . . . . . . . . . . . . . . . 142 216 21.10. Version Downgrade . . . . . . . . . . . . . . . . . . . 143 217 21.11. Targeted Attacks by Routing . . . . . . . . . . . . . . 143 218 21.12. Overview of Security Properties . . . . . . . . . . . . 143 219 21.12.1. Handshake . . . . . . . . . . . . . . . . . . . . . 144 220 21.12.2. Protected Packets . . . . . . . . . . . . . . . . . 145 221 21.12.3. Connection Migration . . . . . . . . . . . . . . . 146 222 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 150 223 22.1. Registration Policies for QUIC Registries . . . . . . . 150 224 22.1.1. Provisional Registrations . . . . . . . . . . . . . 150 225 22.1.2. Selecting Codepoints . . . . . . . . . . . . . . . . 151 226 22.1.3. Reclaiming Provisional Codepoints . . . . . . . . . 152 227 22.1.4. Permanent Registrations . . . . . . . . . . . . . . 153 228 22.2. QUIC Transport Parameter Registry . . . . . . . . . . . 153 229 22.3. QUIC Frame Type Registry . . . . . . . . . . . . . . . . 154 230 22.4. QUIC Transport Error Codes Registry . . . . . . . . . . 155 231 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 157 232 23.1. Normative References . . . . . . . . . . . . . . . . . . 157 233 23.2. Informative References . . . . . . . . . . . . . . . . . 158 234 Appendix A. Sample Packet Number Decoding Algorithm . . . . . . 160 235 Appendix B. Sample ECN Validation Algorithm . . . . . . . . . . 161 236 Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 162 237 C.1. Since draft-ietf-quic-transport-27 . . . . . . . . . . . 162 238 C.2. Since draft-ietf-quic-transport-26 . . . . . . . . . . . 163 239 C.3. Since draft-ietf-quic-transport-25 . . . . . . . . . . . 163 240 C.4. Since draft-ietf-quic-transport-24 . . . . . . . . . . . 163 241 C.5. Since draft-ietf-quic-transport-23 . . . . . . . . . . . 165 242 C.6. Since draft-ietf-quic-transport-22 . . . . . . . . . . . 165 243 C.7. Since draft-ietf-quic-transport-21 . . . . . . . . . . . 166 244 C.8. Since draft-ietf-quic-transport-20 . . . . . . . . . . . 167 245 C.9. Since draft-ietf-quic-transport-19 . . . . . . . . . . . 167 246 C.10. Since draft-ietf-quic-transport-18 . . . . . . . . . . . 168 247 C.11. Since draft-ietf-quic-transport-17 . . . . . . . . . . . 168 248 C.12. Since draft-ietf-quic-transport-16 . . . . . . . . . . . 169 249 C.13. Since draft-ietf-quic-transport-15 . . . . . . . . . . . 170 250 C.14. Since draft-ietf-quic-transport-14 . . . . . . . . . . . 170 251 C.15. Since draft-ietf-quic-transport-13 . . . . . . . . . . . 171 252 C.16. Since draft-ietf-quic-transport-12 . . . . . . . . . . . 172 253 C.17. Since draft-ietf-quic-transport-11 . . . . . . . . . . . 172 254 C.18. Since draft-ietf-quic-transport-10 . . . . . . . . . . . 173 255 C.19. Since draft-ietf-quic-transport-09 . . . . . . . . . . . 173 256 C.20. Since draft-ietf-quic-transport-08 . . . . . . . . . . . 174 257 C.21. Since draft-ietf-quic-transport-07 . . . . . . . . . . . 175 258 C.22. Since draft-ietf-quic-transport-06 . . . . . . . . . . . 176 259 C.23. Since draft-ietf-quic-transport-05 . . . . . . . . . . . 176 260 C.24. Since draft-ietf-quic-transport-04 . . . . . . . . . . . 176 261 C.25. Since draft-ietf-quic-transport-03 . . . . . . . . . . . 177 262 C.26. Since draft-ietf-quic-transport-02 . . . . . . . . . . . 177 263 C.27. Since draft-ietf-quic-transport-01 . . . . . . . . . . . 178 264 C.28. Since draft-ietf-quic-transport-00 . . . . . . . . . . . 180 265 C.29. Since draft-hamilton-quic-transport-protocol-01 . . . . . 180 266 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 180 267 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 182 269 1. Introduction 271 QUIC is a multiplexed and secure general-purpose transport protocol 272 that provides: 274 * Stream multiplexing 276 * Stream and connection-level flow control 278 * Low-latency connection establishment 280 * Connection migration and resilience to NAT rebinding 282 * Authenticated and encrypted header and payload 284 QUIC uses UDP as a substrate to avoid requiring changes to legacy 285 client operating systems and middleboxes. QUIC authenticates all of 286 its headers and encrypts most of the data it exchanges, including its 287 signaling, to avoid incurring a dependency on middleboxes. 289 1.1. Document Structure 291 This document describes the core QUIC protocol and is structured as 292 follows: 294 * Streams are the basic service abstraction that QUIC provides. 296 - Section 2 describes core concepts related to streams, 298 - Section 3 provides a reference model for stream states, and 300 - Section 4 outlines the operation of flow control. 302 * Connections are the context in which QUIC endpoints communicate. 304 - Section 5 describes core concepts related to connections, 306 - Section 6 describes version negotiation, 308 - Section 7 details the process for establishing connections, 310 - Section 8 specifies critical denial of service mitigation 311 mechanisms, 313 - Section 9 describes how endpoints migrate a connection to a new 314 network path, 316 - Section 10 lists the options for terminating an open 317 connection, and 319 - Section 11 provides general guidance for error handling. 321 * Packets and frames are the basic unit used by QUIC to communicate. 323 - Section 12 describes concepts related to packets and frames, 325 - Section 13 defines models for the transmission, retransmission, 326 and acknowledgement of data, and 328 - Section 14 specifies rules for managing the size of packets. 330 * Finally, encoding details of QUIC protocol elements are described 331 in: 333 - Section 15 (Versions), 335 - Section 16 (Integer Encoding), 336 - Section 17 (Packet Headers), 338 - Section 18 (Transport Parameters), 340 - Section 19 (Frames), and 342 - Section 20 (Errors). 344 Accompanying documents describe QUIC's loss detection and congestion 345 control [QUIC-RECOVERY], and the use of TLS for key negotiation 346 [QUIC-TLS]. 348 This document defines QUIC version 1, which conforms to the protocol 349 invariants in [QUIC-INVARIANTS]. 351 1.2. Terms and Definitions 353 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 354 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 355 "OPTIONAL" in this document are to be interpreted as described in 356 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 357 capitals, as shown here. 359 Commonly used terms in the document are described below. 361 QUIC: The transport protocol described by this document. QUIC is a 362 name, not an acronym. 364 QUIC packet: A complete processable unit of QUIC that can be 365 encapsulated in a UDP datagram. Multiple QUIC packets can be 366 encapsulated in a single UDP datagram. 368 Ack-eliciting Packet: A QUIC packet that contains frames other than 369 ACK, PADDING, and CONNECTION_CLOSE. These cause a recipient to 370 send an acknowledgment; see Section 13.2.1. 372 Out-of-order packet: A packet that does not increase the largest 373 received packet number for its packet number space (Section 12.3) 374 by exactly one. A packet can arrive out of order if it is 375 delayed, if earlier packets are lost or delayed, or if the sender 376 intentionally skips a packet number. 378 Endpoint: An entity that can participate in a QUIC connection by 379 generating, receiving, and processing QUIC packets. There are 380 only two types of endpoint in QUIC: client and server. 382 Client: The endpoint initiating a QUIC connection. 384 Server: The endpoint accepting incoming QUIC connections. 386 Address: When used without qualification, the tuple of IP version, 387 IP address, UDP protocol, and UDP port number that represents one 388 end of a network path. 390 Connection ID: An opaque identifier that is used to identify a QUIC 391 connection at an endpoint. Each endpoint sets a value for its 392 peer to include in packets sent towards the endpoint. 394 Stream: A unidirectional or bidirectional channel of ordered bytes 395 within a QUIC connection. A QUIC connection can carry multiple 396 simultaneous streams. 398 Application: An entity that uses QUIC to send and receive data. 400 1.3. Notational Conventions 402 Packet and frame diagrams in this document use a bespoke format. The 403 purpose of this format is to summarize, not define, protocol 404 elements. Prose defines the complete semantics and details of 405 structures. 407 Complex fields are named and then followed by a list of fields 408 surrounded by a pair of matching braces. Each field in this list is 409 separated by commas. 411 Individual fields include length information, plus indications about 412 fixed value, optionality, or repetitions. Individual fields use the 413 following notational conventions, with all lengths in bits: 415 x (A): Indicates that x is A bits long 417 x (i): Indicates that x uses the variable-length encoding in 418 Section 16 420 x (A..B): Indicates that x can be any length from A to B; A can be 421 omitted to indicate a minimum of zero bits and B can be omitted to 422 indicate no set upper limit; values in this format always end on 423 an octet boundary 425 x (?) = C: Indicates that x has a fixed value of C 427 x (?) = C..D: Indicates that x has a value in the range from C to D, 428 inclusive 430 [x (E)]: Indicates that x is optional (and has length of E) 431 x (E) ...: Indicates that x is repeated zero or more times (and that 432 each instance is length E) 434 By convention, individual fields reference a complex field by using 435 the name of the complex field. 437 For example: 439 Example Structure { 440 One-bit Field (1), 441 7-bit Field with Fixed Value (7) = 61, 442 Arbitrary-Length Field (..), 443 Variable-Length Field (8..24), 444 Field With Minimum Length (16..), 445 Field With Maximum Length (..128), 446 [Optional Field (64)], 447 Repeated Field (8) ..., 448 } 450 Figure 1: Example Format 452 2. Streams 454 Streams in QUIC provide a lightweight, ordered byte-stream 455 abstraction to an application. Streams can be unidirectional or 456 bidirectional. An alternative view of QUIC unidirectional streams is 457 a "message" abstraction of practically unlimited length. 459 Streams can be created by sending data. Other processes associated 460 with stream management - ending, cancelling, and managing flow 461 control - are all designed to impose minimal overheads. For 462 instance, a single STREAM frame (Section 19.8) can open, carry data 463 for, and close a stream. Streams can also be long-lived and can last 464 the entire duration of a connection. 466 Streams can be created by either endpoint, can concurrently send data 467 interleaved with other streams, and can be cancelled. QUIC does not 468 provide any means of ensuring ordering between bytes on different 469 streams. 471 QUIC allows for an arbitrary number of streams to operate 472 concurrently and for an arbitrary amount of data to be sent on any 473 stream, subject to flow control constraints and stream limits; see 474 Section 4. 476 2.1. Stream Types and Identifiers 478 Streams can be unidirectional or bidirectional. Unidirectional 479 streams carry data in one direction: from the initiator of the stream 480 to its peer. Bidirectional streams allow for data to be sent in both 481 directions. 483 Streams are identified within a connection by a numeric value, 484 referred to as the stream ID. A stream ID is a 62-bit integer (0 to 485 2^62-1) that is unique for all streams on a connection. Stream IDs 486 are encoded as variable-length integers; see Section 16. A QUIC 487 endpoint MUST NOT reuse a stream ID within a connection. 489 The least significant bit (0x1) of the stream ID identifies the 490 initiator of the stream. Client-initiated streams have even-numbered 491 stream IDs (with the bit set to 0), and server-initiated streams have 492 odd-numbered stream IDs (with the bit set to 1). 494 The second least significant bit (0x2) of the stream ID distinguishes 495 between bidirectional streams (with the bit set to 0) and 496 unidirectional streams (with the bit set to 1). 498 The least significant two bits from a stream ID therefore identify a 499 stream as one of four types, as summarized in Table 1. 501 +------+----------------------------------+ 502 | Bits | Stream Type | 503 +======+==================================+ 504 | 0x0 | Client-Initiated, Bidirectional | 505 +------+----------------------------------+ 506 | 0x1 | Server-Initiated, Bidirectional | 507 +------+----------------------------------+ 508 | 0x2 | Client-Initiated, Unidirectional | 509 +------+----------------------------------+ 510 | 0x3 | Server-Initiated, Unidirectional | 511 +------+----------------------------------+ 513 Table 1: Stream ID Types 515 Within each type, streams are created with numerically increasing 516 stream IDs. A stream ID that is used out of order results in all 517 streams of that type with lower-numbered stream IDs also being 518 opened. 520 The first bidirectional stream opened by the client has a stream ID 521 of 0. 523 2.2. Sending and Receiving Data 525 STREAM frames (Section 19.8) encapsulate data sent by an application. 526 An endpoint uses the Stream ID and Offset fields in STREAM frames to 527 place data in order. 529 Endpoints MUST be able to deliver stream data to an application as an 530 ordered byte-stream. Delivering an ordered byte-stream requires that 531 an endpoint buffer any data that is received out of order, up to the 532 advertised flow control limit. 534 QUIC makes no specific allowances for delivery of stream data out of 535 order. However, implementations MAY choose to offer the ability to 536 deliver data out of order to a receiving application. 538 An endpoint could receive data for a stream at the same stream offset 539 multiple times. Data that has already been received can be 540 discarded. The data at a given offset MUST NOT change if it is sent 541 multiple times; an endpoint MAY treat receipt of different data at 542 the same offset within a stream as a connection error of type 543 PROTOCOL_VIOLATION. 545 Streams are an ordered byte-stream abstraction with no other 546 structure visible to QUIC. STREAM frame boundaries are not expected 547 to be preserved when data is transmitted, retransmitted after packet 548 loss, or delivered to the application at a receiver. 550 An endpoint MUST NOT send data on any stream without ensuring that it 551 is within the flow control limits set by its peer. Flow control is 552 described in detail in Section 4. 554 2.3. Stream Prioritization 556 Stream multiplexing can have a significant effect on application 557 performance if resources allocated to streams are correctly 558 prioritized. 560 QUIC does not provide a mechanism for exchanging prioritization 561 information. Instead, it relies on receiving priority information 562 from the application that uses QUIC. 564 A QUIC implementation SHOULD provide ways in which an application can 565 indicate the relative priority of streams. When deciding which 566 streams to dedicate resources to, the implementation SHOULD use the 567 information provided by the application. 569 2.4. Required Operations on Streams 571 There are certain operations which an application MUST be able to 572 perform when interacting with QUIC streams. This document does not 573 specify an API, but any implementation of this version of QUIC MUST 574 expose the ability to perform the operations described in this 575 section on a QUIC stream. 577 On the sending part of a stream, application protocols need to be 578 able to: 580 * write data, understanding when stream flow control credit 581 (Section 4.1) has successfully been reserved to send the written 582 data; 584 * end the stream (clean termination), resulting in a STREAM frame 585 (Section 19.8) with the FIN bit set; and 587 * reset the stream (abrupt termination), resulting in a RESET_STREAM 588 frame (Section 19.4), if the stream was not already in a terminal 589 state. 591 On the receiving part of a stream, application protocols need to be 592 able to: 594 * read data; and 596 * abort reading of the stream and request closure, possibly 597 resulting in a STOP_SENDING frame (Section 19.5). 599 Applications also need to be informed of state changes on streams, 600 including when the peer has opened or reset a stream, when a peer 601 aborts reading on a stream, when new data is available, and when data 602 can or cannot be written to the stream due to flow control. 604 3. Stream States 606 This section describes streams in terms of their send or receive 607 components. Two state machines are described: one for the streams on 608 which an endpoint transmits data (Section 3.1), and another for 609 streams on which an endpoint receives data (Section 3.2). 611 Unidirectional streams use the applicable state machine directly. 612 Bidirectional streams use both state machines. For the most part, 613 the use of these state machines is the same whether the stream is 614 unidirectional or bidirectional. The conditions for opening a stream 615 are slightly more complex for a bidirectional stream because the 616 opening of either send or receive sides causes the stream to open in 617 both directions. 619 An endpoint MUST open streams of the same type in increasing order of 620 stream ID. 622 Note: These states are largely informative. This document uses 623 stream states to describe rules for when and how different types 624 of frames can be sent and the reactions that are expected when 625 different types of frames are received. Though these state 626 machines are intended to be useful in implementing QUIC, these 627 states aren't intended to constrain implementations. An 628 implementation can define a different state machine as long as its 629 behavior is consistent with an implementation that implements 630 these states. 632 3.1. Sending Stream States 634 Figure 2 shows the states for the part of a stream that sends data to 635 a peer. 637 o 638 | Create Stream (Sending) 639 | Peer Creates Bidirectional Stream 640 v 641 +-------+ 642 | Ready | Send RESET_STREAM 643 | |-----------------------. 644 +-------+ | 645 | | 646 | Send STREAM / | 647 | STREAM_DATA_BLOCKED | 648 | | 649 | Peer Creates | 650 | Bidirectional Stream | 651 v | 652 +-------+ | 653 | Send | Send RESET_STREAM | 654 | |---------------------->| 655 +-------+ | 656 | | 657 | Send STREAM + FIN | 658 v v 659 +-------+ +-------+ 660 | Data | Send RESET_STREAM | Reset | 661 | Sent |------------------>| Sent | 662 +-------+ +-------+ 663 | | 664 | Recv All ACKs | Recv ACK 665 v v 666 +-------+ +-------+ 667 | Data | | Reset | 668 | Recvd | | Recvd | 669 +-------+ +-------+ 671 Figure 2: States for Sending Parts of Streams 673 The sending part of stream that the endpoint initiates (types 0 and 2 674 for clients, 1 and 3 for servers) is opened by the application. The 675 "Ready" state represents a newly created stream that is able to 676 accept data from the application. Stream data might be buffered in 677 this state in preparation for sending. 679 Sending the first STREAM or STREAM_DATA_BLOCKED frame causes a 680 sending part of a stream to enter the "Send" state. An 681 implementation might choose to defer allocating a stream ID to a 682 stream until it sends the first STREAM frame and enters this state, 683 which can allow for better stream prioritization. 685 The sending part of a bidirectional stream initiated by a peer (type 686 0 for a server, type 1 for a client) starts in the "Ready" state when 687 the receiving part is created. 689 In the "Send" state, an endpoint transmits - and retransmits as 690 necessary - stream data in STREAM frames. The endpoint respects the 691 flow control limits set by its peer, and continues to accept and 692 process MAX_STREAM_DATA frames. An endpoint in the "Send" state 693 generates STREAM_DATA_BLOCKED frames if it is blocked from sending by 694 stream or connection flow control limits Section 4.1. 696 After the application indicates that all stream data has been sent 697 and a STREAM frame containing the FIN bit is sent, the sending part 698 of the stream enters the "Data Sent" state. From this state, the 699 endpoint only retransmits stream data as necessary. The endpoint 700 does not need to check flow control limits or send 701 STREAM_DATA_BLOCKED frames for a stream in this state. 702 MAX_STREAM_DATA frames might be received until the peer receives the 703 final stream offset. The endpoint can safely ignore any 704 MAX_STREAM_DATA frames it receives from its peer for a stream in this 705 state. 707 Once all stream data has been successfully acknowledged, the sending 708 part of the stream enters the "Data Recvd" state, which is a terminal 709 state. 711 From any of the "Ready", "Send", or "Data Sent" states, an 712 application can signal that it wishes to abandon transmission of 713 stream data. Alternatively, an endpoint might receive a STOP_SENDING 714 frame from its peer. In either case, the endpoint sends a 715 RESET_STREAM frame, which causes the stream to enter the "Reset Sent" 716 state. 718 An endpoint MAY send a RESET_STREAM as the first frame that mentions 719 a stream; this causes the sending part of that stream to open and 720 then immediately transition to the "Reset Sent" state. 722 Once a packet containing a RESET_STREAM has been acknowledged, the 723 sending part of the stream enters the "Reset Recvd" state, which is a 724 terminal state. 726 3.2. Receiving Stream States 728 Figure 3 shows the states for the part of a stream that receives data 729 from a peer. The states for a receiving part of a stream mirror only 730 some of the states of the sending part of the stream at the peer. 731 The receiving part of a stream does not track states on the sending 732 part that cannot be observed, such as the "Ready" state. Instead, 733 the receiving part of a stream tracks the delivery of data to the 734 application, some of which cannot be observed by the sender. 736 o 737 | Recv STREAM / STREAM_DATA_BLOCKED / RESET_STREAM 738 | Create Bidirectional Stream (Sending) 739 | Recv MAX_STREAM_DATA / STOP_SENDING (Bidirectional) 740 | Create Higher-Numbered Stream 741 v 742 +-------+ 743 | Recv | Recv RESET_STREAM 744 | |-----------------------. 745 +-------+ | 746 | | 747 | Recv STREAM + FIN | 748 v | 749 +-------+ | 750 | Size | Recv RESET_STREAM | 751 | Known |---------------------->| 752 +-------+ | 753 | | 754 | Recv All Data | 755 v v 756 +-------+ Recv RESET_STREAM +-------+ 757 | Data |--- (optional) --->| Reset | 758 | Recvd | Recv All Data | Recvd | 759 +-------+<-- (optional) ----+-------+ 760 | | 761 | App Read All Data | App Read RST 762 v v 763 +-------+ +-------+ 764 | Data | | Reset | 765 | Read | | Read | 766 +-------+ +-------+ 768 Figure 3: States for Receiving Parts of Streams 770 The receiving part of a stream initiated by a peer (types 1 and 3 for 771 a client, or 0 and 2 for a server) is created when the first STREAM, 772 STREAM_DATA_BLOCKED, or RESET_STREAM is received for that stream. 773 For bidirectional streams initiated by a peer, receipt of a 774 MAX_STREAM_DATA or STOP_SENDING frame for the sending part of the 775 stream also creates the receiving part. The initial state for the 776 receiving part of stream is "Recv". 778 The receiving part of a stream enters the "Recv" state when the 779 sending part of a bidirectional stream initiated by the endpoint 780 (type 0 for a client, type 1 for a server) enters the "Ready" state. 782 An endpoint opens a bidirectional stream when a MAX_STREAM_DATA or 783 STOP_SENDING frame is received from the peer for that stream. 784 Receiving a MAX_STREAM_DATA frame for an unopened stream indicates 785 that the remote peer has opened the stream and is providing flow 786 control credit. Receiving a STOP_SENDING frame for an unopened 787 stream indicates that the remote peer no longer wishes to receive 788 data on this stream. Either frame might arrive before a STREAM or 789 STREAM_DATA_BLOCKED frame if packets are lost or reordered. 791 Before a stream is created, all streams of the same type with lower- 792 numbered stream IDs MUST be created. This ensures that the creation 793 order for streams is consistent on both endpoints. 795 In the "Recv" state, the endpoint receives STREAM and 796 STREAM_DATA_BLOCKED frames. Incoming data is buffered and can be 797 reassembled into the correct order for delivery to the application. 798 As data is consumed by the application and buffer space becomes 799 available, the endpoint sends MAX_STREAM_DATA frames to allow the 800 peer to send more data. 802 When a STREAM frame with a FIN bit is received, the final size of the 803 stream is known; see Section 4.4. The receiving part of the stream 804 then enters the "Size Known" state. In this state, the endpoint no 805 longer needs to send MAX_STREAM_DATA frames, it only receives any 806 retransmissions of stream data. 808 Once all data for the stream has been received, the receiving part 809 enters the "Data Recvd" state. This might happen as a result of 810 receiving the same STREAM frame that causes the transition to "Size 811 Known". After all data has been received, any STREAM or 812 STREAM_DATA_BLOCKED frames for the stream can be discarded. 814 The "Data Recvd" state persists until stream data has been delivered 815 to the application. Once stream data has been delivered, the stream 816 enters the "Data Read" state, which is a terminal state. 818 Receiving a RESET_STREAM frame in the "Recv" or "Size Known" states 819 causes the stream to enter the "Reset Recvd" state. This might cause 820 the delivery of stream data to the application to be interrupted. 822 It is possible that all stream data is received when a RESET_STREAM 823 is received (that is, from the "Data Recvd" state). Similarly, it is 824 possible for remaining stream data to arrive after receiving a 825 RESET_STREAM frame (the "Reset Recvd" state). An implementation is 826 free to manage this situation as it chooses. 828 Sending RESET_STREAM means that an endpoint cannot guarantee delivery 829 of stream data; however there is no requirement that stream data not 830 be delivered if a RESET_STREAM is received. An implementation MAY 831 interrupt delivery of stream data, discard any data that was not 832 consumed, and signal the receipt of the RESET_STREAM. A RESET_STREAM 833 signal might be suppressed or withheld if stream data is completely 834 received and is buffered to be read by the application. If the 835 RESET_STREAM is suppressed, the receiving part of the stream remains 836 in "Data Recvd". 838 Once the application receives the signal indicating that the stream 839 was reset, the receiving part of the stream transitions to the "Reset 840 Read" state, which is a terminal state. 842 3.3. Permitted Frame Types 844 The sender of a stream sends just three frame types that affect the 845 state of a stream at either sender or receiver: STREAM 846 (Section 19.8), STREAM_DATA_BLOCKED (Section 19.13), and RESET_STREAM 847 (Section 19.4). 849 A sender MUST NOT send any of these frames from a terminal state 850 ("Data Recvd" or "Reset Recvd"). A sender MUST NOT send STREAM or 851 STREAM_DATA_BLOCKED after sending a RESET_STREAM; that is, in the 852 terminal states and in the "Reset Sent" state. A receiver could 853 receive any of these three frames in any state, due to the 854 possibility of delayed delivery of packets carrying them. 856 The receiver of a stream sends MAX_STREAM_DATA (Section 19.10) and 857 STOP_SENDING frames (Section 19.5). 859 The receiver only sends MAX_STREAM_DATA in the "Recv" state. A 860 receiver can send STOP_SENDING in any state where it has not received 861 a RESET_STREAM frame; that is states other than "Reset Recvd" or 862 "Reset Read". However there is little value in sending a 863 STOP_SENDING frame in the "Data Recvd" state, since all stream data 864 has been received. A sender could receive either of these two frames 865 in any state as a result of delayed delivery of packets. 867 3.4. Bidirectional Stream States 869 A bidirectional stream is composed of sending and receiving parts. 870 Implementations may represent states of the bidirectional stream as 871 composites of sending and receiving stream states. The simplest 872 model presents the stream as "open" when either sending or receiving 873 parts are in a non-terminal state and "closed" when both sending and 874 receiving streams are in terminal states. 876 Table 2 shows a more complex mapping of bidirectional stream states 877 that loosely correspond to the stream states in HTTP/2 [HTTP2]. This 878 shows that multiple states on sending or receiving parts of streams 879 are mapped to the same composite state. Note that this is just one 880 possibility for such a mapping; this mapping requires that data is 881 acknowledged before the transition to a "closed" or "half-closed" 882 state. 884 +----------------------+----------------------+-----------------+ 885 | Sending Part | Receiving Part | Composite State | 886 +======================+======================+=================+ 887 | No Stream/Ready | No Stream/Recv *1 | idle | 888 +----------------------+----------------------+-----------------+ 889 | Ready/Send/Data Sent | Recv/Size Known | open | 890 +----------------------+----------------------+-----------------+ 891 | Ready/Send/Data Sent | Data Recvd/Data Read | half-closed | 892 | | | (remote) | 893 +----------------------+----------------------+-----------------+ 894 | Ready/Send/Data Sent | Reset Recvd/Reset | half-closed | 895 | | Read | (remote) | 896 +----------------------+----------------------+-----------------+ 897 | Data Recvd | Recv/Size Known | half-closed | 898 | | | (local) | 899 +----------------------+----------------------+-----------------+ 900 | Reset Sent/Reset | Recv/Size Known | half-closed | 901 | Recvd | | (local) | 902 +----------------------+----------------------+-----------------+ 903 | Reset Sent/Reset | Data Recvd/Data Read | closed | 904 | Recvd | | | 905 +----------------------+----------------------+-----------------+ 906 | Reset Sent/Reset | Reset Recvd/Reset | closed | 907 | Recvd | Read | | 908 +----------------------+----------------------+-----------------+ 909 | Data Recvd | Data Recvd/Data Read | closed | 910 +----------------------+----------------------+-----------------+ 911 | Data Recvd | Reset Recvd/Reset | closed | 912 | | Read | | 913 +----------------------+----------------------+-----------------+ 915 Table 2: Possible Mapping of Stream States to HTTP/2 917 Note (*1): A stream is considered "idle" if it has not yet been 918 created, or if the receiving part of the stream is in the "Recv" 919 state without yet having received any frames. 921 3.5. Solicited State Transitions 923 If an application is no longer interested in the data it is receiving 924 on a stream, it can abort reading the stream and specify an 925 application error code. 927 If the stream is in the "Recv" or "Size Known" states, the transport 928 SHOULD signal this by sending a STOP_SENDING frame to prompt closure 929 of the stream in the opposite direction. This typically indicates 930 that the receiving application is no longer reading data it receives 931 from the stream, but it is not a guarantee that incoming data will be 932 ignored. 934 STREAM frames received after sending STOP_SENDING are still counted 935 toward connection and stream flow control, even though these frames 936 can be discarded upon receipt. 938 A STOP_SENDING frame requests that the receiving endpoint send a 939 RESET_STREAM frame. An endpoint that receives a STOP_SENDING frame 940 MUST send a RESET_STREAM frame if the stream is in the Ready or Send 941 state. If the stream is in the Data Sent state and any outstanding 942 data is declared lost, an endpoint SHOULD send a RESET_STREAM frame 943 in lieu of a retransmission. 945 An endpoint SHOULD copy the error code from the STOP_SENDING frame to 946 the RESET_STREAM frame it sends, but MAY use any application error 947 code. The endpoint that sends a STOP_SENDING frame MAY ignore the 948 error code carried in any RESET_STREAM frame it receives. 950 If the STOP_SENDING frame is received on a stream that is already in 951 the "Data Sent" state, an endpoint that wishes to cease 952 retransmission of previously-sent STREAM frames on that stream MUST 953 first send a RESET_STREAM frame. 955 STOP_SENDING SHOULD only be sent for a stream that has not been reset 956 by the peer. STOP_SENDING is most useful for streams in the "Recv" 957 or "Size Known" states. 959 An endpoint is expected to send another STOP_SENDING frame if a 960 packet containing a previous STOP_SENDING is lost. However, once 961 either all stream data or a RESET_STREAM frame has been received for 962 the stream - that is, the stream is in any state other than "Recv" or 963 "Size Known" - sending a STOP_SENDING frame is unnecessary. 965 An endpoint that wishes to terminate both directions of a 966 bidirectional stream can terminate one direction by sending a 967 RESET_STREAM, and it can encourage prompt termination in the opposite 968 direction by sending a STOP_SENDING frame. 970 4. Flow Control 972 It is necessary to limit the amount of data that a receiver could 973 buffer, to prevent a fast sender from overwhelming a slow receiver, 974 or to prevent a malicious sender from consuming a large amount of 975 memory at a receiver. To enable a receiver to limit memory 976 commitment to a connection and to apply back pressure on the sender, 977 streams are flow controlled both individually and as an aggregate. A 978 QUIC receiver controls the maximum amount of data the sender can send 979 on a stream at any time, as described in Section 4.1 and Section 4.2 981 Similarly, to limit concurrency within a connection, a QUIC endpoint 982 controls the maximum cumulative number of streams that its peer can 983 initiate, as described in Section 4.5. 985 Data sent in CRYPTO frames is not flow controlled in the same way as 986 stream data. QUIC relies on the cryptographic protocol 987 implementation to avoid excessive buffering of data; see [QUIC-TLS]. 988 The implementation SHOULD provide an interface to QUIC to tell it 989 about its buffering limits so that there is not excessive buffering 990 at multiple layers. 992 4.1. Data Flow Control 994 QUIC employs a credit-based flow-control scheme similar to that in 995 HTTP/2 [HTTP2], where a receiver advertises the number of bytes it is 996 prepared to receive on a given stream and for the entire connection. 997 This leads to two levels of data flow control in QUIC: 999 * Stream flow control, which prevents a single stream from consuming 1000 the entire receive buffer for a connection by limiting the amount 1001 of data that can be sent on any stream. 1003 * Connection flow control, which prevents senders from exceeding a 1004 receiver's buffer capacity for the connection, by limiting the 1005 total bytes of stream data sent in STREAM frames on all streams. 1007 A receiver sets initial credits for all streams by sending transport 1008 parameters during the handshake (Section 7.4). A receiver sends 1009 MAX_STREAM_DATA (Section 19.10) or MAX_DATA (Section 19.9) frames to 1010 the sender to advertise additional credit. 1012 A receiver advertises credit for a stream by sending a 1013 MAX_STREAM_DATA frame with the Stream ID field set appropriately. A 1014 MAX_STREAM_DATA frame indicates the maximum absolute byte offset of a 1015 stream. A receiver could use the current offset of data consumed to 1016 determine the flow control offset to be advertised. A receiver MAY 1017 send MAX_STREAM_DATA frames in multiple packets in order to make sure 1018 that the sender receives an update before running out of flow control 1019 credit, even if one of the packets is lost. 1021 A receiver advertises credit for a connection by sending a MAX_DATA 1022 frame, which indicates the maximum of the sum of the absolute byte 1023 offsets of all streams. A receiver maintains a cumulative sum of 1024 bytes received on all streams, which is used to check for flow 1025 control violations. A receiver might use a sum of bytes consumed on 1026 all streams to determine the maximum data limit to be advertised. 1028 A receiver can advertise a larger offset by sending MAX_STREAM_DATA 1029 or MAX_DATA frames. Once a receiver advertises an offset, it MAY 1030 advertise a smaller offset, but this has no effect. 1032 A receiver MUST close the connection with a FLOW_CONTROL_ERROR error 1033 (Section 11) if the sender violates the advertised connection or 1034 stream data limits. 1036 A sender MUST ignore any MAX_STREAM_DATA or MAX_DATA frames that do 1037 not increase flow control limits. 1039 If a sender runs out of flow control credit, it will be unable to 1040 send new data and is considered blocked. A sender SHOULD send a 1041 STREAM_DATA_BLOCKED or DATA_BLOCKED frame to indicate it has data to 1042 write but is blocked by flow control limits. If a sender is blocked 1043 for a period longer than the idle timeout (Section 10.2), the 1044 connection might be closed even when data is available for 1045 transmission. To keep the connection from closing, a sender that is 1046 flow control limited SHOULD periodically send a STREAM_DATA_BLOCKED 1047 or DATA_BLOCKED frame when it has no ack-eliciting packets in flight. 1049 4.2. Flow Credit Increments 1051 Implementations decide when and how much credit to advertise in 1052 MAX_STREAM_DATA and MAX_DATA frames, but this section offers a few 1053 considerations. 1055 To avoid blocking a sender, a receiver can send a MAX_STREAM_DATA or 1056 MAX_DATA frame multiple times within a round trip or send it early 1057 enough to allow for recovery from loss of the frame. 1059 Control frames contribute to connection overhead. Therefore, 1060 frequently sending MAX_STREAM_DATA and MAX_DATA frames with small 1061 changes is undesirable. On the other hand, if updates are less 1062 frequent, larger increments to limits are necessary to avoid blocking 1063 a sender, requiring larger resource commitments at the receiver. 1064 There is a trade-off between resource commitment and overhead when 1065 determining how large a limit is advertised. 1067 A receiver can use an autotuning mechanism to tune the frequency and 1068 amount of advertised additional credit based on a round-trip time 1069 estimate and the rate at which the receiving application consumes 1070 data, similar to common TCP implementations. As an optimization, an 1071 endpoint could send frames related to flow control only when there 1072 are other frames to send or when a peer is blocked, ensuring that 1073 flow control does not cause extra packets to be sent. 1075 A blocked sender is not required to send STREAM_DATA_BLOCKED or 1076 DATA_BLOCKED frames. Therefore, a receiver MUST NOT wait for a 1077 STREAM_DATA_BLOCKED or DATA_BLOCKED frame before sending a 1078 MAX_STREAM_DATA or MAX_DATA frame; doing so could result in the 1079 sender being blocked for the rest of the connection. Even if the 1080 sender sends these frames, waiting for them will result in the sender 1081 being blocked for at least an entire round trip. 1083 When a sender receives credit after being blocked, it might be able 1084 to send a large amount of data in response, resulting in short-term 1085 congestion; see Section 6.9 in [QUIC-RECOVERY] for a discussion of 1086 how a sender can avoid this congestion. 1088 4.3. Handling Stream Cancellation 1090 Endpoints need to eventually agree on the amount of flow control 1091 credit that has been consumed, to avoid either exceeding flow control 1092 limits or deadlocking. 1094 On receipt of a RESET_STREAM frame, an endpoint will tear down state 1095 for the matching stream and ignore further data arriving on that 1096 stream. Without the offset included in RESET_STREAM, the two 1097 endpoints could disagree on the number of bytes that count towards 1098 connection flow control. 1100 To remedy this issue, a RESET_STREAM frame (Section 19.4) includes 1101 the final size of data sent on the stream. On receiving a 1102 RESET_STREAM frame, a receiver definitively knows how many bytes were 1103 sent on that stream before the RESET_STREAM frame, and the receiver 1104 MUST use the final size of the stream to account for all bytes sent 1105 on the stream in its connection level flow controller. 1107 RESET_STREAM terminates one direction of a stream abruptly. For a 1108 bidirectional stream, RESET_STREAM has no effect on data flow in the 1109 opposite direction. Both endpoints MUST maintain flow control state 1110 for the stream in the unterminated direction until that direction 1111 enters a terminal state, or until one of the endpoints sends 1112 CONNECTION_CLOSE. 1114 4.4. Stream Final Size 1116 The final size is the amount of flow control credit that is consumed 1117 by a stream. Assuming that every contiguous byte on the stream was 1118 sent once, the final size is the number of bytes sent. More 1119 generally, this is one higher than the offset of the byte with the 1120 largest offset sent on the stream, or zero if no bytes were sent. 1122 For a stream that is reset, the final size is carried explicitly in a 1123 RESET_STREAM frame. Otherwise, the final size is the offset plus the 1124 length of a STREAM frame marked with a FIN flag, or 0 in the case of 1125 incoming unidirectional streams. 1127 An endpoint will know the final size for a stream when the receiving 1128 part of the stream enters the "Size Known" or "Reset Recvd" state 1129 (Section 3). 1131 An endpoint MUST NOT send data on a stream at or beyond the final 1132 size. 1134 Once a final size for a stream is known, it cannot change. If a 1135 RESET_STREAM or STREAM frame is received indicating a change in the 1136 final size for the stream, an endpoint SHOULD respond with a 1137 FINAL_SIZE_ERROR error; see Section 11. A receiver SHOULD treat 1138 receipt of data at or beyond the final size as a FINAL_SIZE_ERROR 1139 error, even after a stream is closed. Generating these errors is not 1140 mandatory, but only because requiring that an endpoint generate these 1141 errors also means that the endpoint needs to maintain the final size 1142 state for closed streams, which could mean a significant state 1143 commitment. 1145 4.5. Controlling Concurrency 1147 An endpoint limits the cumulative number of incoming streams a peer 1148 can open. Only streams with a stream ID less than (max_stream * 4 + 1149 initial_stream_id_for_type) can be opened; see Table 5. Initial 1150 limits are set in the transport parameters (see Section 18.2) and 1151 subsequently limits are advertised using MAX_STREAMS frames 1152 (Section 19.11). Separate limits apply to unidirectional and 1153 bidirectional streams. 1155 If a max_streams transport parameter or MAX_STREAMS frame is received 1156 with a value greater than 2^60, this would allow a maximum stream ID 1157 that cannot be expressed as a variable-length integer; see 1158 Section 16. If either is received, the connection MUST be closed 1159 immediately with a connection error of type STREAM_LIMIT_ERROR; see 1160 Section 10.3. 1162 Endpoints MUST NOT exceed the limit set by their peer. An endpoint 1163 that receives a frame with a stream ID exceeding the limit it has 1164 sent MUST treat this as a connection error of type STREAM_LIMIT_ERROR 1165 (Section 11). 1167 Once a receiver advertises a stream limit using the MAX_STREAMS 1168 frame, advertising a smaller limit has no effect. A receiver MUST 1169 ignore any MAX_STREAMS frame that does not increase the stream limit. 1171 As with stream and connection flow control, this document leaves when 1172 and how many streams to advertise to a peer via MAX_STREAMS to 1173 implementations. Implementations might choose to increase limits as 1174 streams close to keep the number of streams available to peers 1175 roughly consistent. 1177 An endpoint that is unable to open a new stream due to the peer's 1178 limits SHOULD send a STREAMS_BLOCKED frame (Section 19.14). This 1179 signal is considered useful for debugging. An endpoint MUST NOT wait 1180 to receive this signal before advertising additional credit, since 1181 doing so will mean that the peer will be blocked for at least an 1182 entire round trip, and potentially for longer if the peer chooses to 1183 not send STREAMS_BLOCKED frames. 1185 5. Connections 1187 QUIC's connection establishment combines version negotiation with the 1188 cryptographic and transport handshakes to reduce connection 1189 establishment latency, as described in Section 7. Once established, 1190 a connection may migrate to a different IP or port at either endpoint 1191 as described in Section 9. Finally, a connection may be terminated 1192 by either endpoint, as described in Section 10. 1194 5.1. Connection ID 1196 Each connection possesses a set of connection identifiers, or 1197 connection IDs, each of which can identify the connection. 1198 Connection IDs are independently selected by endpoints; each endpoint 1199 selects the connection IDs that its peer uses. 1201 The primary function of a connection ID is to ensure that changes in 1202 addressing at lower protocol layers (UDP, IP) don't cause packets for 1203 a QUIC connection to be delivered to the wrong endpoint. Each 1204 endpoint selects connection IDs using an implementation-specific (and 1205 perhaps deployment-specific) method which will allow packets with 1206 that connection ID to be routed back to the endpoint and identified 1207 by the endpoint upon receipt. 1209 Connection IDs MUST NOT contain any information that can be used by 1210 an external observer (that is, one that does not cooperate with the 1211 issuer) to correlate them with other connection IDs for the same 1212 connection. As a trivial example, this means the same connection ID 1213 MUST NOT be issued more than once on the same connection. 1215 Packets with long headers include Source Connection ID and 1216 Destination Connection ID fields. These fields are used to set the 1217 connection IDs for new connections; see Section 7.2 for details. 1219 Packets with short headers (Section 17.3) only include the 1220 Destination Connection ID and omit the explicit length. The length 1221 of the Destination Connection ID field is expected to be known to 1222 endpoints. Endpoints using a load balancer that routes based on 1223 connection ID could agree with the load balancer on a fixed length 1224 for connection IDs, or agree on an encoding scheme. A fixed portion 1225 could encode an explicit length, which allows the entire connection 1226 ID to vary in length and still be used by the load balancer. 1228 A Version Negotiation (Section 17.2.1) packet echoes the connection 1229 IDs selected by the client, both to ensure correct routing toward the 1230 client and to allow the client to validate that the packet is in 1231 response to an Initial packet. 1233 A zero-length connection ID can be used when a connection ID is not 1234 needed to route to the correct endpoint. However, multiplexing 1235 connections on the same local IP address and port while using zero- 1236 length connection IDs will cause failures in the presence of peer 1237 connection migration, NAT rebinding, and client port reuse; and 1238 therefore MUST NOT be done unless an endpoint is certain that those 1239 protocol features are not in use. 1241 When an endpoint uses a non-zero-length connection ID, it needs to 1242 ensure that the peer has a supply of connection IDs from which to 1243 choose for packets sent to the endpoint. These connection IDs are 1244 supplied by the endpoint using the NEW_CONNECTION_ID frame 1245 (Section 19.15). 1247 5.1.1. Issuing Connection IDs 1249 Each Connection ID has an associated sequence number to assist in 1250 deduplicating messages. The initial connection ID issued by an 1251 endpoint is sent in the Source Connection ID field of the long packet 1252 header (Section 17.2) during the handshake. The sequence number of 1253 the initial connection ID is 0. If the preferred_address transport 1254 parameter is sent, the sequence number of the supplied connection ID 1255 is 1. 1257 Additional connection IDs are communicated to the peer using 1258 NEW_CONNECTION_ID frames (Section 19.15). The sequence number on 1259 each newly-issued connection ID MUST increase by 1. The connection 1260 ID randomly selected by the client in the Initial packet and any 1261 connection ID provided by a Retry packet are not assigned sequence 1262 numbers unless a server opts to retain them as its initial connection 1263 ID. 1265 When an endpoint issues a connection ID, it MUST accept packets that 1266 carry this connection ID for the duration of the connection or until 1267 its peer invalidates the connection ID via a RETIRE_CONNECTION_ID 1268 frame (Section 19.16). Connection IDs that are issued and not 1269 retired are considered active; any active connection ID is valid for 1270 use with the current connection at any time, in any packet type. 1271 This includes the connection ID issued by the server via the 1272 preferred_address transport parameter. 1274 An endpoint SHOULD ensure that its peer has a sufficient number of 1275 available and unused connection IDs. Endpoints advertise the number 1276 of active connection IDs they are willing to maintain using the 1277 active_connection_id_limit transport parameter. An endpoint MUST NOT 1278 provide more connection IDs than the peer's limit. An endpoint MAY 1279 send connection IDs that temporarily exceed a peer's limit if the 1280 NEW_CONNECTION_ID frame also requires the retirement of any excess, 1281 by including a sufficiently large value in the Retire Prior To field. 1283 A NEW_CONNECTION_ID frame might cause an endpoint to add some active 1284 connection IDs and retire others based on the value of the Retire 1285 Prior To field. After processing a NEW_CONNECTION_ID frame and 1286 adding and retiring active connection IDs, if the number of active 1287 connection IDs exceeds the value advertised in its 1288 active_connection_id_limit transport parameter, an endpoint MUST 1289 close the connection with an error of type CONNECTION_ID_LIMIT_ERROR. 1291 An endpoint SHOULD supply a new connection ID when the peer retires a 1292 connection ID. If an endpoint provided fewer connection IDs than the 1293 peer's active_connection_id_limit, it MAY supply a new connection ID 1294 when it receives a packet with a previously unused connection ID. An 1295 endpoint MAY limit the frequency or the total number of connection 1296 IDs issued for each connection to avoid the risk of running out of 1297 connection IDs; see Section 10.4.2. An endpoint MAY also limit the 1298 issuance of connection IDs to reduce the amount of per-path state it 1299 maintains, such as path validation status, as its peer might interact 1300 with it over as many paths as there are issued connection IDs. 1302 An endpoint that initiates migration and requires non-zero-length 1303 connection IDs SHOULD ensure that the pool of connection IDs 1304 available to its peer allows the peer to use a new connection ID on 1305 migration, as the peer will close the connection if the pool is 1306 exhausted. 1308 5.1.2. Consuming and Retiring Connection IDs 1310 An endpoint can change the connection ID it uses for a peer to 1311 another available one at any time during the connection. An endpoint 1312 consumes connection IDs in response to a migrating peer; see 1313 Section 9.5 for more. 1315 An endpoint maintains a set of connection IDs received from its peer, 1316 any of which it can use when sending packets. When the endpoint 1317 wishes to remove a connection ID from use, it sends a 1318 RETIRE_CONNECTION_ID frame to its peer. Sending a 1319 RETIRE_CONNECTION_ID frame indicates that the connection ID will not 1320 be used again and requests that the peer replace it with a new 1321 connection ID using a NEW_CONNECTION_ID frame. 1323 As discussed in Section 9.5, endpoints limit the use of a connection 1324 ID to packets sent from a single local address to a single 1325 destination address. Endpoints SHOULD retire connection IDs when 1326 they are no longer actively using either the local or destination 1327 address for which the connection ID was used. 1329 An endpoint might need to stop accepting previously issued connection 1330 IDs in certain circumstances. Such an endpoint can cause its peer to 1331 retire connection IDs by sending a NEW_CONNECTION_ID frame with an 1332 increased Retire Prior To field. The endpoint SHOULD continue to 1333 accept the previously issued connection IDs until they are retired by 1334 the peer. If the endpoint can no longer process the indicated 1335 connection IDs, it MAY close the connection. 1337 Upon receipt of an increased Retire Prior To field, the peer MUST 1338 stop using the corresponding connection IDs and retire them with 1339 RETIRE_CONNECTION_ID frames before adding the newly provided 1340 connection ID to the set of active connection IDs. This ordering 1341 allows an endpoint to replace all active connection IDs without the 1342 possibility of a peer having no available connection IDs and without 1343 exceeding the limit the peer sets in the active_connection_id_limit 1344 transport parameter; see Section 18.2. Failure to cease using the 1345 connection IDs when requested can result in connection failures, as 1346 the issuing endpoint might be unable to continue using the connection 1347 IDs with the active connection. 1349 An endpoint SHOULD limit the number of connection IDs it has retired 1350 locally and have not yet been acknowledged. An endpoint SHOULD allow 1351 for sending and tracking a number of RETIRE_CONNECTION_ID frames of 1352 at least twice the active_connection_id limit. An endpoint MUST NOT 1353 forget a connection ID without retiring it, though it MAY choose to 1354 treat having connection IDs in need of retirement that exceed this 1355 limit as a connection error of type CONNECTION_ID_LIMIT_ERROR. 1357 Endpoints SHOULD NOT issue updates of the Retire Prior To field 1358 before receiving RETIRE_CONNECTION_ID frames that retire all 1359 connection IDs indicated by the previous Retire Prior To value. 1361 5.2. Matching Packets to Connections 1363 Incoming packets are classified on receipt. Packets can either be 1364 associated with an existing connection, or - for servers - 1365 potentially create a new connection. 1367 Endpoints try to associate a packet with an existing connection. If 1368 the packet has a non-zero-length Destination Connection ID 1369 corresponding to an existing connection, QUIC processes that packet 1370 accordingly. Note that more than one connection ID can be associated 1371 with a connection; see Section 5.1. 1373 If the Destination Connection ID is zero length and the addressing 1374 information in the packet matches the addressing information the 1375 endpoint uses to identify a connection with a zero-length connection 1376 ID, QUIC processes the packet as part of that connection. An 1377 endpoint can use just destination IP and port or both source and 1378 destination addresses for identification, though this makes 1379 connections fragile as described in Section 5.1. 1381 Endpoints can send a Stateless Reset (Section 10.4) for any packets 1382 that cannot be attributed to an existing connection. A stateless 1383 reset allows a peer to more quickly identify when a connection 1384 becomes unusable. 1386 Packets that are matched to an existing connection are discarded if 1387 the packets are inconsistent with the state of that connection. For 1388 example, packets are discarded if they indicate a different protocol 1389 version than that of the connection, or if the removal of packet 1390 protection is unsuccessful once the expected keys are available. 1392 Invalid packets without packet protection, such as Initial, Retry, or 1393 Version Negotiation, MAY be discarded. An endpoint MUST generate a 1394 connection error if it commits changes to state before discovering an 1395 error. 1397 5.2.1. Client Packet Handling 1399 Valid packets sent to clients always include a Destination Connection 1400 ID that matches a value the client selects. Clients that choose to 1401 receive zero-length connection IDs can use the local address and port 1402 to identify a connection. Packets that don't match an existing 1403 connection are discarded. 1405 Due to packet reordering or loss, a client might receive packets for 1406 a connection that are encrypted with a key it has not yet computed. 1407 The client MAY drop these packets, or MAY buffer them in anticipation 1408 of later packets that allow it to compute the key. 1410 If a client receives a packet that has an unsupported version, it 1411 MUST discard that packet. 1413 5.2.2. Server Packet Handling 1415 If a server receives a packet that has an unsupported version, but 1416 the packet is sufficiently large to initiate a new connection for any 1417 version supported by the server, it SHOULD send a Version Negotiation 1418 packet as described in Section 6.1. Servers MAY rate control these 1419 packets to avoid storms of Version Negotiation packets. Otherwise, 1420 servers MUST drop packets that specify unsupported versions. 1422 The first packet for an unsupported version can use different 1423 semantics and encodings for any version-specific field. In 1424 particular, different packet protection keys might be used for 1425 different versions. Servers that do not support a particular version 1426 are unlikely to be able to decrypt the payload of the packet. 1427 Servers SHOULD NOT attempt to decode or decrypt a packet from an 1428 unknown version, but instead send a Version Negotiation packet, 1429 provided that the packet is sufficiently long. 1431 Packets with a supported version, or no version field, are matched to 1432 a connection using the connection ID or - for packets with zero- 1433 length connection IDs - the local address and port. If the packet 1434 doesn't match an existing connection, the server continues below. 1436 If the packet is an Initial packet fully conforming with the 1437 specification, the server proceeds with the handshake (Section 7). 1438 This commits the server to the version that the client selected. 1440 If a server isn't currently accepting any new connections, it SHOULD 1441 send an Initial packet containing a CONNECTION_CLOSE frame with error 1442 code SERVER_BUSY. 1444 If the packet is a 0-RTT packet, the server MAY buffer a limited 1445 number of these packets in anticipation of a late-arriving Initial 1446 packet. Clients are not able to send Handshake packets prior to 1447 receiving a server response, so servers SHOULD ignore any such 1448 packets. 1450 Servers MUST drop incoming packets under all other circumstances. 1452 5.2.3. Considerations for Simple Load Balancers 1454 A server deployment could load balance among servers using only 1455 source and destination IP addresses and ports. Changes to the 1456 client's IP address or port could result in packets being forwarded 1457 to the wrong server. Such a server deployment could use one of the 1458 following methods for connection continuity when a client's address 1459 changes. 1461 * Servers could use an out-of-band mechanism to forward packets to 1462 the correct server based on Connection ID. 1464 * If servers can use a dedicated server IP address or port, other 1465 than the one that the client initially connects to, they could use 1466 the preferred_address transport parameter to request that clients 1467 move connections to that dedicated address. Note that clients 1468 could choose not to use the preferred address. 1470 A server in a deployment that does not implement a solution to 1471 maintain connection continuity during connection migration SHOULD 1472 disallow migration using the disable_active_migration transport 1473 parameter. 1475 Server deployments that use this simple form of load balancing MUST 1476 avoid the creation of a stateless reset oracle; see Section 21.9. 1478 5.3. Life of a QUIC Connection 1480 A QUIC connection is a stateful interaction between a client and 1481 server, the primary purpose of which is to support the exchange of 1482 data by an application protocol. Streams (Section 2) are the primary 1483 means by which an application protocol exchanges information. 1485 Each connection starts with a handshake phase, during which client 1486 and server establish a shared secret using the cryptographic 1487 handshake protocol [QUIC-TLS] and negotiate the application protocol. 1489 The handshake (Section 7) confirms that both endpoints are willing to 1490 communicate (Section 8.1) and establishes parameters for the 1491 connection (Section 7.4). 1493 An application protocol can also operate in a limited fashion during 1494 the handshake phase. 0-RTT allows application messages to be sent by 1495 a client before receiving any messages from the server. However, 1496 0-RTT lacks certain key security guarantees. In particular, there is 1497 no protection against replay attacks in 0-RTT; see [QUIC-TLS]. 1498 Separately, a server can also send application data to a client 1499 before it receives the final cryptographic handshake messages that 1500 allow it to confirm the identity and liveness of the client. These 1501 capabilities allow an application protocol to offer the option to 1502 trade some security guarantees for reduced latency. 1504 The use of connection IDs (Section 5.1) allows connections to migrate 1505 to a new network path, both as a direct choice of an endpoint and 1506 when forced by a change in a middlebox. Section 9 describes 1507 mitigations for the security and privacy issues associated with 1508 migration. 1510 For connections that are no longer needed or desired, there are 1511 several ways for a client and server to terminate a connection 1512 (Section 10). 1514 5.4. Required Operations on Connections 1516 There are certain operations which an application MUST be able to 1517 perform when interacting with the QUIC transport. This document does 1518 not specify an API, but any implementation of this version of QUIC 1519 MUST expose the ability to perform the operations described in this 1520 section on a QUIC connection. 1522 When implementing the client role, applications need to be able to: 1524 * open a connection, which begins the exchange described in 1525 Section 7; 1527 * enable 0-RTT when available; and 1529 * be informed when 0-RTT has been accepted or rejected by a server. 1531 When implementing the server role, applications need to be able to: 1533 * listen for incoming connections, which prepares for the exchange 1534 described in Section 7; 1536 * if Early Data is supported, embed application-controlled data in 1537 the TLS resumption ticket sent to the client; and 1539 * if Early Data is supported, retrieve application-controlled data 1540 from the client's resumption ticket and enable rejecting Early 1541 Data based on that information. 1543 In either role, applications need to be able to: 1545 * configure minimum values for the initial number of permitted 1546 streams of each type, as communicated in the transport parameters 1547 (Section 7.4); 1549 * control resource allocation of various types, including flow 1550 control and the number of permitted streams of each type; 1552 * identify whether the handshake has completed successfully or is 1553 still ongoing; 1555 * keep a connection from silently closing, either by generating PING 1556 frames (Section 19.2) or by requesting that the transport send 1557 additional frames before the idle timeout expires (Section 10.2); 1558 and 1560 * immediately close (Section 10.3) the connection. 1562 6. Version Negotiation 1564 Version negotiation ensures that client and server agree to a QUIC 1565 version that is mutually supported. A server sends a Version 1566 Negotiation packet in response to each packet that might initiate a 1567 new connection; see Section 5.2 for details. 1569 The size of the first packet sent by a client will determine whether 1570 a server sends a Version Negotiation packet. Clients that support 1571 multiple QUIC versions SHOULD pad the first packet they send to the 1572 largest of the minimum packet sizes across all versions they support. 1573 This ensures that the server responds if there is a mutually 1574 supported version. 1576 6.1. Sending Version Negotiation Packets 1578 If the version selected by the client is not acceptable to the 1579 server, the server responds with a Version Negotiation packet; see 1580 Section 17.2.1. This includes a list of versions that the server 1581 will accept. An endpoint MUST NOT send a Version Negotiation packet 1582 in response to receiving a Version Negotiation packet. 1584 This system allows a server to process packets with unsupported 1585 versions without retaining state. Though either the Initial packet 1586 or the Version Negotiation packet that is sent in response could be 1587 lost, the client will send new packets until it successfully receives 1588 a response or it abandons the connection attempt. As a result, the 1589 client discards all state for the connection and does not send any 1590 more packets on the connection. 1592 A server MAY limit the number of Version Negotiation packets it 1593 sends. For instance, a server that is able to recognize packets as 1594 0-RTT might choose not to send Version Negotiation packets in 1595 response to 0-RTT packets with the expectation that it will 1596 eventually receive an Initial packet. 1598 6.2. Handling Version Negotiation Packets 1600 Version Negotiation packets are designed to allow future versions of 1601 QUIC to negotiate the version in use between endpoints. Future 1602 versions of QUIC might change how implementations that support 1603 multiple versions of QUIC react to Version Negotiation packets when 1604 attempting to establish a connection using this version. 1606 A client that supports only this version of QUIC MUST abandon the 1607 current connection attempt if it receives a Version Negotiation 1608 packet, with the following two exceptions. A client MUST discard any 1609 Version Negotiation packet if it has received and successfully 1610 processed any other packet, including an earlier Version Negotiation 1611 packet. A client MUST discard a Version Negotiation packet that 1612 lists the QUIC version selected by the client. 1614 How to perform version negotiation is left as future work defined by 1615 future versions of QUIC. In particular, that future work will ensure 1616 robustness against version downgrade attacks; see Section 21.10. 1618 6.2.1. Version Negotiation Between Draft Versions 1620 [[RFC editor: please remove this section before publication.]] 1622 When a draft implementation receives a Version Negotiation packet, it 1623 MAY use it to attempt a new connection with one of the versions 1624 listed in the packet, instead of abandoning the current connection 1625 attempt; see Section 6.2. 1627 The client MUST check that the Destination and Source Connection ID 1628 fields match the Source and Destination Connection ID fields in a 1629 packet that the client sent. If this check fails, the packet MUST be 1630 discarded. 1632 Once the Version Negotiation packet is determined to be valid, the 1633 client then selects an acceptable protocol version from the list 1634 provided by the server. The client then attempts to create a new 1635 connection using that version. The new connection MUST use a new 1636 random Destination Connection ID different from the one it had 1637 previously sent. 1639 Note that this mechanism does not protect against downgrade attacks 1640 and MUST NOT be used outside of draft implementations. 1642 6.3. Using Reserved Versions 1644 For a server to use a new version in the future, clients need to 1645 correctly handle unsupported versions. Some version numbers 1646 (0x?a?a?a?a as defined in Section 15) are reserved for inclusion in 1647 fields that contain version numbers. 1649 Endpoints MAY add reserved versions to any field where unknown or 1650 unsupported versions are ignored to test that a peer correctly 1651 ignores the value. For instance, an endpoint could include a 1652 reserved version in a Version Negotiation packet; see Section 17.2.1. 1653 Endpoints MAY send packets with a reserved version to test that a 1654 peer correctly discards the packet. 1656 7. Cryptographic and Transport Handshake 1658 QUIC relies on a combined cryptographic and transport handshake to 1659 minimize connection establishment latency. QUIC uses the CRYPTO 1660 frame Section 19.6 to transmit the cryptographic handshake. Version 1661 0x00000001 of QUIC uses TLS as described in [QUIC-TLS]; a different 1662 QUIC version number could indicate that a different cryptographic 1663 handshake protocol is in use. 1665 QUIC provides reliable, ordered delivery of the cryptographic 1666 handshake data. QUIC packet protection is used to encrypt as much of 1667 the handshake protocol as possible. The cryptographic handshake MUST 1668 provide the following properties: 1670 * authenticated key exchange, where 1672 - a server is always authenticated, 1674 - a client is optionally authenticated, 1676 - every connection produces distinct and unrelated keys, 1678 - keying material is usable for packet protection for both 0-RTT 1679 and 1-RTT packets, and 1681 - 1-RTT keys have forward secrecy 1683 * authenticated values for transport parameters of both endpoints, 1684 and confidentiality protection for server transport parameters 1685 (see Section 7.4) 1687 * authenticated negotiation of an application protocol (TLS uses 1688 ALPN [RFC7301] for this purpose) 1690 An endpoint can verify support for Explicit Congestion Notification 1691 (ECN) in the first packets it sends, as described in Section 13.4.2. 1693 The CRYPTO frame can be sent in different packet number spaces 1694 (Section 12.3). The sequence numbers used by CRYPTO frames to ensure 1695 ordered delivery of cryptographic handshake data start from zero in 1696 each packet number space. 1698 Endpoints MUST explicitly negotiate an application protocol. This 1699 avoids situations where there is a disagreement about the protocol 1700 that is in use. 1702 7.1. Example Handshake Flows 1704 Details of how TLS is integrated with QUIC are provided in 1705 [QUIC-TLS], but some examples are provided here. An extension of 1706 this exchange to support client address validation is shown in 1707 Section 8.1.2. 1709 Once any address validation exchanges are complete, the cryptographic 1710 handshake is used to agree on cryptographic keys. The cryptographic 1711 handshake is carried in Initial (Section 17.2.2) and Handshake 1712 (Section 17.2.4) packets. 1714 Figure 4 provides an overview of the 1-RTT handshake. Each line 1715 shows a QUIC packet with the packet type and packet number shown 1716 first, followed by the frames that are typically contained in those 1717 packets. So, for instance the first packet is of type Initial, with 1718 packet number 0, and contains a CRYPTO frame carrying the 1719 ClientHello. 1721 Note that multiple QUIC packets - even of different packet types - 1722 can be coalesced into a single UDP datagram; see Section 12.2). As a 1723 result, this handshake may consist of as few as 4 UDP datagrams, or 1724 any number more. For instance, the server's first flight contains 1725 Initial packets, Handshake packets, and "0.5-RTT data" in 1-RTT 1726 packets with a short header. 1728 Client Server 1730 Initial[0]: CRYPTO[CH] -> 1732 Initial[0]: CRYPTO[SH] ACK[0] 1733 Handshake[0]: CRYPTO[EE, CERT, CV, FIN] 1734 <- 1-RTT[0]: STREAM[1, "..."] 1736 Initial[1]: ACK[0] 1737 Handshake[0]: CRYPTO[FIN], ACK[0] 1738 1-RTT[0]: STREAM[0, "..."], ACK[0] -> 1740 Handshake[1]: ACK[0] 1741 <- 1-RTT[1]: STREAM[3, "..."], ACK[0] 1743 Figure 4: Example 1-RTT Handshake 1745 Figure 5 shows an example of a connection with a 0-RTT handshake and 1746 a single packet of 0-RTT data. Note that as described in 1747 Section 12.3, the server acknowledges 0-RTT data in 1-RTT packets, 1748 and the client sends 1-RTT packets in the same packet number space. 1750 Client Server 1752 Initial[0]: CRYPTO[CH] 1753 0-RTT[0]: STREAM[0, "..."] -> 1755 Initial[0]: CRYPTO[SH] ACK[0] 1756 Handshake[0] CRYPTO[EE, FIN] 1757 <- 1-RTT[0]: STREAM[1, "..."] ACK[0] 1759 Initial[1]: ACK[0] 1760 Handshake[0]: CRYPTO[FIN], ACK[0] 1761 1-RTT[1]: STREAM[0, "..."] ACK[0] -> 1763 Handshake[1]: ACK[0] 1764 <- 1-RTT[1]: STREAM[3, "..."], ACK[1] 1766 Figure 5: Example 0-RTT Handshake 1768 7.2. Negotiating Connection IDs 1770 A connection ID is used to ensure consistent routing of packets, as 1771 described in Section 5.1. The long header contains two connection 1772 IDs: the Destination Connection ID is chosen by the recipient of the 1773 packet and is used to provide consistent routing; the Source 1774 Connection ID is used to set the Destination Connection ID used by 1775 the peer. 1777 During the handshake, packets with the long header (Section 17.2) are 1778 used to establish the connection IDs in each direction. Each 1779 endpoint uses the Source Connection ID field to specify the 1780 connection ID that is used in the Destination Connection ID field of 1781 packets being sent to them. Upon receiving a packet, each endpoint 1782 sets the Destination Connection ID it sends to match the value of the 1783 Source Connection ID that it receives. 1785 When an Initial packet is sent by a client that has not previously 1786 received an Initial or Retry packet from the server, the client 1787 populates the Destination Connection ID field with an unpredictable 1788 value. This Destination Connection ID MUST be at least 8 bytes in 1789 length. Until a packet is received from the server, the client MUST 1790 use the same Destination Connection ID value on all packets in this 1791 connection. This Destination Connection ID is used to determine 1792 packet protection keys for Initial packets. 1794 The client populates the Source Connection ID field with a value of 1795 its choosing and sets the SCID Length field to indicate the length. 1797 The first flight of 0-RTT packets use the same Destination Connection 1798 ID and Source Connection ID values as the client's first Initial 1799 packet. 1801 Upon first receiving an Initial or Retry packet from the server, the 1802 client uses the Source Connection ID supplied by the server as the 1803 Destination Connection ID for subsequent packets, including any 0-RTT 1804 packets. This means that a client might have to change the 1805 connection ID it sets in the Destination Connection ID field twice 1806 during connection establishment: once in response to a Retry, and 1807 once in response to an Initial packet from the server. Once a client 1808 has received a valid Initial packet from the server, it MUST discard 1809 any subsequent packet it receives with a different Source Connection 1810 ID. 1812 A client MUST change the Destination Connection ID it uses for 1813 sending packets in response to only the first received Initial or 1814 Retry packet. A server MUST set the Destination Connection ID it 1815 uses for sending packets based on the first received Initial packet. 1816 Any further changes to the Destination Connection ID are only 1817 permitted if the values are taken from any received NEW_CONNECTION_ID 1818 frames; if subsequent Initial packets include a different Source 1819 Connection ID, they MUST be discarded. This avoids unpredictable 1820 outcomes that might otherwise result from stateless processing of 1821 multiple Initial packets with different Source Connection IDs. 1823 The Destination Connection ID that an endpoint sends can change over 1824 the lifetime of a connection, especially in response to connection 1825 migration (Section 9); see Section 5.1.1 for details. 1827 7.3. Authenticating Connection IDs 1829 The choice each endpoint makes about connection IDs during the 1830 handshake is authenticated by including all values in transport 1831 parameters; see Section 7.4. This ensures that all connection IDs 1832 used for the handshake are also authenticated by the cryptographic 1833 handshake. 1835 Each endpoint includes the value of the Source Connection ID field 1836 from the first Initial packet it sent in the 1837 initial_source_connection_id transport parameter; see Section 18.2. 1838 A server includes the Destination Connection ID field from the first 1839 Initial packet it received from the client in the 1840 original_destination_connection_id transport parameter; if the server 1841 sent a Retry packet this refers to the first Initial packet received 1842 before sending the Retry packet. If it sends a Retry packet, a 1843 server also includes the Source Connection ID field from the Retry 1844 packet in the retry_source_connection_id transport parameter. 1846 The values provided by a peer for these transport parameters MUST 1847 match the values that an endpoint used in the Destination and Source 1848 Connection ID fields of Initial packets that it sent. Including 1849 connection ID values in transport parameters and verifying them 1850 ensures that that an attacker cannot influence the choice of 1851 connection ID for a successful connection by injecting packets 1852 carrying attacker-chosen connection IDs during the handshake. An 1853 endpoint MUST treat any of the following as a connection error of 1854 type PROTOCOL_VIOLATION: 1856 * absence of the initial_source_connection_id transport parameter 1857 from either endpoint, 1859 * absence of the original_destination_connection_id transport 1860 parameter from the server, 1862 * absence of the retry_source_connection_id transport parameter from 1863 the server after receiving a Retry packet, 1865 * presence of the retry_source_connection_id transport parameter 1866 when no Retry packet was received, or 1868 * a mismatch between values received from a peer in these transport 1869 parameters and the value sent in the corresponding Destination or 1870 Source Connection ID fields of Initial packets. 1872 If a zero-length connection ID is selected, the corresponding 1873 transport parameter is included with a zero-length value. 1875 Figure 6 shows the connection IDs that are used in a complete 1876 handshake. The exchange of Initial packets is shown, plus the later 1877 exchange of 1-RTT packets that includes the connection ID established 1878 during the handshake. 1880 Client Server 1882 Initial: DCID=S1, SCID=C1 -> 1883 <- Initial: DCID=C1, SCID=S3 1884 ... 1885 1-RTT: DCID=S3 -> 1886 <- 1-RTT: DCID=C1 1888 Figure 6: Use of Connection IDs in a Handshake 1890 Figure 7 shows a similar handshake that includes a Retry packet. 1892 Client Server 1894 Initial: DCID=S1, SCID=C1 -> 1895 <- Retry: DCID=C1, SCID=S2 1896 Initial: DCID=S2, SCID=C1 -> 1897 <- Initial: DCID=C1, SCID=S3 1898 ... 1899 1-RTT: DCID=S3 -> 1900 <- 1-RTT: DCID=C1 1902 Figure 7: Use of Connection IDs in a Handshake with Retry 1904 For the handshakes in Figure 6 and Figure 7 the client sets the value 1905 of the initial_source_connection_id transport parameter to "C1". In 1906 Figure 7, the server sets original_destination_connection_id to "S1", 1907 retry_source_connection_id to "S2", and initial_source_connection_id 1908 to "S3". In Figure 6, the server sets 1909 original_destination_connection_id to "S1", 1910 initial_source_connection_id to "S3", and does not include 1911 retry_source_connection_id. Each endpoint validates the transport 1912 parameters set by their peer, including the client confirming that 1913 retry_source_connection_id is absent if no Retry packet was 1914 processed. 1916 7.4. Transport Parameters 1918 During connection establishment, both endpoints make authenticated 1919 declarations of their transport parameters. Endpoints are required 1920 to comply with the restrictions implied by these parameters; the 1921 description of each parameter includes rules for its handling. 1923 Transport parameters are declarations that are made unilaterally by 1924 each endpoint. Each endpoint can choose values for transport 1925 parameters independent of the values chosen by its peer. 1927 The encoding of the transport parameters is detailed in Section 18. 1929 QUIC includes the encoded transport parameters in the cryptographic 1930 handshake. Once the handshake completes, the transport parameters 1931 declared by the peer are available. Each endpoint validates the 1932 value provided by its peer. 1934 Definitions for each of the defined transport parameters are included 1935 in Section 18.2. 1937 An endpoint MUST treat receipt of a transport parameter with an 1938 invalid value as a connection error of type 1939 TRANSPORT_PARAMETER_ERROR. 1941 An endpoint MUST NOT send a parameter more than once in a given 1942 transport parameters extension. An endpoint SHOULD treat receipt of 1943 duplicate transport parameters as a connection error of type 1944 TRANSPORT_PARAMETER_ERROR. 1946 Endpoints use transport parameters to authenticate the negotiation of 1947 connection IDs during the handshake; see Section 7.3. 1949 7.4.1. Values of Transport Parameters for 0-RTT 1951 Both endpoints store the value of the server transport parameters 1952 from a connection and apply them to any 0-RTT packets that are sent 1953 in subsequent connections to that peer, except for transport 1954 parameters that are explicitly excluded. Remembered transport 1955 parameters apply to the new connection until the handshake completes 1956 and the client starts sending 1-RTT packets. Once the handshake 1957 completes, the client uses the transport parameters established in 1958 the handshake. 1960 The definition of new transport parameters (Section 7.4.2) MUST 1961 specify whether they MUST, MAY, or MUST NOT be stored for 0-RTT. A 1962 client need not store a transport parameter it cannot process. 1964 A client MUST NOT use remembered values for the following parameters: 1965 ack_delay_exponent, max_ack_delay, initial_source_connection_id, 1966 original_destination_connection_id, preferred_address, 1967 retry_source_connection_id, and stateless_reset_token. The client 1968 MUST use the server's new values in the handshake instead, and absent 1969 new values from the server, the default value. 1971 A client that attempts to send 0-RTT data MUST remember all other 1972 transport parameters used by the server. The server can remember 1973 these transport parameters, or store an integrity-protected copy of 1974 the values in the ticket and recover the information when accepting 1975 0-RTT data. A server uses the transport parameters in determining 1976 whether to accept 0-RTT data. 1978 If 0-RTT data is accepted by the server, the server MUST NOT reduce 1979 any limits or alter any values that might be violated by the client 1980 with its 0-RTT data. In particular, a server that accepts 0-RTT data 1981 MUST NOT set values for the following parameters (Section 18.2) that 1982 are smaller than the remembered value of the parameters. 1984 * active_connection_id_limit 1986 * initial_max_data 1988 * initial_max_stream_data_bidi_local 1990 * initial_max_stream_data_bidi_remote 1992 * initial_max_stream_data_uni 1994 * initial_max_streams_bidi 1996 * initial_max_streams_uni 1998 Omitting or setting a zero value for certain transport parameters can 1999 result in 0-RTT data being enabled, but not usable. The applicable 2000 subset of transport parameters that permit sending of application 2001 data SHOULD be set to non-zero values for 0-RTT. This includes 2002 initial_max_data and either initial_max_streams_bidi and 2003 initial_max_stream_data_bidi_remote, or initial_max_streams_uni and 2004 initial_max_stream_data_uni. 2006 A server MUST either reject 0-RTT data or abort a handshake if the 2007 implied values for transport parameters cannot be supported. 2009 When sending frames in 0-RTT packets, a client MUST only use 2010 remembered transport parameters; importantly, it MUST NOT use updated 2011 values that it learns from the server's updated transport parameters 2012 or from frames received in 1-RTT packets. Updated values of 2013 transport parameters from the handshake apply only to 1-RTT packets. 2014 For instance, flow control limits from remembered transport 2015 parameters apply to all 0-RTT packets even if those values are 2016 increased by the handshake or by frames sent in 1-RTT packets. A 2017 server MAY treat use of updated transport parameters in 0-RTT as a 2018 connection error of type PROTOCOL_VIOLATION. 2020 7.4.2. New Transport Parameters 2022 New transport parameters can be used to negotiate new protocol 2023 behavior. An endpoint MUST ignore transport parameters that it does 2024 not support. Absence of a transport parameter therefore disables any 2025 optional protocol feature that is negotiated using the parameter. As 2026 described in Section 18.1, some identifiers are reserved in order to 2027 exercise this requirement. 2029 New transport parameters can be registered according to the rules in 2030 Section 22.2. 2032 7.5. Cryptographic Message Buffering 2034 Implementations need to maintain a buffer of CRYPTO data received out 2035 of order. Because there is no flow control of CRYPTO frames, an 2036 endpoint could potentially force its peer to buffer an unbounded 2037 amount of data. 2039 Implementations MUST support buffering at least 4096 bytes of data 2040 received in CRYPTO frames out of order. Endpoints MAY choose to 2041 allow more data to be buffered during the handshake. A larger limit 2042 during the handshake could allow for larger keys or credentials to be 2043 exchanged. An endpoint's buffer size does not need to remain 2044 constant during the life of the connection. 2046 Being unable to buffer CRYPTO frames during the handshake can lead to 2047 a connection failure. If an endpoint's buffer is exceeded during the 2048 handshake, it can expand its buffer temporarily to complete the 2049 handshake. If an endpoint does not expand its buffer, it MUST close 2050 the connection with a CRYPTO_BUFFER_EXCEEDED error code. 2052 Once the handshake completes, if an endpoint is unable to buffer all 2053 data in a CRYPTO frame, it MAY discard that CRYPTO frame and all 2054 CRYPTO frames received in the future, or it MAY close the connection 2055 with a CRYPTO_BUFFER_EXCEEDED error code. Packets containing 2056 discarded CRYPTO frames MUST be acknowledged because the packet has 2057 been received and processed by the transport even though the CRYPTO 2058 frame was discarded. 2060 8. Address Validation 2062 Address validation is used by QUIC to avoid being used for a traffic 2063 amplification attack. In such an attack, a packet is sent to a 2064 server with spoofed source address information that identifies a 2065 victim. If a server generates more or larger packets in response to 2066 that packet, the attacker can use the server to send more data toward 2067 the victim than it would be able to send on its own. 2069 The primary defense against amplification attack is verifying that an 2070 endpoint is able to receive packets at the transport address that it 2071 claims. Address validation is performed both during connection 2072 establishment (see Section 8.1) and during connection migration (see 2073 Section 8.2). 2075 8.1. Address Validation During Connection Establishment 2077 Connection establishment implicitly provides address validation for 2078 both endpoints. In particular, receipt of a packet protected with 2079 Handshake keys confirms that the client received the Initial packet 2080 from the server. Once the server has successfully processed a 2081 Handshake packet from the client, it can consider the client address 2082 to have been validated. 2084 Prior to validating the client address, servers MUST NOT send more 2085 than three times as many bytes as the number of bytes they have 2086 received. This limits the magnitude of any amplification attack that 2087 can be mounted using spoofed source addresses. For the purposes of 2088 avoiding amplification prior to address validation, servers MUST 2089 count all of the payload bytes received in datagrams that are 2090 uniquely attributed to a single connection. This includes datagrams 2091 that contain packets that are successfully processed and datagrams 2092 that contain packets that are all discarded. 2094 Clients MUST ensure that UDP datagrams containing Initial packets 2095 have UDP payloads of at least 1200 bytes, adding padding to packets 2096 in the datagram as necessary. Sending padded datagrams ensures that 2097 the server is not overly constrained by the amplification 2098 restriction. 2100 Loss of an Initial or Handshake packet from the server can cause a 2101 deadlock if the client does not send additional Initial or Handshake 2102 packets. A deadlock could occur when the server reaches its anti- 2103 amplification limit and the client has received acknowledgements for 2104 all the data it has sent. In this case, when the client has no 2105 reason to send additional packets, the server will be unable to send 2106 more data because it has not validated the client's address. To 2107 prevent this deadlock, clients MUST send a packet on a probe timeout 2108 (PTO, see Section 5.3 of [QUIC-RECOVERY]). Specifically, the client 2109 MUST send an Initial packet in a UDP datagram of at least 1200 bytes 2110 if it does not have Handshake keys, and otherwise send a Handshake 2111 packet. 2113 A server might wish to validate the client address before starting 2114 the cryptographic handshake. QUIC uses a token in the Initial packet 2115 to provide address validation prior to completing the handshake. 2116 This token is delivered to the client during connection establishment 2117 with a Retry packet (see Section 8.1.2) or in a previous connection 2118 using the NEW_TOKEN frame (see Section 8.1.3). 2120 In addition to sending limits imposed prior to address validation, 2121 servers are also constrained in what they can send by the limits set 2122 by the congestion controller. Clients are only constrained by the 2123 congestion controller. 2125 8.1.1. Token Construction 2127 A token sent in a NEW_TOKEN frames or a Retry packet MUST be 2128 constructed in a way that allows the server to identify how it was 2129 provided to a client. These tokens are carried in the same field, 2130 but require different handling from servers. 2132 8.1.2. Address Validation using Retry Packets 2134 Upon receiving the client's Initial packet, the server can request 2135 address validation by sending a Retry packet (Section 17.2.5) 2136 containing a token. This token MUST be repeated by the client in all 2137 Initial packets it sends for that connection after it receives the 2138 Retry packet. In response to processing an Initial containing a 2139 token, a server can either abort the connection or permit it to 2140 proceed. 2142 As long as it is not possible for an attacker to generate a valid 2143 token for its own address (see Section 8.1.4) and the client is able 2144 to return that token, it proves to the server that it received the 2145 token. 2147 A server can also use a Retry packet to defer the state and 2148 processing costs of connection establishment. Requiring the server 2149 to provide a different connection ID, along with the 2150 original_destination_connection_id transport parameter defined in 2151 Section 18.2, forces the server to demonstrate that it, or an entity 2152 it cooperates with, received the original Initial packet from the 2153 client. Providing a different connection ID also grants a server 2154 some control over how subsequent packets are routed. This can be 2155 used to direct connections to a different server instance. 2157 If a server receives a client Initial that can be unprotected but 2158 contains an invalid Retry token, it knows the client will not accept 2159 another Retry token. The server can discard such a packet and allow 2160 the client to time out to detect handshake failure, but that could 2161 impose a significant latency penalty on the client. Instead, the 2162 server SHOULD immediately close (Section 10.3) the connection with an 2163 INVALID_TOKEN error. Note that a server has not established any 2164 state for the connection at this point and so does not enter the 2165 closing period. 2167 A flow showing the use of a Retry packet is shown in Figure 8. 2169 Client Server 2171 Initial[0]: CRYPTO[CH] -> 2173 <- Retry+Token 2175 Initial+Token[1]: CRYPTO[CH] -> 2177 Initial[0]: CRYPTO[SH] ACK[1] 2178 Handshake[0]: CRYPTO[EE, CERT, CV, FIN] 2179 <- 1-RTT[0]: STREAM[1, "..."] 2181 Figure 8: Example Handshake with Retry 2183 8.1.3. Address Validation for Future Connections 2185 A server MAY provide clients with an address validation token during 2186 one connection that can be used on a subsequent connection. Address 2187 validation is especially important with 0-RTT because a server 2188 potentially sends a significant amount of data to a client in 2189 response to 0-RTT data. 2191 The server uses the NEW_TOKEN frame Section 19.7 to provide the 2192 client with an address validation token that can be used to validate 2193 future connections. The client includes this token in Initial 2194 packets to provide address validation in a future connection. The 2195 client MUST include the token in all Initial packets it sends, unless 2196 a Retry replaces the token with a newer one. The client MUST NOT use 2197 the token provided in a Retry for future connections. Servers MAY 2198 discard any Initial packet that does not carry the expected token. 2200 Unlike the token that is created for a Retry packet, which is used 2201 immediately, the token sent in the NEW_TOKEN frame might be used 2202 after some period of time has passed. Thus, a token SHOULD have an 2203 expiration time, which could be either an explicit expiration time or 2204 an issued timestamp that can be used to dynamically calculate the 2205 expiration time. A server can store the expiration time or include 2206 it in an encrypted form in the token. 2208 A token issued with NEW_TOKEN MUST NOT include information that would 2209 allow values to be linked by an observer to the connection on which 2210 it was issued, unless the values are encrypted. For example, it 2211 cannot include the previous connection ID or addressing information. 2212 A server MUST ensure that every NEW_TOKEN frame it sends is unique 2213 across all clients, with the exception of those sent to repair losses 2214 of previously sent NEW_TOKEN frames. Information that allows the 2215 server to distinguish between tokens from Retry and NEW_TOKEN MAY be 2216 accessible to entities other than the server. 2218 It is unlikely that the client port number is the same on two 2219 different connections; validating the port is therefore unlikely to 2220 be successful. 2222 A token received in a NEW_TOKEN frame is applicable to any server 2223 that the connection is considered authoritative for (e.g., server 2224 names included in the certificate). When connecting to a server for 2225 which the client retains an applicable and unused token, it SHOULD 2226 include that token in the Token field of its Initial packet. 2227 Including a token might allow the server to validate the client 2228 address without an additional round trip. A client MUST NOT include 2229 a token that is not applicable to the server that it is connecting 2230 to, unless the client has the knowledge that the server that issued 2231 the token and the server the client is connecting to are jointly 2232 managing the tokens. A client MAY use a token from any previous 2233 connection to that server. 2235 A token allows a server to correlate activity between the connection 2236 where the token was issued and any connection where it is used. 2237 Clients that want to break continuity of identity with a server MAY 2238 discard tokens provided using the NEW_TOKEN frame. In comparison, a 2239 token obtained in a Retry packet MUST be used immediately during the 2240 connection attempt and cannot be used in subsequent connection 2241 attempts. 2243 A client SHOULD NOT reuse a NEW_TOKEN token for different connection 2244 attempts. Reusing a token allows connections to be linked by 2245 entities on the network path; see Section 9.5. 2247 Clients might receive multiple tokens on a single connection. Aside 2248 from preventing linkability, any token can be used in any connection 2249 attempt. Servers can send additional tokens to either enable address 2250 validation for multiple connection attempts or to replace older 2251 tokens that might become invalid. For a client, this ambiguity means 2252 that sending the most recent unused token is most likely to be 2253 effective. Though saving and using older tokens has no negative 2254 consequences, clients can regard older tokens as being less likely be 2255 useful to the server for address validation. 2257 When a server receives an Initial packet with an address validation 2258 token, it MUST attempt to validate the token, unless it has already 2259 completed address validation. If the token is invalid then the 2260 server SHOULD proceed as if the client did not have a validated 2261 address, including potentially sending a Retry. If the validation 2262 succeeds, the server SHOULD then allow the handshake to proceed. 2264 Note: The rationale for treating the client as unvalidated rather 2265 than discarding the packet is that the client might have received 2266 the token in a previous connection using the NEW_TOKEN frame, and 2267 if the server has lost state, it might be unable to validate the 2268 token at all, leading to connection failure if the packet is 2269 discarded. A server SHOULD encode tokens provided with NEW_TOKEN 2270 frames and Retry packets differently, and validate the latter more 2271 strictly. 2273 In a stateless design, a server can use encrypted and authenticated 2274 tokens to pass information to clients that the server can later 2275 recover and use to validate a client address. Tokens are not 2276 integrated into the cryptographic handshake and so they are not 2277 authenticated. For instance, a client might be able to reuse a 2278 token. To avoid attacks that exploit this property, a server can 2279 limit its use of tokens to only the information needed to validate 2280 client addresses. 2282 Clients MAY use tokens obtained on one connection for any connection 2283 attempt using the same version. When selecting a token to use, 2284 clients do not need to consider other properties of the connection 2285 that is being attempted, including the choice of possible application 2286 protocols, session tickets, or other connection properties. 2288 Attackers could replay tokens to use servers as amplifiers in DDoS 2289 attacks. To protect against such attacks, servers SHOULD ensure that 2290 tokens sent in Retry packets are only accepted for a short time. 2291 Tokens that are provided in NEW_TOKEN frames (Section 19.7) need to 2292 be valid for longer, but SHOULD NOT be accepted multiple times in a 2293 short period. Servers are encouraged to allow tokens to be used only 2294 once, if possible. 2296 8.1.4. Address Validation Token Integrity 2298 An address validation token MUST be difficult to guess. Including a 2299 large enough random value in the token would be sufficient, but this 2300 depends on the server remembering the value it sends to clients. 2302 A token-based scheme allows the server to offload any state 2303 associated with validation to the client. For this design to work, 2304 the token MUST be covered by integrity protection against 2305 modification or falsification by clients. Without integrity 2306 protection, malicious clients could generate or guess values for 2307 tokens that would be accepted by the server. Only the server 2308 requires access to the integrity protection key for tokens. 2310 There is no need for a single well-defined format for the token 2311 because the server that generates the token also consumes it. Tokens 2312 sent in Retry packets SHOULD include information that allows the 2313 server to verify that the source IP address and port in client 2314 packets remains constant. 2316 Tokens sent in NEW_TOKEN frames MUST include information that allows 2317 the server to verify that the client IP address has not changed from 2318 when the token was issued. Servers can use tokens from NEW_TOKEN in 2319 deciding not to send a Retry packet, even if the client address has 2320 changed. If the client IP address has changed, the server MUST 2321 adhere to the anti-amplification limits found in Section 8.1. Note 2322 that in the presence of NAT, this requirement might be insufficient 2323 to protect other hosts that share the NAT from amplification attack. 2325 Servers MUST ensure that replay of tokens is prevented or limited. 2326 For instance, servers might limit the time over which a token is 2327 accepted. Tokens provided in NEW_TOKEN frames might need to allow 2328 longer validity periods. Tokens MAY include additional information 2329 about clients to further narrow applicability or reuse. 2331 8.2. Path Validation 2333 Path validation is used during connection migration (see Section 9 2334 and Section 9.6) by the migrating endpoint to verify reachability of 2335 a peer from a new local address. In path validation, endpoints test 2336 reachability between a specific local address and a specific peer 2337 address, where an address is the two-tuple of IP address and port. 2339 Path validation tests that packets (PATH_CHALLENGE) can be both sent 2340 to and received (PATH_RESPONSE) from a peer on the path. 2341 Importantly, it validates that the packets received from the 2342 migrating endpoint do not carry a spoofed source address. 2344 Path validation can be used at any time by either endpoint. For 2345 instance, an endpoint might check that a peer is still in possession 2346 of its address after a period of quiescence. 2348 Path validation is not designed as a NAT traversal mechanism. Though 2349 the mechanism described here might be effective for the creation of 2350 NAT bindings that support NAT traversal, the expectation is that one 2351 or other peer is able to receive packets without first having sent a 2352 packet on that path. Effective NAT traversal needs additional 2353 synchronization mechanisms that are not provided here. 2355 An endpoint MAY bundle PATH_CHALLENGE and PATH_RESPONSE frames that 2356 are used for path validation with other frames. In particular, an 2357 endpoint may pad a packet carrying a PATH_CHALLENGE for PMTU 2358 discovery, or an endpoint may bundle a PATH_RESPONSE with its own 2359 PATH_CHALLENGE. 2361 When probing a new path, an endpoint might want to ensure that its 2362 peer has an unused connection ID available for responses. The 2363 endpoint can send NEW_CONNECTION_ID and PATH_CHALLENGE frames in the 2364 same packet. This ensures that an unused connection ID will be 2365 available to the peer when sending a response. 2367 8.3. Initiating Path Validation 2369 To initiate path validation, an endpoint sends a PATH_CHALLENGE frame 2370 containing a random payload on the path to be validated. 2372 An endpoint MAY send multiple PATH_CHALLENGE frames to guard against 2373 packet loss. However, an endpoint SHOULD NOT send multiple 2374 PATH_CHALLENGE frames in a single packet. An endpoint SHOULD NOT 2375 send a PATH_CHALLENGE more frequently than it would an Initial 2376 packet, ensuring that connection migration is no more load on a new 2377 path than establishing a new connection. 2379 The endpoint MUST use unpredictable data in every PATH_CHALLENGE 2380 frame so that it can associate the peer's response with the 2381 corresponding PATH_CHALLENGE. 2383 8.4. Path Validation Responses 2385 On receiving a PATH_CHALLENGE frame, an endpoint MUST respond 2386 immediately by echoing the data contained in the PATH_CHALLENGE frame 2387 in a PATH_RESPONSE frame. 2389 An endpoint MUST NOT send more than one PATH_RESPONSE frame in 2390 response to one PATH_CHALLENGE frame; see Section 13.3. The peer is 2391 expected to send more PATH_CHALLENGE frames as necessary to evoke 2392 additional PATH_RESPONSE frames. 2394 8.5. Successful Path Validation 2396 A new address is considered valid when a PATH_RESPONSE frame is 2397 received that contains the data that was sent in a previous 2398 PATH_CHALLENGE. Receipt of an acknowledgment for a packet containing 2399 a PATH_CHALLENGE frame is not adequate validation, since the 2400 acknowledgment can be spoofed by a malicious peer. 2402 Note that receipt on a different local address does not result in 2403 path validation failure, as it might be a result of a forwarded 2404 packet (see Section 9.3.3) or misrouting. It is possible that a 2405 valid PATH_RESPONSE might be received in the future. 2407 8.6. Failed Path Validation 2409 Path validation only fails when the endpoint attempting to validate 2410 the path abandons its attempt to validate the path. 2412 Endpoints SHOULD abandon path validation based on a timer. When 2413 setting this timer, implementations are cautioned that the new path 2414 could have a longer round-trip time than the original. A value of 2415 three times the larger of the current Probe Timeout (PTO) or the 2416 initial timeout (that is, 2*kInitialRtt) as defined in 2417 [QUIC-RECOVERY] is RECOMMENDED. That is: 2419 validation_timeout = max(3*PTO, 6*kInitialRtt) 2421 Note that the endpoint might receive packets containing other frames 2422 on the new path, but a PATH_RESPONSE frame with appropriate data is 2423 required for path validation to succeed. 2425 When an endpoint abandons path validation, it determines that the 2426 path is unusable. This does not necessarily imply a failure of the 2427 connection - endpoints can continue sending packets over other paths 2428 as appropriate. If no paths are available, an endpoint can wait for 2429 a new path to become available or close the connection. 2431 A path validation might be abandoned for other reasons besides 2432 failure. Primarily, this happens if a connection migration to a new 2433 path is initiated while a path validation on the old path is in 2434 progress. 2436 9. Connection Migration 2438 The use of a connection ID allows connections to survive changes to 2439 endpoint addresses (IP address and port), such as those caused by an 2440 endpoint migrating to a new network. This section describes the 2441 process by which an endpoint migrates to a new address. 2443 The design of QUIC relies on endpoints retaining a stable address for 2444 the duration of the handshake. An endpoint MUST NOT initiate 2445 connection migration before the handshake is confirmed, as defined in 2446 section 4.1.2 of [QUIC-TLS]. 2448 An endpoint also MUST NOT send packets from a different local 2449 address, actively initiating migration, if the peer sent the 2450 disable_active_migration transport parameter during the handshake. 2451 An endpoint which has sent this transport parameter, but detects that 2452 a peer has nonetheless migrated to a different network MUST either 2453 drop the incoming packets on that path without generating a stateless 2454 reset or proceed with path validation and allow the peer to migrate. 2455 Generating a stateless reset or closing the connection would allow 2456 third parties in the network to cause connections to close by 2457 spoofing or otherwise manipulating observed traffic. 2459 Not all changes of peer address are intentional, or active, 2460 migrations. The peer could experience NAT rebinding: a change of 2461 address due to a middlebox, usually a NAT, allocating a new outgoing 2462 port or even a new outgoing IP address for a flow. An endpoint MUST 2463 perform path validation (Section 8.2) if it detects any change to a 2464 peer's address, unless it has previously validated that address. 2466 When an endpoint has no validated path on which to send packets, it 2467 MAY discard connection state. An endpoint capable of connection 2468 migration MAY wait for a new path to become available before 2469 discarding connection state. 2471 This document limits migration of connections to new client 2472 addresses, except as described in Section 9.6. Clients are 2473 responsible for initiating all migrations. Servers do not send non- 2474 probing packets (see Section 9.1) toward a client address until they 2475 see a non-probing packet from that address. If a client receives 2476 packets from an unknown server address, the client MUST discard these 2477 packets. 2479 9.1. Probing a New Path 2481 An endpoint MAY probe for peer reachability from a new local address 2482 using path validation Section 8.2 prior to migrating the connection 2483 to the new local address. Failure of path validation simply means 2484 that the new path is not usable for this connection. Failure to 2485 validate a path does not cause the connection to end unless there are 2486 no valid alternative paths available. 2488 An endpoint uses a new connection ID for probes sent from a new local 2489 address; see Section 9.5 for further discussion. An endpoint that 2490 uses a new local address needs to ensure that at least one new 2491 connection ID is available at the peer. That can be achieved by 2492 including a NEW_CONNECTION_ID frame in the probe. 2494 Receiving a PATH_CHALLENGE frame from a peer indicates that the peer 2495 is probing for reachability on a path. An endpoint sends a 2496 PATH_RESPONSE in response as per Section 8.2. 2498 PATH_CHALLENGE, PATH_RESPONSE, NEW_CONNECTION_ID, and PADDING frames 2499 are "probing frames", and all other frames are "non-probing frames". 2500 A packet containing only probing frames is a "probing packet", and a 2501 packet containing any other frame is a "non-probing packet". 2503 9.2. Initiating Connection Migration 2505 An endpoint can migrate a connection to a new local address by 2506 sending packets containing non-probing frames from that address. 2508 Each endpoint validates its peer's address during connection 2509 establishment. Therefore, a migrating endpoint can send to its peer 2510 knowing that the peer is willing to receive at the peer's current 2511 address. Thus an endpoint can migrate to a new local address without 2512 first validating the peer's address. 2514 When migrating, the new path might not support the endpoint's current 2515 sending rate. Therefore, the endpoint resets its congestion 2516 controller, as described in Section 9.4. 2518 The new path might not have the same ECN capability. Therefore, the 2519 endpoint verifies ECN capability as described in Section 13.4. 2521 Receiving acknowledgments for data sent on the new path serves as 2522 proof of the peer's reachability from the new address. Note that 2523 since acknowledgments may be received on any path, return 2524 reachability on the new path is not established. To establish return 2525 reachability on the new path, an endpoint MAY concurrently initiate 2526 path validation Section 8.2 on the new path or it MAY choose to wait 2527 for the peer to send the next non-probing frame to its new address. 2529 9.3. Responding to Connection Migration 2531 Receiving a packet from a new peer address containing a non-probing 2532 frame indicates that the peer has migrated to that address. 2534 In response to such a packet, an endpoint MUST start sending 2535 subsequent packets to the new peer address and MUST initiate path 2536 validation (Section 8.2) to verify the peer's ownership of the 2537 unvalidated address. 2539 An endpoint MAY send data to an unvalidated peer address, but it MUST 2540 protect against potential attacks as described in Section 9.3.1 and 2541 Section 9.3.2. An endpoint MAY skip validation of a peer address if 2542 that address has been seen recently. In particular, if an endpoint 2543 returns to a previously-validated path after detecting some form of 2544 spurious migration, skipping address validation and restoring loss 2545 detection and congestion state can reduce the performance impact of 2546 the attack. 2548 An endpoint only changes the address that it sends packets to in 2549 response to the highest-numbered non-probing packet. This ensures 2550 that an endpoint does not send packets to an old peer address in the 2551 case that it receives reordered packets. 2553 After changing the address to which it sends non-probing packets, an 2554 endpoint could abandon any path validation for other addresses. 2556 Receiving a packet from a new peer address might be the result of a 2557 NAT rebinding at the peer. 2559 After verifying a new client address, the server SHOULD send new 2560 address validation tokens (Section 8) to the client. 2562 9.3.1. Peer Address Spoofing 2564 It is possible that a peer is spoofing its source address to cause an 2565 endpoint to send excessive amounts of data to an unwilling host. If 2566 the endpoint sends significantly more data than the spoofing peer, 2567 connection migration might be used to amplify the volume of data that 2568 an attacker can generate toward a victim. 2570 As described in Section 9.3, an endpoint is required to validate a 2571 peer's new address to confirm the peer's possession of the new 2572 address. Until a peer's address is deemed valid, an endpoint MUST 2573 limit the rate at which it sends data to this address. The endpoint 2574 MUST NOT send more than a minimum congestion window's worth of data 2575 per estimated round-trip time (kMinimumWindow, as defined in 2576 [QUIC-RECOVERY]). In the absence of this limit, an endpoint risks 2577 being used for a denial of service attack against an unsuspecting 2578 victim. Note that since the endpoint will not have any round-trip 2579 time measurements to this address, the estimate SHOULD be the default 2580 initial value; see [QUIC-RECOVERY]. 2582 If an endpoint skips validation of a peer address as described in 2583 Section 9.3, it does not need to limit its sending rate. 2585 9.3.2. On-Path Address Spoofing 2587 An on-path attacker could cause a spurious connection migration by 2588 copying and forwarding a packet with a spoofed address such that it 2589 arrives before the original packet. The packet with the spoofed 2590 address will be seen to come from a migrating connection, and the 2591 original packet will be seen as a duplicate and dropped. After a 2592 spurious migration, validation of the source address will fail 2593 because the entity at the source address does not have the necessary 2594 cryptographic keys to read or respond to the PATH_CHALLENGE frame 2595 that is sent to it even if it wanted to. 2597 To protect the connection from failing due to such a spurious 2598 migration, an endpoint MUST revert to using the last validated peer 2599 address when validation of a new peer address fails. 2601 If an endpoint has no state about the last validated peer address, it 2602 MUST close the connection silently by discarding all connection 2603 state. This results in new packets on the connection being handled 2604 generically. For instance, an endpoint MAY send a stateless reset in 2605 response to any further incoming packets. 2607 Note that receipt of packets with higher packet numbers from the 2608 legitimate peer address will trigger another connection migration. 2609 This will cause the validation of the address of the spurious 2610 migration to be abandoned. 2612 9.3.3. Off-Path Packet Forwarding 2614 An off-path attacker that can observe packets might forward copies of 2615 genuine packets to endpoints. If the copied packet arrives before 2616 the genuine packet, this will appear as a NAT rebinding. Any genuine 2617 packet will be discarded as a duplicate. If the attacker is able to 2618 continue forwarding packets, it might be able to cause migration to a 2619 path via the attacker. This places the attacker on path, giving it 2620 the ability to observe or drop all subsequent packets. 2622 Unlike the attack described in Section 9.3.2, the attacker can ensure 2623 that the new path is successfully validated. 2625 This style of attack relies on the attacker using a path that is 2626 approximately as fast as the direct path between endpoints. The 2627 attack is more reliable if relatively few packets are sent or if 2628 packet loss coincides with the attempted attack. 2630 A non-probing packet received on the original path that increases the 2631 maximum received packet number will cause the endpoint to move back 2632 to that path. Eliciting packets on this path increases the 2633 likelihood that the attack is unsuccessful. Therefore, mitigation of 2634 this attack relies on triggering the exchange of packets. 2636 In response to an apparent migration, endpoints MUST validate the 2637 previously active path using a PATH_CHALLENGE frame. This induces 2638 the sending of new packets on that path. If the path is no longer 2639 viable, the validation attempt will time out and fail; if the path is 2640 viable, but no longer desired, the validation will succeed, but only 2641 results in probing packets being sent on the path. 2643 An endpoint that receives a PATH_CHALLENGE on an active path SHOULD 2644 send a non-probing packet in response. If the non-probing packet 2645 arrives before any copy made by an attacker, this results in the 2646 connection being migrated back to the original path. Any subsequent 2647 migration to another path restarts this entire process. 2649 This defense is imperfect, but this is not considered a serious 2650 problem. If the path via the attack is reliably faster than the 2651 original path despite multiple attempts to use that original path, it 2652 is not possible to distinguish between attack and an improvement in 2653 routing. 2655 An endpoint could also use heuristics to improve detection of this 2656 style of attack. For instance, NAT rebinding is improbable if 2657 packets were recently received on the old path, similarly rebinding 2658 is rare on IPv6 paths. Endpoints can also look for duplicated 2659 packets. Conversely, a change in connection ID is more likely to 2660 indicate an intentional migration rather than an attack. 2662 9.4. Loss Detection and Congestion Control 2664 The capacity available on the new path might not be the same as the 2665 old path. Packets sent on the old path MUST NOT contribute to 2666 congestion control or RTT estimation for the new path. 2668 On confirming a peer's ownership of its new address, an endpoint MUST 2669 immediately reset the congestion controller and round-trip time 2670 estimator for the new path to initial values (see Sections A.3 and 2671 B.3 in [QUIC-RECOVERY]) unless it has knowledge that a previous send 2672 rate or round-trip time estimate is valid for the new path. For 2673 instance, an endpoint might infer that a change in only the client's 2674 port number is indicative of a NAT rebinding, meaning that the new 2675 path is likely to have similar bandwidth and round-trip time. 2676 However, this determination will be imperfect. If the determination 2677 is incorrect, the congestion controller and the RTT estimator are 2678 expected to adapt to the new path. Generally, implementations are 2679 advised to be cautious when using previous values on a new path. 2681 There may be apparent reordering at the receiver when an endpoint 2682 sends data and probes from/to multiple addresses during the migration 2683 period, since the two resulting paths may have different round-trip 2684 times. A receiver of packets on multiple paths will still send ACK 2685 frames covering all received packets. 2687 While multiple paths might be used during connection migration, a 2688 single congestion control context and a single loss recovery context 2689 (as described in [QUIC-RECOVERY]) may be adequate. For instance, an 2690 endpoint might delay switching to a new congestion control context 2691 until it is confirmed that an old path is no longer needed (such as 2692 the case in Section 9.3.3). 2694 A sender can make exceptions for probe packets so that their loss 2695 detection is independent and does not unduly cause the congestion 2696 controller to reduce its sending rate. An endpoint might set a 2697 separate timer when a PATH_CHALLENGE is sent, which is cancelled if 2698 the corresponding PATH_RESPONSE is received. If the timer fires 2699 before the PATH_RESPONSE is received, the endpoint might send a new 2700 PATH_CHALLENGE, and restart the timer for a longer period of time. 2701 This timer SHOULD be set as described in Section 5.3 of 2702 [QUIC-RECOVERY] and MUST NOT be more aggressive. 2704 9.5. Privacy Implications of Connection Migration 2706 Using a stable connection ID on multiple network paths allows a 2707 passive observer to correlate activity between those paths. An 2708 endpoint that moves between networks might not wish to have their 2709 activity correlated by any entity other than their peer, so different 2710 connection IDs are used when sending from different local addresses, 2711 as discussed in Section 5.1. For this to be effective endpoints need 2712 to ensure that connection IDs they provide cannot be linked by any 2713 other entity. 2715 At any time, endpoints MAY change the Destination Connection ID they 2716 send to a value that has not been used on another path. 2718 An endpoint MUST NOT reuse a connection ID when sending from more 2719 than one local address, for example when initiating connection 2720 migration as described in Section 9.2 or when probing a new network 2721 path as described in Section 9.1. 2723 Similarly, an endpoint MUST NOT reuse a connection ID when sending to 2724 more than one destination address. Due to network changes outside 2725 the control of its peer, an endpoint might receive packets from a new 2726 source address with the same destination connection ID, in which case 2727 it MAY continue to use the current connection ID with the new remote 2728 address while still sending from the same local address. 2730 These requirements regarding connection ID reuse apply only to the 2731 sending of packets, as unintentional changes in path without a change 2732 in connection ID are possible. For example, after a period of 2733 network inactivity, NAT rebinding might cause packets to be sent on a 2734 new path when the client resumes sending. An endpoint responds to 2735 such an event as described in Section 9.3. 2737 Using different connection IDs for packets sent in both directions on 2738 each new network path eliminates the use of the connection ID for 2739 linking packets from the same connection across different network 2740 paths. Header protection ensures that packet numbers cannot be used 2741 to correlate activity. This does not prevent other properties of 2742 packets, such as timing and size, from being used to correlate 2743 activity. 2745 An endpoint SHOULD NOT initiate migration with a peer that has 2746 requested a zero-length connection ID, because traffic over the new 2747 path might be trivially linkable to traffic over the old one. If the 2748 server is able to route packets with a zero-length connection ID to 2749 the right connection, it means that the server is using other 2750 information to demultiplex packets. For example, a server might 2751 provide a unique address to every client, for instance using HTTP 2752 alternative services [ALTSVC]. Information that might allow correct 2753 routing of packets across multiple network paths will also allow 2754 activity on those paths to be linked by entities other than the peer. 2756 A client might wish to reduce linkability by employing a new 2757 connection ID and source UDP port when sending traffic after a period 2758 of inactivity. Changing the UDP port from which it sends packets at 2759 the same time might cause the packet to appear as a connection 2760 migration. This ensures that the mechanisms that support migration 2761 are exercised even for clients that don't experience NAT rebindings 2762 or genuine migrations. Changing port number can cause a peer to 2763 reset its congestion state (see Section 9.4), so the port SHOULD only 2764 be changed infrequently. 2766 An endpoint that exhausts available connection IDs cannot probe new 2767 paths or initiate migration, nor can it respond to probes or attempts 2768 by its peer to migrate. To ensure that migration is possible and 2769 packets sent on different paths cannot be correlated, endpoints 2770 SHOULD provide new connection IDs before peers migrate; see 2771 Section 5.1.1. If a peer might have exhausted available connection 2772 IDs, a migrating endpoint could include a NEW_CONNECTION_ID frame in 2773 all packets sent on a new network path. 2775 9.6. Server's Preferred Address 2777 QUIC allows servers to accept connections on one IP address and 2778 attempt to transfer these connections to a more preferred address 2779 shortly after the handshake. This is particularly useful when 2780 clients initially connect to an address shared by multiple servers 2781 but would prefer to use a unicast address to ensure connection 2782 stability. This section describes the protocol for migrating a 2783 connection to a preferred server address. 2785 Migrating a connection to a new server address mid-connection is left 2786 for future work. If a client receives packets from a new server 2787 address not indicated by the preferred_address transport parameter, 2788 the client SHOULD discard these packets. 2790 9.6.1. Communicating a Preferred Address 2792 A server conveys a preferred address by including the 2793 preferred_address transport parameter in the TLS handshake. 2795 Servers MAY communicate a preferred address of each address family 2796 (IPv4 and IPv6) to allow clients to pick the one most suited to their 2797 network attachment. 2799 Once the handshake is confirmed, the client SHOULD select one of the 2800 two server's preferred addresses and initiate path validation (see 2801 Section 8.2) of that address using any previously unused active 2802 connection ID, taken from either the preferred_address transport 2803 parameter or a NEW_CONNECTION_ID frame. 2805 If path validation succeeds, the client SHOULD immediately begin 2806 sending all future packets to the new server address using the new 2807 connection ID and discontinue use of the old server address. If path 2808 validation fails, the client MUST continue sending all future packets 2809 to the server's original IP address. 2811 9.6.2. Responding to Connection Migration 2813 A server might receive a packet addressed to its preferred IP address 2814 at any time after it accepts a connection. If this packet contains a 2815 PATH_CHALLENGE frame, the server sends a PATH_RESPONSE frame as per 2816 Section 8.2. The server MUST send other non-probing frames from its 2817 original address until it receives a non-probing packet from the 2818 client at its preferred address and until the server has validated 2819 the new path. 2821 The server MUST probe on the path toward the client from its 2822 preferred address. This helps to guard against spurious migration 2823 initiated by an attacker. 2825 Once the server has completed its path validation and has received a 2826 non-probing packet with a new largest packet number on its preferred 2827 address, the server begins sending non-probing packets to the client 2828 exclusively from its preferred IP address. It SHOULD drop packets 2829 for this connection received on the old IP address, but MAY continue 2830 to process delayed packets. 2832 The addresses that a server provides in the preferred_address 2833 transport parameter are only valid for the connection in which they 2834 are provided. A client MUST NOT use these for other connections, 2835 including connections that are resumed from the current connection. 2837 9.6.3. Interaction of Client Migration and Preferred Address 2839 A client might need to perform a connection migration before it has 2840 migrated to the server's preferred address. In this case, the client 2841 SHOULD perform path validation to both the original and preferred 2842 server address from the client's new address concurrently. 2844 If path validation of the server's preferred address succeeds, the 2845 client MUST abandon validation of the original address and migrate to 2846 using the server's preferred address. If path validation of the 2847 server's preferred address fails but validation of the server's 2848 original address succeeds, the client MAY migrate to its new address 2849 and continue sending to the server's original address. 2851 If the connection to the server's preferred address is not from the 2852 same client address, the server MUST protect against potential 2853 attacks as described in Section 9.3.1 and Section 9.3.2. In addition 2854 to intentional simultaneous migration, this might also occur because 2855 the client's access network used a different NAT binding for the 2856 server's preferred address. 2858 Servers SHOULD initiate path validation to the client's new address 2859 upon receiving a probe packet from a different address. Servers MUST 2860 NOT send more than a minimum congestion window's worth of non-probing 2861 packets to the new address before path validation is complete. 2863 A client that migrates to a new address SHOULD use a preferred 2864 address from the same address family for the server. 2866 The connection ID provided in the preferred_address transport 2867 parameter is not specific to the addresses that are provided. This 2868 connection ID is provided to ensure that the client has a connection 2869 ID available for migration, but the client MAY use this connection ID 2870 on any path. 2872 9.7. Use of IPv6 Flow-Label and Migration 2874 Endpoints that send data using IPv6 SHOULD apply an IPv6 flow label 2875 in compliance with [RFC6437], unless the local API does not allow 2876 setting IPv6 flow labels. 2878 The IPv6 flow label SHOULD be a pseudo-random function of the source 2879 and destination addresses, source and destination UDP ports, and the 2880 destination CID. The flow label generation MUST be designed to 2881 minimize the chances of linkability with a previously used flow 2882 label, as this would enable correlating activity on multiple paths; 2883 see Section 9.5. 2885 A possible implementation is to compute the flow label as a 2886 cryptographic hash function of the source and destination addresses, 2887 source and destination UDP ports, destination CID, and a local 2888 secret. 2890 10. Connection Termination 2892 An established QUIC connection can be terminated in one of three 2893 ways: 2895 * idle timeout (Section 10.2) 2897 * immediate close (Section 10.3) 2899 * stateless reset (Section 10.4) 2901 An endpoint MAY discard connection state if it does not have a 2902 validated path on which it can send packets; see Section 8.2. 2904 10.1. Closing and Draining Connection States 2906 The closing and draining connection states exist to ensure that 2907 connections close cleanly and that delayed or reordered packets are 2908 properly discarded. These states SHOULD persist for at least three 2909 times the current Probe Timeout (PTO) interval as defined in 2910 [QUIC-RECOVERY]. 2912 An endpoint enters a closing period after initiating an immediate 2913 close; Section 10.3. While closing, an endpoint MUST NOT send 2914 packets unless they contain a CONNECTION_CLOSE frame; see 2915 Section 10.3 for details. An endpoint retains only enough 2916 information to generate a packet containing a CONNECTION_CLOSE frame 2917 and to identify packets as belonging to the connection. The 2918 endpoint's selected connection ID and the QUIC version are sufficient 2919 information to identify packets for a closing connection; an endpoint 2920 can discard all other connection state. An endpoint MAY retain 2921 packet protection keys for incoming packets to allow it to read and 2922 process a CONNECTION_CLOSE frame. 2924 The draining state is entered once an endpoint receives a signal that 2925 its peer is closing or draining. While otherwise identical to the 2926 closing state, an endpoint in the draining state MUST NOT send any 2927 packets. Retaining packet protection keys is unnecessary once a 2928 connection is in the draining state. 2930 An endpoint MAY transition from the closing period to the draining 2931 period if it receives a CONNECTION_CLOSE frame or stateless reset, 2932 both of which indicate that the peer is also closing or draining. 2933 The draining period SHOULD end when the closing period would have 2934 ended. In other words, the endpoint can use the same end time, but 2935 cease retransmission of the closing packet. 2937 Disposing of connection state prior to the end of the closing or 2938 draining period could cause delayed or reordered packets to generate 2939 an unnecessary stateless reset. Endpoints that have some alternative 2940 means to ensure that late-arriving packets on the connection do not 2941 induce a response, such as those that are able to close the UDP 2942 socket, MAY use an abbreviated draining period which can allow for 2943 faster resource recovery. Servers that retain an open socket for 2944 accepting new connections SHOULD NOT exit the closing or draining 2945 period early. 2947 Once the closing or draining period has ended, an endpoint SHOULD 2948 discard all connection state. This results in new packets on the 2949 connection being handled generically. For instance, an endpoint MAY 2950 send a stateless reset in response to any further incoming packets. 2952 The draining and closing periods do not apply when a stateless reset 2953 (Section 10.4) is sent. 2955 An endpoint is not expected to handle key updates when it is closing 2956 or draining. A key update might prevent the endpoint from moving 2957 from the closing state to draining, but it otherwise has no impact. 2959 While in the closing period, an endpoint could receive packets from a 2960 new source address, indicating a connection migration; Section 9. An 2961 endpoint in the closing state MUST strictly limit the number of 2962 packets it sends to this new address until the address is validated; 2963 see Section 8.2. A server in the closing state MAY instead choose to 2964 discard packets received from a new source address. 2966 10.2. Idle Timeout 2968 If a max_idle_timeout is specified by either peer in its transport 2969 parameters (Section 18.2), the connection is silently closed and its 2970 state is discarded when it remains idle for longer than the minimum 2971 of both peers max_idle_timeout values and three times the current 2972 Probe Timeout (PTO). 2974 Each endpoint advertises a max_idle_timeout, but the effective value 2975 at an endpoint is computed as the minimum of the two advertised 2976 values. By announcing a max_idle_timeout, an endpoint commits to 2977 initiating an immediate close (Section 10.3) if it abandons the 2978 connection prior to the effective value. 2980 An endpoint restarts its idle timer when a packet from its peer is 2981 received and processed successfully. An endpoint also restarts its 2982 idle timer when sending an ack-eliciting packet if no other ack- 2983 eliciting packets have been sent since last receiving and processing 2984 a packet. Restarting this timer when sending a packet ensures that 2985 connections are not closed after new activity is initiated. 2987 An endpoint might need to send ack-eliciting packets to avoid an idle 2988 timeout if it is expecting response data, but does not have or is 2989 unable to send application data. 2991 An endpoint that sends packets close to the effective timeout risks 2992 having them be discarded at the peer, since the peer might enter its 2993 draining state before these packets arrive. An endpoint can send a 2994 PING or another ack-eliciting frame to test the connection for 2995 liveness if the peer could time out soon, such as within a PTO; see 2996 Section 6.6 of [QUIC-RECOVERY]. This is especially useful if any 2997 available application data cannot be safely retried. Note that the 2998 application determines what data is safe to retry. 3000 10.3. Immediate Close 3002 An endpoint sends a CONNECTION_CLOSE frame (Section 19.19) to 3003 terminate the connection immediately. A CONNECTION_CLOSE frame 3004 causes all streams to immediately become closed; open streams can be 3005 assumed to be implicitly reset. 3007 After sending a CONNECTION_CLOSE frame, an endpoint immediately 3008 enters the closing state. 3010 During the closing period, an endpoint that sends a CONNECTION_CLOSE 3011 frame SHOULD respond to any incoming packet that can be decrypted 3012 with another packet containing a CONNECTION_CLOSE frame. Such an 3013 endpoint SHOULD limit the number of packets it generates containing a 3014 CONNECTION_CLOSE frame. For instance, an endpoint could wait for a 3015 progressively increasing number of received packets or amount of time 3016 before responding to a received packet. 3018 An endpoint is allowed to drop the packet protection keys when 3019 entering the closing period (Section 10.1) and send a packet 3020 containing a CONNECTION_CLOSE in response to any UDP datagram that is 3021 received. However, an endpoint without the packet protection keys 3022 cannot identify and discard invalid packets. To avoid creating an 3023 unwitting amplification attack, such endpoints MUST reduce the 3024 frequency with which it sends packets containing a CONNECTION_CLOSE 3025 frame. To minimize the state that an endpoint maintains for a 3026 closing connection, endpoints MAY send the exact same packet. 3028 Note: Allowing retransmission of a closing packet contradicts other 3029 advice in this document that recommends the creation of new packet 3030 numbers for every packet. Sending new packet numbers is primarily 3031 of advantage to loss recovery and congestion control, which are 3032 not expected to be relevant for a closed connection. 3033 Retransmitting the final packet requires less state. 3035 New packets from unverified addresses could be used to create an 3036 amplification attack; see Section 8. To avoid this, endpoints MUST 3037 either limit transmission of CONNECTION_CLOSE frames to validated 3038 addresses or drop packets without response if the response would be 3039 more than three times larger than the received packet. 3041 After receiving a CONNECTION_CLOSE frame, endpoints enter the 3042 draining state. An endpoint that receives a CONNECTION_CLOSE frame 3043 MAY send a single packet containing a CONNECTION_CLOSE frame before 3044 entering the draining state, using a CONNECTION_CLOSE frame and a 3045 NO_ERROR code if appropriate. An endpoint MUST NOT send further 3046 packets, which could result in a constant exchange of 3047 CONNECTION_CLOSE frames until the closing period on either peer 3048 ended. 3050 An immediate close can be used after an application protocol has 3051 arranged to close a connection. This might be after the application 3052 protocols negotiates a graceful shutdown. The application protocol 3053 exchanges whatever messages that are needed to cause both endpoints 3054 to agree to close the connection, after which the application 3055 requests that the connection be closed. The application protocol can 3056 use a CONNECTION_CLOSE frame with an appropriate error code to signal 3057 closure. 3059 10.3.1. Immediate Close During the Handshake 3061 When sending CONNECTION_CLOSE, the goal is to ensure that the peer 3062 will process the frame. Generally, this means sending the frame in a 3063 packet with the highest level of packet protection to avoid the 3064 packet being discarded. After the handshake is confirmed (see 3065 Section 4.1.2 of [QUIC-TLS]), an endpoint MUST send any 3066 CONNECTION_CLOSE frames in a 1-RTT packet. However, prior to 3067 confirming the handshake, it is possible that more advanced packet 3068 protection keys are not available to the peer, so another 3069 CONNECTION_CLOSE frame MAY be sent in a packet that uses a lower 3070 packet protection level. More specifically: 3072 * A client will always know whether the server has Handshake keys 3073 (see Section 17.2.2.1), but it is possible that a server does not 3074 know whether the client has Handshake keys. Under these 3075 circumstances, a server SHOULD send a CONNECTION_CLOSE frame in 3076 both Handshake and Initial packets to ensure that at least one of 3077 them is processable by the client. 3079 * A client that sends CONNECTION_CLOSE in a 0-RTT packet cannot be 3080 assured of the server has accepted 0-RTT and so sending a 3081 CONNECTION_CLOSE frame in an Initial packet makes it more likely 3082 that the server can receive the close signal, even if the 3083 application error code might not be received. 3085 * Prior to confirming the handshake, a peer might be unable to 3086 process 1-RTT packets, so an endpoint SHOULD send CONNECTION_CLOSE 3087 in both Handshake and 1-RTT packets. A server SHOULD also send 3088 CONNECTION_CLOSE in an Initial packet. 3090 Sending a CONNECTION_CLOSE of type 0x1d in an Initial or Handshake 3091 packet could expose application state or be used to alter application 3092 state. A CONNECTION_CLOSE of type 0x1d MUST be replaced by a 3093 CONNECTION_CLOSE of type 0x1c when sending the frame in Initial or 3094 Handshake packets. Otherwise, information about the application 3095 state might be revealed. Endpoints MUST clear the value of the 3096 Reason Phrase field and SHOULD use the APPLICATION_ERROR code when 3097 converting to a CONNECTION_CLOSE of type 0x1c. 3099 CONNECTION_CLOSE frames sent in multiple packet types can be 3100 coalesced into a single UDP datagram; see Section 12.2. 3102 An endpoint might send a CONNECTION_CLOSE frame in an Initial packet 3103 or in response to unauthenticated information received in Initial or 3104 Handshake packets. Such an immediate close might expose legitimate 3105 connections to a denial of service. QUIC does not include defensive 3106 measures for on-path attacks during the handshake; see Section 21.1. 3107 However, at the cost of reducing feedback about errors for legitimate 3108 peers, some forms of denial of service can be made more difficult for 3109 an attacker if endpoints discard illegal packets rather than 3110 terminating a connection with CONNECTION_CLOSE. For this reason, 3111 endpoints MAY discard packets rather than immediately close if errors 3112 are detected in packets that lack authentication. 3114 An endpoint that has not established state, such as a server that 3115 detects an error in an Initial packet, does not enter the closing 3116 state. An endpoint that has no state for the connection does not 3117 enter a closing or draining period on sending a CONNECTION_CLOSE 3118 frame. 3120 10.4. Stateless Reset 3122 A stateless reset is provided as an option of last resort for an 3123 endpoint that does not have access to the state of a connection. A 3124 crash or outage might result in peers continuing to send data to an 3125 endpoint that is unable to properly continue the connection. An 3126 endpoint MAY send a stateless reset in response to receiving a packet 3127 that it cannot associate with an active connection. 3129 A stateless reset is not appropriate for signaling error conditions. 3130 An endpoint that wishes to communicate a fatal connection error MUST 3131 use a CONNECTION_CLOSE frame if it has sufficient state to do so. 3133 To support this process, a token is sent by endpoints. The token is 3134 carried in the Stateless Reset Token field of a NEW_CONNECTION_ID 3135 frame. Servers can also specify a stateless_reset_token transport 3136 parameter during the handshake that applies to the connection ID that 3137 it selected during the handshake; clients cannot use this transport 3138 parameter because their transport parameters don't have 3139 confidentiality protection. These tokens are protected by 3140 encryption, so only client and server know their value. Tokens are 3141 invalidated when their associated connection ID is retired via a 3142 RETIRE_CONNECTION_ID frame (Section 19.16). 3144 An endpoint that receives packets that it cannot process sends a 3145 packet in the following layout: 3147 Stateless Reset { 3148 Fixed Bits (2) = 1, 3149 Unpredictable Bits (38..), 3150 Stateless Reset Token (128), 3151 } 3153 Figure 9: Stateless Reset Packet 3155 This design ensures that a stateless reset packet is - to the extent 3156 possible - indistinguishable from a regular packet with a short 3157 header. 3159 A stateless reset uses an entire UDP datagram, starting with the 3160 first two bits of the packet header. The remainder of the first byte 3161 and an arbitrary number of bytes following it that are set to 3162 unpredictable values. The last 16 bytes of the datagram contain a 3163 Stateless Reset Token. 3165 To entities other than its intended recipient, a stateless reset will 3166 appear to be a packet with a short header. For the stateless reset 3167 to appear as a valid QUIC packet, the Unpredictable Bits field needs 3168 to include at least 38 bits of data (or 5 bytes, less the two fixed 3169 bits). 3171 A minimum size of 21 bytes does not guarantee that a stateless reset 3172 is difficult to distinguish from other packets if the recipient 3173 requires the use of a connection ID. To prevent a resulting 3174 stateless reset from being trivially distinguishable from a valid 3175 packet, all packets sent by an endpoint SHOULD be padded to at least 3176 22 bytes longer than the minimum connection ID that the endpoint 3177 might use. An endpoint that sends a stateless reset in response to 3178 packet that is 43 bytes or less in length SHOULD send a stateless 3179 reset that is one byte shorter than the packet it responds to. 3181 These values assume that the Stateless Reset Token is the same as the 3182 minimum expansion of the packet protection AEAD. Additional 3183 unpredictable bytes are necessary if the endpoint could have 3184 negotiated a packet protection scheme with a larger minimum 3185 expansion. 3187 An endpoint MUST NOT send a stateless reset that is three times or 3188 more larger than the packet it receives to avoid being used for 3189 amplification. Section 10.4.3 describes additional limits on 3190 stateless reset size. 3192 Endpoints MUST discard packets that are too small to be valid QUIC 3193 packets. With the set of AEAD functions defined in [QUIC-TLS], 3194 packets that are smaller than 21 bytes are never valid. 3196 Endpoints MUST send stateless reset packets formatted as a packet 3197 with a short header. However, endpoints MUST treat any packet ending 3198 in a valid stateless reset token as a stateless reset, as other QUIC 3199 versions might allow the use of a long header. 3201 An endpoint MAY send a stateless reset in response to a packet with a 3202 long header. Sending a stateless reset is not effective prior to the 3203 stateless reset token being available to a peer. In this QUIC 3204 version, packets with a long header are only used during connection 3205 establishment. Because the stateless reset token is not available 3206 until connection establishment is complete or near completion, 3207 ignoring an unknown packet with a long header might be as effective 3208 as sending a stateless reset. 3210 An endpoint cannot determine the Source Connection ID from a packet 3211 with a short header, therefore it cannot set the Destination 3212 Connection ID in the stateless reset packet. The Destination 3213 Connection ID will therefore differ from the value used in previous 3214 packets. A random Destination Connection ID makes the connection ID 3215 appear to be the result of moving to a new connection ID that was 3216 provided using a NEW_CONNECTION_ID frame (Section 19.15). 3218 Using a randomized connection ID results in two problems: 3220 * The packet might not reach the peer. If the Destination 3221 Connection ID is critical for routing toward the peer, then this 3222 packet could be incorrectly routed. This might also trigger 3223 another Stateless Reset in response; see Section 10.4.3. A 3224 Stateless Reset that is not correctly routed is an ineffective 3225 error detection and recovery mechanism. In this case, endpoints 3226 will need to rely on other methods - such as timers - to detect 3227 that the connection has failed. 3229 * The randomly generated connection ID can be used by entities other 3230 than the peer to identify this as a potential stateless reset. An 3231 endpoint that occasionally uses different connection IDs might 3232 introduce some uncertainty about this. 3234 This stateless reset design is specific to QUIC version 1. An 3235 endpoint that supports multiple versions of QUIC needs to generate a 3236 stateless reset that will be accepted by peers that support any 3237 version that the endpoint might support (or might have supported 3238 prior to losing state). Designers of new versions of QUIC need to be 3239 aware of this and either reuse this design, or use a portion of the 3240 packet other than the last 16 bytes for carrying data. 3242 10.4.1. Detecting a Stateless Reset 3244 An endpoint detects a potential stateless reset using the trailing 16 3245 bytes of the UDP datagram. An endpoint remembers all Stateless Reset 3246 Tokens associated with the connection IDs and remote addresses for 3247 datagrams it has recently sent. This includes Stateless Reset Tokens 3248 from NEW_CONNECTION_ID frames and the server's transport parameters 3249 but excludes Stateless Reset Tokens associated with connection IDs 3250 that are either unused or retired. The endpoint identifies a 3251 received datagram as a stateless reset by comparing the last 16 bytes 3252 of the datagram with all Stateless Reset Tokens associated with the 3253 remote address on which the datagram was received. 3255 This comparison can be performed for every inbound datagram. 3256 Endpoints MAY skip this check if any packet from a datagram is 3257 successfully processed. However, the comparison MUST be performed 3258 when the first packet in an incoming datagram either cannot be 3259 associated with a connection, or cannot be decrypted. 3261 An endpoint MUST NOT check for any Stateless Reset Tokens associated 3262 with connection IDs it has not used or for connection IDs that have 3263 been retired. 3265 When comparing a datagram to Stateless Reset Token values, endpoints 3266 MUST perform the comparison without leaking information about the 3267 value of the token. For example, performing this comparison in 3268 constant time protects the value of individual Stateless Reset Tokens 3269 from information leakage through timing side channels. Another 3270 approach would be to store and compare the transformed values of 3271 Stateless Reset Tokens instead of the raw token values, where the 3272 transformation is defined as a cryptographically-secure pseudo-random 3273 function using a secret key (e.g., block cipher, HMAC [RFC2104]). An 3274 endpoint is not expected to protect information about whether a 3275 packet was successfully decrypted, or the number of valid Stateless 3276 Reset Tokens. 3278 If the last 16 bytes of the datagram are identical in value to a 3279 Stateless Reset Token, the endpoint MUST enter the draining period 3280 and not send any further packets on this connection. 3282 10.4.2. Calculating a Stateless Reset Token 3284 The stateless reset token MUST be difficult to guess. In order to 3285 create a Stateless Reset Token, an endpoint could randomly generate 3286 [RFC4086] a secret for every connection that it creates. However, 3287 this presents a coordination problem when there are multiple 3288 instances in a cluster or a storage problem for an endpoint that 3289 might lose state. Stateless reset specifically exists to handle the 3290 case where state is lost, so this approach is suboptimal. 3292 A single static key can be used across all connections to the same 3293 endpoint by generating the proof using a second iteration of a 3294 preimage-resistant function that takes a static key and the 3295 connection ID chosen by the endpoint (see Section 5.1) as input. An 3296 endpoint could use HMAC [RFC2104] (for example, HMAC(static_key, 3297 connection_id)) or HKDF [RFC5869] (for example, using the static key 3298 as input keying material, with the connection ID as salt). The 3299 output of this function is truncated to 16 bytes to produce the 3300 Stateless Reset Token for that connection. 3302 An endpoint that loses state can use the same method to generate a 3303 valid Stateless Reset Token. The connection ID comes from the packet 3304 that the endpoint receives. 3306 This design relies on the peer always sending a connection ID in its 3307 packets so that the endpoint can use the connection ID from a packet 3308 to reset the connection. An endpoint that uses this design MUST 3309 either use the same connection ID length for all connections or 3310 encode the length of the connection ID such that it can be recovered 3311 without state. In addition, it cannot provide a zero-length 3312 connection ID. 3314 Revealing the Stateless Reset Token allows any entity to terminate 3315 the connection, so a value can only be used once. This method for 3316 choosing the Stateless Reset Token means that the combination of 3317 connection ID and static key MUST NOT be used for another connection. 3318 A denial of service attack is possible if the same connection ID is 3319 used by instances that share a static key, or if an attacker can 3320 cause a packet to be routed to an instance that has no state but the 3321 same static key; see Section 21.9. A connection ID from a connection 3322 that is reset by revealing the Stateless Reset Token MUST NOT be 3323 reused for new connections at nodes that share a static key. 3325 The same Stateless Reset Token MUST NOT be used for multiple 3326 connection IDs. Endpoints are not required to compare new values 3327 against all previous values, but a duplicate value MAY be treated as 3328 a connection error of type PROTOCOL_VIOLATION. 3330 Note that Stateless Reset packets do not have any cryptographic 3331 protection. 3333 10.4.3. Looping 3335 The design of a Stateless Reset is such that without knowing the 3336 stateless reset token it is indistinguishable from a valid packet. 3337 For instance, if a server sends a Stateless Reset to another server 3338 it might receive another Stateless Reset in response, which could 3339 lead to an infinite exchange. 3341 An endpoint MUST ensure that every Stateless Reset that it sends is 3342 smaller than the packet which triggered it, unless it maintains state 3343 sufficient to prevent looping. In the event of a loop, this results 3344 in packets eventually being too small to trigger a response. 3346 An endpoint can remember the number of Stateless Reset packets that 3347 it has sent and stop generating new Stateless Reset packets once a 3348 limit is reached. Using separate limits for different remote 3349 addresses will ensure that Stateless Reset packets can be used to 3350 close connections when other peers or connections have exhausted 3351 limits. 3353 Reducing the size of a Stateless Reset below 41 bytes means that the 3354 packet could reveal to an observer that it is a Stateless Reset, 3355 depending upon the length of the peer's connection IDs. Conversely, 3356 refusing to send a Stateless Reset in response to a small packet 3357 might result in Stateless Reset not being useful in detecting cases 3358 of broken connections where only very small packets are sent; such 3359 failures might only be detected by other means, such as timers. 3361 11. Error Handling 3363 An endpoint that detects an error SHOULD signal the existence of that 3364 error to its peer. Both transport-level and application-level errors 3365 can affect an entire connection; see Section 11.1. Only application- 3366 level errors can be isolated to a single stream; see Section 11.2. 3368 The most appropriate error code (Section 20) SHOULD be included in 3369 the frame that signals the error. Where this specification 3370 identifies error conditions, it also identifies the error code that 3371 is used; though these are worded as requirements, different 3372 implementation strategies might lead to different errors being 3373 reported. In particular, an endpoint MAY use any applicable error 3374 code when it detects an error condition; a generic error code (such 3375 as PROTOCOL_VIOLATION or INTERNAL_ERROR) can always be used in place 3376 of specific error codes. 3378 A stateless reset (Section 10.4) is not suitable for any error that 3379 can be signaled with a CONNECTION_CLOSE or RESET_STREAM frame. A 3380 stateless reset MUST NOT be used by an endpoint that has the state 3381 necessary to send a frame on the connection. 3383 11.1. Connection Errors 3385 Errors that result in the connection being unusable, such as an 3386 obvious violation of protocol semantics or corruption of state that 3387 affects an entire connection, MUST be signaled using a 3388 CONNECTION_CLOSE frame (Section 19.19). An endpoint MAY close the 3389 connection in this manner even if the error only affects a single 3390 stream. 3392 Application protocols can signal application-specific protocol errors 3393 using the application-specific variant of the CONNECTION_CLOSE frame. 3394 Errors that are specific to the transport, including all those 3395 described in this document, are carried in the QUIC-specific variant 3396 of the CONNECTION_CLOSE frame. 3398 A CONNECTION_CLOSE frame could be sent in a packet that is lost. An 3399 endpoint SHOULD be prepared to retransmit a packet containing a 3400 CONNECTION_CLOSE frame if it receives more packets on a terminated 3401 connection. Limiting the number of retransmissions and the time over 3402 which this final packet is sent limits the effort expended on 3403 terminated connections. 3405 An endpoint that chooses not to retransmit packets containing a 3406 CONNECTION_CLOSE frame risks a peer missing the first such packet. 3407 The only mechanism available to an endpoint that continues to receive 3408 data for a terminated connection is to use the stateless reset 3409 process (Section 10.4). 3411 11.2. Stream Errors 3413 If an application-level error affects a single stream, but otherwise 3414 leaves the connection in a recoverable state, the endpoint can send a 3415 RESET_STREAM frame (Section 19.4) with an appropriate error code to 3416 terminate just the affected stream. 3418 Resetting a stream without the involvement of the application 3419 protocol could cause the application protocol to enter an 3420 unrecoverable state. RESET_STREAM MUST only be instigated by the 3421 application protocol that uses QUIC. 3423 The semantics of the application error code carried in RESET_STREAM 3424 are defined by the application protocol. Only the application 3425 protocol is able to cause a stream to be terminated. A local 3426 instance of the application protocol uses a direct API call and a 3427 remote instance uses the STOP_SENDING frame, which triggers an 3428 automatic RESET_STREAM. 3430 Application protocols SHOULD define rules for handling streams that 3431 are prematurely cancelled by either endpoint. 3433 12. Packets and Frames 3435 QUIC endpoints communicate by exchanging packets. Packets have 3436 confidentiality and integrity protection; see Section 12.1. Packets 3437 are carried in UDP datagrams; see Section 12.2. 3439 This version of QUIC uses the long packet header during connection 3440 establishment; see Section 17.2. Packets with the long header are 3441 Initial (Section 17.2.2), 0-RTT (Section 17.2.3), Handshake 3442 (Section 17.2.4), and Retry (Section 17.2.5). Version negotiation 3443 uses a version-independent packet with a long header; see 3444 Section 17.2.1. 3446 Packets with the short header are designed for minimal overhead and 3447 are used after a connection is established and 1-RTT keys are 3448 available; see Section 17.3. 3450 12.1. Protected Packets 3452 All QUIC packets except Version Negotiation packets use authenticated 3453 encryption with additional data (AEAD) [RFC5116] to provide 3454 confidentiality and integrity protection. Retry packets use AEAD to 3455 provide integrity protection. Details of packet protection are found 3456 in [QUIC-TLS]; this section includes an overview of the process. 3458 Initial packets are protected using keys that are statically derived. 3459 This packet protection is not effective confidentiality protection. 3460 Initial protection only exists to ensure that the sender of the 3461 packet is on the network path. Any entity that receives the Initial 3462 packet from a client can recover the keys necessary to remove packet 3463 protection or to generate packets that will be successfully 3464 authenticated. 3466 All other packets are protected with keys derived from the 3467 cryptographic handshake. The type of the packet from the long header 3468 or key phase from the short header are used to identify which 3469 encryption keys are used. Packets protected with 0-RTT and 1-RTT 3470 keys are expected to have confidentiality and data origin 3471 authentication; the cryptographic handshake ensures that only the 3472 communicating endpoints receive the corresponding keys. 3474 The packet number field contains a packet number, which has 3475 additional confidentiality protection that is applied after packet 3476 protection is applied; see [QUIC-TLS] for details. The underlying 3477 packet number increases with each packet sent in a given packet 3478 number space; see Section 12.3 for details. 3480 12.2. Coalescing Packets 3482 Initial (Section 17.2.2), 0-RTT (Section 17.2.3), and Handshake 3483 (Section 17.2.4) packets contain a Length field, which determines the 3484 end of the packet. The length includes both the Packet Number and 3485 Payload fields, both of which are confidentiality protected and 3486 initially of unknown length. The length of the Payload field is 3487 learned once header protection is removed. 3489 Using the Length field, a sender can coalesce multiple QUIC packets 3490 into one UDP datagram. This can reduce the number of UDP datagrams 3491 needed to complete the cryptographic handshake and start sending 3492 data. This can also be used to construct PMTU probes; see 3493 Section 14.3.1. Receivers MUST be able to process coalesced packets. 3495 Coalescing packets in order of increasing encryption levels (Initial, 3496 0-RTT, Handshake, 1-RTT; see Section 4.1.4 of [QUIC-TLS]) makes it 3497 more likely the receiver will be able to process all the packets in a 3498 single pass. A packet with a short header does not include a length, 3499 so it can only be the last packet included in a UDP datagram. An 3500 endpoint SHOULD NOT coalesce multiple packets at the same encryption 3501 level. 3503 Senders MUST NOT coalesce QUIC packets for different connections into 3504 a single UDP datagram. Receivers SHOULD ignore any subsequent 3505 packets with a different Destination Connection ID than the first 3506 packet in the datagram. 3508 Every QUIC packet that is coalesced into a single UDP datagram is 3509 separate and complete. The receiver of coalesced QUIC packets MUST 3510 individually process each QUIC packet and separately acknowledge 3511 them, as if they were received as the payload of different UDP 3512 datagrams. For example, if decryption fails (because the keys are 3513 not available or any other reason), the receiver MAY either discard 3514 or buffer the packet for later processing and MUST attempt to process 3515 the remaining packets. 3517 Retry packets (Section 17.2.5), Version Negotiation packets 3518 (Section 17.2.1), and packets with a short header (Section 17.3) do 3519 not contain a Length field and so cannot be followed by other packets 3520 in the same UDP datagram. Note also that there is no situation where 3521 a Retry or Version Negotiation packet is coalesced with another 3522 packet. 3524 12.3. Packet Numbers 3526 The packet number is an integer in the range 0 to 2^62-1. This 3527 number is used in determining the cryptographic nonce for packet 3528 protection. Each endpoint maintains a separate packet number for 3529 sending and receiving. 3531 Packet numbers are limited to this range because they need to be 3532 representable in whole in the Largest Acknowledged field of an ACK 3533 frame (Section 19.3). When present in a long or short header 3534 however, packet numbers are reduced and encoded in 1 to 4 bytes; see 3535 Section 17.1. 3537 Version Negotiation (Section 17.2.1) and Retry (Section 17.2.5) 3538 packets do not include a packet number. 3540 Packet numbers are divided into 3 spaces in QUIC: 3542 * Initial space: All Initial packets (Section 17.2.2) are in this 3543 space. 3545 * Handshake space: All Handshake packets (Section 17.2.4) are in 3546 this space. 3548 * Application data space: All 0-RTT and 1-RTT encrypted packets 3549 (Section 12.1) are in this space. 3551 As described in [QUIC-TLS], each packet type uses different 3552 protection keys. 3554 Conceptually, a packet number space is the context in which a packet 3555 can be processed and acknowledged. Initial packets can only be sent 3556 with Initial packet protection keys and acknowledged in packets which 3557 are also Initial packets. Similarly, Handshake packets are sent at 3558 the Handshake encryption level and can only be acknowledged in 3559 Handshake packets. 3561 This enforces cryptographic separation between the data sent in the 3562 different packet sequence number spaces. Packet numbers in each 3563 space start at packet number 0. Subsequent packets sent in the same 3564 packet number space MUST increase the packet number by at least one. 3566 0-RTT and 1-RTT data exist in the same packet number space to make 3567 loss recovery algorithms easier to implement between the two packet 3568 types. 3570 A QUIC endpoint MUST NOT reuse a packet number within the same packet 3571 number space in one connection. If the packet number for sending 3572 reaches 2^62 - 1, the sender MUST close the connection without 3573 sending a CONNECTION_CLOSE frame or any further packets; an endpoint 3574 MAY send a Stateless Reset (Section 10.4) in response to further 3575 packets that it receives. 3577 A receiver MUST discard a newly unprotected packet unless it is 3578 certain that it has not processed another packet with the same packet 3579 number from the same packet number space. Duplicate suppression MUST 3580 happen after removing packet protection for the reasons described in 3581 Section 9.3 of [QUIC-TLS]. An efficient algorithm for duplicate 3582 suppression can be found in Section 3.4.3 of [RFC4303]. 3584 Packet number encoding at a sender and decoding at a receiver are 3585 described in Section 17.1. 3587 12.4. Frames and Frame Types 3589 The payload of QUIC packets, after removing packet protection, 3590 consists of a sequence of complete frames, as shown in Figure 10. 3591 Version Negotiation, Stateless Reset, and Retry packets do not 3592 contain frames. 3594 Packet Payload { 3595 Frame (..) ..., 3596 } 3598 Figure 10: QUIC Payload 3600 The payload of a packet that contains frames MUST contain at least 3601 one frame, and MAY contain multiple frames and multiple frame types. 3602 Frames always fit within a single QUIC packet and cannot span 3603 multiple packets. 3605 Each frame begins with a Frame Type, indicating its type, followed by 3606 additional type-dependent fields: 3608 Frame { 3609 Frame Type (i), 3610 Type-Dependent Fields (..), 3611 } 3613 Figure 11: Generic Frame Layout 3615 The frame types defined in this specification are listed in Table 3. 3616 The Frame Type in ACK, STREAM, MAX_STREAMS, STREAMS_BLOCKED, and 3617 CONNECTION_CLOSE frames is used to carry other frame-specific flags. 3618 For all other frames, the Frame Type field simply identifies the 3619 frame. These frames are explained in more detail in Section 19. 3621 +-------------+----------------------+---------------+---------+ 3622 | Type Value | Frame Type Name | Definition | Packets | 3623 +=============+======================+===============+=========+ 3624 | 0x00 | PADDING | Section 19.1 | IH01 | 3625 +-------------+----------------------+---------------+---------+ 3626 | 0x01 | PING | Section 19.2 | IH01 | 3627 +-------------+----------------------+---------------+---------+ 3628 | 0x02 - 0x03 | ACK | Section 19.3 | IH_1 | 3629 +-------------+----------------------+---------------+---------+ 3630 | 0x04 | RESET_STREAM | Section 19.4 | __01 | 3631 +-------------+----------------------+---------------+---------+ 3632 | 0x05 | STOP_SENDING | Section 19.5 | __01 | 3633 +-------------+----------------------+---------------+---------+ 3634 | 0x06 | CRYPTO | Section 19.6 | IH_1 | 3635 +-------------+----------------------+---------------+---------+ 3636 | 0x07 | NEW_TOKEN | Section 19.7 | ___1 | 3637 +-------------+----------------------+---------------+---------+ 3638 | 0x08 - 0x0f | STREAM | Section 19.8 | __01 | 3639 +-------------+----------------------+---------------+---------+ 3640 | 0x10 | MAX_DATA | Section 19.9 | __01 | 3641 +-------------+----------------------+---------------+---------+ 3642 | 0x11 | MAX_STREAM_DATA | Section 19.10 | __01 | 3643 +-------------+----------------------+---------------+---------+ 3644 | 0x12 - 0x13 | MAX_STREAMS | Section 19.11 | __01 | 3645 +-------------+----------------------+---------------+---------+ 3646 | 0x14 | DATA_BLOCKED | Section 19.12 | __01 | 3647 +-------------+----------------------+---------------+---------+ 3648 | 0x15 | STREAM_DATA_BLOCKED | Section 19.13 | __01 | 3649 +-------------+----------------------+---------------+---------+ 3650 | 0x16 - 0x17 | STREAMS_BLOCKED | Section 19.14 | __01 | 3651 +-------------+----------------------+---------------+---------+ 3652 | 0x18 | NEW_CONNECTION_ID | Section 19.15 | __01 | 3653 +-------------+----------------------+---------------+---------+ 3654 | 0x19 | RETIRE_CONNECTION_ID | Section 19.16 | __01 | 3655 +-------------+----------------------+---------------+---------+ 3656 | 0x1a | PATH_CHALLENGE | Section 19.17 | __01 | 3657 +-------------+----------------------+---------------+---------+ 3658 | 0x1b | PATH_RESPONSE | Section 19.18 | __01 | 3659 +-------------+----------------------+---------------+---------+ 3660 | 0x1c - 0x1d | CONNECTION_CLOSE | Section 19.19 | ih01 | 3661 +-------------+----------------------+---------------+---------+ 3662 | 0x1e | HANDSHAKE_DONE | Section 19.20 | ___1 | 3663 +-------------+----------------------+---------------+---------+ 3665 Table 3: Frame Types 3667 The "Packets" column in Table 3 does not form part of the IANA 3668 registry; see Section 22.3. This column lists the types of packets 3669 that each frame type could appear in, indicated by the following 3670 characters: 3672 I: Initial (Section 17.2.2) 3674 H: Handshake (Section 17.2.4) 3676 0: 0-RTT (Section 17.2.3) 3678 1: 1-RTT (Section 17.3) 3680 ih: A CONNECTION_CLOSE frame of type 0x1d cannot appear in Initial 3681 or Handshake packets. 3683 Section 4 of [QUIC-TLS] provides more detail about these 3684 restrictions. Note that all frames can appear in 1-RTT packets. 3686 An endpoint MUST treat the receipt of a frame of unknown type as a 3687 connection error of type FRAME_ENCODING_ERROR. 3689 All QUIC frames are idempotent in this version of QUIC. That is, a 3690 valid frame does not cause undesirable side effects or errors when 3691 received more than once. 3693 The Frame Type field uses a variable length integer encoding (see 3694 Section 16) with one exception. To ensure simple and efficient 3695 implementations of frame parsing, a frame type MUST use the shortest 3696 possible encoding. For frame types defined in this document, this 3697 means a single-byte encoding, even though it is possible to encode 3698 these values as a two-, four- or eight-byte variable length integer. 3699 For instance, though 0x4001 is a legitimate two-byte encoding for a 3700 variable-length integer with a value of 1, PING frames are always 3701 encoded as a single byte with the value 0x01. This rule applies to 3702 all current and future QUIC frame types. An endpoint MAY treat the 3703 receipt of a frame type that uses a longer encoding than necessary as 3704 a connection error of type PROTOCOL_VIOLATION. 3706 13. Packetization and Reliability 3708 A sender bundles one or more frames in a QUIC packet; see 3709 Section 12.4. 3711 A sender can minimize per-packet bandwidth and computational costs by 3712 bundling as many frames as possible within a QUIC packet. A sender 3713 MAY wait for a short period of time to bundle multiple frames before 3714 sending a packet that is not maximally packed, to avoid sending out 3715 large numbers of small packets. An implementation MAY use knowledge 3716 about application sending behavior or heuristics to determine whether 3717 and for how long to wait. This waiting period is an implementation 3718 decision, and an implementation should be careful to delay 3719 conservatively, since any delay is likely to increase application- 3720 visible latency. 3722 Stream multiplexing is achieved by interleaving STREAM frames from 3723 multiple streams into one or more QUIC packets. A single QUIC packet 3724 can include multiple STREAM frames from one or more streams. 3726 One of the benefits of QUIC is avoidance of head-of-line blocking 3727 across multiple streams. When a packet loss occurs, only streams 3728 with data in that packet are blocked waiting for a retransmission to 3729 be received, while other streams can continue making progress. Note 3730 that when data from multiple streams is bundled into a single QUIC 3731 packet, loss of that packet blocks all those streams from making 3732 progress. Implementations are advised to bundle as few streams as 3733 necessary in outgoing packets without losing transmission efficiency 3734 to underfilled packets. 3736 13.1. Packet Processing 3738 A packet MUST NOT be acknowledged until packet protection has been 3739 successfully removed and all frames contained in the packet have been 3740 processed. For STREAM frames, this means the data has been enqueued 3741 in preparation to be received by the application protocol, but it 3742 does not require that data is delivered and consumed. 3744 Once the packet has been fully processed, a receiver acknowledges 3745 receipt by sending one or more ACK frames containing the packet 3746 number of the received packet. 3748 13.2. Generating Acknowledgements 3750 Endpoints acknowledge all packets they receive and process. However, 3751 only ack-eliciting packets cause an ACK frame to be sent within the 3752 maximum ack delay. Packets that are not ack-eliciting are only 3753 acknowledged when an ACK frame is sent for other reasons. 3755 When sending a packet for any reason, an endpoint SHOULD attempt to 3756 bundle an ACK frame if one has not been sent recently. Doing so 3757 helps with timely loss detection at the peer. 3759 In general, frequent feedback from a receiver improves loss and 3760 congestion response, but this has to be balanced against excessive 3761 load generated by a receiver that sends an ACK frame in response to 3762 every ack-eliciting packet. The guidance offered below seeks to 3763 strike this balance. 3765 13.2.1. Sending ACK Frames 3767 Every packet SHOULD be acknowledged at least once, and ack-eliciting 3768 packets MUST be acknowledged at least once within the maximum ack 3769 delay. An endpoint communicates its maximum delay using the 3770 max_ack_delay transport parameter; see Section 18.2. max_ack_delay 3771 declares an explicit contract: an endpoint promises to never 3772 intentionally delay acknowledgments of an ack-eliciting packet by 3773 more than the indicated value. If it does, any excess accrues to the 3774 RTT estimate and could result in spurious or delayed retransmissions 3775 from the peer. For Initial and Handshake packets, a max_ack_delay of 3776 0 is used. The sender uses the receiver's max_ack_delay value in 3777 determining timeouts for timer-based retransmission, as detailed in 3778 Section 5.2.1 of [QUIC-RECOVERY]. 3780 An ACK frame SHOULD be generated for at least every second ack- 3781 eliciting packet. This recommendation is in keeping with standard 3782 practice for TCP [RFC5681]. A receiver could decide to send an ACK 3783 frame less frequently if it has information about how frequently the 3784 sender's congestion controller needs feedback, or if the receiver is 3785 CPU or bandwidth constrained. 3787 In order to assist loss detection at the sender, an endpoint SHOULD 3788 send an ACK frame immediately on receiving an ack-eliciting packet 3789 that is out of order. The endpoint SHOULD NOT continue sending ACK 3790 frames immediately unless more ack-eliciting packets are received out 3791 of order. If every subsequent ack-eliciting packet arrives out of 3792 order, then an ACK frame SHOULD be sent immediately for every 3793 received ack-eliciting packet. 3795 Similarly, packets marked with the ECN Congestion Experienced (CE) 3796 codepoint in the IP header SHOULD be acknowledged immediately, to 3797 reduce the peer's response time to congestion events. 3799 As an optimization, a receiver MAY process multiple packets before 3800 sending any ACK frames in response. In this case the receiver can 3801 determine whether an immediate or delayed acknowledgement should be 3802 generated after processing incoming packets. 3804 Packets containing PADDING frames are considered to be in flight for 3805 congestion control purposes [QUIC-RECOVERY]. Sending only PADDING 3806 frames might cause the sender to become limited by the congestion 3807 controller with no acknowledgments forthcoming from the receiver. 3808 Therefore, a sender SHOULD ensure that other frames are sent in 3809 addition to PADDING frames to elicit acknowledgments from the 3810 receiver. 3812 An endpoint that is only sending ACK frames will not receive 3813 acknowledgments from its peer unless those acknowledgements are 3814 included in packets with ack-eliciting frames. An endpoint SHOULD 3815 bundle ACK frames with other frames when there are new ack-eliciting 3816 packets to acknowledge. When only non-ack-eliciting packets need to 3817 be acknowledged, an endpoint MAY wait until an ack-eliciting packet 3818 has been received to bundle an ACK frame with outgoing frames. 3820 The algorithms in [QUIC-RECOVERY] are resilient to receivers that do 3821 not follow guidance offered above. However, an implementor should 3822 only deviate from these requirements after careful consideration of 3823 the performance implications of doing so. 3825 Packets containing only ACK frames are not congestion controlled, so 3826 there are limits on how frequently they can be sent. An endpoint 3827 MUST NOT send more than one ACK-frame-only packet in response to 3828 receiving an ack-eliciting packet. An endpoint MUST NOT send a non- 3829 ack-eliciting packet in response to a non-ack-eliciting packet, even 3830 if there are packet gaps which precede the received packet. Limiting 3831 ACK frames avoids an infinite feedback loop of acknowledgements, 3832 which could prevent the connection from ever becoming idle. However, 3833 the endpoint acknowledges non-ACK-eliciting packets when it sends an 3834 ACK frame. 3836 An endpoint SHOULD treat receipt of an acknowledgment for a packet it 3837 did not send as a connection error of type PROTOCOL_VIOLATION, if it 3838 is able to detect the condition. 3840 13.2.2. Managing ACK Ranges 3842 When an ACK frame is sent, one or more ranges of acknowledged packets 3843 are included. Including older packets reduces the chance of spurious 3844 retransmits caused by losing previously sent ACK frames, at the cost 3845 of larger ACK frames. 3847 ACK frames SHOULD always acknowledge the most recently received 3848 packets, and the more out-of-order the packets are, the more 3849 important it is to send an updated ACK frame quickly, to prevent the 3850 peer from declaring a packet as lost and spuriously retransmitting 3851 the frames it contains. An ACK frame is expected to fit within a 3852 single QUIC packet. If it does not, then older ranges (those with 3853 the smallest packet numbers) are omitted. 3855 Section 13.2.3 and Section 13.2.4 describe an exemplary approach for 3856 determining what packets to acknowledge in each ACK frame. Though 3857 the goal of these algorithms is to generate an acknowledgment for 3858 every packet that is processed, it is still possible for 3859 acknowledgments to be lost. A sender cannot expect to receive an 3860 acknowledgment for every packet that the receiver processes. 3862 13.2.3. Receiver Tracking of ACK Frames 3864 When a packet containing an ACK frame is sent, the largest 3865 acknowledged in that frame may be saved. When a packet containing an 3866 ACK frame is acknowledged, the receiver can stop acknowledging 3867 packets less than or equal to the largest acknowledged in the sent 3868 ACK frame. 3870 In cases without ACK frame loss, this algorithm allows for a minimum 3871 of 1 RTT of reordering. In cases with ACK frame loss and reordering, 3872 this approach does not guarantee that every acknowledgement is seen 3873 by the sender before it is no longer included in the ACK frame. 3874 Packets could be received out of order and all subsequent ACK frames 3875 containing them could be lost. In this case, the loss recovery 3876 algorithm could cause spurious retransmits, but the sender will 3877 continue making forward progress. 3879 13.2.4. Limiting ACK Ranges 3881 A receiver limits the number of ACK Ranges (Section 19.3.1) it 3882 remembers and sends in ACK frames, both to limit the size of ACK 3883 frames and to avoid resource exhaustion. After receiving 3884 acknowledgments for an ACK frame, the receiver SHOULD stop tracking 3885 those acknowledged ACK Ranges. 3887 It is possible that retaining many ACK Ranges could cause an ACK 3888 frame to become too large. A receiver can discard unacknowledged ACK 3889 Ranges to limit ACK frame size, at the cost of increased 3890 retransmissions from the sender. This is necessary if an ACK frame 3891 would be too large to fit in a packet, however receivers MAY also 3892 limit ACK frame size further to preserve space for other frames. 3894 A receiver MUST retain an ACK Range unless it can ensure that it will 3895 not subsequently accept packets with numbers in that range. 3896 Maintaining a minimum packet number that increases as ranges are 3897 discarded is one way to achieve this with minimal state. 3899 Receivers can discard all ACK Ranges, but they MUST retain the 3900 largest packet number that has been successfully processed as that is 3901 used to recover packet numbers from subsequent packets; see 3902 Section 17.1. 3904 A receiver SHOULD include an ACK Range containing the largest 3905 received packet number in every ACK frame. The Largest Acknowledged 3906 field is used in ECN validation at a sender and including a lower 3907 value than what was included in a previous ACK frame could cause ECN 3908 to be unnecessarily disabled; see Section 13.4.2. 3910 A receiver that sends only non-ack-eliciting packets, such as ACK 3911 frames, might not receive an acknowledgement for a long period of 3912 time. This could cause the receiver to maintain state for a large 3913 number of ACK frames for a long period of time, and ACK frames it 3914 sends could be unnecessarily large. In such a case, a receiver could 3915 bundle a PING or other small ack-eliciting frame occasionally, such 3916 as once per round trip, to elicit an ACK from the peer. 3918 A receiver MUST NOT bundle an ack-eliciting frame with all packets 3919 that would otherwise be non-ack-eliciting, to avoid an infinite 3920 feedback loop of acknowledgements. 3922 13.2.5. Measuring and Reporting Host Delay 3924 An endpoint measures the delays intentionally introduced between the 3925 time the packet with the largest packet number is received and the 3926 time an acknowledgment is sent. The endpoint encodes this delay in 3927 the Ack Delay field of an ACK frame; see Section 19.3. This allows 3928 the receiver of the ACK to adjust for any intentional delays, which 3929 is important for getting a better estimate of the path RTT when 3930 acknowledgments are delayed. A packet might be held in the OS kernel 3931 or elsewhere on the host before being processed. An endpoint MUST 3932 NOT include delays that it does not control when populating the Ack 3933 Delay field in an ACK frame. 3935 13.2.6. ACK Frames and Packet Protection 3937 ACK frames MUST only be carried in a packet that has the same packet 3938 number space as the packet being ACKed; see Section 12.1. For 3939 instance, packets that are protected with 1-RTT keys MUST be 3940 acknowledged in packets that are also protected with 1-RTT keys. 3942 Packets that a client sends with 0-RTT packet protection MUST be 3943 acknowledged by the server in packets protected by 1-RTT keys. This 3944 can mean that the client is unable to use these acknowledgments if 3945 the server cryptographic handshake messages are delayed or lost. 3946 Note that the same limitation applies to other data sent by the 3947 server protected by the 1-RTT keys. 3949 13.3. Retransmission of Information 3951 QUIC packets that are determined to be lost are not retransmitted 3952 whole. The same applies to the frames that are contained within lost 3953 packets. Instead, the information that might be carried in frames is 3954 sent again in new frames as needed. 3956 New frames and packets are used to carry information that is 3957 determined to have been lost. In general, information is sent again 3958 when a packet containing that information is determined to be lost 3959 and sending ceases when a packet containing that information is 3960 acknowledged. 3962 * Data sent in CRYPTO frames is retransmitted according to the rules 3963 in [QUIC-RECOVERY], until all data has been acknowledged. Data in 3964 CRYPTO frames for Initial and Handshake packets is discarded when 3965 keys for the corresponding packet number space are discarded. 3967 * Application data sent in STREAM frames is retransmitted in new 3968 STREAM frames unless the endpoint has sent a RESET_STREAM for that 3969 stream. Once an endpoint sends a RESET_STREAM frame, no further 3970 STREAM frames are needed. 3972 * ACK frames carry the most recent set of acknowledgements and the 3973 Ack Delay from the largest acknowledged packet, as described in 3974 Section 13.2.1. Delaying the transmission of packets containing 3975 ACK frames or sending old ACK frames can cause the peer to 3976 generate an inflated RTT sample or unnecessarily disable ECN. 3978 * Cancellation of stream transmission, as carried in a RESET_STREAM 3979 frame, is sent until acknowledged or until all stream data is 3980 acknowledged by the peer (that is, either the "Reset Recvd" or 3981 "Data Recvd" state is reached on the sending part of the stream). 3982 The content of a RESET_STREAM frame MUST NOT change when it is 3983 sent again. 3985 * Similarly, a request to cancel stream transmission, as encoded in 3986 a STOP_SENDING frame, is sent until the receiving part of the 3987 stream enters either a "Data Recvd" or "Reset Recvd" state; see 3988 Section 3.5. 3990 * Connection close signals, including packets that contain 3991 CONNECTION_CLOSE frames, are not sent again when packet loss is 3992 detected, but as described in Section 10. 3994 * The current connection maximum data is sent in MAX_DATA frames. 3995 An updated value is sent in a MAX_DATA frame if the packet 3996 containing the most recently sent MAX_DATA frame is declared lost, 3997 or when the endpoint decides to update the limit. Care is 3998 necessary to avoid sending this frame too often as the limit can 3999 increase frequently and cause an unnecessarily large number of 4000 MAX_DATA frames to be sent. 4002 * The current maximum stream data offset is sent in MAX_STREAM_DATA 4003 frames. Like MAX_DATA, an updated value is sent when the packet 4004 containing the most recent MAX_STREAM_DATA frame for a stream is 4005 lost or when the limit is updated, with care taken to prevent the 4006 frame from being sent too often. An endpoint SHOULD stop sending 4007 MAX_STREAM_DATA frames when the receiving part of the stream 4008 enters a "Size Known" state. 4010 * The limit on streams of a given type is sent in MAX_STREAMS 4011 frames. Like MAX_DATA, an updated value is sent when a packet 4012 containing the most recent MAX_STREAMS for a stream type frame is 4013 declared lost or when the limit is updated, with care taken to 4014 prevent the frame from being sent too often. 4016 * Blocked signals are carried in DATA_BLOCKED, STREAM_DATA_BLOCKED, 4017 and STREAMS_BLOCKED frames. DATA_BLOCKED frames have connection 4018 scope, STREAM_DATA_BLOCKED frames have stream scope, and 4019 STREAMS_BLOCKED frames are scoped to a specific stream type. New 4020 frames are sent if packets containing the most recent frame for a 4021 scope is lost, but only while the endpoint is blocked on the 4022 corresponding limit. These frames always include the limit that 4023 is causing blocking at the time that they are transmitted. 4025 * A liveness or path validation check using PATH_CHALLENGE frames is 4026 sent periodically until a matching PATH_RESPONSE frame is received 4027 or until there is no remaining need for liveness or path 4028 validation checking. PATH_CHALLENGE frames include a different 4029 payload each time they are sent. 4031 * Responses to path validation using PATH_RESPONSE frames are sent 4032 just once. The peer is expected to send more PATH_CHALLENGE 4033 frames as necessary to evoke additional PATH_RESPONSE frames. 4035 * New connection IDs are sent in NEW_CONNECTION_ID frames and 4036 retransmitted if the packet containing them is lost. 4037 Retransmissions of this frame carry the same sequence number 4038 value. Likewise, retired connection IDs are sent in 4039 RETIRE_CONNECTION_ID frames and retransmitted if the packet 4040 containing them is lost. 4042 * NEW_TOKEN frames are retransmitted if the packet containing them 4043 is lost. No special support is made for detecting reordered and 4044 duplicated NEW_TOKEN frames other than a direct comparison of the 4045 frame contents. 4047 * PING and PADDING frames contain no information, so lost PING or 4048 PADDING frames do not require repair. 4050 * The HANDSHAKE_DONE frame MUST be retransmitted until it is 4051 acknowledged. 4053 Endpoints SHOULD prioritize retransmission of data over sending new 4054 data, unless priorities specified by the application indicate 4055 otherwise; see Section 2.3. 4057 Even though a sender is encouraged to assemble frames containing up- 4058 to-date information every time it sends a packet, it is not forbidden 4059 to retransmit copies of frames from lost packets. A sender that 4060 retransmits copies of frames needs to handle decreases in available 4061 payload size due to change in packet number length, connection ID 4062 length, and path MTU. A receiver MUST accept packets containing an 4063 outdated frame, such as a MAX_DATA frame carrying a smaller maximum 4064 data than one found in an older packet. 4066 Upon detecting losses, a sender MUST take appropriate congestion 4067 control action. The details of loss detection and congestion control 4068 are described in [QUIC-RECOVERY]. 4070 13.4. Explicit Congestion Notification 4072 QUIC endpoints can use Explicit Congestion Notification (ECN) 4073 [RFC3168] to detect and respond to network congestion. ECN allows a 4074 network node to indicate congestion in the network by setting a 4075 codepoint in the IP header of a packet instead of dropping it. 4076 Endpoints react to congestion by reducing their sending rate in 4077 response, as described in [QUIC-RECOVERY]. 4079 To use ECN, QUIC endpoints first determine whether a path supports 4080 ECN marking and the peer is able to access the ECN codepoint in the 4081 IP header. A network path does not support ECN if ECN marked packets 4082 get dropped or ECN markings are rewritten on the path. An endpoint 4083 validates the use of ECN on the path, both during connection 4084 establishment and when migrating to a new path (Section 9). 4086 13.4.1. ECN Counts 4088 On receiving a QUIC packet with an ECT or CE codepoint, an ECN- 4089 enabled endpoint that can access the ECN codepoints from the 4090 enclosing IP packet increases the corresponding ECT(0), ECT(1), or CE 4091 count, and includes these counts in subsequent ACK frames; see 4092 Section 13.2 and Section 19.3. Note that this requires being able to 4093 read the ECN codepoints from the enclosing IP packet, which is not 4094 possible on all platforms. 4096 A packet detected by a receiver as a duplicate does not affect the 4097 receiver's local ECN codepoint counts; see (Section 21.8) for 4098 relevant security concerns. 4100 If an endpoint receives a QUIC packet without an ECT or CE codepoint 4101 in the IP packet header, it responds per Section 13.2 with an ACK 4102 frame without increasing any ECN counts. If an endpoint does not 4103 implement ECN support or does not have access to received ECN 4104 codepoints, it does not increase ECN counts. 4106 Coalesced packets (see Section 12.2) mean that several packets can 4107 share the same IP header. The ECN counter for the ECN codepoint 4108 received in the associated IP header are incremented once for each 4109 QUIC packet, not per enclosing IP packet or UDP datagram. 4111 Each packet number space maintains separate acknowledgement state and 4112 separate ECN counts. For example, if one each of an Initial, 0-RTT, 4113 Handshake, and 1-RTT QUIC packet are coalesced, the corresponding 4114 counts for the Initial and Handshake packet number space will be 4115 incremented by one and the counts for the 1-RTT packet number space 4116 will be increased by two. 4118 13.4.2. ECN Validation 4120 It is possible for faulty network devices to corrupt or erroneously 4121 drop packets with ECN markings. To provide robust connectivity in 4122 the presence of such devices, each endpoint independently validates 4123 ECN counts and disables ECN if errors are detected. 4125 Endpoints validate ECN for packets sent on each network path 4126 independently. An endpoint thus validates ECN on new connection 4127 establishment, when switching to a new server preferred address, and 4128 on active connection migration to a new path. Appendix B describes 4129 one possible algorithm for testing paths for ECN support. 4131 Even if an endpoint does not use ECN markings on packets it 4132 transmits, the endpoint MUST provide feedback about ECN markings 4133 received from the peer if they are accessible. Failing to report ECN 4134 counts will cause the peer to disable ECN marking. 4136 13.4.2.1. Sending ECN Markings 4138 To start ECN validation, an endpoint SHOULD do the following when 4139 sending packets on a new path to a peer: 4141 * Set the ECT(0) codepoint in the IP header of early outgoing 4142 packets sent on a new path to the peer [RFC8311]. 4144 * If all packets that were sent with the ECT(0) codepoint are 4145 eventually deemed lost [QUIC-RECOVERY], validation is deemed to 4146 have failed. 4148 To reduce the chances of misinterpreting congestive loss as packets 4149 dropped by a faulty network element, an endpoint could set the ECT(0) 4150 codepoint in the first ten outgoing packets on a path, or for a 4151 period of three RTTs, whichever occurs first. 4153 Implementations MAY experiment with and use other strategies for use 4154 of ECN. Other methods of probing paths for ECN support are possible, 4155 as are different marking strategies. Implementations can also use 4156 the ECT(1) codepoint, as specified in [RFC8311]. 4158 13.4.2.2. Receiving ACK Frames 4160 An endpoint that sets ECT(0) or ECT(1) codepoints on packets it 4161 transmits MUST use the following steps on receiving an ACK frame to 4162 validate ECN. 4164 * If this ACK frame newly acknowledges a packet that the endpoint 4165 sent with either ECT(0) or ECT(1) codepoints set, and if no ECN 4166 feedback is present in the ACK frame, validation fails. This step 4167 protects against both a network element that zeroes out ECN bits 4168 and a peer that is unable to access ECN markings, since the peer 4169 could respond without ECN feedback in these cases. 4171 * For validation to succeed, the total increase in ECT(0), ECT(1), 4172 and CE counts MUST be no smaller than the total number of QUIC 4173 packets sent with an ECT codepoint that are newly acknowledged in 4174 this ACK frame. This step detects any network remarking from 4175 ECT(0), ECT(1), or CE codepoints to Not-ECT. 4177 * Any increase in either ECT(0) or ECT(1) counts, plus any increase 4178 in the CE count, MUST be no smaller than the number of packets 4179 sent with the corresponding ECT codepoint that are newly 4180 acknowledged in this ACK frame. This step detects any erroneous 4181 network remarking from ECT(0) to ECT(1) (or vice versa). 4183 Processing ECN counts out of order can result in validation failure. 4184 An endpoint SHOULD NOT perform this validation if this ACK frame does 4185 not advance the largest packet number acknowledged in this 4186 connection. 4188 An endpoint could miss acknowledgements for a packet when ACK frames 4189 are lost. It is therefore possible for the total increase in ECT(0), 4190 ECT(1), and CE counts to be greater than the number of packets 4191 acknowledged in an ACK frame. When this happens, and if validation 4192 succeeds, the local reference counts MUST be increased to match the 4193 counts in the ACK frame. 4195 13.4.2.3. Validation Outcomes 4197 If validation fails, then the endpoint stops sending ECN markings in 4198 subsequent IP packets with the expectation that either the network 4199 path or the peer does not support ECN. 4201 Upon successful validation, an endpoint can continue to set ECT 4202 codepoints in subsequent packets with the expectation that the path 4203 is ECN-capable. Network routing and path elements can change mid- 4204 connection however; an endpoint MUST disable ECN if validation fails 4205 at any point in the connection. 4207 Even if validation fails, an endpoint MAY revalidate ECN on the same 4208 path at any later time in the connection. 4210 14. Packet Size 4212 The QUIC packet size includes the QUIC header and protected payload, 4213 but not the UDP or IP header. 4215 A client MUST expand the payload of all UDP datagrams carrying 4216 Initial packets to at least 1200 bytes, by adding PADDING frames to 4217 the Initial packet or by coalescing the Initial packet; see 4218 Section 12.2. Sending a UDP datagram of this size ensures that the 4219 network path from the client to the server supports a reasonable 4220 Maximum Transmission Unit (MTU). Padding datagrams also helps reduce 4221 the amplitude of amplification attacks caused by server responses 4222 toward an unverified client address; see Section 8. 4224 Enforcement of the max_udp_payload_size transport parameter 4225 (Section 18.2) might act as an additional limit on packet size. 4226 Exceeding this limit can be avoided once the value is known. 4227 However, prior to learning the value of the transport parameter, 4228 endpoints risk datagrams being lost if they send packets larger than 4229 1200 bytes. 4231 Datagrams containing Initial packets MAY exceed 1200 bytes if the 4232 client believes that the network path and peer both support the size 4233 that it chooses. 4235 UDP datagrams MUST NOT be fragmented at the IP layer. In IPv4 4236 [IPv4], the DF bit MUST be set to prevent fragmentation on the path. 4238 A server MUST discard an Initial packet that is carried in a UDP 4239 datagram with a payload that is less than 1200 bytes. A server MAY 4240 also immediately close the connection by sending a CONNECTION_CLOSE 4241 frame with an error code of PROTOCOL_VIOLATION; see Section 10.3.1. 4243 The server MUST also limit the number of bytes it sends before 4244 validating the address of the client; see Section 8. 4246 14.1. Path Maximum Transmission Unit (PMTU) 4248 The Path Maximum Transmission Unit (PMTU) is the maximum size of the 4249 entire IP packet including the IP header, UDP header, and UDP 4250 payload. The UDP payload includes the QUIC packet header, protected 4251 payload, and any authentication fields. The PMTU can depend on path 4252 characteristics, and can therefore change over time. The largest UDP 4253 payload an endpoint sends at any given time is referred to as the 4254 endpoint's maximum packet size. 4256 QUIC depends on a PMTU of at least 1280 bytes. This is the IPv6 4257 minimum size [RFC8200] and is also supported by most modern IPv4 4258 networks. All QUIC packets (except for PMTU probe packets) SHOULD be 4259 sized to fit within the maximum packet size to avoid the packet being 4260 fragmented or dropped [RFC8085]. 4262 An endpoint SHOULD use Datagram Packetization Layer PMTU Discovery 4263 ([DPLPMTUD]) or implement Path MTU Discovery (PMTUD) [RFC1191] 4264 [RFC8201] to determine whether the path to a destination will support 4265 a desired message size without fragmentation. 4267 In the absence of these mechanisms, QUIC endpoints SHOULD NOT send IP 4268 packets larger than 1280 bytes. Assuming the minimum IP header size, 4269 this results in a QUIC maximum packet size of 1232 bytes for IPv6 and 4270 1252 bytes for IPv4. A QUIC implementation MAY be more conservative 4271 in computing the QUIC maximum packet size to allow for unknown tunnel 4272 overheads or IP header options/extensions. 4274 Each pair of local and remote addresses could have a different PMTU. 4275 QUIC implementations that implement any kind of PMTU discovery 4276 therefore SHOULD maintain a maximum packet size for each combination 4277 of local and remote IP addresses. 4279 If a QUIC endpoint determines that the PMTU between any pair of local 4280 and remote IP addresses has fallen below the size needed to support 4281 the smallest allowed maximum packet size, it MUST immediately cease 4282 sending QUIC packets, except for PMTU probe packets, on the affected 4283 path. An endpoint MAY terminate the connection if an alternative 4284 path cannot be found. 4286 14.2. ICMP Packet Too Big Messages 4288 PMTU discovery [RFC1191] [RFC8201] relies on reception of ICMP 4289 messages (e.g., IPv6 Packet Too Big messages) that indicate when a 4290 packet is dropped because it is larger than the local router MTU. 4291 DPLPMTUD can also optionally use these messages. This use of ICMP 4292 messages is potentially vulnerable to off-path attacks that 4293 successfully guess the addresses used on the path and reduce the PMTU 4294 to a bandwidth-inefficient value. 4296 An endpoint MUST ignore an ICMP message that claims the PMTU has 4297 decreased below 1280 bytes. 4299 The requirements for generating ICMP ([RFC1812], [RFC4443]) state 4300 that the quoted packet should contain as much of the original packet 4301 as possible without exceeding the minimum MTU for the IP version. 4302 The size of the quoted packet can actually be smaller, or the 4303 information unintelligible, as described in Section 1.1 of 4304 [DPLPMTUD]. 4306 QUIC endpoints SHOULD validate ICMP messages to protect from off-path 4307 injection as specified in [RFC8201] and Section 5.2 of [RFC8085]. 4308 This validation SHOULD use the quoted packet supplied in the payload 4309 of an ICMP message to associate the message with a corresponding 4310 transport connection [DPLPMTUD]. 4312 ICMP message validation MUST include matching IP addresses and UDP 4313 ports [RFC8085] and, when possible, connection IDs to an active QUIC 4314 session. 4316 The endpoint SHOULD ignore all ICMP messages that fail validation. 4318 An endpoint MUST NOT increase PMTU based on ICMP messages; see 4319 Section 3, clause 6 of [DPLPMTUD]. Any reduction in the QUIC maximum 4320 packet size in response to ICMP messages MAY be provisional until 4321 QUIC's loss detection algorithm determines that the quoted packet has 4322 actually been lost. 4324 14.3. Datagram Packetization Layer PMTU Discovery 4326 Section 6.3 of [DPLPMTUD] provides considerations for implementing 4327 Datagram Packetization Layer PMTUD (DPLPMTUD) with QUIC. 4329 When implementing the algorithm in Section 5 of [DPLPMTUD], the 4330 initial value of BASE_PMTU SHOULD be consistent with the minimum QUIC 4331 packet size (1232 bytes for IPv6 and 1252 bytes for IPv4). 4333 PING and PADDING frames can be used to generate PMTU probe packets. 4334 These frames might not be retransmitted if a probe packet containing 4335 them is lost. However, these frames do consume congestion window, 4336 which could delay the transmission of subsequent application data. 4338 A PING frame can be included in a PMTU probe to ensure that a valid 4339 probe is acknowledged. 4341 The considerations for processing ICMP messages in the previous 4342 section also apply if these messages are used by DPLPMTUD. 4344 14.3.1. PMTU Probes Containing Source Connection ID 4346 Endpoints that rely on the destination connection ID for routing 4347 incoming QUIC packets are likely to require that the connection ID be 4348 included in PMTU probe packets to route any resulting ICMP messages 4349 (Section 14.2) back to the correct endpoint. However, only long 4350 header packets (Section 17.2) contain source connection IDs, and long 4351 header packets are not decrypted or acknowledged by the peer once the 4352 handshake is complete. 4354 One way to construct a probe for the path MTU is to coalesce (see 4355 Section 12.2) a Handshake packet (Section 17.2.4) with a short header 4356 packet in a single UDP datagram. If the UDP datagram reaches the 4357 endpoint, the Handshake packet will be ignored, but the short header 4358 packet will be acknowledged. If the UDP datagram causes an ICMP 4359 message to be sent, the first part of the datagram will be quoted in 4360 that message. If the source connection ID is within the quoted 4361 portion of the UDP datagram, that could be used for routing. 4363 15. Versions 4365 QUIC versions are identified using a 32-bit unsigned number. 4367 The version 0x00000000 is reserved to represent version negotiation. 4368 This version of the specification is identified by the number 4369 0x00000001. 4371 Other versions of QUIC might have different properties to this 4372 version. The properties of QUIC that are guaranteed to be consistent 4373 across all versions of the protocol are described in 4374 [QUIC-INVARIANTS]. 4376 Version 0x00000001 of QUIC uses TLS as a cryptographic handshake 4377 protocol, as described in [QUIC-TLS]. 4379 Versions with the most significant 16 bits of the version number 4380 cleared are reserved for use in future IETF consensus documents. 4382 Versions that follow the pattern 0x?a?a?a?a are reserved for use in 4383 forcing version negotiation to be exercised. That is, any version 4384 number where the low four bits of all bytes is 1010 (in binary). A 4385 client or server MAY advertise support for any of these reserved 4386 versions. 4388 Reserved version numbers will never represent a real protocol; a 4389 client MAY use one of these version numbers with the expectation that 4390 the server will initiate version negotiation; a server MAY advertise 4391 support for one of these versions and can expect that clients ignore 4392 the value. 4394 [[RFC editor: please remove the remainder of this section before 4395 publication.]] 4397 The version number for the final version of this specification 4398 (0x00000001), is reserved for the version of the protocol that is 4399 published as an RFC. 4401 Version numbers used to identify IETF drafts are created by adding 4402 the draft number to 0xff000000. For example, draft-ietf-quic- 4403 transport-13 would be identified as 0xff00000D. 4405 Implementors are encouraged to register version numbers of QUIC that 4406 they are using for private experimentation on the GitHub wiki at 4407 https://github.com/quicwg/base-drafts/wiki/QUIC-Versions. 4409 16. Variable-Length Integer Encoding 4411 QUIC packets and frames commonly use a variable-length encoding for 4412 non-negative integer values. This encoding ensures that smaller 4413 integer values need fewer bytes to encode. 4415 The QUIC variable-length integer encoding reserves the two most 4416 significant bits of the first byte to encode the base 2 logarithm of 4417 the integer encoding length in bytes. The integer value is encoded 4418 on the remaining bits, in network byte order. 4420 This means that integers are encoded on 1, 2, 4, or 8 bytes and can 4421 encode 6, 14, 30, or 62 bit values respectively. Table 4 summarizes 4422 the encoding properties. 4424 +------+--------+-------------+-----------------------+ 4425 | 2Bit | Length | Usable Bits | Range | 4426 +======+========+=============+=======================+ 4427 | 00 | 1 | 6 | 0-63 | 4428 +------+--------+-------------+-----------------------+ 4429 | 01 | 2 | 14 | 0-16383 | 4430 +------+--------+-------------+-----------------------+ 4431 | 10 | 4 | 30 | 0-1073741823 | 4432 +------+--------+-------------+-----------------------+ 4433 | 11 | 8 | 62 | 0-4611686018427387903 | 4434 +------+--------+-------------+-----------------------+ 4436 Table 4: Summary of Integer Encodings 4438 For example, the eight byte sequence c2 19 7c 5e ff 14 e8 8c (in 4439 hexadecimal) decodes to the decimal value 151288809941952652; the 4440 four byte sequence 9d 7f 3e 7d decodes to 494878333; the two byte 4441 sequence 7b bd decodes to 15293; and the single byte 25 decodes to 37 4442 (as does the two byte sequence 40 25). 4444 Error codes (Section 20) and versions (Section 15) are described 4445 using integers, but do not use this encoding. 4447 17. Packet Formats 4449 All numeric values are encoded in network byte order (that is, big- 4450 endian) and all field sizes are in bits. Hexadecimal notation is 4451 used for describing the value of fields. 4453 17.1. Packet Number Encoding and Decoding 4455 Packet numbers are integers in the range 0 to 2^62-1 (Section 12.3). 4456 When present in long or short packet headers, they are encoded in 1 4457 to 4 bytes. The number of bits required to represent the packet 4458 number is reduced by including the least significant bits of the 4459 packet number. 4461 The encoded packet number is protected as described in Section 5.4 of 4462 [QUIC-TLS]. 4464 The sender MUST use a packet number size able to represent more than 4465 twice as large a range than the difference between the largest 4466 acknowledged packet and packet number being sent. A peer receiving 4467 the packet will then correctly decode the packet number, unless the 4468 packet is delayed in transit such that it arrives after many higher- 4469 numbered packets have been received. An endpoint SHOULD use a large 4470 enough packet number encoding to allow the packet number to be 4471 recovered even if the packet arrives after packets that are sent 4472 afterwards. 4474 As a result, the size of the packet number encoding is at least one 4475 bit more than the base-2 logarithm of the number of contiguous 4476 unacknowledged packet numbers, including the new packet. 4478 For example, if an endpoint has received an acknowledgment for packet 4479 0xabe8bc, sending a packet with a number of 0xac5c02 requires a 4480 packet number encoding with 16 bits or more; whereas the 24-bit 4481 packet number encoding is needed to send a packet with a number of 4482 0xace8fe. 4484 At a receiver, protection of the packet number is removed prior to 4485 recovering the full packet number. The full packet number is then 4486 reconstructed based on the number of significant bits present, the 4487 value of those bits, and the largest packet number received on a 4488 successfully authenticated packet. Recovering the full packet number 4489 is necessary to successfully remove packet protection. 4491 Once header protection is removed, the packet number is decoded by 4492 finding the packet number value that is closest to the next expected 4493 packet. The next expected packet is the highest received packet 4494 number plus one. For example, if the highest successfully 4495 authenticated packet had a packet number of 0xa82f30ea, then a packet 4496 containing a 16-bit value of 0x9b32 will be decoded as 0xa82f9b32. 4497 Example pseudo-code for packet number decoding can be found in 4498 Appendix A. 4500 17.2. Long Header Packets 4502 Long Header Packet { 4503 Header Form (1) = 1, 4504 Fixed Bit (1) = 1, 4505 Long Packet Type (2), 4506 Type-Specific Bits (4), 4507 Version (32), 4508 DCID Length (8), 4509 Destination Connection ID (0..160), 4510 SCID Length (8), 4511 Source Connection ID (0..160), 4512 } 4514 Figure 12: Long Header Packet Format 4516 Long headers are used for packets that are sent prior to the 4517 establishment of 1-RTT keys. Once 1-RTT keys are available, a sender 4518 switches to sending packets using the short header (Section 17.3). 4519 The long form allows for special packets - such as the Version 4520 Negotiation packet - to be represented in this uniform fixed-length 4521 packet format. Packets that use the long header contain the 4522 following fields: 4524 Header Form: The most significant bit (0x80) of byte 0 (the first 4525 byte) is set to 1 for long headers. 4527 Fixed Bit: The next bit (0x40) of byte 0 is set to 1. Packets 4528 containing a zero value for this bit are not valid packets in this 4529 version and MUST be discarded. 4531 Long Packet Type: The next two bits (those with a mask of 0x30) of 4532 byte 0 contain a packet type. Packet types are listed in Table 5. 4534 Type-Specific Bits: The lower four bits (those with a mask of 0x0f) 4535 of byte 0 are type-specific. 4537 Version: The QUIC Version is a 32-bit field that follows the first 4538 byte. This field indicates which version of QUIC is in use and 4539 determines how the rest of the protocol fields are interpreted. 4541 DCID Length: The byte following the version contains the length in 4542 bytes of the Destination Connection ID field that follows it. 4543 This length is encoded as an 8-bit unsigned integer. In QUIC 4544 version 1, this value MUST NOT exceed 20. Endpoints that receive 4545 a version 1 long header with a value larger than 20 MUST drop the 4546 packet. Servers SHOULD be able to read longer connection IDs from 4547 other QUIC versions in order to properly form a version 4548 negotiation packet. 4550 Destination Connection ID: The Destination Connection ID field 4551 follows the DCID Length field and is between 0 and 20 bytes in 4552 length. Section 7.2 describes the use of this field in more 4553 detail. 4555 SCID Length: The byte following the Destination Connection ID 4556 contains the length in bytes of the Source Connection ID field 4557 that follows it. This length is encoded as a 8-bit unsigned 4558 integer. In QUIC version 1, this value MUST NOT exceed 20 bytes. 4559 Endpoints that receive a version 1 long header with a value larger 4560 than 20 MUST drop the packet. Servers SHOULD be able to read 4561 longer connection IDs from other QUIC versions in order to 4562 properly form a version negotiation packet. 4564 Source Connection ID: The Source Connection ID field follows the 4565 SCID Length field and is between 0 and 20 bytes in length. 4566 Section 7.2 describes the use of this field in more detail. 4568 In this version of QUIC, the following packet types with the long 4569 header are defined: 4571 +------+-----------+----------------+ 4572 | Type | Name | Section | 4573 +======+===========+================+ 4574 | 0x0 | Initial | Section 17.2.2 | 4575 +------+-----------+----------------+ 4576 | 0x1 | 0-RTT | Section 17.2.3 | 4577 +------+-----------+----------------+ 4578 | 0x2 | Handshake | Section 17.2.4 | 4579 +------+-----------+----------------+ 4580 | 0x3 | Retry | Section 17.2.5 | 4581 +------+-----------+----------------+ 4583 Table 5: Long Header Packet Types 4585 The header form bit, connection ID lengths byte, Destination and 4586 Source Connection ID fields, and Version fields of a long header 4587 packet are version-independent. The other fields in the first byte 4588 are version-specific. See [QUIC-INVARIANTS] for details on how 4589 packets from different versions of QUIC are interpreted. 4591 The interpretation of the fields and the payload are specific to a 4592 version and packet type. While type-specific semantics for this 4593 version are described in the following sections, several long-header 4594 packets in this version of QUIC contain these additional fields: 4596 Reserved Bits: Two bits (those with a mask of 0x0c) of byte 0 are 4597 reserved across multiple packet types. These bits are protected 4598 using header protection; see Section 5.4 of [QUIC-TLS]. The value 4599 included prior to protection MUST be set to 0. An endpoint MUST 4600 treat receipt of a packet that has a non-zero value for these 4601 bits, after removing both packet and header protection, as a 4602 connection error of type PROTOCOL_VIOLATION. Discarding such a 4603 packet after only removing header protection can expose the 4604 endpoint to attacks; see Section 9.3 of [QUIC-TLS]. 4606 Packet Number Length: In packet types which contain a Packet Number 4607 field, the least significant two bits (those with a mask of 0x03) 4608 of byte 0 contain the length of the packet number, encoded as an 4609 unsigned, two-bit integer that is one less than the length of the 4610 packet number field in bytes. That is, the length of the packet 4611 number field is the value of this field, plus one. These bits are 4612 protected using header protection; see Section 5.4 of [QUIC-TLS]. 4614 Length: The length of the remainder of the packet (that is, the 4615 Packet Number and Payload fields) in bytes, encoded as a variable- 4616 length integer (Section 16). 4618 Packet Number: The packet number field is 1 to 4 bytes long. The 4619 packet number has confidentiality protection separate from packet 4620 protection, as described in Section 5.4 of [QUIC-TLS]. The length 4621 of the packet number field is encoded in the Packet Number Length 4622 bits of byte 0; see above. 4624 17.2.1. Version Negotiation Packet 4626 A Version Negotiation packet is inherently not version-specific. 4627 Upon receipt by a client, it will be identified as a Version 4628 Negotiation packet based on the Version field having a value of 0. 4630 The Version Negotiation packet is a response to a client packet that 4631 contains a version that is not supported by the server, and is only 4632 sent by servers. 4634 The layout of a Version Negotiation packet is: 4636 Version Negotiation Packet { 4637 Header Form (1) = 1, 4638 Unused (7), 4639 Version (32) = 0, 4640 DCID Length (8), 4641 Destination Connection ID (0..2040), 4642 SCID Length (8), 4643 Source Connection ID (0..2040), 4644 Supported Version (32) ..., 4645 } 4647 Figure 13: Version Negotiation Packet 4649 The value in the Unused field is selected randomly by the server. 4650 Clients MUST ignore the value of this field. Servers SHOULD set the 4651 most significant bit of this field (0x40) to 1 so that Version 4652 Negotiation packets appear to have the Fixed Bit field. 4654 The Version field of a Version Negotiation packet MUST be set to 4655 0x00000000. 4657 The server MUST include the value from the Source Connection ID field 4658 of the packet it receives in the Destination Connection ID field. 4659 The value for Source Connection ID MUST be copied from the 4660 Destination Connection ID of the received packet, which is initially 4661 randomly selected by a client. Echoing both connection IDs gives 4662 clients some assurance that the server received the packet and that 4663 the Version Negotiation packet was not generated by an off-path 4664 attacker. 4666 As future versions of QUIC may support Connection IDs larger than the 4667 version 1 limit, Version Negotiation packets could carry Connection 4668 IDs that are longer than 20 bytes. 4670 The remainder of the Version Negotiation packet is a list of 32-bit 4671 versions which the server supports. 4673 A Version Negotiation packet cannot be explicitly acknowledged in an 4674 ACK frame by a client. Receiving another Initial packet implicitly 4675 acknowledges a Version Negotiation packet. 4677 The Version Negotiation packet does not include the Packet Number and 4678 Length fields present in other packets that use the long header form. 4679 Consequently, a Version Negotiation packet consumes an entire UDP 4680 datagram. 4682 A server MUST NOT send more than one Version Negotiation packet in 4683 response to a single UDP datagram. 4685 See Section 6 for a description of the version negotiation process. 4687 17.2.2. Initial Packet 4689 An Initial packet uses long headers with a type value of 0x0. It 4690 carries the first CRYPTO frames sent by the client and server to 4691 perform key exchange, and carries ACKs in either direction. 4693 Initial Packet { 4694 Header Form (1) = 1, 4695 Fixed Bit (1) = 1, 4696 Long Packet Type (2) = 0, 4697 Reserved Bits (2), 4698 Packet Number Length (2), 4699 Version (32), 4700 DCID Length (8), 4701 Destination Connection ID (0..160), 4702 SCID Length (8), 4703 Source Connection ID (0..160), 4704 Token Length (i), 4705 Token (..), 4706 Length (i), 4707 Packet Number (8..32), 4708 Packet Payload (..), 4709 } 4711 Figure 14: Initial Packet 4713 The Initial packet contains a long header as well as the Length and 4714 Packet Number fields. The first byte contains the Reserved and 4715 Packet Number Length bits. Between the SCID and Length fields, there 4716 are two additional fields specific to the Initial packet. 4718 Token Length: A variable-length integer specifying the length of the 4719 Token field, in bytes. This value is zero if no token is present. 4720 Initial packets sent by the server MUST set the Token Length field 4721 to zero; clients that receive an Initial packet with a non-zero 4722 Token Length field MUST either discard the packet or generate a 4723 connection error of type PROTOCOL_VIOLATION. 4725 Token: The value of the token that was previously provided in a 4726 Retry packet or NEW_TOKEN frame. 4728 Packet Payload: The payload of the packet. 4730 In order to prevent tampering by version-unaware middleboxes, Initial 4731 packets are protected with connection- and version-specific keys 4732 (Initial keys) as described in [QUIC-TLS]. This protection does not 4733 provide confidentiality or integrity against on-path attackers, but 4734 provides some level of protection against off-path attackers. 4736 The client and server use the Initial packet type for any packet that 4737 contains an initial cryptographic handshake message. This includes 4738 all cases where a new packet containing the initial cryptographic 4739 message needs to be created, such as the packets sent after receiving 4740 a Retry packet (Section 17.2.5). 4742 A server sends its first Initial packet in response to a client 4743 Initial. A server may send multiple Initial packets. The 4744 cryptographic key exchange could require multiple round trips or 4745 retransmissions of this data. 4747 The payload of an Initial packet includes a CRYPTO frame (or frames) 4748 containing a cryptographic handshake message, ACK frames, or both. 4749 PING, PADDING, and CONNECTION_CLOSE frames are also permitted. An 4750 endpoint that receives an Initial packet containing other frames can 4751 either discard the packet as spurious or treat it as a connection 4752 error. 4754 The first packet sent by a client always includes a CRYPTO frame that 4755 contains the start or all of the first cryptographic handshake 4756 message. The first CRYPTO frame sent always begins at an offset of 4757 0; see Section 7. 4759 Note that if the server sends a HelloRetryRequest, the client will 4760 send another series of Initial packets. These Initial packets will 4761 continue the cryptographic handshake and will contain CRYPTO frames 4762 starting at an offset matching the size of the CRYPTO frames sent in 4763 the first flight of Initial packets. 4765 17.2.2.1. Abandoning Initial Packets 4767 A client stops both sending and processing Initial packets when it 4768 sends its first Handshake packet. A server stops sending and 4769 processing Initial packets when it receives its first Handshake 4770 packet. Though packets might still be in flight or awaiting 4771 acknowledgment, no further Initial packets need to be exchanged 4772 beyond this point. Initial packet protection keys are discarded (see 4773 Section 4.10.1 of [QUIC-TLS]) along with any loss recovery and 4774 congestion control state; see Section 6.5 of [QUIC-RECOVERY]. 4776 Any data in CRYPTO frames is discarded - and no longer retransmitted 4777 - when Initial keys are discarded. 4779 17.2.3. 0-RTT 4781 A 0-RTT packet uses long headers with a type value of 0x1, followed 4782 by the Length and Packet Number fields. The first byte contains the 4783 Reserved and Packet Number Length bits. It is used to carry "early" 4784 data from the client to the server as part of the first flight, prior 4785 to handshake completion. As part of the TLS handshake, the server 4786 can accept or reject this early data. 4788 See Section 2.3 of [TLS13] for a discussion of 0-RTT data and its 4789 limitations. 4791 0-RTT Packet { 4792 Header Form (1) = 1, 4793 Fixed Bit (1) = 1, 4794 Long Packet Type (2) = 1, 4795 Reserved Bits (2), 4796 Packet Number Length (2), 4797 Version (32), 4798 DCID Length (8), 4799 Destination Connection ID (0..160), 4800 SCID Length (8), 4801 Source Connection ID (0..160), 4802 Length (i), 4803 Packet Number (8..32), 4804 Packet Payload (..), 4805 } 4807 Figure 15: 0-RTT Packet 4809 Packet numbers for 0-RTT protected packets use the same space as 4810 1-RTT protected packets. 4812 After a client receives a Retry packet, 0-RTT packets are likely to 4813 have been lost or discarded by the server. A client SHOULD attempt 4814 to resend data in 0-RTT packets after it sends a new Initial packet. 4816 A client MUST NOT reset the packet number it uses for 0-RTT packets, 4817 since the keys used to protect 0-RTT packets will not change as a 4818 result of responding to a Retry packet. Sending packets with the 4819 same packet number in that case is likely to compromise the packet 4820 protection for all 0-RTT packets because the same key and nonce could 4821 be used to protect different content. 4823 A client only receives acknowledgments for its 0-RTT packets once the 4824 handshake is complete. Consequently, a server might expect 0-RTT 4825 packets to start with a packet number of 0. Therefore, in 4826 determining the length of the packet number encoding for 0-RTT 4827 packets, a client MUST assume that all packets up to the current 4828 packet number are in flight, starting from a packet number of 0. 4829 Thus, 0-RTT packets could need to use a longer packet number 4830 encoding. 4832 A client MUST NOT send 0-RTT packets once it starts processing 1-RTT 4833 packets from the server. This means that 0-RTT packets cannot 4834 contain any response to frames from 1-RTT packets. For instance, a 4835 client cannot send an ACK frame in a 0-RTT packet, because that can 4836 only acknowledge a 1-RTT packet. An acknowledgment for a 1-RTT 4837 packet MUST be carried in a 1-RTT packet. 4839 A server SHOULD treat a violation of remembered limits as a 4840 connection error of an appropriate type (for instance, a 4841 FLOW_CONTROL_ERROR for exceeding stream data limits). 4843 17.2.4. Handshake Packet 4845 A Handshake packet uses long headers with a type value of 0x2, 4846 followed by the Length and Packet Number fields. The first byte 4847 contains the Reserved and Packet Number Length bits. It is used to 4848 carry acknowledgments and cryptographic handshake messages from the 4849 server and client. 4851 Handshake Packet { 4852 Header Form (1) = 1, 4853 Fixed Bit (1) = 1, 4854 Long Packet Type (2) = 2, 4855 Reserved Bits (2), 4856 Packet Number Length (2), 4857 Version (32), 4858 DCID Length (8), 4859 Destination Connection ID (0..160), 4860 SCID Length (8), 4861 Source Connection ID (0..160), 4862 Length (i), 4863 Packet Number (8..32), 4864 Packet Payload (..), 4865 } 4867 Figure 16: Handshake Protected Packet 4869 Once a client has received a Handshake packet from a server, it uses 4870 Handshake packets to send subsequent cryptographic handshake messages 4871 and acknowledgments to the server. 4873 The Destination Connection ID field in a Handshake packet contains a 4874 connection ID that is chosen by the recipient of the packet; the 4875 Source Connection ID includes the connection ID that the sender of 4876 the packet wishes to use; see Section 7.2. 4878 Handshake packets are their own packet number space, and thus the 4879 first Handshake packet sent by a server contains a packet number of 4880 0. 4882 The payload of this packet contains CRYPTO frames and could contain 4883 PING, PADDING, or ACK frames. Handshake packets MAY contain 4884 CONNECTION_CLOSE frames. Endpoints MUST treat receipt of Handshake 4885 packets with other frames as a connection error. 4887 Like Initial packets (see Section 17.2.2.1), data in CRYPTO frames 4888 for Handshake packets is discarded - and no longer retransmitted - 4889 when Handshake protection keys are discarded. 4891 17.2.5. Retry Packet 4893 A Retry packet uses a long packet header with a type value of 0x3. 4894 It carries an address validation token created by the server. It is 4895 used by a server that wishes to perform a retry; see Section 8.1. 4897 Retry Packet { 4898 Header Form (1) = 1, 4899 Fixed Bit (1) = 1, 4900 Long Packet Type (2) = 3, 4901 Unused (4), 4902 Version (32), 4903 DCID Length (8), 4904 Destination Connection ID (0..160), 4905 SCID Length (8), 4906 Source Connection ID (0..160), 4907 Retry Token (..), 4908 Retry Integrity Tag (128), 4909 } 4911 Figure 17: Retry Packet 4913 A Retry packet (shown in Figure 17) does not contain any protected 4914 fields. The value in the Unused field is selected randomly by the 4915 server. In addition to the fields from the long header, it contains 4916 these additional fields: 4918 Retry Token: An opaque token that the server can use to validate the 4919 client's address. 4921 Retry Integrity Tag: See the Retry Packet Integrity section of 4922 [QUIC-TLS]. 4924 17.2.5.1. Sending a Retry Packet 4926 The server populates the Destination Connection ID with the 4927 connection ID that the client included in the Source Connection ID of 4928 the Initial packet. 4930 The server includes a connection ID of its choice in the Source 4931 Connection ID field. This value MUST not be equal to the Destination 4932 Connection ID field of the packet sent by the client. A client MUST 4933 discard a Retry packet that contains a Source Connection ID field 4934 that is identical to the Destination Connection ID field of its 4935 Initial packet. The client MUST use the value from the Source 4936 Connection ID field of the Retry packet in the Destination Connection 4937 ID field of subsequent packets that it sends. 4939 A server MAY send Retry packets in response to Initial and 0-RTT 4940 packets. A server can either discard or buffer 0-RTT packets that it 4941 receives. A server can send multiple Retry packets as it receives 4942 Initial or 0-RTT packets. A server MUST NOT send more than one Retry 4943 packet in response to a single UDP datagram. 4945 17.2.5.2. Handling a Retry Packet 4947 A client MUST accept and process at most one Retry packet for each 4948 connection attempt. After the client has received and processed an 4949 Initial or Retry packet from the server, it MUST discard any 4950 subsequent Retry packets that it receives. 4952 Clients MUST discard Retry packets that have a Retry Integrity Tag 4953 that cannot be validated, see the Retry Packet Integrity section of 4954 [QUIC-TLS]. This diminishes an off-path attacker's ability to inject 4955 a Retry packet and protects against accidental corruption of Retry 4956 packets. A client MUST discard a Retry packet with a zero-length 4957 Retry Token field. 4959 The client responds to a Retry packet with an Initial packet that 4960 includes the provided Retry Token to continue connection 4961 establishment. 4963 A client sets the Destination Connection ID field of this Initial 4964 packet to the value from the Source Connection ID in the Retry 4965 packet. Changing Destination Connection ID also results in a change 4966 to the keys used to protect the Initial packet. It also sets the 4967 Token field to the token provided in the Retry. The client MUST NOT 4968 change the Source Connection ID because the server could include the 4969 connection ID as part of its token validation logic; see 4970 Section 8.1.4. 4972 A Retry packet does not include a packet number and cannot be 4973 explicitly acknowledged by a client. 4975 17.2.5.3. Continuing a Handshake After Retry 4977 The next Initial packet from the client uses the connection ID and 4978 token values from the Retry packet; see Section 7.2. Aside from 4979 this, the Initial packet sent by the client is subject to the same 4980 restrictions as the first Initial packet. A client MUST use the same 4981 cryptographic handshake message it includes in this packet. A server 4982 MAY treat a packet that contains a different cryptographic handshake 4983 message as a connection error or discard it. 4985 A client MAY attempt 0-RTT after receiving a Retry packet by sending 4986 0-RTT packets to the connection ID provided by the server. A client 4987 MUST NOT change the cryptographic handshake message it sends in 4988 response to receiving a Retry. 4990 A client MUST NOT reset the packet number for any packet number space 4991 after processing a Retry packet; Section 17.2.3 contains more 4992 information on this. 4994 A server acknowledges the use of a Retry packet for a connection 4995 using the retry_source_connection_id transport parameter; see 4996 Section 18.2. If the server sends a Retry packet, it also 4997 subsequently includes the value of the Source Connection ID field 4998 from the Retry packet in its retry_source_connection_id transport 4999 parameter. 5001 If the client received and processed a Retry packet, it MUST validate 5002 that the retry_source_connection_id transport parameter is present 5003 and correct; otherwise, it MUST validate that the transport parameter 5004 is absent. A client MUST treat a failed validation as a connection 5005 error of type PROTOCOL_VIOLATION. 5007 17.3. Short Header Packets 5009 This version of QUIC defines a single packet type which uses the 5010 short packet header. 5012 Short Header Packet { 5013 Header Form (1) = 0, 5014 Fixed Bit (1) = 1, 5015 Spin Bit (1), 5016 Reserved Bits (2), 5017 Key Phase (1), 5018 Packet Number Length (2), 5019 Destination Connection ID (0..160), 5020 Packet Number (8..32), 5021 Packet Payload (..), 5022 } 5024 Figure 18: Short Header Packet Format 5026 The short header can be used after the version and 1-RTT keys are 5027 negotiated. Packets that use the short header contain the following 5028 fields: 5030 Header Form: The most significant bit (0x80) of byte 0 is set to 0 5031 for the short header. 5033 Fixed Bit: The next bit (0x40) of byte 0 is set to 1. Packets 5034 containing a zero value for this bit are not valid packets in this 5035 version and MUST be discarded. 5037 Spin Bit: The third most significant bit (0x20) of byte 0 is the 5038 latency spin bit, set as described in Section 17.3.1. 5040 Reserved Bits: The next two bits (those with a mask of 0x18) of byte 5041 0 are reserved. These bits are protected using header protection; 5042 see Section 5.4 of [QUIC-TLS]. The value included prior to 5043 protection MUST be set to 0. An endpoint MUST treat receipt of a 5044 packet that has a non-zero value for these bits, after removing 5045 both packet and header protection, as a connection error of type 5046 PROTOCOL_VIOLATION. Discarding such a packet after only removing 5047 header protection can expose the endpoint to attacks; see 5048 Section 9.3 of [QUIC-TLS]. 5050 Key Phase: The next bit (0x04) of byte 0 indicates the key phase, 5051 which allows a recipient of a packet to identify the packet 5052 protection keys that are used to protect the packet. See 5053 [QUIC-TLS] for details. This bit is protected using header 5054 protection; see Section 5.4 of [QUIC-TLS]. 5056 Packet Number Length: The least significant two bits (those with a 5057 mask of 0x03) of byte 0 contain the length of the packet number, 5058 encoded as an unsigned, two-bit integer that is one less than the 5059 length of the packet number field in bytes. That is, the length 5060 of the packet number field is the value of this field, plus one. 5061 These bits are protected using header protection; see Section 5.4 5062 of [QUIC-TLS]. 5064 Destination Connection ID: The Destination Connection ID is a 5065 connection ID that is chosen by the intended recipient of the 5066 packet. See Section 5.1 for more details. 5068 Packet Number: The packet number field is 1 to 4 bytes long. The 5069 packet number has confidentiality protection separate from packet 5070 protection, as described in Section 5.4 of [QUIC-TLS]. The length 5071 of the packet number field is encoded in Packet Number Length 5072 field. See Section 17.1 for details. 5074 Packet Payload: Packets with a short header always include a 1-RTT 5075 protected payload. 5077 The header form bit and the connection ID field of a short header 5078 packet are version-independent. The remaining fields are specific to 5079 the selected QUIC version. See [QUIC-INVARIANTS] for details on how 5080 packets from different versions of QUIC are interpreted. 5082 17.3.1. Latency Spin Bit 5084 The latency spin bit enables passive latency monitoring from 5085 observation points on the network path throughout the duration of a 5086 connection. The spin bit is only present in the short packet header, 5087 since it is possible to measure the initial RTT of a connection by 5088 observing the handshake. Therefore, the spin bit is available after 5089 version negotiation and connection establishment are completed. On- 5090 path measurement and use of the latency spin bit is further discussed 5091 in [QUIC-MANAGEABILITY]. 5093 The spin bit is an OPTIONAL feature of QUIC. A QUIC stack that 5094 chooses to support the spin bit MUST implement it as specified in 5095 this section. 5097 Each endpoint unilaterally decides if the spin bit is enabled or 5098 disabled for a connection. Implementations MUST allow administrators 5099 of clients and servers to disable the spin bit either globally or on 5100 a per-connection basis. Even when the spin bit is not disabled by 5101 the administrator, endpoints MUST disable their use of the spin bit 5102 for a random selection of at least one in every 16 network paths, or 5103 for one in every 16 connection IDs. As each endpoint disables the 5104 spin bit independently, this ensures that the spin bit signal is 5105 disabled on approximately one in eight network paths. 5107 When the spin bit is disabled, endpoints MAY set the spin bit to any 5108 value, and MUST ignore any incoming value. It is RECOMMENDED that 5109 endpoints set the spin bit to a random value either chosen 5110 independently for each packet or chosen independently for each 5111 connection ID. 5113 If the spin bit is enabled for the connection, the endpoint maintains 5114 a spin value and sets the spin bit in the short header to the 5115 currently stored value when a packet with a short header is sent out. 5116 The spin value is initialized to 0 in the endpoint at connection 5117 start. Each endpoint also remembers the highest packet number seen 5118 from its peer on the connection. 5120 When a server receives a short header packet that increments the 5121 highest packet number seen by the server from the client, it sets the 5122 spin value to be equal to the spin bit in the received packet. 5124 When a client receives a short header packet that increments the 5125 highest packet number seen by the client from the server, it sets the 5126 spin value to the inverse of the spin bit in the received packet. 5128 An endpoint resets its spin value to zero when sending the first 5129 packet of a given connection with a new connection ID. This reduces 5130 the risk that transient spin bit state can be used to link flows 5131 across connection migration or ID change. 5133 With this mechanism, the server reflects the spin value received, 5134 while the client 'spins' it after one RTT. On-path observers can 5135 measure the time between two spin bit toggle events to estimate the 5136 end-to-end RTT of a connection. 5138 18. Transport Parameter Encoding 5140 The extension_data field of the quic_transport_parameters extension 5141 defined in [QUIC-TLS] contains the QUIC transport parameters. They 5142 are encoded as a sequence of transport parameters, as shown in 5143 Figure 19: 5145 Transport Parameters { 5146 Transport Parameter (..) ..., 5147 } 5149 Figure 19: Sequence of Transport Parameters 5151 Each transport parameter is encoded as an (identifier, length, value) 5152 tuple, as shown in Figure 20: 5154 Transport Parameter { 5155 Transport Parameter ID (i), 5156 Transport Parameter Length (i), 5157 Transport Parameter Value (..), 5158 } 5160 Figure 20: Transport Parameter Encoding 5162 The Transport Parameter Length field contains the length of the 5163 Transport Parameter Value field. 5165 QUIC encodes transport parameters into a sequence of bytes, which are 5166 then included in the cryptographic handshake. 5168 18.1. Reserved Transport Parameters 5170 Transport parameters with an identifier of the form "31 * N + 27" for 5171 integer values of N are reserved to exercise the requirement that 5172 unknown transport parameters be ignored. These transport parameters 5173 have no semantics, and may carry arbitrary values. 5175 18.2. Transport Parameter Definitions 5177 This section details the transport parameters defined in this 5178 document. 5180 Many transport parameters listed here have integer values. Those 5181 transport parameters that are identified as integers use a variable- 5182 length integer encoding; see Section 16. Transport parameters have a 5183 default value of 0 if the transport parameter is absent unless 5184 otherwise stated. 5186 The following transport parameters are defined: 5188 original_destination_connection_id (0x00): The value of the 5189 Destination Connection ID field from the first Initial packet sent 5190 by the client; see Section 7.3. This transport parameter is only 5191 sent by a server. 5193 max_idle_timeout (0x01): The max idle timeout is a value in 5194 milliseconds that is encoded as an integer; see (Section 10.2). 5195 Idle timeout is disabled when both endpoints omit this transport 5196 parameter or specify a value of 0. 5198 stateless_reset_token (0x02): A stateless reset token is used in 5199 verifying a stateless reset; see Section 10.4. This parameter is 5200 a sequence of 16 bytes. This transport parameter MUST NOT be sent 5201 by a client, but MAY be sent by a server. A server that does not 5202 send this transport parameter cannot use stateless reset 5203 (Section 10.4) for the connection ID negotiated during the 5204 handshake. 5206 max_udp_payload_size (0x03): The maximum UDP payload size parameter 5207 is an integer value that limits the size of UDP payloads that the 5208 endpoint is willing to receive. UDP packets with payloads larger 5209 than this limit are not likely to be processed by the receiver. 5211 The default for this parameter is the maximum permitted UDP 5212 payload of 65527. Values below 1200 are invalid. 5214 This limit does act as an additional constraint on datagram size 5215 in the same way as the path MTU, but it is a property of the 5216 endpoint and not the path. It is expected that this is the space 5217 an endpoint dedicates to holding incoming packets. 5219 initial_max_data (0x04): The initial maximum data parameter is an 5220 integer value that contains the initial value for the maximum 5221 amount of data that can be sent on the connection. This is 5222 equivalent to sending a MAX_DATA (Section 19.9) for the connection 5223 immediately after completing the handshake. 5225 initial_max_stream_data_bidi_local (0x05): This parameter is an 5226 integer value specifying the initial flow control limit for 5227 locally-initiated bidirectional streams. This limit applies to 5228 newly created bidirectional streams opened by the endpoint that 5229 sends the transport parameter. In client transport parameters, 5230 this applies to streams with an identifier with the least 5231 significant two bits set to 0x0; in server transport parameters, 5232 this applies to streams with the least significant two bits set to 5233 0x1. 5235 initial_max_stream_data_bidi_remote (0x06): This parameter is an 5236 integer value specifying the initial flow control limit for peer- 5237 initiated bidirectional streams. This limit applies to newly 5238 created bidirectional streams opened by the endpoint that receives 5239 the transport parameter. In client transport parameters, this 5240 applies to streams with an identifier with the least significant 5241 two bits set to 0x1; in server transport parameters, this applies 5242 to streams with the least significant two bits set to 0x0. 5244 initial_max_stream_data_uni (0x07): This parameter is an integer 5245 value specifying the initial flow control limit for unidirectional 5246 streams. This limit applies to newly created unidirectional 5247 streams opened by the endpoint that receives the transport 5248 parameter. In client transport parameters, this applies to 5249 streams with an identifier with the least significant two bits set 5250 to 0x3; in server transport parameters, this applies to streams 5251 with the least significant two bits set to 0x2. 5253 initial_max_streams_bidi (0x08): The initial maximum bidirectional 5254 streams parameter is an integer value that contains the initial 5255 maximum number of bidirectional streams the peer may initiate. If 5256 this parameter is absent or zero, the peer cannot open 5257 bidirectional streams until a MAX_STREAMS frame is sent. Setting 5258 this parameter is equivalent to sending a MAX_STREAMS 5259 (Section 19.11) of the corresponding type with the same value. 5261 initial_max_streams_uni (0x09): The initial maximum unidirectional 5262 streams parameter is an integer value that contains the initial 5263 maximum number of unidirectional streams the peer may initiate. 5264 If this parameter is absent or zero, the peer cannot open 5265 unidirectional streams until a MAX_STREAMS frame is sent. Setting 5266 this parameter is equivalent to sending a MAX_STREAMS 5267 (Section 19.11) of the corresponding type with the same value. 5269 ack_delay_exponent (0x0a): The ACK delay exponent is an integer 5270 value indicating an exponent used to decode the ACK Delay field in 5271 the ACK frame (Section 19.3). If this value is absent, a default 5272 value of 3 is assumed (indicating a multiplier of 8). Values 5273 above 20 are invalid. 5275 max_ack_delay (0x0b): The maximum ACK delay is an integer value 5276 indicating the maximum amount of time in milliseconds by which the 5277 endpoint will delay sending acknowledgments. This value SHOULD 5278 include the receiver's expected delays in alarms firing. For 5279 example, if a receiver sets a timer for 5ms and alarms commonly 5280 fire up to 1ms late, then it should send a max_ack_delay of 6ms. 5281 If this value is absent, a default of 25 milliseconds is assumed. 5282 Values of 2^14 or greater are invalid. 5284 disable_active_migration (0x0c): The disable active migration 5285 transport parameter is included if the endpoint does not support 5286 active connection migration (Section 9). Peers of an endpoint 5287 that sets this transport parameter MUST NOT send any packets, 5288 including probing packets (Section 9.1), from a local address or 5289 port other than that used to perform the handshake. This 5290 parameter is a zero-length value. 5292 preferred_address (0x0d): The server's preferred address is used to 5293 effect a change in server address at the end of the handshake, as 5294 described in Section 9.6. The format of this transport parameter 5295 is shown in Figure 21. This transport parameter is only sent by a 5296 server. Servers MAY choose to only send a preferred address of 5297 one address family by sending an all-zero address and port 5298 (0.0.0.0:0 or ::.0) for the other family. IP addresses are 5299 encoded in network byte order. 5301 The Connection ID field and the Stateless Reset Token field 5302 contain an alternative connection ID that has a sequence number of 5303 1; see Section 5.1.1. Having these values bundled with the 5304 preferred address ensures that there will be at least one unused 5305 active connection ID when the client initiates migration to the 5306 preferred address. 5308 The Connection ID and Stateless Reset Token fields of a preferred 5309 address are identical in syntax and semantics to the corresponding 5310 fields of a NEW_CONNECTION_ID frame (Section 19.15). A server 5311 that chooses a zero-length connection ID MUST NOT provide a 5312 preferred address. Similarly, a server MUST NOT include a zero- 5313 length connection ID in this transport parameter. A client MUST 5314 treat violation of these requirements as a connection error of 5315 type TRANSPORT_PARAMETER_ERROR. 5317 Preferred Address { 5318 IPv4 Address (32), 5319 IPv4 Port (16), 5320 IPv6 Address (128), 5321 IPv6 Port (16), 5322 CID Length (8), 5323 Connection ID (..), 5324 Stateless Reset Token (128), 5325 } 5327 Figure 21: Preferred Address format 5329 active_connection_id_limit (0x0e): The active connection ID limit is 5330 an integer value specifying the maximum number of connection IDs 5331 from the peer that an endpoint is willing to store. This value 5332 includes the connection ID received during the handshake, that 5333 received in the preferred_address transport parameter, and those 5334 received in NEW_CONNECTION_ID frames. The value of the 5335 active_connection_id_limit parameter MUST be at least 2. An 5336 endpoint that receives a value less than 2 MUST close the 5337 connection with an error of type TRANSPORT_PARAMETER_ERROR. If 5338 this transport parameter is absent, a default of 2 is assumed. If 5339 an endpoint issues a zero-length connection ID, it will never send 5340 a NEW_CONNECTION_ID frame and therefore ignores the 5341 active_connection_id_limit value received from its peer. 5343 initial_source_connection_id (0x0f): The value that the endpoint 5344 included in the Source Connection ID field of the first Initial 5345 packet it sends for the connection; see Section 7.3. 5347 retry_source_connection_id (0x10): The value that the the server 5348 included in the Source Connection ID field of a Retry packet; see 5349 Section 7.3. This transport parameter is only sent by a server. 5351 If present, transport parameters that set initial flow control limits 5352 (initial_max_stream_data_bidi_local, 5353 initial_max_stream_data_bidi_remote, and initial_max_stream_data_uni) 5354 are equivalent to sending a MAX_STREAM_DATA frame (Section 19.10) on 5355 every stream of the corresponding type immediately after opening. If 5356 the transport parameter is absent, streams of that type start with a 5357 flow control limit of 0. 5359 A client MUST NOT include any server-only transport parameter: 5360 original_destination_connection_id, preferred_address, 5361 retry_source_connection_id, or stateless_reset_token. A server MUST 5362 treat receipt of any of these transport parameters as a connection 5363 error of type TRANSPORT_PARAMETER_ERROR. 5365 19. Frame Types and Formats 5367 As described in Section 12.4, packets contain one or more frames. 5368 This section describes the format and semantics of the core QUIC 5369 frame types. 5371 19.1. PADDING Frame 5373 The PADDING frame (type=0x00) has no semantic value. PADDING frames 5374 can be used to increase the size of a packet. Padding can be used to 5375 increase an initial client packet to the minimum required size, or to 5376 provide protection against traffic analysis for protected packets. 5378 A PADDING frame has no content. That is, a PADDING frame consists of 5379 the single byte that identifies the frame as a PADDING frame. 5381 19.2. PING Frame 5383 Endpoints can use PING frames (type=0x01) to verify that their peers 5384 are still alive or to check reachability to the peer. The PING frame 5385 contains no additional fields. 5387 The receiver of a PING frame simply needs to acknowledge the packet 5388 containing this frame. 5390 The PING frame can be used to keep a connection alive when an 5391 application or application protocol wishes to prevent the connection 5392 from timing out. An application protocol SHOULD provide guidance 5393 about the conditions under which generating a PING is recommended. 5394 This guidance SHOULD indicate whether it is the client or the server 5395 that is expected to send the PING. Having both endpoints send PING 5396 frames without coordination can produce an excessive number of 5397 packets and poor performance. 5399 A connection will time out if no packets are sent or received for a 5400 period longer than the time negotiated using the max_idle_timeout 5401 transport parameter; see Section 10. However, state in middleboxes 5402 might time out earlier than that. Though REQ-5 in [RFC4787] 5403 recommends a 2 minute timeout interval, experience shows that sending 5404 packets every 15 to 30 seconds is necessary to prevent the majority 5405 of middleboxes from losing state for UDP flows. 5407 19.3. ACK Frames 5409 Receivers send ACK frames (types 0x02 and 0x03) to inform senders of 5410 packets they have received and processed. The ACK frame contains one 5411 or more ACK Ranges. ACK Ranges identify acknowledged packets. If 5412 the frame type is 0x03, ACK frames also contain the sum of QUIC 5413 packets with associated ECN marks received on the connection up until 5414 this point. QUIC implementations MUST properly handle both types 5415 and, if they have enabled ECN for packets they send, they SHOULD use 5416 the information in the ECN section to manage their congestion state. 5418 QUIC acknowledgements are irrevocable. Once acknowledged, a packet 5419 remains acknowledged, even if it does not appear in a future ACK 5420 frame. This is unlike TCP SACKs ([RFC2018]). 5422 Packets from different packet number spaces can be identified using 5423 the same numeric value. An acknowledgment for a packet needs to 5424 indicate both a packet number and a packet number space. This is 5425 accomplished by having each ACK frame only acknowledge packet numbers 5426 in the same space as the packet in which the ACK frame is contained. 5428 Version Negotiation and Retry packets cannot be acknowledged because 5429 they do not contain a packet number. Rather than relying on ACK 5430 frames, these packets are implicitly acknowledged by the next Initial 5431 packet sent by the client. 5433 An ACK frame is shown in Figure 22. 5435 ACK Frame { 5436 Type (i) = 0x02..0x03, 5437 Largest Acknowledged (i), 5438 ACK Delay (i), 5439 ACK Range Count (i), 5440 First ACK Range (i), 5441 ACK Range (..) ..., 5442 [ECN Counts (..)], 5443 } 5445 Figure 22: ACK Frame Format 5447 ACK frames contain the following fields: 5449 Largest Acknowledged: A variable-length integer representing the 5450 largest packet number the peer is acknowledging; this is usually 5451 the largest packet number that the peer has received prior to 5452 generating the ACK frame. Unlike the packet number in the QUIC 5453 long or short header, the value in an ACK frame is not truncated. 5455 ACK Delay: A variable-length integer representing the time delta in 5456 microseconds between when this ACK was sent and when the largest 5457 acknowledged packet, as indicated in the Largest Acknowledged 5458 field, was received by this peer. The value of the ACK Delay 5459 field is scaled by multiplying the encoded value by 2 to the power 5460 of the value of the ack_delay_exponent transport parameter set by 5461 the sender of the ACK frame; see Section 18.2. Scaling in this 5462 fashion allows for a larger range of values with a shorter 5463 encoding at the cost of lower resolution. Because the receiver 5464 doesn't use the ACK Delay for Initial and Handshake packets, a 5465 sender SHOULD send a value of 0. 5467 ACK Range Count: A variable-length integer specifying the number of 5468 Gap and ACK Range fields in the frame. 5470 First ACK Range: A variable-length integer indicating the number of 5471 contiguous packets preceding the Largest Acknowledged that are 5472 being acknowledged. The First ACK Range is encoded as an ACK 5473 Range; see Section 19.3.1 starting from the Largest Acknowledged. 5474 That is, the smallest packet acknowledged in the range is 5475 determined by subtracting the First ACK Range value from the 5476 Largest Acknowledged. 5478 ACK Ranges: Contains additional ranges of packets which are 5479 alternately not acknowledged (Gap) and acknowledged (ACK Range); 5480 see Section 19.3.1. 5482 ECN Counts: The three ECN Counts; see Section 19.3.2. 5484 19.3.1. ACK Ranges 5486 Each ACK Range consists of alternating Gap and ACK Range values in 5487 descending packet number order. ACK Ranges can be repeated. The 5488 number of Gap and ACK Range values is determined by the ACK Range 5489 Count field; one of each value is present for each value in the ACK 5490 Range Count field. 5492 ACK Ranges are structured as shown in Figure 23. 5494 ACK Range { 5495 Gap (i), 5496 ACK Range Length (i), 5497 } 5499 Figure 23: ACK Ranges 5501 The fields that form each ACK Range are: 5503 Gap: A variable-length integer indicating the number of contiguous 5504 unacknowledged packets preceding the packet number one lower than 5505 the smallest in the preceding ACK Range. 5507 ACK Range Length: A variable-length integer indicating the number of 5508 contiguous acknowledged packets preceding the largest packet 5509 number, as determined by the preceding Gap. 5511 Gap and ACK Range value use a relative integer encoding for 5512 efficiency. Though each encoded value is positive, the values are 5513 subtracted, so that each ACK Range describes progressively lower- 5514 numbered packets. 5516 Each ACK Range acknowledges a contiguous range of packets by 5517 indicating the number of acknowledged packets that precede the 5518 largest packet number in that range. A value of zero indicates that 5519 only the largest packet number is acknowledged. Larger ACK Range 5520 values indicate a larger range, with corresponding lower values for 5521 the smallest packet number in the range. Thus, given a largest 5522 packet number for the range, the smallest value is determined by the 5523 formula: 5525 smallest = largest - ack_range 5527 An ACK Range acknowledges all packets between the smallest packet 5528 number and the largest, inclusive. 5530 The largest value for an ACK Range is determined by cumulatively 5531 subtracting the size of all preceding ACK Ranges and Gaps. 5533 Each Gap indicates a range of packets that are not being 5534 acknowledged. The number of packets in the gap is one higher than 5535 the encoded value of the Gap field. 5537 The value of the Gap field establishes the largest packet number 5538 value for the subsequent ACK Range using the following formula: 5540 largest = previous_smallest - gap - 2 5542 If any computed packet number is negative, an endpoint MUST generate 5543 a connection error of type FRAME_ENCODING_ERROR. 5545 19.3.2. ECN Counts 5547 The ACK frame uses the least significant bit (that is, type 0x03) to 5548 indicate ECN feedback and report receipt of QUIC packets with 5549 associated ECN codepoints of ECT(0), ECT(1), or CE in the packet's IP 5550 header. ECN Counts are only present when the ACK frame type is 0x03. 5552 ECN Counts are only parsed when the ACK frame type is 0x03. There 5553 are 3 ECN counts, as shown in Figure 24. 5555 ECN Counts { 5556 ECT0 Count (i), 5557 ECT1 Count (i), 5558 ECN-CE Count (i), 5559 } 5561 Figure 24: ECN Count Format 5563 The three ECN Counts are: 5565 ECT0 Count: A variable-length integer representing the total number 5566 of packets received with the ECT(0) codepoint in the packet number 5567 space of the ACK frame. 5569 ECT1 Count: A variable-length integer representing the total number 5570 of packets received with the ECT(1) codepoint in the packet number 5571 space of the ACK frame. 5573 CE Count: A variable-length integer representing the total number of 5574 packets received with the CE codepoint in the packet number space 5575 of the ACK frame. 5577 ECN counts are maintained separately for each packet number space. 5579 19.4. RESET_STREAM Frame 5581 An endpoint uses a RESET_STREAM frame (type=0x04) to abruptly 5582 terminate the sending part of a stream. 5584 After sending a RESET_STREAM, an endpoint ceases transmission and 5585 retransmission of STREAM frames on the identified stream. A receiver 5586 of RESET_STREAM can discard any data that it already received on that 5587 stream. 5589 An endpoint that receives a RESET_STREAM frame for a send-only stream 5590 MUST terminate the connection with error STREAM_STATE_ERROR. 5592 The RESET_STREAM frame is shown in Figure 25. 5594 RESET_STREAM Frame { 5595 Type (i) = 0x04, 5596 Stream ID (i), 5597 Application Protocol Error Code (i), 5598 Final Size (i), 5599 } 5601 Figure 25: RESET_STREAM Frame Format 5603 RESET_STREAM frames contain the following fields: 5605 Stream ID: A variable-length integer encoding of the Stream ID of 5606 the stream being terminated. 5608 Application Protocol Error Code: A variable-length integer 5609 containing the application protocol error code (see Section 20.1) 5610 which indicates why the stream is being closed. 5612 Final Size: A variable-length integer indicating the final size of 5613 the stream by the RESET_STREAM sender, in unit of bytes. 5615 19.5. STOP_SENDING Frame 5617 An endpoint uses a STOP_SENDING frame (type=0x05) to communicate that 5618 incoming data is being discarded on receipt at application request. 5619 STOP_SENDING requests that a peer cease transmission on a stream. 5621 A STOP_SENDING frame can be sent for streams in the Recv or Size 5622 Known states; see Section 3.1. Receiving a STOP_SENDING frame for a 5623 locally-initiated stream that has not yet been created MUST be 5624 treated as a connection error of type STREAM_STATE_ERROR. An 5625 endpoint that receives a STOP_SENDING frame for a receive-only stream 5626 MUST terminate the connection with error STREAM_STATE_ERROR. 5628 The STOP_SENDING frame is shown in Figure 26. 5630 STOP_SENDING Frame { 5631 Type (i) = 0x05, 5632 Stream ID (i), 5633 Application Protocol Error Code (i), 5634 } 5636 Figure 26: STOP_SENDING Frame Format 5638 STOP_SENDING frames contain the following fields: 5640 Stream ID: A variable-length integer carrying the Stream ID of the 5641 stream being ignored. 5643 Application Protocol Error Code: A variable-length integer 5644 containing the application-specified reason the sender is ignoring 5645 the stream; see Section 20.1. 5647 19.6. CRYPTO Frame 5649 The CRYPTO frame (type=0x06) is used to transmit cryptographic 5650 handshake messages. It can be sent in all packet types except 0-RTT. 5651 The CRYPTO frame offers the cryptographic protocol an in-order stream 5652 of bytes. CRYPTO frames are functionally identical to STREAM frames, 5653 except that they do not bear a stream identifier; they are not flow 5654 controlled; and they do not carry markers for optional offset, 5655 optional length, and the end of the stream. 5657 The CRYPTO frame is shown in Figure 27. 5659 CRYPTO Frame { 5660 Type (i) = 0x06, 5661 Offset (i), 5662 Length (i), 5663 Crypto Data (..), 5664 } 5666 Figure 27: CRYPTO Frame Format 5668 CRYPTO frames contain the following fields: 5670 Offset: A variable-length integer specifying the byte offset in the 5671 stream for the data in this CRYPTO frame. 5673 Length: A variable-length integer specifying the length of the 5674 Crypto Data field in this CRYPTO frame. 5676 Crypto Data: The cryptographic message data. 5678 There is a separate flow of cryptographic handshake data in each 5679 encryption level, each of which starts at an offset of 0. This 5680 implies that each encryption level is treated as a separate CRYPTO 5681 stream of data. 5683 The largest offset delivered on a stream - the sum of the offset and 5684 data length - cannot exceed 2^62-1. Receipt of a frame that exceeds 5685 this limit MUST be treated as a connection error of type 5686 FRAME_ENCODING_ERROR or CRYPTO_BUFFER_EXCEEDED. 5688 Unlike STREAM frames, which include a Stream ID indicating to which 5689 stream the data belongs, the CRYPTO frame carries data for a single 5690 stream per encryption level. The stream does not have an explicit 5691 end, so CRYPTO frames do not have a FIN bit. 5693 19.7. NEW_TOKEN Frame 5695 A server sends a NEW_TOKEN frame (type=0x07) to provide the client 5696 with a token to send in the header of an Initial packet for a future 5697 connection. 5699 The NEW_TOKEN frame is shown in Figure 28. 5701 NEW_TOKEN Frame { 5702 Type (i) = 0x07, 5703 Token Length (i), 5704 Token (..), 5705 } 5706 Figure 28: NEW_TOKEN Frame Format 5708 NEW_TOKEN frames contain the following fields: 5710 Token Length: A variable-length integer specifying the length of the 5711 token in bytes. 5713 Token: An opaque blob that the client may use with a future Initial 5714 packet. The token MUST NOT be empty. An endpoint MUST treat 5715 receipt of a NEW_TOKEN frame with an empty Token field as a 5716 connection error of type FRAME_ENCODING_ERROR. 5718 An endpoint might receive multiple NEW_TOKEN frames that contain the 5719 same token value if packets containing the frame are incorrectly 5720 determined to be lost. Endpoints are responsible for discarding 5721 duplicate values, which might be used to link connection attempts; 5722 see Section 8.1.3. 5724 Clients MUST NOT send NEW_TOKEN frames. Servers MUST treat receipt 5725 of a NEW_TOKEN frame as a connection error of type 5726 PROTOCOL_VIOLATION. 5728 19.8. STREAM Frames 5730 STREAM frames implicitly create a stream and carry stream data. The 5731 STREAM frame takes the form 0b00001XXX (or the set of values from 5732 0x08 to 0x0f). The value of the three low-order bits of the frame 5733 type determines the fields that are present in the frame. 5735 * The OFF bit (0x04) in the frame type is set to indicate that there 5736 is an Offset field present. When set to 1, the Offset field is 5737 present. When set to 0, the Offset field is absent and the Stream 5738 Data starts at an offset of 0 (that is, the frame contains the 5739 first bytes of the stream, or the end of a stream that includes no 5740 data). 5742 * The LEN bit (0x02) in the frame type is set to indicate that there 5743 is a Length field present. If this bit is set to 0, the Length 5744 field is absent and the Stream Data field extends to the end of 5745 the packet. If this bit is set to 1, the Length field is present. 5747 * The FIN bit (0x01) of the frame type is set only on frames that 5748 contain the final size of the stream. Setting this bit indicates 5749 that the frame marks the end of the stream. 5751 An endpoint MUST terminate the connection with error 5752 STREAM_STATE_ERROR if it receives a STREAM frame for a locally- 5753 initiated stream that has not yet been created, or for a send-only 5754 stream. 5756 The STREAM frames are shown in Figure 29. 5758 STREAM Frame { 5759 Type (i) = 0x08..0x0f, 5760 Stream ID (i), 5761 [Offset (i)], 5762 [Length (i)], 5763 Stream Data (..), 5764 } 5766 Figure 29: STREAM Frame Format 5768 STREAM frames contain the following fields: 5770 Stream ID: A variable-length integer indicating the stream ID of the 5771 stream; see Section 2.1. 5773 Offset: A variable-length integer specifying the byte offset in the 5774 stream for the data in this STREAM frame. This field is present 5775 when the OFF bit is set to 1. When the Offset field is absent, 5776 the offset is 0. 5778 Length: A variable-length integer specifying the length of the 5779 Stream Data field in this STREAM frame. This field is present 5780 when the LEN bit is set to 1. When the LEN bit is set to 0, the 5781 Stream Data field consumes all the remaining bytes in the packet. 5783 Stream Data: The bytes from the designated stream to be delivered. 5785 When a Stream Data field has a length of 0, the offset in the STREAM 5786 frame is the offset of the next byte that would be sent. 5788 The first byte in the stream has an offset of 0. The largest offset 5789 delivered on a stream - the sum of the offset and data length - 5790 cannot exceed 2^62-1, as it is not possible to provide flow control 5791 credit for that data. Receipt of a frame that exceeds this limit 5792 MUST be treated as a connection error of type FRAME_ENCODING_ERROR or 5793 FLOW_CONTROL_ERROR. 5795 19.9. MAX_DATA Frame 5797 The MAX_DATA frame (type=0x10) is used in flow control to inform the 5798 peer of the maximum amount of data that can be sent on the connection 5799 as a whole. 5801 The MAX_DATA frame is shown in Figure 30. 5803 MAX_DATA Frame { 5804 Type (i) = 0x10, 5805 Maximum Data (i), 5806 } 5808 Figure 30: MAX_DATA Frame Format 5810 MAX_DATA frames contain the following fields: 5812 Maximum Data: A variable-length integer indicating the maximum 5813 amount of data that can be sent on the entire connection, in units 5814 of bytes. 5816 All data sent in STREAM frames counts toward this limit. The sum of 5817 the largest received offsets on all streams - including streams in 5818 terminal states - MUST NOT exceed the value advertised by a receiver. 5819 An endpoint MUST terminate a connection with a FLOW_CONTROL_ERROR 5820 error if it receives more data than the maximum data value that it 5821 has sent, unless this is a result of a change in the initial limits; 5822 see Section 7.4.1. 5824 19.10. MAX_STREAM_DATA Frame 5826 The MAX_STREAM_DATA frame (type=0x11) is used in flow control to 5827 inform a peer of the maximum amount of data that can be sent on a 5828 stream. 5830 A MAX_STREAM_DATA frame can be sent for streams in the Recv state; 5831 see Section 3.1. Receiving a MAX_STREAM_DATA frame for a locally- 5832 initiated stream that has not yet been created MUST be treated as a 5833 connection error of type STREAM_STATE_ERROR. An endpoint that 5834 receives a MAX_STREAM_DATA frame for a receive-only stream MUST 5835 terminate the connection with error STREAM_STATE_ERROR. 5837 The MAX_STREAM_DATA frame is shown in Figure 31. 5839 MAX_STREAM_DATA Frame { 5840 Type (i) = 0x11, 5841 Stream ID (i), 5842 Maximum Stream Data (i), 5843 } 5845 Figure 31: MAX_STREAM_DATA Frame Format 5847 MAX_STREAM_DATA frames contain the following fields: 5849 Stream ID: The stream ID of the stream that is affected encoded as a 5850 variable-length integer. 5852 Maximum Stream Data: A variable-length integer indicating the 5853 maximum amount of data that can be sent on the identified stream, 5854 in units of bytes. 5856 When counting data toward this limit, an endpoint accounts for the 5857 largest received offset of data that is sent or received on the 5858 stream. Loss or reordering can mean that the largest received offset 5859 on a stream can be greater than the total size of data received on 5860 that stream. Receiving STREAM frames might not increase the largest 5861 received offset. 5863 The data sent on a stream MUST NOT exceed the largest maximum stream 5864 data value advertised by the receiver. An endpoint MUST terminate a 5865 connection with a FLOW_CONTROL_ERROR error if it receives more data 5866 than the largest maximum stream data that it has sent for the 5867 affected stream, unless this is a result of a change in the initial 5868 limits; see Section 7.4.1. 5870 19.11. MAX_STREAMS Frames 5872 The MAX_STREAMS frames (type=0x12 and 0x13) inform the peer of the 5873 cumulative number of streams of a given type it is permitted to open. 5874 A MAX_STREAMS frame with a type of 0x12 applies to bidirectional 5875 streams, and a MAX_STREAMS frame with a type of 0x13 applies to 5876 unidirectional streams. 5878 The MAX_STREAMS frames are shown in Figure 32; 5880 MAX_STREAMS Frame { 5881 Type (i) = 0x12..0x13, 5882 Maximum Streams (i), 5883 } 5885 Figure 32: MAX_STREAMS Frame Format 5887 MAX_STREAMS frames contain the following fields: 5889 Maximum Streams: A count of the cumulative number of streams of the 5890 corresponding type that can be opened over the lifetime of the 5891 connection. This value cannot exceed 2^60, as it is not possible 5892 to encode stream IDs larger than 2^62-1. Receipt of a frame that 5893 permits opening of a stream larger than this limit MUST be treated 5894 as a FRAME_ENCODING_ERROR. 5896 Loss or reordering can cause a MAX_STREAMS frame to be received which 5897 states a lower stream limit than an endpoint has previously received. 5898 MAX_STREAMS frames which do not increase the stream limit MUST be 5899 ignored. 5901 An endpoint MUST NOT open more streams than permitted by the current 5902 stream limit set by its peer. For instance, a server that receives a 5903 unidirectional stream limit of 3 is permitted to open stream 3, 7, 5904 and 11, but not stream 15. An endpoint MUST terminate a connection 5905 with a STREAM_LIMIT_ERROR error if a peer opens more streams than was 5906 permitted. 5908 Note that these frames (and the corresponding transport parameters) 5909 do not describe the number of streams that can be opened 5910 concurrently. The limit includes streams that have been closed as 5911 well as those that are open. 5913 19.12. DATA_BLOCKED Frame 5915 A sender SHOULD send a DATA_BLOCKED frame (type=0x14) when it wishes 5916 to send data, but is unable to due to connection-level flow control; 5917 see Section 4. DATA_BLOCKED frames can be used as input to tuning of 5918 flow control algorithms; see Section 4.2. 5920 The DATA_BLOCKED frame is shown in Figure 33. 5922 DATA_BLOCKED Frame { 5923 Type (i) = 0x14, 5924 Maximum Data (i), 5925 } 5927 Figure 33: DATA_BLOCKED Frame Format 5929 DATA_BLOCKED frames contain the following fields: 5931 Maximum Data: A variable-length integer indicating the connection- 5932 level limit at which blocking occurred. 5934 19.13. STREAM_DATA_BLOCKED Frame 5936 A sender SHOULD send a STREAM_DATA_BLOCKED frame (type=0x15) when it 5937 wishes to send data, but is unable to due to stream-level flow 5938 control. This frame is analogous to DATA_BLOCKED (Section 19.12). 5940 An endpoint that receives a STREAM_DATA_BLOCKED frame for a send-only 5941 stream MUST terminate the connection with error STREAM_STATE_ERROR. 5943 The STREAM_DATA_BLOCKED frame is shown in Figure 34. 5945 STREAM_DATA_BLOCKED Frame { 5946 Type (i) = 0x15, 5947 Stream ID (i), 5948 Maximum Stream Data (i), 5949 } 5951 Figure 34: STREAM_DATA_BLOCKED Frame Format 5953 STREAM_DATA_BLOCKED frames contain the following fields: 5955 Stream ID: A variable-length integer indicating the stream which is 5956 flow control blocked. 5958 Maximum Stream Data: A variable-length integer indicating the offset 5959 of the stream at which the blocking occurred. 5961 19.14. STREAMS_BLOCKED Frames 5963 A sender SHOULD send a STREAMS_BLOCKED frame (type=0x16 or 0x17) when 5964 it wishes to open a stream, but is unable to due to the maximum 5965 stream limit set by its peer; see Section 19.11. A STREAMS_BLOCKED 5966 frame of type 0x16 is used to indicate reaching the bidirectional 5967 stream limit, and a STREAMS_BLOCKED frame of type 0x17 indicates 5968 reaching the unidirectional stream limit. 5970 A STREAMS_BLOCKED frame does not open the stream, but informs the 5971 peer that a new stream was needed and the stream limit prevented the 5972 creation of the stream. 5974 The STREAMS_BLOCKED frames are shown in Figure 35. 5976 STREAMS_BLOCKED Frame { 5977 Type (i) = 0x16..0x17, 5978 Maximum Streams (i), 5979 } 5981 Figure 35: STREAMS_BLOCKED Frame Format 5983 STREAMS_BLOCKED frames contain the following fields: 5985 Maximum Streams: A variable-length integer indicating the maximum 5986 number of streams allowed at the time the frame was sent. This 5987 value cannot exceed 2^60, as it is not possible to encode stream 5988 IDs larger than 2^62-1. Receipt of a frame that encodes a larger 5989 stream ID MUST be treated as a STREAM_LIMIT_ERROR or a 5990 FRAME_ENCODING_ERROR. 5992 19.15. NEW_CONNECTION_ID Frame 5994 An endpoint sends a NEW_CONNECTION_ID frame (type=0x18) to provide 5995 its peer with alternative connection IDs that can be used to break 5996 linkability when migrating connections; see Section 9.5. 5998 The NEW_CONNECTION_ID frame is shown in Figure 36. 6000 NEW_CONNECTION_ID Frame { 6001 Type (i) = 0x18, 6002 Sequence Number (i), 6003 Retire Prior To (i), 6004 Length (8), 6005 Connection ID (8..160), 6006 Stateless Reset Token (128), 6007 } 6009 Figure 36: NEW_CONNECTION_ID Frame Format 6011 NEW_CONNECTION_ID frames contain the following fields: 6013 Sequence Number: The sequence number assigned to the connection ID 6014 by the sender. See Section 5.1.1. 6016 Retire Prior To: A variable-length integer indicating which 6017 connection IDs should be retired; see Section 5.1.2. 6019 Length: An 8-bit unsigned integer containing the length of the 6020 connection ID. Values less than 1 and greater than 20 are invalid 6021 and MUST be treated as a connection error of type 6022 FRAME_ENCODING_ERROR. 6024 Connection ID: A connection ID of the specified length. 6026 Stateless Reset Token: A 128-bit value that will be used for a 6027 stateless reset when the associated connection ID is used; see 6028 Section 10.4. 6030 An endpoint MUST NOT send this frame if it currently requires that 6031 its peer send packets with a zero-length Destination Connection ID. 6032 Changing the length of a connection ID to or from zero-length makes 6033 it difficult to identify when the value of the connection ID changed. 6034 An endpoint that is sending packets with a zero-length Destination 6035 Connection ID MUST treat receipt of a NEW_CONNECTION_ID frame as a 6036 connection error of type PROTOCOL_VIOLATION. 6038 Transmission errors, timeouts and retransmissions might cause the 6039 same NEW_CONNECTION_ID frame to be received multiple times. Receipt 6040 of the same frame multiple times MUST NOT be treated as a connection 6041 error. A receiver can use the sequence number supplied in the 6042 NEW_CONNECTION_ID frame to identify new connection IDs from old ones. 6044 If an endpoint receives a NEW_CONNECTION_ID frame that repeats a 6045 previously issued connection ID with a different Stateless Reset 6046 Token or a different sequence number, or if a sequence number is used 6047 for different connection IDs, the endpoint MAY treat that receipt as 6048 a connection error of type PROTOCOL_VIOLATION. 6050 The Retire Prior To field counts connection IDs established during 6051 connection setup and the preferred_address transport parameter; see 6052 Section 5.1.2. The Retire Prior To field MUST be less than or equal 6053 to the Sequence Number field. Receiving a value greater than the 6054 Sequence Number MUST be treated as a connection error of type 6055 FRAME_ENCODING_ERROR. 6057 Once a sender indicates a Retire Prior To value, smaller values sent 6058 in subsequent NEW_CONNECTION_ID frames have no effect. A receiver 6059 MUST ignore any Retire Prior To fields that do not increase the 6060 largest received Retire Prior To value. 6062 An endpoint that receives a NEW_CONNECTION_ID frame with a sequence 6063 number smaller than the Retire Prior To field of a previously 6064 received NEW_CONNECTION_ID frame MUST send a corresponding 6065 RETIRE_CONNECTION_ID frame that retires the newly received connection 6066 ID, unless it has already done so for that sequence number. 6068 19.16. RETIRE_CONNECTION_ID Frame 6070 An endpoint sends a RETIRE_CONNECTION_ID frame (type=0x19) to 6071 indicate that it will no longer use a connection ID that was issued 6072 by its peer. This may include the connection ID provided during the 6073 handshake. Sending a RETIRE_CONNECTION_ID frame also serves as a 6074 request to the peer to send additional connection IDs for future use; 6075 see Section 5.1. New connection IDs can be delivered to a peer using 6076 the NEW_CONNECTION_ID frame (Section 19.15). 6078 Retiring a connection ID invalidates the stateless reset token 6079 associated with that connection ID. 6081 The RETIRE_CONNECTION_ID frame is shown in Figure 37. 6083 RETIRE_CONNECTION_ID Frame { 6084 Type (i) = 0x19, 6085 Sequence Number (i), 6086 } 6088 Figure 37: RETIRE_CONNECTION_ID Frame Format 6090 RETIRE_CONNECTION_ID frames contain the following fields: 6092 Sequence Number: The sequence number of the connection ID being 6093 retired; see Section 5.1.2. 6095 Receipt of a RETIRE_CONNECTION_ID frame containing a sequence number 6096 greater than any previously sent to the peer MUST be treated as a 6097 connection error of type PROTOCOL_VIOLATION. 6099 The sequence number specified in a RETIRE_CONNECTION_ID frame MUST 6100 NOT refer to the Destination Connection ID field of the packet in 6101 which the frame is contained. The peer MAY treat this as a 6102 connection error of type FRAME_ENCODING_ERROR. 6104 An endpoint cannot send this frame if it was provided with a zero- 6105 length connection ID by its peer. An endpoint that provides a zero- 6106 length connection ID MUST treat receipt of a RETIRE_CONNECTION_ID 6107 frame as a connection error of type PROTOCOL_VIOLATION. 6109 19.17. PATH_CHALLENGE Frame 6111 Endpoints can use PATH_CHALLENGE frames (type=0x1a) to check 6112 reachability to the peer and for path validation during connection 6113 migration. 6115 The PATH_CHALLENGE frame is shown in Figure 38. 6117 PATH_CHALLENGE Frame { 6118 Type (i) = 0x1a, 6119 Data (64), 6120 } 6122 Figure 38: PATH_CHALLENGE Frame Format 6124 PATH_CHALLENGE frames contain the following fields: 6126 Data: This 8-byte field contains arbitrary data. 6128 A PATH_CHALLENGE frame containing 8 bytes that are hard to guess is 6129 sufficient to ensure that it is easier to receive the packet than it 6130 is to guess the value correctly. 6132 The recipient of this frame MUST generate a PATH_RESPONSE frame 6133 (Section 19.18) containing the same Data. 6135 19.18. PATH_RESPONSE Frame 6137 The PATH_RESPONSE frame (type=0x1b) is sent in response to a 6138 PATH_CHALLENGE frame. Its format, shown in Figure 39 is identical to 6139 the PATH_CHALLENGE frame (Section 19.17). 6141 PATH_RESPONSE Frame { 6142 Type (i) = 0x1b, 6143 Data (64), 6144 } 6146 Figure 39: PATH_RESPONSE Frame Format 6148 If the content of a PATH_RESPONSE frame does not match the content of 6149 a PATH_CHALLENGE frame previously sent by the endpoint, the endpoint 6150 MAY generate a connection error of type PROTOCOL_VIOLATION. 6152 19.19. CONNECTION_CLOSE Frames 6154 An endpoint sends a CONNECTION_CLOSE frame (type=0x1c or 0x1d) to 6155 notify its peer that the connection is being closed. The 6156 CONNECTION_CLOSE with a frame type of 0x1c is used to signal errors 6157 at only the QUIC layer, or the absence of errors (with the NO_ERROR 6158 code). The CONNECTION_CLOSE frame with a type of 0x1d is used to 6159 signal an error with the application that uses QUIC. 6161 If there are open streams that haven't been explicitly closed, they 6162 are implicitly closed when the connection is closed. 6164 The CONNECTION_CLOSE frames are shown in Figure 40. 6166 CONNECTION_CLOSE Frame { 6167 Type (i) = 0x1c..0x1d, 6168 Error Code (i), 6169 [Frame Type (i)], 6170 Reason Phrase Length (i), 6171 Reason Phrase (..), 6172 } 6173 Figure 40: CONNECTION_CLOSE Frame Format 6175 CONNECTION_CLOSE frames contain the following fields: 6177 Error Code: A variable length integer error code which indicates the 6178 reason for closing this connection. A CONNECTION_CLOSE frame of 6179 type 0x1c uses codes from the space defined in Section 20. A 6180 CONNECTION_CLOSE frame of type 0x1d uses codes from the 6181 application protocol error code space; see Section 20.1 6183 Frame Type: A variable-length integer encoding the type of frame 6184 that triggered the error. A value of 0 (equivalent to the mention 6185 of the PADDING frame) is used when the frame type is unknown. The 6186 application-specific variant of CONNECTION_CLOSE (type 0x1d) does 6187 not include this field. 6189 Reason Phrase Length: A variable-length integer specifying the 6190 length of the reason phrase in bytes. Because a CONNECTION_CLOSE 6191 frame cannot be split between packets, any limits on packet size 6192 will also limit the space available for a reason phrase. 6194 Reason Phrase: A human-readable explanation for why the connection 6195 was closed. This can be zero length if the sender chooses to not 6196 give details beyond the Error Code. This SHOULD be a UTF-8 6197 encoded string [RFC3629]. 6199 The application-specific variant of CONNECTION_CLOSE (type 0x1d) can 6200 only be sent using 0-RTT or 1-RTT packets ([QUIC-TLS], Section 4). 6201 When an application wishes to abandon a connection during the 6202 handshake, an endpoint can send a CONNECTION_CLOSE frame (type 0x1c) 6203 with an error code of APPLICATION_ERROR in an Initial or a Handshake 6204 packet. 6206 19.20. HANDSHAKE_DONE frame 6208 The server uses the HANDSHAKE_DONE frame (type=0x1e) to signal 6209 confirmation of the handshake to the client. The HANDSHAKE_DONE 6210 frame contains no additional fields. 6212 This frame can only be sent by the server. Servers MUST NOT send a 6213 HANDSHAKE_DONE frame before completing the handshake. A server MUST 6214 treat receipt of a HANDSHAKE_DONE frame as a connection error of type 6215 PROTOCOL_VIOLATION. 6217 19.21. Extension Frames 6219 QUIC frames do not use a self-describing encoding. An endpoint 6220 therefore needs to understand the syntax of all frames before it can 6221 successfully process a packet. This allows for efficient encoding of 6222 frames, but it means that an endpoint cannot send a frame of a type 6223 that is unknown to its peer. 6225 An extension to QUIC that wishes to use a new type of frame MUST 6226 first ensure that a peer is able to understand the frame. An 6227 endpoint can use a transport parameter to signal its willingness to 6228 receive one or more extension frame types with the one transport 6229 parameter. 6231 Extensions that modify or replace core protocol functionality 6232 (including frame types) will be difficult to combine with other 6233 extensions which modify or replace the same functionality unless the 6234 behavior of the combination is explicitly defined. Such extensions 6235 SHOULD define their interaction with previously-defined extensions 6236 modifying the same protocol components. 6238 Extension frames MUST be congestion controlled and MUST cause an ACK 6239 frame to be sent. The exception is extension frames that replace or 6240 supplement the ACK frame. Extension frames are not included in flow 6241 control unless specified in the extension. 6243 An IANA registry is used to manage the assignment of frame types; see 6244 Section 22.3. 6246 20. Transport Error Codes 6248 QUIC error codes are 62-bit unsigned integers. 6250 This section lists the defined QUIC transport error codes that may be 6251 used in a CONNECTION_CLOSE frame. These errors apply to the entire 6252 connection. 6254 NO_ERROR (0x0): An endpoint uses this with CONNECTION_CLOSE to 6255 signal that the connection is being closed abruptly in the absence 6256 of any error. 6258 INTERNAL_ERROR (0x1): The endpoint encountered an internal error and 6259 cannot continue with the connection. 6261 SERVER_BUSY (0x2): The server is currently busy and does not accept 6262 any new connections. 6264 FLOW_CONTROL_ERROR (0x3): An endpoint received more data than it 6265 permitted in its advertised data limits; see Section 4. 6267 STREAM_LIMIT_ERROR (0x4): An endpoint received a frame for a stream 6268 identifier that exceeded its advertised stream limit for the 6269 corresponding stream type. 6271 STREAM_STATE_ERROR (0x5): An endpoint received a frame for a stream 6272 that was not in a state that permitted that frame; see Section 3. 6274 FINAL_SIZE_ERROR (0x6): An endpoint received a STREAM frame 6275 containing data that exceeded the previously established final 6276 size. Or an endpoint received a STREAM frame or a RESET_STREAM 6277 frame containing a final size that was lower than the size of 6278 stream data that was already received. Or an endpoint received a 6279 STREAM frame or a RESET_STREAM frame containing a different final 6280 size to the one already established. 6282 FRAME_ENCODING_ERROR (0x7): An endpoint received a frame that was 6283 badly formatted. For instance, a frame of an unknown type, or an 6284 ACK frame that has more acknowledgment ranges than the remainder 6285 of the packet could carry. 6287 TRANSPORT_PARAMETER_ERROR (0x8): An endpoint received transport 6288 parameters that were badly formatted, included an invalid value, 6289 was absent even though it is mandatory, was present though it is 6290 forbidden, or is otherwise in error. 6292 CONNECTION_ID_LIMIT_ERROR (0x9): The number of connection IDs 6293 provided by the peer exceeds the advertised 6294 active_connection_id_limit. 6296 PROTOCOL_VIOLATION (0xA): An endpoint detected an error with 6297 protocol compliance that was not covered by more specific error 6298 codes. 6300 INVALID_TOKEN (0xB): A server received a Retry Token in a client 6301 Initial that is invalid. 6303 APPLICATION_ERROR (0xC): The application or application protocol 6304 caused the connection to be closed. 6306 CRYPTO_BUFFER_EXCEEDED (0xD): An endpoint has received more data in 6307 CRYPTO frames than it can buffer. 6309 CRYPTO_ERROR (0x1XX): The cryptographic handshake failed. A range 6310 of 256 values is reserved for carrying error codes specific to the 6311 cryptographic handshake that is used. Codes for errors occurring 6312 when TLS is used for the crypto handshake are described in 6313 Section 4.8 of [QUIC-TLS]. 6315 See Section 22.4 for details of registering new error codes. 6317 In defining these error codes, several principles are applied. Error 6318 conditions that might require specific action on the part of a 6319 recipient are given unique codes. Errors that represent common 6320 conditions are given specific codes. Absent either of these 6321 conditions, error codes are used to identify a general function of 6322 the stack, like flow control or transport parameter handling. 6323 Finally, generic errors are provided for conditions where 6324 implementations are unable or unwilling to use more specific codes. 6326 20.1. Application Protocol Error Codes 6328 Application protocol error codes are 62-bit unsigned integers, but 6329 the management of application error codes is left to application 6330 protocols. Application protocol error codes are used for the 6331 RESET_STREAM frame (Section 19.4), the STOP_SENDING frame 6332 (Section 19.5), and the CONNECTION_CLOSE frame with a type of 0x1d 6333 (Section 19.19). 6335 21. Security Considerations 6337 21.1. Handshake Denial of Service 6339 As an encrypted and authenticated transport QUIC provides a range of 6340 protections against denial of service. Once the cryptographic 6341 handshake is complete, QUIC endpoints discard most packets that are 6342 not authenticated, greatly limiting the ability of an attacker to 6343 interfere with existing connections. 6345 Once a connection is established QUIC endpoints might accept some 6346 unauthenticated ICMP packets (see Section 14.2), but the use of these 6347 packets is extremely limited. The only other type of packet that an 6348 endpoint might accept is a stateless reset (Section 10.4) which 6349 relies on the token being kept secret until it is used. 6351 During the creation of a connection, QUIC only provides protection 6352 against attack from off the network path. All QUIC packets contain 6353 proof that the recipient saw a preceding packet from its peer. 6355 Addresses cannot change during the handshake, so endpoints can 6356 discard packets that are received on a different network path. 6358 The Source and Destination Connection ID fields are the primary means 6359 of protection against off-path attack during the handshake. These 6360 are required to match those set by a peer. Except for an Initial and 6361 stateless reset packets, an endpoint only accepts packets that 6362 include a Destination Connection ID field that matches a value the 6363 endpoint previously chose. This is the only protection offered for 6364 Version Negotiation packets. 6366 The Destination Connection ID field in an Initial packet is selected 6367 by a client to be unpredictable, which serves an additional purpose. 6368 The packets that carry the cryptographic handshake are protected with 6369 a key that is derived from this connection ID and salt specific to 6370 the QUIC version. This allows endpoints to use the same process for 6371 authenticating packets that they receive as they use after the 6372 cryptographic handshake completes. Packets that cannot be 6373 authenticated are discarded. Protecting packets in this fashion 6374 provides a strong assurance that the sender of the packet saw the 6375 Initial packet and understood it. 6377 These protections are not intended to be effective against an 6378 attacker that is able to receive QUIC packets prior to the connection 6379 being established. Such an attacker can potentially send packets 6380 that will be accepted by QUIC endpoints. This version of QUIC 6381 attempts to detect this sort of attack, but it expects that endpoints 6382 will fail to establish a connection rather than recovering. For the 6383 most part, the cryptographic handshake protocol [QUIC-TLS] is 6384 responsible for detecting tampering during the handshake. 6386 Endpoints are permitted to use other methods to detect and attempt to 6387 recover from interference with the handshake. Invalid packets may be 6388 identified and discarded using other methods, but no specific method 6389 is mandated in this document. 6391 21.2. Amplification Attack 6393 An attacker might be able to receive an address validation token 6394 (Section 8) from a server and then release the IP address it used to 6395 acquire that token. At a later time, the attacker may initiate a 6396 0-RTT connection with a server by spoofing this same address, which 6397 might now address a different (victim) endpoint. The attacker can 6398 thus potentially cause the server to send an initial congestion 6399 window's worth of data towards the victim. 6401 Servers SHOULD provide mitigations for this attack by limiting the 6402 usage and lifetime of address validation tokens; see Section 8.1.3. 6404 21.3. Optimistic ACK Attack 6406 An endpoint that acknowledges packets it has not received might cause 6407 a congestion controller to permit sending at rates beyond what the 6408 network supports. An endpoint MAY skip packet numbers when sending 6409 packets to detect this behavior. An endpoint can then immediately 6410 close the connection with a connection error of type 6411 PROTOCOL_VIOLATION; see Section 10.3. 6413 21.4. Slowloris Attacks 6415 The attacks commonly known as Slowloris [SLOWLORIS] try to keep many 6416 connections to the target endpoint open and hold them open as long as 6417 possible. These attacks can be executed against a QUIC endpoint by 6418 generating the minimum amount of activity necessary to avoid being 6419 closed for inactivity. This might involve sending small amounts of 6420 data, gradually opening flow control windows in order to control the 6421 sender rate, or manufacturing ACK frames that simulate a high loss 6422 rate. 6424 QUIC deployments SHOULD provide mitigations for the Slowloris 6425 attacks, such as increasing the maximum number of clients the server 6426 will allow, limiting the number of connections a single IP address is 6427 allowed to make, imposing restrictions on the minimum transfer speed 6428 a connection is allowed to have, and restricting the length of time 6429 an endpoint is allowed to stay connected. 6431 21.5. Stream Fragmentation and Reassembly Attacks 6433 An adversarial sender might intentionally send fragments of stream 6434 data in order to cause disproportionate receive buffer memory 6435 commitment and/or creation of a large and inefficient data structure. 6437 An adversarial receiver might intentionally not acknowledge packets 6438 containing stream data in order to force the sender to store the 6439 unacknowledged stream data for retransmission. 6441 The attack on receivers is mitigated if flow control windows 6442 correspond to available memory. However, some receivers will over- 6443 commit memory and advertise flow control offsets in the aggregate 6444 that exceed actual available memory. The over-commitment strategy 6445 can lead to better performance when endpoints are well behaved, but 6446 renders endpoints vulnerable to the stream fragmentation attack. 6448 QUIC deployments SHOULD provide mitigations against stream 6449 fragmentation attacks. Mitigations could consist of avoiding over- 6450 committing memory, limiting the size of tracking data structures, 6451 delaying reassembly of STREAM frames, implementing heuristics based 6452 on the age and duration of reassembly holes, or some combination. 6454 21.6. Stream Commitment Attack 6456 An adversarial endpoint can open lots of streams, exhausting state on 6457 an endpoint. The adversarial endpoint could repeat the process on a 6458 large number of connections, in a manner similar to SYN flooding 6459 attacks in TCP. 6461 Normally, clients will open streams sequentially, as explained in 6462 Section 2.1. However, when several streams are initiated at short 6463 intervals, loss or reordering may cause STREAM frames that open 6464 streams to be received out of sequence. On receiving a higher- 6465 numbered stream ID, a receiver is required to open all intervening 6466 streams of the same type; see Section 3.2. Thus, on a new 6467 connection, opening stream 4000000 opens 1 million and 1 client- 6468 initiated bidirectional streams. 6470 The number of active streams is limited by the 6471 initial_max_streams_bidi and initial_max_streams_uni transport 6472 parameters, as explained in Section 4.5. If chosen judiciously, 6473 these limits mitigate the effect of the stream commitment attack. 6474 However, setting the limit too low could affect performance when 6475 applications expect to open large number of streams. 6477 21.7. Peer Denial of Service 6479 QUIC and TLS both contain messages that have legitimate uses in some 6480 contexts, but that can be abused to cause a peer to expend processing 6481 resources without having any observable impact on the state of the 6482 connection. 6484 Messages can also be used to change and revert state in small or 6485 inconsequential ways, such as by sending small increments to flow 6486 control limits. 6488 If processing costs are disproportionately large in comparison to 6489 bandwidth consumption or effect on state, then this could allow a 6490 malicious peer to exhaust processing capacity. 6492 While there are legitimate uses for all messages, implementations 6493 SHOULD track cost of processing relative to progress and treat 6494 excessive quantities of any non-productive packets as indicative of 6495 an attack. Endpoints MAY respond to this condition with a connection 6496 error, or by dropping packets. 6498 21.8. Explicit Congestion Notification Attacks 6500 An on-path attacker could manipulate the value of ECN codepoints in 6501 the IP header to influence the sender's rate. [RFC3168] discusses 6502 manipulations and their effects in more detail. 6504 An on-the-side attacker can duplicate and send packets with modified 6505 ECN codepoints to affect the sender's rate. If duplicate packets are 6506 discarded by a receiver, an off-path attacker will need to race the 6507 duplicate packet against the original to be successful in this 6508 attack. Therefore, QUIC endpoints ignore the ECN codepoint field on 6509 an IP packet unless at least one QUIC packet in that IP packet is 6510 successfully processed; see Section 13.4. 6512 21.9. Stateless Reset Oracle 6514 Stateless resets create a possible denial of service attack analogous 6515 to a TCP reset injection. This attack is possible if an attacker is 6516 able to cause a stateless reset token to be generated for a 6517 connection with a selected connection ID. An attacker that can cause 6518 this token to be generated can reset an active connection with the 6519 same connection ID. 6521 If a packet can be routed to different instances that share a static 6522 key, for example by changing an IP address or port, then an attacker 6523 can cause the server to send a stateless reset. To defend against 6524 this style of denial service, endpoints that share a static key for 6525 stateless reset (see Section 10.4.2) MUST be arranged so that packets 6526 with a given connection ID always arrive at an instance that has 6527 connection state, unless that connection is no longer active. 6529 In the case of a cluster that uses dynamic load balancing, it's 6530 possible that a change in load balancer configuration could happen 6531 while an active instance retains connection state; even if an 6532 instance retains connection state, the change in routing and 6533 resulting stateless reset will result in the connection being 6534 terminated. If there is no chance in the packet being routed to the 6535 correct instance, it is better to send a stateless reset than wait 6536 for connections to time out. However, this is acceptable only if the 6537 routing cannot be influenced by an attacker. 6539 21.10. Version Downgrade 6541 This document defines QUIC Version Negotiation packets in Section 6, 6542 which can be used to negotiate the QUIC version used between two 6543 endpoints. However, this document does not specify how this 6544 negotiation will be performed between this version and subsequent 6545 future versions. In particular, Version Negotiation packets do not 6546 contain any mechanism to prevent version downgrade attacks. Future 6547 versions of QUIC that use Version Negotiation packets MUST define a 6548 mechanism that is robust against version downgrade attacks. 6550 21.11. Targeted Attacks by Routing 6552 Deployments should limit the ability of an attacker to target a new 6553 connection to a particular server instance. This means that client- 6554 controlled fields, such as the initial Destination Connection ID used 6555 on Initial and 0-RTT packets SHOULD NOT be used by themselves to make 6556 routing decisions. Ideally, routing decisions are made independently 6557 of client-selected values; a Source Connection ID can be selected to 6558 route later packets to the same server. 6560 21.12. Overview of Security Properties 6562 A complete security analysis of QUIC is outside the scope of this 6563 document. This section provides an informal description of the 6564 desired security properties as an aid to implementors and to help 6565 guide protocol analysis. 6567 QUIC assumes the threat model described in [SEC-CONS] and provides 6568 protections against many of the attacks that arise from that model. 6570 For this purpose, attacks are divided into passive and active 6571 attacks. Passive attackers have the capability to read packets from 6572 the network, while active attackers also have the capability to write 6573 packets into the network. However, a passive attack may involve an 6574 attacker with the ability to cause a routing change or other 6575 modification in the path taken by packets that comprise a connection. 6577 Attackers are additionally categorized as either on-path attackers or 6578 off-path attackers; see Section 3.5 of [SEC-CONS]. An on-path 6579 attacker can read, modify, or remove any packet it observes such that 6580 it no longer reaches its destination, while an off-path attacker 6581 observes the packets, but cannot prevent the original packet from 6582 reaching its intended destination. An off-path attacker can also 6583 transmit arbitrary packets. 6585 Properties of the handshake, protected packets, and connection 6586 migration are considered separately. 6588 21.12.1. Handshake 6590 The QUIC handshake incorporates the TLS 1.3 handshake and inherits 6591 the cryptographic properties described in Appendix E.1 of [TLS13]. 6592 Many of the security properties of QUIC depend on the TLS handshake 6593 providing these properties. Any attack on the TLS handshake could 6594 affect QUIC. 6596 Any attack on the TLS handshake that compromises the secrecy or 6597 uniqueness of session keys affects other security guarantees provided 6598 by QUIC that depends on these keys. For instance, migration 6599 (Section 9) depends on the efficacy of confidentiality protections, 6600 both for the negotiation of keys using the TLS handshake and for QUIC 6601 packet protection, to avoid linkability across network paths. 6603 An attack on the integrity of the TLS handshake might allow an 6604 attacker to affect the selection of application protocol or QUIC 6605 version. 6607 In addition to the properties provided by TLS, the QUIC handshake 6608 provides some defense against DoS attacks on the handshake. 6610 21.12.1.1. Anti-Amplification 6612 Address validation (Section 8) is used to verify that an entity that 6613 claims a given address is able to receive packets at that address. 6614 Address validation limits amplification attack targets to addresses 6615 for which an attacker is either on-path or off-path. 6617 Prior to validation, endpoints are limited in what they are able to 6618 send. During the handshake, a server cannot send more than three 6619 times the data it receives; clients that initiate new connections or 6620 migrate to a new network path are limited. 6622 21.12.1.2. Server-Side DoS 6624 Computing the server's first flight for a full handshake is 6625 potentially expensive, requiring both a signature and a key exchange 6626 computation. In order to prevent computational DoS attacks, the 6627 Retry packet provides a cheap token exchange mechanism which allows 6628 servers to validate a client's IP address prior to doing any 6629 expensive computations at the cost of a single round trip. After a 6630 successful handshake, servers can issue new tokens to a client which 6631 will allow new connection establishment without incurring this cost. 6633 21.12.1.3. On-Path Handshake Termination 6635 An on-path or off-path attacker can force a handshake to fail by 6636 replacing or racing Initial packets. Once valid Initial packets have 6637 been exchanged, subsequent Handshake packets are protected with the 6638 handshake keys and an on-path attacker cannot force handshake failure 6639 other than by dropping packets to cause endpoints to abandon the 6640 attempt. 6642 An on-path attacker can also replace the addresses of packets on 6643 either side and therefore cause the client or server to have an 6644 incorrect view of the remote addresses. Such an attack is 6645 indistinguishable from the functions performed by a NAT. 6647 21.12.1.4. Parameter Negotiation 6649 The entire handshake is cryptographically protected, with the Initial 6650 packets being encrypted with per-version keys and the Handshake and 6651 later packets being encrypted with keys derived from the TLS key 6652 exchange. Further, parameter negotiation is folded into the TLS 6653 transcript and thus provides the same integrity guarantees as 6654 ordinary TLS negotiation. An attacker can observe the client's 6655 transport parameters (as long as it knows the version-specific keys) 6656 but cannot observe the server's transport parameters and cannot 6657 influence parameter negotiation. 6659 Connection IDs are unencrypted but integrity protected in all 6660 packets. 6662 This version of QUIC does not incorporate a version negotiation 6663 mechanism; implementations of incompatible versions will simply fail 6664 to establish a connection. 6666 21.12.2. Protected Packets 6668 Packet protection (Section 12.1) provides authentication and 6669 encryption of all packets except Version Negotiation packets, though 6670 Initial and Retry packets have limited encryption and authentication 6671 based on version-specific keys; see [QUIC-TLS] for more details. 6672 This section considers passive and active attacks against protected 6673 packets. 6675 Both on-path and off-path attackers can mount a passive attack in 6676 which they save observed packets for an offline attack against packet 6677 protection at a future time; this is true for any observer of any 6678 packet on any network. 6680 A blind attacker, one who injects packets without being able to 6681 observe valid packets for a connection, is unlikely to be successful, 6682 since packet protection ensures that valid packets are only generated 6683 by endpoints which possess the key material established during the 6684 handshake; see Section 7 and Section 21.12.1. Similarly, any active 6685 attacker that observes packets and attempts to insert new data or 6686 modify existing data in those packets should not be able to generate 6687 packets deemed valid by the receiving endpoint. 6689 A spoofing attack, in which an active attacker rewrites unprotected 6690 parts of a packet that it forwards or injects, such as the source or 6691 destination address, is only effective if the attacker can forward 6692 packets to the original endpoint. Packet protection ensures that the 6693 packet payloads can only be processed by the endpoints that completed 6694 the handshake, and invalid packets are ignored by those endpoints. 6696 An attacker can also modify the boundaries between packets and UDP 6697 datagrams, causing multiple packets to be coalesced into a single 6698 datagram, or splitting coalesced packets into multiple datagrams. 6699 Aside from datagrams containing Initial packets, which require 6700 padding, modification of how packets are arranged in datagrams has no 6701 functional effect on a connection, although it might change some 6702 performance characteristics. 6704 21.12.3. Connection Migration 6706 Connection Migration (Section 9) provides endpoints with the ability 6707 to transition between IP addresses and ports on multiple paths, using 6708 one path at a time for transmission and receipt of non-probing 6709 frames. Path validation (Section 8.2) establishes that a peer is 6710 both willing and able to receive packets sent on a particular path. 6711 This helps reduce the effects of address spoofing by limiting the 6712 number of packets sent to a spoofed address. 6714 This section describes the intended security properties of connection 6715 migration when under various types of DoS attacks. 6717 21.12.3.1. On-Path Active Attacks 6719 An attacker that can cause a packet it observes to no longer reach 6720 its intended destination is considered an on-path attacker. When an 6721 attacker is present between a client and server, endpoints are 6722 required to send packets through the attacker to establish 6723 connectivity on a given path. 6725 An on-path attacker can: 6727 * Inspect packets 6728 * Modify IP and UDP packet headers 6730 * Inject new packets 6732 * Delay packets 6734 * Reorder packets 6736 * Drop packets 6738 * Split and merge datagrams along packet boundaries 6740 An on-path attacker cannot: 6742 * Modify an authenticated portion of a packet and cause the 6743 recipient to accept that packet 6745 An on-path attacker has the opportunity to modify the packets that it 6746 observes, however any modifications to an authenticated portion of a 6747 packet will cause it to be dropped by the receiving endpoint as 6748 invalid, as packet payloads are both authenticated and encrypted. 6750 In the presence of an on-path attacker, QUIC aims to provide the 6751 following properties: 6753 1. An on-path attacker can prevent use of a path for a connection, 6754 causing it to fail if it cannot use a different path that does 6755 not contain the attacker. This can be achieved by dropping all 6756 packets, modifying them so that they fail to decrypt, or other 6757 methods. 6759 2. An on-path attacker can prevent migration to a new path for which 6760 the attacker is also on-path by causing path validation to fail 6761 on the new path. 6763 3. An on-path attacker cannot prevent a client from migrating to a 6764 path for which the attacker is not on-path. 6766 4. An on-path attacker can reduce the throughput of a connection by 6767 delaying packets or dropping them. 6769 5. An on-path attacker cannot cause an endpoint to accept a packet 6770 for which it has modified an authenticated portion of that 6771 packet. 6773 21.12.3.2. Off-Path Active Attacks 6775 An off-path attacker is not directly on the path between a client and 6776 server, but could be able to obtain copies of some or all packets 6777 sent between the client and the server. It is also able to send 6778 copies of those packets to either endpoint. 6780 An off-path attacker can: 6782 * Inspect packets 6784 * Inject new packets 6786 * Reorder injected packets 6788 An off-path attacker cannot: 6790 * Modify any part of a packet 6792 * Delay packets 6794 * Drop packets 6796 * Reorder original packets 6798 An off-path attacker can modify packets that it has observed and 6799 inject them back into the network, potentially with spoofed source 6800 and destination addresses. 6802 For the purposes of this discussion, it is assumed that an off-path 6803 attacker has the ability to observe, modify, and re-inject a packet 6804 into the network that will reach the destination endpoint prior to 6805 the arrival of the original packet observed by the attacker. In 6806 other words, an attacker has the ability to consistently "win" a race 6807 with the legitimate packets between the endpoints, potentially 6808 causing the original packet to be ignored by the recipient. 6810 It is also assumed that an attacker has the resources necessary to 6811 affect NAT state, potentially both causing an endpoint to lose its 6812 NAT binding, and an attacker to obtain the same port for use with its 6813 traffic. 6815 In the presence of an off-path attacker, QUIC aims to provide the 6816 following properties: 6818 1. An off-path attacker can race packets and attempt to become a 6819 "limited" on-path attacker. 6821 2. An off-path attacker can cause path validation to succeed for 6822 forwarded packets with the source address listed as the off-path 6823 attacker as long as it can provide improved connectivity between 6824 the client and the server. 6826 3. An off-path attacker cannot cause a connection to close once the 6827 handshake has completed. 6829 4. An off-path attacker cannot cause migration to a new path to fail 6830 if it cannot observe the new path. 6832 5. An off-path attacker can become a limited on-path attacker during 6833 migration to a new path for which it is also an off-path 6834 attacker. 6836 6. An off-path attacker can become a limited on-path attacker by 6837 affecting shared NAT state such that it sends packets to the 6838 server from the same IP address and port that the client 6839 originally used. 6841 21.12.3.3. Limited On-Path Active Attacks 6843 A limited on-path attacker is an off-path attacker that has offered 6844 improved routing of packets by duplicating and forwarding original 6845 packets between the server and the client, causing those packets to 6846 arrive before the original copies such that the original packets are 6847 dropped by the destination endpoint. 6849 A limited on-path attacker differs from an on-path attacker in that 6850 it is not on the original path between endpoints, and therefore the 6851 original packets sent by an endpoint are still reaching their 6852 destination. This means that a future failure to route copied 6853 packets to the destination faster than their original path will not 6854 prevent the original packets from reaching the destination. 6856 A limited on-path attacker can: 6858 * Inspect packets 6860 * Inject new packets 6862 * Modify unencrypted packet headers 6864 * Reorder packets 6866 A limited on-path attacker cannot: 6868 * Delay packets so that they arrive later than packets sent on the 6869 original path 6871 * Drop packets 6873 * Modify the authenticated and encrypted portion of a packet and 6874 cause the recipient to accept that packet 6876 A limited on-path attacker can only delay packets up to the point 6877 that the original packets arrive before the duplicate packets, 6878 meaning that it cannot offer routing with worse latency than the 6879 original path. If a limited on-path attacker drops packets, the 6880 original copy will still arrive at the destination endpoint. 6882 In the presence of a limited on-path attacker, QUIC aims to provide 6883 the following properties: 6885 1. A limited on-path attacker cannot cause a connection to close 6886 once the handshake has completed. 6888 2. A limited on-path attacker cannot cause an idle connection to 6889 close if the client is first to resume activity. 6891 3. A limited on-path attacker can cause an idle connection to be 6892 deemed lost if the server is the first to resume activity. 6894 Note that these guarantees are the same guarantees provided for any 6895 NAT, for the same reasons. 6897 22. IANA Considerations 6899 This document establishes several registries for the management of 6900 codepoints in QUIC. These registries operate on a common set of 6901 policies as defined in Section 22.1. 6903 22.1. Registration Policies for QUIC Registries 6905 All QUIC registries allow for both provisional and permanent 6906 registration of codepoints. This section documents policies that are 6907 common to these registries. 6909 22.1.1. Provisional Registrations 6911 Provisional registration of codepoints are intended to allow for 6912 private use and experimentation with extensions to QUIC. Provisional 6913 registrations only require the inclusion of the codepoint value and 6914 contact information. However, provisional registrations could be 6915 reclaimed and reassigned for another purpose. 6917 Provisional registrations require Expert Review, as defined in 6918 Section 4.5 of [RFC8126]. Designated expert(s) are advised that only 6919 registrations for an excessive proportion of remaining codepoint 6920 space or the very first unassigned value (see Section 22.1.2) can be 6921 rejected. 6923 Provisional registrations will include a date field that indicates 6924 when the registration was last updated. A request to update the date 6925 on any provisional registration can be made without review from the 6926 designated expert(s). 6928 All QUIC registries include the following fields to support 6929 provisional registration: 6931 Value: The assigned codepoint. 6933 Status: "Permanent" or "Provisional". 6935 Specification: A reference to a publicly available specification for 6936 the value. 6938 Date: The date of last update to the registration. 6940 Contact: Contact details for the registrant. 6942 Notes: Supplementary notes about the registration. 6944 Provisional registrations MAY omit the Specification and Notes 6945 fields, plus any additional fields that might be required for a 6946 permanent registration. The Date field is not required as part of 6947 requesting a registration as it is set to the date the registration 6948 is created or updated. 6950 22.1.2. Selecting Codepoints 6952 New uses of codepoints from QUIC registries SHOULD use a randomly 6953 selected codepoint that excludes both existing allocations and the 6954 first unallocated codepoint in the selected space. Requests for 6955 multiple codepoints MAY use a contiguous range. This minimizes the 6956 risk that differing semantics are attributed to the same codepoint by 6957 different implementations. Use of the first codepoint in a range is 6958 intended for use by specifications that are developed through the 6959 standards process [STD] and its allocation MUST be negotiated with 6960 IANA before use. 6962 For codepoints that are encoded in variable-length integers 6963 (Section 16), such as frame types, codepoints that encode to four or 6964 eight bytes (that is, values 2^14 and above) SHOULD be used unless 6965 the usage is especially sensitive to having a longer encoding. 6967 Applications to register codepoints in QUIC registries MAY include a 6968 codepoint as part of the registration. IANA MUST allocate the 6969 selected codepoint unless that codepoint is already assigned or the 6970 codepoint is the first unallocated codepoint in the registry. 6972 22.1.3. Reclaiming Provisional Codepoints 6974 A request might be made to remove an unused provisional registration 6975 from the registry to reclaim space in a registry, or portion of the 6976 registry (such as the 64-16383 range for codepoints that use 6977 variable-length encodings). This SHOULD be done only for the 6978 codepoints with the earliest recorded date and entries that have been 6979 updated less than a year prior SHOULD NOT be reclaimed. 6981 A request to remove a codepoint MUST be reviewed by the designated 6982 expert(s). The expert(s) MUST attempt to determine whether the 6983 codepoint is still in use. Experts are advised to contact the listed 6984 contacts for the registration, plus as wide a set of protocol 6985 implementers as possible in order to determine whether any use of the 6986 codepoint is known. The expert(s) are advised to allow at least four 6987 weeks for responses. 6989 If any use of the codepoints is identified by this search or a 6990 request to update the registration is made, the codepoint MUST NOT be 6991 reclaimed. Instead, the date on the registration is updated. A note 6992 might be added for the registration recording relevant information 6993 that was learned. 6995 If no use of the codepoint was identified and no request was made to 6996 update the registration, the codepoint MAY be removed from the 6997 registry. 6999 This process also applies to requests to change a provisional 7000 registration into a permanent registration, except that the goal is 7001 not to determine whether there is no use of the codepoint, but to 7002 determine that the registration is an accurate representation of any 7003 deployed usage. 7005 22.1.4. Permanent Registrations 7007 Permanent registrations in QUIC registries use the Specification 7008 Required policy [RFC8126], unless otherwise specified. The 7009 designated expert(s) verify that a specification exists and is 7010 readily accessible. Expert(s) are encouraged to be biased towards 7011 approving registrations unless they are abusive, frivolous, or 7012 actively harmful (not merely aesthetically displeasing, or 7013 architecturally dubious). The creation of a registry MAY specify 7014 additional constraints on permanent registrations. 7016 The creation of a registries MAY identify a range of codepoints where 7017 registrations are governed by a different registration policy. For 7018 instance, the registries for 62-bit codepoints in this document have 7019 stricter policies for codepoints in the range from 0 to 63. 7021 Any stricter requirements for permanent registrations do not prevent 7022 provisional registrations for affected codepoints. For instance, a 7023 provisional registration for a frame type Section 22.3 of 61 could be 7024 requested. 7026 All registrations made by Standards Track publications MUST be 7027 permanent. 7029 All registrations in this document are assigned a permanent status 7030 and list as contact both the IESG (ietf@ietf.org) and the QUIC 7031 working group (quic@ietf.org (mailto:quic@ietf.org)). 7033 22.2. QUIC Transport Parameter Registry 7035 IANA [SHALL add/has added] a registry for "QUIC Transport Parameters" 7036 under a "QUIC" heading. 7038 The "QUIC Transport Parameters" registry governs a 62-bit space. 7039 This registry follows the registration policy from Section 22.1. 7040 Permanent registrations in this registry are assigned using the 7041 Specification Required policy [RFC8126]. 7043 In addition to the fields in Section 22.1.1, permanent registrations 7044 in this registry MUST include the following fields: 7046 Parameter Name: A short mnemonic for the parameter. 7048 The initial contents of this registry are shown in Table 6. 7050 +-------+-------------------------------------+---------------+ 7051 | Value | Parameter Name | Specification | 7052 +=======+=====================================+===============+ 7053 | 0x00 | original_destination_connection_id | Section 18.2 | 7054 +-------+-------------------------------------+---------------+ 7055 | 0x01 | max_idle_timeout | Section 18.2 | 7056 +-------+-------------------------------------+---------------+ 7057 | 0x02 | stateless_reset_token | Section 18.2 | 7058 +-------+-------------------------------------+---------------+ 7059 | 0x03 | max_udp_payload_size | Section 18.2 | 7060 +-------+-------------------------------------+---------------+ 7061 | 0x04 | initial_max_data | Section 18.2 | 7062 +-------+-------------------------------------+---------------+ 7063 | 0x05 | initial_max_stream_data_bidi_local | Section 18.2 | 7064 +-------+-------------------------------------+---------------+ 7065 | 0x06 | initial_max_stream_data_bidi_remote | Section 18.2 | 7066 +-------+-------------------------------------+---------------+ 7067 | 0x07 | initial_max_stream_data_uni | Section 18.2 | 7068 +-------+-------------------------------------+---------------+ 7069 | 0x08 | initial_max_streams_bidi | Section 18.2 | 7070 +-------+-------------------------------------+---------------+ 7071 | 0x09 | initial_max_streams_uni | Section 18.2 | 7072 +-------+-------------------------------------+---------------+ 7073 | 0x0a | ack_delay_exponent | Section 18.2 | 7074 +-------+-------------------------------------+---------------+ 7075 | 0x0b | max_ack_delay | Section 18.2 | 7076 +-------+-------------------------------------+---------------+ 7077 | 0x0c | disable_active_migration | Section 18.2 | 7078 +-------+-------------------------------------+---------------+ 7079 | 0x0d | preferred_address | Section 18.2 | 7080 +-------+-------------------------------------+---------------+ 7081 | 0x0e | active_connection_id_limit | Section 18.2 | 7082 +-------+-------------------------------------+---------------+ 7083 | 0x0f | initial_source_connection_id | Section 18.2 | 7084 +-------+-------------------------------------+---------------+ 7085 | 0x10 | retry_source_connection_id | Section 18.2 | 7086 +-------+-------------------------------------+---------------+ 7088 Table 6: Initial QUIC Transport Parameters Entries 7090 Additionally, each value of the format "31 * N + 27" for integer 7091 values of N (that is, 27, 58, 89, ...) are reserved and MUST NOT be 7092 assigned by IANA. 7094 22.3. QUIC Frame Type Registry 7096 IANA [SHALL add/has added] a registry for "QUIC Frame Types" under a 7097 "QUIC" heading. 7099 The "QUIC Frame Types" registry governs a 62-bit space. This 7100 registry follows the registration policy from Section 22.1. 7101 Permanent registrations in this registry are assigned using the 7102 Specification Required policy [RFC8126], except for values between 7103 0x00 and 0x3f (in hexadecimal; inclusive), which are assigned using 7104 Standards Action or IESG Approval as defined in Section 4.9 and 4.10 7105 of [RFC8126]. 7107 In addition to the fields in Section 22.1.1, permanent registrations 7108 in this registry MUST include the following fields: 7110 Frame Name: A short mnemonic for the frame type. 7112 In addition to the advice in Section 22.1, specifications for new 7113 permanent registrations SHOULD describe the means by which an 7114 endpoint might determine that it can send the identified type of 7115 frame. An accompanying transport parameter registration is expected 7116 for most registrations; see Section 22.2. Specifications for 7117 permanent registrations also needs to describe the format and 7118 assigned semantics of any fields in the frame. 7120 The initial contents of this registry are tabulated in Table 3. 7122 22.4. QUIC Transport Error Codes Registry 7124 IANA [SHALL add/has added] a registry for "QUIC Transport Error 7125 Codes" under a "QUIC" heading. 7127 The "QUIC Transport Error Codes" registry governs a 62-bit space. 7128 This space is split into three spaces that are governed by different 7129 policies. Permanent registrations in this registry are assigned 7130 using the Specification Required policy [RFC8126], except for values 7131 between 0x00 and 0x3f (in hexadecimal; inclusive), which are assigned 7132 using Standards Action or IESG Approval as defined in Section 4.9 and 7133 4.10 of [RFC8126]. 7135 In addition to the fields in Section 22.1.1, permanent registrations 7136 in this registry MUST include the following fields: 7138 Code: A short mnemonic for the parameter. 7140 Description: A brief description of the error code semantics, which 7141 MAY be a summary if a specification reference is provided. 7143 The initial contents of this registry are shown in Table 7. 7145 +------+---------------------------+----------------+---------------+ 7146 |Value | Error | Description | Specification | 7147 +======+===========================+================+===============+ 7148 | 0x0 | NO_ERROR | No error | Section 20 | 7149 +------+---------------------------+----------------+---------------+ 7150 | 0x1 | INTERNAL_ERROR | Implementation | Section 20 | 7151 | | | error | | 7152 +------+---------------------------+----------------+---------------+ 7153 | 0x2 | SERVER_BUSY |Server currently| Section 20 | 7154 | | | busy | | 7155 +------+---------------------------+----------------+---------------+ 7156 | 0x3 | FLOW_CONTROL_ERROR | Flow control | Section 20 | 7157 | | | error | | 7158 +------+---------------------------+----------------+---------------+ 7159 | 0x4 | STREAM_LIMIT_ERROR |Too many streams| Section 20 | 7160 | | | opened | | 7161 +------+---------------------------+----------------+---------------+ 7162 | 0x5 | STREAM_STATE_ERROR | Frame received | Section 20 | 7163 | | | in invalid | | 7164 | | | stream state | | 7165 +------+---------------------------+----------------+---------------+ 7166 | 0x6 | FINAL_SIZE_ERROR |Change to final | Section 20 | 7167 | | | size | | 7168 +------+---------------------------+----------------+---------------+ 7169 | 0x7 | FRAME_ENCODING_ERROR | Frame encoding | Section 20 | 7170 | | | error | | 7171 +------+---------------------------+----------------+---------------+ 7172 | 0x8 | TRANSPORT_PARAMETER_ERROR | Error in | Section 20 | 7173 | | | transport | | 7174 | | | parameters | | 7175 +------+---------------------------+----------------+---------------+ 7176 | 0x9 | CONNECTION_ID_LIMIT_ERROR | Too many | Section 20 | 7177 | | | connection IDs | | 7178 | | | received | | 7179 +------+---------------------------+----------------+---------------+ 7180 | 0xA | PROTOCOL_VIOLATION |Generic protocol| Section 20 | 7181 | | | violation | | 7182 +------+---------------------------+----------------+---------------+ 7183 | 0xB | INVALID_TOKEN | Invalid Token | Section 20 | 7184 | | | Received | | 7185 +------+---------------------------+----------------+---------------+ 7186 | 0xC | APPLICATION_ERROR | Application | Section 20 | 7187 | | | error | | 7188 +------+---------------------------+----------------+---------------+ 7189 | 0xD | CRYPTO_BUFFER_EXCEEDED | CRYPTO data | Section 20 | 7190 | | | buffer | | 7191 | | | overflowed | | 7192 +------+---------------------------+----------------+---------------+ 7193 Table 7: Initial QUIC Transport Error Codes Entries 7195 23. References 7197 23.1. Normative References 7199 [DPLPMTUD] Fairhurst, G., Jones, T., Tuexen, M., Ruengeler, I., and 7200 T. Voelker, "Packetization Layer Path MTU Discovery for 7201 Datagram Transports", Work in Progress, Internet-Draft, 7202 draft-ietf-tsvwg-datagram-plpmtud-21, 12 May 2020, 7203 . 7206 [IPv4] Postel, J., "Internet Protocol", STD 5, RFC 791, 7207 DOI 10.17487/RFC0791, September 1981, 7208 . 7210 [QUIC-RECOVERY] 7211 Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection 7212 and Congestion Control", Work in Progress, Internet-Draft, 7213 draft-ietf-quic-recovery-28, 20 May 2020, 7214 . 7216 [QUIC-TLS] Thomson, M., Ed. and S. Turner, Ed., "Using Transport 7217 Layer Security (TLS) to Secure QUIC", Work in Progress, 7218 Internet-Draft, draft-ietf-quic-tls-28, 20 May 2020, 7219 . 7221 [RFC1191] Mogul, J.C. and S.E. Deering, "Path MTU discovery", 7222 RFC 1191, DOI 10.17487/RFC1191, November 1990, 7223 . 7225 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 7226 Requirement Levels", BCP 14, RFC 2119, 7227 DOI 10.17487/RFC2119, March 1997, 7228 . 7230 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 7231 of Explicit Congestion Notification (ECN) to IP", 7232 RFC 3168, DOI 10.17487/RFC3168, September 2001, 7233 . 7235 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 7236 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November 7237 2003, . 7239 [RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker, 7240 "Randomness Requirements for Security", BCP 106, RFC 4086, 7241 DOI 10.17487/RFC4086, June 2005, 7242 . 7244 [RFC5116] McGrew, D., "An Interface and Algorithms for Authenticated 7245 Encryption", RFC 5116, DOI 10.17487/RFC5116, January 2008, 7246 . 7248 [RFC6437] Amante, S., Carpenter, B., Jiang, S., and J. Rajahalme, 7249 "IPv6 Flow Label Specification", RFC 6437, 7250 DOI 10.17487/RFC6437, November 2011, 7251 . 7253 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 7254 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 7255 March 2017, . 7257 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 7258 Writing an IANA Considerations Section in RFCs", BCP 26, 7259 RFC 8126, DOI 10.17487/RFC8126, June 2017, 7260 . 7262 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 7263 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 7264 May 2017, . 7266 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 7267 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 7268 DOI 10.17487/RFC8201, July 2017, 7269 . 7271 [RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion 7272 Notification (ECN) Experimentation", RFC 8311, 7273 DOI 10.17487/RFC8311, January 2018, 7274 . 7276 [TLS13] Rescorla, E., "The Transport Layer Security (TLS) Protocol 7277 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, 7278 . 7280 23.2. Informative References 7282 [ALTSVC] Nottingham, M., McManus, P., and J. Reschke, "HTTP 7283 Alternative Services", RFC 7838, DOI 10.17487/RFC7838, 7284 April 2016, . 7286 [EARLY-DESIGN] 7287 Roskind, J., "QUIC: Multiplexed Transport Over UDP", 2 7288 December 2013, . 7290 [HTTP2] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 7291 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 7292 DOI 10.17487/RFC7540, May 2015, 7293 . 7295 [QUIC-INVARIANTS] 7296 Thomson, M., "Version-Independent Properties of QUIC", 7297 Work in Progress, Internet-Draft, draft-ietf-quic- 7298 invariants-08, 20 May 2020, . 7301 [QUIC-MANAGEABILITY] 7302 Kuehlewind, M. and B. Trammell, "Manageability of the QUIC 7303 Transport Protocol", Work in Progress, Internet-Draft, 7304 draft-ietf-quic-manageability-06, 6 January 2020, 7305 . 7308 [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", 7309 RFC 1812, DOI 10.17487/RFC1812, June 1995, 7310 . 7312 [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP 7313 Selective Acknowledgment Options", RFC 2018, 7314 DOI 10.17487/RFC2018, October 1996, 7315 . 7317 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 7318 Hashing for Message Authentication", RFC 2104, 7319 DOI 10.17487/RFC2104, February 1997, 7320 . 7322 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 7323 RFC 4303, DOI 10.17487/RFC4303, December 2005, 7324 . 7326 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 7327 Control Message Protocol (ICMPv6) for the Internet 7328 Protocol Version 6 (IPv6) Specification", STD 89, 7329 RFC 4443, DOI 10.17487/RFC4443, March 2006, 7330 . 7332 [RFC4787] Audet, F., Ed. and C. Jennings, "Network Address 7333 Translation (NAT) Behavioral Requirements for Unicast 7334 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 7335 2007, . 7337 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 7338 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 7339 . 7341 [RFC5869] Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand 7342 Key Derivation Function (HKDF)", RFC 5869, 7343 DOI 10.17487/RFC5869, May 2010, 7344 . 7346 [RFC7301] Friedl, S., Popov, A., Langley, A., and E. Stephan, 7347 "Transport Layer Security (TLS) Application-Layer Protocol 7348 Negotiation Extension", RFC 7301, DOI 10.17487/RFC7301, 7349 July 2014, . 7351 [RFC8200] Deering, S. and R. Hinden, "Internet Protocol, Version 6 7352 (IPv6) Specification", STD 86, RFC 8200, 7353 DOI 10.17487/RFC8200, July 2017, 7354 . 7356 [SEC-CONS] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 7357 Text on Security Considerations", BCP 72, RFC 3552, 7358 DOI 10.17487/RFC3552, July 2003, 7359 . 7361 [SLOWLORIS] 7362 RSnake Hansen, R., "Welcome to Slowloris...", June 2009, 7363 . 7366 [STD] Bradner, S., "The Internet Standards Process -- Revision 7367 3", BCP 9, RFC 2026, DOI 10.17487/RFC2026, October 1996, 7368 . 7370 Appendix A. Sample Packet Number Decoding Algorithm 7372 The pseudo-code in Figure 41 shows how an implementation can decode 7373 packet numbers after header protection has been removed. 7375 DecodePacketNumber(largest_pn, truncated_pn, pn_nbits): 7376 expected_pn = largest_pn + 1 7377 pn_win = 1 << pn_nbits 7378 pn_hwin = pn_win / 2 7379 pn_mask = pn_win - 1 7380 // The incoming packet number should be greater than 7381 // expected_pn - pn_hwin and less than or equal to 7382 // expected_pn + pn_hwin 7383 // 7384 // This means we can't just strip the trailing bits from 7385 // expected_pn and add the truncated_pn because that might 7386 // yield a value outside the window. 7387 // 7388 // The following code calculates a candidate value and 7389 // makes sure it's within the packet number window. 7390 // Note the extra checks to prevent overflow and underflow. 7391 candidate_pn = (expected_pn & ~pn_mask) | truncated_pn 7392 if candidate_pn <= expected_pn - pn_hwin and 7393 candidate_pn < (1 << 62) - pn_win: 7394 return candidate_pn + pn_win 7395 if candidate_pn > expected_pn + pn_hwin and 7396 candidate_pn >= pn_win: 7397 return candidate_pn - pn_win 7398 return candidate_pn 7400 Figure 41: Sample Packet Number Decoding Algorithm 7402 Appendix B. Sample ECN Validation Algorithm 7404 Each time an endpoint commences sending on a new network path, it 7405 determines whether the path supports ECN; see Section 13.4. If the 7406 path supports ECN, the goal is to use ECN. Endpoints might also 7407 periodically reassess a path that was determined to not support ECN. 7409 This section describes one method for testing new paths. This 7410 algorithm is intended to show how a path might be tested for ECN 7411 support. Endpoints can implement different methods. 7413 The path is assigned an ECN state that is one of "testing", 7414 "unknown", "failed", or "capable". On paths with a "testing" or 7415 "capable" state the endpoint sends packets with an ECT marking, by 7416 default ECT(0); otherwise, the endpoint sends unmarked packets. 7418 To start testing a path, the ECN state is set to "testing" and 7419 existing ECN counts are remembered as a baseline. 7421 The testing period runs for a number of packets or round-trip times, 7422 as determined by the endpoint. The goal is not to limit the duration 7423 of the testing period, but to ensure that enough marked packets are 7424 sent for received ECN counts to provide a clear indication of how the 7425 path treats marked packets. Section 13.4.2.2 suggests limiting this 7426 to 10 packets or 3 round-trip times. 7428 After the testing period ends, the ECN state for the path becomes 7429 "unknown". From the "unknown" state, successful validation of the 7430 ECN counts an ACK frame (see Section 13.4.2.2) causes the ECN state 7431 for the path to become "capable", unless no marked packet has been 7432 acknowledged. 7434 If validation of ECN counts fails at any time, the ECN state for the 7435 affected path becomes "failed". An endpoint can also mark the ECN 7436 state for a path as "failed" if marked packets are all declared lost 7437 or if they are all CE marked. 7439 Following this algorithm ensures that ECN is rarely disabled for 7440 paths that properly support ECN. Any path that incorrectly modifies 7441 markings will cause ECN to be disabled. For those rare cases where 7442 marked packets are discarded by the path, the short duration of the 7443 testing period limits the number of losses incurred. 7445 Appendix C. Change Log 7447 *RFC Editor's Note:* Please remove this section prior to 7448 publication of a final version of this document. 7450 Issue and pull request numbers are listed with a leading octothorp. 7452 C.1. Since draft-ietf-quic-transport-27 7454 * Allowed CONNECTION_CLOSE in any packet number space, with a 7455 requirement to use a new transport-level error for application- 7456 specific errors in Initial and Handshake packets (#3430, #3435, 7457 #3440) 7459 * Clearer requirements for address validation (#2125, #3327) 7461 * Security analysis of handshake and migration (#2143, #2387, #2925) 7463 * The entire payload of a datagram is used when counting bytes for 7464 mitigating amplification attacks (#3333, #3470) 7466 * Connection IDs can be used at any time, including in the handshake 7467 (#3348, #3560, #3438, #3565) 7469 * Only one ACK should be sent for each instance of reordering 7470 (#3357, #3361) 7472 * Remove text allowing a server to proceed with a bad Retry token 7473 (#3396, #3398) 7475 * Ignore active_connection_id_limit with a zero-length connection ID 7476 (#3427, #3426) 7478 * Require active_connection_id_limit be remembered for 0-RTT (#3423, 7479 #3425) 7481 * Require ack_delay not be remembered for 0-RTT (#3433, #3545) 7483 * Redefined max_packet_size to max_udp_datagram_size (#3471, #3473) 7485 * Guidance on limiting outstanding attempts to retire connection IDs 7486 (#3489, #3509, #3557, #3547) 7488 * Restored text on dropping bogus Version Negotiation packets 7489 (#3532, #3533) 7491 * Clarified that largest acknowledged needs to be saved, but not 7492 necessarily signaled in all cases (#3541, #3581) 7494 * Addressed linkability risk with the use of preferred_address 7495 (#3559, #3563) 7497 C.2. Since draft-ietf-quic-transport-26 7499 * Change format of transport parameters to use varints (#3294, 7500 #3169) 7502 C.3. Since draft-ietf-quic-transport-25 7504 * Define the use of CONNECTION_CLOSE prior to establishing 7505 connection state (#3269, #3297, #3292) 7507 * Allow use of address validation tokens after client address 7508 changes (#3307, #3308) 7510 * Define the timer for address validation (#2910, #3339) 7512 C.4. Since draft-ietf-quic-transport-24 7514 * Added HANDSHAKE_DONE to signal handshake confirmation (#2863, 7515 #3142, #3145) 7517 * Add integrity check to Retry packets (#3014, #3274, #3120) 7519 * Specify handling of reordered NEW_CONNECTION_ID frames (#3194, 7520 #3202) 7522 * Require checking of sequence numbers in RETIRE_CONNECTION_ID 7523 (#3037, #3036) 7525 * active_connection_id_limit is enforced (#3193, #3197, #3200, 7526 #3201) 7528 * Correct overflow in packet number decode algorithm (#3187, #3188) 7530 * Allow use of CRYPTO_BUFFER_EXCEEDED for CRYPTO frame errors 7531 (#3258, #3186) 7533 * Define applicability and scope of NEW_TOKEN (#3150, #3152, #3155, 7534 #3156) 7536 * Tokens from Retry and NEW_TOKEN must be differentiated (#3127, 7537 #3128) 7539 * Allow CONNECTION_CLOSE in response to invalid token (#3168, #3107) 7541 * Treat an invalid CONNECTION_CLOSE as an invalid frame (#2475, 7542 #3230, #3231) 7544 * Throttle when sending CONNECTION_CLOSE after discarding state 7545 (#3095, #3157) 7547 * Application-variant of CONNECTION_CLOSE can only be sent in 0-RTT 7548 or 1-RTT packets (#3158, #3164) 7550 * Advise sending while blocked to avoid idle timeout (#2744, #3266) 7552 * Define error codes for invalid frames (#3027, #3042) 7554 * Idle timeout is symmetric (#2602, #3099) 7556 * Prohibit IP fragmentation (#3243, #3280) 7558 * Define the use of provisional registration for all registries 7559 (#3109, #3020, #3102, #3170) 7561 * Packets on one path must not adjust values for a different path 7562 (#2909, #3139) 7564 C.5. Since draft-ietf-quic-transport-23 7566 * Allow ClientHello to span multiple packets (#2928, #3045) 7568 * Client Initial size constraints apply to UDP datagram payload 7569 (#3053, #3051) 7571 * Stateless reset changes (#2152, #2993) 7573 - tokens need to be compared in constant time 7575 - detection uses UDP datagrams, not packets 7577 - tokens cannot be reused (#2785, #2968) 7579 * Clearer rules for sharing of UDP ports and use of connection IDs 7580 when doing so (#2844, #2851) 7582 * A new connection ID is necessary when responding to migration 7583 (#2778, #2969) 7585 * Stronger requirements for connection ID retirement (#3046, #3096) 7587 * NEW_TOKEN cannot be empty (#2978, #2977) 7589 * PING can be sent at any encryption level (#3034, #3035) 7591 * CONNECTION_CLOSE is not ack-eliciting (#3097, #3098) 7593 * Frame encoding error conditions updated (#3027, #3042) 7595 * Non-ack-eliciting packets cannot be sent in response to non-ack- 7596 eliciting packets (#3100, #3104) 7598 * Servers have to change connection IDs in Retry (#2837, #3147) 7600 C.6. Since draft-ietf-quic-transport-22 7602 * Rules for preventing correlation by connection ID tightened 7603 (#2084, #2929) 7605 * Clarified use of CONNECTION_CLOSE in Handshake packets (#2151, 7606 #2541, #2688) 7608 * Discourage regressions of largest acknowledged in ACK (#2205, 7609 #2752) 7611 * Improved robustness of validation process for ECN counts (#2534, 7612 #2752) 7614 * Require endpoints to ignore spurious migration attempts (#2342, 7615 #2893) 7617 * Transport parameter for disabling migration clarified to allow NAT 7618 rebinding (#2389, #2893) 7620 * Document principles for defining new error codes (#2388, #2880) 7622 * Reserve transport parameters for greasing (#2550, #2873) 7624 * A maximum ACK delay of 0 is used for handshake packet number 7625 spaces (#2646, #2638) 7627 * Improved rules for use of congestion control state on new paths 7628 (#2685, #2918) 7630 * Removed recommendation to coordinate spin for multiple connections 7631 that share a path (#2763, #2882) 7633 * Allow smaller stateless resets and recommend a smaller minimum on 7634 packets that might trigger a stateless reset (#2770, #2869, #2927, 7635 #3007). 7637 * Provide guidance around the interface to QUIC as used by 7638 application protocols (#2805, #2857) 7640 * Frames other than STREAM can cause STREAM_LIMIT_ERROR (#2825, 7641 #2826) 7643 * Tighter rules about processing of rejected 0-RTT packets (#2829, 7644 #2840, #2841) 7646 * Explanation of the effect of Retry on 0-RTT packets (#2842, #2852) 7648 * Cryptographic handshake needs to provide server transport 7649 parameter encryption (#2920, #2921) 7651 * Moved ACK generation guidance from recovery draft to transport 7652 draft (#1860, #2916). 7654 C.7. Since draft-ietf-quic-transport-21 7656 * Connection ID lengths are now one octet, but limited in version 1 7657 to 20 octets of length (#2736, #2749) 7659 C.8. Since draft-ietf-quic-transport-20 7661 * Error codes are encoded as variable-length integers (#2672, #2680) 7663 * NEW_CONNECTION_ID includes a request to retire old connection IDs 7664 (#2645, #2769) 7666 * Tighter rules for generating and explicitly eliciting ACK frames 7667 (#2546, #2794) 7669 * Recommend having only one packet per encryption level in a 7670 datagram (#2308, #2747) 7672 * More normative language about use of stateless reset (#2471, 7673 #2574) 7675 * Allow reuse of stateless reset tokens (#2732, #2733) 7677 * Allow, but not require, enforcing non-duplicate transport 7678 parameters (#2689, #2691) 7680 * Added an active_connection_id_limit transport parameter (#1994, 7681 #1998) 7683 * max_ack_delay transport parameter defaults to 0 (#2638, #2646) 7685 * When sending 0-RTT, only remembered transport parameters apply 7686 (#2458, #2360, #2466, #2461) 7688 * Define handshake completion and confirmation; define clearer rules 7689 when it encryption keys should be discarded (#2214, #2267, #2673) 7691 * Prohibit path migration prior to handshake confirmation (#2309, 7692 #2370) 7694 * PATH_RESPONSE no longer needs to be received on the validated path 7695 (#2582, #2580, #2579, #2637) 7697 * PATH_RESPONSE frames are not stored and retransmitted (#2724, 7698 #2729) 7700 * Document hack for enabling routing of ICMP when doing PMTU probing 7701 (#1243, #2402) 7703 C.9. Since draft-ietf-quic-transport-19 7705 * Refine discussion of 0-RTT transport parameters (#2467, #2464) 7706 * Fewer transport parameters need to be remembered for 0-RTT (#2624, 7707 #2467) 7709 * Spin bit text incorporated (#2564) 7711 * Close the connection when maximum stream ID in MAX_STREAMS exceeds 7712 2^62 - 1 (#2499, #2487) 7714 * New connection ID required for intentional migration (#2414, 7715 #2413) 7717 * Connection ID issuance can be rate-limited (#2436, #2428) 7719 * The "QUIC bit" is ignored in Version Negotiation (#2400, #2561) 7721 * Initial packets from clients need to be padded to 1200 unless a 7722 Handshake packet is sent as well (#2522, #2523) 7724 * CRYPTO frames can be discarded if too much data is buffered 7725 (#1834, #2524) 7727 * Stateless reset uses a short header packet (#2599, #2600) 7729 C.10. Since draft-ietf-quic-transport-18 7731 * Removed version negotiation; version negotiation, including 7732 authentication of the result, will be addressed in the next 7733 version of QUIC (#1773, #2313) 7735 * Added discussion of the use of IPv6 flow labels (#2348, #2399) 7737 * A connection ID can't be retired in a packet that uses that 7738 connection ID (#2101, #2420) 7740 * Idle timeout transport parameter is in milliseconds (from seconds) 7741 (#2453, #2454) 7743 * Endpoints are required to use new connection IDs when they use new 7744 network paths (#2413, #2414) 7746 * Increased the set of permissible frames in 0-RTT (#2344, #2355) 7748 C.11. Since draft-ietf-quic-transport-17 7750 * Stream-related errors now use STREAM_STATE_ERROR (#2305) 7752 * Endpoints discard initial keys as soon as handshake keys are 7753 available (#1951, #2045) 7755 * Expanded conditions for ignoring ICMP packet too big messages 7756 (#2108, #2161) 7758 * Remove rate control from PATH_CHALLENGE/PATH_RESPONSE (#2129, 7759 #2241) 7761 * Endpoints are permitted to discard malformed initial packets 7762 (#2141) 7764 * Clarified ECN implementation and usage requirements (#2156, #2201) 7766 * Disable ECN count verification for packets that arrive out of 7767 order (#2198, #2215) 7769 * Use Probe Timeout (PTO) instead of RTO (#2206, #2238) 7771 * Loosen constraints on retransmission of ACK ranges (#2199, #2245) 7773 * Limit Retry and Version Negotiation to once per datagram (#2259, 7774 #2303) 7776 * Set a maximum value for max_ack_delay transport parameter (#2282, 7777 #2301) 7779 * Allow server preferred address for both IPv4 and IPv6 (#2122, 7780 #2296) 7782 * Corrected requirements for migration to a preferred address 7783 (#2146, #2349) 7785 * ACK of non-existent packet is illegal (#2298, #2302) 7787 C.12. Since draft-ietf-quic-transport-16 7789 * Stream limits are defined as counts, not maximums (#1850, #1906) 7791 * Require amplification attack defense after closing (#1905, #1911) 7793 * Remove reservation of application error code 0 for STOPPING 7794 (#1804, #1922) 7796 * Renumbered frames (#1945) 7798 * Renumbered transport parameters (#1946) 7800 * Numeric transport parameters are expressed as varints (#1608, 7801 #1947, #1955) 7803 * Reorder the NEW_CONNECTION_ID frame (#1952, #1963) 7805 * Rework the first byte (#2006) 7807 - Fix the 0x40 bit 7809 - Change type values for long header 7811 - Add spin bit to short header (#631, #1988) 7813 - Encrypt the remainder of the first byte (#1322) 7815 - Move packet number length to first byte 7817 - Move ODCIL to first byte of retry packets 7819 - Simplify packet number protection (#1575) 7821 * Allow STOP_SENDING to open a remote bidirectional stream (#1797, 7822 #2013) 7824 * Added mitigation for off-path migration attacks (#1278, #1749, 7825 #2033) 7827 * Don't let the PMTU to drop below 1280 (#2063, #2069) 7829 * Require peers to replace retired connection IDs (#2085) 7831 * Servers are required to ignore Version Negotiation packets (#2088) 7833 * Tokens are repeated in all Initial packets (#2089) 7835 * Clarified how PING frames are sent after loss (#2094) 7837 * Initial keys are discarded once Handshake are available (#1951, 7838 #2045) 7840 * ICMP PTB validation clarifications (#2161, #2109, #2108) 7842 C.13. Since draft-ietf-quic-transport-15 7844 Substantial editorial reorganization; no technical changes. 7846 C.14. Since draft-ietf-quic-transport-14 7848 * Merge ACK and ACK_ECN (#1778, #1801) 7850 * Explicitly communicate max_ack_delay (#981, #1781) 7851 * Validate original connection ID after Retry packets (#1710, #1486, 7852 #1793) 7854 * Idle timeout is optional and has no specified maximum (#1765) 7856 * Update connection ID handling; add RETIRE_CONNECTION_ID type 7857 (#1464, #1468, #1483, #1484, #1486, #1495, #1729, #1742, #1799, 7858 #1821) 7860 * Include a Token in all Initial packets (#1649, #1794) 7862 * Prevent handshake deadlock (#1764, #1824) 7864 C.15. Since draft-ietf-quic-transport-13 7866 * Streams open when higher-numbered streams of the same type open 7867 (#1342, #1549) 7869 * Split initial stream flow control limit into 3 transport 7870 parameters (#1016, #1542) 7872 * All flow control transport parameters are optional (#1610) 7874 * Removed UNSOLICITED_PATH_RESPONSE error code (#1265, #1539) 7876 * Permit stateless reset in response to any packet (#1348, #1553) 7878 * Recommended defense against stateless reset spoofing (#1386, 7879 #1554) 7881 * Prevent infinite stateless reset exchanges (#1443, #1627) 7883 * Forbid processing of the same packet number twice (#1405, #1624) 7885 * Added a packet number decoding example (#1493) 7887 * More precisely define idle timeout (#1429, #1614, #1652) 7889 * Corrected format of Retry packet and prevented looping (#1492, 7890 #1451, #1448, #1498) 7892 * Permit 0-RTT after receiving Version Negotiation or Retry (#1507, 7893 #1514, #1621) 7895 * Permit Retry in response to 0-RTT (#1547, #1552) 7897 * Looser verification of ECN counters to account for ACK loss 7898 (#1555, #1481, #1565) 7900 * Remove frame type field from APPLICATION_CLOSE (#1508, #1528) 7902 C.16. Since draft-ietf-quic-transport-12 7904 * Changes to integration of the TLS handshake (#829, #1018, #1094, 7905 #1165, #1190, #1233, #1242, #1252, #1450, #1458) 7907 - The cryptographic handshake uses CRYPTO frames, not stream 0 7909 - QUIC packet protection is used in place of TLS record 7910 protection 7912 - Separate QUIC packet number spaces are used for the handshake 7914 - Changed Retry to be independent of the cryptographic handshake 7916 - Added NEW_TOKEN frame and Token fields to Initial packet 7918 - Limit the use of HelloRetryRequest to address TLS needs (like 7919 key shares) 7921 * Enable server to transition connections to a preferred address 7922 (#560, #1251, #1373) 7924 * Added ECN feedback mechanisms and handling; new ACK_ECN frame 7925 (#804, #805, #1372) 7927 * Changed rules and recommendations for use of new connection IDs 7928 (#1258, #1264, #1276, #1280, #1419, #1452, #1453, #1465) 7930 * Added a transport parameter to disable intentional connection 7931 migration (#1271, #1447) 7933 * Packets from different connection ID can't be coalesced (#1287, 7934 #1423) 7936 * Fixed sampling method for packet number encryption; the length 7937 field in long headers includes the packet number field in addition 7938 to the packet payload (#1387, #1389) 7940 * Stateless Reset is now symmetric and subject to size constraints 7941 (#466, #1346) 7943 * Added frame type extension mechanism (#58, #1473) 7945 C.17. Since draft-ietf-quic-transport-11 7946 * Enable server to transition connections to a preferred address 7947 (#560, #1251) 7949 * Packet numbers are encrypted (#1174, #1043, #1048, #1034, #850, 7950 #990, #734, #1317, #1267, #1079) 7952 * Packet numbers use a variable-length encoding (#989, #1334) 7954 * STREAM frames can now be empty (#1350) 7956 C.18. Since draft-ietf-quic-transport-10 7958 * Swap payload length and packed number fields in long header 7959 (#1294) 7961 * Clarified that CONNECTION_CLOSE is allowed in Handshake packet 7962 (#1274) 7964 * Spin bit reserved (#1283) 7966 * Coalescing multiple QUIC packets in a UDP datagram (#1262, #1285) 7968 * A more complete connection migration (#1249) 7970 * Refine opportunistic ACK defense text (#305, #1030, #1185) 7972 * A Stateless Reset Token isn't mandatory (#818, #1191) 7974 * Removed implicit stream opening (#896, #1193) 7976 * An empty STREAM frame can be used to open a stream without sending 7977 data (#901, #1194) 7979 * Define stream counts in transport parameters rather than a maximum 7980 stream ID (#1023, #1065) 7982 * STOP_SENDING is now prohibited before streams are used (#1050) 7984 * Recommend including ACK in Retry packets and allow PADDING (#1067, 7985 #882) 7987 * Endpoints now become closing after an idle timeout (#1178, #1179) 7989 * Remove implication that Version Negotiation is sent when a packet 7990 of the wrong version is received (#1197) 7992 C.19. Since draft-ietf-quic-transport-09 7993 * Added PATH_CHALLENGE and PATH_RESPONSE frames to replace PING with 7994 Data and PONG frame. Changed ACK frame type from 0x0e to 0x0d. 7995 (#1091, #725, #1086) 7997 * A server can now only send 3 packets without validating the client 7998 address (#38, #1090) 8000 * Delivery order of stream data is no longer strongly specified 8001 (#252, #1070) 8003 * Rework of packet handling and version negotiation (#1038) 8005 * Stream 0 is now exempt from flow control until the handshake 8006 completes (#1074, #725, #825, #1082) 8008 * Improved retransmission rules for all frame types: information is 8009 retransmitted, not packets or frames (#463, #765, #1095, #1053) 8011 * Added an error code for server busy signals (#1137) 8013 * Endpoints now set the connection ID that their peer uses. 8014 Connection IDs are variable length. Removed the 8015 omit_connection_id transport parameter and the corresponding short 8016 header flag. (#1089, #1052, #1146, #821, #745, #821, #1166, #1151) 8018 C.20. Since draft-ietf-quic-transport-08 8020 * Clarified requirements for BLOCKED usage (#65, #924) 8022 * BLOCKED frame now includes reason for blocking (#452, #924, #927, 8023 #928) 8025 * GAP limitation in ACK Frame (#613) 8027 * Improved PMTUD description (#614, #1036) 8029 * Clarified stream state machine (#634, #662, #743, #894) 8031 * Reserved versions don't need to be generated deterministically 8032 (#831, #931) 8034 * You don't always need the draining period (#871) 8036 * Stateless reset clarified as version-specific (#930, #986) 8038 * initial_max_stream_id_x transport parameters are optional (#970, 8039 #971) 8041 * Ack Delay assumes a default value during the handshake (#1007, 8042 #1009) 8044 * Removed transport parameters from NewSessionTicket (#1015) 8046 C.21. Since draft-ietf-quic-transport-07 8048 * The long header now has version before packet number (#926, #939) 8050 * Rename and consolidate packet types (#846, #822, #847) 8052 * Packet types are assigned new codepoints and the Connection ID 8053 Flag is inverted (#426, #956) 8055 * Removed type for Version Negotiation and use Version 0 (#963, 8056 #968) 8058 * Streams are split into unidirectional and bidirectional (#643, 8059 #656, #720, #872, #175, #885) 8061 - Stream limits now have separate uni- and bi-directional 8062 transport parameters (#909, #958) 8064 - Stream limit transport parameters are now optional and default 8065 to 0 (#970, #971) 8067 * The stream state machine has been split into read and write (#634, 8068 #894) 8070 * Employ variable-length integer encodings throughout (#595) 8072 * Improvements to connection close 8074 - Added distinct closing and draining states (#899, #871) 8076 - Draining period can terminate early (#869, #870) 8078 - Clarifications about stateless reset (#889, #890) 8080 * Address validation for connection migration (#161, #732, #878) 8082 * Clearly defined retransmission rules for BLOCKED (#452, #65, #924) 8084 * negotiated_version is sent in server transport parameters (#710, 8085 #959) 8087 * Increased the range over which packet numbers are randomized 8088 (#864, #850, #964) 8090 C.22. Since draft-ietf-quic-transport-06 8092 * Replaced FNV-1a with AES-GCM for all "Cleartext" packets (#554) 8094 * Split error code space between application and transport (#485) 8096 * Stateless reset token moved to end (#820) 8098 * 1-RTT-protected long header types removed (#848) 8100 * No acknowledgments during draining period (#852) 8102 * Remove "application close" as a separate close type (#854) 8104 * Remove timestamps from the ACK frame (#841) 8106 * Require transport parameters to only appear once (#792) 8108 C.23. Since draft-ietf-quic-transport-05 8110 * Stateless token is server-only (#726) 8112 * Refactor section on connection termination (#733, #748, #328, 8113 #177) 8115 * Limit size of Version Negotiation packet (#585) 8117 * Clarify when and what to ack (#736) 8119 * Renamed STREAM_ID_NEEDED to STREAM_ID_BLOCKED 8121 * Clarify Keep-alive requirements (#729) 8123 C.24. Since draft-ietf-quic-transport-04 8125 * Introduce STOP_SENDING frame, RESET_STREAM only resets in one 8126 direction (#165) 8128 * Removed GOAWAY; application protocols are responsible for graceful 8129 shutdown (#696) 8131 * Reduced the number of error codes (#96, #177, #184, #211) 8133 * Version validation fields can't move or change (#121) 8135 * Removed versions from the transport parameters in a 8136 NewSessionTicket message (#547) 8138 * Clarify the meaning of "bytes in flight" (#550) 8140 * Public reset is now stateless reset and not visible to the path 8141 (#215) 8143 * Reordered bits and fields in STREAM frame (#620) 8145 * Clarifications to the stream state machine (#572, #571) 8147 * Increased the maximum length of the Largest Acknowledged field in 8148 ACK frames to 64 bits (#629) 8150 * truncate_connection_id is renamed to omit_connection_id (#659) 8152 * CONNECTION_CLOSE terminates the connection like TCP RST (#330, 8153 #328) 8155 * Update labels used in HKDF-Expand-Label to match TLS 1.3 (#642) 8157 C.25. Since draft-ietf-quic-transport-03 8159 * Change STREAM and RESET_STREAM layout 8161 * Add MAX_STREAM_ID settings 8163 C.26. Since draft-ietf-quic-transport-02 8165 * The size of the initial packet payload has a fixed minimum (#267, 8166 #472) 8168 * Define when Version Negotiation packets are ignored (#284, #294, 8169 #241, #143, #474) 8171 * The 64-bit FNV-1a algorithm is used for integrity protection of 8172 unprotected packets (#167, #480, #481, #517) 8174 * Rework initial packet types to change how the connection ID is 8175 chosen (#482, #442, #493) 8177 * No timestamps are forbidden in unprotected packets (#542, #429) 8179 * Cryptographic handshake is now on stream 0 (#456) 8181 * Remove congestion control exemption for cryptographic handshake 8182 (#248, #476) 8184 * Version 1 of QUIC uses TLS; a new version is needed to use a 8185 different handshake protocol (#516) 8187 * STREAM frames have a reduced number of offset lengths (#543, #430) 8189 * Split some frames into separate connection- and stream- level 8190 frames (#443) 8192 - WINDOW_UPDATE split into MAX_DATA and MAX_STREAM_DATA (#450) 8194 - BLOCKED split to match WINDOW_UPDATE split (#454) 8196 - Define STREAM_ID_NEEDED frame (#455) 8198 * A NEW_CONNECTION_ID frame supports connection migration without 8199 linkability (#232, #491, #496) 8201 * Transport parameters for 0-RTT are retained from a previous 8202 connection (#405, #513, #512) 8204 - A client in 0-RTT no longer required to reset excess streams 8205 (#425, #479) 8207 * Expanded security considerations (#440, #444, #445, #448) 8209 C.27. Since draft-ietf-quic-transport-01 8211 * Defined short and long packet headers (#40, #148, #361) 8213 * Defined a versioning scheme and stable fields (#51, #361) 8215 * Define reserved version values for "greasing" negotiation (#112, 8216 #278) 8218 * The initial packet number is randomized (#35, #283) 8220 * Narrow the packet number encoding range requirement (#67, #286, 8221 #299, #323, #356) 8223 * Defined client address validation (#52, #118, #120, #275) 8225 * Define transport parameters as a TLS extension (#49, #122) 8227 * SCUP and COPT parameters are no longer valid (#116, #117) 8229 * Transport parameters for 0-RTT are either remembered from before, 8230 or assume default values (#126) 8232 * The server chooses connection IDs in its final flight (#119, #349, 8233 #361) 8235 * The server echoes the Connection ID and packet number fields when 8236 sending a Version Negotiation packet (#133, #295, #244) 8238 * Defined a minimum packet size for the initial handshake packet 8239 from the client (#69, #136, #139, #164) 8241 * Path MTU Discovery (#64, #106) 8243 * The initial handshake packet from the client needs to fit in a 8244 single packet (#338) 8246 * Forbid acknowledgment of packets containing only ACK and PADDING 8247 (#291) 8249 * Require that frames are processed when packets are acknowledged 8250 (#381, #341) 8252 * Removed the STOP_WAITING frame (#66) 8254 * Don't require retransmission of old timestamps for lost ACK frames 8255 (#308) 8257 * Clarified that frames are not retransmitted, but the information 8258 in them can be (#157, #298) 8260 * Error handling definitions (#335) 8262 * Split error codes into four sections (#74) 8264 * Forbid the use of Public Reset where CONNECTION_CLOSE is possible 8265 (#289) 8267 * Define packet protection rules (#336) 8269 * Require that stream be entirely delivered or reset, including 8270 acknowledgment of all STREAM frames or the RESET_STREAM, before it 8271 closes (#381) 8273 * Remove stream reservation from state machine (#174, #280) 8275 * Only stream 1 does not contribute to connection-level flow control 8276 (#204) 8278 * Stream 1 counts towards the maximum concurrent stream limit (#201, 8279 #282) 8281 * Remove connection-level flow control exclusion for some streams 8282 (except 1) (#246) 8284 * RESET_STREAM affects connection-level flow control (#162, #163) 8286 * Flow control accounting uses the maximum data offset on each 8287 stream, rather than bytes received (#378) 8289 * Moved length-determining fields to the start of STREAM and ACK 8290 (#168, #277) 8292 * Added the ability to pad between frames (#158, #276) 8294 * Remove error code and reason phrase from GOAWAY (#352, #355) 8296 * GOAWAY includes a final stream number for both directions (#347) 8298 * Error codes for RESET_STREAM and CONNECTION_CLOSE are now at a 8299 consistent offset (#249) 8301 * Defined priority as the responsibility of the application protocol 8302 (#104, #303) 8304 C.28. Since draft-ietf-quic-transport-00 8306 * Replaced DIVERSIFICATION_NONCE flag with KEY_PHASE flag 8308 * Defined versioning 8310 * Reworked description of packet and frame layout 8312 * Error code space is divided into regions for each component 8314 * Use big endian for all numeric values 8316 C.29. Since draft-hamilton-quic-transport-protocol-01 8318 * Adopted as base for draft-ietf-quic-tls 8320 * Updated authors/editors list 8322 * Added IANA Considerations section 8324 * Moved Contributors and Acknowledgments to appendices 8326 Contributors 8328 The original design and rationale behind this protocol draw 8329 significantly from work by Jim Roskind [EARLY-DESIGN]. 8331 The IETF QUIC Working Group received an enormous amount of support 8332 from many people. The following people provided substantive 8333 contributions to this document: 8335 * Alessandro Ghedini 8337 * Alyssa Wilk 8339 * Antoine Delignat-Lavaud 8341 * Brian Trammell 8343 * Christian Huitema 8345 * Colin Perkins 8347 * David Schinazi 8349 * Dmitri Tikhonov 8351 * Eric Kinnear 8353 * Eric Rescorla 8355 * Gorry Fairhurst 8357 * Ian Swett 8359 * Igor Lubashev 8361 * 奥 一穂 (Kazuho Oku) 8363 * Lucas Pardue 8365 * Magnus Westerlund 8367 * Marten Seemann 8369 * Martin Duke 8371 * Mike Bishop 8373 * Mikkel Fahnøe Jørgensen 8375 * Mirja Kühlewind 8377 * Nick Banks 8378 * Nick Harper 8380 * Patrick McManus 8382 * Roberto Peon 8384 * Ryan Hamilton 8386 * Subodh Iyengar 8388 * Tatsuhiro Tsujikawa 8390 * Ted Hardie 8392 * Tom Jones 8394 * Victor Vasiliev 8396 Authors' Addresses 8398 Jana Iyengar (editor) 8399 Fastly 8401 Email: jri.ietf@gmail.com 8403 Martin Thomson (editor) 8404 Mozilla 8406 Email: mt@lowentropy.net