idnits 2.17.1 draft-ietf-quic-transport-30.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There are 4 instances of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 1 instance of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (September 10, 2020) is 1296 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '0' on line 2300 == Missing Reference: 'CH' is mentioned on line 2296, but not defined == Missing Reference: 'SH' is mentioned on line 2298, but not defined == Missing Reference: 'EE' is mentioned on line 2299, but not defined == Missing Reference: 'CERT' is mentioned on line 2299, but not defined == Missing Reference: 'CV' is mentioned on line 2299, but not defined == Missing Reference: 'FIN' is mentioned on line 2299, but not defined -- Looks like a reference, but probably isn't: '1' on line 2298 == Outdated reference: A later version (-34) exists of draft-ietf-quic-recovery-30 == Outdated reference: A later version (-34) exists of draft-ietf-quic-tls-30 -- Obsolete informational reference (is this intentional?): RFC 2267 (ref. 'BCP38') (Obsoleted by RFC 2827) -- Obsolete informational reference (is this intentional?): RFC 7540 (ref. 'HTTP2') (Obsoleted by RFC 9113) == Outdated reference: A later version (-13) exists of draft-ietf-quic-invariants-10 == Outdated reference: A later version (-18) exists of draft-ietf-quic-manageability-07 -- Duplicate reference: RFC7301, mentioned in 'RFC7301', was also mentioned in 'ALPN'. Summary: 0 errors (**), 0 flaws (~~), 13 warnings (==), 7 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC J. Iyengar, Ed. 3 Internet-Draft Fastly 4 Intended status: Standards Track M. Thomson, Ed. 5 Expires: March 14, 2021 Mozilla 6 September 10, 2020 8 QUIC: A UDP-Based Multiplexed and Secure Transport 9 draft-ietf-quic-transport-30 11 Abstract 13 This document defines the core of the QUIC transport protocol. 14 Accompanying documents describe QUIC's loss detection and congestion 15 control and the use of TLS for key negotiation. 17 Note to Readers 19 Discussion of this draft takes place on the QUIC working group 20 mailing list (quic@ietf.org (mailto:quic@ietf.org)), which is 21 archived at https://mailarchive.ietf.org/arch/search/?email_list=quic 23 Working Group information can be found at https://github.com/quicwg; 24 source code and issues list for this draft can be found at 25 https://github.com/quicwg/base-drafts/labels/-transport. 27 Status of This Memo 29 This Internet-Draft is submitted in full conformance with the 30 provisions of BCP 78 and BCP 79. 32 Internet-Drafts are working documents of the Internet Engineering 33 Task Force (IETF). Note that other groups may also distribute 34 working documents as Internet-Drafts. The list of current Internet- 35 Drafts is at https://datatracker.ietf.org/drafts/current/. 37 Internet-Drafts are draft documents valid for a maximum of six months 38 and may be updated, replaced, or obsoleted by other documents at any 39 time. It is inappropriate to use Internet-Drafts as reference 40 material or to cite them other than as "work in progress." 42 This Internet-Draft will expire on March 14, 2021. 44 Copyright Notice 46 Copyright (c) 2020 IETF Trust and the persons identified as the 47 document authors. All rights reserved. 49 This document is subject to BCP 78 and the IETF Trust's Legal 50 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 51 license-info) in effect on the date of publication of this document. 52 Please review these documents carefully, as they describe your rights 53 and restrictions with respect to this document. Code Components 54 extracted from this document must include Simplified BSD License text 55 as described in Section 4.e of the Trust Legal Provisions and are 56 provided without warranty as described in the Simplified BSD License. 58 Table of Contents 60 1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . 7 61 1.1. Document Structure . . . . . . . . . . . . . . . . . . . 8 62 1.2. Terms and Definitions . . . . . . . . . . . . . . . . . . 9 63 1.3. Notational Conventions . . . . . . . . . . . . . . . . . 10 64 2. Streams . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 65 2.1. Stream Types and Identifiers . . . . . . . . . . . . . . 12 66 2.2. Sending and Receiving Data . . . . . . . . . . . . . . . 13 67 2.3. Stream Prioritization . . . . . . . . . . . . . . . . . . 13 68 2.4. Operations on Streams . . . . . . . . . . . . . . . . . . 14 69 3. Stream States . . . . . . . . . . . . . . . . . . . . . . . . 14 70 3.1. Sending Stream States . . . . . . . . . . . . . . . . . . 15 71 3.2. Receiving Stream States . . . . . . . . . . . . . . . . . 18 72 3.3. Permitted Frame Types . . . . . . . . . . . . . . . . . . 20 73 3.4. Bidirectional Stream States . . . . . . . . . . . . . . . 21 74 3.5. Solicited State Transitions . . . . . . . . . . . . . . . 22 75 4. Flow Control . . . . . . . . . . . . . . . . . . . . . . . . 23 76 4.1. Data Flow Control . . . . . . . . . . . . . . . . . . . . 24 77 4.2. Increasing Flow Control Limits . . . . . . . . . . . . . 25 78 4.3. Handling Stream Cancellation . . . . . . . . . . . . . . 26 79 4.4. Stream Final Size . . . . . . . . . . . . . . . . . . . . 26 80 4.5. Controlling Concurrency . . . . . . . . . . . . . . . . . 27 81 4.6. Flow Control Performance . . . . . . . . . . . . . . . . 28 82 5. Connections . . . . . . . . . . . . . . . . . . . . . . . . . 28 83 5.1. Connection ID . . . . . . . . . . . . . . . . . . . . . . 29 84 5.1.1. Issuing Connection IDs . . . . . . . . . . . . . . . 30 85 5.1.2. Consuming and Retiring Connection IDs . . . . . . . . 31 86 5.2. Matching Packets to Connections . . . . . . . . . . . . . 33 87 5.2.1. Client Packet Handling . . . . . . . . . . . . . . . 33 88 5.2.2. Server Packet Handling . . . . . . . . . . . . . . . 34 89 5.2.3. Considerations for Simple Load Balancers . . . . . . 35 90 5.3. Operations on Connections . . . . . . . . . . . . . . . . 35 91 6. Version Negotiation . . . . . . . . . . . . . . . . . . . . . 36 92 6.1. Sending Version Negotiation Packets . . . . . . . . . . . 37 93 6.2. Handling Version Negotiation Packets . . . . . . . . . . 37 94 6.2.1. Version Negotiation Between Draft Versions . . . . . 37 95 6.3. Using Reserved Versions . . . . . . . . . . . . . . . . . 38 96 7. Cryptographic and Transport Handshake . . . . . . . . . . . . 38 97 7.1. Example Handshake Flows . . . . . . . . . . . . . . . . . 40 98 7.2. Negotiating Connection IDs . . . . . . . . . . . . . . . 41 99 7.3. Authenticating Connection IDs . . . . . . . . . . . . . . 42 100 7.4. Transport Parameters . . . . . . . . . . . . . . . . . . 44 101 7.4.1. Values of Transport Parameters for 0-RTT . . . . . . 45 102 7.4.2. New Transport Parameters . . . . . . . . . . . . . . 47 103 7.5. Cryptographic Message Buffering . . . . . . . . . . . . . 47 104 8. Address Validation . . . . . . . . . . . . . . . . . . . . . 48 105 8.1. Address Validation During Connection Establishment . . . 48 106 8.1.1. Token Construction . . . . . . . . . . . . . . . . . 49 107 8.1.2. Address Validation using Retry Packets . . . . . . . 50 108 8.1.3. Address Validation for Future Connections . . . . . . 51 109 8.1.4. Address Validation Token Integrity . . . . . . . . . 53 110 8.2. Path Validation . . . . . . . . . . . . . . . . . . . . . 54 111 8.3. Initiating Path Validation . . . . . . . . . . . . . . . 55 112 8.4. Path Validation Responses . . . . . . . . . . . . . . . . 55 113 8.5. Successful Path Validation . . . . . . . . . . . . . . . 55 114 8.6. Failed Path Validation . . . . . . . . . . . . . . . . . 56 115 9. Connection Migration . . . . . . . . . . . . . . . . . . . . 56 116 9.1. Probing a New Path . . . . . . . . . . . . . . . . . . . 57 117 9.2. Initiating Connection Migration . . . . . . . . . . . . . 58 118 9.3. Responding to Connection Migration . . . . . . . . . . . 58 119 9.3.1. Peer Address Spoofing . . . . . . . . . . . . . . . . 59 120 9.3.2. On-Path Address Spoofing . . . . . . . . . . . . . . 60 121 9.3.3. Off-Path Packet Forwarding . . . . . . . . . . . . . 60 122 9.4. Loss Detection and Congestion Control . . . . . . . . . . 61 123 9.5. Privacy Implications of Connection Migration . . . . . . 62 124 9.6. Server's Preferred Address . . . . . . . . . . . . . . . 64 125 9.6.1. Communicating a Preferred Address . . . . . . . . . . 64 126 9.6.2. Migration to a Preferred Address . . . . . . . . . . 64 127 9.6.3. Interaction of Client Migration and Preferred 128 Address . . . . . . . . . . . . . . . . . . . . . . . 65 129 9.7. Use of IPv6 Flow-Label and Migration . . . . . . . . . . 66 130 10. Connection Termination . . . . . . . . . . . . . . . . . . . 66 131 10.1. Idle Timeout . . . . . . . . . . . . . . . . . . . . . . 67 132 10.1.1. Liveness Testing . . . . . . . . . . . . . . . . . . 67 133 10.1.2. Deferring Idle Timeout . . . . . . . . . . . . . . . 67 134 10.2. Immediate Close . . . . . . . . . . . . . . . . . . . . 68 135 10.2.1. Closing Connection State . . . . . . . . . . . . . . 69 136 10.2.2. Draining Connection State . . . . . . . . . . . . . 70 137 10.2.3. Immediate Close During the Handshake . . . . . . . . 71 138 10.3. Stateless Reset . . . . . . . . . . . . . . . . . . . . 72 139 10.3.1. Detecting a Stateless Reset . . . . . . . . . . . . 75 140 10.3.2. Calculating a Stateless Reset Token . . . . . . . . 76 141 10.3.3. Looping . . . . . . . . . . . . . . . . . . . . . . 77 142 11. Error Handling . . . . . . . . . . . . . . . . . . . . . . . 77 143 11.1. Connection Errors . . . . . . . . . . . . . . . . . . . 78 144 11.2. Stream Errors . . . . . . . . . . . . . . . . . . . . . 78 146 12. Packets and Frames . . . . . . . . . . . . . . . . . . . . . 79 147 12.1. Protected Packets . . . . . . . . . . . . . . . . . . . 79 148 12.2. Coalescing Packets . . . . . . . . . . . . . . . . . . . 80 149 12.3. Packet Numbers . . . . . . . . . . . . . . . . . . . . . 81 150 12.4. Frames and Frame Types . . . . . . . . . . . . . . . . . 82 151 13. Packetization and Reliability . . . . . . . . . . . . . . . . 86 152 13.1. Packet Processing . . . . . . . . . . . . . . . . . . . 86 153 13.2. Generating Acknowledgements . . . . . . . . . . . . . . 87 154 13.2.1. Sending ACK Frames . . . . . . . . . . . . . . . . . 87 155 13.2.2. Acknowledgement Frequency . . . . . . . . . . . . . 89 156 13.2.3. Managing ACK Ranges . . . . . . . . . . . . . . . . 89 157 13.2.4. Limiting Ranges by Tracking ACK Frames . . . . . . . 90 158 13.2.5. Measuring and Reporting Host Delay . . . . . . . . . 91 159 13.2.6. ACK Frames and Packet Protection . . . . . . . . . . 91 160 13.2.7. PADDING Frames Consume Congestion Window . . . . . . 92 161 13.3. Retransmission of Information . . . . . . . . . . . . . 92 162 13.4. Explicit Congestion Notification . . . . . . . . . . . . 95 163 13.4.1. ECN Counts . . . . . . . . . . . . . . . . . . . . . 95 164 13.4.2. ECN Validation . . . . . . . . . . . . . . . . . . . 96 165 14. Packet Size . . . . . . . . . . . . . . . . . . . . . . . . . 98 166 14.1. Initial Packet Size . . . . . . . . . . . . . . . . . . 99 167 14.2. Path Maximum Transmission Unit . . . . . . . . . . . . . 99 168 14.2.1. Handling of ICMP Messages by PMTUD . . . . . . . . . 100 169 14.3. Datagram Packetization Layer PMTU Discovery . . . . . . 101 170 14.3.1. DPLPMTUD and Initial Connectivity . . . . . . . . . 101 171 14.3.2. Validating the QUIC Path with DPLPMTUD . . . . . . . 101 172 14.3.3. Handling of ICMP Messages by DPLPMTUD . . . . . . . 101 173 14.4. Sending QUIC PMTU Probes . . . . . . . . . . . . . . . . 102 174 14.4.1. PMTU Probes Containing Source Connection ID . . . . 102 175 15. Versions . . . . . . . . . . . . . . . . . . . . . . . . . . 102 176 16. Variable-Length Integer Encoding . . . . . . . . . . . . . . 103 177 17. Packet Formats . . . . . . . . . . . . . . . . . . . . . . . 104 178 17.1. Packet Number Encoding and Decoding . . . . . . . . . . 104 179 17.2. Long Header Packets . . . . . . . . . . . . . . . . . . 105 180 17.2.1. Version Negotiation Packet . . . . . . . . . . . . . 108 181 17.2.2. Initial Packet . . . . . . . . . . . . . . . . . . . 109 182 17.2.3. 0-RTT . . . . . . . . . . . . . . . . . . . . . . . 112 183 17.2.4. Handshake Packet . . . . . . . . . . . . . . . . . . 113 184 17.2.5. Retry Packet . . . . . . . . . . . . . . . . . . . . 114 185 17.3. Short Header Packets . . . . . . . . . . . . . . . . . . 117 186 17.3.1. Latency Spin Bit . . . . . . . . . . . . . . . . . . 118 187 18. Transport Parameter Encoding . . . . . . . . . . . . . . . . 119 188 18.1. Reserved Transport Parameters . . . . . . . . . . . . . 120 189 18.2. Transport Parameter Definitions . . . . . . . . . . . . 120 190 19. Frame Types and Formats . . . . . . . . . . . . . . . . . . . 125 191 19.1. PADDING Frames . . . . . . . . . . . . . . . . . . . . . 125 192 19.2. PING Frames . . . . . . . . . . . . . . . . . . . . . . 125 193 19.3. ACK Frames . . . . . . . . . . . . . . . . . . . . . . . 126 194 19.3.1. ACK Ranges . . . . . . . . . . . . . . . . . . . . . 127 195 19.3.2. ECN Counts . . . . . . . . . . . . . . . . . . . . . 128 196 19.4. RESET_STREAM Frames . . . . . . . . . . . . . . . . . . 129 197 19.5. STOP_SENDING Frames . . . . . . . . . . . . . . . . . . 130 198 19.6. CRYPTO Frames . . . . . . . . . . . . . . . . . . . . . 131 199 19.7. NEW_TOKEN Frames . . . . . . . . . . . . . . . . . . . . 132 200 19.8. STREAM Frames . . . . . . . . . . . . . . . . . . . . . 132 201 19.9. MAX_DATA Frames . . . . . . . . . . . . . . . . . . . . 134 202 19.10. MAX_STREAM_DATA Frames . . . . . . . . . . . . . . . . . 134 203 19.11. MAX_STREAMS Frames . . . . . . . . . . . . . . . . . . . 135 204 19.12. DATA_BLOCKED Frames . . . . . . . . . . . . . . . . . . 136 205 19.13. STREAM_DATA_BLOCKED Frames . . . . . . . . . . . . . . . 137 206 19.14. STREAMS_BLOCKED Frames . . . . . . . . . . . . . . . . . 137 207 19.15. NEW_CONNECTION_ID Frames . . . . . . . . . . . . . . . . 138 208 19.16. RETIRE_CONNECTION_ID Frames . . . . . . . . . . . . . . 140 209 19.17. PATH_CHALLENGE Frames . . . . . . . . . . . . . . . . . 141 210 19.18. PATH_RESPONSE Frames . . . . . . . . . . . . . . . . . . 141 211 19.19. CONNECTION_CLOSE Frames . . . . . . . . . . . . . . . . 142 212 19.20. HANDSHAKE_DONE Frames . . . . . . . . . . . . . . . . . 143 213 19.21. Extension Frames . . . . . . . . . . . . . . . . . . . . 143 214 20. Error Codes . . . . . . . . . . . . . . . . . . . . . . . . . 144 215 20.1. Transport Error Codes . . . . . . . . . . . . . . . . . 144 216 20.2. Application Protocol Error Codes . . . . . . . . . . . . 146 217 21. Security Considerations . . . . . . . . . . . . . . . . . . . 146 218 21.1. Handshake Denial of Service . . . . . . . . . . . . . . 146 219 21.2. Amplification Attack . . . . . . . . . . . . . . . . . . 147 220 21.3. Optimistic ACK Attack . . . . . . . . . . . . . . . . . 147 221 21.4. Request Forgery Attacks . . . . . . . . . . . . . . . . 148 222 21.4.1. Control Options for Endpoints . . . . . . . . . . . 148 223 21.4.2. Request Forgery with Client Initial Packets . . . . 149 224 21.4.3. Request Forgery with Preferred Addresses . . . . . . 150 225 21.4.4. Request Forgery with Spoofed Migration . . . . . . . 151 226 21.4.5. Generic Request Forgery Countermeasures . . . . . . 151 227 21.5. Slowloris Attacks . . . . . . . . . . . . . . . . . . . 152 228 21.6. Stream Fragmentation and Reassembly Attacks . . . . . . 153 229 21.7. Stream Commitment Attack . . . . . . . . . . . . . . . . 153 230 21.8. Peer Denial of Service . . . . . . . . . . . . . . . . . 154 231 21.9. Explicit Congestion Notification Attacks . . . . . . . . 154 232 21.10. Stateless Reset Oracle . . . . . . . . . . . . . . . . . 154 233 21.11. Version Downgrade . . . . . . . . . . . . . . . . . . . 155 234 21.12. Targeted Attacks by Routing . . . . . . . . . . . . . . 155 235 21.13. Overview of Security Properties . . . . . . . . . . . . 155 236 21.13.1. Handshake . . . . . . . . . . . . . . . . . . . . . 156 237 21.13.2. Protected Packets . . . . . . . . . . . . . . . . . 158 238 21.13.3. Connection Migration . . . . . . . . . . . . . . . 158 239 22. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 163 240 22.1. Registration Policies for QUIC Registries . . . . . . . 163 241 22.1.1. Provisional Registrations . . . . . . . . . . . . . 163 242 22.1.2. Selecting Codepoints . . . . . . . . . . . . . . . . 164 243 22.1.3. Reclaiming Provisional Codepoints . . . . . . . . . 164 244 22.1.4. Permanent Registrations . . . . . . . . . . . . . . 165 245 22.2. QUIC Transport Parameter Registry . . . . . . . . . . . 165 246 22.3. QUIC Frame Types Registry . . . . . . . . . . . . . . . 167 247 22.4. QUIC Transport Error Codes Registry . . . . . . . . . . 168 248 23. References . . . . . . . . . . . . . . . . . . . . . . . . . 170 249 23.1. Normative References . . . . . . . . . . . . . . . . . . 170 250 23.2. Informative References . . . . . . . . . . . . . . . . . 171 251 Appendix A. Sample Packet Number Decoding Algorithm . . . . . . 174 252 Appendix B. Sample ECN Validation Algorithm . . . . . . . . . . 175 253 Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 176 254 C.1. Since draft-ietf-quic-transport-29 . . . . . . . . . . . 176 255 C.2. Since draft-ietf-quic-transport-28 . . . . . . . . . . . 177 256 C.3. Since draft-ietf-quic-transport-27 . . . . . . . . . . . 177 257 C.4. Since draft-ietf-quic-transport-26 . . . . . . . . . . . 178 258 C.5. Since draft-ietf-quic-transport-25 . . . . . . . . . . . 178 259 C.6. Since draft-ietf-quic-transport-24 . . . . . . . . . . . 178 260 C.7. Since draft-ietf-quic-transport-23 . . . . . . . . . . . 180 261 C.8. Since draft-ietf-quic-transport-22 . . . . . . . . . . . 180 262 C.9. Since draft-ietf-quic-transport-21 . . . . . . . . . . . 181 263 C.10. Since draft-ietf-quic-transport-20 . . . . . . . . . . . 182 264 C.11. Since draft-ietf-quic-transport-19 . . . . . . . . . . . 182 265 C.12. Since draft-ietf-quic-transport-18 . . . . . . . . . . . 183 266 C.13. Since draft-ietf-quic-transport-17 . . . . . . . . . . . 183 267 C.14. Since draft-ietf-quic-transport-16 . . . . . . . . . . . 184 268 C.15. Since draft-ietf-quic-transport-15 . . . . . . . . . . . 185 269 C.16. Since draft-ietf-quic-transport-14 . . . . . . . . . . . 185 270 C.17. Since draft-ietf-quic-transport-13 . . . . . . . . . . . 186 271 C.18. Since draft-ietf-quic-transport-12 . . . . . . . . . . . 187 272 C.19. Since draft-ietf-quic-transport-11 . . . . . . . . . . . 187 273 C.20. Since draft-ietf-quic-transport-10 . . . . . . . . . . . 188 274 C.21. Since draft-ietf-quic-transport-09 . . . . . . . . . . . 188 275 C.22. Since draft-ietf-quic-transport-08 . . . . . . . . . . . 189 276 C.23. Since draft-ietf-quic-transport-07 . . . . . . . . . . . 190 277 C.24. Since draft-ietf-quic-transport-06 . . . . . . . . . . . 191 278 C.25. Since draft-ietf-quic-transport-05 . . . . . . . . . . . 191 279 C.26. Since draft-ietf-quic-transport-04 . . . . . . . . . . . 191 280 C.27. Since draft-ietf-quic-transport-03 . . . . . . . . . . . 192 281 C.28. Since draft-ietf-quic-transport-02 . . . . . . . . . . . 192 282 C.29. Since draft-ietf-quic-transport-01 . . . . . . . . . . . 193 283 C.30. Since draft-ietf-quic-transport-00 . . . . . . . . . . . 195 284 C.31. Since draft-hamilton-quic-transport-protocol-01 . . . . . 195 285 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . 195 286 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 197 288 1. Overview 290 QUIC is a multiplexed and secure general-purpose transport protocol 291 that provides: 293 * Stream multiplexing 295 * Stream- and connection-level flow control 297 * Low-latency connection establishment 299 * Connection migration and resilience to NAT rebinding 301 * Authenticated and encrypted header and payload 303 QUIC establishes a connection, which is a stateful interaction 304 between a client and server. The primary purpose of a connection is 305 to support the structured exchange of data by an application 306 protocol. 308 Streams are means by which an application protocol exchanges 309 information. Streams are ordered sequences of bytes. Two types of 310 stream can be created: bidirectional streams, which allow both 311 endpoints to send data; and unidirectional streams, which allow a 312 single endpoint to send. A credit-based scheme is used to limit 313 stream creation and to bound the amount of data that can be sent. 315 The QUIC handshake combines negotiation of cryptographic and 316 transport parameters. The handshake is structured to permit the 317 exchange of application data as soon as possible. This includes an 318 option for clients to send data immediately (0-RTT), which might 319 require prior communication to enable. 321 QUIC connections are not strictly bound to a single network path. 322 Connection migration uses connection identifiers to allow connections 323 to transfer to a new network path. 325 Frames are used in QUIC to communicate between endpoints. One or 326 more frames are assembled into packets. QUIC authenticates all 327 packets and encrypts as much as is practical. QUIC packets are 328 carried in UDP datagrams ([UDP]) to better facilitate deployment in 329 existing systems and networks. 331 Once established, multiple options are provided for connection 332 termination. Applications can manage a graceful shutdown, endpoints 333 can negotiate a timeout period, errors can cause immediate connection 334 teardown, and a stateless mechanism provides for termination of 335 connections after one endpoint has lost state. 337 1.1. Document Structure 339 This document describes the core QUIC protocol and is structured as 340 follows: 342 * Streams are the basic service abstraction that QUIC provides. 344 - Section 2 describes core concepts related to streams, 346 - Section 3 provides a reference model for stream states, and 348 - Section 4 outlines the operation of flow control. 350 * Connections are the context in which QUIC endpoints communicate. 352 - Section 5 describes core concepts related to connections, 354 - Section 6 describes version negotiation, 356 - Section 7 details the process for establishing connections, 358 - Section 8 specifies critical denial of service mitigation 359 mechanisms, 361 - Section 9 describes how endpoints migrate a connection to a new 362 network path, 364 - Section 10 lists the options for terminating an open 365 connection, and 367 - Section 11 provides general guidance for error handling. 369 * Packets and frames are the basic unit used by QUIC to communicate. 371 - Section 12 describes concepts related to packets and frames, 373 - Section 13 defines models for the transmission, retransmission, 374 and acknowledgement of data, and 376 - Section 14 specifies rules for managing the size of packets. 378 * Finally, encoding details of QUIC protocol elements are described 379 in: 381 - Section 15 (Versions), 383 - Section 16 (Integer Encoding), 384 - Section 17 (Packet Headers), 386 - Section 18 (Transport Parameters), 388 - Section 19 (Frames), and 390 - Section 20 (Errors). 392 Accompanying documents describe QUIC's loss detection and congestion 393 control [QUIC-RECOVERY], and the use of TLS for key negotiation 394 [QUIC-TLS]. 396 This document defines QUIC version 1, which conforms to the protocol 397 invariants in [QUIC-INVARIANTS]. 399 To refer to QUIC version 1, cite this document. References to the 400 limited set of version-independent properties of QUIC can cite 401 [QUIC-INVARIANTS]. 403 1.2. Terms and Definitions 405 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 406 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 407 "OPTIONAL" in this document are to be interpreted as described in 408 BCP 14 [RFC2119] [RFC8174] when, and only when, they appear in all 409 capitals, as shown here. 411 Commonly used terms in the document are described below. 413 QUIC: The transport protocol described by this document. QUIC is a 414 name, not an acronym. 416 QUIC packet: A complete processable unit of QUIC that can be 417 encapsulated in a UDP datagram. Multiple QUIC packets can be 418 encapsulated in a single UDP datagram. 420 Ack-eliciting Packet: A QUIC packet that contains frames other than 421 ACK, PADDING, and CONNECTION_CLOSE. These cause a recipient to 422 send an acknowledgment; see Section 13.2.1. 424 Endpoint: An entity that can participate in a QUIC connection by 425 generating, receiving, and processing QUIC packets. There are 426 only two types of endpoint in QUIC: client and server. 428 Client: The endpoint that initiates a QUIC connection. 430 Server: The endpoint that accepts a QUIC connection. 432 Address: When used without qualification, the tuple of IP version, 433 IP address, and UDP port number that represents one end of a 434 network path. 436 Connection ID: An identifier that is used to identify a QUIC 437 connection at an endpoint. Each endpoint selects one or more 438 Connection IDs for its peer to include in packets sent towards the 439 endpoint. This value is opaque to the peer. 441 Stream: A unidirectional or bidirectional channel of ordered bytes 442 within a QUIC connection. A QUIC connection can carry multiple 443 simultaneous streams. 445 Application: An entity that uses QUIC to send and receive data. 447 1.3. Notational Conventions 449 Packet and frame diagrams in this document use a custom format. The 450 purpose of this format is to summarize, not define, protocol 451 elements. Prose defines the complete semantics and details of 452 structures. 454 Complex fields are named and then followed by a list of fields 455 surrounded by a pair of matching braces. Each field in this list is 456 separated by commas. 458 Individual fields include length information, plus indications about 459 fixed value, optionality, or repetitions. Individual fields use the 460 following notational conventions, with all lengths in bits: 462 x (A): Indicates that x is A bits long 464 x (i): Indicates that x uses the variable-length encoding in 465 Section 16 467 x (A..B): Indicates that x can be any length from A to B; A can be 468 omitted to indicate a minimum of zero bits and B can be omitted to 469 indicate no set upper limit; values in this format always end on 470 an octet boundary 472 x (?) = C: Indicates that x has a fixed value of C with the length 473 described by ?, as above 475 x (?) = C..D: Indicates that x has a value in the range from C to D, 476 inclusive, with the length described by ?, as above 478 [x (E)]: Indicates that x is optional (and has length of E) 479 x (E) ...: Indicates that x is repeated zero or more times (and that 480 each instance is length E) 482 This document uses network byte order (that is, big endian) values. 483 Fields are placed starting from the high-order bits of each byte. 485 By convention, individual fields reference a complex field by using 486 the name of the complex field. 488 For example: 490 Example Structure { 491 One-bit Field (1), 492 7-bit Field with Fixed Value (7) = 61, 493 Field with Variable-Length Integer (i), 494 Arbitrary-Length Field (..), 495 Variable-Length Field (8..24), 496 Field With Minimum Length (16..), 497 Field With Maximum Length (..128), 498 [Optional Field (64)], 499 Repeated Field (8) ..., 500 } 502 Figure 1: Example Format 504 2. Streams 506 Streams in QUIC provide a lightweight, ordered byte-stream 507 abstraction to an application. Streams can be unidirectional or 508 bidirectional. 510 Streams can be created by sending data. Other processes associated 511 with stream management - ending, cancelling, and managing flow 512 control - are all designed to impose minimal overheads. For 513 instance, a single STREAM frame (Section 19.8) can open, carry data 514 for, and close a stream. Streams can also be long-lived and can last 515 the entire duration of a connection. 517 Streams can be created by either endpoint, can concurrently send data 518 interleaved with other streams, and can be cancelled. QUIC does not 519 provide any means of ensuring ordering between bytes on different 520 streams. 522 QUIC allows for an arbitrary number of streams to operate 523 concurrently and for an arbitrary amount of data to be sent on any 524 stream, subject to flow control constraints and stream limits; see 525 Section 4. 527 2.1. Stream Types and Identifiers 529 Streams can be unidirectional or bidirectional. Unidirectional 530 streams carry data in one direction: from the initiator of the stream 531 to its peer. Bidirectional streams allow for data to be sent in both 532 directions. 534 Streams are identified within a connection by a numeric value, 535 referred to as the stream ID. A stream ID is a 62-bit integer (0 to 536 2^62-1) that is unique for all streams on a connection. Stream IDs 537 are encoded as variable-length integers; see Section 16. A QUIC 538 endpoint MUST NOT reuse a stream ID within a connection. 540 The least significant bit (0x1) of the stream ID identifies the 541 initiator of the stream. Client-initiated streams have even-numbered 542 stream IDs (with the bit set to 0), and server-initiated streams have 543 odd-numbered stream IDs (with the bit set to 1). 545 The second least significant bit (0x2) of the stream ID distinguishes 546 between bidirectional streams (with the bit set to 0) and 547 unidirectional streams (with the bit set to 1). 549 The two least significant bits from a stream ID therefore identify a 550 stream as one of four types, as summarized in Table 1. 552 +======+==================================+ 553 | Bits | Stream Type | 554 +======+==================================+ 555 | 0x0 | Client-Initiated, Bidirectional | 556 +------+----------------------------------+ 557 | 0x1 | Server-Initiated, Bidirectional | 558 +------+----------------------------------+ 559 | 0x2 | Client-Initiated, Unidirectional | 560 +------+----------------------------------+ 561 | 0x3 | Server-Initiated, Unidirectional | 562 +------+----------------------------------+ 564 Table 1: Stream ID Types 566 The stream space for each type begins at the minimum value (0x0 567 through 0x3 respectively); successive streams of each type are 568 created with numerically increasing stream IDs. A stream ID that is 569 used out of order results in all streams of that type with lower- 570 numbered stream IDs also being opened. 572 2.2. Sending and Receiving Data 574 STREAM frames (Section 19.8) encapsulate data sent by an application. 575 An endpoint uses the Stream ID and Offset fields in STREAM frames to 576 place data in order. 578 Endpoints MUST be able to deliver stream data to an application as an 579 ordered byte-stream. Delivering an ordered byte-stream requires that 580 an endpoint buffer any data that is received out of order, up to the 581 advertised flow control limit. 583 QUIC makes no specific allowances for delivery of stream data out of 584 order. However, implementations MAY choose to offer the ability to 585 deliver data out of order to a receiving application. 587 An endpoint could receive data for a stream at the same stream offset 588 multiple times. Data that has already been received can be 589 discarded. The data at a given offset MUST NOT change if it is sent 590 multiple times; an endpoint MAY treat receipt of different data at 591 the same offset within a stream as a connection error of type 592 PROTOCOL_VIOLATION. 594 Streams are an ordered byte-stream abstraction with no other 595 structure visible to QUIC. STREAM frame boundaries are not expected 596 to be preserved when data is transmitted, retransmitted after packet 597 loss, or delivered to the application at a receiver. 599 An endpoint MUST NOT send data on any stream without ensuring that it 600 is within the flow control limits set by its peer. Flow control is 601 described in detail in Section 4. 603 2.3. Stream Prioritization 605 Stream multiplexing can have a significant effect on application 606 performance if resources allocated to streams are correctly 607 prioritized. 609 QUIC does not provide a mechanism for exchanging prioritization 610 information. Instead, it relies on receiving priority information 611 from the application that uses QUIC. 613 A QUIC implementation SHOULD provide ways in which an application can 614 indicate the relative priority of streams. An implementation uses 615 information provided by the application to determine how to allocate 616 resources to active streams. 618 2.4. Operations on Streams 620 This document does not define an API for QUIC, but instead defines a 621 set of functions on streams that application protocols can rely upon. 622 An application protocol can assume that an implementation of QUIC 623 provides an interface that includes the operations described in this 624 section. An implementation designed for use with a specific 625 application protocol might provide only those operations that are 626 used by that protocol. 628 On the sending part of a stream, an application protocol can: 630 * write data, understanding when stream flow control credit 631 (Section 4.1) has successfully been reserved to send the written 632 data; 634 * end the stream (clean termination), resulting in a STREAM frame 635 (Section 19.8) with the FIN bit set; and 637 * reset the stream (abrupt termination), resulting in a RESET_STREAM 638 frame (Section 19.4) if the stream was not already in a terminal 639 state. 641 On the receiving part of a stream, an application protocol can: 643 * read data; and 645 * abort reading of the stream and request closure, possibly 646 resulting in a STOP_SENDING frame (Section 19.5). 648 An application protocol can also request to be informed of state 649 changes on streams, including when the peer has opened or reset a 650 stream, when a peer aborts reading on a stream, when new data is 651 available, and when data can or cannot be written to the stream due 652 to flow control. 654 3. Stream States 656 This section describes streams in terms of their send or receive 657 components. Two state machines are described: one for the streams on 658 which an endpoint transmits data (Section 3.1), and another for 659 streams on which an endpoint receives data (Section 3.2). 661 Unidirectional streams use the applicable state machine directly. 662 Bidirectional streams use both state machines. For the most part, 663 the use of these state machines is the same whether the stream is 664 unidirectional or bidirectional. The conditions for opening a stream 665 are slightly more complex for a bidirectional stream because the 666 opening of either the send or receive side causes the stream to open 667 in both directions. 669 These state machines shown in this section are largely informative. 670 This document uses stream states to describe rules for when and how 671 different types of frames can be sent and the reactions that are 672 expected when different types of frames are received. Though these 673 state machines are intended to be useful in implementing QUIC, these 674 states are not intended to constrain implementations. An 675 implementation can define a different state machine as long as its 676 behavior is consistent with an implementation that implements these 677 states. 679 Note: In some cases, a single event or action can cause a transition 680 through multiple states. For instance, sending STREAM with a FIN 681 bit set can cause two state transitions for a sending stream: from 682 the Ready state to the Send state, and from the Send state to the 683 Data Sent state. 685 3.1. Sending Stream States 687 Figure 2 shows the states for the part of a stream that sends data to 688 a peer. 690 o 691 | Create Stream (Sending) 692 | Peer Creates Bidirectional Stream 693 v 694 +-------+ 695 | Ready | Send RESET_STREAM 696 | |-----------------------. 697 +-------+ | 698 | | 699 | Send STREAM / | 700 | STREAM_DATA_BLOCKED | 701 | | 702 | Peer Creates | 703 | Bidirectional Stream | 704 v | 705 +-------+ | 706 | Send | Send RESET_STREAM | 707 | |---------------------->| 708 +-------+ | 709 | | 710 | Send STREAM + FIN | 711 v v 712 +-------+ +-------+ 713 | Data | Send RESET_STREAM | Reset | 714 | Sent |------------------>| Sent | 715 +-------+ +-------+ 716 | | 717 | Recv All ACKs | Recv ACK 718 v v 719 +-------+ +-------+ 720 | Data | | Reset | 721 | Recvd | | Recvd | 722 +-------+ +-------+ 724 Figure 2: States for Sending Parts of Streams 726 The sending part of a stream that the endpoint initiates (types 0 and 727 2 for clients, 1 and 3 for servers) is opened by the application. 728 The "Ready" state represents a newly created stream that is able to 729 accept data from the application. Stream data might be buffered in 730 this state in preparation for sending. 732 Sending the first STREAM or STREAM_DATA_BLOCKED frame causes a 733 sending part of a stream to enter the "Send" state. An 734 implementation might choose to defer allocating a stream ID to a 735 stream until it sends the first STREAM frame and enters this state, 736 which can allow for better stream prioritization. 738 The sending part of a bidirectional stream initiated by a peer (type 739 0 for a server, type 1 for a client) starts in the "Ready" state when 740 the receiving part is created. 742 In the "Send" state, an endpoint transmits - and retransmits as 743 necessary - stream data in STREAM frames. The endpoint respects the 744 flow control limits set by its peer, and continues to accept and 745 process MAX_STREAM_DATA frames. An endpoint in the "Send" state 746 generates STREAM_DATA_BLOCKED frames if it is blocked from sending by 747 stream or connection flow control limits Section 4.1. 749 After the application indicates that all stream data has been sent 750 and a STREAM frame containing the FIN bit is sent, the sending part 751 of the stream enters the "Data Sent" state. From this state, the 752 endpoint only retransmits stream data as necessary. The endpoint 753 does not need to check flow control limits or send 754 STREAM_DATA_BLOCKED frames for a stream in this state. 755 MAX_STREAM_DATA frames might be received until the peer receives the 756 final stream offset. The endpoint can safely ignore any 757 MAX_STREAM_DATA frames it receives from its peer for a stream in this 758 state. 760 Once all stream data has been successfully acknowledged, the sending 761 part of the stream enters the "Data Recvd" state, which is a terminal 762 state. 764 From any of the "Ready", "Send", or "Data Sent" states, an 765 application can signal that it wishes to abandon transmission of 766 stream data. Alternatively, an endpoint might receive a STOP_SENDING 767 frame from its peer. In either case, the endpoint sends a 768 RESET_STREAM frame, which causes the stream to enter the "Reset Sent" 769 state. 771 An endpoint MAY send a RESET_STREAM as the first frame that mentions 772 a stream; this causes the sending part of that stream to open and 773 then immediately transition to the "Reset Sent" state. 775 Once a packet containing a RESET_STREAM has been acknowledged, the 776 sending part of the stream enters the "Reset Recvd" state, which is a 777 terminal state. 779 3.2. Receiving Stream States 781 Figure 3 shows the states for the part of a stream that receives data 782 from a peer. The states for a receiving part of a stream mirror only 783 some of the states of the sending part of the stream at the peer. 784 The receiving part of a stream does not track states on the sending 785 part that cannot be observed, such as the "Ready" state. Instead, 786 the receiving part of a stream tracks the delivery of data to the 787 application, some of which cannot be observed by the sender. 789 o 790 | Recv STREAM / STREAM_DATA_BLOCKED / RESET_STREAM 791 | Create Bidirectional Stream (Sending) 792 | Recv MAX_STREAM_DATA / STOP_SENDING (Bidirectional) 793 | Create Higher-Numbered Stream 794 v 795 +-------+ 796 | Recv | Recv RESET_STREAM 797 | |-----------------------. 798 +-------+ | 799 | | 800 | Recv STREAM + FIN | 801 v | 802 +-------+ | 803 | Size | Recv RESET_STREAM | 804 | Known |---------------------->| 805 +-------+ | 806 | | 807 | Recv All Data | 808 v v 809 +-------+ Recv RESET_STREAM +-------+ 810 | Data |--- (optional) --->| Reset | 811 | Recvd | Recv All Data | Recvd | 812 +-------+<-- (optional) ----+-------+ 813 | | 814 | App Read All Data | App Read RST 815 v v 816 +-------+ +-------+ 817 | Data | | Reset | 818 | Read | | Read | 819 +-------+ +-------+ 821 Figure 3: States for Receiving Parts of Streams 823 The receiving part of a stream initiated by a peer (types 1 and 3 for 824 a client, or 0 and 2 for a server) is created when the first STREAM, 825 STREAM_DATA_BLOCKED, or RESET_STREAM frame is received for that 826 stream. For bidirectional streams initiated by a peer, receipt of a 827 MAX_STREAM_DATA or STOP_SENDING frame for the sending part of the 828 stream also creates the receiving part. The initial state for the 829 receiving part of a stream is "Recv". 831 The receiving part of a stream enters the "Recv" state when the 832 sending part of a bidirectional stream initiated by the endpoint 833 (type 0 for a client, type 1 for a server) enters the "Ready" state. 835 An endpoint opens a bidirectional stream when a MAX_STREAM_DATA or 836 STOP_SENDING frame is received from the peer for that stream. 837 Receiving a MAX_STREAM_DATA frame for an unopened stream indicates 838 that the remote peer has opened the stream and is providing flow 839 control credit. Receiving a STOP_SENDING frame for an unopened 840 stream indicates that the remote peer no longer wishes to receive 841 data on this stream. Either frame might arrive before a STREAM or 842 STREAM_DATA_BLOCKED frame if packets are lost or reordered. 844 Before a stream is created, all streams of the same type with lower- 845 numbered stream IDs MUST be created. This ensures that the creation 846 order for streams is consistent on both endpoints. 848 In the "Recv" state, the endpoint receives STREAM and 849 STREAM_DATA_BLOCKED frames. Incoming data is buffered and can be 850 reassembled into the correct order for delivery to the application. 851 As data is consumed by the application and buffer space becomes 852 available, the endpoint sends MAX_STREAM_DATA frames to allow the 853 peer to send more data. 855 When a STREAM frame with a FIN bit is received, the final size of the 856 stream is known; see Section 4.4. The receiving part of the stream 857 then enters the "Size Known" state. In this state, the endpoint no 858 longer needs to send MAX_STREAM_DATA frames, it only receives any 859 retransmissions of stream data. 861 Once all data for the stream has been received, the receiving part 862 enters the "Data Recvd" state. This might happen as a result of 863 receiving the same STREAM frame that causes the transition to "Size 864 Known". After all data has been received, any STREAM or 865 STREAM_DATA_BLOCKED frames for the stream can be discarded. 867 The "Data Recvd" state persists until stream data has been delivered 868 to the application. Once stream data has been delivered, the stream 869 enters the "Data Read" state, which is a terminal state. 871 Receiving a RESET_STREAM frame in the "Recv" or "Size Known" states 872 causes the stream to enter the "Reset Recvd" state. This might cause 873 the delivery of stream data to the application to be interrupted. 875 It is possible that all stream data has already been received when a 876 RESET_STREAM is received (that is, in the "Data Recvd" state). 877 Similarly, it is possible for remaining stream data to arrive after 878 receiving a RESET_STREAM frame (the "Reset Recvd" state). An 879 implementation is free to manage this situation as it chooses. 881 Sending RESET_STREAM means that an endpoint cannot guarantee delivery 882 of stream data; however there is no requirement that stream data not 883 be delivered if a RESET_STREAM is received. An implementation MAY 884 interrupt delivery of stream data, discard any data that was not 885 consumed, and signal the receipt of the RESET_STREAM. A RESET_STREAM 886 signal might be suppressed or withheld if stream data is completely 887 received and is buffered to be read by the application. If the 888 RESET_STREAM is suppressed, the receiving part of the stream remains 889 in "Data Recvd". 891 Once the application receives the signal indicating that the stream 892 was reset, the receiving part of the stream transitions to the "Reset 893 Read" state, which is a terminal state. 895 3.3. Permitted Frame Types 897 The sender of a stream sends just three frame types that affect the 898 state of a stream at either sender or receiver: STREAM 899 (Section 19.8), STREAM_DATA_BLOCKED (Section 19.13), and RESET_STREAM 900 (Section 19.4). 902 A sender MUST NOT send any of these frames from a terminal state 903 ("Data Recvd" or "Reset Recvd"). A sender MUST NOT send a STREAM or 904 STREAM_DATA_BLOCKED frame for a stream in the "Reset Sent" state or 905 any terminal state, that is, after sending a RESET_STREAM frame. A 906 receiver could receive any of these three frames in any state, due to 907 the possibility of delayed delivery of packets carrying them. 909 The receiver of a stream sends MAX_STREAM_DATA (Section 19.10) and 910 STOP_SENDING frames (Section 19.5). 912 The receiver only sends MAX_STREAM_DATA in the "Recv" state. A 913 receiver MAY send STOP_SENDING in any state where it has not received 914 a RESET_STREAM frame; that is states other than "Reset Recvd" or 915 "Reset Read". However there is little value in sending a 916 STOP_SENDING frame in the "Data Recvd" state, since all stream data 917 has been received. A sender could receive either of these two frames 918 in any state as a result of delayed delivery of packets. 920 3.4. Bidirectional Stream States 922 A bidirectional stream is composed of sending and receiving parts. 923 Implementations may represent states of the bidirectional stream as 924 composites of sending and receiving stream states. The simplest 925 model presents the stream as "open" when either sending or receiving 926 parts are in a non-terminal state and "closed" when both sending and 927 receiving streams are in terminal states. 929 Table 2 shows a more complex mapping of bidirectional stream states 930 that loosely correspond to the stream states in HTTP/2 [HTTP2]. This 931 shows that multiple states on sending or receiving parts of streams 932 are mapped to the same composite state. Note that this is just one 933 possibility for such a mapping; this mapping requires that data is 934 acknowledged before the transition to a "closed" or "half-closed" 935 state. 937 +======================+======================+=================+ 938 | Sending Part | Receiving Part | Composite State | 939 +======================+======================+=================+ 940 | No Stream/Ready | No Stream/Recv *1 | idle | 941 +----------------------+----------------------+-----------------+ 942 | Ready/Send/Data Sent | Recv/Size Known | open | 943 +----------------------+----------------------+-----------------+ 944 | Ready/Send/Data Sent | Data Recvd/Data Read | half-closed | 945 | | | (remote) | 946 +----------------------+----------------------+-----------------+ 947 | Ready/Send/Data Sent | Reset Recvd/Reset | half-closed | 948 | | Read | (remote) | 949 +----------------------+----------------------+-----------------+ 950 | Data Recvd | Recv/Size Known | half-closed | 951 | | | (local) | 952 +----------------------+----------------------+-----------------+ 953 | Reset Sent/Reset | Recv/Size Known | half-closed | 954 | Recvd | | (local) | 955 +----------------------+----------------------+-----------------+ 956 | Reset Sent/Reset | Data Recvd/Data Read | closed | 957 | Recvd | | | 958 +----------------------+----------------------+-----------------+ 959 | Reset Sent/Reset | Reset Recvd/Reset | closed | 960 | Recvd | Read | | 961 +----------------------+----------------------+-----------------+ 962 | Data Recvd | Data Recvd/Data Read | closed | 963 +----------------------+----------------------+-----------------+ 964 | Data Recvd | Reset Recvd/Reset | closed | 965 | | Read | | 966 +----------------------+----------------------+-----------------+ 968 Table 2: Possible Mapping of Stream States to HTTP/2 970 Note (*1): A stream is considered "idle" if it has not yet been 971 created, or if the receiving part of the stream is in the "Recv" 972 state without yet having received any frames. 974 3.5. Solicited State Transitions 976 If an application is no longer interested in the data it is receiving 977 on a stream, it can abort reading the stream and specify an 978 application error code. 980 If the stream is in the "Recv" or "Size Known" states, the transport 981 SHOULD signal this by sending a STOP_SENDING frame to prompt closure 982 of the stream in the opposite direction. This typically indicates 983 that the receiving application is no longer reading data it receives 984 from the stream, but it is not a guarantee that incoming data will be 985 ignored. 987 STREAM frames received after sending a STOP_SENDING frame are still 988 counted toward connection and stream flow control, even though these 989 frames can be discarded upon receipt. 991 A STOP_SENDING frame requests that the receiving endpoint send a 992 RESET_STREAM frame. An endpoint that receives a STOP_SENDING frame 993 MUST send a RESET_STREAM frame if the stream is in the Ready or Send 994 state. If the stream is in the "Data Sent" state, an endpoint MAY 995 defer sending the RESET_STREAM frame until the packets containing 996 outstanding data are acknowledged or declared lost. If any 997 outstanding data is declared lost, the endpoint SHOULD send a 998 RESET_STREAM frame instead of retransmitting the data. 1000 An endpoint SHOULD copy the error code from the STOP_SENDING frame to 1001 the RESET_STREAM frame it sends, but MAY use any application error 1002 code. The endpoint that sends a STOP_SENDING frame MAY ignore the 1003 error code carried in any RESET_STREAM frame it receives. 1005 STOP_SENDING SHOULD only be sent for a stream that has not been reset 1006 by the peer. STOP_SENDING is most useful for streams in the "Recv" 1007 or "Size Known" states. 1009 An endpoint is expected to send another STOP_SENDING frame if a 1010 packet containing a previous STOP_SENDING is lost. However, once 1011 either all stream data or a RESET_STREAM frame has been received for 1012 the stream - that is, the stream is in any state other than "Recv" or 1013 "Size Known" - sending a STOP_SENDING frame is unnecessary. 1015 An endpoint that wishes to terminate both directions of a 1016 bidirectional stream can terminate one direction by sending a 1017 RESET_STREAM frame, and it can encourage prompt termination in the 1018 opposite direction by sending a STOP_SENDING frame. 1020 4. Flow Control 1022 It is necessary to limit the amount of data that a receiver could 1023 buffer, to prevent a fast sender from overwhelming a slow receiver, 1024 or to prevent a malicious sender from consuming a large amount of 1025 memory at a receiver. To enable a receiver to limit memory 1026 commitment to a connection and to apply back pressure on the sender, 1027 streams are flow controlled both individually and as an aggregate. A 1028 QUIC receiver controls the maximum amount of data the sender can send 1029 on a stream at any time, as described in Section 4.1 and Section 4.2 1031 Similarly, to limit concurrency within a connection, a QUIC endpoint 1032 controls the maximum cumulative number of streams that its peer can 1033 initiate, as described in Section 4.5. 1035 Data sent in CRYPTO frames is not flow controlled in the same way as 1036 stream data. QUIC relies on the cryptographic protocol 1037 implementation to avoid excessive buffering of data; see [QUIC-TLS]. 1038 The implementation SHOULD provide an interface to QUIC to tell it 1039 about its buffering limits so that there is not excessive buffering 1040 at multiple layers. 1042 4.1. Data Flow Control 1044 QUIC employs a limit-based flow-control scheme where a receiver 1045 advertises the limit of total bytes it is prepared to receive on a 1046 given stream or for the entire connection. This leads to two levels 1047 of data flow control in QUIC: 1049 * Stream flow control, which prevents a single stream from consuming 1050 the entire receive buffer for a connection by limiting the amount 1051 of data that can be sent on any stream. 1053 * Connection flow control, which prevents senders from exceeding a 1054 receiver's buffer capacity for the connection, by limiting the 1055 total bytes of stream data sent in STREAM frames on all streams. 1057 Senders MUST NOT send data in excess of either limit. 1059 A receiver sets initial limits for all streams by sending transport 1060 parameters during the handshake (Section 7.4). A receiver sends 1061 MAX_STREAM_DATA (Section 19.10) or MAX_DATA (Section 19.9) frames to 1062 the sender to advertise larger limits. 1064 A receiver can advertise a larger limit for a stream by sending a 1065 MAX_STREAM_DATA frame with the corresponding stream ID. A 1066 MAX_STREAM_DATA frame indicates the maximum absolute byte offset of a 1067 stream. A receiver could use the current offset of data consumed to 1068 determine the flow control offset to be advertised. 1070 A receiver can advertise a larger limit for a connection by sending a 1071 MAX_DATA frame, which indicates the maximum of the sum of the 1072 absolute byte offsets of all streams. A receiver maintains a 1073 cumulative sum of bytes received on all streams, which is used to 1074 check for violations of the advertised connection or stream data 1075 limits. A receiver might use a sum of bytes consumed on all streams 1076 to determine the maximum data limit to be advertised. 1078 Once a receiver advertises a limit for the connection or a stream, it 1079 MAY advertise a smaller limit, but this has no effect. 1081 A receiver MUST close the connection with a FLOW_CONTROL_ERROR error 1082 (Section 11) if the sender violates the advertised connection or 1083 stream data limits. 1085 A sender MUST ignore any MAX_STREAM_DATA or MAX_DATA frames that do 1086 not increase flow control limits. 1088 If a sender has sent data up to the limit, it will be unable to send 1089 new data and is considered blocked. A sender SHOULD send a 1090 STREAM_DATA_BLOCKED or DATA_BLOCKED frame to indicate it has data to 1091 write but is blocked by flow control limits. If a sender is blocked 1092 for a period longer than the idle timeout (Section 10.1), the 1093 receiver might close the connection even when the sender has data 1094 that is available for transmission. To keep the connection from 1095 closing, a sender that is flow control limited SHOULD periodically 1096 send a STREAM_DATA_BLOCKED or DATA_BLOCKED frame when it has no ack- 1097 eliciting packets in flight. 1099 4.2. Increasing Flow Control Limits 1101 Implementations decide when and how much credit to advertise in 1102 MAX_STREAM_DATA and MAX_DATA frames, but this section offers a few 1103 considerations. 1105 To avoid blocking a sender, a receiver MAY send a MAX_STREAM_DATA or 1106 MAX_DATA frame multiple times within a round trip or send it early 1107 enough to allow for recovery from loss of the frame. 1109 Control frames contribute to connection overhead. Therefore, 1110 frequently sending MAX_STREAM_DATA and MAX_DATA frames with small 1111 changes is undesirable. On the other hand, if updates are less 1112 frequent, larger increments to limits are necessary to avoid blocking 1113 a sender, requiring larger resource commitments at the receiver. 1114 There is a trade-off between resource commitment and overhead when 1115 determining how large a limit is advertised. 1117 A receiver can use an autotuning mechanism to tune the frequency and 1118 amount of advertised additional credit based on a round-trip time 1119 estimate and the rate at which the receiving application consumes 1120 data, similar to common TCP implementations. As an optimization, an 1121 endpoint could send frames related to flow control only when there 1122 are other frames to send or when a peer is blocked, ensuring that 1123 flow control does not cause extra packets to be sent. 1125 A blocked sender is not required to send STREAM_DATA_BLOCKED or 1126 DATA_BLOCKED frames. Therefore, a receiver MUST NOT wait for a 1127 STREAM_DATA_BLOCKED or DATA_BLOCKED frame before sending a 1128 MAX_STREAM_DATA or MAX_DATA frame; doing so could result in the 1129 sender being blocked for the rest of the connection. Even if the 1130 sender sends these frames, waiting for them will result in the sender 1131 being blocked for at least an entire round trip. 1133 When a sender receives credit after being blocked, it might be able 1134 to send a large amount of data in response, resulting in short-term 1135 congestion; see Section 6.9 in [QUIC-RECOVERY] for a discussion of 1136 how a sender can avoid this congestion. 1138 4.3. Handling Stream Cancellation 1140 Endpoints need to eventually agree on the amount of flow control 1141 credit that has been consumed, to avoid either exceeding flow control 1142 limits or deadlocking. 1144 On receipt of a RESET_STREAM frame, an endpoint will tear down state 1145 for the matching stream and ignore further data arriving on that 1146 stream. 1148 RESET_STREAM terminates one direction of a stream abruptly. For a 1149 bidirectional stream, RESET_STREAM has no effect on data flow in the 1150 opposite direction. Both endpoints MUST maintain flow control state 1151 for the stream in the unterminated direction until that direction 1152 enters a terminal state, or until one of the endpoints sends 1153 CONNECTION_CLOSE. 1155 4.4. Stream Final Size 1157 The final size is the amount of flow control credit that is consumed 1158 by a stream. Assuming that every contiguous byte on the stream was 1159 sent once, the final size is the number of bytes sent. More 1160 generally, this is one higher than the offset of the byte with the 1161 largest offset sent on the stream, or zero if no bytes were sent. 1163 The final size of a stream is always signaled to the recipient. The 1164 final size is the sum of the Offset and Length fields of a STREAM 1165 frame with a FIN flag, noting that these fields might be implicit. 1166 Alternatively, the Final Size field of a RESET_STREAM frame carries 1167 this value. This ensures that all ways that a stream can be closed 1168 result in the number of bytes on the stream being reliably 1169 transmitted. This guarantees that both endpoints agree on how much 1170 flow control credit was consumed by the stream. 1172 An endpoint will know the final size for a stream when the receiving 1173 part of the stream enters the "Size Known" or "Reset Recvd" state 1174 (Section 3). The receiver MUST use the final size of the stream to 1175 account for all bytes sent on the stream in its connection level flow 1176 controller. 1178 An endpoint MUST NOT send data on a stream at or beyond the final 1179 size. 1181 Once a final size for a stream is known, it cannot change. If a 1182 RESET_STREAM or STREAM frame is received indicating a change in the 1183 final size for the stream, an endpoint SHOULD respond with a 1184 FINAL_SIZE_ERROR error; see Section 11. A receiver SHOULD treat 1185 receipt of data at or beyond the final size as a FINAL_SIZE_ERROR 1186 error, even after a stream is closed. Generating these errors is not 1187 mandatory, but only because requiring that an endpoint generate these 1188 errors also means that the endpoint needs to maintain the final size 1189 state for closed streams, which could mean a significant state 1190 commitment. 1192 4.5. Controlling Concurrency 1194 An endpoint limits the cumulative number of incoming streams a peer 1195 can open. Only streams with a stream ID less than (max_stream * 4 + 1196 initial_stream_id_for_type) can be opened; see Table 1. Initial 1197 limits are set in the transport parameters; see Section 18.2. 1198 Subsequent limits are advertised using MAX_STREAMS frames; see 1199 Section 19.11. Separate limits apply to unidirectional and 1200 bidirectional streams. 1202 If a max_streams transport parameter or MAX_STREAMS frame is received 1203 with a value greater than 2^60, this would allow a maximum stream ID 1204 that cannot be expressed as a variable-length integer; see 1205 Section 16. If either is received, the connection MUST be closed 1206 immediately with a connection error of type FRAME_ENCODING_ERROR; see 1207 Section 10.2. 1209 Endpoints MUST NOT exceed the limit set by their peer. An endpoint 1210 that receives a frame with a stream ID exceeding the limit it has 1211 sent MUST treat this as a connection error of type STREAM_LIMIT_ERROR 1212 (Section 11). 1214 Once a receiver advertises a stream limit using the MAX_STREAMS 1215 frame, advertising a smaller limit has no effect. A receiver MUST 1216 ignore any MAX_STREAMS frame that does not increase the stream limit. 1218 As with stream and connection flow control, this document leaves when 1219 and how many streams to advertise to a peer via MAX_STREAMS to 1220 implementations. Implementations might choose to increase limits as 1221 streams close to keep the number of streams available to peers 1222 roughly consistent. 1224 An endpoint that is unable to open a new stream due to the peer's 1225 limits SHOULD send a STREAMS_BLOCKED frame (Section 19.14). This 1226 signal is considered useful for debugging. An endpoint MUST NOT wait 1227 to receive this signal before advertising additional credit, since 1228 doing so will mean that the peer will be blocked for at least an 1229 entire round trip, and potentially for longer if the peer chooses not 1230 to send STREAMS_BLOCKED frames. 1232 4.6. Flow Control Performance 1234 An endpoint that is unable to ensure that a peer has flow control 1235 credit on the order of the current BDP will have receive throughput 1236 limited by flow control. Lost packets can cause gaps in the receive 1237 buffer, delaying the application from consuming data and freeing up 1238 flow control window. 1240 Sending timely updates of flow control limits can improve 1241 performance. Sending packets only to provide flow control updates 1242 can increase network load and adversely affect performance. Sending 1243 flow control updates along with other frames, such as ACK frames, 1244 reduces the cost of those updates. 1246 5. Connections 1248 A QUIC connection is shared state between a client and a server. 1250 Each connection starts with a handshake phase, during which the two 1251 endpoints establish a shared secret using the cryptographic handshake 1252 protocol [QUIC-TLS] and negotiate the application protocol. The 1253 handshake (Section 7) confirms that both endpoints are willing to 1254 communicate (Section 8.1) and establishes parameters for the 1255 connection (Section 7.4). 1257 An application protocol can use the connection during the handshake 1258 phase with some limitations. 0-RTT allows application data to be 1259 sent by a client before receiving a response from the server. 1260 However, 0-RTT provides no protection against replay attacks; see 1261 Section 9.2 of [QUIC-TLS]. A server can also send application data 1262 to a client before it receives the final cryptographic handshake 1263 messages that allow it to confirm the identity and liveness of the 1264 client. These capabilities allow an application protocol to offer 1265 the option of trading some security guarantees for reduced latency. 1267 The use of connection IDs (Section 5.1) allows connections to migrate 1268 to a new network path, both as a direct choice of an endpoint and 1269 when forced by a change in a middlebox. Section 9 describes 1270 mitigations for the security and privacy issues associated with 1271 migration. 1273 For connections that are no longer needed or desired, there are 1274 several ways for a client and server to terminate a connection 1275 (Section 10). 1277 5.1. Connection ID 1279 Each connection possesses a set of connection identifiers, or 1280 connection IDs, each of which can identify the connection. 1281 Connection IDs are independently selected by endpoints; each endpoint 1282 selects the connection IDs that its peer uses. 1284 The primary function of a connection ID is to ensure that changes in 1285 addressing at lower protocol layers (UDP, IP) do not cause packets 1286 for a QUIC connection to be delivered to the wrong endpoint. Each 1287 endpoint selects connection IDs using an implementation-specific (and 1288 perhaps deployment-specific) method that will allow packets with that 1289 connection ID to be routed back to the endpoint and to be identified 1290 by the endpoint upon receipt. 1292 Connection IDs MUST NOT contain any information that can be used by 1293 an external observer (that is, one that does not cooperate with the 1294 issuer) to correlate them with other connection IDs for the same 1295 connection. As a trivial example, this means the same connection ID 1296 MUST NOT be issued more than once on the same connection. 1298 Packets with long headers include Source Connection ID and 1299 Destination Connection ID fields. These fields are used to set the 1300 connection IDs for new connections; see Section 7.2 for details. 1302 Packets with short headers (Section 17.3) only include the 1303 Destination Connection ID and omit the explicit length. The length 1304 of the Destination Connection ID field is expected to be known to 1305 endpoints. Endpoints using a load balancer that routes based on 1306 connection ID could agree with the load balancer on a fixed length 1307 for connection IDs, or agree on an encoding scheme. A fixed portion 1308 could encode an explicit length, which allows the entire connection 1309 ID to vary in length and still be used by the load balancer. 1311 A Version Negotiation (Section 17.2.1) packet echoes the connection 1312 IDs selected by the client, both to ensure correct routing toward the 1313 client and to demonstrate that the packet is in response to a packet 1314 sent by the client. 1316 A zero-length connection ID can be used when a connection ID is not 1317 needed to route to the correct endpoint. However, multiplexing 1318 connections on the same local IP address and port while using zero- 1319 length connection IDs will cause failures in the presence of peer 1320 connection migration, NAT rebinding, and client port reuse. An 1321 endpoint MUST NOT use the same IP address and port for multiple 1322 connections with zero-length connection IDs, unless it is certain 1323 that those protocol features are not in use. 1325 When an endpoint uses a non-zero-length connection ID, it needs to 1326 ensure that the peer has a supply of connection IDs from which to 1327 choose for packets sent to the endpoint. These connection IDs are 1328 supplied by the endpoint using the NEW_CONNECTION_ID frame 1329 (Section 19.15). 1331 5.1.1. Issuing Connection IDs 1333 Each Connection ID has an associated sequence number to assist in 1334 detecting when NEW_CONNECTION_ID or RETIRE_CONNECTION_ID frames refer 1335 to the same value. The initial connection ID issued by an endpoint 1336 is sent in the Source Connection ID field of the long packet header 1337 (Section 17.2) during the handshake. The sequence number of the 1338 initial connection ID is 0. If the preferred_address transport 1339 parameter is sent, the sequence number of the supplied connection ID 1340 is 1. 1342 Additional connection IDs are communicated to the peer using 1343 NEW_CONNECTION_ID frames (Section 19.15). The sequence number on 1344 each newly issued connection ID MUST increase by 1. The connection 1345 ID randomly selected by the client in the Initial packet and any 1346 connection ID provided by a Retry packet are not assigned sequence 1347 numbers unless a server opts to retain them as its initial connection 1348 ID. 1350 When an endpoint issues a connection ID, it MUST accept packets that 1351 carry this connection ID for the duration of the connection or until 1352 its peer invalidates the connection ID via a RETIRE_CONNECTION_ID 1353 frame (Section 19.16). Connection IDs that are issued and not 1354 retired are considered active; any active connection ID is valid for 1355 use with the current connection at any time, in any packet type. 1356 This includes the connection ID issued by the server via the 1357 preferred_address transport parameter. 1359 An endpoint SHOULD ensure that its peer has a sufficient number of 1360 available and unused connection IDs. Endpoints advertise the number 1361 of active connection IDs they are willing to maintain using the 1362 active_connection_id_limit transport parameter. An endpoint MUST NOT 1363 provide more connection IDs than the peer's limit. An endpoint MAY 1364 send connection IDs that temporarily exceed a peer's limit if the 1365 NEW_CONNECTION_ID frame also requires the retirement of any excess, 1366 by including a sufficiently large value in the Retire Prior To field. 1368 A NEW_CONNECTION_ID frame might cause an endpoint to add some active 1369 connection IDs and retire others based on the value of the Retire 1370 Prior To field. After processing a NEW_CONNECTION_ID frame and 1371 adding and retiring active connection IDs, if the number of active 1372 connection IDs exceeds the value advertised in its 1373 active_connection_id_limit transport parameter, an endpoint MUST 1374 close the connection with an error of type CONNECTION_ID_LIMIT_ERROR. 1376 An endpoint SHOULD supply a new connection ID when the peer retires a 1377 connection ID. If an endpoint provided fewer connection IDs than the 1378 peer's active_connection_id_limit, it MAY supply a new connection ID 1379 when it receives a packet with a previously unused connection ID. An 1380 endpoint MAY limit the total number of connection IDs issued for each 1381 connection to avoid the risk of running out of connection IDs; see 1382 Section 10.3.2. An endpoint MAY also limit the issuance of 1383 connection IDs to reduce the amount of per-path state it maintains, 1384 such as path validation status, as its peer might interact with it 1385 over as many paths as there are issued connection IDs. 1387 An endpoint that initiates migration and requires non-zero-length 1388 connection IDs SHOULD ensure that the pool of connection IDs 1389 available to its peer allows the peer to use a new connection ID on 1390 migration, as the peer will be unable to respond if the pool is 1391 exhausted. 1393 5.1.2. Consuming and Retiring Connection IDs 1395 An endpoint can change the connection ID it uses for a peer to 1396 another available one at any time during the connection. An endpoint 1397 consumes connection IDs in response to a migrating peer; see 1398 Section 9.5 for more. 1400 An endpoint maintains a set of connection IDs received from its peer, 1401 any of which it can use when sending packets. When the endpoint 1402 wishes to remove a connection ID from use, it sends a 1403 RETIRE_CONNECTION_ID frame to its peer. Sending a 1404 RETIRE_CONNECTION_ID frame indicates that the connection ID will not 1405 be used again and requests that the peer replace it with a new 1406 connection ID using a NEW_CONNECTION_ID frame. 1408 As discussed in Section 9.5, endpoints limit the use of a connection 1409 ID to packets sent from a single local address to a single 1410 destination address. Endpoints SHOULD retire connection IDs when 1411 they are no longer actively using either the local or destination 1412 address for which the connection ID was used. 1414 An endpoint might need to stop accepting previously issued connection 1415 IDs in certain circumstances. Such an endpoint can cause its peer to 1416 retire connection IDs by sending a NEW_CONNECTION_ID frame with an 1417 increased Retire Prior To field. The endpoint SHOULD continue to 1418 accept the previously issued connection IDs until they are retired by 1419 the peer. If the endpoint can no longer process the indicated 1420 connection IDs, it MAY close the connection. 1422 Upon receipt of an increased Retire Prior To field, the peer MUST 1423 stop using the corresponding connection IDs and retire them with 1424 RETIRE_CONNECTION_ID frames before adding the newly provided 1425 connection ID to the set of active connection IDs. This ordering 1426 allows an endpoint to replace all active connection IDs without the 1427 possibility of a peer having no available connection IDs and without 1428 exceeding the limit the peer sets in the active_connection_id_limit 1429 transport parameter; see Section 18.2. Failure to cease using the 1430 connection IDs when requested can result in connection failures, as 1431 the issuing endpoint might be unable to continue using the connection 1432 IDs with the active connection. 1434 An endpoint SHOULD limit the number of connection IDs it has retired 1435 locally and have not yet been acknowledged. An endpoint SHOULD allow 1436 for sending and tracking a number of RETIRE_CONNECTION_ID frames of 1437 at least twice the active_connection_id limit. An endpoint MUST NOT 1438 forget a connection ID without retiring it, though it MAY choose to 1439 treat having connection IDs in need of retirement that exceed this 1440 limit as a connection error of type CONNECTION_ID_LIMIT_ERROR. 1442 Endpoints SHOULD NOT issue updates of the Retire Prior To field 1443 before receiving RETIRE_CONNECTION_ID frames that retire all 1444 connection IDs indicated by the previous Retire Prior To value. 1446 5.2. Matching Packets to Connections 1448 Incoming packets are classified on receipt. Packets can either be 1449 associated with an existing connection, or - for servers - 1450 potentially create a new connection. 1452 Endpoints try to associate a packet with an existing connection. If 1453 the packet has a non-zero-length Destination Connection ID 1454 corresponding to an existing connection, QUIC processes that packet 1455 accordingly. Note that more than one connection ID can be associated 1456 with a connection; see Section 5.1. 1458 If the Destination Connection ID is zero length and the addressing 1459 information in the packet matches the addressing information the 1460 endpoint uses to identify a connection with a zero-length connection 1461 ID, QUIC processes the packet as part of that connection. An 1462 endpoint can use just destination IP and port or both source and 1463 destination addresses for identification, though this makes 1464 connections fragile as described in Section 5.1. 1466 Endpoints can send a Stateless Reset (Section 10.3) for any packets 1467 that cannot be attributed to an existing connection. A stateless 1468 reset allows a peer to more quickly identify when a connection 1469 becomes unusable. 1471 Packets that are matched to an existing connection are discarded if 1472 the packets are inconsistent with the state of that connection. For 1473 example, packets are discarded if they indicate a different protocol 1474 version than that of the connection, or if the removal of packet 1475 protection is unsuccessful once the expected keys are available. 1477 Invalid packets that lack strong integrity protection, such as 1478 Initial, Retry, or Version Negotiation, MAY be discarded. An 1479 endpoint MUST generate a connection error if processing the contents 1480 of these packets prior to discovering an error resulted in changes to 1481 connection state that cannot be reverted. 1483 5.2.1. Client Packet Handling 1485 Valid packets sent to clients always include a Destination Connection 1486 ID that matches a value the client selects. Clients that choose to 1487 receive zero-length connection IDs can use the local address and port 1488 to identify a connection. Packets that do not match an existing 1489 connection, based on Destination Connection ID or, if this value is 1490 zero-length, local IP address and port, are discarded. 1492 Due to packet reordering or loss, a client might receive packets for 1493 a connection that are encrypted with a key it has not yet computed. 1494 The client MAY drop these packets, or MAY buffer them in anticipation 1495 of later packets that allow it to compute the key. 1497 If a client receives a packet that has an unsupported version, it 1498 MUST discard that packet. 1500 5.2.2. Server Packet Handling 1502 If a server receives a packet that indicates an unsupported version 1503 but is large enough to initiate a new connection for any supported 1504 version, the server SHOULD send a Version Negotiation packet as 1505 described in Section 6.1. A server MAY limit the number of packets 1506 to which it responds with a Version Negotiation packet. Servers MUST 1507 drop smaller packets that specify unsupported versions. 1509 The first packet for an unsupported version can use different 1510 semantics and encodings for any version-specific field. In 1511 particular, different packet protection keys might be used for 1512 different versions. Servers that do not support a particular version 1513 are unlikely to be able to decrypt the payload of the packet or 1514 properly interpret the result. Servers SHOULD respond with a Version 1515 Negotiation packet, provided that the datagram is sufficiently long. 1517 Packets with a supported version, or no version field, are matched to 1518 a connection using the connection ID or - for packets with zero- 1519 length connection IDs - the local address and port. These packets 1520 are processed using the selected connection; otherwise, the server 1521 continues below. 1523 If the packet is an Initial packet fully conforming with the 1524 specification, the server proceeds with the handshake (Section 7). 1525 This commits the server to the version that the client selected. 1527 If a server refuses to accept a new connection, it SHOULD send an 1528 Initial packet containing a CONNECTION_CLOSE frame with error code 1529 CONNECTION_REFUSED. 1531 If the packet is a 0-RTT packet, the server MAY buffer a limited 1532 number of these packets in anticipation of a late-arriving Initial 1533 packet. Clients are not able to send Handshake packets prior to 1534 receiving a server response, so servers SHOULD ignore any such 1535 packets. 1537 Servers MUST drop incoming packets under all other circumstances. 1539 5.2.3. Considerations for Simple Load Balancers 1541 A server deployment could load balance among servers using only 1542 source and destination IP addresses and ports. Changes to the 1543 client's IP address or port could result in packets being forwarded 1544 to the wrong server. Such a server deployment could use one of the 1545 following methods for connection continuity when a client's address 1546 changes. 1548 * Servers could use an out-of-band mechanism to forward packets to 1549 the correct server based on Connection ID. 1551 * If servers can use a dedicated server IP address or port, other 1552 than the one that the client initially connects to, they could use 1553 the preferred_address transport parameter to request that clients 1554 move connections to that dedicated address. Note that clients 1555 could choose not to use the preferred address. 1557 A server in a deployment that does not implement a solution to 1558 maintain connection continuity when the client address changes SHOULD 1559 indicate migration is not supported using the 1560 disable_active_migration transport parameter. The 1561 disable_active_migration transport parameter does not prohibit 1562 connection migration after a client has acted on a preferred_address 1563 transport parameter. 1565 Server deployments that use this simple form of load balancing MUST 1566 avoid the creation of a stateless reset oracle; see Section 21.10. 1568 5.3. Operations on Connections 1570 This document does not define an API for QUIC, but instead defines a 1571 set of functions for QUIC connections that application protocols can 1572 rely upon. An application protocol can assume that an implementation 1573 of QUIC provides an interface that includes the operations described 1574 in this section. An implementation designed for use with a specific 1575 application protocol might provide only those operations that are 1576 used by that protocol. 1578 When implementing the client role, an application protocol can: 1580 * open a connection, which begins the exchange described in 1581 Section 7; 1583 * enable Early Data when available; and 1585 * be informed when Early Data has been accepted or rejected by a 1586 server. 1588 When implementing the server role, an application protocol can: 1590 * listen for incoming connections, which prepares for the exchange 1591 described in Section 7; 1593 * if Early Data is supported, embed application-controlled data in 1594 the TLS resumption ticket sent to the client; and 1596 * if Early Data is supported, retrieve application-controlled data 1597 from the client's resumption ticket and enable rejecting Early 1598 Data based on that information. 1600 In either role, an application protocol can: 1602 * configure minimum values for the initial number of permitted 1603 streams of each type, as communicated in the transport parameters 1604 (Section 7.4); 1606 * control resource allocation of various types, including flow 1607 control and the number of permitted streams of each type; 1609 * identify whether the handshake has completed successfully or is 1610 still ongoing; 1612 * keep a connection from silently closing, either by generating PING 1613 frames (Section 19.2) or by requesting that the transport send 1614 additional frames before the idle timeout expires (Section 10.1); 1615 and 1617 * immediately close (Section 10.2) the connection. 1619 6. Version Negotiation 1621 Version negotiation allows a server to indicate that it does not 1622 support the version the client used. A server sends a Version 1623 Negotiation packet in response to each packet that might initiate a 1624 new connection; see Section 5.2 for details. 1626 The size of the first packet sent by a client will determine whether 1627 a server sends a Version Negotiation packet. Clients that support 1628 multiple QUIC versions SHOULD pad the first UDP datagram they send to 1629 the largest of the minimum datagram sizes from all versions they 1630 support. This ensures that the server responds if there is a 1631 mutually supported version. A server might not send a Version 1632 Negotiation packet if the datagram it receives is smaller than the 1633 minimum size specified in a different version; see Section 14.1. 1635 6.1. Sending Version Negotiation Packets 1637 If the version selected by the client is not acceptable to the 1638 server, the server responds with a Version Negotiation packet; see 1639 Section 17.2.1. This includes a list of versions that the server 1640 will accept. An endpoint MUST NOT send a Version Negotiation packet 1641 in response to receiving a Version Negotiation packet. 1643 This system allows a server to process packets with unsupported 1644 versions without retaining state. Though either the Initial packet 1645 or the Version Negotiation packet that is sent in response could be 1646 lost, the client will send new packets until it successfully receives 1647 a response or it abandons the connection attempt. As a result, the 1648 client discards all state for the connection and does not send any 1649 more packets on the connection. 1651 A server MAY limit the number of Version Negotiation packets it 1652 sends. For instance, a server that is able to recognize packets as 1653 0-RTT might choose not to send Version Negotiation packets in 1654 response to 0-RTT packets with the expectation that it will 1655 eventually receive an Initial packet. 1657 6.2. Handling Version Negotiation Packets 1659 Version Negotiation packets are designed to allow future versions of 1660 QUIC to negotiate the version in use between endpoints. Future 1661 versions of QUIC might change how implementations that support 1662 multiple versions of QUIC react to Version Negotiation packets when 1663 attempting to establish a connection using this version. 1665 A client that supports only this version of QUIC MUST abandon the 1666 current connection attempt if it receives a Version Negotiation 1667 packet, with the following two exceptions. A client MUST discard any 1668 Version Negotiation packet if it has received and successfully 1669 processed any other packet, including an earlier Version Negotiation 1670 packet. A client MUST discard a Version Negotiation packet that 1671 lists the QUIC version selected by the client. 1673 How to perform version negotiation is left as future work defined by 1674 future versions of QUIC. In particular, that future work will ensure 1675 robustness against version downgrade attacks; see Section 21.11. 1677 6.2.1. Version Negotiation Between Draft Versions 1679 [[RFC editor: please remove this section before publication.]] 1680 When a draft implementation receives a Version Negotiation packet, it 1681 MAY use it to attempt a new connection with one of the versions 1682 listed in the packet, instead of abandoning the current connection 1683 attempt; see Section 6.2. 1685 The client MUST check that the Destination and Source Connection ID 1686 fields match the Source and Destination Connection ID fields in a 1687 packet that the client sent. If this check fails, the packet MUST be 1688 discarded. 1690 Once the Version Negotiation packet is determined to be valid, the 1691 client then selects an acceptable protocol version from the list 1692 provided by the server. The client then attempts to create a new 1693 connection using that version. The new connection MUST use a new 1694 random Destination Connection ID different from the one it had 1695 previously sent. 1697 Note that this mechanism does not protect against downgrade attacks 1698 and MUST NOT be used outside of draft implementations. 1700 6.3. Using Reserved Versions 1702 For a server to use a new version in the future, clients need to 1703 correctly handle unsupported versions. Some version numbers 1704 (0x?a?a?a?a as defined in Section 15) are reserved for inclusion in 1705 fields that contain version numbers. 1707 Endpoints MAY add reserved versions to any field where unknown or 1708 unsupported versions are ignored to test that a peer correctly 1709 ignores the value. For instance, an endpoint could include a 1710 reserved version in a Version Negotiation packet; see Section 17.2.1. 1711 Endpoints MAY send packets with a reserved version to test that a 1712 peer correctly discards the packet. 1714 7. Cryptographic and Transport Handshake 1716 QUIC relies on a combined cryptographic and transport handshake to 1717 minimize connection establishment latency. QUIC uses the CRYPTO 1718 frame (Section 19.6) to transmit the cryptographic handshake. 1719 Version 0x00000001 of QUIC uses TLS as described in [QUIC-TLS]; a 1720 different QUIC version number could indicate that a different 1721 cryptographic handshake protocol is in use. 1723 QUIC provides reliable, ordered delivery of the cryptographic 1724 handshake data. QUIC packet protection is used to encrypt as much of 1725 the handshake protocol as possible. The cryptographic handshake MUST 1726 provide the following properties: 1728 * authenticated key exchange, where 1730 - a server is always authenticated, 1732 - a client is optionally authenticated, 1734 - every connection produces distinct and unrelated keys, and 1736 - keying material is usable for packet protection for both 0-RTT 1737 and 1-RTT packets 1739 * authenticated values for transport parameters of both endpoints, 1740 and confidentiality protection for server transport parameters 1741 (see Section 7.4) 1743 * authenticated negotiation of an application protocol (TLS uses 1744 ALPN [RFC7301] for this purpose) 1746 An endpoint verifies support for Explicit Congestion Notification 1747 (ECN) by observing whether the ACK frames acknowledging the first 1748 packets it sends carry ECN counts, as described in Section 13.4.2. 1750 The CRYPTO frame can be sent in different packet number spaces 1751 (Section 12.3). The offsets used by CRYPTO frames to ensure ordered 1752 delivery of cryptographic handshake data start from zero in each 1753 packet number space. 1755 Figure 4 shows a simplified handshake and the exchange of packets and 1756 frames that are used to advance the handshake. Exchange of 1757 application data during the handshake is enabled where possible, 1758 shown with a '*'. Once completed, endpoints are able to exchange 1759 application data. 1761 Client Server 1763 Initial (CRYPTO) 1764 0-RTT (*) ----------> 1765 Initial (CRYPTO) 1766 Handshake (CRYPTO) 1767 <---------- 1-RTT (*) 1768 Handshake (CRYPTO) 1769 1-RTT (*) ----------> 1770 <---------- 1-RTT (HANDSHAKE_DONE,*) 1772 1-RTT (*) <=========> 1-RTT (*) 1774 Figure 4: Simplified QUIC Handshake 1776 Endpoints MUST explicitly negotiate an application protocol. This 1777 avoids situations where there is a disagreement about the protocol 1778 that is in use. 1780 7.1. Example Handshake Flows 1782 Details of how TLS is integrated with QUIC are provided in 1783 [QUIC-TLS], but some examples are provided here. An extension of 1784 this exchange to support client address validation is shown in 1785 Section 8.1.2. 1787 Once any address validation exchanges are complete, the cryptographic 1788 handshake is used to agree on cryptographic keys. The cryptographic 1789 handshake is carried in Initial (Section 17.2.2) and Handshake 1790 (Section 17.2.4) packets. 1792 Figure 5 provides an overview of the 1-RTT handshake. Each line 1793 shows a QUIC packet with the packet type and packet number shown 1794 first, followed by the frames that are typically contained in those 1795 packets. So, for instance the first packet is of type Initial, with 1796 packet number 0, and contains a CRYPTO frame carrying the 1797 ClientHello. 1799 Multiple QUIC packets - even of different packet types - can be 1800 coalesced into a single UDP datagram; see Section 12.2. As a result, 1801 this handshake may consist of as few as 4 UDP datagrams, or any 1802 number more (subject to limits inherent to the protocol, such as 1803 congestion control or anti-amplification). For instance, the 1804 server's first flight contains Initial packets, Handshake packets, 1805 and "0.5-RTT data" in 1-RTT packets with a short header. 1807 Client Server 1809 Initial[0]: CRYPTO[CH] -> 1811 Initial[0]: CRYPTO[SH] ACK[0] 1812 Handshake[0]: CRYPTO[EE, CERT, CV, FIN] 1813 <- 1-RTT[0]: STREAM[1, "..."] 1815 Initial[1]: ACK[0] 1816 Handshake[0]: CRYPTO[FIN], ACK[0] 1817 1-RTT[0]: STREAM[0, "..."], ACK[0] -> 1819 Handshake[1]: ACK[0] 1820 <- 1-RTT[1]: HANDSHAKE_DONE, STREAM[3, "..."], ACK[0] 1822 Figure 5: Example 1-RTT Handshake 1824 Figure 6 shows an example of a connection with a 0-RTT handshake and 1825 a single packet of 0-RTT data. Note that as described in 1826 Section 12.3, the server acknowledges 0-RTT data in 1-RTT packets, 1827 and the client sends 1-RTT packets in the same packet number space. 1829 Client Server 1831 Initial[0]: CRYPTO[CH] 1832 0-RTT[0]: STREAM[0, "..."] -> 1834 Initial[0]: CRYPTO[SH] ACK[0] 1835 Handshake[0] CRYPTO[EE, FIN] 1836 <- 1-RTT[0]: STREAM[1, "..."] ACK[0] 1838 Initial[1]: ACK[0] 1839 Handshake[0]: CRYPTO[FIN], ACK[0] 1840 1-RTT[1]: STREAM[0, "..."] ACK[0] -> 1842 Handshake[1]: ACK[0] 1843 <- 1-RTT[1]: HANDSHAKE_DONE, STREAM[3, "..."], ACK[1] 1845 Figure 6: Example 0-RTT Handshake 1847 7.2. Negotiating Connection IDs 1849 A connection ID is used to ensure consistent routing of packets, as 1850 described in Section 5.1. The long header contains two connection 1851 IDs: the Destination Connection ID is chosen by the recipient of the 1852 packet and is used to provide consistent routing; the Source 1853 Connection ID is used to set the Destination Connection ID used by 1854 the peer. 1856 During the handshake, packets with the long header (Section 17.2) are 1857 used to establish the connection IDs used by both endpoints. Each 1858 endpoint uses the Source Connection ID field to specify the 1859 connection ID that is used in the Destination Connection ID field of 1860 packets being sent to them. After processing the first Initial 1861 packet, each endpoint sets the Destination Connection ID field in 1862 subsequent packets it sends to the value of the Source Connection ID 1863 field that it received. 1865 When an Initial packet is sent by a client that has not previously 1866 received an Initial or Retry packet from the server, the client 1867 populates the Destination Connection ID field with an unpredictable 1868 value. This Destination Connection ID MUST be at least 8 bytes in 1869 length. Until a packet is received from the server, the client MUST 1870 use the same Destination Connection ID value on all packets in this 1871 connection. 1873 The Destination Connection ID field from the first Initial packet 1874 sent by a client is used to determine packet protection keys for 1875 Initial packets. These keys change after receiving a Retry packet; 1876 see Section 5.2 of [QUIC-TLS]. 1878 The client populates the Source Connection ID field with a value of 1879 its choosing and sets the Source Connection ID Length field to 1880 indicate the length. 1882 The first flight of 0-RTT packets use the same Destination Connection 1883 ID and Source Connection ID values as the client's first Initial 1884 packet. 1886 Upon first receiving an Initial or Retry packet from the server, the 1887 client uses the Source Connection ID supplied by the server as the 1888 Destination Connection ID for subsequent packets, including any 0-RTT 1889 packets. This means that a client might have to change the 1890 connection ID it sets in the Destination Connection ID field twice 1891 during connection establishment: once in response to a Retry, and 1892 once in response to an Initial packet from the server. Once a client 1893 has received a valid Initial packet from the server, it MUST discard 1894 any subsequent packet it receives with a different Source Connection 1895 ID. 1897 A client MUST change the Destination Connection ID it uses for 1898 sending packets in response to only the first received Initial or 1899 Retry packet. A server MUST set the Destination Connection ID it 1900 uses for sending packets based on the first received Initial packet. 1901 Any further changes to the Destination Connection ID are only 1902 permitted if the values are taken from NEW_CONNECTION_ID frames; if 1903 subsequent Initial packets include a different Source Connection ID, 1904 they MUST be discarded. This avoids unpredictable outcomes that 1905 might otherwise result from stateless processing of multiple Initial 1906 packets with different Source Connection IDs. 1908 The Destination Connection ID that an endpoint sends can change over 1909 the lifetime of a connection, especially in response to connection 1910 migration (Section 9); see Section 5.1.1 for details. 1912 7.3. Authenticating Connection IDs 1914 The choice each endpoint makes about connection IDs during the 1915 handshake is authenticated by including all values in transport 1916 parameters; see Section 7.4. This ensures that all connection IDs 1917 used for the handshake are also authenticated by the cryptographic 1918 handshake. 1920 Each endpoint includes the value of the Source Connection ID field 1921 from the first Initial packet it sent in the 1922 initial_source_connection_id transport parameter; see Section 18.2. 1923 A server includes the Destination Connection ID field from the first 1924 Initial packet it received from the client in the 1925 original_destination_connection_id transport parameter; if the server 1926 sent a Retry packet, this refers to the first Initial packet received 1927 before sending the Retry packet. If it sends a Retry packet, a 1928 server also includes the Source Connection ID field from the Retry 1929 packet in the retry_source_connection_id transport parameter. 1931 The values provided by a peer for these transport parameters MUST 1932 match the values that an endpoint used in the Destination and Source 1933 Connection ID fields of Initial packets that it sent. Including 1934 connection ID values in transport parameters and verifying them 1935 ensures that that an attacker cannot influence the choice of 1936 connection ID for a successful connection by injecting packets 1937 carrying attacker-chosen connection IDs during the handshake. 1939 An endpoint MUST treat absence of the initial_source_connection_id 1940 transport parameter from either endpoint or absence of the 1941 original_destination_connection_id transport parameter from the 1942 server as a connection error of type TRANSPORT_PARAMETER_ERROR. 1944 An endpoint MUST treat the following as a connection error of type 1945 TRANSPORT_PARAMETER_ERROR or PROTOCOL_VIOLATION: 1947 * absence of the retry_source_connection_id transport parameter from 1948 the server after receiving a Retry packet, 1950 * presence of the retry_source_connection_id transport parameter 1951 when no Retry packet was received, or 1953 * a mismatch between values received from a peer in these transport 1954 parameters and the value sent in the corresponding Destination or 1955 Source Connection ID fields of Initial packets. 1957 If a zero-length connection ID is selected, the corresponding 1958 transport parameter is included with a zero-length value. 1960 Figure 7 shows the connection IDs (with DCID=Destination Connection 1961 ID, SCID=Source Connection ID) that are used in a complete handshake. 1962 The exchange of Initial packets is shown, plus the later exchange of 1963 1-RTT packets that includes the connection ID established during the 1964 handshake. 1966 Client Server 1968 Initial: DCID=S1, SCID=C1 -> 1969 <- Initial: DCID=C1, SCID=S3 1970 ... 1971 1-RTT: DCID=S3 -> 1972 <- 1-RTT: DCID=C1 1974 Figure 7: Use of Connection IDs in a Handshake 1976 Figure 8 shows a similar handshake that includes a Retry packet. 1978 Client Server 1980 Initial: DCID=S1, SCID=C1 -> 1981 <- Retry: DCID=C1, SCID=S2 1982 Initial: DCID=S2, SCID=C1 -> 1983 <- Initial: DCID=C1, SCID=S3 1984 ... 1985 1-RTT: DCID=S3 -> 1986 <- 1-RTT: DCID=C1 1988 Figure 8: Use of Connection IDs in a Handshake with Retry 1990 In both cases (Figure 7 and Figure 8), the client sets the value of 1991 the initial_source_connection_id transport parameter to "C1". 1993 When the handshake does not include a Retry (Figure 7), the server 1994 sets original_destination_connection_id to "S1" and 1995 initial_source_connection_id to "S3". In this case, the server does 1996 not include a retry_source_connection_id transport parameter. 1998 When the handshake includes a Retry (Figure 8), the server sets 1999 original_destination_connection_id to "S1", 2000 retry_source_connection_id to "S2", and initial_source_connection_id 2001 to "S3". 2003 Each endpoint validates transport parameters set by the peer. The 2004 client confirms that the retry_source_connection_id transport 2005 parameter is absent if it did not process a Retry packet. 2007 7.4. Transport Parameters 2009 During connection establishment, both endpoints make authenticated 2010 declarations of their transport parameters. Endpoints are required 2011 to comply with the restrictions that each parameter defines; the 2012 description of each parameter includes rules for its handling. 2014 Transport parameters are declarations that are made unilaterally by 2015 each endpoint. Each endpoint can choose values for transport 2016 parameters independent of the values chosen by its peer. 2018 The encoding of the transport parameters is detailed in Section 18. 2020 QUIC includes the encoded transport parameters in the cryptographic 2021 handshake. Once the handshake completes, the transport parameters 2022 declared by the peer are available. Each endpoint validates the 2023 values provided by its peer. 2025 Definitions for each of the defined transport parameters are included 2026 in Section 18.2. 2028 An endpoint MUST treat receipt of a transport parameter with an 2029 invalid value as a connection error of type 2030 TRANSPORT_PARAMETER_ERROR. 2032 An endpoint MUST NOT send a parameter more than once in a given 2033 transport parameters extension. An endpoint SHOULD treat receipt of 2034 duplicate transport parameters as a connection error of type 2035 TRANSPORT_PARAMETER_ERROR. 2037 Endpoints use transport parameters to authenticate the negotiation of 2038 connection IDs during the handshake; see Section 7.3. 2040 Application Layer Protocol Negotiation (ALPN; see [ALPN]) allows 2041 clients to offer multiple application protocols during connection 2042 establishment. The transport parameters that a client includes 2043 during the handshake apply to all application protocols that the 2044 client offers. Application protocols can recommend values for 2045 transport parameters, such as the initial flow control limits. 2046 However, application protocols that set constraints on values for 2047 transport parameters could make it impossible for a client to offer 2048 multiple application protocols if these constraints conflict. 2050 7.4.1. Values of Transport Parameters for 0-RTT 2052 Using 0-RTT depends on both client and server using protocol 2053 parameters that were negotiated from a previous connection. To 2054 enable 0-RTT, endpoints store the value of the server transport 2055 parameters from a connection and apply them to any 0-RTT packets that 2056 are sent in subsequent connections to that peer. This information is 2057 stored with any information required by the application protocol or 2058 cryptographic handshake; see Section 4.6 of [QUIC-TLS]. 2060 Remembered transport parameters apply to the new connection until the 2061 handshake completes and the client starts sending 1-RTT packets. 2062 Once the handshake completes, the client uses the transport 2063 parameters established in the handshake. Not all transport 2064 parameters are remembered, as some do not apply to future connections 2065 or they have no effect on use of 0-RTT. 2067 The definition of a new transport parameter (Section 7.4.2) MUST 2068 specify whether storing the transport parameter for 0-RTT is 2069 mandatory, optional, or prohibited. A client need not store a 2070 transport parameter it cannot process. 2072 A client MUST NOT use remembered values for the following parameters: 2073 ack_delay_exponent, max_ack_delay, initial_source_connection_id, 2074 original_destination_connection_id, preferred_address, 2075 retry_source_connection_id, and stateless_reset_token. The client 2076 MUST use the server's new values in the handshake instead; if the 2077 server does not provide new values, the default value is used. 2079 A client that attempts to send 0-RTT data MUST remember all other 2080 transport parameters used by the server. The server can remember 2081 these transport parameters, or store an integrity-protected copy of 2082 the values in the ticket and recover the information when accepting 2083 0-RTT data. A server uses the transport parameters in determining 2084 whether to accept 0-RTT data. 2086 If 0-RTT data is accepted by the server, the server MUST NOT reduce 2087 any limits or alter any values that might be violated by the client 2088 with its 0-RTT data. In particular, a server that accepts 0-RTT data 2089 MUST NOT set values for the following parameters (Section 18.2) that 2090 are smaller than the remembered value of the parameters. 2092 * active_connection_id_limit 2094 * initial_max_data 2096 * initial_max_stream_data_bidi_local 2098 * initial_max_stream_data_bidi_remote 2100 * initial_max_stream_data_uni 2102 * initial_max_streams_bidi 2104 * initial_max_streams_uni 2105 Omitting or setting a zero value for certain transport parameters can 2106 result in 0-RTT data being enabled, but not usable. The applicable 2107 subset of transport parameters that permit sending of application 2108 data SHOULD be set to non-zero values for 0-RTT. This includes 2109 initial_max_data and either initial_max_streams_bidi and 2110 initial_max_stream_data_bidi_remote, or initial_max_streams_uni and 2111 initial_max_stream_data_uni. 2113 A server MAY store and recover the previously sent values of the 2114 max_idle_timeout, max_udp_payload_size, and disable_active_migration 2115 parameters and reject 0-RTT if it selects smaller values. Lowering 2116 the values of these parameters while also accepting 0-RTT data could 2117 degrade the performance of the connection. Specifically, lowering 2118 the max_udp_payload_size could result in dropped packets leading to 2119 worse performance compared to rejecting 0-RTT data outright. 2121 A server MUST either reject 0-RTT data or abort a handshake if the 2122 implied values for transport parameters cannot be supported. 2124 When sending frames in 0-RTT packets, a client MUST only use 2125 remembered transport parameters; importantly, it MUST NOT use updated 2126 values that it learns from the server's updated transport parameters 2127 or from frames received in 1-RTT packets. Updated values of 2128 transport parameters from the handshake apply only to 1-RTT packets. 2129 For instance, flow control limits from remembered transport 2130 parameters apply to all 0-RTT packets even if those values are 2131 increased by the handshake or by frames sent in 1-RTT packets. A 2132 server MAY treat use of updated transport parameters in 0-RTT as a 2133 connection error of type PROTOCOL_VIOLATION. 2135 7.4.2. New Transport Parameters 2137 New transport parameters can be used to negotiate new protocol 2138 behavior. An endpoint MUST ignore transport parameters that it does 2139 not support. Absence of a transport parameter therefore disables any 2140 optional protocol feature that is negotiated using the parameter. As 2141 described in Section 18.1, some identifiers are reserved in order to 2142 exercise this requirement. 2144 New transport parameters can be registered according to the rules in 2145 Section 22.2. 2147 7.5. Cryptographic Message Buffering 2149 Implementations need to maintain a buffer of CRYPTO data received out 2150 of order. Because there is no flow control of CRYPTO frames, an 2151 endpoint could potentially force its peer to buffer an unbounded 2152 amount of data. 2154 Implementations MUST support buffering at least 4096 bytes of data 2155 received in out-of-order CRYPTO frames. Endpoints MAY choose to 2156 allow more data to be buffered during the handshake. A larger limit 2157 during the handshake could allow for larger keys or credentials to be 2158 exchanged. An endpoint's buffer size does not need to remain 2159 constant during the life of the connection. 2161 Being unable to buffer CRYPTO frames during the handshake can lead to 2162 a connection failure. If an endpoint's buffer is exceeded during the 2163 handshake, it can expand its buffer temporarily to complete the 2164 handshake. If an endpoint does not expand its buffer, it MUST close 2165 the connection with a CRYPTO_BUFFER_EXCEEDED error code. 2167 Once the handshake completes, if an endpoint is unable to buffer all 2168 data in a CRYPTO frame, it MAY discard that CRYPTO frame and all 2169 CRYPTO frames received in the future, or it MAY close the connection 2170 with a CRYPTO_BUFFER_EXCEEDED error code. Packets containing 2171 discarded CRYPTO frames MUST be acknowledged because the packet has 2172 been received and processed by the transport even though the CRYPTO 2173 frame was discarded. 2175 8. Address Validation 2177 Address validation ensures that an endpoint cannot be used for a 2178 traffic amplification attack. In such an attack, a packet is sent to 2179 a server with spoofed source address information that identifies a 2180 victim. If a server generates more or larger packets in response to 2181 that packet, the attacker can use the server to send more data toward 2182 the victim than it would be able to send on its own. 2184 The primary defense against amplification attack is verifying that an 2185 endpoint is able to receive packets at the transport address that it 2186 claims. Address validation is performed both during connection 2187 establishment (see Section 8.1) and during connection migration (see 2188 Section 8.2). 2190 8.1. Address Validation During Connection Establishment 2192 Connection establishment implicitly provides address validation for 2193 both endpoints. In particular, receipt of a packet protected with 2194 Handshake keys confirms that the client received the Initial packet 2195 from the server. Once the server has successfully processed a 2196 Handshake packet from the client, it can consider the client address 2197 to have been validated. 2199 Additionally, a server MAY consider the client address validated if 2200 the client uses a connection ID chosen by the server and the 2201 connection ID contains at least 64 bits of entropy. 2203 Prior to validating the client address, servers MUST NOT send more 2204 than three times as many bytes as the number of bytes they have 2205 received. This limits the magnitude of any amplification attack that 2206 can be mounted using spoofed source addresses. For the purposes of 2207 avoiding amplification prior to address validation, servers MUST 2208 count all of the payload bytes received in datagrams that are 2209 uniquely attributed to a single connection. This includes datagrams 2210 that contain packets that are successfully processed and datagrams 2211 that contain packets that are all discarded. 2213 Clients MUST ensure that UDP datagrams containing Initial packets 2214 have UDP payloads of at least 1200 bytes, adding padding to packets 2215 in the datagram as necessary. A client that sends padded datagrams 2216 allows the server to send more data prior to completing address 2217 validation. 2219 Loss of an Initial or Handshake packet from the server can cause a 2220 deadlock if the client does not send additional Initial or Handshake 2221 packets. A deadlock could occur when the server reaches its anti- 2222 amplification limit and the client has received acknowledgements for 2223 all the data it has sent. In this case, when the client has no 2224 reason to send additional packets, the server will be unable to send 2225 more data because it has not validated the client's address. To 2226 prevent this deadlock, clients MUST send a packet on a probe timeout 2227 (PTO, see Section 6.2 of [QUIC-RECOVERY]). Specifically, the client 2228 MUST send an Initial packet in a UDP datagram that contains at least 2229 1200 bytes if it does not have Handshake keys, and otherwise send a 2230 Handshake packet. 2232 A server might wish to validate the client address before starting 2233 the cryptographic handshake. QUIC uses a token in the Initial packet 2234 to provide address validation prior to completing the handshake. 2235 This token is delivered to the client during connection establishment 2236 with a Retry packet (see Section 8.1.2) or in a previous connection 2237 using the NEW_TOKEN frame (see Section 8.1.3). 2239 In addition to sending limits imposed prior to address validation, 2240 servers are also constrained in what they can send by the limits set 2241 by the congestion controller. Clients are only constrained by the 2242 congestion controller. 2244 8.1.1. Token Construction 2246 A token sent in a NEW_TOKEN frames or a Retry packet MUST be 2247 constructed in a way that allows the server to identify how it was 2248 provided to a client. These tokens are carried in the same field, 2249 but require different handling from servers. 2251 8.1.2. Address Validation using Retry Packets 2253 Upon receiving the client's Initial packet, the server can request 2254 address validation by sending a Retry packet (Section 17.2.5) 2255 containing a token. This token MUST be repeated by the client in all 2256 Initial packets it sends for that connection after it receives the 2257 Retry packet. 2259 In response to processing an Initial containing a token that was 2260 provided in a Retry packet, a server cannot send another Retry 2261 packet; it can only refuse the connection or permit it to proceed. 2263 As long as it is not possible for an attacker to generate a valid 2264 token for its own address (see Section 8.1.4) and the client is able 2265 to return that token, it proves to the server that it received the 2266 token. 2268 A server can also use a Retry packet to defer the state and 2269 processing costs of connection establishment. Requiring the server 2270 to provide a different connection ID, along with the 2271 original_destination_connection_id transport parameter defined in 2272 Section 18.2, forces the server to demonstrate that it, or an entity 2273 it cooperates with, received the original Initial packet from the 2274 client. Providing a different connection ID also grants a server 2275 some control over how subsequent packets are routed. This can be 2276 used to direct connections to a different server instance. 2278 If a server receives a client Initial that can be unprotected but 2279 contains an invalid Retry token, it knows the client will not accept 2280 another Retry token. The server can discard such a packet and allow 2281 the client to time out to detect handshake failure, but that could 2282 impose a significant latency penalty on the client. Instead, the 2283 server SHOULD immediately close (Section 10.2) the connection with an 2284 INVALID_TOKEN error. Note that a server has not established any 2285 state for the connection at this point and so does not enter the 2286 closing period. 2288 A flow showing the use of a Retry packet is shown in Figure 9. 2290 Client Server 2292 Initial[0]: CRYPTO[CH] -> 2294 <- Retry+Token 2296 Initial+Token[1]: CRYPTO[CH] -> 2298 Initial[0]: CRYPTO[SH] ACK[1] 2299 Handshake[0]: CRYPTO[EE, CERT, CV, FIN] 2300 <- 1-RTT[0]: STREAM[1, "..."] 2302 Figure 9: Example Handshake with Retry 2304 8.1.3. Address Validation for Future Connections 2306 A server MAY provide clients with an address validation token during 2307 one connection that can be used on a subsequent connection. Address 2308 validation is especially important with 0-RTT because a server 2309 potentially sends a significant amount of data to a client in 2310 response to 0-RTT data. 2312 The server uses the NEW_TOKEN frame (Section 19.7) to provide the 2313 client with an address validation token that can be used to validate 2314 future connections. In a future connection, the client includes this 2315 token in Initial packets to provide address validation. The client 2316 MUST include the token in all Initial packets it sends, unless a 2317 Retry replaces the token with a newer one. The client MUST NOT use 2318 the token provided in a Retry for future connections. Servers MAY 2319 discard any Initial packet that does not carry the expected token. 2321 Unlike the token that is created for a Retry packet, which is used 2322 immediately, the token sent in the NEW_TOKEN frame might be used 2323 after some period of time has passed. Thus, a token SHOULD have an 2324 expiration time, which could be either an explicit expiration time or 2325 an issued timestamp that can be used to dynamically calculate the 2326 expiration time. A server can store the expiration time or include 2327 it in an encrypted form in the token. 2329 A token issued with NEW_TOKEN MUST NOT include information that would 2330 allow values to be linked by an observer to the connection on which 2331 it was issued, unless the values are encrypted. For example, it 2332 cannot include the previous connection ID or addressing information. 2333 A server MUST ensure that every NEW_TOKEN frame it sends is unique 2334 across all clients, with the exception of those sent to repair losses 2335 of previously sent NEW_TOKEN frames. Information that allows the 2336 server to distinguish between tokens from Retry and NEW_TOKEN MAY be 2337 accessible to entities other than the server. 2339 It is unlikely that the client port number is the same on two 2340 different connections; validating the port is therefore unlikely to 2341 be successful. 2343 A token received in a NEW_TOKEN frame is applicable to any server 2344 that the connection is considered authoritative for (e.g., server 2345 names included in the certificate). When connecting to a server for 2346 which the client retains an applicable and unused token, it SHOULD 2347 include that token in the Token field of its Initial packet. 2348 Including a token might allow the server to validate the client 2349 address without an additional round trip. A client MUST NOT include 2350 a token that is not applicable to the server that it is connecting 2351 to, unless the client has the knowledge that the server that issued 2352 the token and the server the client is connecting to are jointly 2353 managing the tokens. A client MAY use a token from any previous 2354 connection to that server. 2356 A token allows a server to correlate activity between the connection 2357 where the token was issued and any connection where it is used. 2358 Clients that want to break continuity of identity with a server MAY 2359 discard tokens provided using the NEW_TOKEN frame. In comparison, a 2360 token obtained in a Retry packet MUST be used immediately during the 2361 connection attempt and cannot be used in subsequent connection 2362 attempts. 2364 A client SHOULD NOT reuse a NEW_TOKEN token for different connection 2365 attempts. Reusing a token allows connections to be linked by 2366 entities on the network path; see Section 9.5. 2368 Clients might receive multiple tokens on a single connection. Aside 2369 from preventing linkability, any token can be used in any connection 2370 attempt. Servers can send additional tokens to either enable address 2371 validation for multiple connection attempts or to replace older 2372 tokens that might become invalid. For a client, this ambiguity means 2373 that sending the most recent unused token is most likely to be 2374 effective. Though saving and using older tokens has no negative 2375 consequences, clients can regard older tokens as being less likely be 2376 useful to the server for address validation. 2378 When a server receives an Initial packet with an address validation 2379 token, it MUST attempt to validate the token, unless it has already 2380 completed address validation. If the token is invalid then the 2381 server SHOULD proceed as if the client did not have a validated 2382 address, including potentially sending a Retry. A server SHOULD 2383 encode tokens provided with NEW_TOKEN frames and Retry packets 2384 differently, and validate the latter more strictly. If the 2385 validation succeeds, the server SHOULD then allow the handshake to 2386 proceed. 2388 Note: The rationale for treating the client as unvalidated rather 2389 than discarding the packet is that the client might have received 2390 the token in a previous connection using the NEW_TOKEN frame, and 2391 if the server has lost state, it might be unable to validate the 2392 token at all, leading to connection failure if the packet is 2393 discarded. 2395 In a stateless design, a server can use encrypted and authenticated 2396 tokens to pass information to clients that the server can later 2397 recover and use to validate a client address. Tokens are not 2398 integrated into the cryptographic handshake and so they are not 2399 authenticated. For instance, a client might be able to reuse a 2400 token. To avoid attacks that exploit this property, a server can 2401 limit its use of tokens to only the information needed to validate 2402 client addresses. 2404 Clients MAY use tokens obtained on one connection for any connection 2405 attempt using the same version. When selecting a token to use, 2406 clients do not need to consider other properties of the connection 2407 that is being attempted, including the choice of possible application 2408 protocols, session tickets, or other connection properties. 2410 8.1.4. Address Validation Token Integrity 2412 An address validation token MUST be difficult to guess. Including a 2413 large enough random value in the token would be sufficient, but this 2414 depends on the server remembering the value it sends to clients. 2416 A token-based scheme allows the server to offload any state 2417 associated with validation to the client. For this design to work, 2418 the token MUST be covered by integrity protection against 2419 modification or falsification by clients. Without integrity 2420 protection, malicious clients could generate or guess values for 2421 tokens that would be accepted by the server. Only the server 2422 requires access to the integrity protection key for tokens. 2424 There is no need for a single well-defined format for the token 2425 because the server that generates the token also consumes it. Tokens 2426 sent in Retry packets SHOULD include information that allows the 2427 server to verify that the source IP address and port in client 2428 packets remain constant. 2430 Tokens sent in NEW_TOKEN frames MUST include information that allows 2431 the server to verify that the client IP address has not changed from 2432 when the token was issued. Servers can use tokens from NEW_TOKEN in 2433 deciding not to send a Retry packet, even if the client address has 2434 changed. If the client IP address has changed, the server MUST 2435 adhere to the anti-amplification limits found in Section 8.1. Note 2436 that in the presence of NAT, this requirement might be insufficient 2437 to protect other hosts that share the NAT from amplification attack. 2439 Attackers could replay tokens to use servers as amplifiers in DDoS 2440 attacks. To protect against such attacks, servers MUST ensure that 2441 replay of tokens is prevented or limited. Servers SHOULD ensure that 2442 tokens sent in Retry packets are only accepted for a short time. 2443 Tokens that are provided in NEW_TOKEN frames (Section 19.7) need to 2444 be valid for longer, but SHOULD NOT be accepted multiple times in a 2445 short period. Servers are encouraged to allow tokens to be used only 2446 once, if possible; tokens MAY include additional information about 2447 clients to further narrow applicability or reuse. 2449 8.2. Path Validation 2451 Path validation is used during connection migration (see Section 9) 2452 by the migrating endpoint to verify reachability of a peer from a new 2453 local address. In path validation, endpoints test reachability 2454 between a specific local address and a specific peer address, where 2455 an address is the two-tuple of IP address and port. 2457 Path validation tests that packets sent on a path to a peer are 2458 received by that peer. Path validation is used to ensure that 2459 packets received from a migrating peer do not carry a spoofed source 2460 address. 2462 Path validation does not validate that a peer can send in the return 2463 direction. The peer performs independent validation of the return 2464 path. 2466 Path validation can be used at any time by either endpoint. For 2467 instance, an endpoint might check that a peer is still in possession 2468 of its address after a period of quiescence. 2470 Path validation is not designed as a NAT traversal mechanism. Though 2471 the mechanism described here might be effective for the creation of 2472 NAT bindings that support NAT traversal, the expectation is that one 2473 or other peer is able to receive packets without first having sent a 2474 packet on that path. Effective NAT traversal needs additional 2475 synchronization mechanisms that are not provided here. 2477 An endpoint MAY include PATH_CHALLENGE and PATH_RESPONSE frames that 2478 are used for path validation with other frames. In particular, an 2479 endpoint can pad a packet carrying a PATH_CHALLENGE for Path Maximum 2480 Transfer Unit (PMTU) discovery (see Section 14.2.1), or an endpoint 2481 can include a PATH_RESPONSE with its own PATH_CHALLENGE. 2483 When probing a new path, an endpoint might want to ensure that its 2484 peer has an unused connection ID available for responses. The 2485 endpoint can send NEW_CONNECTION_ID and PATH_CHALLENGE frames in the 2486 same packet. This ensures that an unused connection ID will be 2487 available to the peer when sending a response. 2489 8.3. Initiating Path Validation 2491 To initiate path validation, an endpoint sends a PATH_CHALLENGE frame 2492 containing an unpredictable payload on the path to be validated. 2494 An endpoint MAY send multiple PATH_CHALLENGE frames to guard against 2495 packet loss. However, an endpoint SHOULD NOT send multiple 2496 PATH_CHALLENGE frames in a single packet. An endpoint SHOULD NOT 2497 send a PATH_CHALLENGE more frequently than it would an Initial 2498 packet, ensuring that connection migration is no more load on a new 2499 path than establishing a new connection. 2501 The endpoint MUST use unpredictable data in every PATH_CHALLENGE 2502 frame so that it can associate the peer's response with the 2503 corresponding PATH_CHALLENGE. 2505 8.4. Path Validation Responses 2507 On receiving a PATH_CHALLENGE frame, an endpoint MUST respond by 2508 echoing the data contained in the PATH_CHALLENGE frame in a 2509 PATH_RESPONSE frame. A PATH_RESPONSE frame does not need to be sent 2510 on the network path where the PATH_CHALLENGE was received; a 2511 PATH_RESPONSE can be sent on any network path. An endpoint MUST NOT 2512 delay transmission of a packet containing a PATH_RESPONSE frame 2513 unless constrained by congestion control. 2515 An endpoint MUST NOT send more than one PATH_RESPONSE frame in 2516 response to one PATH_CHALLENGE frame; see Section 13.3. The peer is 2517 expected to send more PATH_CHALLENGE frames as necessary to evoke 2518 additional PATH_RESPONSE frames. 2520 8.5. Successful Path Validation 2522 Path validation succeeds when a PATH_RESPONSE frame is received that 2523 contains the data that was sent in a previous PATH_CHALLENGE frame. 2524 This validates the path on which the PATH_CHALLENGE was sent. 2526 Receipt of an acknowledgment for a packet containing a PATH_CHALLENGE 2527 frame is not adequate validation, since the acknowledgment can be 2528 spoofed by a malicious peer. 2530 8.6. Failed Path Validation 2532 Path validation only fails when the endpoint attempting to validate 2533 the path abandons its attempt to validate the path. 2535 Endpoints SHOULD abandon path validation based on a timer. When 2536 setting this timer, implementations are cautioned that the new path 2537 could have a longer round-trip time than the original. A value of 2538 three times the larger of the current Probe Timeout (PTO) or the 2539 initial timeout (that is, 2*kInitialRtt) as defined in 2540 [QUIC-RECOVERY] is RECOMMENDED. That is: 2542 validation_timeout = max(3*PTO, 6*kInitialRtt) 2544 This timeout allows for multiple PTOs to expire prior to failing path 2545 validation, so that loss of a single PATH_CHALLENGE or PATH_RESPONSE 2546 frame does not cause path validation failure. 2548 Note that the endpoint might receive packets containing other frames 2549 on the new path, but a PATH_RESPONSE frame with appropriate data is 2550 required for path validation to succeed. 2552 When an endpoint abandons path validation, it determines that the 2553 path is unusable. This does not necessarily imply a failure of the 2554 connection - endpoints can continue sending packets over other paths 2555 as appropriate. If no paths are available, an endpoint can wait for 2556 a new path to become available or close the connection. 2558 A path validation might be abandoned for other reasons besides 2559 failure. Primarily, this happens if a connection migration to a new 2560 path is initiated while a path validation on the old path is in 2561 progress. 2563 9. Connection Migration 2565 The use of a connection ID allows connections to survive changes to 2566 endpoint addresses (IP address and port), such as those caused by an 2567 endpoint migrating to a new network. This section describes the 2568 process by which an endpoint migrates to a new address. 2570 The design of QUIC relies on endpoints retaining a stable address for 2571 the duration of the handshake. An endpoint MUST NOT initiate 2572 connection migration before the handshake is confirmed, as defined in 2573 section 4.1.2 of [QUIC-TLS]. 2575 If the peer sent the disable_active_migration transport parameter, an 2576 endpoint also MUST NOT send packets (including probing packets; see 2577 Section 9.1) from a different local address to the address the peer 2578 used during the handshake. An endpoint that has sent this transport 2579 parameter, but detects that a peer has nonetheless migrated to a 2580 different remote address MUST either drop the incoming packets on 2581 that path without generating a stateless reset or proceed with path 2582 validation and allow the peer to migrate. Generating a stateless 2583 reset or closing the connection would allow third parties in the 2584 network to cause connections to close by spoofing or otherwise 2585 manipulating observed traffic. 2587 Not all changes of peer address are intentional, or active, 2588 migrations. The peer could experience NAT rebinding: a change of 2589 address due to a middlebox, usually a NAT, allocating a new outgoing 2590 port or even a new outgoing IP address for a flow. An endpoint MUST 2591 perform path validation (Section 8.2) if it detects any change to a 2592 peer's address, unless it has previously validated that address. 2594 When an endpoint has no validated path on which to send packets, it 2595 MAY discard connection state. An endpoint capable of connection 2596 migration MAY wait for a new path to become available before 2597 discarding connection state. 2599 This document limits migration of connections to new client 2600 addresses, except as described in Section 9.6. Clients are 2601 responsible for initiating all migrations. Servers do not send non- 2602 probing packets (see Section 9.1) toward a client address until they 2603 see a non-probing packet from that address. If a client receives 2604 packets from an unknown server address, the client MUST discard these 2605 packets. 2607 9.1. Probing a New Path 2609 An endpoint MAY probe for peer reachability from a new local address 2610 using path validation (Section 8.2) prior to migrating the connection 2611 to the new local address. Failure of path validation simply means 2612 that the new path is not usable for this connection. Failure to 2613 validate a path does not cause the connection to end unless there are 2614 no valid alternative paths available. 2616 An endpoint uses a new connection ID for probes sent from a new local 2617 address; see Section 9.5 for further discussion. An endpoint that 2618 uses a new local address needs to ensure that at least one new 2619 connection ID is available at the peer. That can be achieved by 2620 including a NEW_CONNECTION_ID frame in the probe. 2622 Receiving a PATH_CHALLENGE frame from a peer indicates that the peer 2623 is probing for reachability on a path. An endpoint sends a 2624 PATH_RESPONSE frame in response, as per Section 8.2. 2626 PATH_CHALLENGE, PATH_RESPONSE, NEW_CONNECTION_ID, and PADDING frames 2627 are "probing frames", and all other frames are "non-probing frames". 2628 A packet containing only probing frames is a "probing packet", and a 2629 packet containing any other frame is a "non-probing packet". 2631 9.2. Initiating Connection Migration 2633 An endpoint can migrate a connection to a new local address by 2634 sending packets containing non-probing frames from that address. 2636 Each endpoint validates its peer's address during connection 2637 establishment. Therefore, a migrating endpoint can send to its peer 2638 knowing that the peer is willing to receive at the peer's current 2639 address. Thus an endpoint can migrate to a new local address without 2640 first validating the peer's address. 2642 When migrating, the new path might not support the endpoint's current 2643 sending rate. Therefore, the endpoint resets its congestion 2644 controller and RTT estimate, as described in Section 9.4. 2646 The new path might not have the same ECN capability. Therefore, the 2647 endpoint verifies ECN capability as described in Section 13.4. 2649 To establish reachability on the new path, an endpoint initiates path 2650 validation (Section 8.2) on the new path. An endpoint MAY defer path 2651 validation until after a peer sends the next non-probing frame to its 2652 new address. 2654 Path validation is necessary to verify reachability of a peer on a 2655 new network path. Acknowledgments cannot be used for path validation 2656 as they contain insufficient entropy and might be spoofed. No method 2657 is provided to establish return reachability, as endpoints 2658 independently determine reachability on each direction of a path. 2660 9.3. Responding to Connection Migration 2662 Receiving a packet from a new peer address containing a non-probing 2663 frame indicates that the peer has migrated to that address. 2665 In response to such a packet, an endpoint MUST start sending 2666 subsequent packets to the new peer address and MUST initiate path 2667 validation (Section 8.2) to verify the peer's ownership of the 2668 unvalidated address. 2670 An endpoint MAY send data to an unvalidated peer address, but it MUST 2671 protect against potential attacks as described in Section 9.3.1 and 2672 Section 9.3.2. An endpoint MAY skip validation of a peer address if 2673 that address has been seen recently. In particular, if an endpoint 2674 returns to a previously-validated path after detecting some form of 2675 spurious migration, skipping address validation and restoring loss 2676 detection and congestion state can reduce the performance impact of 2677 the attack. 2679 An endpoint only changes the address that it sends packets to in 2680 response to the highest-numbered non-probing packet. This ensures 2681 that an endpoint does not send packets to an old peer address in the 2682 case that it receives reordered packets. 2684 After changing the address to which it sends non-probing packets, an 2685 endpoint could abandon any path validation for other addresses. 2687 Receiving a packet from a new peer address might be the result of a 2688 NAT rebinding at the peer. 2690 After verifying a new client address, the server SHOULD send new 2691 address validation tokens (Section 8) to the client. 2693 9.3.1. Peer Address Spoofing 2695 It is possible that a peer is spoofing its source address to cause an 2696 endpoint to send excessive amounts of data to an unwilling host. If 2697 the endpoint sends significantly more data than the spoofing peer, 2698 connection migration might be used to amplify the volume of data that 2699 an attacker can generate toward a victim. 2701 As described in Section 9.3, an endpoint is required to validate a 2702 peer's new address to confirm the peer's possession of the new 2703 address. Until a peer's address is deemed valid, an endpoint MUST 2704 limit the rate at which it sends data to this address. The endpoint 2705 MUST NOT send more than a minimum congestion window's worth of data 2706 per estimated round-trip time (kMinimumWindow, as defined in 2707 [QUIC-RECOVERY]). In the absence of this limit, an endpoint risks 2708 being used for a denial of service attack against an unsuspecting 2709 victim. Note that since the endpoint will not have any round-trip 2710 time measurements to this address, the estimate SHOULD be the default 2711 initial value; see [QUIC-RECOVERY]. 2713 If an endpoint skips validation of a peer address as described above, 2714 it does not need to limit its sending rate. 2716 9.3.2. On-Path Address Spoofing 2718 An on-path attacker could cause a spurious connection migration by 2719 copying and forwarding a packet with a spoofed address such that it 2720 arrives before the original packet. The packet with the spoofed 2721 address will be seen to come from a migrating connection, and the 2722 original packet will be seen as a duplicate and dropped. After a 2723 spurious migration, validation of the source address will fail 2724 because the entity at the source address does not have the necessary 2725 cryptographic keys to read or respond to the PATH_CHALLENGE frame 2726 that is sent to it even if it wanted to. 2728 To protect the connection from failing due to such a spurious 2729 migration, an endpoint MUST revert to using the last validated peer 2730 address when validation of a new peer address fails. Additionally, 2731 receipt of packets with higher packet numbers from the legitimate 2732 peer address will trigger another connection migration. This will 2733 cause the validation of the address of the spurious migration to be 2734 abandoned, thus containing migrations initiated by the attacker 2735 injecting a single packet. 2737 If an endpoint has no state about the last validated peer address, it 2738 MUST close the connection silently by discarding all connection 2739 state. This results in new packets on the connection being handled 2740 generically. For instance, an endpoint MAY send a stateless reset in 2741 response to any further incoming packets. 2743 9.3.3. Off-Path Packet Forwarding 2745 An off-path attacker that can observe packets might forward copies of 2746 genuine packets to endpoints. If the copied packet arrives before 2747 the genuine packet, this will appear as a NAT rebinding. Any genuine 2748 packet will be discarded as a duplicate. If the attacker is able to 2749 continue forwarding packets, it might be able to cause migration to a 2750 path via the attacker. This places the attacker on path, giving it 2751 the ability to observe or drop all subsequent packets. 2753 This style of attack relies on the attacker using a path that has 2754 approximately the same characteristics as the direct path between 2755 endpoints. The attack is more reliable if relatively few packets are 2756 sent or if packet loss coincides with the attempted attack. 2758 A non-probing packet received on the original path that increases the 2759 maximum received packet number will cause the endpoint to move back 2760 to that path. Eliciting packets on this path increases the 2761 likelihood that the attack is unsuccessful. Therefore, mitigation of 2762 this attack relies on triggering the exchange of packets. 2764 In response to an apparent migration, endpoints MUST validate the 2765 previously active path using a PATH_CHALLENGE frame. This induces 2766 the sending of new packets on that path. If the path is no longer 2767 viable, the validation attempt will time out and fail; if the path is 2768 viable, but no longer desired, the validation will succeed, but only 2769 results in probing packets being sent on the path. 2771 An endpoint that receives a PATH_CHALLENGE on an active path SHOULD 2772 send a non-probing packet in response. If the non-probing packet 2773 arrives before any copy made by an attacker, this results in the 2774 connection being migrated back to the original path. Any subsequent 2775 migration to another path restarts this entire process. 2777 This defense is imperfect, but this is not considered a serious 2778 problem. If the path via the attack is reliably faster than the 2779 original path despite multiple attempts to use that original path, it 2780 is not possible to distinguish between attack and an improvement in 2781 routing. 2783 An endpoint could also use heuristics to improve detection of this 2784 style of attack. For instance, NAT rebinding is improbable if 2785 packets were recently received on the old path, similarly rebinding 2786 is rare on IPv6 paths. Endpoints can also look for duplicated 2787 packets. Conversely, a change in connection ID is more likely to 2788 indicate an intentional migration rather than an attack. 2790 9.4. Loss Detection and Congestion Control 2792 The capacity available on the new path might not be the same as the 2793 old path. Packets sent on the old path MUST NOT contribute to 2794 congestion control or RTT estimation for the new path. 2796 On confirming a peer's ownership of its new address, an endpoint MUST 2797 immediately reset the congestion controller and round-trip time 2798 estimator for the new path to initial values (see Appendices A.3 and 2799 B.3 in [QUIC-RECOVERY]) unless the only change in the peer's address 2800 is its port number. Because port-only changes are commonly the 2801 result of NAT rebinding or other middlebox activity, the endpoint MAY 2802 instead retain its congestion control state and round-trip estimate 2803 in those cases instead of reverting to initial values. In cases 2804 where congestion control state retained from an old path is used on a 2805 new path with substantially different characteristics, a sender may 2806 transmit too aggressively until the congestion controller and the RTT 2807 estimator have adapted. Generally, implementations are advised to be 2808 cautious when using previous values on a new path. 2810 There may be apparent reordering at the receiver when an endpoint 2811 sends data and probes from/to multiple addresses during the migration 2812 period, since the two resulting paths may have different round-trip 2813 times. A receiver of packets on multiple paths will still send ACK 2814 frames covering all received packets. 2816 While multiple paths might be used during connection migration, a 2817 single congestion control context and a single loss recovery context 2818 (as described in [QUIC-RECOVERY]) may be adequate. For instance, an 2819 endpoint might delay switching to a new congestion control context 2820 until it is confirmed that an old path is no longer needed (such as 2821 the case in Section 9.3.3). 2823 A sender can make exceptions for probe packets so that their loss 2824 detection is independent and does not unduly cause the congestion 2825 controller to reduce its sending rate. An endpoint might set a 2826 separate timer when a PATH_CHALLENGE is sent, which is cancelled if 2827 the corresponding PATH_RESPONSE is received. If the timer fires 2828 before the PATH_RESPONSE is received, the endpoint might send a new 2829 PATH_CHALLENGE, and restart the timer for a longer period of time. 2830 This timer SHOULD be set as described in Section 6.2.1 of 2831 [QUIC-RECOVERY] and MUST NOT be more aggressive. 2833 9.5. Privacy Implications of Connection Migration 2835 Using a stable connection ID on multiple network paths would allow a 2836 passive observer to correlate activity between those paths. An 2837 endpoint that moves between networks might not wish to have their 2838 activity correlated by any entity other than their peer, so different 2839 connection IDs are used when sending from different local addresses, 2840 as discussed in Section 5.1. For this to be effective, endpoints 2841 need to ensure that connection IDs they provide cannot be linked by 2842 any other entity. 2844 At any time, endpoints MAY change the Destination Connection ID they 2845 transmit with to a value that has not been used on another path. 2847 An endpoint MUST NOT reuse a connection ID when sending from more 2848 than one local address, for example when initiating connection 2849 migration as described in Section 9.2 or when probing a new network 2850 path as described in Section 9.1. 2852 Similarly, an endpoint MUST NOT reuse a connection ID when sending to 2853 more than one destination address. Due to network changes outside 2854 the control of its peer, an endpoint might receive packets from a new 2855 source address with the same destination connection ID, in which case 2856 it MAY continue to use the current connection ID with the new remote 2857 address while still sending from the same local address. 2859 These requirements regarding connection ID reuse apply only to the 2860 sending of packets, as unintentional changes in path without a change 2861 in connection ID are possible. For example, after a period of 2862 network inactivity, NAT rebinding might cause packets to be sent on a 2863 new path when the client resumes sending. An endpoint responds to 2864 such an event as described in Section 9.3. 2866 Using different connection IDs for packets sent in both directions on 2867 each new network path eliminates the use of the connection ID for 2868 linking packets from the same connection across different network 2869 paths. Header protection ensures that packet numbers cannot be used 2870 to correlate activity. This does not prevent other properties of 2871 packets, such as timing and size, from being used to correlate 2872 activity. 2874 An endpoint SHOULD NOT initiate migration with a peer that has 2875 requested a zero-length connection ID, because traffic over the new 2876 path might be trivially linkable to traffic over the old one. If the 2877 server is able to associate packets with a zero-length connection ID 2878 to the right connection, it means that the server is using other 2879 information to demultiplex packets. For example, a server might 2880 provide a unique address to every client, for instance using HTTP 2881 alternative services [ALTSVC]. Information that might allow correct 2882 routing of packets across multiple network paths will also allow 2883 activity on those paths to be linked by entities other than the peer. 2885 A client might wish to reduce linkability by employing a new 2886 connection ID and source UDP port when sending traffic after a period 2887 of inactivity. Changing the UDP port from which it sends packets at 2888 the same time might cause the packet to appear as a connection 2889 migration. This ensures that the mechanisms that support migration 2890 are exercised even for clients that do not experience NAT rebindings 2891 or genuine migrations. Changing port number can cause a peer to 2892 reset its congestion state (see Section 9.4), so the port SHOULD only 2893 be changed infrequently. 2895 An endpoint that exhausts available connection IDs cannot probe new 2896 paths or initiate migration, nor can it respond to probes or attempts 2897 by its peer to migrate. To ensure that migration is possible and 2898 packets sent on different paths cannot be correlated, endpoints 2899 SHOULD provide new connection IDs before peers migrate; see 2900 Section 5.1.1. If a peer might have exhausted available connection 2901 IDs, a migrating endpoint could include a NEW_CONNECTION_ID frame in 2902 all packets sent on a new network path. 2904 9.6. Server's Preferred Address 2906 QUIC allows servers to accept connections on one IP address and 2907 attempt to transfer these connections to a more preferred address 2908 shortly after the handshake. This is particularly useful when 2909 clients initially connect to an address shared by multiple servers 2910 but would prefer to use a unicast address to ensure connection 2911 stability. This section describes the protocol for migrating a 2912 connection to a preferred server address. 2914 Migrating a connection to a new server address mid-connection is not 2915 supported by the version of QUIC specified in this document. If a 2916 client receives packets from a new server address when the client has 2917 not initiated a migration to that address, the client SHOULD discard 2918 these packets. 2920 9.6.1. Communicating a Preferred Address 2922 A server conveys a preferred address by including the 2923 preferred_address transport parameter in the TLS handshake. 2925 Servers MAY communicate a preferred address of each address family 2926 (IPv4 and IPv6) to allow clients to pick the one most suited to their 2927 network attachment. 2929 Once the handshake is confirmed, the client SHOULD select one of the 2930 two addresses provided by the server and initiate path validation 2931 (see Section 8.2). A client constructs packets using any previously 2932 unused active connection ID, taken from either the preferred_address 2933 transport parameter or a NEW_CONNECTION_ID frame. 2935 As soon as path validation succeeds, the client SHOULD begin sending 2936 all future packets to the new server address using the new connection 2937 ID and discontinue use of the old server address. If path validation 2938 fails, the client MUST continue sending all future packets to the 2939 server's original IP address. 2941 9.6.2. Migration to a Preferred Address 2943 A client that migrates to a preferred address MUST validate the 2944 address it chooses before migrating; see Section 21.4.3. 2946 A server might receive a packet addressed to its preferred IP address 2947 at any time after it accepts a connection. If this packet contains a 2948 PATH_CHALLENGE frame, the server sends a packet containing a 2949 PATH_RESPONSE frame as per Section 8.2. The server MUST send non- 2950 probing packets from its original address until it receives a non- 2951 probing packet from the client at its preferred address and until the 2952 server has validated the new path. 2954 The server MUST probe on the path toward the client from its 2955 preferred address. This helps to guard against spurious migration 2956 initiated by an attacker. 2958 Once the server has completed its path validation and has received a 2959 non-probing packet with a new largest packet number on its preferred 2960 address, the server begins sending non-probing packets to the client 2961 exclusively from its preferred IP address. It SHOULD drop packets 2962 for this connection received on the old IP address, but MAY continue 2963 to process delayed packets. 2965 The addresses that a server provides in the preferred_address 2966 transport parameter are only valid for the connection in which they 2967 are provided. A client MUST NOT use these for other connections, 2968 including connections that are resumed from the current connection. 2970 9.6.3. Interaction of Client Migration and Preferred Address 2972 A client might need to perform a connection migration before it has 2973 migrated to the server's preferred address. In this case, the client 2974 SHOULD perform path validation to both the original and preferred 2975 server address from the client's new address concurrently. 2977 If path validation of the server's preferred address succeeds, the 2978 client MUST abandon validation of the original address and migrate to 2979 using the server's preferred address. If path validation of the 2980 server's preferred address fails but validation of the server's 2981 original address succeeds, the client MAY migrate to its new address 2982 and continue sending to the server's original address. 2984 If packets received at the server's preferred address have a 2985 different source address than observed from the client during the 2986 handshake, the server MUST protect against potential attacks as 2987 described in Section 9.3.1 and Section 9.3.2. In addition to 2988 intentional simultaneous migration, this might also occur because the 2989 client's access network used a different NAT binding for the server's 2990 preferred address. 2992 Servers SHOULD initiate path validation to the client's new address 2993 upon receiving a probe packet from a different address. Servers MUST 2994 NOT send more than a minimum congestion window's worth of non-probing 2995 packets to the new address before path validation is complete. 2997 A client that migrates to a new address SHOULD use a preferred 2998 address from the same address family for the server. 3000 The connection ID provided in the preferred_address transport 3001 parameter is not specific to the addresses that are provided. This 3002 connection ID is provided to ensure that the client has a connection 3003 ID available for migration, but the client MAY use this connection ID 3004 on any path. 3006 9.7. Use of IPv6 Flow-Label and Migration 3008 Endpoints that send data using IPv6 SHOULD apply an IPv6 flow label 3009 in compliance with [RFC6437], unless the local API does not allow 3010 setting IPv6 flow labels. 3012 The IPv6 flow label SHOULD be a pseudo-random function of the source 3013 and destination addresses, source and destination UDP ports, and the 3014 Destination Connection ID field. The flow label generation MUST be 3015 designed to minimize the chances of linkability with a previously 3016 used flow label, as this would enable correlating activity on 3017 multiple paths; see Section 9.5. 3019 A possible implementation is to compute the flow label as a 3020 cryptographic hash function of the source and destination addresses, 3021 source and destination UDP ports, Destination Connection ID field, 3022 and a local secret. 3024 10. Connection Termination 3026 An established QUIC connection can be terminated in one of three 3027 ways: 3029 * idle timeout (Section 10.1) 3031 * immediate close (Section 10.2) 3033 * stateless reset (Section 10.3) 3035 An endpoint MAY discard connection state if it does not have a 3036 validated path on which it can send packets; see Section 8.2. 3038 10.1. Idle Timeout 3040 If a max_idle_timeout is specified by either peer in its transport 3041 parameters (Section 18.2), the connection is silently closed and its 3042 state is discarded when it remains idle for longer than the minimum 3043 of both peers max_idle_timeout values. 3045 Each endpoint advertises a max_idle_timeout, but the effective value 3046 at an endpoint is computed as the minimum of the two advertised 3047 values. By announcing a max_idle_timeout, an endpoint commits to 3048 initiating an immediate close (Section 10.2) if it abandons the 3049 connection prior to the effective value. 3051 An endpoint restarts its idle timer when a packet from its peer is 3052 received and processed successfully. An endpoint also restarts its 3053 idle timer when sending an ack-eliciting packet if no other ack- 3054 eliciting packets have been sent since last receiving and processing 3055 a packet. Restarting this timer when sending a packet ensures that 3056 connections are not closed after new activity is initiated. 3058 To avoid excessively small idle timeout periods, endpoints MUST 3059 increase the idle timeout period to be at least three times the 3060 current Probe Timeout (PTO). This allows for multiple PTOs to expire 3061 prior to idle timeout, ensuring the idle timeout does not expire as a 3062 result of a single packet loss. 3064 10.1.1. Liveness Testing 3066 An endpoint that sends packets close to the effective timeout risks 3067 having them be discarded at the peer, since the idle timeout period 3068 might have expired at the peer before these packets arrive. 3070 An endpoint can send a PING or another ack-eliciting frame to test 3071 the connection for liveness if the peer could time out soon, such as 3072 within a PTO; see Section 6.2 of [QUIC-RECOVERY]. This is especially 3073 useful if any available application data cannot be safely retried. 3074 Note that the application determines what data is safe to retry. 3076 10.1.2. Deferring Idle Timeout 3078 An endpoint might need to send ack-eliciting packets to avoid an idle 3079 timeout if it is expecting response data, but does not have or is 3080 unable to send application data. 3082 An implementation of QUIC might provide applications with an option 3083 to defer an idle timeout. This facility could be used when the 3084 application wishes to avoid losing state that has been associated 3085 with an open connection, but does not expect to exchange application 3086 data for some time. With this option, an endpoint could send a PING 3087 frame (Section 19.2) periodically, which will cause the peer to 3088 restart its idle timeout period. Sending a packet containing a PING 3089 frame restarts the idle timeout for this endpoint also if this is the 3090 first ack-eliciting packet sent since receiving a packet. Sending a 3091 PING frame causes the peer to respond with an acknowledgment, which 3092 also restarts the idle timeout for the endpoint. 3094 Application protocols that use QUIC SHOULD provide guidance on when 3095 deferring an idle timeout is appropriate. Unnecessary sending of 3096 PING frames could have a detrimental effect on performance. 3098 A connection will time out if no packets are sent or received for a 3099 period longer than the time negotiated using the max_idle_timeout 3100 transport parameter; see Section 10. However, state in middleboxes 3101 might time out earlier than that. Though REQ-5 in [RFC4787] 3102 recommends a 2 minute timeout interval, experience shows that sending 3103 packets every 30 seconds is necessary to prevent the majority of 3104 middleboxes from losing state for UDP flows [GATEWAY]. 3106 10.2. Immediate Close 3108 An endpoint sends a CONNECTION_CLOSE frame (Section 19.19) to 3109 terminate the connection immediately. A CONNECTION_CLOSE frame 3110 causes all streams to immediately become closed; open streams can be 3111 assumed to be implicitly reset. 3113 After sending a CONNECTION_CLOSE frame, an endpoint immediately 3114 enters the closing state; see Section 10.2.1. After receiving a 3115 CONNECTION_CLOSE frame, endpoints enter the draining state; see 3116 Section 10.2.2. 3118 An immediate close can be used after an application protocol has 3119 arranged to close a connection. This might be after the application 3120 protocol negotiates a graceful shutdown. The application protocol 3121 can exchange messages that are needed for both application endpoints 3122 to agree that the connection can be closed, after which the 3123 application requests that QUIC close the connection. When QUIC 3124 consequently closes the connection, a CONNECTION_CLOSE frame with an 3125 application-supplied error code will be used to signal closure to the 3126 peer. 3128 The closing and draining connection states exist to ensure that 3129 connections close cleanly and that delayed or reordered packets are 3130 properly discarded. These states SHOULD persist for at least three 3131 times the current Probe Timeout (PTO) interval as defined in 3132 [QUIC-RECOVERY]. 3134 Disposing of connection state prior to exiting the closing or 3135 draining state could cause could result in an endpoint generating a 3136 stateless reset unnecessarily when it receives a late-arriving 3137 packet. Endpoints that have some alternative means to ensure that 3138 late-arriving packets do not induce a response, such as those that 3139 are able to close the UDP socket, MAY end these states earlier to 3140 allow for faster resource recovery. Servers that retain an open 3141 socket for accepting new connections SHOULD NOT end the closing or 3142 draining states early. 3144 Once its closing or draining state ends, an endpoint SHOULD discard 3145 all connection state. The endpoint MAY send a stateless reset in 3146 response to any further incoming packets belonging to this 3147 connection. 3149 10.2.1. Closing Connection State 3151 An endpoint enters the closing state after initiating an immediate 3152 close. 3154 In the closing state, an endpoint retains only enough information to 3155 generate a packet containing a CONNECTION_CLOSE frame and to identify 3156 packets as belonging to the connection. An endpoint in the closing 3157 state sends a packet containing a CONNECTION_CLOSE frame in response 3158 to any incoming packet that it attributes to the connection. 3160 An endpoint SHOULD limit the rate at which it generates packets in 3161 the closing state. For instance, an endpoint could wait for a 3162 progressively increasing number of received packets or amount of time 3163 before responding to received packets. 3165 An endpoint's selected connection ID and the QUIC version are 3166 sufficient information to identify packets for a closing connection; 3167 the endpoint MAY discard all other connection state. An endpoint 3168 that is closing is not required to process any received frame. An 3169 endpoint MAY retain packet protection keys for incoming packets to 3170 allow it to read and process a CONNECTION_CLOSE frame. 3172 An endpoint MAY drop packet protection keys when entering the closing 3173 state and send a packet containing a CONNECTION_CLOSE frame in 3174 response to any UDP datagram that is received. However, an endpoint 3175 that discards packet protection keys cannot identify and discard 3176 invalid packets. To avoid being used for an amplication attack, such 3177 endpoints MUST limit the cumulative size of packets it sends to three 3178 times the cumulative size of the packets that are received and 3179 attributed to the connection. To minimize the state that an endpoint 3180 maintains for a closing connection, endpoints MAY send the exact same 3181 packet in response to any received packet. 3183 Note: Allowing retransmission of a closing packet is an exception to 3184 the requirement that a new packet number be used for each packet 3185 in Section 12.3. Sending new packet numbers is primarily of 3186 advantage to loss recovery and congestion control, which are not 3187 expected to be relevant for a closed connection. Retransmitting 3188 the final packet requires less state. 3190 While in the closing state, an endpoint could receive packets from a 3191 new source address, possibly indicating a connection migration; see 3192 Section 9. An endpoint in the closing state MUST either discard 3193 packets received from an unvalidated address or limit the cumulative 3194 size of packets it sends to an unvalidated address to three times the 3195 size of packets it receives from that address. 3197 An endpoint is not expected to handle key updates when it is closing 3198 (Section 6 of [QUIC-TLS]). A key update might prevent the endpoint 3199 from moving from the closing state to the draining state, as the 3200 endpoint will not be able to process subsequently received packets, 3201 but it otherwise has no impact. 3203 10.2.2. Draining Connection State 3205 The draining state is entered once an endpoint receives a 3206 CONNECTION_CLOSE frame, which indicates that its peer is closing or 3207 draining. While otherwise identical to the closing state, an 3208 endpoint in the draining state MUST NOT send any packets. Retaining 3209 packet protection keys is unnecessary once a connection is in the 3210 draining state. 3212 An endpoint that receives a CONNECTION_CLOSE frame MAY send a single 3213 packet containing a CONNECTION_CLOSE frame before entering the 3214 draining state, using a NO_ERROR code if appropriate. An endpoint 3215 MUST NOT send further packets. Doing so could result in a constant 3216 exchange of CONNECTION_CLOSE frames until one of the endpoints exits 3217 the closing state. 3219 An endpoint MAY enter the draining state from the closing state if it 3220 receives a CONNECTION_CLOSE frame, which indicates that the peer is 3221 also closing or draining. In this case, the draining state SHOULD 3222 end when the closing state would have ended. In other words, the 3223 endpoint uses the same end time, but ceases transmission of any 3224 packets on this connection. 3226 10.2.3. Immediate Close During the Handshake 3228 When sending CONNECTION_CLOSE, the goal is to ensure that the peer 3229 will process the frame. Generally, this means sending the frame in a 3230 packet with the highest level of packet protection to avoid the 3231 packet being discarded. After the handshake is confirmed (see 3232 Section 4.1.2 of [QUIC-TLS]), an endpoint MUST send any 3233 CONNECTION_CLOSE frames in a 1-RTT packet. However, prior to 3234 confirming the handshake, it is possible that more advanced packet 3235 protection keys are not available to the peer, so another 3236 CONNECTION_CLOSE frame MAY be sent in a packet that uses a lower 3237 packet protection level. More specifically: 3239 * A client will always know whether the server has Handshake keys 3240 (see Section 17.2.2.1), but it is possible that a server does not 3241 know whether the client has Handshake keys. Under these 3242 circumstances, a server SHOULD send a CONNECTION_CLOSE frame in 3243 both Handshake and Initial packets to ensure that at least one of 3244 them is processable by the client. 3246 * A client that sends CONNECTION_CLOSE in a 0-RTT packet cannot be 3247 assured that the server has accepted 0-RTT. Sending a 3248 CONNECTION_CLOSE frame in an Initial packet makes it more likely 3249 that the server can receive the close signal, even if the 3250 application error code might not be received. 3252 * Prior to confirming the handshake, a peer might be unable to 3253 process 1-RTT packets, so an endpoint SHOULD send CONNECTION_CLOSE 3254 in both Handshake and 1-RTT packets. A server SHOULD also send 3255 CONNECTION_CLOSE in an Initial packet. 3257 Sending a CONNECTION_CLOSE of type 0x1d in an Initial or Handshake 3258 packet could expose application state or be used to alter application 3259 state. A CONNECTION_CLOSE of type 0x1d MUST be replaced by a 3260 CONNECTION_CLOSE of type 0x1c when sending the frame in Initial or 3261 Handshake packets. Otherwise, information about the application 3262 state might be revealed. Endpoints MUST clear the value of the 3263 Reason Phrase field and SHOULD use the APPLICATION_ERROR code when 3264 converting to a CONNECTION_CLOSE of type 0x1c. 3266 CONNECTION_CLOSE frames sent in multiple packet types can be 3267 coalesced into a single UDP datagram; see Section 12.2. 3269 An endpoint can send a CONNECTION_CLOSE frame in an Initial packet. 3270 This might be in response to unauthenticated information received in 3271 Initial or Handshake packets. Such an immediate close might expose 3272 legitimate connections to a denial of service. QUIC does not include 3273 defensive measures for on-path attacks during the handshake; see 3274 Section 21.1. However, at the cost of reducing feedback about errors 3275 for legitimate peers, some forms of denial of service can be made 3276 more difficult for an attacker if endpoints discard illegal packets 3277 rather than terminating a connection with CONNECTION_CLOSE. For this 3278 reason, endpoints MAY discard packets rather than immediately close 3279 if errors are detected in packets that lack authentication. 3281 An endpoint that has not established state, such as a server that 3282 detects an error in an Initial packet, does not enter the closing 3283 state. An endpoint that has no state for the connection does not 3284 enter a closing or draining period on sending a CONNECTION_CLOSE 3285 frame. 3287 10.3. Stateless Reset 3289 A stateless reset is provided as an option of last resort for an 3290 endpoint that does not have access to the state of a connection. A 3291 crash or outage might result in peers continuing to send data to an 3292 endpoint that is unable to properly continue the connection. An 3293 endpoint MAY send a stateless reset in response to receiving a packet 3294 that it cannot associate with an active connection. 3296 A stateless reset is not appropriate for signaling error conditions. 3297 An endpoint that wishes to communicate a fatal connection error MUST 3298 use a CONNECTION_CLOSE frame if it has sufficient state to do so. 3300 To support this process, a token is sent by endpoints. The token is 3301 carried in the Stateless Reset Token field of a NEW_CONNECTION_ID 3302 frame. Servers can also specify a stateless_reset_token transport 3303 parameter during the handshake that applies to the connection ID that 3304 it selected during the handshake; clients cannot use this transport 3305 parameter because their transport parameters do not have 3306 confidentiality protection. These tokens are protected by 3307 encryption, so only client and server know their value. Tokens are 3308 invalidated when their associated connection ID is retired via a 3309 RETIRE_CONNECTION_ID frame (Section 19.16). 3311 An endpoint that receives packets that it cannot process sends a 3312 packet in the following layout: 3314 Stateless Reset { 3315 Fixed Bits (2) = 1, 3316 Unpredictable Bits (38..), 3317 Stateless Reset Token (128), 3318 } 3320 Figure 10: Stateless Reset Packet 3322 This design ensures that a stateless reset packet is - to the extent 3323 possible - indistinguishable from a regular packet with a short 3324 header. 3326 A stateless reset uses an entire UDP datagram, starting with the 3327 first two bits of the packet header. The remainder of the first byte 3328 and an arbitrary number of bytes following it are set to values that 3329 SHOULD be indistinguishable from random. The last 16 bytes of the 3330 datagram contain a Stateless Reset Token. 3332 To entities other than its intended recipient, a stateless reset will 3333 appear to be a packet with a short header. For the stateless reset 3334 to appear as a valid QUIC packet, the Unpredictable Bits field needs 3335 to include at least 38 bits of data (or 5 bytes, less the two fixed 3336 bits). 3338 The resulting minimum size of 21 bytes does not guarantee that a 3339 stateless reset is difficult to distinguish from other packets if the 3340 recipient requires the use of a connection ID. To achieve that end, 3341 the endpoint SHOULD pad all packets it sends to at least 22 bytes 3342 longer than the minimum connection ID that it might request the peer 3343 to include in packets that the peer sends. This ensures that any 3344 stateless reset sent by the peer is indistinguishable from a valid 3345 packet sent to the endpoint. An endpoint that sends a stateless 3346 reset in response to a packet that is 43 bytes or shorter SHOULD send 3347 a stateless reset that is one byte shorter than the packet it 3348 responds to. 3350 These values assume that the Stateless Reset Token is the same length 3351 as the minimum expansion of the packet protection AEAD. Additional 3352 unpredictable bytes are necessary if the endpoint could have 3353 negotiated a packet protection scheme with a larger minimum 3354 expansion. 3356 An endpoint MUST NOT send a stateless reset that is three times or 3357 more larger than the packet it receives to avoid being used for 3358 amplification. Section 10.3.3 describes additional limits on 3359 stateless reset size. 3361 Endpoints MUST discard packets that are too small to be valid QUIC 3362 packets. With the set of AEAD functions defined in [QUIC-TLS], 3363 packets that are smaller than 21 bytes are never valid. 3365 Endpoints MUST send stateless reset packets formatted as a packet 3366 with a short header. However, endpoints MUST treat any packet ending 3367 in a valid stateless reset token as a stateless reset, as other QUIC 3368 versions might allow the use of a long header. 3370 An endpoint MAY send a stateless reset in response to a packet with a 3371 long header. Sending a stateless reset is not effective prior to the 3372 stateless reset token being available to a peer. In this QUIC 3373 version, packets with a long header are only used during connection 3374 establishment. Because the stateless reset token is not available 3375 until connection establishment is complete or near completion, 3376 ignoring an unknown packet with a long header might be as effective 3377 as sending a stateless reset. 3379 An endpoint cannot determine the Source Connection ID from a packet 3380 with a short header, therefore it cannot set the Destination 3381 Connection ID in the stateless reset packet. The Destination 3382 Connection ID will therefore differ from the value used in previous 3383 packets. A random Destination Connection ID makes the connection ID 3384 appear to be the result of moving to a new connection ID that was 3385 provided using a NEW_CONNECTION_ID frame (Section 19.15). 3387 Using a randomized connection ID results in two problems: 3389 * The packet might not reach the peer. If the Destination 3390 Connection ID is critical for routing toward the peer, then this 3391 packet could be incorrectly routed. This might also trigger 3392 another Stateless Reset in response; see Section 10.3.3. A 3393 Stateless Reset that is not correctly routed is an ineffective 3394 error detection and recovery mechanism. In this case, endpoints 3395 will need to rely on other methods - such as timers - to detect 3396 that the connection has failed. 3398 * The randomly generated connection ID can be used by entities other 3399 than the peer to identify this as a potential stateless reset. An 3400 endpoint that occasionally uses different connection IDs might 3401 introduce some uncertainty about this. 3403 This stateless reset design is specific to QUIC version 1. An 3404 endpoint that supports multiple versions of QUIC needs to generate a 3405 stateless reset that will be accepted by peers that support any 3406 version that the endpoint might support (or might have supported 3407 prior to losing state). Designers of new versions of QUIC need to be 3408 aware of this and either reuse this design, or use a portion of the 3409 packet other than the last 16 bytes for carrying data. 3411 10.3.1. Detecting a Stateless Reset 3413 An endpoint detects a potential stateless reset using the trailing 16 3414 bytes of the UDP datagram. An endpoint remembers all Stateless Reset 3415 Tokens associated with the connection IDs and remote addresses for 3416 datagrams it has recently sent. This includes Stateless Reset Tokens 3417 from NEW_CONNECTION_ID frames and the server's transport parameters 3418 but excludes Stateless Reset Tokens associated with connection IDs 3419 that are either unused or retired. The endpoint identifies a 3420 received datagram as a stateless reset by comparing the last 16 bytes 3421 of the datagram with all Stateless Reset Tokens associated with the 3422 remote address on which the datagram was received. 3424 This comparison can be performed for every inbound datagram. 3425 Endpoints MAY skip this check if any packet from a datagram is 3426 successfully processed. However, the comparison MUST be performed 3427 when the first packet in an incoming datagram either cannot be 3428 associated with a connection, or cannot be decrypted. 3430 An endpoint MUST NOT check for any Stateless Reset Tokens associated 3431 with connection IDs it has not used or for connection IDs that have 3432 been retired. 3434 When comparing a datagram to Stateless Reset Token values, endpoints 3435 MUST perform the comparison without leaking information about the 3436 value of the token. For example, performing this comparison in 3437 constant time protects the value of individual Stateless Reset Tokens 3438 from information leakage through timing side channels. Another 3439 approach would be to store and compare the transformed values of 3440 Stateless Reset Tokens instead of the raw token values, where the 3441 transformation is defined as a cryptographically-secure pseudo-random 3442 function using a secret key (e.g., block cipher, HMAC [RFC2104]). An 3443 endpoint is not expected to protect information about whether a 3444 packet was successfully decrypted, or the number of valid Stateless 3445 Reset Tokens. 3447 If the last 16 bytes of the datagram are identical in value to a 3448 Stateless Reset Token, the endpoint MUST enter the draining period 3449 and not send any further packets on this connection. 3451 10.3.2. Calculating a Stateless Reset Token 3453 The stateless reset token MUST be difficult to guess. In order to 3454 create a Stateless Reset Token, an endpoint could randomly generate 3455 ([RFC4086]) a secret for every connection that it creates. However, 3456 this presents a coordination problem when there are multiple 3457 instances in a cluster or a storage problem for an endpoint that 3458 might lose state. Stateless reset specifically exists to handle the 3459 case where state is lost, so this approach is suboptimal. 3461 A single static key can be used across all connections to the same 3462 endpoint by generating the proof using a second iteration of a 3463 preimage-resistant function that takes a static key and the 3464 connection ID chosen by the endpoint (see Section 5.1) as input. An 3465 endpoint could use HMAC [RFC2104] (for example, HMAC(static_key, 3466 connection_id)) or HKDF [RFC5869] (for example, using the static key 3467 as input keying material, with the connection ID as salt). The 3468 output of this function is truncated to 16 bytes to produce the 3469 Stateless Reset Token for that connection. 3471 An endpoint that loses state can use the same method to generate a 3472 valid Stateless Reset Token. The connection ID comes from the packet 3473 that the endpoint receives. 3475 This design relies on the peer always sending a connection ID in its 3476 packets so that the endpoint can use the connection ID from a packet 3477 to reset the connection. An endpoint that uses this design MUST 3478 either use the same connection ID length for all connections or 3479 encode the length of the connection ID such that it can be recovered 3480 without state. In addition, it cannot provide a zero-length 3481 connection ID. 3483 Revealing the Stateless Reset Token allows any entity to terminate 3484 the connection, so a value can only be used once. This method for 3485 choosing the Stateless Reset Token means that the combination of 3486 connection ID and static key MUST NOT be used for another connection. 3487 A denial of service attack is possible if the same connection ID is 3488 used by instances that share a static key, or if an attacker can 3489 cause a packet to be routed to an instance that has no state but the 3490 same static key; see Section 21.10. A connection ID from a 3491 connection that is reset by revealing the Stateless Reset Token MUST 3492 NOT be reused for new connections at nodes that share a static key. 3494 The same Stateless Reset Token MUST NOT be used for multiple 3495 connection IDs. Endpoints are not required to compare new values 3496 against all previous values, but a duplicate value MAY be treated as 3497 a connection error of type PROTOCOL_VIOLATION. 3499 Note that Stateless Reset packets do not have any cryptographic 3500 protection. 3502 10.3.3. Looping 3504 The design of a Stateless Reset is such that without knowing the 3505 stateless reset token it is indistinguishable from a valid packet. 3506 For instance, if a server sends a Stateless Reset to another server 3507 it might receive another Stateless Reset in response, which could 3508 lead to an infinite exchange. 3510 An endpoint MUST ensure that every Stateless Reset that it sends is 3511 smaller than the packet that triggered it, unless it maintains state 3512 sufficient to prevent looping. In the event of a loop, this results 3513 in packets eventually being too small to trigger a response. 3515 An endpoint can remember the number of Stateless Reset packets that 3516 it has sent and stop generating new Stateless Reset packets once a 3517 limit is reached. Using separate limits for different remote 3518 addresses will ensure that Stateless Reset packets can be used to 3519 close connections when other peers or connections have exhausted 3520 limits. 3522 Reducing the size of a Stateless Reset below 41 bytes means that the 3523 packet could reveal to an observer that it is a Stateless Reset, 3524 depending upon the length of the peer's connection IDs. Conversely, 3525 refusing to send a Stateless Reset in response to a small packet 3526 might result in Stateless Reset not being useful in detecting cases 3527 of broken connections where only very small packets are sent; such 3528 failures might only be detected by other means, such as timers. 3530 11. Error Handling 3532 An endpoint that detects an error SHOULD signal the existence of that 3533 error to its peer. Both transport-level and application-level errors 3534 can affect an entire connection; see Section 11.1. Only application- 3535 level errors can be isolated to a single stream; see Section 11.2. 3537 The most appropriate error code (Section 20) SHOULD be included in 3538 the frame that signals the error. Where this specification 3539 identifies error conditions, it also identifies the error code that 3540 is used; though these are worded as requirements, different 3541 implementation strategies might lead to different errors being 3542 reported. In particular, an endpoint MAY use any applicable error 3543 code when it detects an error condition; a generic error code (such 3544 as PROTOCOL_VIOLATION or INTERNAL_ERROR) can always be used in place 3545 of specific error codes. 3547 A stateless reset (Section 10.3) is not suitable for any error that 3548 can be signaled with a CONNECTION_CLOSE or RESET_STREAM frame. A 3549 stateless reset MUST NOT be used by an endpoint that has the state 3550 necessary to send a frame on the connection. 3552 11.1. Connection Errors 3554 Errors that result in the connection being unusable, such as an 3555 obvious violation of protocol semantics or corruption of state that 3556 affects an entire connection, MUST be signaled using a 3557 CONNECTION_CLOSE frame (Section 19.19). An endpoint MAY close the 3558 connection in this manner even if the error only affects a single 3559 stream. 3561 Application-specific protocol errors are signaled using the 3562 CONNECTION_CLOSE frame with a frame type of 0x1d. Errors that are 3563 specific to the transport, including all those described in this 3564 document, are carried in the CONNECTION_CLOSE frame with a frame type 3565 of 0x1c. 3567 A CONNECTION_CLOSE frame could be sent in a packet that is lost. An 3568 endpoint SHOULD be prepared to retransmit a packet containing a 3569 CONNECTION_CLOSE frame if it receives more packets on a terminated 3570 connection. Limiting the number of retransmissions and the time over 3571 which this final packet is sent limits the effort expended on 3572 terminated connections. 3574 An endpoint that chooses not to retransmit packets containing a 3575 CONNECTION_CLOSE frame risks a peer missing the first such packet. 3576 The only mechanism available to an endpoint that continues to receive 3577 data for a terminated connection is to use the stateless reset 3578 process (Section 10.3). 3580 11.2. Stream Errors 3582 If an application-level error affects a single stream, but otherwise 3583 leaves the connection in a recoverable state, the endpoint can send a 3584 RESET_STREAM frame (Section 19.4) with an appropriate error code to 3585 terminate just the affected stream. 3587 Resetting a stream without the involvement of the application 3588 protocol could cause the application protocol to enter an 3589 unrecoverable state. RESET_STREAM MUST only be instigated by the 3590 application protocol that uses QUIC. 3592 The semantics of the application error code carried in RESET_STREAM 3593 are defined by the application protocol. Only the application 3594 protocol is able to cause a stream to be terminated. A local 3595 instance of the application protocol uses a direct API call and a 3596 remote instance uses the STOP_SENDING frame, which triggers an 3597 automatic RESET_STREAM. 3599 Application protocols SHOULD define rules for handling streams that 3600 are prematurely cancelled by either endpoint. 3602 12. Packets and Frames 3604 QUIC endpoints communicate by exchanging packets. Packets have 3605 confidentiality and integrity protection; see Section 12.1. Packets 3606 are carried in UDP datagrams; see Section 12.2. 3608 This version of QUIC uses the long packet header during connection 3609 establishment; see Section 17.2. Packets with the long header are 3610 Initial (Section 17.2.2), 0-RTT (Section 17.2.3), Handshake 3611 (Section 17.2.4), and Retry (Section 17.2.5). Version negotiation 3612 uses a version-independent packet with a long header; see 3613 Section 17.2.1. 3615 Packets with the short header are designed for minimal overhead and 3616 are used after a connection is established and 1-RTT keys are 3617 available; see Section 17.3. 3619 12.1. Protected Packets 3621 QUIC packets have different levels of cryptographic protection based 3622 on the type of packet. Details of packet protection are found in 3623 [QUIC-TLS]; this section includes an overview of the protections that 3624 are provided. 3626 Version Negotiation packets have no cryptographic protection; see 3627 [QUIC-INVARIANTS]. 3629 Retry packets use an authenticated encryption with associated data 3630 function (AEAD; [AEAD]) to protect against accidental modification. 3632 Initial packets use an AEAD, the keys for which are derived using a 3633 value that is visible on the wire. Initial packets therefore do not 3634 have effective confidentiality protection. Initial protection exists 3635 to ensure that the sender of the packet is on the network path. Any 3636 entity that receives an Initial packet from a client can recover the 3637 keys that will allow them to both read the contents of the packet and 3638 generate Initial packets that will be successfully authenticated at 3639 either endpoint. 3641 All other packets are protected with keys derived from the 3642 cryptographic handshake. The cryptographic handshake ensures that 3643 only the communicating endpoints receive the corresponding keys for 3644 Handshake, 0-RTT, and 1-RTT packets. Packets protected with 0-RTT 3645 and 1-RTT keys have strong confidentiality and integrity protection. 3647 The Packet Number field that appears in some packet types has 3648 alternative confidentiality protection that is applied as part of 3649 header protection; see Section 5.4 of [QUIC-TLS] for details. The 3650 underlying packet number increases with each packet sent in a given 3651 packet number space; see Section 12.3 for details. 3653 12.2. Coalescing Packets 3655 Initial (Section 17.2.2), 0-RTT (Section 17.2.3), and Handshake 3656 (Section 17.2.4) packets contain a Length field that determines the 3657 end of the packet. The length includes both the Packet Number and 3658 Payload fields, both of which are confidentiality protected and 3659 initially of unknown length. The length of the Payload field is 3660 learned once header protection is removed. 3662 Using the Length field, a sender can coalesce multiple QUIC packets 3663 into one UDP datagram. This can reduce the number of UDP datagrams 3664 needed to complete the cryptographic handshake and start sending 3665 data. This can also be used to construct PMTU probes; see 3666 Section 14.4.1. Receivers MUST be able to process coalesced packets. 3668 Coalescing packets in order of increasing encryption levels (Initial, 3669 0-RTT, Handshake, 1-RTT; see Section 4.1.4 of [QUIC-TLS]) makes it 3670 more likely the receiver will be able to process all the packets in a 3671 single pass. A packet with a short header does not include a length, 3672 so it can only be the last packet included in a UDP datagram. An 3673 endpoint SHOULD include multiple frames in a single packet if they 3674 are to be sent at the same encryption level, instead of coalescing 3675 multiple packets at the same encryption level. 3677 Receivers MAY route based on the information in the first packet 3678 contained in a UDP datagram. Senders MUST NOT coalesce QUIC packets 3679 with different connection IDs into a single UDP datagram. Receivers 3680 SHOULD ignore any subsequent packets with a different Destination 3681 Connection ID than the first packet in the datagram. 3683 Every QUIC packet that is coalesced into a single UDP datagram is 3684 separate and complete. The receiver of coalesced QUIC packets MUST 3685 individually process each QUIC packet and separately acknowledge 3686 them, as if they were received as the payload of different UDP 3687 datagrams. For example, if decryption fails (because the keys are 3688 not available or any other reason), the receiver MAY either discard 3689 or buffer the packet for later processing and MUST attempt to process 3690 the remaining packets. 3692 Retry packets (Section 17.2.5), Version Negotiation packets 3693 (Section 17.2.1), and packets with a short header (Section 17.3) do 3694 not contain a Length field and so cannot be followed by other packets 3695 in the same UDP datagram. Note also that there is no situation where 3696 a Retry or Version Negotiation packet is coalesced with another 3697 packet. 3699 12.3. Packet Numbers 3701 The packet number is an integer in the range 0 to 2^62-1. This 3702 number is used in determining the cryptographic nonce for packet 3703 protection. Each endpoint maintains a separate packet number for 3704 sending and receiving. 3706 Packet numbers are limited to this range because they need to be 3707 representable in whole in the Largest Acknowledged field of an ACK 3708 frame (Section 19.3). When present in a long or short header 3709 however, packet numbers are reduced and encoded in 1 to 4 bytes; see 3710 Section 17.1. 3712 Version Negotiation (Section 17.2.1) and Retry (Section 17.2.5) 3713 packets do not include a packet number. 3715 Packet numbers are divided into 3 spaces in QUIC: 3717 * Initial space: All Initial packets (Section 17.2.2) are in this 3718 space. 3720 * Handshake space: All Handshake packets (Section 17.2.4) are in 3721 this space. 3723 * Application data space: All 0-RTT (Section 17.2.3) and 1-RTT 3724 (Section 17.3) encrypted packets are in this space. 3726 As described in [QUIC-TLS], each packet type uses different 3727 protection keys. 3729 Conceptually, a packet number space is the context in which a packet 3730 can be processed and acknowledged. Initial packets can only be sent 3731 with Initial packet protection keys and acknowledged in packets that 3732 are also Initial packets. Similarly, Handshake packets are sent at 3733 the Handshake encryption level and can only be acknowledged in 3734 Handshake packets. 3736 This enforces cryptographic separation between the data sent in the 3737 different packet number spaces. Packet numbers in each space start 3738 at packet number 0. Subsequent packets sent in the same packet 3739 number space MUST increase the packet number by at least one. 3741 0-RTT and 1-RTT data exist in the same packet number space to make 3742 loss recovery algorithms easier to implement between the two packet 3743 types. 3745 A QUIC endpoint MUST NOT reuse a packet number within the same packet 3746 number space in one connection. If the packet number for sending 3747 reaches 2^62 - 1, the sender MUST close the connection without 3748 sending a CONNECTION_CLOSE frame or any further packets; an endpoint 3749 MAY send a Stateless Reset (Section 10.3) in response to further 3750 packets that it receives. 3752 A receiver MUST discard a newly unprotected packet unless it is 3753 certain that it has not processed another packet with the same packet 3754 number from the same packet number space. Duplicate suppression MUST 3755 happen after removing packet protection for the reasons described in 3756 Section 9.3 of [QUIC-TLS]. 3758 Endpoints that track all individual packets for the purposes of 3759 detecting duplicates are at risk of accumulating excessive state. 3760 The data required for detecting duplicates can be limited by 3761 maintaining a minimum packet number below which all packets are 3762 immediately dropped. Any minimum needs to account for large 3763 variations in round trip time, which includes the possibility that a 3764 peer might probe network paths with much larger round trip times; see 3765 Section 9. 3767 Packet number encoding at a sender and decoding at a receiver are 3768 described in Section 17.1. 3770 12.4. Frames and Frame Types 3772 The payload of QUIC packets, after removing packet protection, 3773 consists of a sequence of complete frames, as shown in Figure 11. 3774 Version Negotiation, Stateless Reset, and Retry packets do not 3775 contain frames. 3777 Packet Payload { 3778 Frame (8..) ..., 3779 } 3781 Figure 11: QUIC Payload 3783 The payload of a packet that contains frames MUST contain at least 3784 one frame, and MAY contain multiple frames and multiple frame types. 3785 Frames always fit within a single QUIC packet and cannot span 3786 multiple packets. 3788 Each frame begins with a Frame Type, indicating its type, followed by 3789 additional type-dependent fields: 3791 Frame { 3792 Frame Type (i), 3793 Type-Dependent Fields (..), 3794 } 3796 Figure 12: Generic Frame Layout 3798 The frame types defined in this specification are listed in Table 3. 3799 The Frame Type in ACK, STREAM, MAX_STREAMS, STREAMS_BLOCKED, and 3800 CONNECTION_CLOSE frames is used to carry other frame-specific flags. 3801 For all other frames, the Frame Type field simply identifies the 3802 frame. These frames are explained in more detail in Section 19. 3804 +=============+======================+===============+======+======+ 3805 | Type Value | Frame Type Name | Definition | Pkts | Spec | 3806 +=============+======================+===============+======+======+ 3807 | 0x00 | PADDING | Section 19.1 | IH01 | NP | 3808 +-------------+----------------------+---------------+------+------+ 3809 | 0x01 | PING | Section 19.2 | IH01 | | 3810 +-------------+----------------------+---------------+------+------+ 3811 | 0x02 - 0x03 | ACK | Section 19.3 | IH_1 | NC | 3812 +-------------+----------------------+---------------+------+------+ 3813 | 0x04 | RESET_STREAM | Section 19.4 | __01 | | 3814 +-------------+----------------------+---------------+------+------+ 3815 | 0x05 | STOP_SENDING | Section 19.5 | __01 | | 3816 +-------------+----------------------+---------------+------+------+ 3817 | 0x06 | CRYPTO | Section 19.6 | IH_1 | | 3818 +-------------+----------------------+---------------+------+------+ 3819 | 0x07 | NEW_TOKEN | Section 19.7 | ___1 | | 3820 +-------------+----------------------+---------------+------+------+ 3821 | 0x08 - 0x0f | STREAM | Section 19.8 | __01 | F | 3822 +-------------+----------------------+---------------+------+------+ 3823 | 0x10 | MAX_DATA | Section 19.9 | __01 | | 3824 +-------------+----------------------+---------------+------+------+ 3825 | 0x11 | MAX_STREAM_DATA | Section 19.10 | __01 | | 3826 +-------------+----------------------+---------------+------+------+ 3827 | 0x12 - 0x13 | MAX_STREAMS | Section 19.11 | __01 | | 3828 +-------------+----------------------+---------------+------+------+ 3829 | 0x14 | DATA_BLOCKED | Section 19.12 | __01 | | 3830 +-------------+----------------------+---------------+------+------+ 3831 | 0x15 | STREAM_DATA_BLOCKED | Section 19.13 | __01 | | 3832 +-------------+----------------------+---------------+------+------+ 3833 | 0x16 - 0x17 | STREAMS_BLOCKED | Section 19.14 | __01 | | 3834 +-------------+----------------------+---------------+------+------+ 3835 | 0x18 | NEW_CONNECTION_ID | Section 19.15 | __01 | P | 3836 +-------------+----------------------+---------------+------+------+ 3837 | 0x19 | RETIRE_CONNECTION_ID | Section 19.16 | __01 | | 3838 +-------------+----------------------+---------------+------+------+ 3839 | 0x1a | PATH_CHALLENGE | Section 19.17 | __01 | P | 3840 +-------------+----------------------+---------------+------+------+ 3841 | 0x1b | PATH_RESPONSE | Section 19.18 | __01 | P | 3842 +-------------+----------------------+---------------+------+------+ 3843 | 0x1c - 0x1d | CONNECTION_CLOSE | Section 19.19 | ih01 | | 3844 +-------------+----------------------+---------------+------+------+ 3845 | 0x1e | HANDSHAKE_DONE | Section 19.20 | ___1 | | 3846 +-------------+----------------------+---------------+------+------+ 3848 Table 3: Frame Types 3850 The "Pkts" column in Table 3 lists the types of packets that each 3851 frame type could appear in, indicated by the following characters: 3853 I: Initial (Section 17.2.2) 3855 H: Handshake (Section 17.2.4) 3857 0: 0-RTT (Section 17.2.3) 3859 1: 1-RTT (Section 17.3) 3861 ih: Only a CONNECTION_CLOSE frame of type 0x1c can appear in Initial 3862 or Handshake packets. 3864 Section 4 of [QUIC-TLS] provides more detail about these 3865 restrictions. Note that all frames can appear in 1-RTT packets. An 3866 endpoint MUST treat receipt of a frame in a packet type that is not 3867 permitted as a connection error of type PROTOCOL_VIOLATION. 3869 The "Spec" column in Table 3 summarizes any special rules governing 3870 the processing or generation of the frame type, as indicated by the 3871 following characters: 3873 N: Packets containing only frames with this marking are not ack- 3874 eliciting; see Section 13.2. 3876 C: Packets containing only frames with this marking do not count 3877 toward bytes in flight for congestion control purposes; see 3878 [QUIC-RECOVERY]. 3880 P: Packets containing only frames with this marking can be used to 3881 probe new network paths during connection migration; see 3882 Section 9.1. 3884 F: The content of frames with this marking are flow controlled; see 3885 Section 4. 3887 The "Pkts" and "Spec" columns in Table 3 do not form part of the IANA 3888 registry; see Section 22.3. 3890 An endpoint MUST treat the receipt of a frame of unknown type as a 3891 connection error of type FRAME_ENCODING_ERROR. 3893 All QUIC frames are idempotent in this version of QUIC. That is, a 3894 valid frame does not cause undesirable side effects or errors when 3895 received more than once. 3897 The Frame Type field uses a variable-length integer encoding (see 3898 Section 16) with one exception. To ensure simple and efficient 3899 implementations of frame parsing, a frame type MUST use the shortest 3900 possible encoding. For frame types defined in this document, this 3901 means a single-byte encoding, even though it is possible to encode 3902 these values as a two-, four- or eight-byte variable-length integer. 3903 For instance, though 0x4001 is a legitimate two-byte encoding for a 3904 variable-length integer with a value of 1, PING frames are always 3905 encoded as a single byte with the value 0x01. This rule applies to 3906 all current and future QUIC frame types. An endpoint MAY treat the 3907 receipt of a frame type that uses a longer encoding than necessary as 3908 a connection error of type PROTOCOL_VIOLATION. 3910 13. Packetization and Reliability 3912 A sender sends one or more frames in a QUIC packet; see Section 12.4. 3914 A sender can minimize per-packet bandwidth and computational costs by 3915 including as many frames as possible in each QUIC packet. A sender 3916 MAY wait for a short period of time to collect multiple frames before 3917 sending a packet that is not maximally packed, to avoid sending out 3918 large numbers of small packets. An implementation MAY use knowledge 3919 about application sending behavior or heuristics to determine whether 3920 and for how long to wait. This waiting period is an implementation 3921 decision, and an implementation should be careful to delay 3922 conservatively, since any delay is likely to increase application- 3923 visible latency. 3925 Stream multiplexing is achieved by interleaving STREAM frames from 3926 multiple streams into one or more QUIC packets. A single QUIC packet 3927 can include multiple STREAM frames from one or more streams. 3929 One of the benefits of QUIC is avoidance of head-of-line blocking 3930 across multiple streams. When a packet loss occurs, only streams 3931 with data in that packet are blocked waiting for a retransmission to 3932 be received, while other streams can continue making progress. Note 3933 that when data from multiple streams is included in a single QUIC 3934 packet, loss of that packet blocks all those streams from making 3935 progress. Implementations are advised to include as few streams as 3936 necessary in outgoing packets without losing transmission efficiency 3937 to underfilled packets. 3939 13.1. Packet Processing 3941 A packet MUST NOT be acknowledged until packet protection has been 3942 successfully removed and all frames contained in the packet have been 3943 processed. For STREAM frames, this means the data has been enqueued 3944 in preparation to be received by the application protocol, but it 3945 does not require that data is delivered and consumed. 3947 Once the packet has been fully processed, a receiver acknowledges 3948 receipt by sending one or more ACK frames containing the packet 3949 number of the received packet. 3951 An endpoint SHOULD treat receipt of an acknowledgment for a packet it 3952 did not send as a connection error of type PROTOCOL_VIOLATION, if it 3953 is able to detect the condition. 3955 13.2. Generating Acknowledgements 3957 Endpoints acknowledge all packets they receive and process. However, 3958 only ack-eliciting packets cause an ACK frame to be sent within the 3959 maximum ack delay. Packets that are not ack-eliciting are only 3960 acknowledged when an ACK frame is sent for other reasons. 3962 When sending a packet for any reason, an endpoint SHOULD attempt to 3963 include an ACK frame if one has not been sent recently. Doing so 3964 helps with timely loss detection at the peer. 3966 In general, frequent feedback from a receiver improves loss and 3967 congestion response, but this has to be balanced against excessive 3968 load generated by a receiver that sends an ACK frame in response to 3969 every ack-eliciting packet. The guidance offered below seeks to 3970 strike this balance. 3972 13.2.1. Sending ACK Frames 3974 Every packet SHOULD be acknowledged at least once, and ack-eliciting 3975 packets MUST be acknowledged at least once within the maximum delay 3976 an endpoint communicated using the max_ack_delay transport parameter; 3977 see Section 18.2. max_ack_delay declares an explicit contract: an 3978 endpoint promises to never intentionally delay acknowledgments of an 3979 ack-eliciting packet by more than the indicated value. If it does, 3980 any excess accrues to the RTT estimate and could result in spurious 3981 or delayed retransmissions from the peer. A sender uses the 3982 receiver's max_ack_delay value in determining timeouts for timer- 3983 based retransmission, as detailed in Section 6.2 of [QUIC-RECOVERY]. 3985 An endpoint MUST acknowledge all ack-eliciting Initial and Handshake 3986 packets immediately and all ack-eliciting 0-RTT and 1-RTT packets 3987 within its advertised max_ack_delay, with the following exception. 3988 Prior to handshake confirmation, an endpoint might not have packet 3989 protection keys for decrypting Handshake, 0-RTT, or 1-RTT packets 3990 when they are received. It might therefore buffer them and 3991 acknowledge them when the requisite keys become available. 3993 Since packets containing only ACK frames are not congestion 3994 controlled, an endpoint MUST NOT send more than one such packet in 3995 response to receiving an ack-eliciting packet. 3997 An endpoint MUST NOT send a non-ack-eliciting packet in response to a 3998 non-ack-eliciting packet, even if there are packet gaps that precede 3999 the received packet. This avoids an infinite feedback loop of 4000 acknowledgements, which could prevent the connection from ever 4001 becoming idle. Non-ack-eliciting packets are eventually acknowledged 4002 when the endpoint sends an ACK frame in response to other events. 4004 In order to assist loss detection at the sender, an endpoint SHOULD 4005 generate and send an ACK frame without delay when it receives an ack- 4006 eliciting packet either: 4008 * when the received packet has a packet number less than another 4009 ack-eliciting packet that has been received, or 4011 * when the packet has a packet number larger than the highest- 4012 numbered ack-eliciting packet that has been received and there are 4013 missing packets between that packet and this packet. 4015 Similarly, packets marked with the ECN Congestion Experienced (CE) 4016 codepoint in the IP header SHOULD be acknowledged immediately, to 4017 reduce the peer's response time to congestion events. 4019 The algorithms in [QUIC-RECOVERY] are expected to be resilient to 4020 receivers that do not follow the guidance offered above. However, an 4021 implementation should only deviate from these requirements after 4022 careful consideration of the performance implications of a change, 4023 for connections made by the endpoint and for other users of the 4024 network. 4026 An endpoint that is only sending ACK frames will not receive 4027 acknowledgments from its peer unless those acknowledgements are 4028 included in packets with ack-eliciting frames. An endpoint SHOULD 4029 send an ACK frame with other frames when there are new ack-eliciting 4030 packets to acknowledge. When only non-ack-eliciting packets need to 4031 be acknowledged, an endpoint MAY wait until an ack-eliciting packet 4032 has been received to include an ACK frame with outgoing frames. 4034 A receiver MUST NOT send an ack-eliciting frame in all packets that 4035 would otherwise be non-ack-eliciting, to avoid an infinite feedback 4036 loop of acknowledgements. 4038 13.2.2. Acknowledgement Frequency 4040 A receiver determines how frequently to send acknowledgements in 4041 response to ack-eliciting packets. This determination involves a 4042 trade-off. 4044 Endpoints rely on timely acknowledgment to detect loss; see Section 6 4045 of [QUIC-RECOVERY]. Window-based congestion controllers, such as the 4046 one in Section 7 of [QUIC-RECOVERY], rely on acknowledgments to 4047 manage their congestion window. In both cases, delaying 4048 acknowledgments can adversely affect performance. 4050 On the other hand, reducing the frequency of packets that carry only 4051 acknowledgements reduces packet transmission and processing cost at 4052 both endpoints. It can improve connection throughput on severely 4053 asymmetric links and reduce the volume of acknowledgment traffic 4054 using return path capacity; see Section 3 of [RFC3449]. 4056 A receiver SHOULD send an ACK frame after receiving at least two ack- 4057 eliciting packets. This recommendation is general in nature and 4058 consistent with recommendations for TCP endpoint behavior [RFC5681]. 4059 Knowledge of network conditions, knowledge of the peer's congestion 4060 controller, or further research and experimentation might suggest 4061 alternative acknowledgment strategies with better performance 4062 characteristics. 4064 A receiver MAY process multiple available packets before determining 4065 whether to send an ACK frame in response. 4067 13.2.3. Managing ACK Ranges 4069 When an ACK frame is sent, one or more ranges of acknowledged packets 4070 are included. Including acknowledgements for older packets reduces 4071 the chance of spurious retransmissions caused by losing previously 4072 sent ACK frames, at the cost of larger ACK frames. 4074 ACK frames SHOULD always acknowledge the most recently received 4075 packets, and the more out-of-order the packets are, the more 4076 important it is to send an updated ACK frame quickly, to prevent the 4077 peer from declaring a packet as lost and spuriously retransmitting 4078 the frames it contains. An ACK frame is expected to fit within a 4079 single QUIC packet. If it does not, then older ranges (those with 4080 the smallest packet numbers) are omitted. 4082 A receiver limits the number of ACK Ranges (Section 19.3.1) it 4083 remembers and sends in ACK frames, both to limit the size of ACK 4084 frames and to avoid resource exhaustion. After receiving 4085 acknowledgments for an ACK frame, the receiver SHOULD stop tracking 4086 those acknowledged ACK Ranges. Senders can expect acknowledgements 4087 for most packets, but QUIC does not guarantee receipt of an 4088 acknowledgment for every packet that the receiver processes. 4090 It is possible that retaining many ACK Ranges could cause an ACK 4091 frame to become too large. A receiver can discard unacknowledged ACK 4092 Ranges to limit ACK frame size, at the cost of increased 4093 retransmissions from the sender. This is necessary if an ACK frame 4094 would be too large to fit in a packet. Receivers MAY also limit ACK 4095 frame size further to preserve space for other frames or to limit the 4096 capacity that acknowledgments consume. 4098 A receiver MUST retain an ACK Range unless it can ensure that it will 4099 not subsequently accept packets with numbers in that range. 4100 Maintaining a minimum packet number that increases as ranges are 4101 discarded is one way to achieve this with minimal state. 4103 Receivers can discard all ACK Ranges, but they MUST retain the 4104 largest packet number that has been successfully processed as that is 4105 used to recover packet numbers from subsequent packets; see 4106 Section 17.1. 4108 A receiver SHOULD include an ACK Range containing the largest 4109 received packet number in every ACK frame. The Largest Acknowledged 4110 field is used in ECN validation at a sender and including a lower 4111 value than what was included in a previous ACK frame could cause ECN 4112 to be unnecessarily disabled; see Section 13.4.2. 4114 Section 13.2.4 describes an exemplary approach for determining what 4115 packets to acknowledge in each ACK frame. Though the goal of this 4116 algorithm is to generate an acknowledgment for every packet that is 4117 processed, it is still possible for acknowledgments to be lost. 4119 13.2.4. Limiting Ranges by Tracking ACK Frames 4121 When a packet containing an ACK frame is sent, the largest 4122 acknowledged in that frame may be saved. When a packet containing an 4123 ACK frame is acknowledged, the receiver can stop acknowledging 4124 packets less than or equal to the largest acknowledged in the sent 4125 ACK frame. 4127 A receiver that sends only non-ack-eliciting packets, such as ACK 4128 frames, might not receive an acknowledgement for a long period of 4129 time. This could cause the receiver to maintain state for a large 4130 number of ACK frames for a long period of time, and ACK frames it 4131 sends could be unnecessarily large. In such a case, a receiver could 4132 send a PING or other small ack-eliciting frame occasionally, such as 4133 once per round trip, to elicit an ACK from the peer. 4135 In cases without ACK frame loss, this algorithm allows for a minimum 4136 of 1 RTT of reordering. In cases with ACK frame loss and reordering, 4137 this approach does not guarantee that every acknowledgement is seen 4138 by the sender before it is no longer included in the ACK frame. 4139 Packets could be received out of order and all subsequent ACK frames 4140 containing them could be lost. In this case, the loss recovery 4141 algorithm could cause spurious retransmissions, but the sender will 4142 continue making forward progress. 4144 13.2.5. Measuring and Reporting Host Delay 4146 An endpoint measures the delays intentionally introduced between the 4147 time the packet with the largest packet number is received and the 4148 time an acknowledgment is sent. The endpoint encodes this 4149 acknowledgement delay in the ACK Delay field of an ACK frame; see 4150 Section 19.3. This allows the receiver of the ACK frame to adjust 4151 for any intentional delays, which is important for getting a better 4152 estimate of the path RTT when acknowledgments are delayed. 4154 A packet might be held in the OS kernel or elsewhere on the host 4155 before being processed. An endpoint MUST NOT include delays that it 4156 does not control when populating the ACK Delay field in an ACK frame. 4157 However, endpoints SHOULD include buffering delays caused by 4158 unavailability of decryption keys, since these delays can be large 4159 and are likely to be non-repeating. 4161 When the measured acknowledgement delay is larger than its 4162 max_ack_delay, an endpoint SHOULD report the measured delay. This 4163 information is especially useful during the handshake when delays 4164 might be large; see Section 13.2.1. 4166 13.2.6. ACK Frames and Packet Protection 4168 ACK frames MUST only be carried in a packet that has the same packet 4169 number space as the packet being acknowledged; see Section 12.1. For 4170 instance, packets that are protected with 1-RTT keys MUST be 4171 acknowledged in packets that are also protected with 1-RTT keys. 4173 Packets that a client sends with 0-RTT packet protection MUST be 4174 acknowledged by the server in packets protected by 1-RTT keys. This 4175 can mean that the client is unable to use these acknowledgments if 4176 the server cryptographic handshake messages are delayed or lost. 4177 Note that the same limitation applies to other data sent by the 4178 server protected by the 1-RTT keys. 4180 13.2.7. PADDING Frames Consume Congestion Window 4182 Packets containing PADDING frames are considered to be in flight for 4183 congestion control purposes [QUIC-RECOVERY]. Packets containing only 4184 PADDING frames therefore consume congestion window but do not 4185 generate acknowledgments that will open the congestion window. To 4186 avoid a deadlock, a sender SHOULD ensure that other frames are sent 4187 periodically in addition to PADDING frames to elicit acknowledgments 4188 from the receiver. 4190 13.3. Retransmission of Information 4192 QUIC packets that are determined to be lost are not retransmitted 4193 whole. The same applies to the frames that are contained within lost 4194 packets. Instead, the information that might be carried in frames is 4195 sent again in new frames as needed. 4197 New frames and packets are used to carry information that is 4198 determined to have been lost. In general, information is sent again 4199 when a packet containing that information is determined to be lost 4200 and sending ceases when a packet containing that information is 4201 acknowledged. 4203 * Data sent in CRYPTO frames is retransmitted according to the rules 4204 in [QUIC-RECOVERY], until all data has been acknowledged. Data in 4205 CRYPTO frames for Initial and Handshake packets is discarded when 4206 keys for the corresponding packet number space are discarded. 4208 * Application data sent in STREAM frames is retransmitted in new 4209 STREAM frames unless the endpoint has sent a RESET_STREAM for that 4210 stream. Once an endpoint sends a RESET_STREAM frame, no further 4211 STREAM frames are needed. 4213 * ACK frames carry the most recent set of acknowledgements and the 4214 acknowledgement delay from the largest acknowledged packet, as 4215 described in Section 13.2.1. Delaying the transmission of packets 4216 containing ACK frames or resending old ACK frames can cause the 4217 peer to generate an inflated RTT sample or unnecessarily disable 4218 ECN. 4220 * Cancellation of stream transmission, as carried in a RESET_STREAM 4221 frame, is sent until acknowledged or until all stream data is 4222 acknowledged by the peer (that is, either the "Reset Recvd" or 4223 "Data Recvd" state is reached on the sending part of the stream). 4224 The content of a RESET_STREAM frame MUST NOT change when it is 4225 sent again. 4227 * Similarly, a request to cancel stream transmission, as encoded in 4228 a STOP_SENDING frame, is sent until the receiving part of the 4229 stream enters either a "Data Recvd" or "Reset Recvd" state; see 4230 Section 3.5. 4232 * Connection close signals, including packets that contain 4233 CONNECTION_CLOSE frames, are not sent again when packet loss is 4234 detected, but as described in Section 10. 4236 * The current connection maximum data is sent in MAX_DATA frames. 4237 An updated value is sent in a MAX_DATA frame if the packet 4238 containing the most recently sent MAX_DATA frame is declared lost, 4239 or when the endpoint decides to update the limit. Care is 4240 necessary to avoid sending this frame too often as the limit can 4241 increase frequently and cause an unnecessarily large number of 4242 MAX_DATA frames to be sent; see Section 4.2. 4244 * The current maximum stream data offset is sent in MAX_STREAM_DATA 4245 frames. Like MAX_DATA, an updated value is sent when the packet 4246 containing the most recent MAX_STREAM_DATA frame for a stream is 4247 lost or when the limit is updated, with care taken to prevent the 4248 frame from being sent too often. An endpoint SHOULD stop sending 4249 MAX_STREAM_DATA frames when the receiving part of the stream 4250 enters a "Size Known" state. 4252 * The limit on streams of a given type is sent in MAX_STREAMS 4253 frames. Like MAX_DATA, an updated value is sent when a packet 4254 containing the most recent MAX_STREAMS for a stream type frame is 4255 declared lost or when the limit is updated, with care taken to 4256 prevent the frame from being sent too often. 4258 * Blocked signals are carried in DATA_BLOCKED, STREAM_DATA_BLOCKED, 4259 and STREAMS_BLOCKED frames. DATA_BLOCKED frames have connection 4260 scope, STREAM_DATA_BLOCKED frames have stream scope, and 4261 STREAMS_BLOCKED frames are scoped to a specific stream type. New 4262 frames are sent if packets containing the most recent frame for a 4263 scope is lost, but only while the endpoint is blocked on the 4264 corresponding limit. These frames always include the limit that 4265 is causing blocking at the time that they are transmitted. 4267 * A liveness or path validation check using PATH_CHALLENGE frames is 4268 sent periodically until a matching PATH_RESPONSE frame is received 4269 or until there is no remaining need for liveness or path 4270 validation checking. PATH_CHALLENGE frames include a different 4271 payload each time they are sent. 4273 * Responses to path validation using PATH_RESPONSE frames are sent 4274 just once. The peer is expected to send more PATH_CHALLENGE 4275 frames as necessary to evoke additional PATH_RESPONSE frames. 4277 * New connection IDs are sent in NEW_CONNECTION_ID frames and 4278 retransmitted if the packet containing them is lost. 4279 Retransmissions of this frame carry the same sequence number 4280 value. Likewise, retired connection IDs are sent in 4281 RETIRE_CONNECTION_ID frames and retransmitted if the packet 4282 containing them is lost. 4284 * NEW_TOKEN frames are retransmitted if the packet containing them 4285 is lost. No special support is made for detecting reordered and 4286 duplicated NEW_TOKEN frames other than a direct comparison of the 4287 frame contents. 4289 * PING and PADDING frames contain no information, so lost PING or 4290 PADDING frames do not require repair. 4292 * The HANDSHAKE_DONE frame MUST be retransmitted until it is 4293 acknowledged. 4295 Endpoints SHOULD prioritize retransmission of data over sending new 4296 data, unless priorities specified by the application indicate 4297 otherwise; see Section 2.3. 4299 Even though a sender is encouraged to assemble frames containing up- 4300 to-date information every time it sends a packet, it is not forbidden 4301 to retransmit copies of frames from lost packets. A sender that 4302 retransmits copies of frames needs to handle decreases in available 4303 payload size due to change in packet number length, connection ID 4304 length, and path MTU. A receiver MUST accept packets containing an 4305 outdated frame, such as a MAX_DATA frame carrying a smaller maximum 4306 data than one found in an older packet. 4308 A sender SHOULD avoid retransmitting information from packets once 4309 they are acknowledged. This includes packets that are acknowledged 4310 after being declared lost, which can happen in the presence of 4311 network reordering. Doing so requires senders to retain information 4312 about packets after they are declared lost. A sender can discard 4313 this information after a period of time elapses that adequately 4314 allows for reordering, such as a PTO (Section 6.2 of 4315 [QUIC-RECOVERY]), or on other events, such as reaching a memory 4316 limit. 4318 Upon detecting losses, a sender MUST take appropriate congestion 4319 control action. The details of loss detection and congestion control 4320 are described in [QUIC-RECOVERY]. 4322 13.4. Explicit Congestion Notification 4324 QUIC endpoints can use Explicit Congestion Notification (ECN) 4325 [RFC3168] to detect and respond to network congestion. ECN allows a 4326 network node to indicate congestion in the network by setting a 4327 codepoint in the IP header of a packet instead of dropping it. 4328 Endpoints react to congestion by reducing their sending rate in 4329 response, as described in [QUIC-RECOVERY]. 4331 Note that supporting ECN requires being able to set and read the ECN 4332 codepoints in the IP headers of datagrams carrying QUIC packets. On 4333 platforms where this is not possible, QUIC cannot support ECN. 4335 To use ECN, QUIC endpoints first determine whether a path supports 4336 ECN marking and the peer is able to access the ECN codepoint in the 4337 IP header. A network path does not support ECN if ECN marked packets 4338 get dropped or ECN markings are rewritten on the path. An endpoint 4339 validates the use of ECN on the path, both during connection 4340 establishment and when migrating to a new path (Section 9). 4342 13.4.1. ECN Counts 4344 On receiving a QUIC packet with an ECT or CE codepoint, an ECN- 4345 enabled endpoint that can access the ECN codepoints from the 4346 enclosing IP packet increases the corresponding ECT(0), ECT(1), or CE 4347 count, and includes these counts in subsequent ACK frames; see 4348 Section 13.2 and Section 19.3. 4350 An IP packet that results in no QUIC packets being processed does not 4351 increase ECN counts. A QUIC packet detected by a receiver as a 4352 duplicate does not affect the receiver's local ECN codepoint counts; 4353 see Section 21.9 for relevant security concerns. 4355 If an endpoint receives a QUIC packet without an ECT or CE codepoint 4356 in the IP packet header, it responds per Section 13.2 with an ACK 4357 frame without increasing any ECN counts. If an endpoint does not 4358 implement ECN support or does not have access to received ECN 4359 codepoints, it does not increase ECN counts. 4361 Coalesced packets (see Section 12.2) mean that several packets can 4362 share the same IP header. The ECN counts for the ECN codepoint 4363 received in the associated IP header are incremented once for each 4364 QUIC packet, not per enclosing IP packet or UDP datagram. 4366 Each packet number space maintains separate acknowledgement state and 4367 separate ECN counts. For example, if one each of an Initial, 0-RTT, 4368 Handshake, and 1-RTT QUIC packet are coalesced, the corresponding 4369 counts for the Initial and Handshake packet number space will be 4370 incremented by one and the counts for the application data packet 4371 number space will be increased by two. 4373 13.4.2. ECN Validation 4375 It is possible for faulty network devices to corrupt or erroneously 4376 drop packets with ECN markings. To provide robust connectivity in 4377 the presence of such devices, each endpoint independently validates 4378 ECN counts and disables ECN if the path is not showing consistent 4379 support for ECN. 4381 Endpoints validate ECN for packets sent on each network path 4382 independently. An endpoint thus validates ECN on new connection 4383 establishment, when switching to a server's preferred address, and on 4384 active connection migration to a new path. Appendix B describes one 4385 possible algorithm for testing paths for ECN support. 4387 Even if an endpoint does not use ECN markings on packets it 4388 transmits, the endpoint MUST provide feedback about ECN markings 4389 received from the peer if they are accessible. Failing to report ECN 4390 counts will cause the peer to disable ECN marking. 4392 13.4.2.1. Sending ECN Markings 4394 To start ECN validation, an endpoint SHOULD do the following when 4395 sending packets on a new path to a peer: 4397 * Set the ECT(0) codepoint in the IP header of early outgoing 4398 packets sent on a new path to the peer ([RFC8311]). 4400 * If all packets that were sent with the ECT(0) codepoint are 4401 eventually deemed lost (Section 6 of [QUIC-RECOVERY]), validation 4402 is deemed to have failed. 4404 To reduce the chances of misinterpreting congestive loss as packets 4405 dropped by a faulty network element, an endpoint could set the ECT(0) 4406 codepoint for only the first ten outgoing packets on a path, or for a 4407 period of three RTTs, whichever occurs first. 4409 Other methods of probing paths for ECN support are possible, as are 4410 different marking strategies. Implementations MAY use other methods 4411 defined in RFCs; see [RFC8311]. Implementations that use the ECT(1) 4412 codepoint need to perform ECN validation using ECT(1) counts. 4414 13.4.2.2. Receiving ACK Frames 4416 Erroneous application of ECN marks in the network can result in 4417 degraded connection performance. An endpoint that receives an ACK 4418 frame with ECN counts therefore validates the counts before using 4419 them. It performs this validation by comparing newly received counts 4420 against those from the last successfully processed ACK frame. Any 4421 increase in ECN counts is validated based on the markings that were 4422 applied to packets that are newly acknowledged in the ACK frame. 4424 If an ACK frame newly acknowledges a packet that the endpoint sent 4425 with either ECT(0) or ECT(1) codepoints set, ECN validation fails if 4426 ECN counts are not present in the ACK frame. This check detects a 4427 network element that zeroes out ECN bits or a peer that is unable to 4428 access ECN markings. 4430 ECN validation fails if the sum of the increase in ECT(0) and ECN-CE 4431 counts is less than the number of newly acknowledged packets that 4432 were originally sent with an ECT(0) marking. Similarly, ECN 4433 validation fails if the sum of the increases to ECT(1) and ECN-CE 4434 counts is less than the number of newly acknowledged packets sent 4435 with an ECT(1) marking. These checks can detect removal of ECN 4436 markings in the network. 4438 An endpoint could miss acknowledgements for a packet when ACK frames 4439 are lost. It is therefore possible for the total increase in ECT(0), 4440 ECT(1), and ECN-CE counts to be greater than the number of packets 4441 acknowledged in an ACK frame. This is why counts are permitted to be 4442 larger than might be accounted for by newly acknowledged packets. 4444 ECN validation MAY fail if the total count for an ECT(0) or ECT(1) 4445 marking exceeds the total number of packets sent with the 4446 corresponding marking. In particular, an endpoint that never applies 4447 a particular marking can fail validation when a non-zero count for 4448 the corresponding marking is received. This check can detect when 4449 packets are marked ECT(0) or ECT(1) in the network. 4451 Processing ECN counts out of order can result in validation failure. 4452 An endpoint SHOULD skip ECN validation on an ACK frame that does not 4453 increase the largest acknowledged packet number. 4455 13.4.2.3. Validation Outcomes 4457 If validation fails, then the endpoint stops sending ECN markings in 4458 subsequent IP packets with the expectation that either the network 4459 path or the peer does not support ECN. 4461 Upon successful validation, an endpoint can continue to set ECT 4462 codepoints in subsequent packets with the expectation that the path 4463 is ECN-capable. Network routing and path elements can change mid- 4464 connection however; an endpoint MUST disable ECN if validation fails 4465 at any point in the connection. 4467 Even if validation fails, an endpoint MAY revalidate ECN on the same 4468 path at any later time in the connection. 4470 14. Packet Size 4472 The QUIC packet size includes the QUIC header and protected payload, 4473 but not the UDP or IP headers. 4475 QUIC depends upon a minimum IP packet size of at least 1280 bytes. 4476 This is the IPv6 minimum size ([IPv6]) and is also supported by most 4477 modern IPv4 networks. Assuming the minimum IP header size, this 4478 results in a QUIC maximum packet size of 1232 bytes for IPv6 and 1252 4479 bytes for IPv4. 4481 The QUIC maximum packet size is the largest size of QUIC packet that 4482 can be sent across a network path using a single packet. Any maximum 4483 packet size larger than 1200 bytes can be discovered using Path 4484 Maximum Transmission Unit Discovery (PMTUD; see Section 14.2.1) or 4485 Datagram Packetization Layer PMTU Discovery (DPLPMTUD; see 4486 Section 14.3). 4488 Enforcement of the max_udp_payload_size transport parameter 4489 (Section 18.2) might act as an additional limit on the maximum packet 4490 size. A sender can avoid exceeding this limit, once the value is 4491 known. However, prior to learning the value of the transport 4492 parameter, endpoints risk datagrams being lost if they send packets 4493 larger than the smallest allowed maximum packet size of 1200 bytes. 4495 UDP datagrams MUST NOT be fragmented at the IP layer. In IPv4 4496 ([IPv4]), the DF bit MUST be set to prevent fragmentation on the 4497 path. 4499 14.1. Initial Packet Size 4501 A client MUST expand the payload of all UDP datagrams carrying 4502 Initial packets to at least the smallest allowed maximum packet size 4503 (1200 bytes) by adding PADDING frames to the Initial packet or by 4504 coalescing the Initial packet; see Section 12.2. Sending a UDP 4505 datagram of this size ensures that the network path from the client 4506 to the server supports a reasonable Path Maximum Transmission Unit 4507 (PMTU). This also helps reduce the amplitude of amplification 4508 attacks caused by server responses toward an unverified client 4509 address; see Section 8. 4511 Datagrams containing Initial packets MAY exceed 1200 bytes if the 4512 client believes that the network path and peer both support the size 4513 that it chooses. 4515 A server MUST discard an Initial packet that is carried in a UDP 4516 datagram with a payload that is less than the smallest allowed 4517 maximum packet size of 1200 bytes. A server MAY also immediately 4518 close the connection by sending a CONNECTION_CLOSE frame with an 4519 error code of PROTOCOL_VIOLATION; see Section 10.2.3. 4521 The server MUST also limit the number of bytes it sends before 4522 validating the address of the client; see Section 8. 4524 14.2. Path Maximum Transmission Unit 4526 The Path Maximum Transmission Unit (PMTU) is the maximum size of the 4527 entire IP packet including the IP header, UDP header, and UDP 4528 payload. The UDP payload includes the QUIC packet header, protected 4529 payload, and any authentication fields. The PMTU can depend on path 4530 characteristics, and can therefore change over time. The largest UDP 4531 payload an endpoint sends at any given time is referred to as the 4532 endpoint's maximum packet size. 4534 An endpoint SHOULD use DPLPMTUD (Section 14.3) or PMTUD 4535 (Section 14.2.1) to determine whether the path to a destination will 4536 support a desired maximum packet size without fragmentation. In the 4537 absence of these mechanisms, QUIC endpoints SHOULD NOT send IP 4538 packets larger than the smallest allowed maximum packet size. 4540 Both DPLPMTUD and PMTUD send IP packets that are larger than the 4541 current maximum packet size, referred to as PMTU probes. All QUIC 4542 packets that are not sent in a PMTU probe SHOULD be sized to fit 4543 within the maximum packet size to avoid the packet being fragmented 4544 or dropped ([RFC8085]). 4546 If a QUIC endpoint determines that the PMTU between any pair of local 4547 and remote IP addresses has fallen below the smallest allowed maximum 4548 packet size of 1200 bytes, it MUST immediately cease sending QUIC 4549 packets, except for those in PMTU probes or those containing 4550 CONNECTION_CLOSE frames, on the affected path. An endpoint MAY 4551 terminate the connection if an alternative path cannot be found. 4553 Each pair of local and remote addresses could have a different PMTU. 4554 QUIC implementations that implement any kind of PMTU discovery 4555 therefore SHOULD maintain a maximum packet size for each combination 4556 of local and remote IP addresses. 4558 A QUIC implementation MAY be more conservative in computing the 4559 maximum packet size to allow for unknown tunnel overheads or IP 4560 header options/extensions. 4562 14.2.1. Handling of ICMP Messages by PMTUD 4564 Path Maximum Transmission Unit Discovery (PMTUD; [RFC1191], 4565 [RFC8201]) relies on reception of ICMP messages (e.g., IPv6 Packet 4566 Too Big messages) that indicate when a packet is dropped because it 4567 is larger than the local router MTU. DPLPMTUD can also optionally 4568 use these messages. This use of ICMP messages is potentially 4569 vulnerable to off-path attacks that successfully guess the addresses 4570 used on the path and reduce the PMTU to a bandwidth-inefficient 4571 value. 4573 An endpoint MUST ignore an ICMP message that claims the PMTU has 4574 decreased below the minimum QUIC packet size. 4576 The requirements for generating ICMP ([RFC1812], [RFC4443]) state 4577 that the quoted packet should contain as much of the original packet 4578 as possible without exceeding the minimum MTU for the IP version. 4579 The size of the quoted packet can actually be smaller, or the 4580 information unintelligible, as described in Section 1.1 of 4581 [DPLPMTUD]. 4583 QUIC endpoints using PMTUD SHOULD validate ICMP messages to protect 4584 from off-path injection as specified in [RFC8201] and Section 5.2 of 4585 [RFC8085]. This validation SHOULD use the quoted packet supplied in 4586 the payload of an ICMP message to associate the message with a 4587 corresponding transport connection (see Section 4.6.1 of [DPLPMTUD]). 4588 ICMP message validation MUST include matching IP addresses and UDP 4589 ports ([RFC8085]) and, when possible, connection IDs to an active 4590 QUIC session. The endpoint SHOULD ignore all ICMP messages that fail 4591 validation. 4593 An endpoint MUST NOT increase PMTU based on ICMP messages; see 4594 Section 3, clause 6 of [DPLPMTUD]. Any reduction in the QUIC maximum 4595 packet size in response to ICMP messages MAY be provisional until 4596 QUIC's loss detection algorithm determines that the quoted packet has 4597 actually been lost. 4599 14.3. Datagram Packetization Layer PMTU Discovery 4601 Datagram Packetization Layer PMTU Discovery (DPLPMTUD; [DPLPMTUD]) 4602 relies on tracking loss or acknowledgment of QUIC packets that are 4603 carried in PMTU probes. PMTU probes for DPLPMTUD that use the 4604 PADDING frame implement "Probing using padding data", as defined in 4605 Section 4.1 of [DPLPMTUD]. 4607 Endpoints SHOULD set the initial value of BASE_PMTU (see Section 5.1 4608 of [DPLPMTUD]) to be consistent with the minimum QUIC packet size. 4609 The MIN_PLPMTU is the same as the BASE_PMTU. 4611 QUIC endpoints implementing DPLPMTUD maintain a maximum packet size 4612 (DPLPMTUD MPS) for each combination of local and remote IP addresses. 4614 14.3.1. DPLPMTUD and Initial Connectivity 4616 From the perspective of DPLPMTUD, QUIC is an acknowledged 4617 packetization layer (PL). A sender can therefore enter the DPLPMTUD 4618 BASE state when the QUIC connection handshake has been completed. 4620 14.3.2. Validating the QUIC Path with DPLPMTUD 4622 QUIC provides an acknowledged PL, therefore a sender does not 4623 implement the DPLPMTUD CONFIRMATION_TIMER while in the 4624 SEARCH_COMPLETE state; see Section 5.2 of [DPLPMTUD]. 4626 14.3.3. Handling of ICMP Messages by DPLPMTUD 4628 An endpoint using DPLPMTUD requires the validation of any received 4629 ICMP Packet Too Big (PTB) message before using the PTB information, 4630 as defined in Section 4.6 of [DPLPMTUD]. In addition to UDP port 4631 validation, QUIC validates an ICMP message by using other PL 4632 information (e.g., validation of connection IDs in the quoted packet 4633 of any received ICMP message). 4635 The considerations for processing ICMP messages described in 4636 Section 14.2.1 also apply if these messages are used by DPLPMTUD. 4638 14.4. Sending QUIC PMTU Probes 4640 PMTU probes are ack-eliciting packets. 4642 Endpoints could limit the content of PMTU probes to PING and PADDING 4643 frames as packets that are larger than the current maximum packet 4644 size are more likely to be dropped by the network. Loss of a QUIC 4645 packet that is carried in a PMTU probe is therefore not a reliable 4646 indication of congestion and SHOULD NOT trigger a congestion control 4647 reaction; see Section 3, Bullet 7 of [DPLPMTUD]. However, PMTU 4648 probes consume congestion window, which could delay subsequent 4649 transmission by an application. 4651 14.4.1. PMTU Probes Containing Source Connection ID 4653 Endpoints that rely on the destination connection ID for routing 4654 incoming QUIC packets are likely to require that the connection ID be 4655 included in PMTU probes to route any resulting ICMP messages 4656 (Section 14.2.1) back to the correct endpoint. However, only long 4657 header packets (Section 17.2) contain the Source Connection ID field, 4658 and long header packets are not decrypted or acknowledged by the peer 4659 once the handshake is complete. 4661 One way to construct a PMTU probe is to coalesce (see Section 12.2) a 4662 packet with a long header, such as a Handshake or 0-RTT packet 4663 (Section 17.2), with a short header packet in a single UDP datagram. 4664 If the resulting PMTU probe reaches the endpoint, the packet with the 4665 long header will be ignored, but the short header packet will be 4666 acknowledged. If the PMTU probe causes an ICMP message to be sent, 4667 the first part of the probe will be quoted in that message. If the 4668 Source Connection ID field is within the quoted portion of the probe, 4669 that could be used for routing or validation of the ICMP message. 4671 Note: The purpose of using a packet with a long header is only to 4672 ensure that the quoted packet contained in the ICMP message 4673 contains a Source Connection ID field. This packet does not need 4674 to be a valid packet and it can be sent even if there is no 4675 current use for packets of that type. 4677 15. Versions 4679 QUIC versions are identified using a 32-bit unsigned number. 4681 The version 0x00000000 is reserved to represent version negotiation. 4682 This version of the specification is identified by the number 4683 0x00000001. 4685 Other versions of QUIC might have different properties from this 4686 version. The properties of QUIC that are guaranteed to be consistent 4687 across all versions of the protocol are described in 4688 [QUIC-INVARIANTS]. 4690 Version 0x00000001 of QUIC uses TLS as a cryptographic handshake 4691 protocol, as described in [QUIC-TLS]. 4693 Versions with the most significant 16 bits of the version number 4694 cleared are reserved for use in future IETF consensus documents. 4696 Versions that follow the pattern 0x?a?a?a?a are reserved for use in 4697 forcing version negotiation to be exercised. That is, any version 4698 number where the low four bits of all bytes is 1010 (in binary). A 4699 client or server MAY advertise support for any of these reserved 4700 versions. 4702 Reserved version numbers will never represent a real protocol; a 4703 client MAY use one of these version numbers with the expectation that 4704 the server will initiate version negotiation; a server MAY advertise 4705 support for one of these versions and can expect that clients ignore 4706 the value. 4708 [[RFC editor: please remove the remainder of this section before 4709 publication.]] 4711 The version number for the final version of this specification 4712 (0x00000001), is reserved for the version of the protocol that is 4713 published as an RFC. 4715 Version numbers used to identify IETF drafts are created by adding 4716 the draft number to 0xff000000. For example, draft-ietf-quic- 4717 transport-13 would be identified as 0xff00000d. 4719 Implementors are encouraged to register version numbers of QUIC that 4720 they are using for private experimentation on the GitHub wiki at 4721 https://github.com/quicwg/base-drafts/wiki/QUIC-Versions. 4723 16. Variable-Length Integer Encoding 4725 QUIC packets and frames commonly use a variable-length encoding for 4726 non-negative integer values. This encoding ensures that smaller 4727 integer values need fewer bytes to encode. 4729 The QUIC variable-length integer encoding reserves the two most 4730 significant bits of the first byte to encode the base 2 logarithm of 4731 the integer encoding length in bytes. The integer value is encoded 4732 on the remaining bits, in network byte order. 4734 This means that integers are encoded on 1, 2, 4, or 8 bytes and can 4735 encode 6, 14, 30, or 62 bit values respectively. Table 4 summarizes 4736 the encoding properties. 4738 +======+========+=============+=======================+ 4739 | 2Bit | Length | Usable Bits | Range | 4740 +======+========+=============+=======================+ 4741 | 00 | 1 | 6 | 0-63 | 4742 +------+--------+-------------+-----------------------+ 4743 | 01 | 2 | 14 | 0-16383 | 4744 +------+--------+-------------+-----------------------+ 4745 | 10 | 4 | 30 | 0-1073741823 | 4746 +------+--------+-------------+-----------------------+ 4747 | 11 | 8 | 62 | 0-4611686018427387903 | 4748 +------+--------+-------------+-----------------------+ 4750 Table 4: Summary of Integer Encodings 4752 For example, the eight byte sequence c2 19 7c 5e ff 14 e8 8c (in 4753 hexadecimal) decodes to the decimal value 151288809941952652; the 4754 four byte sequence 9d 7f 3e 7d decodes to 494878333; the two byte 4755 sequence 7b bd decodes to 15293; and the single byte 25 decodes to 37 4756 (as does the two byte sequence 40 25). 4758 Versions (Section 15) and error codes (Section 20) are described 4759 using integers, but do not use this encoding. 4761 17. Packet Formats 4763 All numeric values are encoded in network byte order (that is, big- 4764 endian) and all field sizes are in bits. Hexadecimal notation is 4765 used for describing the value of fields. 4767 17.1. Packet Number Encoding and Decoding 4769 Packet numbers are integers in the range 0 to 2^62-1 (Section 12.3). 4770 When present in long or short packet headers, they are encoded in 1 4771 to 4 bytes. The number of bits required to represent the packet 4772 number is reduced by including only the least significant bits of the 4773 packet number. 4775 The encoded packet number is protected as described in Section 5.4 of 4776 [QUIC-TLS]. 4778 Prior to receiving an acknowledgement for a packet number space, the 4779 full packet number MUST be included. 4781 After an acknowledgement is received for a packet number space, the 4782 sender MUST use a packet number size able to represent more than 4783 twice as large a range than the difference between the largest 4784 acknowledged packet and packet number being sent. A peer receiving 4785 the packet will then correctly decode the packet number, unless the 4786 packet is delayed in transit such that it arrives after many higher- 4787 numbered packets have been received. An endpoint SHOULD use a large 4788 enough packet number encoding to allow the packet number to be 4789 recovered even if the packet arrives after packets that are sent 4790 afterwards. 4792 As a result, the size of the packet number encoding is at least one 4793 bit more than the base-2 logarithm of the number of contiguous 4794 unacknowledged packet numbers, including the new packet. 4796 For example, if an endpoint has received an acknowledgment for packet 4797 0xabe8bc, sending a packet with a number of 0xac5c02 requires a 4798 packet number encoding with 16 bits or more; whereas the 24-bit 4799 packet number encoding is needed to send a packet with a number of 4800 0xace8fe. 4802 At a receiver, protection of the packet number is removed prior to 4803 recovering the full packet number. The full packet number is then 4804 reconstructed based on the number of significant bits present, the 4805 value of those bits, and the largest packet number received on a 4806 successfully authenticated packet. Recovering the full packet number 4807 is necessary to successfully remove packet protection. 4809 Once header protection is removed, the packet number is decoded by 4810 finding the packet number value that is closest to the next expected 4811 packet. The next expected packet is the highest received packet 4812 number plus one. For example, if the highest successfully 4813 authenticated packet had a packet number of 0xa82f30ea, then a packet 4814 containing a 16-bit value of 0x9b32 will be decoded as 0xa82f9b32. 4815 Example pseudo-code for packet number decoding can be found in 4816 Appendix A. 4818 17.2. Long Header Packets 4819 Long Header Packet { 4820 Header Form (1) = 1, 4821 Fixed Bit (1) = 1, 4822 Long Packet Type (2), 4823 Type-Specific Bits (4), 4824 Version (32), 4825 Destination Connection ID Length (8), 4826 Destination Connection ID (0..160), 4827 Source Connection ID Length (8), 4828 Source Connection ID (0..160), 4829 } 4831 Figure 13: Long Header Packet Format 4833 Long headers are used for packets that are sent prior to the 4834 establishment of 1-RTT keys. Once 1-RTT keys are available, a sender 4835 switches to sending packets using the short header (Section 17.3). 4836 The long form allows for special packets - such as the Version 4837 Negotiation packet - to be represented in this uniform fixed-length 4838 packet format. Packets that use the long header contain the 4839 following fields: 4841 Header Form: The most significant bit (0x80) of byte 0 (the first 4842 byte) is set to 1 for long headers. 4844 Fixed Bit: The next bit (0x40) of byte 0 is set to 1. Packets 4845 containing a zero value for this bit are not valid packets in this 4846 version and MUST be discarded. 4848 Long Packet Type: The next two bits (those with a mask of 0x30) of 4849 byte 0 contain a packet type. Packet types are listed in Table 5. 4851 Type-Specific Bits: The lower four bits (those with a mask of 0x0f) 4852 of byte 0 are type-specific. 4854 Version: The QUIC Version is a 32-bit field that follows the first 4855 byte. This field indicates the version of QUIC that is in use and 4856 determines how the rest of the protocol fields are interpreted. 4858 Destination Connection ID Length: The byte following the version 4859 contains the length in bytes of the Destination Connection ID 4860 field that follows it. This length is encoded as an 8-bit 4861 unsigned integer. In QUIC version 1, this value MUST NOT exceed 4862 20. Endpoints that receive a version 1 long header with a value 4863 larger than 20 MUST drop the packet. In order to properly form a 4864 Version Negotiation packet, servers SHOULD be able to read longer 4865 connection IDs from other QUIC versions. 4867 Destination Connection ID: The Destination Connection ID field 4868 follows the Destination Connection ID Length field, which 4869 indicates the length of this field. Section 7.2 describes the use 4870 of this field in more detail. 4872 Source Connection ID Length: The byte following the Destination 4873 Connection ID contains the length in bytes of the Source 4874 Connection ID field that follows it. This length is encoded as a 4875 8-bit unsigned integer. In QUIC version 1, this value MUST NOT 4876 exceed 20 bytes. Endpoints that receive a version 1 long header 4877 with a value larger than 20 MUST drop the packet. In order to 4878 properly form a Version Negotiation packet, servers SHOULD be able 4879 to read longer connection IDs from other QUIC versions. 4881 Source Connection ID: The Source Connection ID field follows the 4882 Source Connection ID Length field, which indicates the length of 4883 this field. Section 7.2 describes the use of this field in more 4884 detail. 4886 In this version of QUIC, the following packet types with the long 4887 header are defined: 4889 +======+===========+================+ 4890 | Type | Name | Section | 4891 +======+===========+================+ 4892 | 0x0 | Initial | Section 17.2.2 | 4893 +------+-----------+----------------+ 4894 | 0x1 | 0-RTT | Section 17.2.3 | 4895 +------+-----------+----------------+ 4896 | 0x2 | Handshake | Section 17.2.4 | 4897 +------+-----------+----------------+ 4898 | 0x3 | Retry | Section 17.2.5 | 4899 +------+-----------+----------------+ 4901 Table 5: Long Header Packet Types 4903 The header form bit, Destination and Source Connection ID lengths, 4904 Destination and Source Connection ID fields, and Version fields of a 4905 long header packet are version-independent. The other fields in the 4906 first byte are version-specific. See [QUIC-INVARIANTS] for details 4907 on how packets from different versions of QUIC are interpreted. 4909 The interpretation of the fields and the payload are specific to a 4910 version and packet type. While type-specific semantics for this 4911 version are described in the following sections, several long-header 4912 packets in this version of QUIC contain these additional fields: 4914 Reserved Bits: Two bits (those with a mask of 0x0c) of byte 0 are 4915 reserved across multiple packet types. These bits are protected 4916 using header protection; see Section 5.4 of [QUIC-TLS]. The value 4917 included prior to protection MUST be set to 0. An endpoint MUST 4918 treat receipt of a packet that has a non-zero value for these bits 4919 after removing both packet and header protection as a connection 4920 error of type PROTOCOL_VIOLATION. Discarding such a packet after 4921 only removing header protection can expose the endpoint to 4922 attacks; see Section 9.3 of [QUIC-TLS]. 4924 Packet Number Length: In packet types that contain a Packet Number 4925 field, the least significant two bits (those with a mask of 0x03) 4926 of byte 0 contain the length of the packet number, encoded as an 4927 unsigned, two-bit integer that is one less than the length of the 4928 packet number field in bytes. That is, the length of the packet 4929 number field is the value of this field, plus one. These bits are 4930 protected using header protection; see Section 5.4 of [QUIC-TLS]. 4932 Length: The length of the remainder of the packet (that is, the 4933 Packet Number and Payload fields) in bytes, encoded as a variable- 4934 length integer (Section 16). 4936 Packet Number: The packet number field is 1 to 4 bytes long. The 4937 packet number is protected using header protection; see 4938 Section 5.4 of [QUIC-TLS]. The length of the packet number field 4939 is encoded in the Packet Number Length bits of byte 0; see above. 4941 17.2.1. Version Negotiation Packet 4943 A Version Negotiation packet is inherently not version-specific. 4944 Upon receipt by a client, it will be identified as a Version 4945 Negotiation packet based on the Version field having a value of 0. 4947 The Version Negotiation packet is a response to a client packet that 4948 contains a version that is not supported by the server, and is only 4949 sent by servers. 4951 The layout of a Version Negotiation packet is: 4953 Version Negotiation Packet { 4954 Header Form (1) = 1, 4955 Unused (7), 4956 Version (32) = 0, 4957 Destination Connection ID Length (8), 4958 Destination Connection ID (0..2040), 4959 Source Connection ID Length (8), 4960 Source Connection ID (0..2040), 4961 Supported Version (32) ..., 4962 } 4963 Figure 14: Version Negotiation Packet 4965 The value in the Unused field is selected randomly by the server. 4966 Clients MUST ignore the value of this field. Servers SHOULD set the 4967 most significant bit of this field (0x40) to 1 so that Version 4968 Negotiation packets appear to have the Fixed Bit field. 4970 The Version field of a Version Negotiation packet MUST be set to 4971 0x00000000. 4973 The server MUST include the value from the Source Connection ID field 4974 of the packet it receives in the Destination Connection ID field. 4975 The value for Source Connection ID MUST be copied from the 4976 Destination Connection ID of the received packet, which is initially 4977 randomly selected by a client. Echoing both connection IDs gives 4978 clients some assurance that the server received the packet and that 4979 the Version Negotiation packet was not generated by an off-path 4980 attacker. 4982 As future versions of QUIC may support Connection IDs larger than the 4983 version 1 limit, Version Negotiation packets could carry Connection 4984 IDs that are longer than 20 bytes. 4986 The remainder of the Version Negotiation packet is a list of 32-bit 4987 versions that the server supports. 4989 A Version Negotiation packet is not acknowledged. It is only sent in 4990 response to a packet that indicates an unsupported version; see 4991 Section 5.2.2. 4993 The Version Negotiation packet does not include the Packet Number and 4994 Length fields present in other packets that use the long header form. 4995 Consequently, a Version Negotiation packet consumes an entire UDP 4996 datagram. 4998 A server MUST NOT send more than one Version Negotiation packet in 4999 response to a single UDP datagram. 5001 See Section 6 for a description of the version negotiation process. 5003 17.2.2. Initial Packet 5005 An Initial packet uses long headers with a type value of 0x0. It 5006 carries the first CRYPTO frames sent by the client and server to 5007 perform key exchange, and carries ACKs in either direction. 5009 Initial Packet { 5010 Header Form (1) = 1, 5011 Fixed Bit (1) = 1, 5012 Long Packet Type (2) = 0, 5013 Reserved Bits (2), 5014 Packet Number Length (2), 5015 Version (32), 5016 Destination Connection ID Length (8), 5017 Destination Connection ID (0..160), 5018 Source Connection ID Length (8), 5019 Source Connection ID (0..160), 5020 Token Length (i), 5021 Token (..), 5022 Length (i), 5023 Packet Number (8..32), 5024 Packet Payload (..), 5025 } 5027 Figure 15: Initial Packet 5029 The Initial packet contains a long header as well as the Length and 5030 Packet Number fields; see Section 17.2. The first byte contains the 5031 Reserved and Packet Number Length bits; see also Section 17.2. 5032 Between the Source Connection ID and Length fields, there are two 5033 additional fields specific to the Initial packet. 5035 Token Length: A variable-length integer specifying the length of the 5036 Token field, in bytes. This value is zero if no token is present. 5037 Initial packets sent by the server MUST set the Token Length field 5038 to zero; clients that receive an Initial packet with a non-zero 5039 Token Length field MUST either discard the packet or generate a 5040 connection error of type PROTOCOL_VIOLATION. 5042 Token: The value of the token that was previously provided in a 5043 Retry packet or NEW_TOKEN frame; see Section 8.1. 5045 Packet Payload: The payload of the packet. 5047 In order to prevent tampering by version-unaware middleboxes, Initial 5048 packets are protected with connection- and version-specific keys 5049 (Initial keys) as described in [QUIC-TLS]. This protection does not 5050 provide confidentiality or integrity against on-path attackers, but 5051 provides some level of protection against off-path attackers. 5053 The client and server use the Initial packet type for any packet that 5054 contains an initial cryptographic handshake message. This includes 5055 all cases where a new packet containing the initial cryptographic 5056 message needs to be created, such as the packets sent after receiving 5057 a Retry packet (Section 17.2.5). 5059 A server sends its first Initial packet in response to a client 5060 Initial. A server may send multiple Initial packets. The 5061 cryptographic key exchange could require multiple round trips or 5062 retransmissions of this data. 5064 The payload of an Initial packet includes a CRYPTO frame (or frames) 5065 containing a cryptographic handshake message, ACK frames, or both. 5066 PING, PADDING, and CONNECTION_CLOSE frames of type 0x1c are also 5067 permitted. An endpoint that receives an Initial packet containing 5068 other frames can either discard the packet as spurious or treat it as 5069 a connection error. 5071 The first packet sent by a client always includes a CRYPTO frame that 5072 contains the start or all of the first cryptographic handshake 5073 message. The first CRYPTO frame sent always begins at an offset of 5074 0; see Section 7. 5076 Note that if the server sends a HelloRetryRequest, the client will 5077 send another series of Initial packets. These Initial packets will 5078 continue the cryptographic handshake and will contain CRYPTO frames 5079 starting at an offset matching the size of the CRYPTO frames sent in 5080 the first flight of Initial packets. 5082 17.2.2.1. Abandoning Initial Packets 5084 A client stops both sending and processing Initial packets when it 5085 sends its first Handshake packet. A server stops sending and 5086 processing Initial packets when it receives its first Handshake 5087 packet. Though packets might still be in flight or awaiting 5088 acknowledgment, no further Initial packets need to be exchanged 5089 beyond this point. Initial packet protection keys are discarded (see 5090 Section 4.9.1 of [QUIC-TLS]) along with any loss recovery and 5091 congestion control state; see Section 6.4 of [QUIC-RECOVERY]. 5093 Any data in CRYPTO frames is discarded - and no longer retransmitted 5094 - when Initial keys are discarded. 5096 17.2.3. 0-RTT 5098 A 0-RTT packet uses long headers with a type value of 0x1, followed 5099 by the Length and Packet Number fields; see Section 17.2. The first 5100 byte contains the Reserved and Packet Number Length bits; see 5101 Section 17.2. A 0-RTT packet is used to carry "early" data from the 5102 client to the server as part of the first flight, prior to handshake 5103 completion. As part of the TLS handshake, the server can accept or 5104 reject this early data. 5106 See Section 2.3 of [TLS13] for a discussion of 0-RTT data and its 5107 limitations. 5109 0-RTT Packet { 5110 Header Form (1) = 1, 5111 Fixed Bit (1) = 1, 5112 Long Packet Type (2) = 1, 5113 Reserved Bits (2), 5114 Packet Number Length (2), 5115 Version (32), 5116 Destination Connection ID Length (8), 5117 Destination Connection ID (0..160), 5118 Source Connection ID Length (8), 5119 Source Connection ID (0..160), 5120 Length (i), 5121 Packet Number (8..32), 5122 Packet Payload (..), 5123 } 5125 Figure 16: 0-RTT Packet 5127 Packet numbers for 0-RTT protected packets use the same space as 5128 1-RTT protected packets. 5130 After a client receives a Retry packet, 0-RTT packets are likely to 5131 have been lost or discarded by the server. A client SHOULD attempt 5132 to resend data in 0-RTT packets after it sends a new Initial packet. 5133 New packet numbers MUST be used for any new packets that are sent; as 5134 described in Section 17.2.5.3, reusing packet numbers could 5135 compromise packet protection. 5137 A client only receives acknowledgments for its 0-RTT packets once the 5138 handshake is complete, as defined Section 4.1.1 of [QUIC-TLS]. 5140 A client MUST NOT send 0-RTT packets once it starts processing 1-RTT 5141 packets from the server. This means that 0-RTT packets cannot 5142 contain any response to frames from 1-RTT packets. For instance, a 5143 client cannot send an ACK frame in a 0-RTT packet, because that can 5144 only acknowledge a 1-RTT packet. An acknowledgment for a 1-RTT 5145 packet MUST be carried in a 1-RTT packet. 5147 A server SHOULD treat a violation of remembered limits 5148 (Section 7.4.1) as a connection error of an appropriate type (for 5149 instance, a FLOW_CONTROL_ERROR for exceeding stream data limits). 5151 17.2.4. Handshake Packet 5153 A Handshake packet uses long headers with a type value of 0x2, 5154 followed by the Length and Packet Number fields; see Section 17.2. 5155 The first byte contains the Reserved and Packet Number Length bits; 5156 see Section 17.2. It is used to carry cryptographic handshake 5157 messages and acknowledgments from the server and client. 5159 Handshake Packet { 5160 Header Form (1) = 1, 5161 Fixed Bit (1) = 1, 5162 Long Packet Type (2) = 2, 5163 Reserved Bits (2), 5164 Packet Number Length (2), 5165 Version (32), 5166 Destination Connection ID Length (8), 5167 Destination Connection ID (0..160), 5168 Source Connection ID Length (8), 5169 Source Connection ID (0..160), 5170 Length (i), 5171 Packet Number (8..32), 5172 Packet Payload (..), 5173 } 5175 Figure 17: Handshake Protected Packet 5177 Once a client has received a Handshake packet from a server, it uses 5178 Handshake packets to send subsequent cryptographic handshake messages 5179 and acknowledgments to the server. 5181 The Destination Connection ID field in a Handshake packet contains a 5182 connection ID that is chosen by the recipient of the packet; the 5183 Source Connection ID includes the connection ID that the sender of 5184 the packet wishes to use; see Section 7.2. 5186 Handshake packets are their own packet number space, and thus the 5187 first Handshake packet sent by a server contains a packet number of 5188 0. 5190 The payload of this packet contains CRYPTO frames and could contain 5191 PING, PADDING, or ACK frames. Handshake packets MAY contain 5192 CONNECTION_CLOSE frames of type 0x1c. Endpoints MUST treat receipt 5193 of Handshake packets with other frames as a connection error. 5195 Like Initial packets (see Section 17.2.2.1), data in CRYPTO frames 5196 for Handshake packets is discarded - and no longer retransmitted - 5197 when Handshake protection keys are discarded. 5199 17.2.5. Retry Packet 5201 A Retry packet uses a long packet header with a type value of 0x3. 5202 It carries an address validation token created by the server. It is 5203 used by a server that wishes to perform a retry; see Section 8.1. 5205 Retry Packet { 5206 Header Form (1) = 1, 5207 Fixed Bit (1) = 1, 5208 Long Packet Type (2) = 3, 5209 Unused (4), 5210 Version (32), 5211 Destination Connection ID Length (8), 5212 Destination Connection ID (0..160), 5213 Source Connection ID Length (8), 5214 Source Connection ID (0..160), 5215 Retry Token (..), 5216 Retry Integrity Tag (128), 5217 } 5219 Figure 18: Retry Packet 5221 A Retry packet (shown in Figure 18) does not contain any protected 5222 fields. The value in the Unused field is set to an arbitrary value 5223 by the server; a client MUST ignore these bits. In addition to the 5224 fields from the long header, it contains these additional fields: 5226 Retry Token: An opaque token that the server can use to validate the 5227 client's address. 5229 Retry Integrity Tag: See the Retry Packet Integrity section of 5230 [QUIC-TLS]. 5232 17.2.5.1. Sending a Retry Packet 5234 The server populates the Destination Connection ID with the 5235 connection ID that the client included in the Source Connection ID of 5236 the Initial packet. 5238 The server includes a connection ID of its choice in the Source 5239 Connection ID field. This value MUST NOT be equal to the Destination 5240 Connection ID field of the packet sent by the client. A client MUST 5241 discard a Retry packet that contains a Source Connection ID field 5242 that is identical to the Destination Connection ID field of its 5243 Initial packet. The client MUST use the value from the Source 5244 Connection ID field of the Retry packet in the Destination Connection 5245 ID field of subsequent packets that it sends. 5247 A server MAY send Retry packets in response to Initial and 0-RTT 5248 packets. A server can either discard or buffer 0-RTT packets that it 5249 receives. A server can send multiple Retry packets as it receives 5250 Initial or 0-RTT packets. A server MUST NOT send more than one Retry 5251 packet in response to a single UDP datagram. 5253 17.2.5.2. Handling a Retry Packet 5255 A client MUST accept and process at most one Retry packet for each 5256 connection attempt. After the client has received and processed an 5257 Initial or Retry packet from the server, it MUST discard any 5258 subsequent Retry packets that it receives. 5260 Clients MUST discard Retry packets that have a Retry Integrity Tag 5261 that cannot be validated; see the Retry Packet Integrity section of 5262 [QUIC-TLS]. This diminishes an off-path attacker's ability to inject 5263 a Retry packet and protects against accidental corruption of Retry 5264 packets. A client MUST discard a Retry packet with a zero-length 5265 Retry Token field. 5267 The client responds to a Retry packet with an Initial packet that 5268 includes the provided Retry Token to continue connection 5269 establishment. 5271 A client sets the Destination Connection ID field of this Initial 5272 packet to the value from the Source Connection ID in the Retry 5273 packet. Changing Destination Connection ID also results in a change 5274 to the keys used to protect the Initial packet. It also sets the 5275 Token field to the token provided in the Retry. The client MUST NOT 5276 change the Source Connection ID because the server could include the 5277 connection ID as part of its token validation logic; see 5278 Section 8.1.4. 5280 A Retry packet does not include a packet number and cannot be 5281 explicitly acknowledged by a client. 5283 17.2.5.3. Continuing a Handshake After Retry 5285 Subsequent Initial packets from the client include the connection ID 5286 and token values from the Retry packet. The client copies the Source 5287 Connection ID field from the Retry packet to the Destination 5288 Connection ID field and uses this value until an Initial packet with 5289 an updated value is received; see Section 7.2. The value of the 5290 Token field is copied to all subsequent Initial packets; see 5291 Section 8.1.2. 5293 Other than updating the Destination Connection ID and Token fields, 5294 the Initial packet sent by the client is subject to the same 5295 restrictions as the first Initial packet. A client MUST use the same 5296 cryptographic handshake message it included in this packet. A server 5297 MAY treat a packet that contains a different cryptographic handshake 5298 message as a connection error or discard it. 5300 A client MAY attempt 0-RTT after receiving a Retry packet by sending 5301 0-RTT packets to the connection ID provided by the server. A client 5302 MUST NOT change the cryptographic handshake message it sends in 5303 response to receiving a Retry. 5305 A client MUST NOT reset the packet number for any packet number space 5306 after processing a Retry packet. In particular, 0-RTT packets 5307 contain confidential information that will most likely be 5308 retransmitted on receiving a Retry packet. The keys used to protect 5309 these new 0-RTT packets will not change as a result of responding to 5310 a Retry packet. However, the data sent in these packets could be 5311 different than what was sent earlier. Sending these new packets with 5312 the same packet number is likely to compromise the packet protection 5313 for those packets because the same key and nonce could be used to 5314 protect different content. A server MAY abort the connection if it 5315 detects that the client reset the packet number. 5317 A server acknowledges the use of a Retry packet for a connection 5318 using the retry_source_connection_id transport parameter; see 5319 Section 18.2. If the server sends a Retry packet, it also 5320 subsequently includes the value of the Source Connection ID field 5321 from the Retry packet in its retry_source_connection_id transport 5322 parameter. 5324 If the client received and processed a Retry packet, it MUST validate 5325 that the retry_source_connection_id transport parameter is present 5326 and correct; otherwise, it MUST validate that the transport parameter 5327 is absent. A client MUST treat a failed validation as a connection 5328 error of type PROTOCOL_VIOLATION. 5330 17.3. Short Header Packets 5332 This version of QUIC defines a single packet type that uses the short 5333 packet header. 5335 Short Header Packet { 5336 Header Form (1) = 0, 5337 Fixed Bit (1) = 1, 5338 Spin Bit (1), 5339 Reserved Bits (2), 5340 Key Phase (1), 5341 Packet Number Length (2), 5342 Destination Connection ID (0..160), 5343 Packet Number (8..32), 5344 Packet Payload (..), 5345 } 5347 Figure 19: Short Header Packet Format 5349 The short header can be used after the version and 1-RTT keys are 5350 negotiated. Packets that use the short header contain the following 5351 fields: 5353 Header Form: The most significant bit (0x80) of byte 0 is set to 0 5354 for the short header. 5356 Fixed Bit: The next bit (0x40) of byte 0 is set to 1. Packets 5357 containing a zero value for this bit are not valid packets in this 5358 version and MUST be discarded. 5360 Spin Bit: The third most significant bit (0x20) of byte 0 is the 5361 latency spin bit, set as described in Section 17.3.1. 5363 Reserved Bits: The next two bits (those with a mask of 0x18) of byte 5364 0 are reserved. These bits are protected using header protection; 5365 see Section 5.4 of [QUIC-TLS]. The value included prior to 5366 protection MUST be set to 0. An endpoint MUST treat receipt of a 5367 packet that has a non-zero value for these bits, after removing 5368 both packet and header protection, as a connection error of type 5369 PROTOCOL_VIOLATION. Discarding such a packet after only removing 5370 header protection can expose the endpoint to attacks; see 5371 Section 9.3 of [QUIC-TLS]. 5373 Key Phase: The next bit (0x04) of byte 0 indicates the key phase, 5374 which allows a recipient of a packet to identify the packet 5375 protection keys that are used to protect the packet. See 5376 [QUIC-TLS] for details. This bit is protected using header 5377 protection; see Section 5.4 of [QUIC-TLS]. 5379 Packet Number Length: The least significant two bits (those with a 5380 mask of 0x03) of byte 0 contain the length of the packet number, 5381 encoded as an unsigned, two-bit integer that is one less than the 5382 length of the packet number field in bytes. That is, the length 5383 of the packet number field is the value of this field, plus one. 5384 These bits are protected using header protection; see Section 5.4 5385 of [QUIC-TLS]. 5387 Destination Connection ID: The Destination Connection ID is a 5388 connection ID that is chosen by the intended recipient of the 5389 packet. See Section 5.1 for more details. 5391 Packet Number: The packet number field is 1 to 4 bytes long. The 5392 packet number has confidentiality protection separate from packet 5393 protection, as described in Section 5.4 of [QUIC-TLS]. The length 5394 of the packet number field is encoded in Packet Number Length 5395 field. See Section 17.1 for details. 5397 Packet Payload: Packets with a short header always include a 1-RTT 5398 protected payload. 5400 The header form bit and the connection ID field of a short header 5401 packet are version-independent. The remaining fields are specific to 5402 the selected QUIC version. See [QUIC-INVARIANTS] for details on how 5403 packets from different versions of QUIC are interpreted. 5405 17.3.1. Latency Spin Bit 5407 The latency spin bit enables passive latency monitoring from 5408 observation points on the network path throughout the duration of a 5409 connection. The spin bit is only present in the short packet header, 5410 since it is possible to measure the initial RTT of a connection by 5411 observing the handshake. Therefore, the spin bit is available after 5412 version negotiation and connection establishment are completed. On- 5413 path measurement and use of the latency spin bit is further discussed 5414 in [QUIC-MANAGEABILITY]. 5416 The spin bit is an OPTIONAL feature of QUIC. A QUIC stack that 5417 chooses to support the spin bit MUST implement it as specified in 5418 this section. 5420 Each endpoint unilaterally decides if the spin bit is enabled or 5421 disabled for a connection. Implementations MUST allow administrators 5422 of clients and servers to disable the spin bit either globally or on 5423 a per-connection basis. Even when the spin bit is not disabled by 5424 the administrator, endpoints MUST disable their use of the spin bit 5425 for a random selection of at least one in every 16 network paths, or 5426 for one in every 16 connection IDs. As each endpoint disables the 5427 spin bit independently, this ensures that the spin bit signal is 5428 disabled on approximately one in eight network paths. 5430 When the spin bit is disabled, endpoints MAY set the spin bit to any 5431 value, and MUST ignore any incoming value. It is RECOMMENDED that 5432 endpoints set the spin bit to a random value either chosen 5433 independently for each packet or chosen independently for each 5434 connection ID. 5436 If the spin bit is enabled for the connection, the endpoint maintains 5437 a spin value for each network path and sets the spin bit in the short 5438 header to the currently stored value when a packet with a short 5439 header is sent on that path. The spin value is initialized to 0 in 5440 the endpoint for each network path. Each endpoint also remembers the 5441 highest packet number seen from its peer on each path. 5443 When a server receives a short header packet that increases the 5444 highest packet number seen by the server from the client on a given 5445 network path, it sets the spin value for that path to be equal to the 5446 spin bit in the received packet. 5448 When a client receives a short header packet that increases the 5449 highest packet number seen by the client from the server on a given 5450 network path, it sets the spin value for that path to the inverse of 5451 the spin bit in the received packet. 5453 An endpoint resets the spin value for a network path to zero when 5454 changing the connection ID being used on that network path. 5456 With this mechanism, the server reflects the spin value received, 5457 while the client 'spins' it after one RTT. On-path observers can 5458 measure the time between two spin bit toggle events to estimate the 5459 end-to-end RTT of a connection. 5461 18. Transport Parameter Encoding 5463 The extension_data field of the quic_transport_parameters extension 5464 defined in [QUIC-TLS] contains the QUIC transport parameters. They 5465 are encoded as a sequence of transport parameters, as shown in 5466 Figure 20: 5468 Transport Parameters { 5469 Transport Parameter (..) ..., 5470 } 5472 Figure 20: Sequence of Transport Parameters 5474 Each transport parameter is encoded as an (identifier, length, value) 5475 tuple, as shown in Figure 21: 5477 Transport Parameter { 5478 Transport Parameter ID (i), 5479 Transport Parameter Length (i), 5480 Transport Parameter Value (..), 5481 } 5483 Figure 21: Transport Parameter Encoding 5485 The Transport Parameter Length field contains the length of the 5486 Transport Parameter Value field. 5488 QUIC encodes transport parameters into a sequence of bytes, which is 5489 then included in the cryptographic handshake. 5491 18.1. Reserved Transport Parameters 5493 Transport parameters with an identifier of the form "31 * N + 27" for 5494 integer values of N are reserved to exercise the requirement that 5495 unknown transport parameters be ignored. These transport parameters 5496 have no semantics, and may carry arbitrary values. 5498 18.2. Transport Parameter Definitions 5500 This section details the transport parameters defined in this 5501 document. 5503 Many transport parameters listed here have integer values. Those 5504 transport parameters that are identified as integers use a variable- 5505 length integer encoding; see Section 16. Transport parameters have a 5506 default value of 0 if the transport parameter is absent unless 5507 otherwise stated. 5509 The following transport parameters are defined: 5511 original_destination_connection_id (0x00): The value of the 5512 Destination Connection ID field from the first Initial packet sent 5513 by the client; see Section 7.3. This transport parameter is only 5514 sent by a server. 5516 max_idle_timeout (0x01): The max idle timeout is a value in 5517 milliseconds that is encoded as an integer; see (Section 10.1). 5518 Idle timeout is disabled when both endpoints omit this transport 5519 parameter or specify a value of 0. 5521 stateless_reset_token (0x02): A stateless reset token is used in 5522 verifying a stateless reset; see Section 10.3. This parameter is 5523 a sequence of 16 bytes. This transport parameter MUST NOT be sent 5524 by a client, but MAY be sent by a server. A server that does not 5525 send this transport parameter cannot use stateless reset 5526 (Section 10.3) for the connection ID negotiated during the 5527 handshake. 5529 max_udp_payload_size (0x03): The maximum UDP payload size parameter 5530 is an integer value that limits the size of UDP payloads that the 5531 endpoint is willing to receive. UDP datagrams with payloads 5532 larger than this limit are not likely to be processed by the 5533 receiver. 5535 The default for this parameter is the maximum permitted UDP 5536 payload of 65527. Values below 1200 are invalid. 5538 This limit does act as an additional constraint on datagram size 5539 in the same way as the path MTU, but it is a property of the 5540 endpoint and not the path; see Section 14. It is expected that 5541 this is the space an endpoint dedicates to holding incoming 5542 packets. 5544 initial_max_data (0x04): The initial maximum data parameter is an 5545 integer value that contains the initial value for the maximum 5546 amount of data that can be sent on the connection. This is 5547 equivalent to sending a MAX_DATA (Section 19.9) for the connection 5548 immediately after completing the handshake. 5550 initial_max_stream_data_bidi_local (0x05): This parameter is an 5551 integer value specifying the initial flow control limit for 5552 locally-initiated bidirectional streams. This limit applies to 5553 newly created bidirectional streams opened by the endpoint that 5554 sends the transport parameter. In client transport parameters, 5555 this applies to streams with an identifier with the least 5556 significant two bits set to 0x0; in server transport parameters, 5557 this applies to streams with the least significant two bits set to 5558 0x1. 5560 initial_max_stream_data_bidi_remote (0x06): This parameter is an 5561 integer value specifying the initial flow control limit for peer- 5562 initiated bidirectional streams. This limit applies to newly 5563 created bidirectional streams opened by the endpoint that receives 5564 the transport parameter. In client transport parameters, this 5565 applies to streams with an identifier with the least significant 5566 two bits set to 0x1; in server transport parameters, this applies 5567 to streams with the least significant two bits set to 0x0. 5569 initial_max_stream_data_uni (0x07): This parameter is an integer 5570 value specifying the initial flow control limit for unidirectional 5571 streams. This limit applies to newly created unidirectional 5572 streams opened by the endpoint that receives the transport 5573 parameter. In client transport parameters, this applies to 5574 streams with an identifier with the least significant two bits set 5575 to 0x3; in server transport parameters, this applies to streams 5576 with the least significant two bits set to 0x2. 5578 initial_max_streams_bidi (0x08): The initial maximum bidirectional 5579 streams parameter is an integer value that contains the initial 5580 maximum number of bidirectional streams the peer may initiate. If 5581 this parameter is absent or zero, the peer cannot open 5582 bidirectional streams until a MAX_STREAMS frame is sent. Setting 5583 this parameter is equivalent to sending a MAX_STREAMS 5584 (Section 19.11) of the corresponding type with the same value. 5586 initial_max_streams_uni (0x09): The initial maximum unidirectional 5587 streams parameter is an integer value that contains the initial 5588 maximum number of unidirectional streams the peer may initiate. 5589 If this parameter is absent or zero, the peer cannot open 5590 unidirectional streams until a MAX_STREAMS frame is sent. Setting 5591 this parameter is equivalent to sending a MAX_STREAMS 5592 (Section 19.11) of the corresponding type with the same value. 5594 ack_delay_exponent (0x0a): The acknowledgement delay exponent is an 5595 integer value indicating an exponent used to decode the ACK Delay 5596 field in the ACK frame (Section 19.3). If this value is absent, a 5597 default value of 3 is assumed (indicating a multiplier of 8). 5598 Values above 20 are invalid. 5600 max_ack_delay (0x0b): The maximum acknowledgement delay is an 5601 integer value indicating the maximum amount of time in 5602 milliseconds by which the endpoint will delay sending 5603 acknowledgments. This value SHOULD include the receiver's 5604 expected delays in alarms firing. For example, if a receiver sets 5605 a timer for 5ms and alarms commonly fire up to 1ms late, then it 5606 should send a max_ack_delay of 6ms. If this value is absent, a 5607 default of 25 milliseconds is assumed. Values of 2^14 or greater 5608 are invalid. 5610 disable_active_migration (0x0c): The disable active migration 5611 transport parameter is included if the endpoint does not support 5612 active connection migration (Section 9) on the address being used 5613 during the handshake. When a peer sets this transport parameter, 5614 an endpoint MUST NOT use a new local address when sending to the 5615 address that the peer used during the handshake. This transport 5616 parameter does not prohibit connection migration after a client 5617 has acted on a preferred_address transport parameter. This 5618 parameter is a zero-length value. 5620 preferred_address (0x0d): The server's preferred address is used to 5621 effect a change in server address at the end of the handshake, as 5622 described in Section 9.6. This transport parameter is only sent 5623 by a server. Servers MAY choose to only send a preferred address 5624 of one address family by sending an all-zero address and port 5625 (0.0.0.0:0 or ::.0) for the other family. IP addresses are 5626 encoded in network byte order. 5628 The preferred_address transport parameter contains an address and 5629 port for both IP version 4 and 6. The four-byte IPv4 Address 5630 field is followed by the associated two-byte IPv4 Port field. 5631 This is followed by a 16-byte IPv6 Address field and two-byte IPv6 5632 Port field. After address and port pairs, a Connection ID Length 5633 field describes the length of the following Connection ID field. 5634 Finally, a 16-byte Stateless Reset Token field includes the 5635 stateless reset token associated with the connection ID. The 5636 format of this transport parameter is shown in Figure 22. 5638 The Connection ID field and the Stateless Reset Token field 5639 contain an alternative connection ID that has a sequence number of 5640 1; see Section 5.1.1. Having these values sent alongside the 5641 preferred address ensures that there will be at least one unused 5642 active connection ID when the client initiates migration to the 5643 preferred address. 5645 The Connection ID and Stateless Reset Token fields of a preferred 5646 address are identical in syntax and semantics to the corresponding 5647 fields of a NEW_CONNECTION_ID frame (Section 19.15). A server 5648 that chooses a zero-length connection ID MUST NOT provide a 5649 preferred address. Similarly, a server MUST NOT include a zero- 5650 length connection ID in this transport parameter. A client MUST 5651 treat violation of these requirements as a connection error of 5652 type TRANSPORT_PARAMETER_ERROR. 5654 Preferred Address { 5655 IPv4 Address (32), 5656 IPv4 Port (16), 5657 IPv6 Address (128), 5658 IPv6 Port (16), 5659 Connection ID Length (8), 5660 Connection ID (..), 5661 Stateless Reset Token (128), 5662 } 5664 Figure 22: Preferred Address format 5666 active_connection_id_limit (0x0e): The active connection ID limit is 5667 an integer value specifying the maximum number of connection IDs 5668 from the peer that an endpoint is willing to store. This value 5669 includes the connection ID received during the handshake, that 5670 received in the preferred_address transport parameter, and those 5671 received in NEW_CONNECTION_ID frames. The value of the 5672 active_connection_id_limit parameter MUST be at least 2. An 5673 endpoint that receives a value less than 2 MUST close the 5674 connection with an error of type TRANSPORT_PARAMETER_ERROR. If 5675 this transport parameter is absent, a default of 2 is assumed. If 5676 an endpoint issues a zero-length connection ID, it will never send 5677 a NEW_CONNECTION_ID frame and therefore ignores the 5678 active_connection_id_limit value received from its peer. 5680 initial_source_connection_id (0x0f): The value that the endpoint 5681 included in the Source Connection ID field of the first Initial 5682 packet it sends for the connection; see Section 7.3. 5684 retry_source_connection_id (0x10): The value that the server 5685 included in the Source Connection ID field of a Retry packet; see 5686 Section 7.3. This transport parameter is only sent by a server. 5688 If present, transport parameters that set initial flow control limits 5689 (initial_max_stream_data_bidi_local, 5690 initial_max_stream_data_bidi_remote, and initial_max_stream_data_uni) 5691 are equivalent to sending a MAX_STREAM_DATA frame (Section 19.10) on 5692 every stream of the corresponding type immediately after opening. If 5693 the transport parameter is absent, streams of that type start with a 5694 flow control limit of 0. 5696 A client MUST NOT include any server-only transport parameter: 5697 original_destination_connection_id, preferred_address, 5698 retry_source_connection_id, or stateless_reset_token. A server MUST 5699 treat receipt of any of these transport parameters as a connection 5700 error of type TRANSPORT_PARAMETER_ERROR. 5702 19. Frame Types and Formats 5704 As described in Section 12.4, packets contain one or more frames. 5705 This section describes the format and semantics of the core QUIC 5706 frame types. 5708 19.1. PADDING Frames 5710 A PADDING frame (type=0x00) has no semantic value. PADDING frames 5711 can be used to increase the size of a packet. Padding can be used to 5712 increase an initial client packet to the minimum required size, or to 5713 provide protection against traffic analysis for protected packets. 5715 PADDING frames are formatted as shown in Figure 23, which shows that 5716 PADDING frames have no content. That is, a PADDING frame consists of 5717 the single byte that identifies the frame as a PADDING frame. 5719 PADDING Frame { 5720 Type (i) = 0x00, 5721 } 5723 Figure 23: PADDING Frame Format 5725 19.2. PING Frames 5727 Endpoints can use PING frames (type=0x01) to verify that their peers 5728 are still alive or to check reachability to the peer. 5730 PING frames are formatted as shown in Figure 24, which shows that 5731 PING frames have no content. 5733 PING Frame { 5734 Type (i) = 0x01, 5735 } 5737 Figure 24: PING Frame Format 5739 The receiver of a PING frame simply needs to acknowledge the packet 5740 containing this frame. 5742 The PING frame can be used to keep a connection alive when an 5743 application or application protocol wishes to prevent the connection 5744 from timing out; see Section 10.1.2. 5746 19.3. ACK Frames 5748 Receivers send ACK frames (types 0x02 and 0x03) to inform senders of 5749 packets they have received and processed. The ACK frame contains one 5750 or more ACK Ranges. ACK Ranges identify acknowledged packets. If 5751 the frame type is 0x03, ACK frames also contain the sum of QUIC 5752 packets with associated ECN marks received on the connection up until 5753 this point. QUIC implementations MUST properly handle both types 5754 and, if they have enabled ECN for packets they send, they SHOULD use 5755 the information in the ECN section to manage their congestion state. 5757 QUIC acknowledgements are irrevocable. Once acknowledged, a packet 5758 remains acknowledged, even if it does not appear in a future ACK 5759 frame. This is unlike reneging for TCP SACKs ([RFC2018]). 5761 Packets from different packet number spaces can be identified using 5762 the same numeric value. An acknowledgment for a packet needs to 5763 indicate both a packet number and a packet number space. This is 5764 accomplished by having each ACK frame only acknowledge packet numbers 5765 in the same space as the packet in which the ACK frame is contained. 5767 Version Negotiation and Retry packets cannot be acknowledged because 5768 they do not contain a packet number. Rather than relying on ACK 5769 frames, these packets are implicitly acknowledged by the next Initial 5770 packet sent by the client. 5772 ACK frames are formatted as shown in Figure 25. 5774 ACK Frame { 5775 Type (i) = 0x02..0x03, 5776 Largest Acknowledged (i), 5777 ACK Delay (i), 5778 ACK Range Count (i), 5779 First ACK Range (i), 5780 ACK Range (..) ..., 5781 [ECN Counts (..)], 5782 } 5784 Figure 25: ACK Frame Format 5786 ACK frames contain the following fields: 5788 Largest Acknowledged: A variable-length integer representing the 5789 largest packet number the peer is acknowledging; this is usually 5790 the largest packet number that the peer has received prior to 5791 generating the ACK frame. Unlike the packet number in the QUIC 5792 long or short header, the value in an ACK frame is not truncated. 5794 ACK Delay: A variable-length integer encoding the acknowledgement 5795 delay in microseconds; see Section 13.2.5. It is decoded by 5796 multiplying the value in the field by 2 to the power of the 5797 ack_delay_exponent transport parameter sent by the sender of the 5798 ACK frame; see Section 18.2. Compared to simply expressing the 5799 delay as an integer, this encoding allows for a larger range of 5800 values within the same number of bytes, at the cost of lower 5801 resolution. 5803 ACK Range Count: A variable-length integer specifying the number of 5804 Gap and ACK Range fields in the frame. 5806 First ACK Range: A variable-length integer indicating the number of 5807 contiguous packets preceding the Largest Acknowledged that are 5808 being acknowledged. The First ACK Range is encoded as an ACK 5809 Range; see Section 19.3.1 starting from the Largest Acknowledged. 5810 That is, the smallest packet acknowledged in the range is 5811 determined by subtracting the First ACK Range value from the 5812 Largest Acknowledged. 5814 ACK Ranges: Contains additional ranges of packets that are 5815 alternately not acknowledged (Gap) and acknowledged (ACK Range); 5816 see Section 19.3.1. 5818 ECN Counts: The three ECN Counts; see Section 19.3.2. 5820 19.3.1. ACK Ranges 5822 Each ACK Range consists of alternating Gap and ACK Range values in 5823 descending packet number order. ACK Ranges can be repeated. The 5824 number of Gap and ACK Range values is determined by the ACK Range 5825 Count field; one of each value is present for each value in the ACK 5826 Range Count field. 5828 ACK Ranges are structured as shown in Figure 26. 5830 ACK Range { 5831 Gap (i), 5832 ACK Range Length (i), 5833 } 5835 Figure 26: ACK Ranges 5837 The fields that form each ACK Range are: 5839 Gap: A variable-length integer indicating the number of contiguous 5840 unacknowledged packets preceding the packet number one lower than 5841 the smallest in the preceding ACK Range. 5843 ACK Range Length: A variable-length integer indicating the number of 5844 contiguous acknowledged packets preceding the largest packet 5845 number, as determined by the preceding Gap. 5847 Gap and ACK Range value use a relative integer encoding for 5848 efficiency. Though each encoded value is positive, the values are 5849 subtracted, so that each ACK Range describes progressively lower- 5850 numbered packets. 5852 Each ACK Range acknowledges a contiguous range of packets by 5853 indicating the number of acknowledged packets that precede the 5854 largest packet number in that range. A value of zero indicates that 5855 only the largest packet number is acknowledged. Larger ACK Range 5856 values indicate a larger range, with corresponding lower values for 5857 the smallest packet number in the range. Thus, given a largest 5858 packet number for the range, the smallest value is determined by the 5859 formula: 5861 smallest = largest - ack_range 5863 An ACK Range acknowledges all packets between the smallest packet 5864 number and the largest, inclusive. 5866 The largest value for an ACK Range is determined by cumulatively 5867 subtracting the size of all preceding ACK Ranges and Gaps. 5869 Each Gap indicates a range of packets that are not being 5870 acknowledged. The number of packets in the gap is one higher than 5871 the encoded value of the Gap field. 5873 The value of the Gap field establishes the largest packet number 5874 value for the subsequent ACK Range using the following formula: 5876 largest = previous_smallest - gap - 2 5878 If any computed packet number is negative, an endpoint MUST generate 5879 a connection error of type FRAME_ENCODING_ERROR. 5881 19.3.2. ECN Counts 5883 The ACK frame uses the least significant bit (that is, type 0x03) to 5884 indicate ECN feedback and report receipt of QUIC packets with 5885 associated ECN codepoints of ECT(0), ECT(1), or CE in the packet's IP 5886 header. ECN Counts are only present when the ACK frame type is 0x03. 5888 When present, there are 3 ECN counts, as shown in Figure 27. 5890 ECN Counts { 5891 ECT0 Count (i), 5892 ECT1 Count (i), 5893 ECN-CE Count (i), 5894 } 5896 Figure 27: ECN Count Format 5898 The three ECN Counts are: 5900 ECT0 Count: A variable-length integer representing the total number 5901 of packets received with the ECT(0) codepoint in the packet number 5902 space of the ACK frame. 5904 ECT1 Count: A variable-length integer representing the total number 5905 of packets received with the ECT(1) codepoint in the packet number 5906 space of the ACK frame. 5908 CE Count: A variable-length integer representing the total number of 5909 packets received with the CE codepoint in the packet number space 5910 of the ACK frame. 5912 ECN counts are maintained separately for each packet number space. 5914 19.4. RESET_STREAM Frames 5916 An endpoint uses a RESET_STREAM frame (type=0x04) to abruptly 5917 terminate the sending part of a stream. 5919 After sending a RESET_STREAM, an endpoint ceases transmission and 5920 retransmission of STREAM frames on the identified stream. A receiver 5921 of RESET_STREAM can discard any data that it already received on that 5922 stream. 5924 An endpoint that receives a RESET_STREAM frame for a send-only stream 5925 MUST terminate the connection with error STREAM_STATE_ERROR. 5927 RESET_STREAM frames are formatted as shown in Figure 28. 5929 RESET_STREAM Frame { 5930 Type (i) = 0x04, 5931 Stream ID (i), 5932 Application Protocol Error Code (i), 5933 Final Size (i), 5934 } 5936 Figure 28: RESET_STREAM Frame Format 5938 RESET_STREAM frames contain the following fields: 5940 Stream ID: A variable-length integer encoding of the Stream ID of 5941 the stream being terminated. 5943 Application Protocol Error Code: A variable-length integer 5944 containing the application protocol error code (see Section 20.2) 5945 that indicates why the stream is being closed. 5947 Final Size: A variable-length integer indicating the final size of 5948 the stream by the RESET_STREAM sender, in unit of bytes; see 5949 Section 4.4. 5951 19.5. STOP_SENDING Frames 5953 An endpoint uses a STOP_SENDING frame (type=0x05) to communicate that 5954 incoming data is being discarded on receipt at application request. 5955 STOP_SENDING requests that a peer cease transmission on a stream. 5957 A STOP_SENDING frame can be sent for streams in the Recv or Size 5958 Known states; see Section 3.1. Receiving a STOP_SENDING frame for a 5959 locally-initiated stream that has not yet been created MUST be 5960 treated as a connection error of type STREAM_STATE_ERROR. An 5961 endpoint that receives a STOP_SENDING frame for a receive-only stream 5962 MUST terminate the connection with error STREAM_STATE_ERROR. 5964 STOP_SENDING frames are formatted as shown in Figure 29. 5966 STOP_SENDING Frame { 5967 Type (i) = 0x05, 5968 Stream ID (i), 5969 Application Protocol Error Code (i), 5970 } 5972 Figure 29: STOP_SENDING Frame Format 5974 STOP_SENDING frames contain the following fields: 5976 Stream ID: A variable-length integer carrying the Stream ID of the 5977 stream being ignored. 5979 Application Protocol Error Code: A variable-length integer 5980 containing the application-specified reason the sender is ignoring 5981 the stream; see Section 20.2. 5983 19.6. CRYPTO Frames 5985 A CRYPTO frame (type=0x06) is used to transmit cryptographic 5986 handshake messages. It can be sent in all packet types except 0-RTT. 5987 The CRYPTO frame offers the cryptographic protocol an in-order stream 5988 of bytes. CRYPTO frames are functionally identical to STREAM frames, 5989 except that they do not bear a stream identifier; they are not flow 5990 controlled; and they do not carry markers for optional offset, 5991 optional length, and the end of the stream. 5993 CRYPTO frames are formatted as shown in Figure 30. 5995 CRYPTO Frame { 5996 Type (i) = 0x06, 5997 Offset (i), 5998 Length (i), 5999 Crypto Data (..), 6000 } 6002 Figure 30: CRYPTO Frame Format 6004 CRYPTO frames contain the following fields: 6006 Offset: A variable-length integer specifying the byte offset in the 6007 stream for the data in this CRYPTO frame. 6009 Length: A variable-length integer specifying the length of the 6010 Crypto Data field in this CRYPTO frame. 6012 Crypto Data: The cryptographic message data. 6014 There is a separate flow of cryptographic handshake data in each 6015 encryption level, each of which starts at an offset of 0. This 6016 implies that each encryption level is treated as a separate CRYPTO 6017 stream of data. 6019 The largest offset delivered on a stream - the sum of the offset and 6020 data length - cannot exceed 2^62-1. Receipt of a frame that exceeds 6021 this limit MUST be treated as a connection error of type 6022 FRAME_ENCODING_ERROR or CRYPTO_BUFFER_EXCEEDED. 6024 Unlike STREAM frames, which include a Stream ID indicating to which 6025 stream the data belongs, the CRYPTO frame carries data for a single 6026 stream per encryption level. The stream does not have an explicit 6027 end, so CRYPTO frames do not have a FIN bit. 6029 19.7. NEW_TOKEN Frames 6031 A server sends a NEW_TOKEN frame (type=0x07) to provide the client 6032 with a token to send in the header of an Initial packet for a future 6033 connection. 6035 NEW_TOKEN frames are formatted as shown in Figure 31. 6037 NEW_TOKEN Frame { 6038 Type (i) = 0x07, 6039 Token Length (i), 6040 Token (..), 6041 } 6043 Figure 31: NEW_TOKEN Frame Format 6045 NEW_TOKEN frames contain the following fields: 6047 Token Length: A variable-length integer specifying the length of the 6048 token in bytes. 6050 Token: An opaque blob that the client may use with a future Initial 6051 packet. The token MUST NOT be empty. An endpoint MUST treat 6052 receipt of a NEW_TOKEN frame with an empty Token field as a 6053 connection error of type FRAME_ENCODING_ERROR. 6055 An endpoint might receive multiple NEW_TOKEN frames that contain the 6056 same token value if packets containing the frame are incorrectly 6057 determined to be lost. Endpoints are responsible for discarding 6058 duplicate values, which might be used to link connection attempts; 6059 see Section 8.1.3. 6061 Clients MUST NOT send NEW_TOKEN frames. Servers MUST treat receipt 6062 of a NEW_TOKEN frame as a connection error of type 6063 PROTOCOL_VIOLATION. 6065 19.8. STREAM Frames 6067 STREAM frames implicitly create a stream and carry stream data. The 6068 STREAM frame Type field takes the form 0b00001XXX (or the set of 6069 values from 0x08 to 0x0f). The three low-order bits of the frame 6070 type determine the fields that are present in the frame: 6072 * The OFF bit (0x04) in the frame type is set to indicate that there 6073 is an Offset field present. When set to 1, the Offset field is 6074 present. When set to 0, the Offset field is absent and the Stream 6075 Data starts at an offset of 0 (that is, the frame contains the 6076 first bytes of the stream, or the end of a stream that includes no 6077 data). 6079 * The LEN bit (0x02) in the frame type is set to indicate that there 6080 is a Length field present. If this bit is set to 0, the Length 6081 field is absent and the Stream Data field extends to the end of 6082 the packet. If this bit is set to 1, the Length field is present. 6084 * The FIN bit (0x01) of the frame type is set only on frames that 6085 contain the final size of the stream. Setting this bit indicates 6086 that the frame marks the end of the stream. 6088 An endpoint MUST terminate the connection with error 6089 STREAM_STATE_ERROR if it receives a STREAM frame for a locally- 6090 initiated stream that has not yet been created, or for a send-only 6091 stream. 6093 STREAM frames are formatted as shown in Figure 32. 6095 STREAM Frame { 6096 Type (i) = 0x08..0x0f, 6097 Stream ID (i), 6098 [Offset (i)], 6099 [Length (i)], 6100 Stream Data (..), 6101 } 6103 Figure 32: STREAM Frame Format 6105 STREAM frames contain the following fields: 6107 Stream ID: A variable-length integer indicating the stream ID of the 6108 stream; see Section 2.1. 6110 Offset: A variable-length integer specifying the byte offset in the 6111 stream for the data in this STREAM frame. This field is present 6112 when the OFF bit is set to 1. When the Offset field is absent, 6113 the offset is 0. 6115 Length: A variable-length integer specifying the length of the 6116 Stream Data field in this STREAM frame. This field is present 6117 when the LEN bit is set to 1. When the LEN bit is set to 0, the 6118 Stream Data field consumes all the remaining bytes in the packet. 6120 Stream Data: The bytes from the designated stream to be delivered. 6122 When a Stream Data field has a length of 0, the offset in the STREAM 6123 frame is the offset of the next byte that would be sent. 6125 The first byte in the stream has an offset of 0. The largest offset 6126 delivered on a stream - the sum of the offset and data length - 6127 cannot exceed 2^62-1, as it is not possible to provide flow control 6128 credit for that data. Receipt of a frame that exceeds this limit 6129 MUST be treated as a connection error of type FRAME_ENCODING_ERROR or 6130 FLOW_CONTROL_ERROR. 6132 19.9. MAX_DATA Frames 6134 A MAX_DATA frame (type=0x10) is used in flow control to inform the 6135 peer of the maximum amount of data that can be sent on the connection 6136 as a whole. 6138 MAX_DATA frames are formatted as shown in Figure 33. 6140 MAX_DATA Frame { 6141 Type (i) = 0x10, 6142 Maximum Data (i), 6143 } 6145 Figure 33: MAX_DATA Frame Format 6147 MAX_DATA frames contain the following field: 6149 Maximum Data: A variable-length integer indicating the maximum 6150 amount of data that can be sent on the entire connection, in units 6151 of bytes. 6153 All data sent in STREAM frames counts toward this limit. The sum of 6154 the largest received offsets on all streams - including streams in 6155 terminal states - MUST NOT exceed the value advertised by a receiver. 6156 An endpoint MUST terminate a connection with a FLOW_CONTROL_ERROR 6157 error if it receives more data than the maximum data value that it 6158 has sent. This includes violations of remembered limits in Early 6159 Data; see Section 7.4.1. 6161 19.10. MAX_STREAM_DATA Frames 6163 A MAX_STREAM_DATA frame (type=0x11) is used in flow control to inform 6164 a peer of the maximum amount of data that can be sent on a stream. 6166 A MAX_STREAM_DATA frame can be sent for streams in the Recv state; 6167 see Section 3.1. Receiving a MAX_STREAM_DATA frame for a locally- 6168 initiated stream that has not yet been created MUST be treated as a 6169 connection error of type STREAM_STATE_ERROR. An endpoint that 6170 receives a MAX_STREAM_DATA frame for a receive-only stream MUST 6171 terminate the connection with error STREAM_STATE_ERROR. 6173 MAX_STREAM_DATA frames are formatted as shown in Figure 34. 6175 MAX_STREAM_DATA Frame { 6176 Type (i) = 0x11, 6177 Stream ID (i), 6178 Maximum Stream Data (i), 6179 } 6181 Figure 34: MAX_STREAM_DATA Frame Format 6183 MAX_STREAM_DATA frames contain the following fields: 6185 Stream ID: The stream ID of the stream that is affected encoded as a 6186 variable-length integer. 6188 Maximum Stream Data: A variable-length integer indicating the 6189 maximum amount of data that can be sent on the identified stream, 6190 in units of bytes. 6192 When counting data toward this limit, an endpoint accounts for the 6193 largest received offset of data that is sent or received on the 6194 stream. Loss or reordering can mean that the largest received offset 6195 on a stream can be greater than the total size of data received on 6196 that stream. Receiving STREAM frames might not increase the largest 6197 received offset. 6199 The data sent on a stream MUST NOT exceed the largest maximum stream 6200 data value advertised by the receiver. An endpoint MUST terminate a 6201 connection with a FLOW_CONTROL_ERROR error if it receives more data 6202 than the largest maximum stream data that it has sent for the 6203 affected stream. This includes violations of remembered limits in 6204 Early Data; see Section 7.4.1. 6206 19.11. MAX_STREAMS Frames 6208 A MAX_STREAMS frame (type=0x12 or 0x13) inform the peer of the 6209 cumulative number of streams of a given type it is permitted to open. 6210 A MAX_STREAMS frame with a type of 0x12 applies to bidirectional 6211 streams, and a MAX_STREAMS frame with a type of 0x13 applies to 6212 unidirectional streams. 6214 MAX_STREAMS frames are formatted as shown in Figure 35; 6216 MAX_STREAMS Frame { 6217 Type (i) = 0x12..0x13, 6218 Maximum Streams (i), 6219 } 6221 Figure 35: MAX_STREAMS Frame Format 6223 MAX_STREAMS frames contain the following field: 6225 Maximum Streams: A count of the cumulative number of streams of the 6226 corresponding type that can be opened over the lifetime of the 6227 connection. This value cannot exceed 2^60, as it is not possible 6228 to encode stream IDs larger than 2^62-1. Receipt of a frame that 6229 permits opening of a stream larger than this limit MUST be treated 6230 as a FRAME_ENCODING_ERROR. 6232 Loss or reordering can cause a MAX_STREAMS frame to be received that 6233 state a lower stream limit than an endpoint has previously received. 6234 MAX_STREAMS frames that do not increase the stream limit MUST be 6235 ignored. 6237 An endpoint MUST NOT open more streams than permitted by the current 6238 stream limit set by its peer. For instance, a server that receives a 6239 unidirectional stream limit of 3 is permitted to open stream 3, 7, 6240 and 11, but not stream 15. An endpoint MUST terminate a connection 6241 with a STREAM_LIMIT_ERROR error if a peer opens more streams than was 6242 permitted. This includes violations of remembered limits in Early 6243 Data; see Section 7.4.1. 6245 Note that these frames (and the corresponding transport parameters) 6246 do not describe the number of streams that can be opened 6247 concurrently. The limit includes streams that have been closed as 6248 well as those that are open. 6250 19.12. DATA_BLOCKED Frames 6252 A sender SHOULD send a DATA_BLOCKED frame (type=0x14) when it wishes 6253 to send data, but is unable to do so due to connection-level flow 6254 control; see Section 4. DATA_BLOCKED frames can be used as input to 6255 tuning of flow control algorithms; see Section 4.2. 6257 DATA_BLOCKED frames are formatted as shown in Figure 36. 6259 DATA_BLOCKED Frame { 6260 Type (i) = 0x14, 6261 Maximum Data (i), 6262 } 6264 Figure 36: DATA_BLOCKED Frame Format 6266 DATA_BLOCKED frames contain the following field: 6268 Maximum Data: A variable-length integer indicating the connection- 6269 level limit at which blocking occurred. 6271 19.13. STREAM_DATA_BLOCKED Frames 6273 A sender SHOULD send a STREAM_DATA_BLOCKED frame (type=0x15) when it 6274 wishes to send data, but is unable to do so due to stream-level flow 6275 control. This frame is analogous to DATA_BLOCKED (Section 19.12). 6277 An endpoint that receives a STREAM_DATA_BLOCKED frame for a send-only 6278 stream MUST terminate the connection with error STREAM_STATE_ERROR. 6280 STREAM_DATA_BLOCKED frames are formatted as shown in Figure 37. 6282 STREAM_DATA_BLOCKED Frame { 6283 Type (i) = 0x15, 6284 Stream ID (i), 6285 Maximum Stream Data (i), 6286 } 6288 Figure 37: STREAM_DATA_BLOCKED Frame Format 6290 STREAM_DATA_BLOCKED frames contain the following fields: 6292 Stream ID: A variable-length integer indicating the stream that is 6293 blocked due to flow control. 6295 Maximum Stream Data: A variable-length integer indicating the offset 6296 of the stream at which the blocking occurred. 6298 19.14. STREAMS_BLOCKED Frames 6300 A sender SHOULD send a STREAMS_BLOCKED frame (type=0x16 or 0x17) when 6301 it wishes to open a stream, but is unable to due to the maximum 6302 stream limit set by its peer; see Section 19.11. A STREAMS_BLOCKED 6303 frame of type 0x16 is used to indicate reaching the bidirectional 6304 stream limit, and a STREAMS_BLOCKED frame of type 0x17 is used to 6305 indicate reaching the unidirectional stream limit. 6307 A STREAMS_BLOCKED frame does not open the stream, but informs the 6308 peer that a new stream was needed and the stream limit prevented the 6309 creation of the stream. 6311 STREAMS_BLOCKED frames are formatted as shown in Figure 38. 6313 STREAMS_BLOCKED Frame { 6314 Type (i) = 0x16..0x17, 6315 Maximum Streams (i), 6316 } 6318 Figure 38: STREAMS_BLOCKED Frame Format 6320 STREAMS_BLOCKED frames contain the following field: 6322 Maximum Streams: A variable-length integer indicating the maximum 6323 number of streams allowed at the time the frame was sent. This 6324 value cannot exceed 2^60, as it is not possible to encode stream 6325 IDs larger than 2^62-1. Receipt of a frame that encodes a larger 6326 stream ID MUST be treated as a STREAM_LIMIT_ERROR or a 6327 FRAME_ENCODING_ERROR. 6329 19.15. NEW_CONNECTION_ID Frames 6331 An endpoint sends a NEW_CONNECTION_ID frame (type=0x18) to provide 6332 its peer with alternative connection IDs that can be used to break 6333 linkability when migrating connections; see Section 9.5. 6335 NEW_CONNECTION_ID frames are formatted as shown in Figure 39. 6337 NEW_CONNECTION_ID Frame { 6338 Type (i) = 0x18, 6339 Sequence Number (i), 6340 Retire Prior To (i), 6341 Length (8), 6342 Connection ID (8..160), 6343 Stateless Reset Token (128), 6344 } 6346 Figure 39: NEW_CONNECTION_ID Frame Format 6348 NEW_CONNECTION_ID frames contain the following fields: 6350 Sequence Number: The sequence number assigned to the connection ID 6351 by the sender, encoded as a variable-length integer; see 6352 Section 5.1.1. 6354 Retire Prior To: A variable-length integer indicating which 6355 connection IDs should be retired; see Section 5.1.2. 6357 Length: An 8-bit unsigned integer containing the length of the 6358 connection ID. Values less than 1 and greater than 20 are invalid 6359 and MUST be treated as a connection error of type 6360 FRAME_ENCODING_ERROR. 6362 Connection ID: A connection ID of the specified length. 6364 Stateless Reset Token: A 128-bit value that will be used for a 6365 stateless reset when the associated connection ID is used; see 6366 Section 10.3. 6368 An endpoint MUST NOT send this frame if it currently requires that 6369 its peer send packets with a zero-length Destination Connection ID. 6370 Changing the length of a connection ID to or from zero-length makes 6371 it difficult to identify when the value of the connection ID changed. 6372 An endpoint that is sending packets with a zero-length Destination 6373 Connection ID MUST treat receipt of a NEW_CONNECTION_ID frame as a 6374 connection error of type PROTOCOL_VIOLATION. 6376 Transmission errors, timeouts and retransmissions might cause the 6377 same NEW_CONNECTION_ID frame to be received multiple times. Receipt 6378 of the same frame multiple times MUST NOT be treated as a connection 6379 error. A receiver can use the sequence number supplied in the 6380 NEW_CONNECTION_ID frame to handle receiving the same 6381 NEW_CONNECTION_ID frame multiple times. 6383 If an endpoint receives a NEW_CONNECTION_ID frame that repeats a 6384 previously issued connection ID with a different Stateless Reset 6385 Token or a different sequence number, or if a sequence number is used 6386 for different connection IDs, the endpoint MAY treat that receipt as 6387 a connection error of type PROTOCOL_VIOLATION. 6389 The Retire Prior To field counts connection IDs established during 6390 connection setup and the preferred_address transport parameter; see 6391 Section 5.1.2. The Retire Prior To field MUST be less than or equal 6392 to the Sequence Number field. Receiving a value greater than the 6393 Sequence Number MUST be treated as a connection error of type 6394 FRAME_ENCODING_ERROR. 6396 Once a sender indicates a Retire Prior To value, smaller values sent 6397 in subsequent NEW_CONNECTION_ID frames have no effect. A receiver 6398 MUST ignore any Retire Prior To fields that do not increase the 6399 largest received Retire Prior To value. 6401 An endpoint that receives a NEW_CONNECTION_ID frame with a sequence 6402 number smaller than the Retire Prior To field of a previously 6403 received NEW_CONNECTION_ID frame MUST send a corresponding 6404 RETIRE_CONNECTION_ID frame that retires the newly received connection 6405 ID, unless it has already done so for that sequence number. 6407 19.16. RETIRE_CONNECTION_ID Frames 6409 An endpoint sends a RETIRE_CONNECTION_ID frame (type=0x19) to 6410 indicate that it will no longer use a connection ID that was issued 6411 by its peer. This may include the connection ID provided during the 6412 handshake. Sending a RETIRE_CONNECTION_ID frame also serves as a 6413 request to the peer to send additional connection IDs for future use; 6414 see Section 5.1. New connection IDs can be delivered to a peer using 6415 the NEW_CONNECTION_ID frame (Section 19.15). 6417 Retiring a connection ID invalidates the stateless reset token 6418 associated with that connection ID. 6420 RETIRE_CONNECTION_ID frames are formatted as shown in Figure 40. 6422 RETIRE_CONNECTION_ID Frame { 6423 Type (i) = 0x19, 6424 Sequence Number (i), 6425 } 6427 Figure 40: RETIRE_CONNECTION_ID Frame Format 6429 RETIRE_CONNECTION_ID frames contain the following field: 6431 Sequence Number: The sequence number of the connection ID being 6432 retired; see Section 5.1.2. 6434 Receipt of a RETIRE_CONNECTION_ID frame containing a sequence number 6435 greater than any previously sent to the peer MUST be treated as a 6436 connection error of type PROTOCOL_VIOLATION. 6438 The sequence number specified in a RETIRE_CONNECTION_ID frame MUST 6439 NOT refer to the Destination Connection ID field of the packet in 6440 which the frame is contained. The peer MAY treat this as a 6441 connection error of type PROTOCOL_VIOLATION. 6443 An endpoint cannot send this frame if it was provided with a zero- 6444 length connection ID by its peer. An endpoint that provides a zero- 6445 length connection ID MUST treat receipt of a RETIRE_CONNECTION_ID 6446 frame as a connection error of type PROTOCOL_VIOLATION. 6448 19.17. PATH_CHALLENGE Frames 6450 Endpoints can use PATH_CHALLENGE frames (type=0x1a) to check 6451 reachability to the peer and for path validation during connection 6452 migration. 6454 PATH_CHALLENGE frames are formatted as shown in Figure 41. 6456 PATH_CHALLENGE Frame { 6457 Type (i) = 0x1a, 6458 Data (64), 6459 } 6461 Figure 41: PATH_CHALLENGE Frame Format 6463 PATH_CHALLENGE frames contain the following field: 6465 Data: This 8-byte field contains arbitrary data. 6467 Including 64 bits of entropy in a PATH_CHALLENGE frame ensures that 6468 it is easier to receive the packet than it is to guess the value 6469 correctly. 6471 The recipient of this frame MUST generate a PATH_RESPONSE frame 6472 (Section 19.18) containing the same Data. 6474 19.18. PATH_RESPONSE Frames 6476 A PATH_RESPONSE frame (type=0x1b) is sent in response to a 6477 PATH_CHALLENGE frame. 6479 PATH_RESPONSE frames are formatted as shown in Figure 42, which is 6480 identical to the PATH_CHALLENGE frame (Section 19.17). 6482 PATH_RESPONSE Frame { 6483 Type (i) = 0x1b, 6484 Data (64), 6485 } 6487 Figure 42: PATH_RESPONSE Frame Format 6489 If the content of a PATH_RESPONSE frame does not match the content of 6490 a PATH_CHALLENGE frame previously sent by the endpoint, the endpoint 6491 MAY generate a connection error of type PROTOCOL_VIOLATION. 6493 19.19. CONNECTION_CLOSE Frames 6495 An endpoint sends a CONNECTION_CLOSE frame (type=0x1c or 0x1d) to 6496 notify its peer that the connection is being closed. The 6497 CONNECTION_CLOSE with a frame type of 0x1c is used to signal errors 6498 at only the QUIC layer, or the absence of errors (with the NO_ERROR 6499 code). The CONNECTION_CLOSE frame with a type of 0x1d is used to 6500 signal an error with the application that uses QUIC. 6502 If there are open streams that have not been explicitly closed, they 6503 are implicitly closed when the connection is closed. 6505 CONNECTION_CLOSE frames are formatted as shown in Figure 43. 6507 CONNECTION_CLOSE Frame { 6508 Type (i) = 0x1c..0x1d, 6509 Error Code (i), 6510 [Frame Type (i)], 6511 Reason Phrase Length (i), 6512 Reason Phrase (..), 6513 } 6515 Figure 43: CONNECTION_CLOSE Frame Format 6517 CONNECTION_CLOSE frames contain the following fields: 6519 Error Code: A variable-length integer error code that indicates the 6520 reason for closing this connection. A CONNECTION_CLOSE frame of 6521 type 0x1c uses codes from the space defined in Section 20.1. A 6522 CONNECTION_CLOSE frame of type 0x1d uses codes from the 6523 application protocol error code space; see Section 20.2. 6525 Frame Type: A variable-length integer encoding the type of frame 6526 that triggered the error. A value of 0 (equivalent to the mention 6527 of the PADDING frame) is used when the frame type is unknown. The 6528 application-specific variant of CONNECTION_CLOSE (type 0x1d) does 6529 not include this field. 6531 Reason Phrase Length: A variable-length integer specifying the 6532 length of the reason phrase in bytes. Because a CONNECTION_CLOSE 6533 frame cannot be split between packets, any limits on packet size 6534 will also limit the space available for a reason phrase. 6536 Reason Phrase: A human-readable explanation for why the connection 6537 was closed. This can be zero length if the sender chooses not to 6538 give details beyond the Error Code. This SHOULD be a UTF-8 6539 encoded string [RFC3629]. 6541 The application-specific variant of CONNECTION_CLOSE (type 0x1d) can 6542 only be sent using 0-RTT or 1-RTT packets; see Section 4 of 6543 [QUIC-TLS]. When an application wishes to abandon a connection 6544 during the handshake, an endpoint can send a CONNECTION_CLOSE frame 6545 (type 0x1c) with an error code of APPLICATION_ERROR in an Initial or 6546 a Handshake packet. 6548 19.20. HANDSHAKE_DONE Frames 6550 The server uses a HANDSHAKE_DONE frame (type=0x1e) to signal 6551 confirmation of the handshake to the client. 6553 HANDSHAKE_DONE frames are formatted as shown in Figure 44, which 6554 shows that HANDSHAKE_DONE frames have no content. 6556 HANDSHAKE_DONE Frame { 6557 Type (i) = 0x1e, 6558 } 6560 Figure 44: HANDSHAKE_DONE Frame Format 6562 A HANDSHAKE_DONE frame can only be sent by the server. Servers MUST 6563 NOT send a HANDSHAKE_DONE frame before completing the handshake. A 6564 server MUST treat receipt of a HANDSHAKE_DONE frame as a connection 6565 error of type PROTOCOL_VIOLATION. 6567 19.21. Extension Frames 6569 QUIC frames do not use a self-describing encoding. An endpoint 6570 therefore needs to understand the syntax of all frames before it can 6571 successfully process a packet. This allows for efficient encoding of 6572 frames, but it means that an endpoint cannot send a frame of a type 6573 that is unknown to its peer. 6575 An extension to QUIC that wishes to use a new type of frame MUST 6576 first ensure that a peer is able to understand the frame. An 6577 endpoint can use a transport parameter to signal its willingness to 6578 receive extension frame types. One transport parameter can indicate 6579 support for one or more extension frame types. 6581 Extensions that modify or replace core protocol functionality 6582 (including frame types) will be difficult to combine with other 6583 extensions that modify or replace the same functionality unless the 6584 behavior of the combination is explicitly defined. Such extensions 6585 SHOULD define their interaction with previously-defined extensions 6586 modifying the same protocol components. 6588 Extension frames MUST be congestion controlled and MUST cause an ACK 6589 frame to be sent. The exception is extension frames that replace or 6590 supplement the ACK frame. Extension frames are not included in flow 6591 control unless specified in the extension. 6593 An IANA registry is used to manage the assignment of frame types; see 6594 Section 22.3. 6596 20. Error Codes 6598 QUIC transport error codes and application error codes are 62-bit 6599 unsigned integers. 6601 20.1. Transport Error Codes 6603 This section lists the defined QUIC transport error codes that may be 6604 used in a CONNECTION_CLOSE frame with a type of 0x1c. These errors 6605 apply to the entire connection. 6607 NO_ERROR (0x0): An endpoint uses this with CONNECTION_CLOSE to 6608 signal that the connection is being closed abruptly in the absence 6609 of any error. 6611 INTERNAL_ERROR (0x1): The endpoint encountered an internal error and 6612 cannot continue with the connection. 6614 CONNECTION_REFUSED (0x2): The server refused to accept a new 6615 connection. 6617 FLOW_CONTROL_ERROR (0x3): An endpoint received more data than it 6618 permitted in its advertised data limits; see Section 4. 6620 STREAM_LIMIT_ERROR (0x4): An endpoint received a frame for a stream 6621 identifier that exceeded its advertised stream limit for the 6622 corresponding stream type. 6624 STREAM_STATE_ERROR (0x5): An endpoint received a frame for a stream 6625 that was not in a state that permitted that frame; see Section 3. 6627 FINAL_SIZE_ERROR (0x6): An endpoint received a STREAM frame 6628 containing data that exceeded the previously established final 6629 size. Or an endpoint received a STREAM frame or a RESET_STREAM 6630 frame containing a final size that was lower than the size of 6631 stream data that was already received. Or an endpoint received a 6632 STREAM frame or a RESET_STREAM frame containing a different final 6633 size to the one already established. 6635 FRAME_ENCODING_ERROR (0x7): An endpoint received a frame that was 6636 badly formatted. For instance, a frame of an unknown type, or an 6637 ACK frame that has more acknowledgment ranges than the remainder 6638 of the packet could carry. 6640 TRANSPORT_PARAMETER_ERROR (0x8): An endpoint received transport 6641 parameters that were badly formatted, included an invalid value, 6642 was absent even though it is mandatory, was present though it is 6643 forbidden, or is otherwise in error. 6645 CONNECTION_ID_LIMIT_ERROR (0x9): The number of connection IDs 6646 provided by the peer exceeds the advertised 6647 active_connection_id_limit. 6649 PROTOCOL_VIOLATION (0xa): An endpoint detected an error with 6650 protocol compliance that was not covered by more specific error 6651 codes. 6653 INVALID_TOKEN (0xb): A server received a client Initial that 6654 contained an invalid Token field. 6656 APPLICATION_ERROR (0xc): The application or application protocol 6657 caused the connection to be closed. 6659 CRYPTO_BUFFER_EXCEEDED (0xd): An endpoint has received more data in 6660 CRYPTO frames than it can buffer. 6662 AEAD_LIMIT_REACHED (0xe): An endpoint has reached the 6663 confidentiality or integrity limit for the AEAD algorithm used by 6664 the given connection. 6666 CRYPTO_ERROR (0x1XX): The cryptographic handshake failed. A range 6667 of 256 values is reserved for carrying error codes specific to the 6668 cryptographic handshake that is used. Codes for errors occurring 6669 when TLS is used for the crypto handshake are described in 6670 Section 4.8 of [QUIC-TLS]. 6672 See Section 22.4 for details of registering new error codes. 6674 In defining these error codes, several principles are applied. Error 6675 conditions that might require specific action on the part of a 6676 recipient are given unique codes. Errors that represent common 6677 conditions are given specific codes. Absent either of these 6678 conditions, error codes are used to identify a general function of 6679 the stack, like flow control or transport parameter handling. 6680 Finally, generic errors are provided for conditions where 6681 implementations are unable or unwilling to use more specific codes. 6683 20.2. Application Protocol Error Codes 6685 The management of application error codes is left to application 6686 protocols. Application protocol error codes are used for the 6687 RESET_STREAM frame (Section 19.4), the STOP_SENDING frame 6688 (Section 19.5), and the CONNECTION_CLOSE frame with a type of 0x1d 6689 (Section 19.19). 6691 21. Security Considerations 6693 21.1. Handshake Denial of Service 6695 As an encrypted and authenticated transport QUIC provides a range of 6696 protections against denial of service. Once the cryptographic 6697 handshake is complete, QUIC endpoints discard most packets that are 6698 not authenticated, greatly limiting the ability of an attacker to 6699 interfere with existing connections. 6701 Once a connection is established QUIC endpoints might accept some 6702 unauthenticated ICMP packets (see Section 14.2.1), but the use of 6703 these packets is extremely limited. The only other type of packet 6704 that an endpoint might accept is a stateless reset (Section 10.3), 6705 which relies on the token being kept secret until it is used. 6707 During the creation of a connection, QUIC only provides protection 6708 against attack from off the network path. All QUIC packets contain 6709 proof that the recipient saw a preceding packet from its peer. 6711 Addresses cannot change during the handshake, so endpoints can 6712 discard packets that are received on a different network path. 6714 The Source and Destination Connection ID fields are the primary means 6715 of protection against off-path attack during the handshake. These 6716 are required to match those set by a peer. Except for an Initial and 6717 stateless reset packets, an endpoint only accepts packets that 6718 include a Destination Connection ID field that matches a value the 6719 endpoint previously chose. This is the only protection offered for 6720 Version Negotiation packets. 6722 The Destination Connection ID field in an Initial packet is selected 6723 by a client to be unpredictable, which serves an additional purpose. 6724 The packets that carry the cryptographic handshake are protected with 6725 a key that is derived from this connection ID and salt specific to 6726 the QUIC version. This allows endpoints to use the same process for 6727 authenticating packets that they receive as they use after the 6728 cryptographic handshake completes. Packets that cannot be 6729 authenticated are discarded. Protecting packets in this fashion 6730 provides a strong assurance that the sender of the packet saw the 6731 Initial packet and understood it. 6733 These protections are not intended to be effective against an 6734 attacker that is able to receive QUIC packets prior to the connection 6735 being established. Such an attacker can potentially send packets 6736 that will be accepted by QUIC endpoints. This version of QUIC 6737 attempts to detect this sort of attack, but it expects that endpoints 6738 will fail to establish a connection rather than recovering. For the 6739 most part, the cryptographic handshake protocol [QUIC-TLS] is 6740 responsible for detecting tampering during the handshake. 6742 Endpoints are permitted to use other methods to detect and attempt to 6743 recover from interference with the handshake. Invalid packets may be 6744 identified and discarded using other methods, but no specific method 6745 is mandated in this document. 6747 21.2. Amplification Attack 6749 An attacker might be able to receive an address validation token 6750 (Section 8) from a server and then release the IP address it used to 6751 acquire that token. At a later time, the attacker may initiate a 6752 0-RTT connection with a server by spoofing this same address, which 6753 might now address a different (victim) endpoint. The attacker can 6754 thus potentially cause the server to send an initial congestion 6755 window's worth of data towards the victim. 6757 Servers SHOULD provide mitigations for this attack by limiting the 6758 usage and lifetime of address validation tokens; see Section 8.1.3. 6760 21.3. Optimistic ACK Attack 6762 An endpoint that acknowledges packets it has not received might cause 6763 a congestion controller to permit sending at rates beyond what the 6764 network supports. An endpoint MAY skip packet numbers when sending 6765 packets to detect this behavior. An endpoint can then immediately 6766 close the connection with a connection error of type 6767 PROTOCOL_VIOLATION; see Section 10.2. 6769 21.4. Request Forgery Attacks 6771 A request forgery attack occurs where an endpoint causes its peer to 6772 issue a request towards a victim, with the request controlled by the 6773 endpoint. Request forgery attacks aim to provide an attacker with 6774 access to capabilities of its peer that might otherwise be 6775 unavailable to the attacker. For a networking protocol, a request 6776 forgery attack is often used to exploit any implicit authorization 6777 conferred on the peer by the victim due to the peer's location in the 6778 network. 6780 For request forgery to be effective, an attacker needs to be able to 6781 influence what packets the peer sends and where these packets are 6782 sent. If an attacker can target a vulnerable service with a 6783 controlled payload, that service might perform actions that are 6784 attributed to the attacker's peer, but decided by the attacker. 6786 For example, cross-site request forgery [CSRF] exploits on the Web 6787 cause a client to issue requests that include authorization cookies 6788 [COOKIE], allowing one site access to information and actions that 6789 are intended to be restricted to a different site. 6791 As QUIC runs over UDP, the primary attack modality of concern is one 6792 where an attacker can select the address to which its peer sends UDP 6793 datagrams and can control some of the unprotected content of those 6794 packets. As much of the data sent by QUIC endpoints is protected, 6795 this includes control over ciphertext. An attack is successful if an 6796 attacker can cause a peer to send a UDP datagram to a host that will 6797 perform some action based on content in the datagram. 6799 This section discusses ways in which QUIC might be used for request 6800 forgery attacks. 6802 This section also describes limited countermeasures that can be 6803 implemented by QUIC endpoints. These mitigations can be employed 6804 unilaterally by a QUIC implementation or deployment, without 6805 potential targets for request forgery attacks taking action. However 6806 these countermeasures could be insufficient if UDP-based services do 6807 not properly authorize requests. 6809 21.4.1. Control Options for Endpoints 6811 QUIC offers some opportunities for an attacker to influence or 6812 control where its peer sends UDP datagrams: 6814 * initial connection establishment (Section 7), where a server is 6815 able to choose where a client sends datagrams, for example by 6816 populating DNS records; 6818 * preferred addresses (Section 9.6), where a server is able to 6819 choose where a client sends datagrams; and 6821 * spoofed connection migrations (Section 9.3.1), where a client is 6822 able to use source address spoofing to select where a server sends 6823 subsequent datagrams. 6825 In all three cases, the attacker can cause its peer to send datagrams 6826 to a victim that might not understand QUIC. That is, these packets 6827 are sent by the peer prior to address validation; see Section 8. 6829 Outside of the encrypted portion of packets, QUIC offers an endpoint 6830 several options for controlling the content of UDP datagrams that its 6831 peer sends. The Destination Connection ID field offers direct 6832 control over bytes that appear early in packets sent by the peer; see 6833 Section 5.1. The Token field in Initial packets offers a server 6834 control over other bytes of Initial packets; see Section 17.2.2. 6836 There are no measures in this version of QUIC to prevent indirect 6837 control over the encrypted portions of packets. It is necessary to 6838 assume that endpoints are able to control the contents of frames that 6839 a peer sends, especially those frames that convey application data, 6840 such as STREAM frames. Though this depends to some degree on details 6841 of the application protocol, some control is possible in many 6842 protocol usage contexts. As the attacker has access to packet 6843 protection keys, they are likely to be capable of predicting how a 6844 peer will encrypt future packets. Successful control over datagram 6845 content then only requires that the attacker be able to predict the 6846 packet number and placement of frames in packets with some amount of 6847 reliability. 6849 This section assumes that limiting control over datagram content is 6850 not feasible. The focus of the mitigations in subsequent sections is 6851 on limiting the ways in which datagrams that are sent prior to 6852 address validation can be used for request forgery. 6854 21.4.2. Request Forgery with Client Initial Packets 6856 An attacker acting as a server can choose the IP address and port on 6857 which it advertises its availability, so Initial packets from clients 6858 are assumed to be available for use in this sort of attack. The 6859 address validation implicit in the handshake ensures that - for a new 6860 connection - a client will not send other types of packet to a 6861 destination that does not understand QUIC or is not willing to accept 6862 a QUIC connection. 6864 Initial packet protection (Section 5.2 of [QUIC-TLS]) makes it 6865 difficult for servers to control the content of Initial packets sent 6866 by clients. A client choosing an unpredictable Destination 6867 Connection ID ensures that servers are unable to control any of the 6868 encrypted portion of Initial packets from clients. 6870 However, the Token field is open to server control and does allow a 6871 server to use clients to mount request forgery attacks. Use of 6872 tokens provided with the NEW_TOKEN frame (Section 8.1.3) offers the 6873 only option for request forgery during connection establishment. 6875 Clients however are not obligated to use the NEW_TOKEN frame. 6876 Request forgery attacks that rely on the Token field can be avoided 6877 if clients send an empty Token field when the server address has 6878 changed from when the NEW_TOKEN frame was received. 6880 Therefore, clients SHOULD NOT send a token received in a NEW_TOKEN 6881 frame from one server address in an Initial packet that is sent to a 6882 different server address. As strict equality might reduce the 6883 utility of this mechanism, clients MAY employ heuristics that result 6884 in different server addresses being treated as equivalent, such as 6885 treating addresses with a shared prefix of sufficient length as being 6886 functionally equivalent (for instance, /24 in IPv4 or /56 in IPv6). 6887 In addition, clients SHOULD treat a preferred address that is 6888 successfully validated as equivalent to the address on which the 6889 connection was made; see Section 9.6. 6891 Sending a Retry packet (Section 17.2.5) offers a server the option to 6892 change the Token field. After sending a Retry, the server can also 6893 control the Destination Connection ID field of subsequent Initial 6894 packets from the client. This also might allow indirect control over 6895 the encrypted content of Initial packets. However, the exchange of a 6896 Retry packet validates the server's address, thereby preventing the 6897 use of subsequent Initial packets for request forgery. 6899 21.4.3. Request Forgery with Preferred Addresses 6901 Servers can specify a preferred address, which clients then migrate 6902 to after confirming the handshake; see Section 9.6. The Destination 6903 Connection ID field of packets that the client sends to a preferred 6904 address can be used for request forgery. 6906 A client MUST NOT send non-probing frames to a preferred address 6907 prior to validating that address; see Section 8. This greatly 6908 reduces the options that a server has to control the encrypted 6909 portion of datagrams. 6911 This document does not offer any additional countermeasures that are 6912 specific to use of preferred addresses and can be implemented by 6913 endpoints. The generic measures described in Section 21.4.5 could be 6914 used as further mitigation. 6916 21.4.4. Request Forgery with Spoofed Migration 6918 Clients are able to present a spoofed source address as part of an 6919 apparent connection migration to cause a server to send datagrams to 6920 that address. 6922 The Destination Connection ID field in any packets that a server 6923 subsequently sends to this spoofed address can be used for request 6924 forgery. A client might also be able to influence the ciphertext. 6926 A server that only sends probing packets (Section 9.1) to an address 6927 prior to address validation provides an attacker with only limited 6928 control over the encrypted portion of datagrams. However, 6929 particularly for NAT rebinding, this can adversely affect 6930 performance. If the server sends frames carrying application data, 6931 an attacker might be able to control most of the content of 6932 datagrams. 6934 This document does not offer specific countermeasures that can be 6935 implemented by endpoints aside from the generic measures described in 6936 Section 21.4.5. However, countermeasures for address spoofing at the 6937 network level, in particular ingress filtering [BCP38], are 6938 especially effective against attacks that use spoofing and originate 6939 from an external network. 6941 21.4.5. Generic Request Forgery Countermeasures 6943 The most effective defense against request forgery attacks is to 6944 modify vulnerable services to use strong authentication. However, 6945 this is not always something that is within the control of a QUIC 6946 deployment. This section outlines some others steps that QUIC 6947 endpoints could take unilaterally. These additional steps are all 6948 discretionary as, depending on circumstances, they could interfere 6949 with or prevent legitimate uses. 6951 Services offered over loopback interfaces (that is, the IPv6 address 6952 ::1 or the IPv4 address 127.0.0.1) often lack proper authentication. 6953 Endpoints MAY prevent connection attempts or migration to a loopback 6954 address. Endpoints SHOULD NOT allow connections or migration to a 6955 loopback address if the same service was previously available at a 6956 different interface or if the address was provided by a service at a 6957 non-loopback address. Endpoints that depend on these capabilities 6958 could offer an option to disable these protections. 6960 Similarly, endpoints could regard a change in address to link-local 6961 address [RFC4291] or an address in a private use range [RFC1918] from 6962 a global, unique-local [RFC4193], or non-private address as a 6963 potential attempt at request forgery. Endpoints could refuse to use 6964 these addresses entirely, but that carries a significant risk of 6965 interfering with legitimate uses. Endpoints SHOULD NOT refuse to use 6966 an address unless they have specific knowledge about the network 6967 indicating that sending datagrams to unvalidated addresses in a given 6968 range is not safe. 6970 Endpoints MAY choose to reduce the risk of request forgery by not 6971 including values from NEW_TOKEN frames in Initial packets or by only 6972 sending probing frames in packets prior to completing address 6973 validation. Note that this does not prevent an attacker from using 6974 the Destination Connection ID field for an attack. 6976 Endpoints are not expected to have specific information about the 6977 location of servers that could be vulnerable targets of a request 6978 forgery attack. However, it might be possible over time to identify 6979 specific UDP ports that are common targets of attacks or particular 6980 patterns in datagrams that are used for attacks. Endpoints MAY 6981 choose to avoid sending datagrams to these ports or not send 6982 datagrams that match these patterns prior to validating the 6983 destination address. Endpoints MAY retire connection IDs containing 6984 patterns known to be problematic without using them. 6986 Note: Modifying endpoints to apply these protections is more 6987 efficient than deploying network-based protections, as endpoints 6988 do not need to perform any additional processing when sending to 6989 an address that has been validated. 6991 21.5. Slowloris Attacks 6993 The attacks commonly known as Slowloris ([SLOWLORIS]) try to keep 6994 many connections to the target endpoint open and hold them open as 6995 long as possible. These attacks can be executed against a QUIC 6996 endpoint by generating the minimum amount of activity necessary to 6997 avoid being closed for inactivity. This might involve sending small 6998 amounts of data, gradually opening flow control windows in order to 6999 control the sender rate, or manufacturing ACK frames that simulate a 7000 high loss rate. 7002 QUIC deployments SHOULD provide mitigations for the Slowloris 7003 attacks, such as increasing the maximum number of clients the server 7004 will allow, limiting the number of connections a single IP address is 7005 allowed to make, imposing restrictions on the minimum transfer speed 7006 a connection is allowed to have, and restricting the length of time 7007 an endpoint is allowed to stay connected. 7009 21.6. Stream Fragmentation and Reassembly Attacks 7011 An adversarial sender might intentionally send fragments of stream 7012 data in an attempt to cause disproportionate receive buffer memory 7013 commitment and/or creation of a large and inefficient data structure. 7015 An adversarial receiver might intentionally not acknowledge packets 7016 containing stream data in an attempt to force the sender to store the 7017 unacknowledged stream data for retransmission. 7019 The attack on receivers is mitigated if flow control windows 7020 correspond to available memory. However, some receivers will over- 7021 commit memory and advertise flow control offsets in the aggregate 7022 that exceed actual available memory. The over-commitment strategy 7023 can lead to better performance when endpoints are well behaved, but 7024 renders endpoints vulnerable to the stream fragmentation attack. 7026 QUIC deployments SHOULD provide mitigations against stream 7027 fragmentation attacks. Mitigations could consist of avoiding over- 7028 committing memory, limiting the size of tracking data structures, 7029 delaying reassembly of STREAM frames, implementing heuristics based 7030 on the age and duration of reassembly holes, or some combination. 7032 21.7. Stream Commitment Attack 7034 An adversarial endpoint can open a large number of streams, 7035 exhausting state on an endpoint. The adversarial endpoint could 7036 repeat the process on a large number of connections, in a manner 7037 similar to SYN flooding attacks in TCP. 7039 Normally, clients will open streams sequentially, as explained in 7040 Section 2.1. However, when several streams are initiated at short 7041 intervals, loss or reordering may cause STREAM frames that open 7042 streams to be received out of sequence. On receiving a higher- 7043 numbered stream ID, a receiver is required to open all intervening 7044 streams of the same type; see Section 3.2. Thus, on a new 7045 connection, opening stream 4000000 opens 1 million and 1 client- 7046 initiated bidirectional streams. 7048 The number of active streams is limited by the 7049 initial_max_streams_bidi and initial_max_streams_uni transport 7050 parameters, as explained in Section 4.5. If chosen judiciously, 7051 these limits mitigate the effect of the stream commitment attack. 7052 However, setting the limit too low could affect performance when 7053 applications expect to open large number of streams. 7055 21.8. Peer Denial of Service 7057 QUIC and TLS both contain frames or messages that have legitimate 7058 uses in some contexts, but that can be abused to cause a peer to 7059 expend processing resources without having any observable impact on 7060 the state of the connection. 7062 Messages can also be used to change and revert state in small or 7063 inconsequential ways, such as by sending small increments to flow 7064 control limits. 7066 If processing costs are disproportionately large in comparison to 7067 bandwidth consumption or effect on state, then this could allow a 7068 malicious peer to exhaust processing capacity. 7070 While there are legitimate uses for all messages, implementations 7071 SHOULD track cost of processing relative to progress and treat 7072 excessive quantities of any non-productive packets as indicative of 7073 an attack. Endpoints MAY respond to this condition with a connection 7074 error, or by dropping packets. 7076 21.9. Explicit Congestion Notification Attacks 7078 An on-path attacker could manipulate the value of ECN codepoints in 7079 the IP header to influence the sender's rate. [RFC3168] discusses 7080 manipulations and their effects in more detail. 7082 An on-the-side attacker can duplicate and send packets with modified 7083 ECN codepoints to affect the sender's rate. If duplicate packets are 7084 discarded by a receiver, an off-path attacker will need to race the 7085 duplicate packet against the original to be successful in this 7086 attack. Therefore, QUIC endpoints ignore the ECN codepoint field on 7087 an IP packet unless at least one QUIC packet in that IP packet is 7088 successfully processed; see Section 13.4. 7090 21.10. Stateless Reset Oracle 7092 Stateless resets create a possible denial of service attack analogous 7093 to a TCP reset injection. This attack is possible if an attacker is 7094 able to cause a stateless reset token to be generated for a 7095 connection with a selected connection ID. An attacker that can cause 7096 this token to be generated can reset an active connection with the 7097 same connection ID. 7099 If a packet can be routed to different instances that share a static 7100 key, for example by changing an IP address or port, then an attacker 7101 can cause the server to send a stateless reset. To defend against 7102 this style of denial of service, endpoints that share a static key 7103 for stateless reset (see Section 10.3.2) MUST be arranged so that 7104 packets with a given connection ID always arrive at an instance that 7105 has connection state, unless that connection is no longer active. 7107 More generally, servers MUST NOT generate a stateless reset if a 7108 connection with the corresponding connection ID could be active on 7109 any endpoint using the same static key. 7111 In the case of a cluster that uses dynamic load balancing, it is 7112 possible that a change in load balancer configuration could occur 7113 while an active instance retains connection state. Even if an 7114 instance retains connection state, the change in routing and 7115 resulting stateless reset will result in the connection being 7116 terminated. If there is no chance of the packet being routed to the 7117 correct instance, it is better to send a stateless reset than wait 7118 for the connection to time out. However, this is acceptable only if 7119 the routing cannot be influenced by an attacker. 7121 21.11. Version Downgrade 7123 This document defines QUIC Version Negotiation packets in Section 6 7124 that can be used to negotiate the QUIC version used between two 7125 endpoints. However, this document does not specify how this 7126 negotiation will be performed between this version and subsequent 7127 future versions. In particular, Version Negotiation packets do not 7128 contain any mechanism to prevent version downgrade attacks. Future 7129 versions of QUIC that use Version Negotiation packets MUST define a 7130 mechanism that is robust against version downgrade attacks. 7132 21.12. Targeted Attacks by Routing 7134 Deployments should limit the ability of an attacker to target a new 7135 connection to a particular server instance. This means that client- 7136 controlled fields, such as the initial Destination Connection ID used 7137 on Initial and 0-RTT packets SHOULD NOT be used by themselves to make 7138 routing decisions. Ideally, routing decisions are made independently 7139 of client-selected values; a Source Connection ID can be selected to 7140 route later packets to the same server. 7142 21.13. Overview of Security Properties 7144 A complete security analysis of QUIC is outside the scope of this 7145 document. This section provides an informal description of the 7146 desired security properties as an aid to implementors and to help 7147 guide protocol analysis. 7149 QUIC assumes the threat model described in [SEC-CONS] and provides 7150 protections against many of the attacks that arise from that model. 7152 For this purpose, attacks are divided into passive and active 7153 attacks. Passive attackers have the capability to read packets from 7154 the network, while active attackers also have the capability to write 7155 packets into the network. However, a passive attack may involve an 7156 attacker with the ability to cause a routing change or other 7157 modification in the path taken by packets that comprise a connection. 7159 Attackers are additionally categorized as either on-path attackers or 7160 off-path attackers; see Section 3.5 of [SEC-CONS]. An on-path 7161 attacker can read, modify, or remove any packet it observes such that 7162 it no longer reaches its destination, while an off-path attacker 7163 observes the packets, but cannot prevent the original packet from 7164 reaching its intended destination. Both types of attackers can also 7165 transmit arbitrary packets. 7167 Properties of the handshake, protected packets, and connection 7168 migration are considered separately. 7170 21.13.1. Handshake 7172 The QUIC handshake incorporates the TLS 1.3 handshake and inherits 7173 the cryptographic properties described in Appendix E.1 of [TLS13]. 7174 Many of the security properties of QUIC depend on the TLS handshake 7175 providing these properties. Any attack on the TLS handshake could 7176 affect QUIC. 7178 Any attack on the TLS handshake that compromises the secrecy or 7179 uniqueness of session keys affects other security guarantees provided 7180 by QUIC that depends on these keys. For instance, migration 7181 (Section 9) depends on the efficacy of confidentiality protections, 7182 both for the negotiation of keys using the TLS handshake and for QUIC 7183 packet protection, to avoid linkability across network paths. 7185 An attack on the integrity of the TLS handshake might allow an 7186 attacker to affect the selection of application protocol or QUIC 7187 version. 7189 In addition to the properties provided by TLS, the QUIC handshake 7190 provides some defense against DoS attacks on the handshake. 7192 21.13.1.1. Anti-Amplification 7194 Address validation (Section 8) is used to verify that an entity that 7195 claims a given address is able to receive packets at that address. 7196 Address validation limits amplification attack targets to addresses 7197 for which an attacker can observe packets. 7199 Prior to validation, endpoints are limited in what they are able to 7200 send. During the handshake, a server cannot send more than three 7201 times the data it receives; clients that initiate new connections or 7202 migrate to a new network path are limited. 7204 21.13.1.2. Server-Side DoS 7206 Computing the server's first flight for a full handshake is 7207 potentially expensive, requiring both a signature and a key exchange 7208 computation. In order to prevent computational DoS attacks, the 7209 Retry packet provides a cheap token exchange mechanism that allows 7210 servers to validate a client's IP address prior to doing any 7211 expensive computations at the cost of a single round trip. After a 7212 successful handshake, servers can issue new tokens to a client, which 7213 will allow new connection establishment without incurring this cost. 7215 21.13.1.3. On-Path Handshake Termination 7217 An on-path or off-path attacker can force a handshake to fail by 7218 replacing or racing Initial packets. Once valid Initial packets have 7219 been exchanged, subsequent Handshake packets are protected with the 7220 handshake keys and an on-path attacker cannot force handshake failure 7221 other than by dropping packets to cause endpoints to abandon the 7222 attempt. 7224 An on-path attacker can also replace the addresses of packets on 7225 either side and therefore cause the client or server to have an 7226 incorrect view of the remote addresses. Such an attack is 7227 indistinguishable from the functions performed by a NAT. 7229 21.13.1.4. Parameter Negotiation 7231 The entire handshake is cryptographically protected, with the Initial 7232 packets being encrypted with per-version keys and the Handshake and 7233 later packets being encrypted with keys derived from the TLS key 7234 exchange. Further, parameter negotiation is folded into the TLS 7235 transcript and thus provides the same integrity guarantees as 7236 ordinary TLS negotiation. An attacker can observe the client's 7237 transport parameters (as long as it knows the version-specific salt) 7238 but cannot observe the server's transport parameters and cannot 7239 influence parameter negotiation. 7241 Connection IDs are unencrypted but integrity protected in all 7242 packets. 7244 This version of QUIC does not incorporate a version negotiation 7245 mechanism; implementations of incompatible versions will simply fail 7246 to establish a connection. 7248 21.13.2. Protected Packets 7250 Packet protection (Section 12.1) provides authentication and 7251 encryption of all packets except Version Negotiation packets, though 7252 Initial and Retry packets have limited encryption and authentication 7253 based on version-specific inputs; see [QUIC-TLS] for more details. 7254 This section considers passive and active attacks against protected 7255 packets. 7257 Both on-path and off-path attackers can mount a passive attack in 7258 which they save observed packets for an offline attack against packet 7259 protection at a future time; this is true for any observer of any 7260 packet on any network. 7262 A blind attacker, one who injects packets without being able to 7263 observe valid packets for a connection, is unlikely to be successful, 7264 since packet protection ensures that valid packets are only generated 7265 by endpoints that possess the key material established during the 7266 handshake; see Section 7 and Section 21.13.1. Similarly, any active 7267 attacker that observes packets and attempts to insert new data or 7268 modify existing data in those packets should not be able to generate 7269 packets deemed valid by the receiving endpoint. 7271 A spoofing attack, in which an active attacker rewrites unprotected 7272 parts of a packet that it forwards or injects, such as the source or 7273 destination address, is only effective if the attacker can forward 7274 packets to the original endpoint. Packet protection ensures that the 7275 packet payloads can only be processed by the endpoints that completed 7276 the handshake, and invalid packets are ignored by those endpoints. 7278 An attacker can also modify the boundaries between packets and UDP 7279 datagrams, causing multiple packets to be coalesced into a single 7280 datagram, or splitting coalesced packets into multiple datagrams. 7281 Aside from datagrams containing Initial packets, which require 7282 padding, modification of how packets are arranged in datagrams has no 7283 functional effect on a connection, although it might change some 7284 performance characteristics. 7286 21.13.3. Connection Migration 7288 Connection Migration (Section 9) provides endpoints with the ability 7289 to transition between IP addresses and ports on multiple paths, using 7290 one path at a time for transmission and receipt of non-probing 7291 frames. Path validation (Section 8.2) establishes that a peer is 7292 both willing and able to receive packets sent on a particular path. 7293 This helps reduce the effects of address spoofing by limiting the 7294 number of packets sent to a spoofed address. 7296 This section describes the intended security properties of connection 7297 migration when under various types of DoS attacks. 7299 21.13.3.1. On-Path Active Attacks 7301 An attacker that can cause a packet it observes to no longer reach 7302 its intended destination is considered an on-path attacker. When an 7303 attacker is present between a client and server, endpoints are 7304 required to send packets through the attacker to establish 7305 connectivity on a given path. 7307 An on-path attacker can: 7309 * Inspect packets 7311 * Modify IP and UDP packet headers 7313 * Inject new packets 7315 * Delay packets 7317 * Reorder packets 7319 * Drop packets 7321 * Split and merge datagrams along packet boundaries 7323 An on-path attacker cannot: 7325 * Modify an authenticated portion of a packet and cause the 7326 recipient to accept that packet 7328 An on-path attacker has the opportunity to modify the packets that it 7329 observes, however any modifications to an authenticated portion of a 7330 packet will cause it to be dropped by the receiving endpoint as 7331 invalid, as packet payloads are both authenticated and encrypted. 7333 In the presence of an on-path attacker, QUIC aims to provide the 7334 following properties: 7336 1. An on-path attacker can prevent use of a path for a connection, 7337 causing it to fail if it cannot use a different path that does 7338 not contain the attacker. This can be achieved by dropping all 7339 packets, modifying them so that they fail to decrypt, or other 7340 methods. 7342 2. An on-path attacker can prevent migration to a new path for which 7343 the attacker is also on-path by causing path validation to fail 7344 on the new path. 7346 3. An on-path attacker cannot prevent a client from migrating to a 7347 path for which the attacker is not on-path. 7349 4. An on-path attacker can reduce the throughput of a connection by 7350 delaying packets or dropping them. 7352 5. An on-path attacker cannot cause an endpoint to accept a packet 7353 for which it has modified an authenticated portion of that 7354 packet. 7356 21.13.3.2. Off-Path Active Attacks 7358 An off-path attacker is not directly on the path between a client and 7359 server, but could be able to obtain copies of some or all packets 7360 sent between the client and the server. It is also able to send 7361 copies of those packets to either endpoint. 7363 An off-path attacker can: 7365 * Inspect packets 7367 * Inject new packets 7369 * Reorder injected packets 7371 An off-path attacker cannot: 7373 * Modify any part of a packet 7375 * Delay packets 7377 * Drop packets 7379 * Reorder original packets 7381 An off-path attacker can modify packets that it has observed and 7382 inject them back into the network, potentially with spoofed source 7383 and destination addresses. 7385 For the purposes of this discussion, it is assumed that an off-path 7386 attacker has the ability to observe, modify, and re-inject a packet 7387 into the network that will reach the destination endpoint prior to 7388 the arrival of the original packet observed by the attacker. In 7389 other words, an attacker has the ability to consistently "win" a race 7390 with the legitimate packets between the endpoints, potentially 7391 causing the original packet to be ignored by the recipient. 7393 It is also assumed that an attacker has the resources necessary to 7394 affect NAT state, potentially both causing an endpoint to lose its 7395 NAT binding, and an attacker to obtain the same port for use with its 7396 traffic. 7398 In the presence of an off-path attacker, QUIC aims to provide the 7399 following properties: 7401 1. An off-path attacker can race packets and attempt to become a 7402 "limited" on-path attacker. 7404 2. An off-path attacker can cause path validation to succeed for 7405 forwarded packets with the source address listed as the off-path 7406 attacker as long as it can provide improved connectivity between 7407 the client and the server. 7409 3. An off-path attacker cannot cause a connection to close once the 7410 handshake has completed. 7412 4. An off-path attacker cannot cause migration to a new path to fail 7413 if it cannot observe the new path. 7415 5. An off-path attacker can become a limited on-path attacker during 7416 migration to a new path for which it is also an off-path 7417 attacker. 7419 6. An off-path attacker can become a limited on-path attacker by 7420 affecting shared NAT state such that it sends packets to the 7421 server from the same IP address and port that the client 7422 originally used. 7424 21.13.3.3. Limited On-Path Active Attacks 7426 A limited on-path attacker is an off-path attacker that has offered 7427 improved routing of packets by duplicating and forwarding original 7428 packets between the server and the client, causing those packets to 7429 arrive before the original copies such that the original packets are 7430 dropped by the destination endpoint. 7432 A limited on-path attacker differs from an on-path attacker in that 7433 it is not on the original path between endpoints, and therefore the 7434 original packets sent by an endpoint are still reaching their 7435 destination. This means that a future failure to route copied 7436 packets to the destination faster than their original path will not 7437 prevent the original packets from reaching the destination. 7439 A limited on-path attacker can: 7441 * Inspect packets 7443 * Inject new packets 7445 * Modify unencrypted packet headers 7447 * Reorder packets 7449 A limited on-path attacker cannot: 7451 * Delay packets so that they arrive later than packets sent on the 7452 original path 7454 * Drop packets 7456 * Modify the authenticated and encrypted portion of a packet and 7457 cause the recipient to accept that packet 7459 A limited on-path attacker can only delay packets up to the point 7460 that the original packets arrive before the duplicate packets, 7461 meaning that it cannot offer routing with worse latency than the 7462 original path. If a limited on-path attacker drops packets, the 7463 original copy will still arrive at the destination endpoint. 7465 In the presence of a limited on-path attacker, QUIC aims to provide 7466 the following properties: 7468 1. A limited on-path attacker cannot cause a connection to close 7469 once the handshake has completed. 7471 2. A limited on-path attacker cannot cause an idle connection to 7472 close if the client is first to resume activity. 7474 3. A limited on-path attacker can cause an idle connection to be 7475 deemed lost if the server is the first to resume activity. 7477 Note that these guarantees are the same guarantees provided for any 7478 NAT, for the same reasons. 7480 22. IANA Considerations 7482 This document establishes several registries for the management of 7483 codepoints in QUIC. These registries operate on a common set of 7484 policies as defined in Section 22.1. 7486 22.1. Registration Policies for QUIC Registries 7488 All QUIC registries allow for both provisional and permanent 7489 registration of codepoints. This section documents policies that are 7490 common to these registries. 7492 22.1.1. Provisional Registrations 7494 Provisional registration of codepoints are intended to allow for 7495 private use and experimentation with extensions to QUIC. Provisional 7496 registrations only require the inclusion of the codepoint value and 7497 contact information. However, provisional registrations could be 7498 reclaimed and reassigned for another purpose. 7500 Provisional registrations require Expert Review, as defined in 7501 Section 4.5 of [RFC8126]. Designated expert(s) are advised that only 7502 registrations for an excessive proportion of remaining codepoint 7503 space or the very first unassigned value (see Section 22.1.2) can be 7504 rejected. 7506 Provisional registrations will include a date field that indicates 7507 when the registration was last updated. A request to update the date 7508 on any provisional registration can be made without review from the 7509 designated expert(s). 7511 All QUIC registries include the following fields to support 7512 provisional registration: 7514 Value: The assigned codepoint. 7516 Status: "Permanent" or "Provisional". 7518 Specification: A reference to a publicly available specification for 7519 the value. 7521 Date: The date of last update to the registration. 7523 Contact: Contact details for the registrant. 7525 Notes: Supplementary notes about the registration. 7527 Provisional registrations MAY omit the Specification and Notes 7528 fields, plus any additional fields that might be required for a 7529 permanent registration. The Date field is not required as part of 7530 requesting a registration as it is set to the date the registration 7531 is created or updated. 7533 22.1.2. Selecting Codepoints 7535 New uses of codepoints from QUIC registries SHOULD use a randomly 7536 selected codepoint that excludes both existing allocations and the 7537 first unallocated codepoint in the selected space. Requests for 7538 multiple codepoints MAY use a contiguous range. This minimizes the 7539 risk that differing semantics are attributed to the same codepoint by 7540 different implementations. Use of the first codepoint in a range is 7541 intended for use by specifications that are developed through the 7542 standards process [STD] and its allocation MUST be negotiated with 7543 IANA before use. 7545 For codepoints that are encoded in variable-length integers 7546 (Section 16), such as frame types, codepoints that encode to four or 7547 eight bytes (that is, values 2^14 and above) SHOULD be used unless 7548 the usage is especially sensitive to having a longer encoding. 7550 Applications to register codepoints in QUIC registries MAY include a 7551 codepoint as part of the registration. IANA MUST allocate the 7552 selected codepoint unless that codepoint is already assigned or the 7553 codepoint is the first unallocated codepoint in the registry. 7555 22.1.3. Reclaiming Provisional Codepoints 7557 A request might be made to remove an unused provisional registration 7558 from the registry to reclaim space in a registry, or portion of the 7559 registry (such as the 64-16383 range for codepoints that use 7560 variable-length encodings). This SHOULD be done only for the 7561 codepoints with the earliest recorded date and entries that have been 7562 updated less than a year prior SHOULD NOT be reclaimed. 7564 A request to remove a codepoint MUST be reviewed by the designated 7565 expert(s). The expert(s) MUST attempt to determine whether the 7566 codepoint is still in use. Experts are advised to contact the listed 7567 contacts for the registration, plus as wide a set of protocol 7568 implementers as possible in order to determine whether any use of the 7569 codepoint is known. The expert(s) are advised to allow at least four 7570 weeks for responses. 7572 If any use of the codepoints is identified by this search or a 7573 request to update the registration is made, the codepoint MUST NOT be 7574 reclaimed. Instead, the date on the registration is updated. A note 7575 might be added for the registration recording relevant information 7576 that was learned. 7578 If no use of the codepoint was identified and no request was made to 7579 update the registration, the codepoint MAY be removed from the 7580 registry. 7582 This process also applies to requests to change a provisional 7583 registration into a permanent registration, except that the goal is 7584 not to determine whether there is no use of the codepoint, but to 7585 determine that the registration is an accurate representation of any 7586 deployed usage. 7588 22.1.4. Permanent Registrations 7590 Permanent registrations in QUIC registries use the Specification 7591 Required policy ([RFC8126]), unless otherwise specified. The 7592 designated expert(s) verify that a specification exists and is 7593 readily accessible. Expert(s) are encouraged to be biased towards 7594 approving registrations unless they are abusive, frivolous, or 7595 actively harmful (not merely aesthetically displeasing, or 7596 architecturally dubious). The creation of a registry MAY specify 7597 additional constraints on permanent registrations. 7599 The creation of a registry MAY identify a range of codepoints where 7600 registrations are governed by a different registration policy. For 7601 instance, the registries for 62-bit codepoints in this document have 7602 stricter policies for codepoints in the range from 0 to 63. 7604 Any stricter requirements for permanent registrations do not prevent 7605 provisional registrations for affected codepoints. For instance, a 7606 provisional registration for a frame type (Section 22.3) of 61 could 7607 be requested. 7609 All registrations made by Standards Track publications MUST be 7610 permanent. 7612 All registrations in this document are assigned a permanent status 7613 and list as contact the IETF (quic@ietf.org). 7615 22.2. QUIC Transport Parameter Registry 7617 IANA [SHALL add/has added] a registry for "QUIC Transport Parameters" 7618 under a "QUIC" heading. 7620 The "QUIC Transport Parameters" registry governs a 62-bit space. 7621 This registry follows the registration policy from Section 22.1. 7622 Permanent registrations in this registry are assigned using the 7623 Specification Required policy ([RFC8126]). 7625 In addition to the fields in Section 22.1.1, permanent registrations 7626 in this registry MUST include the following field: 7628 Parameter Name: A short mnemonic for the parameter. 7630 The initial contents of this registry are shown in Table 6. 7632 +=======+=====================================+===============+ 7633 | Value | Parameter Name | Specification | 7634 +=======+=====================================+===============+ 7635 | 0x00 | original_destination_connection_id | Section 18.2 | 7636 +-------+-------------------------------------+---------------+ 7637 | 0x01 | max_idle_timeout | Section 18.2 | 7638 +-------+-------------------------------------+---------------+ 7639 | 0x02 | stateless_reset_token | Section 18.2 | 7640 +-------+-------------------------------------+---------------+ 7641 | 0x03 | max_udp_payload_size | Section 18.2 | 7642 +-------+-------------------------------------+---------------+ 7643 | 0x04 | initial_max_data | Section 18.2 | 7644 +-------+-------------------------------------+---------------+ 7645 | 0x05 | initial_max_stream_data_bidi_local | Section 18.2 | 7646 +-------+-------------------------------------+---------------+ 7647 | 0x06 | initial_max_stream_data_bidi_remote | Section 18.2 | 7648 +-------+-------------------------------------+---------------+ 7649 | 0x07 | initial_max_stream_data_uni | Section 18.2 | 7650 +-------+-------------------------------------+---------------+ 7651 | 0x08 | initial_max_streams_bidi | Section 18.2 | 7652 +-------+-------------------------------------+---------------+ 7653 | 0x09 | initial_max_streams_uni | Section 18.2 | 7654 +-------+-------------------------------------+---------------+ 7655 | 0x0a | ack_delay_exponent | Section 18.2 | 7656 +-------+-------------------------------------+---------------+ 7657 | 0x0b | max_ack_delay | Section 18.2 | 7658 +-------+-------------------------------------+---------------+ 7659 | 0x0c | disable_active_migration | Section 18.2 | 7660 +-------+-------------------------------------+---------------+ 7661 | 0x0d | preferred_address | Section 18.2 | 7662 +-------+-------------------------------------+---------------+ 7663 | 0x0e | active_connection_id_limit | Section 18.2 | 7664 +-------+-------------------------------------+---------------+ 7665 | 0x0f | initial_source_connection_id | Section 18.2 | 7666 +-------+-------------------------------------+---------------+ 7667 | 0x10 | retry_source_connection_id | Section 18.2 | 7668 +-------+-------------------------------------+---------------+ 7670 Table 6: Initial QUIC Transport Parameters Entries 7672 Additionally, each value of the format "31 * N + 27" for integer 7673 values of N (that is, 27, 58, 89, ...) are reserved and MUST NOT be 7674 assigned by IANA. 7676 22.3. QUIC Frame Types Registry 7678 IANA [SHALL add/has added] a registry for "QUIC Frame Types" under a 7679 "QUIC" heading. 7681 The "QUIC Frame Types" registry governs a 62-bit space. This 7682 registry follows the registration policy from Section 22.1. 7683 Permanent registrations in this registry are assigned using the 7684 Specification Required policy ([RFC8126]), except for values between 7685 0x00 and 0x3f (in hexadecimal; inclusive), which are assigned using 7686 Standards Action or IESG Approval as defined in Section 4.9 and 4.10 7687 of [RFC8126]. 7689 In addition to the fields in Section 22.1.1, permanent registrations 7690 in this registry MUST include the following field: 7692 Frame Name: A short mnemonic for the frame type. 7694 In addition to the advice in Section 22.1, specifications for new 7695 permanent registrations SHOULD describe the means by which an 7696 endpoint might determine that it can send the identified type of 7697 frame. An accompanying transport parameter registration is expected 7698 for most registrations; see Section 22.2. Specifications for 7699 permanent registrations also need to describe the format and assigned 7700 semantics of any fields in the frame. 7702 The initial contents of this registry are tabulated in Table 3. Note 7703 that the registry does not include the "Pkts" and "Spec" columns from 7704 Table 3. 7706 22.4. QUIC Transport Error Codes Registry 7708 IANA [SHALL add/has added] a registry for "QUIC Transport Error 7709 Codes" under a "QUIC" heading. 7711 The "QUIC Transport Error Codes" registry governs a 62-bit space. 7712 This space is split into three regions that are governed by different 7713 policies. Permanent registrations in this registry are assigned 7714 using the Specification Required policy ([RFC8126]), except for 7715 values between 0x00 and 0x3f (in hexadecimal; inclusive), which are 7716 assigned using Standards Action or IESG Approval as defined in 7717 Section 4.9 and 4.10 of [RFC8126]. 7719 In addition to the fields in Section 22.1.1, permanent registrations 7720 in this registry MUST include the following fields: 7722 Code: A short mnemonic for the parameter. 7724 Description: A brief description of the error code semantics, which 7725 MAY be a summary if a specification reference is provided. 7727 The initial contents of this registry are shown in Table 7. 7729 +======+===========================+================+===============+ 7730 |Value | Error | Description | Specification | 7731 +======+===========================+================+===============+ 7732 | 0x0 | NO_ERROR | No error | Section 20 | 7733 +------+---------------------------+----------------+---------------+ 7734 | 0x1 | INTERNAL_ERROR | Implementation | Section 20 | 7735 | | | error | | 7736 +------+---------------------------+----------------+---------------+ 7737 | 0x2 | CONNECTION_REFUSED |Server refuses a| Section 20 | 7738 | | | connection | | 7739 +------+---------------------------+----------------+---------------+ 7740 | 0x3 | FLOW_CONTROL_ERROR | Flow control | Section 20 | 7741 | | | error | | 7742 +------+---------------------------+----------------+---------------+ 7743 | 0x4 | STREAM_LIMIT_ERROR |Too many streams| Section 20 | 7744 | | | opened | | 7745 +------+---------------------------+----------------+---------------+ 7746 | 0x5 | STREAM_STATE_ERROR | Frame received | Section 20 | 7747 | | | in invalid | | 7748 | | | stream state | | 7749 +------+---------------------------+----------------+---------------+ 7750 | 0x6 | FINAL_SIZE_ERROR |Change to final | Section 20 | 7751 | | | size | | 7752 +------+---------------------------+----------------+---------------+ 7753 | 0x7 | FRAME_ENCODING_ERROR | Frame encoding | Section 20 | 7754 | | | error | | 7755 +------+---------------------------+----------------+---------------+ 7756 | 0x8 | TRANSPORT_PARAMETER_ERROR | Error in | Section 20 | 7757 | | | transport | | 7758 | | | parameters | | 7759 +------+---------------------------+----------------+---------------+ 7760 | 0x9 | CONNECTION_ID_LIMIT_ERROR | Too many | Section 20 | 7761 | | | connection IDs | | 7762 | | | received | | 7763 +------+---------------------------+----------------+---------------+ 7764 | 0xa | PROTOCOL_VIOLATION |Generic protocol| Section 20 | 7765 | | | violation | | 7766 +------+---------------------------+----------------+---------------+ 7767 | 0xb | INVALID_TOKEN | Invalid Token | Section 20 | 7768 | | | Received | | 7769 +------+---------------------------+----------------+---------------+ 7770 | 0xc | APPLICATION_ERROR | Application | Section 20 | 7771 | | | error | | 7772 +------+---------------------------+----------------+---------------+ 7773 | 0xd | CRYPTO_BUFFER_EXCEEDED | CRYPTO data | Section 20 | 7774 | | | buffer | | 7775 | | | overflowed | | 7776 +------+---------------------------+----------------+---------------+ 7777 Table 7: Initial QUIC Transport Error Codes Entries 7779 23. References 7781 23.1. Normative References 7783 [DPLPMTUD] Fairhurst, G., Jones, T., Tuexen, M., Ruengeler, I., and 7784 T. Voelker, "Packetization Layer Path MTU Discovery for 7785 Datagram Transports", Work in Progress, Internet-Draft, 7786 draft-ietf-tsvwg-datagram-plpmtud-22, June 10, 2020, 7787 . 7790 [IPv4] Postel, J., "Internet Protocol", STD 5, RFC 791, 7791 DOI 10.17487/RFC0791, September 1981, 7792 . 7794 [QUIC-RECOVERY] 7795 Iyengar, J., Ed. and I. Swett, Ed., "QUIC Loss Detection 7796 and Congestion Control", Work in Progress, Internet-Draft, 7797 draft-ietf-quic-recovery-30, September 10, 2020, 7798 . 7800 [QUIC-TLS] Thomson, M., Ed. and S. Turner, Ed., "Using Transport 7801 Layer Security (TLS) to Secure QUIC", Work in Progress, 7802 Internet-Draft, draft-ietf-quic-tls-30, September 10, 7803 2020, 7804 . 7806 [RFC1191] Mogul, J.C. and S.E. Deering, "Path MTU discovery", 7807 RFC 1191, DOI 10.17487/RFC1191, November 1990, 7808 . 7810 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 7811 Requirement Levels", BCP 14, RFC 2119, 7812 DOI 10.17487/RFC2119, March 1997, 7813 . 7815 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 7816 of Explicit Congestion Notification (ECN) to IP", 7817 RFC 3168, DOI 10.17487/RFC3168, September 2001, 7818 . 7820 [RFC3629] Yergeau, F., "UTF-8, a transformation format of ISO 7821 10646", STD 63, RFC 3629, DOI 10.17487/RFC3629, November 7822 2003, . 7824 [RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker, 7825 "Randomness Requirements for Security", BCP 106, RFC 4086, 7826 DOI 10.17487/RFC4086, June 2005, 7827 . 7829 [RFC6437] Amante, S., Carpenter, B., Jiang, S., and J. Rajahalme, 7830 "IPv6 Flow Label Specification", RFC 6437, 7831 DOI 10.17487/RFC6437, November 2011, 7832 . 7834 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 7835 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 7836 March 2017, . 7838 [RFC8126] Cotton, M., Leiba, B., and T. Narten, "Guidelines for 7839 Writing an IANA Considerations Section in RFCs", BCP 26, 7840 RFC 8126, DOI 10.17487/RFC8126, June 2017, 7841 . 7843 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 7844 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 7845 May 2017, . 7847 [RFC8201] McCann, J., Deering, S., Mogul, J., and R. Hinden, Ed., 7848 "Path MTU Discovery for IP version 6", STD 87, RFC 8201, 7849 DOI 10.17487/RFC8201, July 2017, 7850 . 7852 [RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion 7853 Notification (ECN) Experimentation", RFC 8311, 7854 DOI 10.17487/RFC8311, January 2018, 7855 . 7857 [TLS13] Rescorla, E., "The Transport Layer Security (TLS) Protocol 7858 Version 1.3", RFC 8446, DOI 10.17487/RFC8446, August 2018, 7859 . 7861 [UDP] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 7862 DOI 10.17487/RFC0768, August 1980, 7863 . 7865 23.2. Informative References 7867 [AEAD] McGrew, D., "An Interface and Algorithms for Authenticated 7868 Encryption", RFC 5116, DOI 10.17487/RFC5116, January 2008, 7869 . 7871 [ALPN] Friedl, S., Popov, A., Langley, A., and E. Stephan, 7872 "Transport Layer Security (TLS) Application-Layer Protocol 7873 Negotiation Extension", RFC 7301, DOI 10.17487/RFC7301, 7874 July 2014, . 7876 [ALTSVC] Nottingham, M., McManus, P., and J. Reschke, "HTTP 7877 Alternative Services", RFC 7838, DOI 10.17487/RFC7838, 7878 April 2016, . 7880 [BCP38] Ferguson, P. and D. Senie, "Network Ingress Filtering: 7881 Defeating Denial of Service Attacks which employ IP Source 7882 Address Spoofing", RFC 2267, DOI 10.17487/RFC2267, January 7883 1998, . 7885 [COOKIE] Barth, A., "HTTP State Management Mechanism", RFC 6265, 7886 DOI 10.17487/RFC6265, April 2011, 7887 . 7889 [CSRF] Barth, A., Jackson, C., and J. Mitchell, "Robust defenses 7890 for cross-site request forgery", 7891 DOI 10.1145/1455770.1455782, Proceedings of the 15th ACM 7892 conference on Computer and communications security - 7893 CCS '08, 2008, . 7895 [EARLY-DESIGN] 7896 Roskind, J., "QUIC: Multiplexed Transport Over UDP", 7897 December 2, 2013, . 7899 [GATEWAY] Hätönen, S., Nyrhinen, A., Eggert, L., Strowes, S., 7900 Sarolahti, P., and M. Kojo, "An experimental study of home 7901 gateway characteristics", DOI 10.1145/1879141.1879174, 7902 Proceedings of the 10th annual conference on Internet 7903 measurement - IMC '10, 2010, 7904 . 7906 [HTTP2] Belshe, M., Peon, R., and M. Thomson, Ed., "Hypertext 7907 Transfer Protocol Version 2 (HTTP/2)", RFC 7540, 7908 DOI 10.17487/RFC7540, May 2015, 7909 . 7911 [IPv6] Deering, S. and R. Hinden, "Internet Protocol, Version 6 7912 (IPv6) Specification", STD 86, RFC 8200, 7913 DOI 10.17487/RFC8200, July 2017, 7914 . 7916 [QUIC-INVARIANTS] 7917 Thomson, M., "Version-Independent Properties of QUIC", 7918 Work in Progress, Internet-Draft, draft-ietf-quic- 7919 invariants-10, September 10, 2020, 7920 . 7923 [QUIC-MANAGEABILITY] 7924 Kuehlewind, M. and B. Trammell, "Manageability of the QUIC 7925 Transport Protocol", Work in Progress, Internet-Draft, 7926 draft-ietf-quic-manageability-07, July 8, 2020, 7927 . 7930 [RFC1812] Baker, F., Ed., "Requirements for IP Version 4 Routers", 7931 RFC 1812, DOI 10.17487/RFC1812, June 1995, 7932 . 7934 [RFC1918] Rekhter, Y., Moskowitz, B., Karrenberg, D., de Groot, G. 7935 J., and E. Lear, "Address Allocation for Private 7936 Internets", BCP 5, RFC 1918, DOI 10.17487/RFC1918, 7937 February 1996, . 7939 [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP 7940 Selective Acknowledgment Options", RFC 2018, 7941 DOI 10.17487/RFC2018, October 1996, 7942 . 7944 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 7945 Hashing for Message Authentication", RFC 2104, 7946 DOI 10.17487/RFC2104, February 1997, 7947 . 7949 [RFC3449] Balakrishnan, H., Padmanabhan, V., Fairhurst, G., and M. 7950 Sooriyabandara, "TCP Performance Implications of Network 7951 Path Asymmetry", BCP 69, RFC 3449, DOI 10.17487/RFC3449, 7952 December 2002, . 7954 [RFC4193] Hinden, R. and B. Haberman, "Unique Local IPv6 Unicast 7955 Addresses", RFC 4193, DOI 10.17487/RFC4193, October 2005, 7956 . 7958 [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing 7959 Architecture", RFC 4291, DOI 10.17487/RFC4291, February 7960 2006, . 7962 [RFC4443] Conta, A., Deering, S., and M. Gupta, Ed., "Internet 7963 Control Message Protocol (ICMPv6) for the Internet 7964 Protocol Version 6 (IPv6) Specification", STD 89, 7965 RFC 4443, DOI 10.17487/RFC4443, March 2006, 7966 . 7968 [RFC4787] Audet, F., Ed. and C. Jennings, "Network Address 7969 Translation (NAT) Behavioral Requirements for Unicast 7970 UDP", BCP 127, RFC 4787, DOI 10.17487/RFC4787, January 7971 2007, . 7973 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 7974 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 7975 . 7977 [RFC5869] Krawczyk, H. and P. Eronen, "HMAC-based Extract-and-Expand 7978 Key Derivation Function (HKDF)", RFC 5869, 7979 DOI 10.17487/RFC5869, May 2010, 7980 . 7982 [RFC7301] Friedl, S., Popov, A., Langley, A., and E. Stephan, 7983 "Transport Layer Security (TLS) Application-Layer Protocol 7984 Negotiation Extension", RFC 7301, DOI 10.17487/RFC7301, 7985 July 2014, . 7987 [SEC-CONS] Rescorla, E. and B. Korver, "Guidelines for Writing RFC 7988 Text on Security Considerations", BCP 72, RFC 3552, 7989 DOI 10.17487/RFC3552, July 2003, 7990 . 7992 [SLOWLORIS] 7993 RSnake Hansen, R., "Welcome to Slowloris...", June 2009, 7994 . 7997 [STD] Bradner, S., "The Internet Standards Process -- Revision 7998 3", BCP 9, RFC 2026, DOI 10.17487/RFC2026, October 1996, 7999 . 8001 Appendix A. Sample Packet Number Decoding Algorithm 8003 The pseudo-code in Figure 45 shows how an implementation can decode 8004 packet numbers after header protection has been removed. 8006 DecodePacketNumber(largest_pn, truncated_pn, pn_nbits): 8007 expected_pn = largest_pn + 1 8008 pn_win = 1 << pn_nbits 8009 pn_hwin = pn_win / 2 8010 pn_mask = pn_win - 1 8011 // The incoming packet number should be greater than 8012 // expected_pn - pn_hwin and less than or equal to 8013 // expected_pn + pn_hwin 8014 // 8015 // This means we cannot just strip the trailing bits from 8016 // expected_pn and add the truncated_pn because that might 8017 // yield a value outside the window. 8018 // 8019 // The following code calculates a candidate value and 8020 // makes sure it's within the packet number window. 8021 // Note the extra checks to prevent overflow and underflow. 8022 candidate_pn = (expected_pn & ~pn_mask) | truncated_pn 8023 if candidate_pn <= expected_pn - pn_hwin and 8024 candidate_pn < (1 << 62) - pn_win: 8025 return candidate_pn + pn_win 8026 if candidate_pn > expected_pn + pn_hwin and 8027 candidate_pn >= pn_win: 8028 return candidate_pn - pn_win 8029 return candidate_pn 8031 Figure 45: Sample Packet Number Decoding Algorithm 8033 Appendix B. Sample ECN Validation Algorithm 8035 Each time an endpoint commences sending on a new network path, it 8036 determines whether the path supports ECN; see Section 13.4. If the 8037 path supports ECN, the goal is to use ECN. Endpoints might also 8038 periodically reassess a path that was determined to not support ECN. 8040 This section describes one method for testing new paths. This 8041 algorithm is intended to show how a path might be tested for ECN 8042 support. Endpoints can implement different methods. 8044 The path is assigned an ECN state that is one of "testing", 8045 "unknown", "failed", or "capable". On paths with a "testing" or 8046 "capable" state the endpoint sends packets with an ECT marking, by 8047 default ECT(0); otherwise, the endpoint sends unmarked packets. 8049 To start testing a path, the ECN state is set to "testing" and 8050 existing ECN counts are remembered as a baseline. 8052 The testing period runs for a number of packets or round-trip times, 8053 as determined by the endpoint. The goal is not to limit the duration 8054 of the testing period, but to ensure that enough marked packets are 8055 sent for received ECN counts to provide a clear indication of how the 8056 path treats marked packets. Section 13.4.2.2 suggests limiting this 8057 to 10 packets or 3 round-trip times. 8059 After the testing period ends, the ECN state for the path becomes 8060 "unknown". From the "unknown" state, successful validation of the 8061 ECN counts an ACK frame (see Section 13.4.2.2) causes the ECN state 8062 for the path to become "capable", unless no marked packet has been 8063 acknowledged. 8065 If validation of ECN counts fails at any time, the ECN state for the 8066 affected path becomes "failed". An endpoint can also mark the ECN 8067 state for a path as "failed" if marked packets are all declared lost 8068 or if they are all CE marked. 8070 Following this algorithm ensures that ECN is rarely disabled for 8071 paths that properly support ECN. Any path that incorrectly modifies 8072 markings will cause ECN to be disabled. For those rare cases where 8073 marked packets are discarded by the path, the short duration of the 8074 testing period limits the number of losses incurred. 8076 Appendix C. Change Log 8078 *RFC Editor's Note:* Please remove this section prior to 8079 publication of a final version of this document. 8081 Issue and pull request numbers are listed with a leading octothorp. 8083 C.1. Since draft-ietf-quic-transport-29 8085 * Require the same connection ID on coalesced packets (#3800, #3930) 8087 * Allow caching of packets that can't be decrypted, by allowing the 8088 reported acknowledgment delay to exceed max_ack_delay prior to 8089 confirming the handshake (#3821, #3980, #4035, #3874) 8091 * Allow connection ID to be used for address validation (#3834, 8092 #3924) 8094 * Required protocol operations are no longer directed at 8095 implementations, but are features provided to application 8096 protocols (#3838, #3935) 8098 * Narrow requirements for reset of congestion state on path change 8099 (#3842, #3945) 8101 * Add a three times amplification limit for sending of 8102 CONNECTION_CLOSE with reduced state (#3845, #3864) 8104 * Change error code for invalid RETIRE_CONNECTION_ID frames (#3860, 8105 #3861) 8107 * Recommend retention of state for lost packets to allow for late 8108 arrival and avoid unnecessary retransmission (#3956, #3957) 8110 * Allow a server to reject connections if a client reuses packet 8111 numbers after Retry (#3989, #3990) 8113 * Limit recommendation for immediate acknowledgment to when ack- 8114 eliciting packets are reordered (#4001, #4000) 8116 C.2. Since draft-ietf-quic-transport-28 8118 * Made SERVER_BUSY error (0x2) more generic, now CONNECTION_REFUSED 8119 (#3709, #3690, #3694) 8121 * Allow TRANSPORT_PARAMETER_ERROR when validating connection IDs 8122 (#3703, #3691) 8124 * Integrate QUIC-specific language from draft-ietf-tsvwg-datagram- 8125 plpmtud (#3695, #3702) 8127 * disable_active_migration does not apply to the addresses offered 8128 in server_preferred_address (#3608, #3670) 8130 C.3. Since draft-ietf-quic-transport-27 8132 * Allowed CONNECTION_CLOSE in any packet number space, with a 8133 requirement to use a new transport-level error for application- 8134 specific errors in Initial and Handshake packets (#3430, #3435, 8135 #3440) 8137 * Clearer requirements for address validation (#2125, #3327) 8139 * Security analysis of handshake and migration (#2143, #2387, #2925) 8141 * The entire payload of a datagram is used when counting bytes for 8142 mitigating amplification attacks (#3333, #3470) 8144 * Connection IDs can be used at any time, including in the handshake 8145 (#3348, #3560, #3438, #3565) 8147 * Only one ACK should be sent for each instance of reordering 8148 (#3357, #3361) 8150 * Remove text allowing a server to proceed with a bad Retry token 8151 (#3396, #3398) 8153 * Ignore active_connection_id_limit with a zero-length connection ID 8154 (#3427, #3426) 8156 * Require active_connection_id_limit be remembered for 0-RTT (#3423, 8157 #3425) 8159 * Require ack_delay not be remembered for 0-RTT (#3433, #3545) 8161 * Redefined max_packet_size to max_udp_datagram_size (#3471, #3473) 8163 * Guidance on limiting outstanding attempts to retire connection IDs 8164 (#3489, #3509, #3557, #3547) 8166 * Restored text on dropping bogus Version Negotiation packets 8167 (#3532, #3533) 8169 * Clarified that largest acknowledged needs to be saved, but not 8170 necessarily signaled in all cases (#3541, #3581) 8172 * Addressed linkability risk with the use of preferred_address 8173 (#3559, #3563) 8175 * Added authentication of handshake connection IDs (#3439, #3499) 8177 * Opening a stream in the wrong direction is an error (#3527) 8179 C.4. Since draft-ietf-quic-transport-26 8181 * Change format of transport parameters to use varints (#3294, 8182 #3169) 8184 C.5. Since draft-ietf-quic-transport-25 8186 * Define the use of CONNECTION_CLOSE prior to establishing 8187 connection state (#3269, #3297, #3292) 8189 * Allow use of address validation tokens after client address 8190 changes (#3307, #3308) 8192 * Define the timer for address validation (#2910, #3339) 8194 C.6. Since draft-ietf-quic-transport-24 8196 * Added HANDSHAKE_DONE to signal handshake confirmation (#2863, 8197 #3142, #3145) 8199 * Add integrity check to Retry packets (#3014, #3274, #3120) 8201 * Specify handling of reordered NEW_CONNECTION_ID frames (#3194, 8202 #3202) 8204 * Require checking of sequence numbers in RETIRE_CONNECTION_ID 8205 (#3037, #3036) 8207 * active_connection_id_limit is enforced (#3193, #3197, #3200, 8208 #3201) 8210 * Correct overflow in packet number decode algorithm (#3187, #3188) 8212 * Allow use of CRYPTO_BUFFER_EXCEEDED for CRYPTO frame errors 8213 (#3258, #3186) 8215 * Define applicability and scope of NEW_TOKEN (#3150, #3152, #3155, 8216 #3156) 8218 * Tokens from Retry and NEW_TOKEN must be differentiated (#3127, 8219 #3128) 8221 * Allow CONNECTION_CLOSE in response to invalid token (#3168, #3107) 8223 * Treat an invalid CONNECTION_CLOSE as an invalid frame (#2475, 8224 #3230, #3231) 8226 * Throttle when sending CONNECTION_CLOSE after discarding state 8227 (#3095, #3157) 8229 * Application-variant of CONNECTION_CLOSE can only be sent in 0-RTT 8230 or 1-RTT packets (#3158, #3164) 8232 * Advise sending while blocked to avoid idle timeout (#2744, #3266) 8234 * Define error codes for invalid frames (#3027, #3042) 8236 * Idle timeout is symmetric (#2602, #3099) 8238 * Prohibit IP fragmentation (#3243, #3280) 8240 * Define the use of provisional registration for all registries 8241 (#3109, #3020, #3102, #3170) 8243 * Packets on one path must not adjust values for a different path 8244 (#2909, #3139) 8246 C.7. Since draft-ietf-quic-transport-23 8248 * Allow ClientHello to span multiple packets (#2928, #3045) 8250 * Client Initial size constraints apply to UDP datagram payload 8251 (#3053, #3051) 8253 * Stateless reset changes (#2152, #2993) 8255 - tokens need to be compared in constant time 8257 - detection uses UDP datagrams, not packets 8259 - tokens cannot be reused (#2785, #2968) 8261 * Clearer rules for sharing of UDP ports and use of connection IDs 8262 when doing so (#2844, #2851) 8264 * A new connection ID is necessary when responding to migration 8265 (#2778, #2969) 8267 * Stronger requirements for connection ID retirement (#3046, #3096) 8269 * NEW_TOKEN cannot be empty (#2978, #2977) 8271 * PING can be sent at any encryption level (#3034, #3035) 8273 * CONNECTION_CLOSE is not ack-eliciting (#3097, #3098) 8275 * Frame encoding error conditions updated (#3027, #3042) 8277 * Non-ack-eliciting packets cannot be sent in response to non-ack- 8278 eliciting packets (#3100, #3104) 8280 * Servers have to change connection IDs in Retry (#2837, #3147) 8282 C.8. Since draft-ietf-quic-transport-22 8284 * Rules for preventing correlation by connection ID tightened 8285 (#2084, #2929) 8287 * Clarified use of CONNECTION_CLOSE in Handshake packets (#2151, 8288 #2541, #2688) 8290 * Discourage regressions of largest acknowledged in ACK (#2205, 8291 #2752) 8293 * Improved robustness of validation process for ECN counts (#2534, 8294 #2752) 8296 * Require endpoints to ignore spurious migration attempts (#2342, 8297 #2893) 8299 * Transport parameter for disabling migration clarified to allow NAT 8300 rebinding (#2389, #2893) 8302 * Document principles for defining new error codes (#2388, #2880) 8304 * Reserve transport parameters for greasing (#2550, #2873) 8306 * A maximum ACK delay of 0 is used for handshake packet number 8307 spaces (#2646, #2638) 8309 * Improved rules for use of congestion control state on new paths 8310 (#2685, #2918) 8312 * Removed recommendation to coordinate spin for multiple connections 8313 that share a path (#2763, #2882) 8315 * Allow smaller stateless resets and recommend a smaller minimum on 8316 packets that might trigger a stateless reset (#2770, #2869, #2927, 8317 #3007). 8319 * Provide guidance around the interface to QUIC as used by 8320 application protocols (#2805, #2857) 8322 * Frames other than STREAM can cause STREAM_LIMIT_ERROR (#2825, 8323 #2826) 8325 * Tighter rules about processing of rejected 0-RTT packets (#2829, 8326 #2840, #2841) 8328 * Explanation of the effect of Retry on 0-RTT packets (#2842, #2852) 8330 * Cryptographic handshake needs to provide server transport 8331 parameter encryption (#2920, #2921) 8333 * Moved ACK generation guidance from recovery draft to transport 8334 draft (#1860, #2916). 8336 C.9. Since draft-ietf-quic-transport-21 8338 * Connection ID lengths are now one octet, but limited in version 1 8339 to 20 octets of length (#2736, #2749) 8341 C.10. Since draft-ietf-quic-transport-20 8343 * Error codes are encoded as variable-length integers (#2672, #2680) 8345 * NEW_CONNECTION_ID includes a request to retire old connection IDs 8346 (#2645, #2769) 8348 * Tighter rules for generating and explicitly eliciting ACK frames 8349 (#2546, #2794) 8351 * Recommend having only one packet per encryption level in a 8352 datagram (#2308, #2747) 8354 * More normative language about use of stateless reset (#2471, 8355 #2574) 8357 * Allow reuse of stateless reset tokens (#2732, #2733) 8359 * Allow, but not require, enforcing non-duplicate transport 8360 parameters (#2689, #2691) 8362 * Added an active_connection_id_limit transport parameter (#1994, 8363 #1998) 8365 * max_ack_delay transport parameter defaults to 0 (#2638, #2646) 8367 * When sending 0-RTT, only remembered transport parameters apply 8368 (#2458, #2360, #2466, #2461) 8370 * Define handshake completion and confirmation; define clearer rules 8371 when it encryption keys should be discarded (#2214, #2267, #2673) 8373 * Prohibit path migration prior to handshake confirmation (#2309, 8374 #2370) 8376 * PATH_RESPONSE no longer needs to be received on the validated path 8377 (#2582, #2580, #2579, #2637) 8379 * PATH_RESPONSE frames are not stored and retransmitted (#2724, 8380 #2729) 8382 * Document hack for enabling routing of ICMP when doing PMTU probing 8383 (#1243, #2402) 8385 C.11. Since draft-ietf-quic-transport-19 8387 * Refine discussion of 0-RTT transport parameters (#2467, #2464) 8388 * Fewer transport parameters need to be remembered for 0-RTT (#2624, 8389 #2467) 8391 * Spin bit text incorporated (#2564) 8393 * Close the connection when maximum stream ID in MAX_STREAMS exceeds 8394 2^62 - 1 (#2499, #2487) 8396 * New connection ID required for intentional migration (#2414, 8397 #2413) 8399 * Connection ID issuance can be rate-limited (#2436, #2428) 8401 * The "QUIC bit" is ignored in Version Negotiation (#2400, #2561) 8403 * Initial packets from clients need to be padded to 1200 unless a 8404 Handshake packet is sent as well (#2522, #2523) 8406 * CRYPTO frames can be discarded if too much data is buffered 8407 (#1834, #2524) 8409 * Stateless reset uses a short header packet (#2599, #2600) 8411 C.12. Since draft-ietf-quic-transport-18 8413 * Removed version negotiation; version negotiation, including 8414 authentication of the result, will be addressed in the next 8415 version of QUIC (#1773, #2313) 8417 * Added discussion of the use of IPv6 flow labels (#2348, #2399) 8419 * A connection ID can't be retired in a packet that uses that 8420 connection ID (#2101, #2420) 8422 * Idle timeout transport parameter is in milliseconds (from seconds) 8423 (#2453, #2454) 8425 * Endpoints are required to use new connection IDs when they use new 8426 network paths (#2413, #2414) 8428 * Increased the set of permissible frames in 0-RTT (#2344, #2355) 8430 C.13. Since draft-ietf-quic-transport-17 8432 * Stream-related errors now use STREAM_STATE_ERROR (#2305) 8434 * Endpoints discard initial keys as soon as handshake keys are 8435 available (#1951, #2045) 8437 * Expanded conditions for ignoring ICMP packet too big messages 8438 (#2108, #2161) 8440 * Remove rate control from PATH_CHALLENGE/PATH_RESPONSE (#2129, 8441 #2241) 8443 * Endpoints are permitted to discard malformed initial packets 8444 (#2141) 8446 * Clarified ECN implementation and usage requirements (#2156, #2201) 8448 * Disable ECN count verification for packets that arrive out of 8449 order (#2198, #2215) 8451 * Use Probe Timeout (PTO) instead of RTO (#2206, #2238) 8453 * Loosen constraints on retransmission of ACK ranges (#2199, #2245) 8455 * Limit Retry and Version Negotiation to once per datagram (#2259, 8456 #2303) 8458 * Set a maximum value for max_ack_delay transport parameter (#2282, 8459 #2301) 8461 * Allow server preferred address for both IPv4 and IPv6 (#2122, 8462 #2296) 8464 * Corrected requirements for migration to a preferred address 8465 (#2146, #2349) 8467 * ACK of non-existent packet is illegal (#2298, #2302) 8469 C.14. Since draft-ietf-quic-transport-16 8471 * Stream limits are defined as counts, not maximums (#1850, #1906) 8473 * Require amplification attack defense after closing (#1905, #1911) 8475 * Remove reservation of application error code 0 for STOPPING 8476 (#1804, #1922) 8478 * Renumbered frames (#1945) 8480 * Renumbered transport parameters (#1946) 8482 * Numeric transport parameters are expressed as varints (#1608, 8483 #1947, #1955) 8485 * Reorder the NEW_CONNECTION_ID frame (#1952, #1963) 8487 * Rework the first byte (#2006) 8489 - Fix the 0x40 bit 8491 - Change type values for long header 8493 - Add spin bit to short header (#631, #1988) 8495 - Encrypt the remainder of the first byte (#1322) 8497 - Move packet number length to first byte 8499 - Move ODCIL to first byte of retry packets 8501 - Simplify packet number protection (#1575) 8503 * Allow STOP_SENDING to open a remote bidirectional stream (#1797, 8504 #2013) 8506 * Added mitigation for off-path migration attacks (#1278, #1749, 8507 #2033) 8509 * Don't let the PMTU to drop below 1280 (#2063, #2069) 8511 * Require peers to replace retired connection IDs (#2085) 8513 * Servers are required to ignore Version Negotiation packets (#2088) 8515 * Tokens are repeated in all Initial packets (#2089) 8517 * Clarified how PING frames are sent after loss (#2094) 8519 * Initial keys are discarded once Handshake are available (#1951, 8520 #2045) 8522 * ICMP PTB validation clarifications (#2161, #2109, #2108) 8524 C.15. Since draft-ietf-quic-transport-15 8526 Substantial editorial reorganization; no technical changes. 8528 C.16. Since draft-ietf-quic-transport-14 8530 * Merge ACK and ACK_ECN (#1778, #1801) 8532 * Explicitly communicate max_ack_delay (#981, #1781) 8533 * Validate original connection ID after Retry packets (#1710, #1486, 8534 #1793) 8536 * Idle timeout is optional and has no specified maximum (#1765) 8538 * Update connection ID handling; add RETIRE_CONNECTION_ID type 8539 (#1464, #1468, #1483, #1484, #1486, #1495, #1729, #1742, #1799, 8540 #1821) 8542 * Include a Token in all Initial packets (#1649, #1794) 8544 * Prevent handshake deadlock (#1764, #1824) 8546 C.17. Since draft-ietf-quic-transport-13 8548 * Streams open when higher-numbered streams of the same type open 8549 (#1342, #1549) 8551 * Split initial stream flow control limit into 3 transport 8552 parameters (#1016, #1542) 8554 * All flow control transport parameters are optional (#1610) 8556 * Removed UNSOLICITED_PATH_RESPONSE error code (#1265, #1539) 8558 * Permit stateless reset in response to any packet (#1348, #1553) 8560 * Recommended defense against stateless reset spoofing (#1386, 8561 #1554) 8563 * Prevent infinite stateless reset exchanges (#1443, #1627) 8565 * Forbid processing of the same packet number twice (#1405, #1624) 8567 * Added a packet number decoding example (#1493) 8569 * More precisely define idle timeout (#1429, #1614, #1652) 8571 * Corrected format of Retry packet and prevented looping (#1492, 8572 #1451, #1448, #1498) 8574 * Permit 0-RTT after receiving Version Negotiation or Retry (#1507, 8575 #1514, #1621) 8577 * Permit Retry in response to 0-RTT (#1547, #1552) 8579 * Looser verification of ECN counters to account for ACK loss 8580 (#1555, #1481, #1565) 8582 * Remove frame type field from APPLICATION_CLOSE (#1508, #1528) 8584 C.18. Since draft-ietf-quic-transport-12 8586 * Changes to integration of the TLS handshake (#829, #1018, #1094, 8587 #1165, #1190, #1233, #1242, #1252, #1450, #1458) 8589 - The cryptographic handshake uses CRYPTO frames, not stream 0 8591 - QUIC packet protection is used in place of TLS record 8592 protection 8594 - Separate QUIC packet number spaces are used for the handshake 8596 - Changed Retry to be independent of the cryptographic handshake 8598 - Added NEW_TOKEN frame and Token fields to Initial packet 8600 - Limit the use of HelloRetryRequest to address TLS needs (like 8601 key shares) 8603 * Enable server to transition connections to a preferred address 8604 (#560, #1251, #1373) 8606 * Added ECN feedback mechanisms and handling; new ACK_ECN frame 8607 (#804, #805, #1372) 8609 * Changed rules and recommendations for use of new connection IDs 8610 (#1258, #1264, #1276, #1280, #1419, #1452, #1453, #1465) 8612 * Added a transport parameter to disable intentional connection 8613 migration (#1271, #1447) 8615 * Packets from different connection ID can't be coalesced (#1287, 8616 #1423) 8618 * Fixed sampling method for packet number encryption; the length 8619 field in long headers includes the packet number field in addition 8620 to the packet payload (#1387, #1389) 8622 * Stateless Reset is now symmetric and subject to size constraints 8623 (#466, #1346) 8625 * Added frame type extension mechanism (#58, #1473) 8627 C.19. Since draft-ietf-quic-transport-11 8628 * Enable server to transition connections to a preferred address 8629 (#560, #1251) 8631 * Packet numbers are encrypted (#1174, #1043, #1048, #1034, #850, 8632 #990, #734, #1317, #1267, #1079) 8634 * Packet numbers use a variable-length encoding (#989, #1334) 8636 * STREAM frames can now be empty (#1350) 8638 C.20. Since draft-ietf-quic-transport-10 8640 * Swap payload length and packed number fields in long header 8641 (#1294) 8643 * Clarified that CONNECTION_CLOSE is allowed in Handshake packet 8644 (#1274) 8646 * Spin bit reserved (#1283) 8648 * Coalescing multiple QUIC packets in a UDP datagram (#1262, #1285) 8650 * A more complete connection migration (#1249) 8652 * Refine opportunistic ACK defense text (#305, #1030, #1185) 8654 * A Stateless Reset Token isn't mandatory (#818, #1191) 8656 * Removed implicit stream opening (#896, #1193) 8658 * An empty STREAM frame can be used to open a stream without sending 8659 data (#901, #1194) 8661 * Define stream counts in transport parameters rather than a maximum 8662 stream ID (#1023, #1065) 8664 * STOP_SENDING is now prohibited before streams are used (#1050) 8666 * Recommend including ACK in Retry packets and allow PADDING (#1067, 8667 #882) 8669 * Endpoints now become closing after an idle timeout (#1178, #1179) 8671 * Remove implication that Version Negotiation is sent when a packet 8672 of the wrong version is received (#1197) 8674 C.21. Since draft-ietf-quic-transport-09 8675 * Added PATH_CHALLENGE and PATH_RESPONSE frames to replace PING with 8676 Data and PONG frame. Changed ACK frame type from 0x0e to 0x0d. 8677 (#1091, #725, #1086) 8679 * A server can now only send 3 packets without validating the client 8680 address (#38, #1090) 8682 * Delivery order of stream data is no longer strongly specified 8683 (#252, #1070) 8685 * Rework of packet handling and version negotiation (#1038) 8687 * Stream 0 is now exempt from flow control until the handshake 8688 completes (#1074, #725, #825, #1082) 8690 * Improved retransmission rules for all frame types: information is 8691 retransmitted, not packets or frames (#463, #765, #1095, #1053) 8693 * Added an error code for server busy signals (#1137) 8695 * Endpoints now set the connection ID that their peer uses. 8696 Connection IDs are variable length. Removed the 8697 omit_connection_id transport parameter and the corresponding short 8698 header flag. (#1089, #1052, #1146, #821, #745, #821, #1166, #1151) 8700 C.22. Since draft-ietf-quic-transport-08 8702 * Clarified requirements for BLOCKED usage (#65, #924) 8704 * BLOCKED frame now includes reason for blocking (#452, #924, #927, 8705 #928) 8707 * GAP limitation in ACK Frame (#613) 8709 * Improved PMTUD description (#614, #1036) 8711 * Clarified stream state machine (#634, #662, #743, #894) 8713 * Reserved versions don't need to be generated deterministically 8714 (#831, #931) 8716 * You don't always need the draining period (#871) 8718 * Stateless reset clarified as version-specific (#930, #986) 8720 * initial_max_stream_id_x transport parameters are optional (#970, 8721 #971) 8723 * ACK delay assumes a default value during the handshake (#1007, 8724 #1009) 8726 * Removed transport parameters from NewSessionTicket (#1015) 8728 C.23. Since draft-ietf-quic-transport-07 8730 * The long header now has version before packet number (#926, #939) 8732 * Rename and consolidate packet types (#846, #822, #847) 8734 * Packet types are assigned new codepoints and the Connection ID 8735 Flag is inverted (#426, #956) 8737 * Removed type for Version Negotiation and use Version 0 (#963, 8738 #968) 8740 * Streams are split into unidirectional and bidirectional (#643, 8741 #656, #720, #872, #175, #885) 8743 - Stream limits now have separate uni- and bi-directional 8744 transport parameters (#909, #958) 8746 - Stream limit transport parameters are now optional and default 8747 to 0 (#970, #971) 8749 * The stream state machine has been split into read and write (#634, 8750 #894) 8752 * Employ variable-length integer encodings throughout (#595) 8754 * Improvements to connection close 8756 - Added distinct closing and draining states (#899, #871) 8758 - Draining period can terminate early (#869, #870) 8760 - Clarifications about stateless reset (#889, #890) 8762 * Address validation for connection migration (#161, #732, #878) 8764 * Clearly defined retransmission rules for BLOCKED (#452, #65, #924) 8766 * negotiated_version is sent in server transport parameters (#710, 8767 #959) 8769 * Increased the range over which packet numbers are randomized 8770 (#864, #850, #964) 8772 C.24. Since draft-ietf-quic-transport-06 8774 * Replaced FNV-1a with AES-GCM for all "Cleartext" packets (#554) 8776 * Split error code space between application and transport (#485) 8778 * Stateless reset token moved to end (#820) 8780 * 1-RTT-protected long header types removed (#848) 8782 * No acknowledgments during draining period (#852) 8784 * Remove "application close" as a separate close type (#854) 8786 * Remove timestamps from the ACK frame (#841) 8788 * Require transport parameters to only appear once (#792) 8790 C.25. Since draft-ietf-quic-transport-05 8792 * Stateless token is server-only (#726) 8794 * Refactor section on connection termination (#733, #748, #328, 8795 #177) 8797 * Limit size of Version Negotiation packet (#585) 8799 * Clarify when and what to ack (#736) 8801 * Renamed STREAM_ID_NEEDED to STREAM_ID_BLOCKED 8803 * Clarify Keep-alive requirements (#729) 8805 C.26. Since draft-ietf-quic-transport-04 8807 * Introduce STOP_SENDING frame, RESET_STREAM only resets in one 8808 direction (#165) 8810 * Removed GOAWAY; application protocols are responsible for graceful 8811 shutdown (#696) 8813 * Reduced the number of error codes (#96, #177, #184, #211) 8815 * Version validation fields can't move or change (#121) 8817 * Removed versions from the transport parameters in a 8818 NewSessionTicket message (#547) 8820 * Clarify the meaning of "bytes in flight" (#550) 8822 * Public reset is now stateless reset and not visible to the path 8823 (#215) 8825 * Reordered bits and fields in STREAM frame (#620) 8827 * Clarifications to the stream state machine (#572, #571) 8829 * Increased the maximum length of the Largest Acknowledged field in 8830 ACK frames to 64 bits (#629) 8832 * truncate_connection_id is renamed to omit_connection_id (#659) 8834 * CONNECTION_CLOSE terminates the connection like TCP RST (#330, 8835 #328) 8837 * Update labels used in HKDF-Expand-Label to match TLS 1.3 (#642) 8839 C.27. Since draft-ietf-quic-transport-03 8841 * Change STREAM and RESET_STREAM layout 8843 * Add MAX_STREAM_ID settings 8845 C.28. Since draft-ietf-quic-transport-02 8847 * The size of the initial packet payload has a fixed minimum (#267, 8848 #472) 8850 * Define when Version Negotiation packets are ignored (#284, #294, 8851 #241, #143, #474) 8853 * The 64-bit FNV-1a algorithm is used for integrity protection of 8854 unprotected packets (#167, #480, #481, #517) 8856 * Rework initial packet types to change how the connection ID is 8857 chosen (#482, #442, #493) 8859 * No timestamps are forbidden in unprotected packets (#542, #429) 8861 * Cryptographic handshake is now on stream 0 (#456) 8863 * Remove congestion control exemption for cryptographic handshake 8864 (#248, #476) 8866 * Version 1 of QUIC uses TLS; a new version is needed to use a 8867 different handshake protocol (#516) 8869 * STREAM frames have a reduced number of offset lengths (#543, #430) 8871 * Split some frames into separate connection- and stream- level 8872 frames (#443) 8874 - WINDOW_UPDATE split into MAX_DATA and MAX_STREAM_DATA (#450) 8876 - BLOCKED split to match WINDOW_UPDATE split (#454) 8878 - Define STREAM_ID_NEEDED frame (#455) 8880 * A NEW_CONNECTION_ID frame supports connection migration without 8881 linkability (#232, #491, #496) 8883 * Transport parameters for 0-RTT are retained from a previous 8884 connection (#405, #513, #512) 8886 - A client in 0-RTT no longer required to reset excess streams 8887 (#425, #479) 8889 * Expanded security considerations (#440, #444, #445, #448) 8891 C.29. Since draft-ietf-quic-transport-01 8893 * Defined short and long packet headers (#40, #148, #361) 8895 * Defined a versioning scheme and stable fields (#51, #361) 8897 * Define reserved version values for "greasing" negotiation (#112, 8898 #278) 8900 * The initial packet number is randomized (#35, #283) 8902 * Narrow the packet number encoding range requirement (#67, #286, 8903 #299, #323, #356) 8905 * Defined client address validation (#52, #118, #120, #275) 8907 * Define transport parameters as a TLS extension (#49, #122) 8909 * SCUP and COPT parameters are no longer valid (#116, #117) 8911 * Transport parameters for 0-RTT are either remembered from before, 8912 or assume default values (#126) 8914 * The server chooses connection IDs in its final flight (#119, #349, 8915 #361) 8917 * The server echoes the Connection ID and packet number fields when 8918 sending a Version Negotiation packet (#133, #295, #244) 8920 * Defined a minimum packet size for the initial handshake packet 8921 from the client (#69, #136, #139, #164) 8923 * Path MTU Discovery (#64, #106) 8925 * The initial handshake packet from the client needs to fit in a 8926 single packet (#338) 8928 * Forbid acknowledgment of packets containing only ACK and PADDING 8929 (#291) 8931 * Require that frames are processed when packets are acknowledged 8932 (#381, #341) 8934 * Removed the STOP_WAITING frame (#66) 8936 * Don't require retransmission of old timestamps for lost ACK frames 8937 (#308) 8939 * Clarified that frames are not retransmitted, but the information 8940 in them can be (#157, #298) 8942 * Error handling definitions (#335) 8944 * Split error codes into four sections (#74) 8946 * Forbid the use of Public Reset where CONNECTION_CLOSE is possible 8947 (#289) 8949 * Define packet protection rules (#336) 8951 * Require that stream be entirely delivered or reset, including 8952 acknowledgment of all STREAM frames or the RESET_STREAM, before it 8953 closes (#381) 8955 * Remove stream reservation from state machine (#174, #280) 8957 * Only stream 1 does not contribute to connection-level flow control 8958 (#204) 8960 * Stream 1 counts towards the maximum concurrent stream limit (#201, 8961 #282) 8963 * Remove connection-level flow control exclusion for some streams 8964 (except 1) (#246) 8966 * RESET_STREAM affects connection-level flow control (#162, #163) 8968 * Flow control accounting uses the maximum data offset on each 8969 stream, rather than bytes received (#378) 8971 * Moved length-determining fields to the start of STREAM and ACK 8972 (#168, #277) 8974 * Added the ability to pad between frames (#158, #276) 8976 * Remove error code and reason phrase from GOAWAY (#352, #355) 8978 * GOAWAY includes a final stream number for both directions (#347) 8980 * Error codes for RESET_STREAM and CONNECTION_CLOSE are now at a 8981 consistent offset (#249) 8983 * Defined priority as the responsibility of the application protocol 8984 (#104, #303) 8986 C.30. Since draft-ietf-quic-transport-00 8988 * Replaced DIVERSIFICATION_NONCE flag with KEY_PHASE flag 8990 * Defined versioning 8992 * Reworked description of packet and frame layout 8994 * Error code space is divided into regions for each component 8996 * Use big endian for all numeric values 8998 C.31. Since draft-hamilton-quic-transport-protocol-01 9000 * Adopted as base for draft-ietf-quic-tls 9002 * Updated authors/editors list 9004 * Added IANA Considerations section 9006 * Moved Contributors and Acknowledgments to appendices 9008 Contributors 9010 The original design and rationale behind this protocol draw 9011 significantly from work by Jim Roskind [EARLY-DESIGN]. 9013 The IETF QUIC Working Group received an enormous amount of support 9014 from many people. The following people provided substantive 9015 contributions to this document: 9017 * Alessandro Ghedini 9019 * Alyssa Wilk 9021 * Antoine Delignat-Lavaud 9023 * Brian Trammell 9025 * Christian Huitema 9027 * Colin Perkins 9029 * David Schinazi 9031 * Dmitri Tikhonov 9033 * Eric Kinnear 9035 * Eric Rescorla 9037 * Gorry Fairhurst 9039 * Ian Swett 9041 * Igor Lubashev 9043 * 奥 一穂 (Kazuho Oku) 9045 * Lucas Pardue 9047 * Magnus Westerlund 9049 * Marten Seemann 9051 * Martin Duke 9053 * Mike Bishop 9055 * Mikkel Fahnøe Jørgensen 9057 * Mirja Kühlewind 9059 * Nick Banks 9060 * Nick Harper 9062 * Patrick McManus 9064 * Roberto Peon 9066 * Ryan Hamilton 9068 * Subodh Iyengar 9070 * Tatsuhiro Tsujikawa 9072 * Ted Hardie 9074 * Tom Jones 9076 * Victor Vasiliev 9078 Authors' Addresses 9080 Jana Iyengar (editor) 9081 Fastly 9083 Email: jri.ietf@gmail.com 9085 Martin Thomson (editor) 9086 Mozilla 9088 Email: mt@lowentropy.net