idnits 2.17.1 draft-ietf-quic-recovery-21.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The abstract seems to contain references ([2], [3], [1]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 08, 2019) is 1744 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: '1' on line 1124 -- Looks like a reference, but probably isn't: '2' on line 1126 -- Looks like a reference, but probably isn't: '3' on line 1128 == Missing Reference: 'Initial' is mentioned on line 1392, but not defined == Outdated reference: A later version (-34) exists of draft-ietf-quic-tls-21 == Outdated reference: A later version (-34) exists of draft-ietf-quic-transport-21 == Outdated reference: A later version (-15) exists of draft-ietf-tcpm-rack-05 -- Obsolete informational reference (is this intentional?): RFC 8312 (Obsoleted by RFC 9438) Summary: 1 error (**), 0 flaws (~~), 5 warnings (==), 6 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC J. Iyengar, Ed. 3 Internet-Draft Fastly 4 Intended status: Standards Track I. Swett, Ed. 5 Expires: January 9, 2020 Google 6 July 08, 2019 8 QUIC Loss Detection and Congestion Control 9 draft-ietf-quic-recovery-21 11 Abstract 13 This document describes loss detection and congestion control 14 mechanisms for QUIC. 16 Note to Readers 18 Discussion of this draft takes place on the QUIC working group 19 mailing list (quic@ietf.org), which is archived at 20 https://mailarchive.ietf.org/arch/search/?email_list=quic [1]. 22 Working Group information can be found at https://github.com/quicwg 23 [2]; source code and issues list for this draft can be found at 24 https://github.com/quicwg/base-drafts/labels/-recovery [3]. 26 Status of This Memo 28 This Internet-Draft is submitted in full conformance with the 29 provisions of BCP 78 and BCP 79. 31 Internet-Drafts are working documents of the Internet Engineering 32 Task Force (IETF). Note that other groups may also distribute 33 working documents as Internet-Drafts. The list of current Internet- 34 Drafts is at https://datatracker.ietf.org/drafts/current/. 36 Internet-Drafts are draft documents valid for a maximum of six months 37 and may be updated, replaced, or obsoleted by other documents at any 38 time. It is inappropriate to use Internet-Drafts as reference 39 material or to cite them other than as "work in progress." 41 This Internet-Draft will expire on January 9, 2020. 43 Copyright Notice 45 Copyright (c) 2019 IETF Trust and the persons identified as the 46 document authors. All rights reserved. 48 This document is subject to BCP 78 and the IETF Trust's Legal 49 Provisions Relating to IETF Documents 50 (https://trustee.ietf.org/license-info) in effect on the date of 51 publication of this document. Please review these documents 52 carefully, as they describe your rights and restrictions with respect 53 to this document. Code Components extracted from this document must 54 include Simplified BSD License text as described in Section 4.e of 55 the Trust Legal Provisions and are provided without warranty as 56 described in the Simplified BSD License. 58 Table of Contents 60 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 61 2. Conventions and Definitions . . . . . . . . . . . . . . . . . 4 62 3. Design of the QUIC Transmission Machinery . . . . . . . . . . 5 63 3.1. Relevant Differences Between QUIC and TCP . . . . . . . . 5 64 3.1.1. Separate Packet Number Spaces . . . . . . . . . . . . 6 65 3.1.2. Monotonically Increasing Packet Numbers . . . . . . . 6 66 3.1.3. Clearer Loss Epoch . . . . . . . . . . . . . . . . . 6 67 3.1.4. No Reneging . . . . . . . . . . . . . . . . . . . . . 7 68 3.1.5. More ACK Ranges . . . . . . . . . . . . . . . . . . . 7 69 3.1.6. Explicit Correction For Delayed Acknowledgements . . 7 70 4. Generating Acknowledgements . . . . . . . . . . . . . . . . . 7 71 4.1. Crypto Handshake Data . . . . . . . . . . . . . . . . . . 8 72 4.2. ACK Ranges . . . . . . . . . . . . . . . . . . . . . . . 8 73 4.3. Receiver Tracking of ACK Frames . . . . . . . . . . . . . 8 74 4.4. Measuring and Reporting Host Delay . . . . . . . . . . . 8 75 5. Estimating the Round-Trip Time . . . . . . . . . . . . . . . 9 76 5.1. Generating RTT samples . . . . . . . . . . . . . . . . . 9 77 5.2. Estimating min_rtt . . . . . . . . . . . . . . . . . . . 10 78 5.3. Estimating smoothed_rtt and rttvar . . . . . . . . . . . 10 79 6. Loss Detection . . . . . . . . . . . . . . . . . . . . . . . 11 80 6.1. Acknowledgement-based Detection . . . . . . . . . . . . . 11 81 6.1.1. Packet Threshold . . . . . . . . . . . . . . . . . . 12 82 6.1.2. Time Threshold . . . . . . . . . . . . . . . . . . . 12 83 6.2. Crypto Retransmission Timeout . . . . . . . . . . . . . . 13 84 6.3. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 14 85 6.3.1. Computing PTO . . . . . . . . . . . . . . . . . . . . 14 86 6.3.2. Sending Probe Packets . . . . . . . . . . . . . . . . 15 87 6.3.3. Loss Detection . . . . . . . . . . . . . . . . . . . 16 88 6.4. Retry and Version Negotiation . . . . . . . . . . . . . . 16 89 6.5. Discarding Keys and Packet State . . . . . . . . . . . . 17 90 6.6. Discussion . . . . . . . . . . . . . . . . . . . . . . . 17 91 7. Congestion Control . . . . . . . . . . . . . . . . . . . . . 17 92 7.1. Explicit Congestion Notification . . . . . . . . . . . . 18 93 7.2. Slow Start . . . . . . . . . . . . . . . . . . . . . . . 18 94 7.3. Congestion Avoidance . . . . . . . . . . . . . . . . . . 18 95 7.4. Recovery Period . . . . . . . . . . . . . . . . . . . . . 18 96 7.5. Ignoring Loss of Undecryptable Packets . . . . . . . . . 19 97 7.6. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 19 98 7.7. Persistent Congestion . . . . . . . . . . . . . . . . . . 19 99 7.8. Pacing . . . . . . . . . . . . . . . . . . . . . . . . . 20 100 7.9. Under-utilizing the Congestion Window . . . . . . . . . . 21 101 8. Security Considerations . . . . . . . . . . . . . . . . . . . 21 102 8.1. Congestion Signals . . . . . . . . . . . . . . . . . . . 21 103 8.2. Traffic Analysis . . . . . . . . . . . . . . . . . . . . 21 104 8.3. Misreporting ECN Markings . . . . . . . . . . . . . . . . 22 105 9. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 106 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 22 107 10.1. Normative References . . . . . . . . . . . . . . . . . . 22 108 10.2. Informative References . . . . . . . . . . . . . . . . . 23 109 10.3. URIs . . . . . . . . . . . . . . . . . . . . . . . . . . 24 110 Appendix A. Loss Recovery Pseudocode . . . . . . . . . . . . . . 24 111 A.1. Tracking Sent Packets . . . . . . . . . . . . . . . . . . 25 112 A.1.1. Sent Packet Fields . . . . . . . . . . . . . . . . . 25 113 A.2. Constants of interest . . . . . . . . . . . . . . . . . . 25 114 A.3. Variables of interest . . . . . . . . . . . . . . . . . . 26 115 A.4. Initialization . . . . . . . . . . . . . . . . . . . . . 27 116 A.5. On Sending a Packet . . . . . . . . . . . . . . . . . . . 27 117 A.6. On Receiving an Acknowledgment . . . . . . . . . . . . . 28 118 A.7. On Packet Acknowledgment . . . . . . . . . . . . . . . . 29 119 A.8. Setting the Loss Detection Timer . . . . . . . . . . . . 30 120 A.9. On Timeout . . . . . . . . . . . . . . . . . . . . . . . 32 121 A.10. Detecting Lost Packets . . . . . . . . . . . . . . . . . 32 122 Appendix B. Congestion Control Pseudocode . . . . . . . . . . . 33 123 B.1. Constants of interest . . . . . . . . . . . . . . . . . . 33 124 B.2. Variables of interest . . . . . . . . . . . . . . . . . . 34 125 B.3. Initialization . . . . . . . . . . . . . . . . . . . . . 35 126 B.4. On Packet Sent . . . . . . . . . . . . . . . . . . . . . 35 127 B.5. On Packet Acknowledgement . . . . . . . . . . . . . . . . 35 128 B.6. On New Congestion Event . . . . . . . . . . . . . . . . . 36 129 B.7. Process ECN Information . . . . . . . . . . . . . . . . . 36 130 B.8. On Packets Lost . . . . . . . . . . . . . . . . . . . . . 37 131 Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 37 132 C.1. Since draft-ietf-quic-recovery-20 . . . . . . . . . . . . 37 133 C.2. Since draft-ietf-quic-recovery-19 . . . . . . . . . . . . 37 134 C.3. Since draft-ietf-quic-recovery-18 . . . . . . . . . . . . 38 135 C.4. Since draft-ietf-quic-recovery-17 . . . . . . . . . . . . 38 136 C.5. Since draft-ietf-quic-recovery-16 . . . . . . . . . . . . 39 137 C.6. Since draft-ietf-quic-recovery-14 . . . . . . . . . . . . 39 138 C.7. Since draft-ietf-quic-recovery-13 . . . . . . . . . . . . 40 139 C.8. Since draft-ietf-quic-recovery-12 . . . . . . . . . . . . 40 140 C.9. Since draft-ietf-quic-recovery-11 . . . . . . . . . . . . 40 141 C.10. Since draft-ietf-quic-recovery-10 . . . . . . . . . . . . 40 142 C.11. Since draft-ietf-quic-recovery-09 . . . . . . . . . . . . 40 143 C.12. Since draft-ietf-quic-recovery-08 . . . . . . . . . . . . 41 144 C.13. Since draft-ietf-quic-recovery-07 . . . . . . . . . . . . 41 145 C.14. Since draft-ietf-quic-recovery-06 . . . . . . . . . . . . 41 146 C.15. Since draft-ietf-quic-recovery-05 . . . . . . . . . . . . 41 147 C.16. Since draft-ietf-quic-recovery-04 . . . . . . . . . . . . 41 148 C.17. Since draft-ietf-quic-recovery-03 . . . . . . . . . . . . 41 149 C.18. Since draft-ietf-quic-recovery-02 . . . . . . . . . . . . 41 150 C.19. Since draft-ietf-quic-recovery-01 . . . . . . . . . . . . 41 151 C.20. Since draft-ietf-quic-recovery-00 . . . . . . . . . . . . 42 152 C.21. Since draft-iyengar-quic-loss-recovery-01 . . . . . . . . 42 153 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 42 154 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 42 156 1. Introduction 158 QUIC is a new multiplexed and secure transport atop UDP. QUIC builds 159 on decades of transport and security experience, and implements 160 mechanisms that make it attractive as a modern general-purpose 161 transport. The QUIC protocol is described in [QUIC-TRANSPORT]. 163 QUIC implements the spirit of existing TCP loss recovery mechanisms, 164 described in RFCs, various Internet-drafts, and also those prevalent 165 in the Linux TCP implementation. This document describes QUIC 166 congestion control and loss recovery, and where applicable, 167 attributes the TCP equivalent in RFCs, Internet-drafts, academic 168 papers, and/or TCP implementations. 170 2. Conventions and Definitions 172 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 173 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 174 "OPTIONAL" in this document are to be interpreted as described in BCP 175 14 [RFC2119] [RFC8174] when, and only when, they appear in all 176 capitals, as shown here. 178 Definitions of terms that are used in this document: 180 ACK-only: Any packet containing only one or more ACK frame(s). 182 In-flight: Packets are considered in-flight when they have been sent 183 and are not ACK-only, and they are not acknowledged, declared 184 lost, or abandoned along with old keys. 186 Ack-eliciting Frames: All frames besides ACK or PADDING are 187 considered ack-eliciting. 189 Ack-eliciting Packets: Packets that contain ack-eliciting frames 190 elicit an ACK from the receiver within the maximum ack delay and 191 are called ack-eliciting packets. 193 Crypto Packets: Packets containing CRYPTO data sent in Initial or 194 Handshake packets. 196 Out-of-order Packets: Packets that do not increase the largest 197 received packet number for its packet number space by exactly one. 198 Packets arrive out of order when earlier packets are lost or 199 delayed. 201 3. Design of the QUIC Transmission Machinery 203 All transmissions in QUIC are sent with a packet-level header, which 204 indicates the encryption level and includes a packet sequence number 205 (referred to below as a packet number). The encryption level 206 indicates the packet number space, as described in [QUIC-TRANSPORT]. 207 Packet numbers never repeat within a packet number space for the 208 lifetime of a connection. Packet numbers monotonically increase 209 within a space, preventing ambiguity. 211 This design obviates the need for disambiguating between 212 transmissions and retransmissions and eliminates significant 213 complexity from QUIC's interpretation of TCP loss detection 214 mechanisms. 216 QUIC packets can contain multiple frames of different types. The 217 recovery mechanisms ensure that data and frames that need reliable 218 delivery are acknowledged or declared lost and sent in new packets as 219 necessary. The types of frames contained in a packet affect recovery 220 and congestion control logic: 222 o All packets are acknowledged, though packets that contain no ack- 223 eliciting frames are only acknowledged along with ack-eliciting 224 packets. 226 o Long header packets that contain CRYPTO frames are critical to the 227 performance of the QUIC handshake and use shorter timers for 228 acknowledgement and retransmission. 230 o Packets that contain only ACK frames do not count toward 231 congestion control limits and are not considered in-flight. 233 o PADDING frames cause packets to contribute toward bytes in flight 234 without directly causing an acknowledgment to be sent. 236 3.1. Relevant Differences Between QUIC and TCP 238 Readers familiar with TCP's loss detection and congestion control 239 will find algorithms here that parallel well-known TCP ones. 240 Protocol differences between QUIC and TCP however contribute to 241 algorithmic differences. We briefly describe these protocol 242 differences below. 244 3.1.1. Separate Packet Number Spaces 246 QUIC uses separate packet number spaces for each encryption level, 247 except 0-RTT and all generations of 1-RTT keys use the same packet 248 number space. Separate packet number spaces ensures acknowledgement 249 of packets sent with one level of encryption will not cause spurious 250 retransmission of packets sent with a different encryption level. 251 Congestion control and round-trip time (RTT) measurement are unified 252 across packet number spaces. 254 3.1.2. Monotonically Increasing Packet Numbers 256 TCP conflates transmission order at the sender with delivery order at 257 the receiver, which results in retransmissions of the same data 258 carrying the same sequence number, and consequently leads to 259 "retransmission ambiguity". QUIC separates the two: QUIC uses a 260 packet number to indicate transmission order, and any application 261 data is sent in one or more streams, with delivery order determined 262 by stream offsets encoded within STREAM frames. 264 QUIC's packet number is strictly increasing within a packet number 265 space, and directly encodes transmission order. A higher packet 266 number signifies that the packet was sent later, and a lower packet 267 number signifies that the packet was sent earlier. When a packet 268 containing ack-eliciting frames is detected lost, QUIC rebundles 269 necessary frames in a new packet with a new packet number, removing 270 ambiguity about which packet is acknowledged when an ACK is received. 271 Consequently, more accurate RTT measurements can be made, spurious 272 retransmissions are trivially detected, and mechanisms such as Fast 273 Retransmit can be applied universally, based only on packet number. 275 This design point significantly simplifies loss detection mechanisms 276 for QUIC. Most TCP mechanisms implicitly attempt to infer 277 transmission ordering based on TCP sequence numbers - a non-trivial 278 task, especially when TCP timestamps are not available. 280 3.1.3. Clearer Loss Epoch 282 QUIC ends a loss epoch when a packet sent after loss is declared is 283 acknowledged. TCP waits for the gap in the sequence number space to 284 be filled, and so if a segment is lost multiple times in a row, the 285 loss epoch may not end for several round trips. Because both should 286 reduce their congestion windows only once per epoch, QUIC will do it 287 correctly once for every round trip that experiences loss, while TCP 288 may only do it once across multiple round trips. 290 3.1.4. No Reneging 292 QUIC ACKs contain information that is similar to TCP SACK, but QUIC 293 does not allow any acked packet to be reneged, greatly simplifying 294 implementations on both sides and reducing memory pressure on the 295 sender. 297 3.1.5. More ACK Ranges 299 QUIC supports many ACK ranges, opposed to TCP's 3 SACK ranges. In 300 high loss environments, this speeds recovery, reduces spurious 301 retransmits, and ensures forward progress without relying on 302 timeouts. 304 3.1.6. Explicit Correction For Delayed Acknowledgements 306 QUIC endpoints measure the delay incurred between when a packet is 307 received and when the corresponding acknowledgment is sent, allowing 308 a peer to maintain a more accurate round-trip time estimate (see 309 Section 4.4). 311 4. Generating Acknowledgements 313 An acknowledgement SHOULD be sent immediately upon receipt of a 314 second ack-eliciting packet. QUIC recovery algorithms do not assume 315 the peer sends an ACK immediately when receiving a second ack- 316 eliciting packet. 318 In order to accelerate loss recovery and reduce timeouts, the 319 receiver SHOULD send an immediate ACK after it receives an out-of- 320 order packet. It could send immediate ACKs for in-order packets for 321 a period of time that SHOULD NOT exceed 1/8 RTT unless more out-of- 322 order packets arrive. If every packet arrives out-of- order, then an 323 immediate ACK SHOULD be sent for every received packet. 325 Similarly, packets marked with the ECN Congestion Experienced (CE) 326 codepoint in the IP header SHOULD be acknowledged immediately, to 327 reduce the peer's response time to congestion events. 329 As an optimization, a receiver MAY process multiple packets before 330 sending any ACK frames in response. In this case the receiver can 331 determine whether an immediate or delayed acknowledgement should be 332 generated after processing incoming packets. 334 4.1. Crypto Handshake Data 336 In order to quickly complete the handshake and avoid spurious 337 retransmissions due to crypto retransmission timeouts, crypto packets 338 SHOULD use a very short ack delay, such as the local timer 339 granularity. ACK frames SHOULD be sent immediately when the crypto 340 stack indicates all data for that packet number space has been 341 received. 343 4.2. ACK Ranges 345 When an ACK frame is sent, one or more ranges of acknowledged packets 346 are included. Including older packets reduces the chance of spurious 347 retransmits caused by losing previously sent ACK frames, at the cost 348 of larger ACK frames. 350 ACK frames SHOULD always acknowledge the most recently received 351 packets, and the more out-of-order the packets are, the more 352 important it is to send an updated ACK frame quickly, to prevent the 353 peer from declaring a packet as lost and spuriously retransmitting 354 the frames it contains. 356 Below is one recommended approach for determining what packets to 357 include in an ACK frame. 359 4.3. Receiver Tracking of ACK Frames 361 When a packet containing an ACK frame is sent, the largest 362 acknowledged in that frame may be saved. When a packet containing an 363 ACK frame is acknowledged, the receiver can stop acknowledging 364 packets less than or equal to the largest acknowledged in the sent 365 ACK frame. 367 In cases without ACK frame loss, this algorithm allows for a minimum 368 of 1 RTT of reordering. In cases with ACK frame loss and reordering, 369 this approach does not guarantee that every acknowledgement is seen 370 by the sender before it is no longer included in the ACK frame. 371 Packets could be received out of order and all subsequent ACK frames 372 containing them could be lost. In this case, the loss recovery 373 algorithm may cause spurious retransmits, but the sender will 374 continue making forward progress. 376 4.4. Measuring and Reporting Host Delay 378 An endpoint measures the delays intentionally introduced between when 379 an ACK-eliciting packet is received and the corresponding 380 acknowledgment is sent. The endpoint encodes this delay for the 381 largest acknowledged packet in the Ack Delay field of an ACK frame 382 (see Section 19.3 of [QUIC-TRANSPORT]). This allows the receiver of 383 the ACK to adjust for any intentional delays, which is important for 384 delayed acknowledgements, when estimating the path RTT. A packet 385 might be held in the OS kernel or elsewhere on the host before being 386 processed. An endpoint SHOULD NOT include these unintentional delays 387 when populating the Ack Delay field in an ACK frame. 389 An endpoint MUST NOT excessively delay acknowledgements of ack- 390 eliciting packets. The maximum ack delay is communicated in the 391 max_ack_delay transport parameter; see Section 18.1 of 392 [QUIC-TRANSPORT]. max_ack_delay implies an explicit contract: an 393 endpoint promises to never delay acknowledgments of an ack-eliciting 394 packet by more than the indicated value. If it does, any excess 395 accrues to the RTT estimate and could result in spurious 396 retransmissions from the peer. For Initial and Handshake packets, a 397 max_ack_delay of 0 is used. 399 5. Estimating the Round-Trip Time 401 At a high level, an endpoint measures the time from when a packet was 402 sent to when it is acknowledged as a round-trip time (RTT) sample. 403 The endpoint uses RTT samples and peer-reported host delays 404 (Section 4.4) to generate a statistical description of the 405 connection's RTT. An endpoint computes the following three values: 406 the minimum value observed over the lifetime of the connection 407 (min_rtt), an exponentially-weighted moving average (smoothed_rtt), 408 and the variance in the observed RTT samples (rttvar). 410 5.1. Generating RTT samples 412 An endpoint generates an RTT sample on receiving an ACK frame that 413 meets the following two conditions: 415 o the largest acknowledged packet number is newly acknowledged, and 417 o at least one of the newly acknowledged packets was ack-eliciting. 419 The RTT sample, latest_rtt, is generated as the time elapsed since 420 the largest acknowledged packet was sent: 422 latest_rtt = ack_time - send_time_of_largest_acked 424 An RTT sample is generated using only the largest acknowledged packet 425 in the received ACK frame. This is because a peer reports host 426 delays for only the largest acknowledged packet in an ACK frame. 427 While the reported host delay is not used by the RTT sample 428 measurement, it is used to adjust the RTT sample in subsequent 429 computations of smoothed_rtt and rttvar Section 5.3. 431 To avoid generating multiple RTT samples using the same packet, an 432 ACK frame SHOULD NOT be used to update RTT estimates if it does not 433 newly acknowledge the largest acknowledged packet. 435 An RTT sample MUST NOT be generated on receiving an ACK frame that 436 does not newly acknowledge at least one ack-eliciting packet. A peer 437 does not send an ACK frame on receiving only non-ack-eliciting 438 packets, so an ACK frame that is subsequently sent can include an 439 arbitrarily large Ack Delay field. Ignoring such ACK frames avoids 440 complications in subsequent smoothed_rtt and rttvar computations. 442 A sender might generate multiple RTT samples per RTT when multiple 443 ACK frames are received within an RTT. As suggested in [RFC6298], 444 doing so might result in inadequate history in smoothed_rtt and 445 rttvar. Ensuring that RTT estimates retain sufficient history is an 446 open research question. 448 5.2. Estimating min_rtt 450 min_rtt is the minimum RTT observed over the lifetime of the 451 connection. min_rtt is set to the latest_rtt on the first sample in 452 a connection, and to the lesser of min_rtt and latest_rtt on 453 subsequent samples. 455 An endpoint uses only locally observed times in computing the min_rtt 456 and does not adjust for host delays reported by the peer 457 (Section 4.4). Doing so allows the endpoint to set a lower bound for 458 the smoothed_rtt based entirely on what it observes (see 459 Section 5.3), and limits potential underestimation due to 460 erroneously-reported delays by the peer. 462 5.3. Estimating smoothed_rtt and rttvar 464 smoothed_rtt is an exponentially-weighted moving average of an 465 endpoint's RTT samples, and rttvar is the endpoint's estimated 466 variance in the RTT samples. 468 The calculation of smoothed_rtt uses path latency after adjusting RTT 469 samples for host delays (Section 4.4). For packets sent in the 470 ApplicationData packet number space, a peer limits any delay in 471 sending an acknowledgement for an ack-eliciting packet to no greater 472 than the value it advertised in the max_ack_delay transport 473 parameter. Consequently, when a peer reports an Ack Delay that is 474 greater than its max_ack_delay, the delay is attributed to reasons 475 out of the peer's control, such as scheduler latency at the peer or 476 loss of previous ACK frames. Any delays beyond the peer's 477 max_ack_delay are therefore considered effectively part of path delay 478 and incorporated into the smoothed_rtt estimate. 480 When adjusting an RTT sample using peer-reported acknowledgement 481 delays, an endpoint: 483 o MUST ignore the Ack Delay field of the ACK frame for packets sent 484 in the Initial and Handshake packet number space. 486 o MUST use the lesser of the value reported in Ack Delay field of 487 the ACK frame and the peer's max_ack_delay transport parameter 488 (Section 4.4). 490 o MUST NOT apply the adjustment if the resulting RTT sample is 491 smaller than the min_rtt. This limits the underestimation that a 492 misreporting peer can cause to the smoothed_rtt. 494 On the first RTT sample in a connection, the smoothed_rtt is set to 495 the latest_rtt. 497 smoothed_rtt and rttvar are computed as follows, similar to 498 [RFC6298]. On the first RTT sample in a connection: 500 smoothed_rtt = latest_rtt 501 rttvar = latest_rtt / 2 503 On subsequent RTT samples, smoothed_rtt and rttvar evolve as follows: 505 ack_delay = min(Ack Delay in ACK Frame, max_ack_delay) 506 adjusted_rtt = latest_rtt 507 if (min_rtt + ack_delay < latest_rtt): 508 adjusted_rtt = latest_rtt - ack_delay 509 smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * adjusted_rtt 510 rttvar_sample = abs(smoothed_rtt - adjusted_rtt) 511 rttvar = 3/4 * rttvar + 1/4 * rttvar_sample 513 6. Loss Detection 515 QUIC senders use both ack information and timeouts to detect lost 516 packets, and this section provides a description of these algorithms. 518 If a packet is lost, the QUIC transport needs to recover from that 519 loss, such as by retransmitting the data, sending an updated frame, 520 or abandoning the frame. For more information, see Section 13.2 of 521 [QUIC-TRANSPORT]. 523 6.1. Acknowledgement-based Detection 525 Acknowledgement-based loss detection implements the spirit of TCP's 526 Fast Retransmit [RFC5681], Early Retransmit [RFC5827], FACK [FACK], 527 SACK loss recovery [RFC6675], and RACK [RACK]. This section provides 528 an overview of how these algorithms are implemented in QUIC. 530 A packet is declared lost if it meets all the following conditions: 532 o The packet is unacknowledged, in-flight, and was sent prior to an 533 acknowledged packet. 535 o Either its packet number is kPacketThreshold smaller than an 536 acknowledged packet (Section 6.1.1), or it was sent long enough in 537 the past (Section 6.1.2). 539 The acknowledgement indicates that a packet sent later was delivered, 540 while the packet and time thresholds provide some tolerance for 541 packet reordering. 543 Spuriously declaring packets as lost leads to unnecessary 544 retransmissions and may result in degraded performance due to the 545 actions of the congestion controller upon detecting loss. 546 Implementations that detect spurious retransmissions and increase the 547 reordering threshold in packets or time MAY choose to start with 548 smaller initial reordering thresholds to minimize recovery latency. 550 6.1.1. Packet Threshold 552 The RECOMMENDED initial value for the packet reordering threshold 553 (kPacketThreshold) is 3, based on best practices for TCP loss 554 detection [RFC5681] [RFC6675]. 556 Some networks may exhibit higher degrees of reordering, causing a 557 sender to detect spurious losses. Implementers MAY use algorithms 558 developed for TCP, such as TCP-NCR [RFC4653], to improve QUIC's 559 reordering resilience. 561 6.1.2. Time Threshold 563 Once a later packet packet within the same packet number space has 564 been acknowledged, an endpoint SHOULD declare an earlier packet lost 565 if it was sent a threshold amount of time in the past. To avoid 566 declaring packets as lost too early, this time threshold MUST be set 567 to at least kGranularity. The time threshold is: 569 kTimeThreshold * max(SRTT, latest_RTT, kGranularity) 571 If packets sent prior to the largest acknowledged packet cannot yet 572 be declared lost, then a timer SHOULD be set for the remaining time. 574 Using max(SRTT, latest_RTT) protects from the two following cases: 576 o the latest RTT sample is lower than the SRTT, perhaps due to 577 reordering where the acknowledgement encountered a shorter path; 579 o the latest RTT sample is higher than the SRTT, perhaps due to a 580 sustained increase in the actual RTT, but the smoothed SRTT has 581 not yet caught up. 583 The RECOMMENDED time threshold (kTimeThreshold), expressed as a 584 round-trip time multiplier, is 9/8. 586 Implementations MAY experiment with absolute thresholds, thresholds 587 from previous connections, adaptive thresholds, or including RTT 588 variance. Smaller thresholds reduce reordering resilience and 589 increase spurious retransmissions, and larger thresholds increase 590 loss detection delay. 592 6.2. Crypto Retransmission Timeout 594 Data in CRYPTO frames is critical to QUIC transport and crypto 595 negotiation, so a more aggressive timeout is used to retransmit it. 597 The initial crypto retransmission timeout SHOULD be set to twice the 598 initial RTT. 600 At the beginning, there are no prior RTT samples within a connection. 601 Resumed connections over the same network SHOULD use the previous 602 connection's final smoothed RTT value as the resumed connection's 603 initial RTT. If no previous RTT is available, or if the network 604 changes, the initial RTT SHOULD be set to 500ms, resulting in a 1 605 second initial handshake timeout as recommended in [RFC6298]. 607 A connection MAY use the delay between sending a PATH_CHALLENGE and 608 receiving a PATH_RESPONSE to seed initial_rtt for a new path, but the 609 delay SHOULD NOT be considered an RTT sample. 611 When a crypto packet is sent, the sender MUST set a timer for twice 612 the smoothed RTT. This timer MUST be updated when a new crypto 613 packet is sent and when an acknowledgement is received which computes 614 a new RTT sample. Upon timeout, the sender MUST retransmit all 615 unacknowledged CRYPTO data if possible. The sender MUST NOT declare 616 in-flight crypto packets as lost when the crypto timer expires. 618 On each consecutive expiration of the crypto timer without receiving 619 an acknowledgement for a new packet, the sender MUST double the 620 crypto retransmission timeout and set a timer for this period. 622 Until the server has validated the client's address on the path, the 623 amount of data it can send is limited, as specified in Section 8.1 of 625 [QUIC-TRANSPORT]. If not all unacknowledged CRYPTO data can be sent, 626 then all unacknowledged CRYPTO data sent in Initial packets should be 627 retransmitted. If no data can be sent, then no alarm should be armed 628 until data has been received from the client. 630 Because the server could be blocked until more packets are received, 631 the client MUST ensure that the crypto retransmission timer is set if 632 there is unacknowledged crypto data or if the client does not yet 633 have 1-RTT keys. If the crypto retransmission timer expires before 634 the client has 1-RTT keys, it is possible that the client may not 635 have any crypto data to retransmit. However, the client MUST send a 636 new packet, containing only PADDING frames if necessary, to allow the 637 server to continue sending data. If Handshake keys are available to 638 the client, it MUST send a Handshake packet, and otherwise it MUST 639 send an Initial packet in a UDP datagram of at least 1200 bytes. 641 Because packets only containing PADDING do not elicit an 642 acknowledgement, they may never be acknowledged, but they are removed 643 from bytes in flight when the client gets Handshake keys and the 644 Initial keys are discarded. 646 The crypto retransmission timer is not set if the time threshold 647 Section 6.1.2 loss detection timer is set. The time threshold loss 648 detection timer is expected to both expire earlier than the crypto 649 retransmission timeout and be less likely to spuriously retransmit 650 data. The Initial and Handshake packet number spaces will typically 651 contain a small number of packets, so losses are less likely to be 652 detected using packet-threshold loss detection. 654 When the crypto retransmission timer is active, the probe timer 655 (Section 6.3) is not active. 657 6.3. Probe Timeout 659 A Probe Timeout (PTO) triggers a probe packet when ack-eliciting data 660 is in flight but an acknowledgement is not received within the 661 expected period of time. A PTO enables a connection to recover from 662 loss of tail packets or acks. The PTO algorithm used in QUIC 663 implements the reliability functions of Tail Loss Probe [TLP] [RACK], 664 RTO [RFC5681] and F-RTO algorithms for TCP [RFC5682], and the timeout 665 computation is based on TCP's retransmission timeout period 666 [RFC6298]. 668 6.3.1. Computing PTO 670 When an ack-eliciting packet is transmitted, the sender schedules a 671 timer for the PTO period as follows: 673 PTO = smoothed_rtt + max(4*rttvar, kGranularity) + max_ack_delay 675 kGranularity, smoothed_rtt, rttvar, and max_ack_delay are defined in 676 Appendix A.2 and Appendix A.3. 678 The PTO period is the amount of time that a sender ought to wait for 679 an acknowledgement of a sent packet. This time period includes the 680 estimated network roundtrip-time (smoothed_rtt), the variance in the 681 estimate (4*rttvar), and max_ack_delay, to account for the maximum 682 time by which a receiver might delay sending an acknowledgement. 684 The PTO value MUST be set to at least kGranularity, to avoid the 685 timer expiring immediately. 687 When a PTO timer expires, the sender probes the network as described 688 in the next section. The PTO period MUST be set to twice its current 689 value. This exponential reduction in the sender's rate is important 690 because the PTOs might be caused by loss of packets or 691 acknowledgements due to severe congestion. 693 A sender computes its PTO timer every time an ack-eliciting packet is 694 sent. A sender might choose to optimize this by setting the timer 695 fewer times if it knows that more ack-eliciting packets will be sent 696 within a short period of time. 698 6.3.2. Sending Probe Packets 700 When a PTO timer expires, a sender MUST send at least one ack- 701 eliciting packet as a probe, unless there is no data available to 702 send. An endpoint MAY send up to two ack-eliciting packets, to avoid 703 an expensive consecutive PTO expiration due to a single packet loss. 705 It is possible that the sender has no new or previously-sent data to 706 send. As an example, consider the following sequence of events: new 707 application data is sent in a STREAM frame, deemed lost, then 708 retransmitted in a new packet, and then the original transmission is 709 acknowledged. In the absence of any new application data, a PTO 710 timer expiration now would find the sender with no new or previously- 711 sent data to send. 713 When there is no data to send, the sender SHOULD send a PING or other 714 ack-eliciting frame in a single packet, re-arming the PTO timer. 716 Alternatively, instead of sending an ack-eliciting packet, the sender 717 MAY mark any packets still in flight as lost. Doing so avoids 718 sending an additional packet, but increases the risk that loss is 719 declared too aggressively, resulting in an unnecessary rate reduction 720 by the congestion controller. 722 Consecutive PTO periods increase exponentially, and as a result, 723 connection recovery latency increases exponentially as packets 724 continue to be dropped in the network. Sending two packets on PTO 725 expiration increases resilience to packet drops, thus reducing the 726 probability of consecutive PTO events. 728 Probe packets sent on a PTO MUST be ack-eliciting. A probe packet 729 SHOULD carry new data when possible. A probe packet MAY carry 730 retransmitted unacknowledged data when new data is unavailable, when 731 flow control does not permit new data to be sent, or to 732 opportunistically reduce loss recovery delay. Implementations MAY 733 use alternate strategies for determining the content of probe 734 packets, including sending new or retransmitted data based on the 735 application's priorities. 737 When the PTO timer expires multiple times and new data cannot be 738 sent, implementations must choose between sending the same payload 739 every time or sending different payloads. Sending the same payload 740 may be simpler and ensures the highest priority frames arrive first. 741 Sending different payloads each time reduces the chances of spurious 742 retransmission. 744 6.3.3. Loss Detection 746 Delivery or loss of packets in flight is established when an ACK 747 frame is received that newly acknowledges one or more packets. 749 A PTO timer expiration event does not indicate packet loss and MUST 750 NOT cause prior unacknowledged packets to be marked as lost. When an 751 acknowledgement is received that newly acknowledges packets, loss 752 detection proceeds as dictated by packet and time threshold 753 mechanisms; see Section 6.1. 755 6.4. Retry and Version Negotiation 757 A Retry or Version Negotiation packet causes a client to send another 758 Initial packet, effectively restarting the connection process and 759 resetting congestion control and loss recovery state, including 760 resetting any pending timers. Either packet indicates that the 761 Initial was received but not processed. Neither packet can be 762 treated as an acknowledgment for the Initial. 764 The client MAY however compute an RTT estimate to the server as the 765 time period from when the first Initial was sent to when a Retry or a 766 Version Negotiation packet is received. The client MAY use this 767 value to seed the RTT estimator for a subsequent connection attempt 768 to the server. 770 6.5. Discarding Keys and Packet State 772 When packet protection keys are discarded (see Section 4.9 of 773 [QUIC-TLS]), all packets that were sent with those keys can no longer 774 be acknowledged because their acknowledgements cannot be processed 775 anymore. The sender MUST discard all recovery state associated with 776 those packets and MUST remove them from the count of bytes in flight. 778 Endpoints stop sending and receiving Initial packets once they start 779 exchanging Handshake packets (see Section 17.2.2.1 of 780 [QUIC-TRANSPORT]). At this point, recovery state for all in-flight 781 Initial packets is discarded. 783 When 0-RTT is rejected, recovery state for all in-flight 0-RTT 784 packets is discarded. 786 If a server accepts 0-RTT, but does not buffer 0-RTT packets that 787 arrive before Initial packets, early 0-RTT packets will be declared 788 lost, but that is expected to be infrequent. 790 It is expected that keys are discarded after packets encrypted with 791 them would be acknowledged or declared lost. Initial secrets however 792 might be destroyed sooner, as soon as handshake keys are available 793 (see Section 4.9.1 of [QUIC-TLS]). 795 6.6. Discussion 797 The majority of constants were derived from best common practices 798 among widely deployed TCP implementations on the internet. 799 Exceptions follow. 801 A shorter delayed ack time of 25ms was chosen because longer delayed 802 acks can delay loss recovery and for the small number of connections 803 where less than packet per 25ms is delivered, acking every packet is 804 beneficial to congestion control and loss recovery. 806 7. Congestion Control 808 QUIC's congestion control is based on TCP NewReno [RFC6582]. NewReno 809 is a congestion window based congestion control. QUIC specifies the 810 congestion window in bytes rather than packets due to finer control 811 and the ease of appropriate byte counting [RFC3465]. 813 QUIC hosts MUST NOT send packets if they would increase 814 bytes_in_flight (defined in Appendix B.2) beyond the available 815 congestion window, unless the packet is a probe packet sent after a 816 PTO timer expires, as described in Section 6.3. 818 Implementations MAY use other congestion control algorithms, such as 819 Cubic [RFC8312], and endpoints MAY use different algorithms from one 820 another. The signals QUIC provides for congestion control are 821 generic and are designed to support different algorithms. 823 7.1. Explicit Congestion Notification 825 If a path has been verified to support ECN, QUIC treats a Congestion 826 Experienced codepoint in the IP header as a signal of congestion. 827 This document specifies an endpoint's response when its peer receives 828 packets with the Congestion Experienced codepoint. As discussed in 829 [RFC8311], endpoints are permitted to experiment with other response 830 functions. 832 7.2. Slow Start 834 QUIC begins every connection in slow start and exits slow start upon 835 loss or upon increase in the ECN-CE counter. QUIC re-enters slow 836 start anytime the congestion window is less than ssthresh, which only 837 occurs after persistent congestion is declared. While in slow start, 838 QUIC increases the congestion window by the number of bytes 839 acknowledged when each acknowledgment is processed. 841 7.3. Congestion Avoidance 843 Slow start exits to congestion avoidance. Congestion avoidance in 844 NewReno uses an additive increase multiplicative decrease (AIMD) 845 approach that increases the congestion window by one maximum packet 846 size per congestion window acknowledged. When a loss is detected, 847 NewReno halves the congestion window and sets the slow start 848 threshold to the new congestion window. 850 7.4. Recovery Period 852 Recovery is a period of time beginning with detection of a lost 853 packet or an increase in the ECN-CE counter. Because QUIC does not 854 retransmit packets, it defines the end of recovery as a packet sent 855 after the start of recovery being acknowledged. This is slightly 856 different from TCP's definition of recovery, which ends when the lost 857 packet that started recovery is acknowledged. 859 The recovery period limits congestion window reduction to once per 860 round trip. During recovery, the congestion window remains unchanged 861 irrespective of new losses or increases in the ECN-CE counter. 863 7.5. Ignoring Loss of Undecryptable Packets 865 During the handshake, some packet protection keys might not be 866 available when a packet arrives. In particular, Handshake and 0-RTT 867 packets cannot be processed until the Initial packets arrive, and 868 1-RTT packets cannot be processed until the handshake completes. 869 Endpoints MAY ignore the loss of Handshake, 0-RTT, and 1-RTT packets 870 that might arrive before the peer has packet protection keys to 871 process those packets. 873 7.6. Probe Timeout 875 Probe packets MUST NOT be blocked by the congestion controller. A 876 sender MUST however count these packets as being additionally in 877 flight, since these packets add network load without establishing 878 packet loss. Note that sending probe packets might cause the 879 sender's bytes in flight to exceed the congestion window until an 880 acknowledgement is received that establishes loss or delivery of 881 packets. 883 7.7. Persistent Congestion 885 When an ACK frame is received that establishes loss of all in-flight 886 packets sent over a long enough period of time, the network is 887 considered to be experiencing persistent congestion. Commonly, this 888 can be established by consecutive PTOs, but since the PTO timer is 889 reset when a new ack-eliciting packet is sent, an explicit duration 890 must be used to account for those cases where PTOs do not occur or 891 are substantially delayed. This duration is computed as follows: 893 (smoothed_rtt + 4 * rttvar + max_ack_delay) * 894 kPersistentCongestionThreshold 896 For example, assume: 898 smoothed_rtt = 1 rttvar = 0 max_ack_delay = 0 899 kPersistentCongestionThreshold = 3 901 If an eck-eliciting packet is sent at time = 0, the following 902 scenario would illustrate persistent congestion: 904 +-----+------------------------+ 905 | t=0 | Send Pkt #1 (App Data) | 906 +-----+------------------------+ 907 | t=1 | Send Pkt #2 (PTO 1) | 908 | | | 909 | t=3 | Send Pkt #3 (PTO 2) | 910 | | | 911 | t=7 | Send Pkt #4 (PTO 3) | 912 | | | 913 | t=8 | Recv ACK of Pkt #4 | 914 +-----+------------------------+ 916 The first three packets are determined to be lost when the ACK of 917 packet 4 is received at t=8. The congestion period is calculated as 918 the time between the oldest and newest lost packets: (3 - 0) = 3. 919 The duration for persistent congestion is equal to: (1 * 920 kPersistentCongestionThreshold) = 3. Because the threshold was 921 reached and because none of the packets between the oldest and the 922 newest packets are acknowledged, the network is considered to have 923 experienced persistent congestion. 925 When persistent congestion is established, the sender's congestion 926 window MUST be reduced to the minimum congestion window 927 (kMinimumWindow). This response of collapsing the congestion window 928 on persistent congestion is functionally similar to a sender's 929 response on a Retransmission Timeout (RTO) in TCP [RFC5681] after 930 Tail Loss Probes (TLP) [TLP]. 932 7.8. Pacing 934 This document does not specify a pacer, but it is RECOMMENDED that a 935 sender pace sending of all in-flight packets based on input from the 936 congestion controller. For example, a pacer might distribute the 937 congestion window over the SRTT when used with a window-based 938 controller, and a pacer might use the rate estimate of a rate-based 939 controller. 941 An implementation should take care to architect its congestion 942 controller to work well with a pacer. For instance, a pacer might 943 wrap the congestion controller and control the availability of the 944 congestion window, or a pacer might pace out packets handed to it by 945 the congestion controller. Timely delivery of ACK frames is 946 important for efficient loss recovery. Packets containing only ACK 947 frames should therefore not be paced, to avoid delaying their 948 delivery to the peer. 950 As an example of a well-known and publicly available implementation 951 of a flow pacer, implementers are referred to the Fair Queue packet 952 scheduler (fq qdisc) in Linux (3.11 onwards). 954 7.9. Under-utilizing the Congestion Window 956 A congestion window that is under-utilized SHOULD NOT be increased in 957 either slow start or congestion avoidance. This can happen due to 958 insufficient application data or flow control credit. 960 A sender MAY use the pipeACK method described in section 4.3 of 961 [RFC7661] to determine if the congestion window is sufficiently 962 utilized. 964 A sender that paces packets (see Section 7.8) might delay sending 965 packets and not fully utilize the congestion window due to this 966 delay. A sender should not consider itself application limited if it 967 would have fully utilized the congestion window without pacing delay. 969 Bursting more than an intial window's worth of data into the network 970 might cause short-term congestion and losses. Implemementations 971 SHOULD either use pacing or reduce their congestion window to limit 972 such bursts. 974 A sender MAY implement alternate mechanisms to update its congestion 975 window after periods of under-utilization, such as those proposed for 976 TCP in [RFC7661]. 978 8. Security Considerations 980 8.1. Congestion Signals 982 Congestion control fundamentally involves the consumption of signals 983 - both loss and ECN codepoints - from unauthenticated entities. On- 984 path attackers can spoof or alter these signals. An attacker can 985 cause endpoints to reduce their sending rate by dropping packets, or 986 alter send rate by changing ECN codepoints. 988 8.2. Traffic Analysis 990 Packets that carry only ACK frames can be heuristically identified by 991 observing packet size. Acknowledgement patterns may expose 992 information about link characteristics or application behavior. 993 Endpoints can use PADDING frames or bundle acknowledgments with other 994 frames to reduce leaked information. 996 8.3. Misreporting ECN Markings 998 A receiver can misreport ECN markings to alter the congestion 999 response of a sender. Suppressing reports of ECN-CE markings could 1000 cause a sender to increase their send rate. This increase could 1001 result in congestion and loss. 1003 A sender MAY attempt to detect suppression of reports by marking 1004 occasional packets that they send with ECN-CE. If a packet marked 1005 with ECN-CE is not reported as having been marked when the packet is 1006 acknowledged, the sender SHOULD then disable ECN for that path. 1008 Reporting additional ECN-CE markings will cause a sender to reduce 1009 their sending rate, which is similar in effect to advertising reduced 1010 connection flow control limits and so no advantage is gained by doing 1011 so. 1013 Endpoints choose the congestion controller that they use. Though 1014 congestion controllers generally treat reports of ECN-CE markings as 1015 equivalent to loss [RFC8311], the exact response for each controller 1016 could be different. Failure to correctly respond to information 1017 about ECN markings is therefore difficult to detect. 1019 9. IANA Considerations 1021 This document has no IANA actions. Yet. 1023 10. References 1025 10.1. Normative References 1027 [QUIC-TLS] 1028 Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure 1029 QUIC", draft-ietf-quic-tls-21 (work in progress), July 1030 2019. 1032 [QUIC-TRANSPORT] 1033 Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based 1034 Multiplexed and Secure Transport", draft-ietf-quic- 1035 transport-21 (work in progress), July 2019. 1037 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1038 Requirement Levels", BCP 14, RFC 2119, 1039 DOI 10.17487/RFC2119, March 1997, 1040 . 1042 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 1043 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 1044 May 2017, . 1046 [RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion 1047 Notification (ECN) Experimentation", RFC 8311, 1048 DOI 10.17487/RFC8311, January 2018, 1049 . 1051 10.2. Informative References 1053 [FACK] Mathis, M. and J. Mahdavi, "Forward Acknowledgement: 1054 Refining TCP Congestion Control", ACM SIGCOMM , August 1055 1996. 1057 [RACK] Cheng, Y., Cardwell, N., Dukkipati, N., and P. Jha, "RACK: 1058 a time-based fast loss detection algorithm for TCP", 1059 draft-ietf-tcpm-rack-05 (work in progress), April 2019. 1061 [RFC3465] Allman, M., "TCP Congestion Control with Appropriate Byte 1062 Counting (ABC)", RFC 3465, DOI 10.17487/RFC3465, February 1063 2003, . 1065 [RFC4653] Bhandarkar, S., Reddy, A., Allman, M., and E. Blanton, 1066 "Improving the Robustness of TCP to Non-Congestion 1067 Events", RFC 4653, DOI 10.17487/RFC4653, August 2006, 1068 . 1070 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 1071 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 1072 . 1074 [RFC5682] Sarolahti, P., Kojo, M., Yamamoto, K., and M. Hata, 1075 "Forward RTO-Recovery (F-RTO): An Algorithm for Detecting 1076 Spurious Retransmission Timeouts with TCP", RFC 5682, 1077 DOI 10.17487/RFC5682, September 2009, 1078 . 1080 [RFC5827] Allman, M., Avrachenkov, K., Ayesta, U., Blanton, J., and 1081 P. Hurtig, "Early Retransmit for TCP and Stream Control 1082 Transmission Protocol (SCTP)", RFC 5827, 1083 DOI 10.17487/RFC5827, May 2010, 1084 . 1086 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 1087 "Computing TCP's Retransmission Timer", RFC 6298, 1088 DOI 10.17487/RFC6298, June 2011, 1089 . 1091 [RFC6582] Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The 1092 NewReno Modification to TCP's Fast Recovery Algorithm", 1093 RFC 6582, DOI 10.17487/RFC6582, April 2012, 1094 . 1096 [RFC6675] Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M., 1097 and Y. Nishida, "A Conservative Loss Recovery Algorithm 1098 Based on Selective Acknowledgment (SACK) for TCP", 1099 RFC 6675, DOI 10.17487/RFC6675, August 2012, 1100 . 1102 [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis, 1103 "Increasing TCP's Initial Window", RFC 6928, 1104 DOI 10.17487/RFC6928, April 2013, 1105 . 1107 [RFC7661] Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating 1108 TCP to Support Rate-Limited Traffic", RFC 7661, 1109 DOI 10.17487/RFC7661, October 2015, 1110 . 1112 [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and 1113 R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", 1114 RFC 8312, DOI 10.17487/RFC8312, February 2018, 1115 . 1117 [TLP] Dukkipati, N., Cardwell, N., Cheng, Y., and M. Mathis, 1118 "Tail Loss Probe (TLP): An Algorithm for Fast Recovery of 1119 Tail Losses", draft-dukkipati-tcpm-tcp-loss-probe-01 (work 1120 in progress), February 2013. 1122 10.3. URIs 1124 [1] https://mailarchive.ietf.org/arch/search/?email_list=quic 1126 [2] https://github.com/quicwg 1128 [3] https://github.com/quicwg/base-drafts/labels/-recovery 1130 Appendix A. Loss Recovery Pseudocode 1132 We now describe an example implementation of the loss detection 1133 mechanisms described in Section 6. 1135 A.1. Tracking Sent Packets 1137 To correctly implement congestion control, a QUIC sender tracks every 1138 ack-eliciting packet until the packet is acknowledged or lost. It is 1139 expected that implementations will be able to access this information 1140 by packet number and crypto context and store the per-packet fields 1141 (Appendix A.1.1) for loss recovery and congestion control. 1143 After a packet is declared lost, the endpoint can track it for an 1144 amount of time comparable to the maximum expected packet reordering, 1145 such as 1 RTT. This allows for detection of spurious 1146 retransmissions. 1148 Sent packets are tracked for each packet number space, and ACK 1149 processing only applies to a single space. 1151 A.1.1. Sent Packet Fields 1153 packet_number: The packet number of the sent packet. 1155 ack_eliciting: A boolean that indicates whether a packet is ack- 1156 eliciting. If true, it is expected that an acknowledgement will 1157 be received, though the peer could delay sending the ACK frame 1158 containing it by up to the MaxAckDelay. 1160 in_flight: A boolean that indicates whether the packet counts 1161 towards bytes in flight. 1163 is_crypto_packet: A boolean that indicates whether the packet 1164 contains cryptographic handshake messages critical to the 1165 completion of the QUIC handshake. In this version of QUIC, this 1166 includes any packet with the long header that includes a CRYPTO 1167 frame. 1169 sent_bytes: The number of bytes sent in the packet, not including 1170 UDP or IP overhead, but including QUIC framing overhead. 1172 time_sent: The time the packet was sent. 1174 A.2. Constants of interest 1176 Constants used in loss recovery are based on a combination of RFCs, 1177 papers, and common practice. Some may need to be changed or 1178 negotiated in order to better suit a variety of environments. 1180 kPacketThreshold: Maximum reordering in packets before packet 1181 threshold loss detection considers a packet lost. The RECOMMENDED 1182 value is 3. 1184 kTimeThreshold: Maximum reordering in time before time threshold 1185 loss detection considers a packet lost. Specified as an RTT 1186 multiplier. The RECOMMENDED value is 9/8. 1188 kGranularity: Timer granularity. This is a system-dependent value. 1189 However, implementations SHOULD use a value no smaller than 1ms. 1191 kInitialRtt: The RTT used before an RTT sample is taken. The 1192 RECOMMENDED value is 500ms. 1194 kPacketNumberSpace: An enum to enumerate the three packet number 1195 spaces. 1197 enum kPacketNumberSpace { 1198 Initial, 1199 Handshake, 1200 ApplicationData, 1201 } 1203 A.3. Variables of interest 1205 Variables required to implement the congestion control mechanisms are 1206 described in this section. 1208 loss_detection_timer: Multi-modal timer used for loss detection. 1210 crypto_count: The number of times all unacknowledged CRYPTO data has 1211 been retransmitted without receiving an ack. 1213 pto_count: The number of times a PTO has been sent without receiving 1214 an ack. 1216 time_of_last_sent_ack_eliciting_packet: The time the most recent 1217 ack-eliciting packet was sent. 1219 time_of_last_sent_crypto_packet: The time the most recent crypto 1220 packet was sent. 1222 largest_acked_packet[kPacketNumberSpace]: The largest packet number 1223 acknowledged in the packet number space so far. 1225 latest_rtt: The most recent RTT measurement made when receiving an 1226 ack for a previously unacked packet. 1228 smoothed_rtt: The smoothed RTT of the connection, computed as 1229 described in [RFC6298] 1231 rttvar: The RTT variance, computed as described in [RFC6298] 1232 min_rtt: The minimum RTT seen in the connection, ignoring ack delay. 1234 max_ack_delay: The maximum amount of time by which the receiver 1235 intends to delay acknowledgments for packets in the 1236 ApplicationData packet number space. The actual ack_delay in a 1237 received ACK frame may be larger due to late timers, reordering, 1238 or lost ACKs. 1240 loss_time[kPacketNumberSpace]: The time at which the next packet in 1241 that packet number space will be considered lost based on 1242 exceeding the reordering window in time. 1244 sent_packets[kPacketNumberSpace]: An association of packet numbers 1245 in a packet number space to information about them. Described in 1246 detail above in Appendix A.1. 1248 A.4. Initialization 1250 At the beginning of the connection, initialize the loss detection 1251 variables as follows: 1253 loss_detection_timer.reset() 1254 crypto_count = 0 1255 pto_count = 0 1256 latest_rtt = 0 1257 smoothed_rtt = 0 1258 rttvar = 0 1259 min_rtt = 0 1260 max_ack_delay = 0 1261 time_of_last_sent_ack_eliciting_packet = 0 1262 time_of_last_sent_crypto_packet = 0 1263 for pn_space in [ Initial, Handshake, ApplicationData ]: 1264 largest_acked_packet[pn_space] = infinite 1265 loss_time[pn_space] = 0 1267 A.5. On Sending a Packet 1269 After a packet is sent, information about the packet is stored. The 1270 parameters to OnPacketSent are described in detail above in 1271 Appendix A.1.1. 1273 Pseudocode for OnPacketSent follows: 1275 OnPacketSent(packet_number, pn_space, ack_eliciting, 1276 in_flight, is_crypto_packet, sent_bytes): 1277 sent_packets[pn_space][packet_number].packet_number = 1278 packet_number 1279 sent_packets[pn_space][packet_number].time_sent = now 1280 sent_packets[pn_space][packet_number].ack_eliciting = 1281 ack_eliciting 1282 sent_packets[pn_space][packet_number].in_flight = in_flight 1283 if (in_flight): 1284 if (is_crypto_packet): 1285 time_of_last_sent_crypto_packet = now 1286 if (ack_eliciting): 1287 time_of_last_sent_ack_eliciting_packet = now 1288 OnPacketSentCC(sent_bytes) 1289 sent_packets[pn_space][packet_number].size = sent_bytes 1290 SetLossDetectionTimer() 1292 A.6. On Receiving an Acknowledgment 1294 When an ACK frame is received, it may newly acknowledge any number of 1295 packets. 1297 Pseudocode for OnAckReceived and UpdateRtt follow: 1299 OnAckReceived(ack, pn_space): 1300 if (largest_acked_packet[pn_space] == infinite): 1301 largest_acked_packet[pn_space] = ack.largest_acked 1302 else: 1303 largest_acked_packet[pn_space] = 1304 max(largest_acked_packet[pn_space], ack.largest_acked) 1306 // Nothing to do if there are no newly acked packets. 1307 newly_acked_packets = DetermineNewlyAckedPackets(ack, pn_space) 1308 if (newly_acked_packets.empty()): 1309 return 1311 // If the largest acknowledged is newly acked and 1312 // at least one ack-eliciting was newly acked, update the RTT. 1313 if (sent_packets[pn_space][ack.largest_acked] && 1314 IncludesAckEliciting(newly_acked_packets)) 1315 latest_rtt = 1316 now - sent_packets[pn_space][ack.largest_acked].time_sent 1317 ack_delay = 0 1318 if pn_space == ApplicationData: 1319 ack_delay = ack.ack_delay 1320 UpdateRtt(ack_delay) 1322 // Process ECN information if present. 1324 if (ACK frame contains ECN information): 1325 ProcessECN(ack) 1327 for acked_packet in newly_acked_packets: 1328 OnPacketAcked(acked_packet.packet_number, pn_space) 1330 DetectLostPackets(pn_space) 1332 crypto_count = 0 1333 pto_count = 0 1335 SetLossDetectionTimer() 1337 UpdateRtt(ack_delay): 1338 // First RTT sample. 1339 if (smoothed_rtt == 0): 1340 min_rtt = latest_rtt 1341 smoothed_rtt = latest_rtt 1342 rttvar = latest_rtt / 2 1343 return 1345 // min_rtt ignores ack delay. 1346 min_rtt = min(min_rtt, latest_rtt) 1347 // Limit ack_delay by max_ack_delay 1348 ack_delay = min(ack_delay, max_ack_delay) 1349 // Adjust for ack delay if plausible. 1350 adjusted_rtt = latest_rtt 1351 if (latest_rtt > min_rtt + ack_delay): 1352 adjusted_rtt = latest_rtt - ack_delay 1354 rttvar = 3/4 * rttvar + 1/4 * abs(smoothed_rtt - adjusted_rtt) 1355 smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * adjusted_rtt 1357 A.7. On Packet Acknowledgment 1359 When a packet is acknowledged for the first time, the following 1360 OnPacketAcked function is called. Note that a single ACK frame may 1361 newly acknowledge several packets. OnPacketAcked must be called once 1362 for each of these newly acknowledged packets. 1364 OnPacketAcked takes two parameters: acked_packet, which is the struct 1365 detailed in Appendix A.1.1, and the packet number space that this ACK 1366 frame was sent for. 1368 Pseudocode for OnPacketAcked follows: 1370 OnPacketAcked(acked_packet, pn_space): 1371 if (acked_packet.in_flight): 1372 OnPacketAckedCC(acked_packet) 1373 sent_packets[pn_space].remove(acked_packet.packet_number) 1375 A.8. Setting the Loss Detection Timer 1377 QUIC loss detection uses a single timer for all timeout loss 1378 detection. The duration of the timer is based on the timer's mode, 1379 which is set in the packet and timer events further below. The 1380 function SetLossDetectionTimer defined below shows how the single 1381 timer is set. 1383 This algorithm may result in the timer being set in the past, 1384 particularly if timers wake up late. Timers set in the past SHOULD 1385 fire immediately. 1387 Pseudocode for SetLossDetectionTimer follows: 1389 // Returns the earliest loss_time and the packet number 1390 // space it's from. Returns 0 if all times are 0. 1391 GetEarliestLossTime(): 1392 time = loss_time[Initial] 1393 space = Initial 1394 for pn_space in [ Handshake, ApplicationData ]: 1395 if loss_time[pn_space] != 0 && 1396 (time == 0 || loss_time[pn_space] < time): 1397 time = loss_time[pn_space]; 1398 space = pn_space 1399 return time, space 1401 SetLossDetectionTimer(): 1402 loss_time, _ = GetEarliestLossTime() 1403 if (loss_time != 0): 1404 // Time threshold loss detection. 1405 loss_detection_timer.update(loss_time) 1406 return 1408 if (has unacknowledged crypto data 1409 || endpoint is client without 1-RTT keys): 1410 // Crypto retransmission timer. 1411 if (smoothed_rtt == 0): 1412 timeout = 2 * kInitialRtt 1413 else: 1414 timeout = 2 * smoothed_rtt 1415 timeout = max(timeout, kGranularity) 1416 timeout = timeout * (2 ^ crypto_count) 1417 loss_detection_timer.update( 1418 time_of_last_sent_crypto_packet + timeout) 1419 return 1421 // Don't arm timer if there are no ack-eliciting packets 1422 // in flight. 1423 if (no ack-eliciting packets in flight): 1424 loss_detection_timer.cancel() 1425 return 1427 // Calculate PTO duration 1428 timeout = 1429 smoothed_rtt + max(4 * rttvar, kGranularity) + max_ack_delay 1430 timeout = timeout * (2 ^ pto_count) 1432 loss_detection_timer.update( 1433 time_of_last_sent_ack_eliciting_packet + timeout) 1435 A.9. On Timeout 1437 When the loss detection timer expires, the timer's mode determines 1438 the action to be performed. 1440 Pseudocode for OnLossDetectionTimeout follows: 1442 OnLossDetectionTimeout(): 1443 loss_time, pn_space = GetEarliestLossTime() 1444 if (loss_time != 0): 1445 // Time threshold loss Detection 1446 DetectLostPackets(pn_space) 1447 // Retransmit crypto data if no packets were lost 1448 // and there is crypto data to retransmit. 1449 else if (has unacknowledged crypto data): 1450 // Crypto retransmission timeout. 1451 RetransmitUnackedCryptoData() 1452 crypto_count++ 1453 else if (endpoint is client without 1-RTT keys): 1454 // Client sends an anti-deadlock packet: Initial is padded 1455 // to earn more anti-amplification credit, 1456 // a Handshake packet proves address ownership. 1457 if (has Handshake keys): 1458 SendOneHandshakePacket() 1459 else: 1460 SendOnePaddedInitialPacket() 1461 crypto_count++ 1462 else: 1463 // PTO. Send new data if available, else retransmit old data. 1464 // If neither is available, send a single PING frame. 1465 SendOneOrTwoPackets() 1466 pto_count++ 1468 SetLossDetectionTimer() 1470 A.10. Detecting Lost Packets 1472 DetectLostPackets is called every time an ACK is received and 1473 operates on the sent_packets for that packet number space. 1475 Pseudocode for DetectLostPackets follows: 1477 DetectLostPackets(pn_space): 1478 assert(largest_acked_packet[pn_space] != infinite) 1479 loss_time[pn_space] = 0 1480 lost_packets = {} 1481 loss_delay = kTimeThreshold * max(latest_rtt, smoothed_rtt) 1483 // Minimum time of kGranularity before packets are deemed lost. 1484 loss_delay = max(loss_delay, kGranularity) 1486 // Packets sent before this time are deemed lost. 1487 lost_send_time = now() - loss_delay 1489 foreach unacked in sent_packets[pn_space]: 1490 if (unacked.packet_number > largest_acked_packet[pn_space]): 1491 continue 1493 // Mark packet as lost, or set time when it should be marked. 1494 if (unacked.time_sent <= lost_send_time || 1495 largest_acked_packet[pn_space] >= 1496 unacked.packet_number + kPacketThreshold): 1497 sent_packets[pn_space].remove(unacked.packet_number) 1498 if (unacked.in_flight): 1499 lost_packets.insert(unacked) 1500 else: 1501 if (loss_time[pn_space] == 0): 1502 loss_time[pn_space] = unacked.time_sent + loss_delay 1503 else: 1504 loss_time[pn_space] = min(loss_time[pn_space], 1505 unacked.time_sent + loss_delay) 1507 // Inform the congestion controller of lost packets and 1508 // let it decide whether to retransmit immediately. 1509 if (!lost_packets.empty()): 1510 OnPacketsLost(lost_packets) 1512 Appendix B. Congestion Control Pseudocode 1514 We now describe an example implementation of the congestion 1515 controller described in Section 7. 1517 B.1. Constants of interest 1519 Constants used in congestion control are based on a combination of 1520 RFCs, papers, and common practice. Some may need to be changed or 1521 negotiated in order to better suit a variety of environments. 1523 kMaxDatagramSize: The sender's maximum payload size. Does not 1524 include UDP or IP overhead. The max packet size is used for 1525 calculating initial and minimum congestion windows. The 1526 RECOMMENDED value is 1200 bytes. 1528 kInitialWindow: Default limit on the initial amount of data in 1529 flight, in bytes. Taken from [RFC6928], but increased slightly to 1530 account for the smaller 8 byte overhead of UDP vs 20 bytes for 1531 TCP. The RECOMMENDED value is the minimum of 10 * 1532 kMaxDatagramSize and max(2* kMaxDatagramSize, 14720)). 1534 kMinimumWindow: Minimum congestion window in bytes. The RECOMMENDED 1535 value is 2 * kMaxDatagramSize. 1537 kLossReductionFactor: Reduction in congestion window when a new loss 1538 event is detected. The RECOMMENDED value is 0.5. 1540 kPersistentCongestionThreshold: Period of time for persistent 1541 congestion to be established, specified as a PTO multiplier. The 1542 rationale for this threshold is to enable a sender to use initial 1543 PTOs for aggressive probing, as TCP does with Tail Loss Probe 1544 (TLP) [TLP] [RACK], before establishing persistent congestion, as 1545 TCP does with a Retransmission Timeout (RTO) [RFC5681]. The 1546 RECOMMENDED value for kPersistentCongestionThreshold is 3, which 1547 is approximately equivalent to having two TLPs before an RTO in 1548 TCP. 1550 B.2. Variables of interest 1552 Variables required to implement the congestion control mechanisms are 1553 described in this section. 1555 ecn_ce_counter: The highest value reported for the ECN-CE counter by 1556 the peer in an ACK frame. This variable is used to detect 1557 increases in the reported ECN-CE counter. 1559 bytes_in_flight: The sum of the size in bytes of all sent packets 1560 that contain at least one ack-eliciting or PADDING frame, and have 1561 not been acked or declared lost. The size does not include IP or 1562 UDP overhead, but does include the QUIC header and AEAD overhead. 1563 Packets only containing ACK frames do not count towards 1564 bytes_in_flight to ensure congestion control does not impede 1565 congestion feedback. 1567 congestion_window: Maximum number of bytes-in-flight that may be 1568 sent. 1570 congestion_recovery_start_time: The time when QUIC first detects 1571 congestion due to loss or ECN, causing it to enter congestion 1572 recovery. When a packet sent after this time is acknowledged, 1573 QUIC exits congestion recovery. 1575 ssthresh: Slow start threshold in bytes. When the congestion window 1576 is below ssthresh, the mode is slow start and the window grows by 1577 the number of bytes acknowledged. 1579 B.3. Initialization 1581 At the beginning of the connection, initialize the congestion control 1582 variables as follows: 1584 congestion_window = kInitialWindow 1585 bytes_in_flight = 0 1586 congestion_recovery_start_time = 0 1587 ssthresh = infinite 1588 ecn_ce_counter = 0 1590 B.4. On Packet Sent 1592 Whenever a packet is sent, and it contains non-ACK frames, the packet 1593 increases bytes_in_flight. 1595 OnPacketSentCC(bytes_sent): 1596 bytes_in_flight += bytes_sent 1598 B.5. On Packet Acknowledgement 1600 Invoked from loss detection's OnPacketAcked and is supplied with the 1601 acked_packet from sent_packets. 1603 InCongestionRecovery(sent_time): 1604 return sent_time <= congestion_recovery_start_time 1606 OnPacketAckedCC(acked_packet): 1607 // Remove from bytes_in_flight. 1608 bytes_in_flight -= acked_packet.size 1609 if (InCongestionRecovery(acked_packet.time_sent)): 1610 // Do not increase congestion window in recovery period. 1611 return 1612 if (IsAppLimited()) 1613 // Do not increase congestion_window if application 1614 // limited. 1615 return 1616 if (congestion_window < ssthresh): 1617 // Slow start. 1618 congestion_window += acked_packet.size 1619 else: 1620 // Congestion avoidance. 1621 congestion_window += kMaxDatagramSize * acked_packet.size 1622 / congestion_window 1624 B.6. On New Congestion Event 1626 Invoked from ProcessECN and OnPacketsLost when a new congestion event 1627 is detected. May start a new recovery period and reduces the 1628 congestion window. 1630 CongestionEvent(sent_time): 1631 // Start a new congestion event if packet was sent after the 1632 // start of the previous congestion recovery period. 1633 if (!InCongestionRecovery(sent_time)): 1634 congestion_recovery_start_time = Now() 1635 congestion_window *= kLossReductionFactor 1636 congestion_window = max(congestion_window, kMinimumWindow) 1637 ssthresh = congestion_window 1639 B.7. Process ECN Information 1641 Invoked when an ACK frame with an ECN section is received from the 1642 peer. 1644 ProcessECN(ack): 1645 // If the ECN-CE counter reported by the peer has increased, 1646 // this could be a new congestion event. 1647 if (ack.ce_counter > ecn_ce_counter): 1648 ecn_ce_counter = ack.ce_counter 1649 CongestionEvent(sent_packets[ack.largest_acked].time_sent) 1651 B.8. On Packets Lost 1653 Invoked from DetectLostPackets when packets are deemed lost. 1655 InPersistentCongestion(largest_lost_packet): 1656 pto = smoothed_rtt + max(4 * rttvar, kGranularity) + 1657 max_ack_delay 1658 congestion_period = pto * kPersistentCongestionThreshold 1659 // Determine if all packets in the time period before the 1660 // newest lost packet, including the edges, are marked 1661 // lost 1662 return AreAllPacketsLost(largest_lost_packet, 1663 congestion_period) 1665 OnPacketsLost(lost_packets): 1666 // Remove lost packets from bytes_in_flight. 1667 for (lost_packet : lost_packets): 1668 bytes_in_flight -= lost_packet.size 1669 largest_lost_packet = lost_packets.last() 1670 CongestionEvent(largest_lost_packet.time_sent) 1672 // Collapse congestion window if persistent congestion 1673 if (InPersistentCongestion(largest_lost_packet)): 1674 congestion_window = kMinimumWindow 1676 Appendix C. Change Log 1678 *RFC Editor's Note:* Please remove this section prior to 1679 publication of a final version of this document. 1681 Issue and pull request numbers are listed with a leading octothorp. 1683 C.1. Since draft-ietf-quic-recovery-20 1685 o Path validation can be used as initial RTT value (#2644, #2687) 1687 o max_ack_delay transport parameter defaults to 0 (#2638, #2646) 1689 o Ack Delay only measures intentional delays induced by the 1690 implementation (#2596, #2786) 1692 C.2. Since draft-ietf-quic-recovery-19 1694 o Change kPersistentThreshold from an exponent to a multiplier 1695 (#2557) 1697 o Send a PING if the PTO timer fires and there's nothing to send 1698 (#2624) 1700 o Set loss delay to at least kGranularity (#2617) 1702 o Merge application limited and sending after idle sections. Always 1703 limit burst size instead of requiring resetting CWND to initial 1704 CWND after idle (#2605) 1706 o Rewrite RTT estimation, allow RTT samples where a newly acked 1707 packet is ack-eliciting but the largest_acked is not (#2592) 1709 o Don't arm the handshake timer if there is no handshake data 1710 (#2590) 1712 o Clarify that the time threshold loss alarm takes precedence over 1713 the crypto handshake timer (#2590, #2620) 1715 o Change initial RTT to 500ms to align with RFC6298 (#2184) 1717 C.3. Since draft-ietf-quic-recovery-18 1719 o Change IW byte limit to 14720 from 14600 (#2494) 1721 o Update PTO calculation to match RFC6298 (#2480, #2489, #2490) 1723 o Improve loss detection's description of multiple packet number 1724 spaces and pseudocode (#2485, #2451, #2417) 1726 o Declare persistent congestion even if non-probe packets are sent 1727 and don't make persistent congestion more aggressive than RTO 1728 verified was (#2365, #2244) 1730 o Move pseudocode to the appendices (#2408) 1732 o What to send on multiple PTOs (#2380) 1734 C.4. Since draft-ietf-quic-recovery-17 1736 o After Probe Timeout discard in-flight packets or send another 1737 (#2212, #1965) 1739 o Endpoints discard initial keys as soon as handshake keys are 1740 available (#1951, #2045) 1742 o 0-RTT state is discarded when 0-RTT is rejected (#2300) 1744 o Loss detection timer is cancelled when ack-eliciting frames are in 1745 flight (#2117, #2093) 1747 o Packets are declared lost if they are in flight (#2104) 1748 o After becoming idle, either pace packets or reset the congestion 1749 controller (#2138, 2187) 1751 o Process ECN counts before marking packets lost (#2142) 1753 o Mark packets lost before resetting crypto_count and pto_count 1754 (#2208, #2209) 1756 o Congestion and loss recovery state are discarded when keys are 1757 discarded (#2327) 1759 C.5. Since draft-ietf-quic-recovery-16 1761 o Unify TLP and RTO into a single PTO; eliminate min RTO, min TLP 1762 and min crypto timeouts; eliminate timeout validation (#2114, 1763 #2166, #2168, #1017) 1765 o Redefine how congestion avoidance in terms of when the period 1766 starts (#1928, #1930) 1768 o Document what needs to be tracked for packets that are in flight 1769 (#765, #1724, #1939) 1771 o Integrate both time and packet thresholds into loss detection 1772 (#1969, #1212, #934, #1974) 1774 o Reduce congestion window after idle, unless pacing is used (#2007, 1775 #2023) 1777 o Disable RTT calculation for packets that don't elicit 1778 acknowledgment (#2060, #2078) 1780 o Limit ack_delay by max_ack_delay (#2060, #2099) 1782 o Initial keys are discarded once Handshake are avaialble (#1951, 1783 #2045) 1785 o Reorder ECN and loss detection in pseudocode (#2142) 1787 o Only cancel loss detection timer if ack-eliciting packets are in 1788 flight (#2093, #2117) 1790 C.6. Since draft-ietf-quic-recovery-14 1792 o Used max_ack_delay from transport params (#1796, #1782) 1794 o Merge ACK and ACK_ECN (#1783) 1796 C.7. Since draft-ietf-quic-recovery-13 1798 o Corrected the lack of ssthresh reduction in CongestionEvent 1799 pseudocode (#1598) 1801 o Considerations for ECN spoofing (#1426, #1626) 1803 o Clarifications for PADDING and congestion control (#837, #838, 1804 #1517, #1531, #1540) 1806 o Reduce early retransmission timer to RTT/8 (#945, #1581) 1808 o Packets are declared lost after an RTO is verified (#935, #1582) 1810 C.8. Since draft-ietf-quic-recovery-12 1812 o Changes to manage separate packet number spaces and encryption 1813 levels (#1190, #1242, #1413, #1450) 1815 o Added ECN feedback mechanisms and handling; new ACK_ECN frame 1816 (#804, #805, #1372) 1818 C.9. Since draft-ietf-quic-recovery-11 1820 No significant changes. 1822 C.10. Since draft-ietf-quic-recovery-10 1824 o Improved text on ack generation (#1139, #1159) 1826 o Make references to TCP recovery mechanisms informational (#1195) 1828 o Define time_of_last_sent_handshake_packet (#1171) 1830 o Added signal from TLS the data it includes needs to be sent in a 1831 Retry packet (#1061, #1199) 1833 o Minimum RTT (min_rtt) is initialized with an infinite value 1834 (#1169) 1836 C.11. Since draft-ietf-quic-recovery-09 1838 No significant changes. 1840 C.12. Since draft-ietf-quic-recovery-08 1842 o Clarified pacing and RTO (#967, #977) 1844 C.13. Since draft-ietf-quic-recovery-07 1846 o Include Ack Delay in RTO(and TLP) computations (#981) 1848 o Ack Delay in SRTT computation (#961) 1850 o Default RTT and Slow Start (#590) 1852 o Many editorial fixes. 1854 C.14. Since draft-ietf-quic-recovery-06 1856 No significant changes. 1858 C.15. Since draft-ietf-quic-recovery-05 1860 o Add more congestion control text (#776) 1862 C.16. Since draft-ietf-quic-recovery-04 1864 No significant changes. 1866 C.17. Since draft-ietf-quic-recovery-03 1868 No significant changes. 1870 C.18. Since draft-ietf-quic-recovery-02 1872 o Integrate F-RTO (#544, #409) 1874 o Add congestion control (#545, #395) 1876 o Require connection abort if a skipped packet was acknowledged 1877 (#415) 1879 o Simplify RTO calculations (#142, #417) 1881 C.19. Since draft-ietf-quic-recovery-01 1883 o Overview added to loss detection 1885 o Changes initial default RTT to 100ms 1887 o Added time-based loss detection and fixes early retransmit 1888 o Clarified loss recovery for handshake packets 1890 o Fixed references and made TCP references informative 1892 C.20. Since draft-ietf-quic-recovery-00 1894 o Improved description of constants and ACK behavior 1896 C.21. Since draft-iyengar-quic-loss-recovery-01 1898 o Adopted as base for draft-ietf-quic-recovery 1900 o Updated authors/editors list 1902 o Added table of contents 1904 Acknowledgments 1906 Authors' Addresses 1908 Jana Iyengar (editor) 1909 Fastly 1911 Email: jri.ietf@gmail.com 1913 Ian Swett (editor) 1914 Google 1916 Email: ianswett@google.com