idnits 2.17.1 draft-ietf-quic-recovery-25.txt: -(1843): Line appears to be too long, but this could be caused by non-ascii characters in UTF-8 encoding Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == There is 1 instance of lines with non-ascii characters in the document. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (22 January 2020) is 1528 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'Initial' is mentioned on line 1288, but not defined == Outdated reference: A later version (-34) exists of draft-ietf-quic-tls-25 == Outdated reference: A later version (-34) exists of draft-ietf-quic-transport-25 == Outdated reference: A later version (-15) exists of draft-ietf-tcpm-rack-05 -- Obsolete informational reference (is this intentional?): RFC 8312 (Obsoleted by RFC 9438) Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 QUIC J. Iyengar, Ed. 3 Internet-Draft Fastly 4 Intended status: Standards Track I. Swett, Ed. 5 Expires: 25 July 2020 Google 6 22 January 2020 8 QUIC Loss Detection and Congestion Control 9 draft-ietf-quic-recovery-25 11 Abstract 13 This document describes loss detection and congestion control 14 mechanisms for QUIC. 16 Note to Readers 18 Discussion of this draft takes place on the QUIC working group 19 mailing list (quic@ietf.org), which is archived at 20 https://mailarchive.ietf.org/arch/search/?email_list=quic 21 (https://mailarchive.ietf.org/arch/search/?email_list=quic). 23 Working Group information can be found at https://github.com/quicwg 24 (https://github.com/quicwg); source code and issues list for this 25 draft can be found at https://github.com/quicwg/base-drafts/labels/- 26 recovery (https://github.com/quicwg/base-drafts/labels/-recovery). 28 Status of This Memo 30 This Internet-Draft is submitted in full conformance with the 31 provisions of BCP 78 and BCP 79. 33 Internet-Drafts are working documents of the Internet Engineering 34 Task Force (IETF). Note that other groups may also distribute 35 working documents as Internet-Drafts. The list of current Internet- 36 Drafts is at https://datatracker.ietf.org/drafts/current/. 38 Internet-Drafts are draft documents valid for a maximum of six months 39 and may be updated, replaced, or obsoleted by other documents at any 40 time. It is inappropriate to use Internet-Drafts as reference 41 material or to cite them other than as "work in progress." 43 This Internet-Draft will expire on 25 July 2020. 45 Copyright Notice 47 Copyright (c) 2020 IETF Trust and the persons identified as the 48 document authors. All rights reserved. 50 This document is subject to BCP 78 and the IETF Trust's Legal 51 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 52 license-info) in effect on the date of publication of this document. 53 Please review these documents carefully, as they describe your rights 54 and restrictions with respect to this document. Code Components 55 extracted from this document must include Simplified BSD License text 56 as described in Section 4.e of the Trust Legal Provisions and are 57 provided without warranty as described in the Simplified BSD License. 59 Table of Contents 61 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 4 62 2. Conventions and Definitions . . . . . . . . . . . . . . . . . 4 63 3. Design of the QUIC Transmission Machinery . . . . . . . . . . 5 64 3.1. Relevant Differences Between QUIC and TCP . . . . . . . . 5 65 3.1.1. Separate Packet Number Spaces . . . . . . . . . . . . 6 66 3.1.2. Monotonically Increasing Packet Numbers . . . . . . . 6 67 3.1.3. Clearer Loss Epoch . . . . . . . . . . . . . . . . . 6 68 3.1.4. No Reneging . . . . . . . . . . . . . . . . . . . . . 7 69 3.1.5. More ACK Ranges . . . . . . . . . . . . . . . . . . . 7 70 3.1.6. Explicit Correction For Delayed Acknowledgements . . 7 71 4. Estimating the Round-Trip Time . . . . . . . . . . . . . . . 7 72 4.1. Generating RTT samples . . . . . . . . . . . . . . . . . 7 73 4.2. Estimating min_rtt . . . . . . . . . . . . . . . . . . . 8 74 4.3. Estimating smoothed_rtt and rttvar . . . . . . . . . . . 9 75 5. Loss Detection . . . . . . . . . . . . . . . . . . . . . . . 10 76 5.1. Acknowledgement-based Detection . . . . . . . . . . . . . 10 77 5.1.1. Packet Threshold . . . . . . . . . . . . . . . . . . 11 78 5.1.2. Time Threshold . . . . . . . . . . . . . . . . . . . 11 79 5.2. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 12 80 5.2.1. Computing PTO . . . . . . . . . . . . . . . . . . . . 12 81 5.3. Handshakes and New Paths . . . . . . . . . . . . . . . . 13 82 5.3.1. Sending Probe Packets . . . . . . . . . . . . . . . . 14 83 5.3.2. Loss Detection . . . . . . . . . . . . . . . . . . . 15 84 5.4. Handling Retry Packets . . . . . . . . . . . . . . . . . 15 85 5.5. Discarding Keys and Packet State . . . . . . . . . . . . 15 86 6. Congestion Control . . . . . . . . . . . . . . . . . . . . . 16 87 6.1. Explicit Congestion Notification . . . . . . . . . . . . 16 88 6.2. Slow Start . . . . . . . . . . . . . . . . . . . . . . . 17 89 6.3. Congestion Avoidance . . . . . . . . . . . . . . . . . . 17 90 6.4. Recovery Period . . . . . . . . . . . . . . . . . . . . . 17 91 6.5. Ignoring Loss of Undecryptable Packets . . . . . . . . . 17 92 6.6. Probe Timeout . . . . . . . . . . . . . . . . . . . . . . 17 93 6.7. Persistent Congestion . . . . . . . . . . . . . . . . . . 18 94 6.8. Pacing . . . . . . . . . . . . . . . . . . . . . . . . . 19 95 6.9. Under-utilizing the Congestion Window . . . . . . . . . . 19 96 7. Security Considerations . . . . . . . . . . . . . . . . . . . 20 97 7.1. Congestion Signals . . . . . . . . . . . . . . . . . . . 20 98 7.2. Traffic Analysis . . . . . . . . . . . . . . . . . . . . 20 99 7.3. Misreporting ECN Markings . . . . . . . . . . . . . . . . 20 100 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 21 101 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 21 102 9.1. Normative References . . . . . . . . . . . . . . . . . . 21 103 9.2. Informative References . . . . . . . . . . . . . . . . . 21 104 Appendix A. Loss Recovery Pseudocode . . . . . . . . . . . . . . 23 105 A.1. Tracking Sent Packets . . . . . . . . . . . . . . . . . . 23 106 A.1.1. Sent Packet Fields . . . . . . . . . . . . . . . . . 23 107 A.2. Constants of interest . . . . . . . . . . . . . . . . . . 24 108 A.3. Variables of interest . . . . . . . . . . . . . . . . . . 24 109 A.4. Initialization . . . . . . . . . . . . . . . . . . . . . 25 110 A.5. On Sending a Packet . . . . . . . . . . . . . . . . . . . 25 111 A.6. On Receiving an Acknowledgment . . . . . . . . . . . . . 26 112 A.7. On Packet Acknowledgment . . . . . . . . . . . . . . . . 27 113 A.8. Setting the Loss Detection Timer . . . . . . . . . . . . 28 114 A.9. On Timeout . . . . . . . . . . . . . . . . . . . . . . . 30 115 A.10. Detecting Lost Packets . . . . . . . . . . . . . . . . . 30 116 Appendix B. Congestion Control Pseudocode . . . . . . . . . . . 31 117 B.1. Constants of interest . . . . . . . . . . . . . . . . . . 31 118 B.2. Variables of interest . . . . . . . . . . . . . . . . . . 32 119 B.3. Initialization . . . . . . . . . . . . . . . . . . . . . 33 120 B.4. On Packet Sent . . . . . . . . . . . . . . . . . . . . . 33 121 B.5. On Packet Acknowledgement . . . . . . . . . . . . . . . . 33 122 B.6. On New Congestion Event . . . . . . . . . . . . . . . . . 34 123 B.7. Process ECN Information . . . . . . . . . . . . . . . . . 34 124 B.8. On Packets Lost . . . . . . . . . . . . . . . . . . . . . 35 125 Appendix C. Change Log . . . . . . . . . . . . . . . . . . . . . 35 126 C.1. Since draft-ietf-quic-recovery-24 . . . . . . . . . . . . 35 127 C.2. Since draft-ietf-quic-recovery-23 . . . . . . . . . . . . 35 128 C.3. Since draft-ietf-quic-recovery-22 . . . . . . . . . . . . 36 129 C.4. Since draft-ietf-quic-recovery-21 . . . . . . . . . . . . 36 130 C.5. Since draft-ietf-quic-recovery-20 . . . . . . . . . . . . 36 131 C.6. Since draft-ietf-quic-recovery-19 . . . . . . . . . . . . 36 132 C.7. Since draft-ietf-quic-recovery-18 . . . . . . . . . . . . 37 133 C.8. Since draft-ietf-quic-recovery-17 . . . . . . . . . . . . 37 134 C.9. Since draft-ietf-quic-recovery-16 . . . . . . . . . . . . 37 135 C.10. Since draft-ietf-quic-recovery-14 . . . . . . . . . . . . 38 136 C.11. Since draft-ietf-quic-recovery-13 . . . . . . . . . . . . 38 137 C.12. Since draft-ietf-quic-recovery-12 . . . . . . . . . . . . 39 138 C.13. Since draft-ietf-quic-recovery-11 . . . . . . . . . . . . 39 139 C.14. Since draft-ietf-quic-recovery-10 . . . . . . . . . . . . 39 140 C.15. Since draft-ietf-quic-recovery-09 . . . . . . . . . . . . 39 141 C.16. Since draft-ietf-quic-recovery-08 . . . . . . . . . . . . 39 142 C.17. Since draft-ietf-quic-recovery-07 . . . . . . . . . . . . 39 143 C.18. Since draft-ietf-quic-recovery-06 . . . . . . . . . . . . 39 144 C.19. Since draft-ietf-quic-recovery-05 . . . . . . . . . . . . 40 145 C.20. Since draft-ietf-quic-recovery-04 . . . . . . . . . . . . 40 146 C.21. Since draft-ietf-quic-recovery-03 . . . . . . . . . . . . 40 147 C.22. Since draft-ietf-quic-recovery-02 . . . . . . . . . . . . 40 148 C.23. Since draft-ietf-quic-recovery-01 . . . . . . . . . . . . 40 149 C.24. Since draft-ietf-quic-recovery-00 . . . . . . . . . . . . 40 150 C.25. Since draft-iyengar-quic-loss-recovery-01 . . . . . . . . 40 151 Appendix D. Contributors . . . . . . . . . . . . . . . . . . . . 41 152 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . 41 153 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 41 155 1. Introduction 157 QUIC is a new multiplexed and secure transport atop UDP. QUIC builds 158 on decades of transport and security experience, and implements 159 mechanisms that make it attractive as a modern general-purpose 160 transport. The QUIC protocol is described in [QUIC-TRANSPORT]. 162 QUIC implements the spirit of existing TCP congestion control and 163 loss recovery mechanisms, described in RFCs, various Internet-drafts, 164 and also those prevalent in the Linux TCP implementation. This 165 document describes QUIC congestion control and loss recovery, and 166 where applicable, attributes the TCP equivalent in RFCs, Internet- 167 drafts, academic papers, and/or TCP implementations. 169 2. Conventions and Definitions 171 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 172 "SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and 173 "OPTIONAL" in this document are to be interpreted as described in BCP 174 14 [RFC2119] [RFC8174] when, and only when, they appear in all 175 capitals, as shown here. 177 Definitions of terms that are used in this document: 179 ACK-only: Any packet containing only one or more ACK frame(s). 181 In-flight: Packets are considered in-flight when they have been sent 182 and are not ACK-only, and they are not acknowledged, declared 183 lost, or abandoned along with old keys. 185 Ack-eliciting Frames: All frames other than ACK, PADDING, and 186 CONNECTION_CLOSE are considered ack-eliciting. 188 Ack-eliciting Packets: Packets that contain ack-eliciting frames 189 elicit an ACK from the receiver within the maximum ack delay and 190 are called ack-eliciting packets. 192 3. Design of the QUIC Transmission Machinery 194 All transmissions in QUIC are sent with a packet-level header, which 195 indicates the encryption level and includes a packet sequence number 196 (referred to below as a packet number). The encryption level 197 indicates the packet number space, as described in [QUIC-TRANSPORT]. 198 Packet numbers never repeat within a packet number space for the 199 lifetime of a connection. Packet numbers are sent in monotonically 200 increasing order within a space, preventing ambiguity. 202 This design obviates the need for disambiguating between 203 transmissions and retransmissions and eliminates significant 204 complexity from QUIC's interpretation of TCP loss detection 205 mechanisms. 207 QUIC packets can contain multiple frames of different types. The 208 recovery mechanisms ensure that data and frames that need reliable 209 delivery are acknowledged or declared lost and sent in new packets as 210 necessary. The types of frames contained in a packet affect recovery 211 and congestion control logic: 213 * All packets are acknowledged, though packets that contain no ack- 214 eliciting frames are only acknowledged along with ack-eliciting 215 packets. 217 * Long header packets that contain CRYPTO frames are critical to the 218 performance of the QUIC handshake and use shorter timers for 219 acknowledgement. 221 * Packets containing frames besides ACK or CONNECTION_CLOSE frames 222 count toward congestion control limits and are considered in- 223 flight. 225 * PADDING frames cause packets to contribute toward bytes in flight 226 without directly causing an acknowledgment to be sent. 228 3.1. Relevant Differences Between QUIC and TCP 230 Readers familiar with TCP's loss detection and congestion control 231 will find algorithms here that parallel well-known TCP ones. 232 Protocol differences between QUIC and TCP however contribute to 233 algorithmic differences. We briefly describe these protocol 234 differences below. 236 3.1.1. Separate Packet Number Spaces 238 QUIC uses separate packet number spaces for each encryption level, 239 except 0-RTT and all generations of 1-RTT keys use the same packet 240 number space. Separate packet number spaces ensures acknowledgement 241 of packets sent with one level of encryption will not cause spurious 242 retransmission of packets sent with a different encryption level. 243 Congestion control and round-trip time (RTT) measurement are unified 244 across packet number spaces. 246 3.1.2. Monotonically Increasing Packet Numbers 248 TCP conflates transmission order at the sender with delivery order at 249 the receiver, which results in retransmissions of the same data 250 carrying the same sequence number, and consequently leads to 251 "retransmission ambiguity". QUIC separates the two: QUIC uses a 252 packet number to indicate transmission order, and any application 253 data is sent in one or more streams, with delivery order determined 254 by stream offsets encoded within STREAM frames. 256 QUIC's packet number is strictly increasing within a packet number 257 space, and directly encodes transmission order. A higher packet 258 number signifies that the packet was sent later, and a lower packet 259 number signifies that the packet was sent earlier. When a packet 260 containing ack-eliciting frames is detected lost, QUIC rebundles 261 necessary frames in a new packet with a new packet number, removing 262 ambiguity about which packet is acknowledged when an ACK is received. 263 Consequently, more accurate RTT measurements can be made, spurious 264 retransmissions are trivially detected, and mechanisms such as Fast 265 Retransmit can be applied universally, based only on packet number. 267 This design point significantly simplifies loss detection mechanisms 268 for QUIC. Most TCP mechanisms implicitly attempt to infer 269 transmission ordering based on TCP sequence numbers - a non-trivial 270 task, especially when TCP timestamps are not available. 272 3.1.3. Clearer Loss Epoch 274 QUIC starts a loss epoch when a packet is lost and ends one when any 275 packet sent after the epoch starts is acknowledged. TCP waits for 276 the gap in the sequence number space to be filled, and so if a 277 segment is lost multiple times in a row, the loss epoch may not end 278 for several round trips. Because both should reduce their congestion 279 windows only once per epoch, QUIC will do it once for every round 280 trip that experiences loss, while TCP may only do it once across 281 multiple round trips. 283 3.1.4. No Reneging 285 QUIC ACKs contain information that is similar to TCP SACK, but QUIC 286 does not allow any acked packet to be reneged, greatly simplifying 287 implementations on both sides and reducing memory pressure on the 288 sender. 290 3.1.5. More ACK Ranges 292 QUIC supports many ACK ranges, opposed to TCP's 3 SACK ranges. In 293 high loss environments, this speeds recovery, reduces spurious 294 retransmits, and ensures forward progress without relying on 295 timeouts. 297 3.1.6. Explicit Correction For Delayed Acknowledgements 299 QUIC endpoints measure the delay incurred between when a packet is 300 received and when the corresponding acknowledgment is sent, allowing 301 a peer to maintain a more accurate round-trip time estimate (see 302 Section 13.2 of [QUIC-TRANSPORT]). 304 4. Estimating the Round-Trip Time 306 At a high level, an endpoint measures the time from when a packet was 307 sent to when it is acknowledged as a round-trip time (RTT) sample. 308 The endpoint uses RTT samples and peer-reported host delays (see 309 Section 13.2 of [QUIC-TRANSPORT]) to generate a statistical 310 description of the network path's RTT. An endpoint computes the 311 following three values for each path: the minimum value observed over 312 the lifetime of the path (min_rtt), an exponentially-weighted moving 313 average (smoothed_rtt), and the mean deviation (referred to as 314 "variation" in the rest of this document) in the observed RTT samples 315 (rttvar). 317 4.1. Generating RTT samples 319 An endpoint generates an RTT sample on receiving an ACK frame that 320 meets the following two conditions: 322 * the largest acknowledged packet number is newly acknowledged, and 324 * at least one of the newly acknowledged packets was ack-eliciting. 326 The RTT sample, latest_rtt, is generated as the time elapsed since 327 the largest acknowledged packet was sent: 329 latest_rtt = ack_time - send_time_of_largest_acked 330 An RTT sample is generated using only the largest acknowledged packet 331 in the received ACK frame. This is because a peer reports ACK delays 332 for only the largest acknowledged packet in an ACK frame. While the 333 reported ACK delay is not used by the RTT sample measurement, it is 334 used to adjust the RTT sample in subsequent computations of 335 smoothed_rtt and rttvar Section 4.3. 337 To avoid generating multiple RTT samples for a single packet, an ACK 338 frame SHOULD NOT be used to update RTT estimates if it does not newly 339 acknowledge the largest acknowledged packet. 341 An RTT sample MUST NOT be generated on receiving an ACK frame that 342 does not newly acknowledge at least one ack-eliciting packet. A peer 343 does not send an ACK frame on receiving only non-ack-eliciting 344 packets, so an ACK frame that is subsequently sent can include an 345 arbitrarily large Ack Delay field. Ignoring such ACK frames avoids 346 complications in subsequent smoothed_rtt and rttvar computations. 348 A sender might generate multiple RTT samples per RTT when multiple 349 ACK frames are received within an RTT. As suggested in [RFC6298], 350 doing so might result in inadequate history in smoothed_rtt and 351 rttvar. Ensuring that RTT estimates retain sufficient history is an 352 open research question. 354 4.2. Estimating min_rtt 356 min_rtt is the minimum RTT observed for a given network path. 357 min_rtt is set to the latest_rtt on the first RTT sample, and to the 358 lesser of min_rtt and latest_rtt on subsequent samples. In this 359 document, min_rtt is used by loss detection to reject implausibly 360 small rtt samples. 362 An endpoint uses only locally observed times in computing the min_rtt 363 and does not adjust for ACK delays reported by the peer. Doing so 364 allows the endpoint to set a lower bound for the smoothed_rtt based 365 entirely on what it observes (see Section 4.3), and limits potential 366 underestimation due to erroneously-reported delays by the peer. 368 The RTT for a network path may change over time. If a path's actual 369 RTT decreases, the min_rtt will adapt immediately on the first low 370 sample. If the path's actual RTT increases, the min_rtt will not 371 adapt to it, allowing future RTT samples that are smaller than the 372 new RTT be included in smoothed_rtt. 374 4.3. Estimating smoothed_rtt and rttvar 376 smoothed_rtt is an exponentially-weighted moving average of an 377 endpoint's RTT samples, and rttvar is the variation in the RTT 378 samples, estimated using a mean variation. 380 The calculation of smoothed_rtt uses path latency after adjusting RTT 381 samples for ACK delays. For packets sent in the ApplicationData 382 packet number space, a peer limits any delay in sending an 383 acknowledgement for an ack-eliciting packet to no greater than the 384 value it advertised in the max_ack_delay transport parameter. 385 Consequently, when a peer reports an Ack Delay that is greater than 386 its max_ack_delay, the delay is attributed to reasons out of the 387 peer's control, such as scheduler latency at the peer or loss of 388 previous ACK frames. Any delays beyond the peer's max_ack_delay are 389 therefore considered effectively part of path delay and incorporated 390 into the smoothed_rtt estimate. 392 When adjusting an RTT sample using peer-reported acknowledgement 393 delays, an endpoint: 395 * MUST ignore the Ack Delay field of the ACK frame for packets sent 396 in the Initial and Handshake packet number space. 398 * MUST use the lesser of the value reported in Ack Delay field of 399 the ACK frame and the peer's max_ack_delay transport parameter. 401 * MUST NOT apply the adjustment if the resulting RTT sample is 402 smaller than the min_rtt. This limits the underestimation that a 403 misreporting peer can cause to the smoothed_rtt. 405 On the first RTT sample for a network path, the smoothed_rtt is set 406 to the latest_rtt. 408 smoothed_rtt and rttvar are computed as follows, similar to 409 [RFC6298]. On the first RTT sample for a network path: 411 smoothed_rtt = latest_rtt 412 rttvar = latest_rtt / 2 414 On subsequent RTT samples, smoothed_rtt and rttvar evolve as follows: 416 ack_delay = min(Ack Delay in ACK Frame, max_ack_delay) 417 adjusted_rtt = latest_rtt 418 if (min_rtt + ack_delay < latest_rtt): 419 adjusted_rtt = latest_rtt - ack_delay 420 smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * adjusted_rtt 421 rttvar_sample = abs(smoothed_rtt - adjusted_rtt) 422 rttvar = 3/4 * rttvar + 1/4 * rttvar_sample 424 5. Loss Detection 426 QUIC senders use acknowledgements to detect lost packets, and a probe 427 time out Section 5.2 to ensure acknowledgements are received. This 428 section provides a description of these algorithms. 430 If a packet is lost, the QUIC transport needs to recover from that 431 loss, such as by retransmitting the data, sending an updated frame, 432 or abandoning the frame. For more information, see Section 13.3 of 433 [QUIC-TRANSPORT]. 435 5.1. Acknowledgement-based Detection 437 Acknowledgement-based loss detection implements the spirit of TCP's 438 Fast Retransmit [RFC5681], Early Retransmit [RFC5827], FACK [FACK], 439 SACK loss recovery [RFC6675], and RACK [RACK]. This section provides 440 an overview of how these algorithms are implemented in QUIC. 442 A packet is declared lost if it meets all the following conditions: 444 * The packet is unacknowledged, in-flight, and was sent prior to an 445 acknowledged packet. 447 * Either its packet number is kPacketThreshold smaller than an 448 acknowledged packet (Section 5.1.1), or it was sent long enough in 449 the past (Section 5.1.2). 451 The acknowledgement indicates that a packet sent later was delivered, 452 and the packet and time thresholds provide some tolerance for packet 453 reordering. 455 Spuriously declaring packets as lost leads to unnecessary 456 retransmissions and may result in degraded performance due to the 457 actions of the congestion controller upon detecting loss. 458 Implementations that detect spurious retransmissions and increase the 459 reordering threshold in packets or time MAY choose to start with 460 smaller initial reordering thresholds to minimize recovery latency. 462 5.1.1. Packet Threshold 464 The RECOMMENDED initial value for the packet reordering threshold 465 (kPacketThreshold) is 3, based on best practices for TCP loss 466 detection [RFC5681] [RFC6675]. Implementations SHOULD NOT use a 467 packet threshold less than 3, to keep in line with TCP [RFC5681]. 469 Some networks may exhibit higher degrees of reordering, causing a 470 sender to detect spurious losses. Implementers MAY use algorithms 471 developed for TCP, such as TCP-NCR [RFC4653], to improve QUIC's 472 reordering resilience. 474 5.1.2. Time Threshold 476 Once a later packet within the same packet number space has been 477 acknowledged, an endpoint SHOULD declare an earlier packet lost if it 478 was sent a threshold amount of time in the past. To avoid declaring 479 packets as lost too early, this time threshold MUST be set to at 480 least kGranularity. The time threshold is: 482 max(kTimeThreshold * max(smoothed_rtt, latest_rtt), kGranularity) 484 If packets sent prior to the largest acknowledged packet cannot yet 485 be declared lost, then a timer SHOULD be set for the remaining time. 487 Using max(smoothed_rtt, latest_rtt) protects from the two following 488 cases: 490 * the latest RTT sample is lower than the smoothed RTT, perhaps due 491 to reordering where the acknowledgement encountered a shorter 492 path; 494 * the latest RTT sample is higher than the smoothed RTT, perhaps due 495 to a sustained increase in the actual RTT, but the smoothed RTT 496 has not yet caught up. 498 The RECOMMENDED time threshold (kTimeThreshold), expressed as a 499 round-trip time multiplier, is 9/8. 501 Implementations MAY experiment with absolute thresholds, thresholds 502 from previous connections, adaptive thresholds, or including RTT 503 variation. Smaller thresholds reduce reordering resilience and 504 increase spurious retransmissions, and larger thresholds increase 505 loss detection delay. 507 5.2. Probe Timeout 509 A Probe Timeout (PTO) triggers sending one or two probe datagrams 510 when ack-eliciting packets are not acknowledged within the expected 511 period of time or the handshake has not been completed. A PTO 512 enables a connection to recover from loss of tail packets or 513 acknowledgements. 515 As with loss detection, the probe timeout is per packet number space. 516 The PTO algorithm used in QUIC implements the reliability functions 517 of Tail Loss Probe [RACK], RTO [RFC5681], and F-RTO algorithms for 518 TCP [RFC5682]. The timeout computation is based on TCP's 519 retransmission timeout period [RFC6298]. 521 5.2.1. Computing PTO 523 When an ack-eliciting packet is transmitted, the sender schedules a 524 timer for the PTO period as follows: 526 PTO = smoothed_rtt + max(4*rttvar, kGranularity) + max_ack_delay 528 kGranularity, smoothed_rtt, rttvar, and max_ack_delay are defined in 529 Appendix A.2 and Appendix A.3. 531 The PTO period is the amount of time that a sender ought to wait for 532 an acknowledgement of a sent packet. This time period includes the 533 estimated network roundtrip-time (smoothed_rtt), the variation in the 534 estimate (4*rttvar), and max_ack_delay, to account for the maximum 535 time by which a receiver might delay sending an acknowledgement. 536 When the PTO is armed for Initial or Handshake packet number spaces, 537 the max_ack_delay is 0, as specified in 13.2.5 of [QUIC-TRANSPORT]. 539 The PTO value MUST be set to at least kGranularity, to avoid the 540 timer expiring immediately. 542 A sender computes its PTO timer every time an ack-eliciting packet is 543 sent. When ack-eliciting packets are in-flight in multiple packet 544 number spaces, the timer MUST be set for the packet number space with 545 the earliest timeout, except for ApplicationData, which MUST be 546 ignored until the handshake completes; see Section 4.1.1 of 547 [QUIC-TLS]. Not arming the PTO for ApplicationData prioritizes 548 completing the handshake and prevents the server from sending a 1-RTT 549 packet on a PTO before before it has the keys to process a 1-RTT 550 packet. 552 When a PTO timer expires, the PTO period MUST be set to twice its 553 current value. This exponential reduction in the sender's rate is 554 important because consecutive PTOs might be caused by loss of packets 555 or acknowledgements due to severe congestion. Even when there are 556 ack-eliciting packets in-flight in multiple packet number spaces, the 557 exponential increase in probe timeout occurs across all spaces to 558 prevent excess load on the network. For example, a timeout in the 559 Initial packet number space doubles the length of the timeout in the 560 Handshake packet number space. 562 The life of a connection that is experiencing consecutive PTOs is 563 limited by the endpoint's idle timeout. 565 The probe timer is not set if the time threshold Section 5.1.2 loss 566 detection timer is set. The time threshold loss detection timer is 567 expected to both expire earlier than the PTO and be less likely to 568 spuriously retransmit data. 570 5.3. Handshakes and New Paths 572 The initial probe timeout for a new connection or new path SHOULD be 573 set to twice the initial RTT. Resumed connections over the same 574 network SHOULD use the previous connection's final smoothed RTT value 575 as the resumed connection's initial RTT. If no previous RTT is 576 available, the initial RTT SHOULD be set to 500ms, resulting in a 1 577 second initial timeout as recommended in [RFC6298]. 579 A connection MAY use the delay between sending a PATH_CHALLENGE and 580 receiving a PATH_RESPONSE to set the initial RTT (see kInitialRtt in 581 Appendix A.2) for a new path, but the delay SHOULD NOT be considered 582 an RTT sample. 584 Until the server has validated the client's address on the path, the 585 amount of data it can send is limited to three times the amount of 586 data received, as specified in Section 8.1 of [QUIC-TRANSPORT]. If 587 no data can be sent, then the PTO alarm MUST NOT be armed until 588 datagrams have been received from the client. 590 Since the server could be blocked until more packets are received 591 from the client, it is the client's responsibility to send packets to 592 unblock the server until it is certain that the server has finished 593 its address validation (see Section 8 of [QUIC-TRANSPORT]). That is, 594 the client MUST set the probe timer if the client has not received an 595 acknowledgement for one of its Handshake or 1-RTT packets. 597 Prior to handshake completion, when few to none RTT samples have been 598 generated, it is possible that the probe timer expiration is due to 599 an incorrect RTT estimate at the client. To allow the client to 600 improve its RTT estimate, the new packet that it sends MUST be ack- 601 eliciting. If Handshake keys are available to the client, it MUST 602 send a Handshake packet, and otherwise it MUST send an Initial packet 603 in a UDP datagram of at least 1200 bytes. 605 Initial packets and Handshake packets could be never acknowledged, 606 but they are removed from bytes in flight when the Initial and 607 Handshake keys are discarded. 609 5.3.1. Sending Probe Packets 611 When a PTO timer expires, a sender MUST send at least one ack- 612 eliciting packet in the packet number space as a probe, unless there 613 is no data available to send. An endpoint MAY send up to two full- 614 sized datagrams containing ack-eliciting packets, to avoid an 615 expensive consecutive PTO expiration due to a single lost datagram or 616 transmit data from multiple packet number spaces. 618 In addition to sending data in the packet number space for which the 619 timer expired, the sender SHOULD send ack-eliciting packets from 620 other packet number spaces with in-flight data, coalescing packets if 621 possible. 623 When the PTO timer expires, and there is new or previously sent 624 unacknowledged data, it MUST be sent. 626 It is possible the sender has no new or previously-sent data to send. 627 As an example, consider the following sequence of events: new 628 application data is sent in a STREAM frame, deemed lost, then 629 retransmitted in a new packet, and then the original transmission is 630 acknowledged. When there is no data to send, the sender SHOULD send 631 a PING or other ack-eliciting frame in a single packet, re-arming the 632 PTO timer. 634 Alternatively, instead of sending an ack-eliciting packet, the sender 635 MAY mark any packets still in flight as lost. Doing so avoids 636 sending an additional packet, but increases the risk that loss is 637 declared too aggressively, resulting in an unnecessary rate reduction 638 by the congestion controller. 640 Consecutive PTO periods increase exponentially, and as a result, 641 connection recovery latency increases exponentially as packets 642 continue to be dropped in the network. Sending two packets on PTO 643 expiration increases resilience to packet drops, thus reducing the 644 probability of consecutive PTO events. 646 Probe packets sent on a PTO MUST be ack-eliciting. A probe packet 647 SHOULD carry new data when possible. A probe packet MAY carry 648 retransmitted unacknowledged data when new data is unavailable, when 649 flow control does not permit new data to be sent, or to 650 opportunistically reduce loss recovery delay. Implementations MAY 651 use alternative strategies for determining the content of probe 652 packets, including sending new or retransmitted data based on the 653 application's priorities. 655 When the PTO timer expires multiple times and new data cannot be 656 sent, implementations must choose between sending the same payload 657 every time or sending different payloads. Sending the same payload 658 may be simpler and ensures the highest priority frames arrive first. 659 Sending different payloads each time reduces the chances of spurious 660 retransmission. 662 5.3.2. Loss Detection 664 Delivery or loss of packets in flight is established when an ACK 665 frame is received that newly acknowledges one or more packets. 667 A PTO timer expiration event does not indicate packet loss and MUST 668 NOT cause prior unacknowledged packets to be marked as lost. When an 669 acknowledgement is received that newly acknowledges packets, loss 670 detection proceeds as dictated by packet and time threshold 671 mechanisms; see Section 5.1. 673 5.4. Handling Retry Packets 675 A Retry packet causes a client to send another Initial packet, 676 effectively restarting the connection process. A Retry packet 677 indicates that the Initial was received, but not processed. A Retry 678 packet cannot be treated as an acknowledgment, because it does not 679 indicate that a packet was processed or specify the packet number. 681 Clients that receive a Retry packet reset congestion control and loss 682 recovery state, including resetting any pending timers. Other 683 connection state, in particular cryptographic handshake messages, is 684 retained; see Section 17.2.5 of [QUIC-TRANSPORT]. 686 The client MAY compute an RTT estimate to the server as the time 687 period from when the first Initial was sent to when a Retry or a 688 Version Negotiation packet is received. The client MAY use this 689 value in place of its default for the initial RTT estimate. 691 5.5. Discarding Keys and Packet State 693 When packet protection keys are discarded (see Section 4.10 of 694 [QUIC-TLS]), all packets that were sent with those keys can no longer 695 be acknowledged because their acknowledgements cannot be processed 696 anymore. The sender MUST discard all recovery state associated with 697 those packets and MUST remove them from the count of bytes in flight. 699 Endpoints stop sending and receiving Initial packets once they start 700 exchanging Handshake packets (see Section 17.2.2.1 of 701 [QUIC-TRANSPORT]). At this point, recovery state for all in-flight 702 Initial packets is discarded. 704 When 0-RTT is rejected, recovery state for all in-flight 0-RTT 705 packets is discarded. 707 If a server accepts 0-RTT, but does not buffer 0-RTT packets that 708 arrive before Initial packets, early 0-RTT packets will be declared 709 lost, but that is expected to be infrequent. 711 It is expected that keys are discarded after packets encrypted with 712 them would be acknowledged or declared lost. Initial secrets however 713 might be destroyed sooner, as soon as handshake keys are available 714 (see Section 4.10.1 of [QUIC-TLS]). 716 6. Congestion Control 718 This document specifies a Reno congestion controller for QUIC 719 [RFC6582]. 721 The signals QUIC provides for congestion control are generic and are 722 designed to support different algorithms. Endpoints can unilaterally 723 choose a different algorithm to use, such as Cubic [RFC8312]. 725 If an endpoint uses a different controller than that specified in 726 this document, the chosen controller MUST conform to the congestion 727 control guidelines specified in Section 3.1 of [RFC8085]. 729 The algorithm in this document specifies and uses the controller's 730 congestion window in bytes. 732 An endpoint MUST NOT send a packet if it would cause bytes_in_flight 733 (see Appendix B.2) to be larger than the congestion window, unless 734 the packet is sent on a PTO timer expiration (see Section 5.2). 736 6.1. Explicit Congestion Notification 738 If a path has been verified to support ECN [RFC3168] [RFC8311], QUIC 739 treats a Congestion Experienced(CE) codepoint in the IP header as a 740 signal of congestion. This document specifies an endpoint's response 741 when its peer receives packets with the Congestion Experienced 742 codepoint. 744 6.2. Slow Start 746 QUIC begins every connection in slow start and exits slow start upon 747 loss or upon increase in the ECN-CE counter. QUIC re-enters slow 748 start any time the congestion window is less than ssthresh, which 749 only occurs after persistent congestion is declared. While in slow 750 start, QUIC increases the congestion window by the number of bytes 751 acknowledged when each acknowledgment is processed. 753 6.3. Congestion Avoidance 755 Slow start exits to congestion avoidance. Congestion avoidance in 756 NewReno uses an additive increase multiplicative decrease (AIMD) 757 approach that increases the congestion window by one maximum packet 758 size per congestion window acknowledged. When a loss is detected, 759 NewReno halves the congestion window and sets the slow start 760 threshold to the new congestion window. 762 6.4. Recovery Period 764 Recovery is a period of time beginning with detection of a lost 765 packet or an increase in the ECN-CE counter. Because QUIC does not 766 retransmit packets, it defines the end of recovery as a packet sent 767 after the start of recovery being acknowledged. This is slightly 768 different from TCP's definition of recovery, which ends when the lost 769 packet that started recovery is acknowledged. 771 The recovery period limits congestion window reduction to once per 772 round trip. During recovery, the congestion window remains unchanged 773 irrespective of new losses or increases in the ECN-CE counter. 775 6.5. Ignoring Loss of Undecryptable Packets 777 During the handshake, some packet protection keys might not be 778 available when a packet arrives. In particular, Handshake and 0-RTT 779 packets cannot be processed until the Initial packets arrive, and 780 1-RTT packets cannot be processed until the handshake completes. 781 Endpoints MAY ignore the loss of Handshake, 0-RTT, and 1-RTT packets 782 that might arrive before the peer has packet protection keys to 783 process those packets. 785 6.6. Probe Timeout 787 Probe packets MUST NOT be blocked by the congestion controller. A 788 sender MUST however count these packets as being additionally in 789 flight, since these packets add network load without establishing 790 packet loss. Note that sending probe packets might cause the 791 sender's bytes in flight to exceed the congestion window until an 792 acknowledgement is received that establishes loss or delivery of 793 packets. 795 6.7. Persistent Congestion 797 When an ACK frame is received that establishes loss of all in-flight 798 packets sent over a long enough period of time, the network is 799 considered to be experiencing persistent congestion. Commonly, this 800 can be established by consecutive PTOs, but since the PTO timer is 801 reset when a new ack-eliciting packet is sent, an explicit duration 802 must be used to account for those cases where PTOs do not occur or 803 are substantially delayed. This duration is computed as follows: 805 (smoothed_rtt + 4 * rttvar + max_ack_delay) * 806 kPersistentCongestionThreshold 808 For example, assume: 810 smoothed_rtt = 1 rttvar = 0 max_ack_delay = 0 811 kPersistentCongestionThreshold = 3 813 If an ack-eliciting packet is sent at time = 0, the following 814 scenario would illustrate persistent congestion: 816 +-----+------------------------+ 817 | t=0 | Send Pkt #1 (App Data) | 818 +=====+========================+ 819 | t=1 | Send Pkt #2 (PTO 1) | 820 +-----+------------------------+ 821 | t=3 | Send Pkt #3 (PTO 2) | 822 +-----+------------------------+ 823 | t=7 | Send Pkt #4 (PTO 3) | 824 +-----+------------------------+ 825 | t=8 | Recv ACK of Pkt #4 | 826 +-----+------------------------+ 828 Table 1 830 The first three packets are determined to be lost when the 831 acknowlegement of packet 4 is received at t=8. The congestion period 832 is calculated as the time between the oldest and newest lost packets: 833 (3 - 0) = 3. The duration for persistent congestion is equal to: (1 834 * kPersistentCongestionThreshold) = 3. Because the threshold was 835 reached and because none of the packets between the oldest and the 836 newest packets are acknowledged, the network is considered to have 837 experienced persistent congestion. 839 When persistent congestion is established, the sender's congestion 840 window MUST be reduced to the minimum congestion window 841 (kMinimumWindow). This response of collapsing the congestion window 842 on persistent congestion is functionally similar to a sender's 843 response on a Retransmission Timeout (RTO) in TCP [RFC5681] after 844 Tail Loss Probes (TLP) [RACK]. 846 6.8. Pacing 848 This document does not specify a pacer, but it is RECOMMENDED that a 849 sender pace sending of all in-flight packets based on input from the 850 congestion controller. For example, a pacer might distribute the 851 congestion window over the smoothed RTT when used with a window-based 852 controller, and a pacer might use the rate estimate of a rate-based 853 controller. 855 An implementation should take care to architect its congestion 856 controller to work well with a pacer. For instance, a pacer might 857 wrap the congestion controller and control the availability of the 858 congestion window, or a pacer might pace out packets handed to it by 859 the congestion controller. Timely delivery of ACK frames is 860 important for efficient loss recovery. Packets containing only ACK 861 frames should therefore not be paced, to avoid delaying their 862 delivery to the peer. 864 Sending multiple packets into the network without any delay between 865 them creates a packet burst that might cause short-term congestion 866 and losses. Implementations MUST either use pacing or limit such 867 bursts to the initial congestion window, which is recommended to be 868 the minimum of 10 * max_datagram_size and max(2* max_datagram_size, 869 14720)), where max_datagram_size is the current maximum size of a 870 datagram for the connection, not including UDP or IP overhead. 872 As an example of a well-known and publicly available implementation 873 of a flow pacer, implementers are referred to the Fair Queue packet 874 scheduler (fq qdisc) in Linux (3.11 onwards). 876 6.9. Under-utilizing the Congestion Window 878 When bytes in flight is smaller than the congestion window and 879 sending is not pacing limited, the congestion window is under- 880 utilized. When this occurs, the congestion window SHOULD NOT be 881 increased in either slow start or congestion avoidance. This can 882 happen due to insufficient application data or flow control credit. 884 A sender MAY use the pipeACK method described in section 4.3 of 885 [RFC7661] to determine if the congestion window is sufficiently 886 utilized. 888 A sender that paces packets (see Section 6.8) might delay sending 889 packets and not fully utilize the congestion window due to this 890 delay. A sender should not consider itself application limited if it 891 would have fully utilized the congestion window without pacing delay. 893 A sender MAY implement alternative mechanisms to update its 894 congestion window after periods of under-utilization, such as those 895 proposed for TCP in [RFC7661]. 897 7. Security Considerations 899 7.1. Congestion Signals 901 Congestion control fundamentally involves the consumption of signals 902 - both loss and ECN codepoints - from unauthenticated entities. On- 903 path attackers can spoof or alter these signals. An attacker can 904 cause endpoints to reduce their sending rate by dropping packets, or 905 alter send rate by changing ECN codepoints. 907 7.2. Traffic Analysis 909 Packets that carry only ACK frames can be heuristically identified by 910 observing packet size. Acknowledgement patterns may expose 911 information about link characteristics or application behavior. 912 Endpoints can use PADDING frames or bundle acknowledgments with other 913 frames to reduce leaked information. 915 7.3. Misreporting ECN Markings 917 A receiver can misreport ECN markings to alter the congestion 918 response of a sender. Suppressing reports of ECN-CE markings could 919 cause a sender to increase their send rate. This increase could 920 result in congestion and loss. 922 A sender MAY attempt to detect suppression of reports by marking 923 occasional packets that they send with ECN-CE. If a packet sent with 924 ECN-CE is not reported as having been CE marked when the packet is 925 acknowledged, then the sender SHOULD disable ECN for that path. 927 Reporting additional ECN-CE markings will cause a sender to reduce 928 their sending rate, which is similar in effect to advertising reduced 929 connection flow control limits and so no advantage is gained by doing 930 so. 932 Endpoints choose the congestion controller that they use. Though 933 congestion controllers generally treat reports of ECN-CE markings as 934 equivalent to loss [RFC8311], the exact response for each controller 935 could be different. Failure to correctly respond to information 936 about ECN markings is therefore difficult to detect. 938 8. IANA Considerations 940 This document has no IANA actions. Yet. 942 9. References 944 9.1. Normative References 946 [QUIC-TLS] Thomson, M., Ed. and S. Turner, Ed., "Using TLS to Secure 947 QUIC", Work in Progress, Internet-Draft, draft-ietf-quic- 948 tls-25, 22 January 2020, 949 . 951 [QUIC-TRANSPORT] 952 Iyengar, J., Ed. and M. Thomson, Ed., "QUIC: A UDP-Based 953 Multiplexed and Secure Transport", Work in Progress, 954 Internet-Draft, draft-ietf-quic-transport-25, 22 January 955 2020, . 958 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 959 Requirement Levels", BCP 14, RFC 2119, 960 DOI 10.17487/RFC2119, March 1997, 961 . 963 [RFC8085] Eggert, L., Fairhurst, G., and G. Shepherd, "UDP Usage 964 Guidelines", BCP 145, RFC 8085, DOI 10.17487/RFC8085, 965 March 2017, . 967 [RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC 968 2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, 969 May 2017, . 971 9.2. Informative References 973 [FACK] Mathis, M. and J. Mahdavi, "Forward Acknowledgement: 974 Refining TCP Congestion Control", ACM SIGCOMM , August 975 1996. 977 [RACK] Cheng, Y., Cardwell, N., Dukkipati, N., and P. Jha, "RACK: 978 a time-based fast loss detection algorithm for TCP", Work 979 in Progress, Internet-Draft, draft-ietf-tcpm-rack-05, 26 980 April 2019, . 983 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 984 of Explicit Congestion Notification (ECN) to IP", 985 RFC 3168, DOI 10.17487/RFC3168, September 2001, 986 . 988 [RFC4653] Bhandarkar, S., Reddy, A. L. N., Allman, M., and E. 989 Blanton, "Improving the Robustness of TCP to Non- 990 Congestion Events", RFC 4653, DOI 10.17487/RFC4653, August 991 2006, . 993 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 994 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 995 . 997 [RFC5682] Sarolahti, P., Kojo, M., Yamamoto, K., and M. Hata, 998 "Forward RTO-Recovery (F-RTO): An Algorithm for Detecting 999 Spurious Retransmission Timeouts with TCP", RFC 5682, 1000 DOI 10.17487/RFC5682, September 2009, 1001 . 1003 [RFC5827] Allman, M., Avrachenkov, K., Ayesta, U., Blanton, J., and 1004 P. Hurtig, "Early Retransmit for TCP and Stream Control 1005 Transmission Protocol (SCTP)", RFC 5827, 1006 DOI 10.17487/RFC5827, May 2010, 1007 . 1009 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 1010 "Computing TCP's Retransmission Timer", RFC 6298, 1011 DOI 10.17487/RFC6298, June 2011, 1012 . 1014 [RFC6582] Henderson, T., Floyd, S., Gurtov, A., and Y. Nishida, "The 1015 NewReno Modification to TCP's Fast Recovery Algorithm", 1016 RFC 6582, DOI 10.17487/RFC6582, April 2012, 1017 . 1019 [RFC6675] Blanton, E., Allman, M., Wang, L., Jarvinen, I., Kojo, M., 1020 and Y. Nishida, "A Conservative Loss Recovery Algorithm 1021 Based on Selective Acknowledgment (SACK) for TCP", 1022 RFC 6675, DOI 10.17487/RFC6675, August 2012, 1023 . 1025 [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis, 1026 "Increasing TCP's Initial Window", RFC 6928, 1027 DOI 10.17487/RFC6928, April 2013, 1028 . 1030 [RFC7661] Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating 1031 TCP to Support Rate-Limited Traffic", RFC 7661, 1032 DOI 10.17487/RFC7661, October 2015, 1033 . 1035 [RFC8311] Black, D., "Relaxing Restrictions on Explicit Congestion 1036 Notification (ECN) Experimentation", RFC 8311, 1037 DOI 10.17487/RFC8311, January 2018, 1038 . 1040 [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and 1041 R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", 1042 RFC 8312, DOI 10.17487/RFC8312, February 2018, 1043 . 1045 Appendix A. Loss Recovery Pseudocode 1047 We now describe an example implementation of the loss detection 1048 mechanisms described in Section 5. 1050 A.1. Tracking Sent Packets 1052 To correctly implement congestion control, a QUIC sender tracks every 1053 ack-eliciting packet until the packet is acknowledged or lost. It is 1054 expected that implementations will be able to access this information 1055 by packet number and crypto context and store the per-packet fields 1056 (Appendix A.1.1) for loss recovery and congestion control. 1058 After a packet is declared lost, the endpoint can track it for an 1059 amount of time comparable to the maximum expected packet reordering, 1060 such as 1 RTT. This allows for detection of spurious 1061 retransmissions. 1063 Sent packets are tracked for each packet number space, and ACK 1064 processing only applies to a single space. 1066 A.1.1. Sent Packet Fields 1068 packet_number: The packet number of the sent packet. 1070 ack_eliciting: A boolean that indicates whether a packet is ack- 1071 eliciting. If true, it is expected that an acknowledgement will 1072 be received, though the peer could delay sending the ACK frame 1073 containing it by up to the MaxAckDelay. 1075 in_flight: A boolean that indicates whether the packet counts 1076 towards bytes in flight. 1078 sent_bytes: The number of bytes sent in the packet, not including 1079 UDP or IP overhead, but including QUIC framing overhead. 1081 time_sent: The time the packet was sent. 1083 A.2. Constants of interest 1085 Constants used in loss recovery are based on a combination of RFCs, 1086 papers, and common practice. 1088 kPacketThreshold: Maximum reordering in packets before packet 1089 threshold loss detection considers a packet lost. The RECOMMENDED 1090 value is 3. 1092 kTimeThreshold: Maximum reordering in time before time threshold 1093 loss detection considers a packet lost. Specified as an RTT 1094 multiplier. The RECOMMENDED value is 9/8. 1096 kGranularity: Timer granularity. This is a system-dependent value. 1097 However, implementations SHOULD use a value no smaller than 1ms. 1099 kInitialRtt: The RTT used before an RTT sample is taken. The 1100 RECOMMENDED value is 500ms. 1102 kPacketNumberSpace: An enum to enumerate the three packet number 1103 spaces. 1105 enum kPacketNumberSpace { 1106 Initial, 1107 Handshake, 1108 ApplicationData, 1109 } 1111 A.3. Variables of interest 1113 Variables required to implement the congestion control mechanisms are 1114 described in this section. 1116 latest_rtt: The most recent RTT measurement made when receiving an 1117 ack for a previously unacked packet. 1119 smoothed_rtt: The smoothed RTT of the connection, computed as 1120 described in [RFC6298] 1122 rttvar: The RTT variation, computed as described in [RFC6298] 1124 min_rtt: The minimum RTT seen in the connection, ignoring ack delay. 1126 max_ack_delay: The maximum amount of time by which the receiver 1127 intends to delay acknowledgments for packets in the 1128 ApplicationData packet number space. The actual ack_delay in a 1129 received ACK frame may be larger due to late timers, reordering, 1130 or lost ACK frames. 1132 loss_detection_timer: Multi-modal timer used for loss detection. 1134 pto_count: The number of times a PTO has been sent without receiving 1135 an ack. 1137 time_of_last_sent_ack_eliciting_packet[kPacketNumberSpace]: The time 1138 the most recent ack-eliciting packet was sent. 1140 largest_acked_packet[kPacketNumberSpace]: The largest packet number 1141 acknowledged in the packet number space so far. 1143 loss_time[kPacketNumberSpace]: The time at which the next packet in 1144 that packet number space will be considered lost based on 1145 exceeding the reordering window in time. 1147 sent_packets[kPacketNumberSpace]: An association of packet numbers 1148 in a packet number space to information about them. Described in 1149 detail above in Appendix A.1. 1151 A.4. Initialization 1153 At the beginning of the connection, initialize the loss detection 1154 variables as follows: 1156 loss_detection_timer.reset() 1157 pto_count = 0 1158 latest_rtt = 0 1159 smoothed_rtt = 0 1160 rttvar = 0 1161 min_rtt = 0 1162 max_ack_delay = 0 1163 for pn_space in [ Initial, Handshake, ApplicationData ]: 1164 largest_acked_packet[pn_space] = infinite 1165 time_of_last_sent_ack_eliciting_packet[pn_space] = 0 1166 loss_time[pn_space] = 0 1168 A.5. On Sending a Packet 1170 After a packet is sent, information about the packet is stored. The 1171 parameters to OnPacketSent are described in detail above in 1172 Appendix A.1.1. 1174 Pseudocode for OnPacketSent follows: 1176 OnPacketSent(packet_number, pn_space, ack_eliciting, 1177 in_flight, sent_bytes): 1178 sent_packets[pn_space][packet_number].packet_number = 1179 packet_number 1180 sent_packets[pn_space][packet_number].time_sent = now 1181 sent_packets[pn_space][packet_number].ack_eliciting = 1182 ack_eliciting 1183 sent_packets[pn_space][packet_number].in_flight = in_flight 1184 if (in_flight): 1185 if (ack_eliciting): 1186 time_of_last_sent_ack_eliciting_packet[pn_space] = now 1187 OnPacketSentCC(sent_bytes) 1188 sent_packets[pn_space][packet_number].size = sent_bytes 1189 SetLossDetectionTimer() 1191 A.6. On Receiving an Acknowledgment 1193 When an ACK frame is received, it may newly acknowledge any number of 1194 packets. 1196 Pseudocode for OnAckReceived and UpdateRtt follow: 1198 OnAckReceived(ack, pn_space): 1199 if (largest_acked_packet[pn_space] == infinite): 1200 largest_acked_packet[pn_space] = ack.largest_acked 1201 else: 1202 largest_acked_packet[pn_space] = 1203 max(largest_acked_packet[pn_space], ack.largest_acked) 1205 // Nothing to do if there are no newly acked packets. 1206 newly_acked_packets = DetermineNewlyAckedPackets(ack, pn_space) 1207 if (newly_acked_packets.empty()): 1208 return 1210 // If the largest acknowledged is newly acked and 1211 // at least one ack-eliciting was newly acked, update the RTT. 1212 if (sent_packets[pn_space].contains(ack.largest_acked) && 1213 IncludesAckEliciting(newly_acked_packets)): 1214 latest_rtt = 1215 now - sent_packets[pn_space][ack.largest_acked].time_sent 1216 ack_delay = 0 1217 if (pn_space == ApplicationData): 1218 ack_delay = ack.ack_delay 1219 UpdateRtt(ack_delay) 1221 // Process ECN information if present. 1223 if (ACK frame contains ECN information): 1224 ProcessECN(ack, pn_space) 1226 for acked_packet in newly_acked_packets: 1227 OnPacketAcked(acked_packet.packet_number, pn_space) 1229 DetectLostPackets(pn_space) 1231 pto_count = 0 1233 SetLossDetectionTimer() 1235 UpdateRtt(ack_delay): 1236 // First RTT sample. 1237 if (smoothed_rtt == 0): 1238 min_rtt = latest_rtt 1239 smoothed_rtt = latest_rtt 1240 rttvar = latest_rtt / 2 1241 return 1243 // min_rtt ignores ack delay. 1244 min_rtt = min(min_rtt, latest_rtt) 1245 // Limit ack_delay by max_ack_delay 1246 ack_delay = min(ack_delay, max_ack_delay) 1247 // Adjust for ack delay if plausible. 1248 adjusted_rtt = latest_rtt 1249 if (latest_rtt > min_rtt + ack_delay): 1250 adjusted_rtt = latest_rtt - ack_delay 1252 rttvar = 3/4 * rttvar + 1/4 * abs(smoothed_rtt - adjusted_rtt) 1253 smoothed_rtt = 7/8 * smoothed_rtt + 1/8 * adjusted_rtt 1255 A.7. On Packet Acknowledgment 1257 When a packet is acknowledged for the first time, the following 1258 OnPacketAcked function is called. Note that a single ACK frame may 1259 newly acknowledge several packets. OnPacketAcked must be called once 1260 for each of these newly acknowledged packets. 1262 OnPacketAcked takes two parameters: acked_packet, which is the struct 1263 detailed in Appendix A.1.1, and the packet number space that this ACK 1264 frame was sent for. 1266 Pseudocode for OnPacketAcked follows: 1268 OnPacketAcked(acked_packet, pn_space): 1269 if (acked_packet.in_flight): 1270 OnPacketAckedCC(acked_packet) 1271 sent_packets[pn_space].remove(acked_packet.packet_number) 1273 A.8. Setting the Loss Detection Timer 1275 QUIC loss detection uses a single timer for all timeout loss 1276 detection. The duration of the timer is based on the timer's mode, 1277 which is set in the packet and timer events further below. The 1278 function SetLossDetectionTimer defined below shows how the single 1279 timer is set. 1281 This algorithm may result in the timer being set in the past, 1282 particularly if timers wake up late. Timers set in the past SHOULD 1283 fire immediately. 1285 Pseudocode for SetLossDetectionTimer follows: 1287 GetEarliestTimeAndSpace(times): 1288 time = times[Initial] 1289 space = Initial 1290 for pn_space in [ Handshake, ApplicationData ]: 1291 if (times[pn_space] != 0 && 1292 (time == 0 || times[pn_space] < time) && 1293 # Skip ApplicationData until handshake completion. 1294 (pn_space != ApplicationData || 1295 IsHandshakeComplete()): 1296 time = times[pn_space]; 1297 space = pn_space 1298 return time, space 1300 PeerNotAwaitingAddressValidation(): 1301 # Assume clients validate the server's address implicitly. 1302 if (endpoint is server): 1303 return true 1304 # Servers complete address validation when a 1305 # protected packet is received. 1306 return has received Handshake ACK || 1307 has received 1-RTT ACK 1309 SetLossDetectionTimer(): 1310 earliest_loss_time, _ = GetEarliestTimeAndSpace(loss_time) 1311 if (earliest_loss_time != 0): 1312 // Time threshold loss detection. 1313 loss_detection_timer.update(earliest_loss_time) 1314 return 1316 if (no ack-eliciting packets in flight && 1317 PeerNotAwaitingAddressValidation()): 1318 loss_detection_timer.cancel() 1319 return 1321 // Use a default timeout if there are no RTT measurements 1322 if (smoothed_rtt == 0): 1323 timeout = 2 * kInitialRtt 1324 else: 1325 // Calculate PTO duration 1326 timeout = smoothed_rtt + max(4 * rttvar, kGranularity) + 1327 max_ack_delay 1328 timeout = timeout * (2 ^ pto_count) 1330 sent_time, _ = GetEarliestTimeAndSpace( 1331 time_of_last_sent_ack_eliciting_packet) 1332 loss_detection_timer.update(sent_time + timeout) 1334 A.9. On Timeout 1336 When the loss detection timer expires, the timer's mode determines 1337 the action to be performed. 1339 Pseudocode for OnLossDetectionTimeout follows: 1341 OnLossDetectionTimeout(): 1342 earliest_loss_time, pn_space = 1343 GetEarliestTimeAndSpace(loss_time) 1344 if (earliest_loss_time != 0): 1345 // Time threshold loss Detection 1346 DetectLostPackets(pn_space) 1347 SetLossDetectionTimer() 1348 return 1350 if (endpoint is client without 1-RTT keys): 1351 // Client sends an anti-deadlock packet: Initial is padded 1352 // to earn more anti-amplification credit, 1353 // a Handshake packet proves address ownership. 1354 if (has Handshake keys): 1355 SendOneAckElicitingHandshakePacket() 1356 else: 1357 SendOneAckElicitingPaddedInitialPacket() 1358 else: 1359 // PTO. Send new data if available, else retransmit old data. 1360 // If neither is available, send a single PING frame. 1361 _, pn_space = GetEarliestTimeAndSpace( 1362 time_of_last_sent_ack_eliciting_packet) 1363 SendOneOrTwoAckElicitingPackets(pn_space) 1365 pto_count++ 1366 SetLossDetectionTimer() 1368 A.10. Detecting Lost Packets 1370 DetectLostPackets is called every time an ACK is received and 1371 operates on the sent_packets for that packet number space. 1373 Pseudocode for DetectLostPackets follows: 1375 DetectLostPackets(pn_space): 1376 assert(largest_acked_packet[pn_space] != infinite) 1377 loss_time[pn_space] = 0 1378 lost_packets = {} 1379 loss_delay = kTimeThreshold * max(latest_rtt, smoothed_rtt) 1381 // Minimum time of kGranularity before packets are deemed lost. 1382 loss_delay = max(loss_delay, kGranularity) 1384 // Packets sent before this time are deemed lost. 1385 lost_send_time = now() - loss_delay 1387 foreach unacked in sent_packets[pn_space]: 1388 if (unacked.packet_number > largest_acked_packet[pn_space]): 1389 continue 1391 // Mark packet as lost, or set time when it should be marked. 1392 if (unacked.time_sent <= lost_send_time || 1393 largest_acked_packet[pn_space] >= 1394 unacked.packet_number + kPacketThreshold): 1395 sent_packets[pn_space].remove(unacked.packet_number) 1396 if (unacked.in_flight): 1397 lost_packets.insert(unacked) 1398 else: 1399 if (loss_time[pn_space] == 0): 1400 loss_time[pn_space] = unacked.time_sent + loss_delay 1401 else: 1402 loss_time[pn_space] = min(loss_time[pn_space], 1403 unacked.time_sent + loss_delay) 1405 // Inform the congestion controller of lost packets and 1406 // let it decide whether to retransmit immediately. 1407 if (!lost_packets.empty()): 1408 OnPacketsLost(lost_packets) 1410 Appendix B. Congestion Control Pseudocode 1412 We now describe an example implementation of the congestion 1413 controller described in Section 6. 1415 B.1. Constants of interest 1417 Constants used in congestion control are based on a combination of 1418 RFCs, papers, and common practice. 1420 kInitialWindow: Default limit on the initial amount of data in 1421 flight, in bytes. The RECOMMENDED value is the minimum of 10 * 1422 max_datagram_size and max(2 * max_datagram_size, 14720)). This 1423 follows the analysis and recommendations in [RFC6928], increasing 1424 the byte limit to account for the smaller 8 byte overhead of UDP 1425 compared to the 20 byte overhead for TCP. 1427 kMinimumWindow: Minimum congestion window in bytes. The RECOMMENDED 1428 value is 2 * max_datagram_size. 1430 kLossReductionFactor: Reduction in congestion window when a new loss 1431 event is detected. The RECOMMENDED value is 0.5. 1433 kPersistentCongestionThreshold: Period of time for persistent 1434 congestion to be established, specified as a PTO multiplier. The 1435 rationale for this threshold is to enable a sender to use initial 1436 PTOs for aggressive probing, as TCP does with Tail Loss Probe 1437 (TLP) [RACK], before establishing persistent congestion, as TCP 1438 does with a Retransmission Timeout (RTO) [RFC5681]. The 1439 RECOMMENDED value for kPersistentCongestionThreshold is 3, which 1440 is approximately equivalent to having two TLPs before an RTO in 1441 TCP. 1443 B.2. Variables of interest 1445 Variables required to implement the congestion control mechanisms are 1446 described in this section. 1448 max_datagram_size: The sender's current maximum payload size. Does 1449 not include UDP or IP overhead. The max datagram size is used for 1450 congestion window computations. An endpoint sets the value of 1451 this variable based on its PMTU (see Section 14.1 of 1452 [QUIC-TRANSPORT]), with a minimum value of 1200 bytes. 1454 ecn_ce_counters[kPacketNumberSpace]: The highest value reported for 1455 the ECN-CE counter in the packet number space by the peer in an 1456 ACK frame. This value is used to detect increases in the reported 1457 ECN-CE counter. 1459 bytes_in_flight: The sum of the size in bytes of all sent packets 1460 that contain at least one ack-eliciting or PADDING frame, and have 1461 not been acked or declared lost. The size does not include IP or 1462 UDP overhead, but does include the QUIC header and AEAD overhead. 1463 Packets only containing ACK frames do not count towards 1464 bytes_in_flight to ensure congestion control does not impede 1465 congestion feedback. 1467 congestion_window: Maximum number of bytes-in-flight that may be 1468 sent. 1470 congestion_recovery_start_time: The time when QUIC first detects 1471 congestion due to loss or ECN, causing it to enter congestion 1472 recovery. When a packet sent after this time is acknowledged, 1473 QUIC exits congestion recovery. 1475 ssthresh: Slow start threshold in bytes. When the congestion window 1476 is below ssthresh, the mode is slow start and the window grows by 1477 the number of bytes acknowledged. 1479 B.3. Initialization 1481 At the beginning of the connection, initialize the congestion control 1482 variables as follows: 1484 congestion_window = kInitialWindow 1485 bytes_in_flight = 0 1486 congestion_recovery_start_time = 0 1487 ssthresh = infinite 1488 for pn_space in [ Initial, Handshake, ApplicationData ]: 1489 ecn_ce_counters[pn_space] = 0 1491 B.4. On Packet Sent 1493 Whenever a packet is sent, and it contains non-ACK frames, the packet 1494 increases bytes_in_flight. 1496 OnPacketSentCC(bytes_sent): 1497 bytes_in_flight += bytes_sent 1499 B.5. On Packet Acknowledgement 1501 Invoked from loss detection's OnPacketAcked and is supplied with the 1502 acked_packet from sent_packets. 1504 InCongestionRecovery(sent_time): 1505 return sent_time <= congestion_recovery_start_time 1507 OnPacketAckedCC(acked_packet): 1508 // Remove from bytes_in_flight. 1509 bytes_in_flight -= acked_packet.size 1510 if (InCongestionRecovery(acked_packet.time_sent)): 1511 // Do not increase congestion window in recovery period. 1512 return 1513 if (IsAppOrFlowControlLimited()): 1514 // Do not increase congestion_window if application 1515 // limited or flow control limited. 1516 return 1517 if (congestion_window < ssthresh): 1518 // Slow start. 1519 congestion_window += acked_packet.size 1520 else: 1521 // Congestion avoidance. 1522 congestion_window += max_datagram_size * acked_packet.size 1523 / congestion_window 1525 B.6. On New Congestion Event 1527 Invoked from ProcessECN and OnPacketsLost when a new congestion event 1528 is detected. May start a new recovery period and reduces the 1529 congestion window. 1531 CongestionEvent(sent_time): 1532 // Start a new congestion event if packet was sent after the 1533 // start of the previous congestion recovery period. 1534 if (!InCongestionRecovery(sent_time)): 1535 congestion_recovery_start_time = Now() 1536 congestion_window *= kLossReductionFactor 1537 congestion_window = max(congestion_window, kMinimumWindow) 1538 ssthresh = congestion_window 1540 B.7. Process ECN Information 1542 Invoked when an ACK frame with an ECN section is received from the 1543 peer. 1545 ProcessECN(ack, pn_space): 1546 // If the ECN-CE counter reported by the peer has increased, 1547 // this could be a new congestion event. 1548 if (ack.ce_counter > ecn_ce_counters[pn_space]): 1549 ecn_ce_counters[pn_space] = ack.ce_counter 1550 CongestionEvent(sent_packets[ack.largest_acked].time_sent) 1552 B.8. On Packets Lost 1554 Invoked from DetectLostPackets when packets are deemed lost. 1556 InPersistentCongestion(largest_lost_packet): 1557 pto = smoothed_rtt + max(4 * rttvar, kGranularity) + 1558 max_ack_delay 1559 congestion_period = pto * kPersistentCongestionThreshold 1560 // Determine if all packets in the time period before the 1561 // newest lost packet, including the edges, are marked 1562 // lost 1563 return AreAllPacketsLost(largest_lost_packet, 1564 congestion_period) 1566 OnPacketsLost(lost_packets): 1567 // Remove lost packets from bytes_in_flight. 1568 for (lost_packet : lost_packets): 1569 bytes_in_flight -= lost_packet.size 1570 largest_lost_packet = lost_packets.last() 1571 CongestionEvent(largest_lost_packet.time_sent) 1573 // Collapse congestion window if persistent congestion 1574 if (InPersistentCongestion(largest_lost_packet)): 1575 congestion_window = kMinimumWindow 1577 Appendix C. Change Log 1579 *RFC Editor's Note:* Please remove this section prior to 1580 publication of a final version of this document. 1582 Issue and pull request numbers are listed with a leading octothorp. 1584 C.1. Since draft-ietf-quic-recovery-24 1586 * Require congestion control of some sort (#3247, #3244, #3248) 1588 * Set a minimum reordering threshold (#3256, #3240) 1590 * PTO is specific to a packet number space (#3067, #3074, #3066) 1592 C.2. Since draft-ietf-quic-recovery-23 1594 * Define under-utilizing the congestion window (#2630, #2686, #2675) 1596 * PTO MUST send data if possible (#3056, #3057) 1598 * Connection Close is not ack-eliciting (#3097, #3098) 1599 * MUST limit bursts to the initial congestion window (#3160) 1601 * Define the current max_datagram_size for congestion control 1602 (#3041, #3167) 1604 C.3. Since draft-ietf-quic-recovery-22 1606 * PTO should always send an ack-eliciting packet (#2895) 1608 * Unify the Handshake Timer with the PTO timer (#2648, #2658, #2886) 1610 * Move ACK generation text to transport draft (#1860, #2916) 1612 C.4. Since draft-ietf-quic-recovery-21 1614 * No changes 1616 C.5. Since draft-ietf-quic-recovery-20 1618 * Path validation can be used as initial RTT value (#2644, #2687) 1620 * max_ack_delay transport parameter defaults to 0 (#2638, #2646) 1622 * Ack Delay only measures intentional delays induced by the 1623 implementation (#2596, #2786) 1625 C.6. Since draft-ietf-quic-recovery-19 1627 * Change kPersistentThreshold from an exponent to a multiplier 1628 (#2557) 1630 * Send a PING if the PTO timer fires and there's nothing to send 1631 (#2624) 1633 * Set loss delay to at least kGranularity (#2617) 1635 * Merge application limited and sending after idle sections. Always 1636 limit burst size instead of requiring resetting CWND to initial 1637 CWND after idle (#2605) 1639 * Rewrite RTT estimation, allow RTT samples where a newly acked 1640 packet is ack-eliciting but the largest_acked is not (#2592) 1642 * Don't arm the handshake timer if there is no handshake data 1643 (#2590) 1645 * Clarify that the time threshold loss alarm takes precedence over 1646 the crypto handshake timer (#2590, #2620) 1648 * Change initial RTT to 500ms to align with RFC6298 (#2184) 1650 C.7. Since draft-ietf-quic-recovery-18 1652 * Change IW byte limit to 14720 from 14600 (#2494) 1654 * Update PTO calculation to match RFC6298 (#2480, #2489, #2490) 1656 * Improve loss detection's description of multiple packet number 1657 spaces and pseudocode (#2485, #2451, #2417) 1659 * Declare persistent congestion even if non-probe packets are sent 1660 and don't make persistent congestion more aggressive than RTO 1661 verified was (#2365, #2244) 1663 * Move pseudocode to the appendices (#2408) 1665 * What to send on multiple PTOs (#2380) 1667 C.8. Since draft-ietf-quic-recovery-17 1669 * After Probe Timeout discard in-flight packets or send another 1670 (#2212, #1965) 1672 * Endpoints discard initial keys as soon as handshake keys are 1673 available (#1951, #2045) 1675 * 0-RTT state is discarded when 0-RTT is rejected (#2300) 1677 * Loss detection timer is cancelled when ack-eliciting frames are in 1678 flight (#2117, #2093) 1680 * Packets are declared lost if they are in flight (#2104) 1682 * After becoming idle, either pace packets or reset the congestion 1683 controller (#2138, 2187) 1685 * Process ECN counts before marking packets lost (#2142) 1687 * Mark packets lost before resetting crypto_count and pto_count 1688 (#2208, #2209) 1690 * Congestion and loss recovery state are discarded when keys are 1691 discarded (#2327) 1693 C.9. Since draft-ietf-quic-recovery-16 1694 * Unify TLP and RTO into a single PTO; eliminate min RTO, min TLP 1695 and min crypto timeouts; eliminate timeout validation (#2114, 1696 #2166, #2168, #1017) 1698 * Redefine how congestion avoidance in terms of when the period 1699 starts (#1928, #1930) 1701 * Document what needs to be tracked for packets that are in flight 1702 (#765, #1724, #1939) 1704 * Integrate both time and packet thresholds into loss detection 1705 (#1969, #1212, #934, #1974) 1707 * Reduce congestion window after idle, unless pacing is used (#2007, 1708 #2023) 1710 * Disable RTT calculation for packets that don't elicit 1711 acknowledgment (#2060, #2078) 1713 * Limit ack_delay by max_ack_delay (#2060, #2099) 1715 * Initial keys are discarded once Handshake keys are available 1716 (#1951, #2045) 1718 * Reorder ECN and loss detection in pseudocode (#2142) 1720 * Only cancel loss detection timer if ack-eliciting packets are in 1721 flight (#2093, #2117) 1723 C.10. Since draft-ietf-quic-recovery-14 1725 * Used max_ack_delay from transport params (#1796, #1782) 1727 * Merge ACK and ACK_ECN (#1783) 1729 C.11. Since draft-ietf-quic-recovery-13 1731 * Corrected the lack of ssthresh reduction in CongestionEvent 1732 pseudocode (#1598) 1734 * Considerations for ECN spoofing (#1426, #1626) 1736 * Clarifications for PADDING and congestion control (#837, #838, 1737 #1517, #1531, #1540) 1739 * Reduce early retransmission timer to RTT/8 (#945, #1581) 1741 * Packets are declared lost after an RTO is verified (#935, #1582) 1743 C.12. Since draft-ietf-quic-recovery-12 1745 * Changes to manage separate packet number spaces and encryption 1746 levels (#1190, #1242, #1413, #1450) 1748 * Added ECN feedback mechanisms and handling; new ACK_ECN frame 1749 (#804, #805, #1372) 1751 C.13. Since draft-ietf-quic-recovery-11 1753 No significant changes. 1755 C.14. Since draft-ietf-quic-recovery-10 1757 * Improved text on ack generation (#1139, #1159) 1759 * Make references to TCP recovery mechanisms informational (#1195) 1761 * Define time_of_last_sent_handshake_packet (#1171) 1763 * Added signal from TLS the data it includes needs to be sent in a 1764 Retry packet (#1061, #1199) 1766 * Minimum RTT (min_rtt) is initialized with an infinite value 1767 (#1169) 1769 C.15. Since draft-ietf-quic-recovery-09 1771 No significant changes. 1773 C.16. Since draft-ietf-quic-recovery-08 1775 * Clarified pacing and RTO (#967, #977) 1777 C.17. Since draft-ietf-quic-recovery-07 1779 * Include Ack Delay in RTO(and TLP) computations (#981) 1781 * Ack Delay in SRTT computation (#961) 1783 * Default RTT and Slow Start (#590) 1785 * Many editorial fixes. 1787 C.18. Since draft-ietf-quic-recovery-06 1789 No significant changes. 1791 C.19. Since draft-ietf-quic-recovery-05 1793 * Add more congestion control text (#776) 1795 C.20. Since draft-ietf-quic-recovery-04 1797 No significant changes. 1799 C.21. Since draft-ietf-quic-recovery-03 1801 No significant changes. 1803 C.22. Since draft-ietf-quic-recovery-02 1805 * Integrate F-RTO (#544, #409) 1807 * Add congestion control (#545, #395) 1809 * Require connection abort if a skipped packet was acknowledged 1810 (#415) 1812 * Simplify RTO calculations (#142, #417) 1814 C.23. Since draft-ietf-quic-recovery-01 1816 * Overview added to loss detection 1818 * Changes initial default RTT to 100ms 1820 * Added time-based loss detection and fixes early retransmit 1822 * Clarified loss recovery for handshake packets 1824 * Fixed references and made TCP references informative 1826 C.24. Since draft-ietf-quic-recovery-00 1828 * Improved description of constants and ACK behavior 1830 C.25. Since draft-iyengar-quic-loss-recovery-01 1832 * Adopted as base for draft-ietf-quic-recovery 1834 * Updated authors/editors list 1836 * Added table of contents 1838 Appendix D. Contributors 1840 The IETF QUIC Working Group received an enormous amount of support 1841 from many people. The following people provided substantive 1842 contributions to this document: Alessandro Ghedini, Benjamin 1843 Saunders, Gorry Fairhurst, 奥 一穂 (Kazuho Oku), Lars Eggert, Magnus 1844 Westerlund, Marten Seemann, Martin Duke, Martin Thomson, Nick Banks, 1845 Praveen Balasubramaniam. 1847 Acknowledgments 1849 Authors' Addresses 1851 Jana Iyengar (editor) 1852 Fastly 1854 Email: jri.ietf@gmail.com 1856 Ian Swett (editor) 1857 Google 1859 Email: ianswett@google.com