idnits 2.17.1 draft-iyengar-quic-loss-recovery-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (October 31, 2016) is 2734 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Missing Reference: 'RFC 6298' is mentioned on line 185, but not defined == Unused Reference: 'RFC2119' is defined on line 425, but no explicit reference was found in the text Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group J. Iyengar 3 Internet-Draft I. Swett 4 Intended status: Experimental Google 5 Expires: May 4, 2017 October 31, 2016 7 QUIC Congestion Control And Loss Recovery 8 draft-iyengar-quic-loss-recovery-01 10 Abstract 12 QUIC is a new multiplexed and secure transport atop UDP. QUIC builds 13 on decades of transport and security experience, and implements 14 mechanisms that make it attractive as a modern general-purpose 15 transport. QUIC implements the spirit of known TCP loss detection 16 mechanisms, described in RFCs, various Internet-drafts, and also 17 those prevalent in the Linux TCP implementation. This document 18 describes QUIC loss detection and congestion control, and attributes 19 the TCP equivalent in RFCs, Internet-drafts, academic papers, and TCP 20 implementations. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at http://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on May 4, 2017. 39 Copyright Notice 41 Copyright (c) 2016 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (http://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 1. Introduction 56 QUIC is a new multiplexed and secure transport atop UDP. QUIC builds 57 on decades of transport and security experience, and implements 58 mechanisms that make it attractive as a modern general-purpose 59 transport. The QUIC protocol is described in [draft-hamilton-quic- 60 transport-protocol]. 62 QUIC implements the spirit of known TCP loss recovery mechanisms, 63 described in RFCs, various Internet-drafts, and also those prevalent 64 in the Linux TCP implementation. This document describes QUIC 65 congestion control and loss recovery, and where applicable, 66 attributes the TCP equivalent in RFCs, Internet-drafts, academic 67 papers, and/or TCP implementations. 69 This document first describes pre-requisite parts of the QUIC 70 transmission machinery, then discusses QUIC's default congestion 71 control and loss detection mechanisms, and finally lists the various 72 TCP mechanisms that QUIC loss detection implements (in spirit.) 74 2. Design of the QUIC Transmission Machinery 76 All transmissions in QUIC are sent with a packet-level header, which 77 includes a packet sequence number (referred to below as a packet 78 number). These packet numbers never repeat in the lifetime of a 79 connection, and are monotonically increasing, which makes duplicate 80 detection trivial. This fundamental design decision obviates the 81 need for disambiguating between transmissions and retransmissions and 82 eliminates significant complexity from QUIC's interpretation of TCP 83 loss detection mechanisms. 85 Every packet may contain several frames. We outline the frames that 86 are important to the loss detection and congestion control machinery 87 below. 89 o Retransmittable frames are frames requiring reliable delivery. 90 The most common are STREAM frames, which typically contain 91 application data. 93 o Crypto handshake data is also sent as STREAM data, and uses the 94 reliability machinery of QUIC underneath. 96 o ACK frames contain acknowledgment information. QUIC uses a SACK- 97 based scheme, where acks express up to 256 ranges. The ACK frame 98 also includes a receive timestamp for each packet newly acked. 100 2.1. Relevant Differences Between QUIC and TCP 102 There are some notable differences between QUIC and TCP which are 103 important for reasoning about the differences between the loss 104 recovery mechanisms employed by the two protocols. We briefly 105 describe these differences below. 107 2.1.1. Monotonically Increasing Packet Numbers 109 TCP conflates transmission sequence number at the sender with 110 delivery sequence number at the receiver, which results in 111 retransmissions of the same data carrying the same sequence number, 112 and consequently to problems caused by "retransmission ambiguity". 113 QUIC separates the two: QUIC uses a packet sequence number (referred 114 to as the "packet number") for transmissions, and any data that is to 115 be delivered to the receiving application(s) is sent in one or more 116 streams, with stream offsets encoded within STREAM frames inside of 117 packets that determine delivery order. 119 QUIC's packet number is strictly increasing, and directly encodes 120 transmission order. A higher QUIC packet number signifies that the 121 packet was sent later, and a lower QUIC packet number signifies that 122 the packet was sent earlier. When a packet containing frames is 123 deemed lost, QUIC rebundles necessary frames in a new packet with a 124 new packet number, removing ambiguity about which packet is 125 acknowledged when an ACK is received. Consequently, more accurate 126 RTT measurements can be made, spurious retransmissions are trivially 127 detected, and mechanisms such as Fast Retransmit can be applied 128 universally, based only on packet number. 130 This design point significantly simplifies loss detection mechanisms 131 for QUIC. Most TCP mechanisms implicitly attempt to infer 132 transmission ordering based on TCP sequence numbers --- a non-trivial 133 task, especially when TCP timestamps are not available. 135 2.1.2. No Reneging 137 QUIC ACKs contain information that is equivalent to TCP SACK, but 138 QUIC does not allow any acked packet to be reneged, greatly 139 simplifying implementations on both sides and reducing memory 140 pressure on the sender. 142 2.1.3. More ACK Ranges 144 QUIC supports up to 256 ACK ranges, opposed to TCP's 3 SACK ranges. 145 In high loss environments, this speeds recovery. 147 2.1.4. Explicit Correction For Delayed Acks 149 QUIC ACKs explicitly encode the delay incurred at the receiver 150 between when a packet is received and when the corresponding ACK is 151 sent. This allows the receiver of the ACK to adjust for receiver 152 delays, specifically the delayed ack timer, when estimating the path 153 RTT. This mechanism also allows a receiver to measure and report the 154 delay from when a packet was received by the OS kernel, which is 155 useful in receivers which may incur delays such as context-switch 156 latency before a userspace QUIC receiver processes a received packet. 158 3. Loss Detection 160 We now describe QUIC's loss detection as functions that should be 161 called on packet transmission, when a packet is acked, and timer 162 expiration events. 164 3.1. Variables of interest 166 We first describe the variables required to implement the loss 167 detection mechanisms described in this section. 169 o loss_detection_alarm: Multi-modal alarm used for loss detection. 171 o alarm_mode: QUIC maintains a single loss detection alarm, which 172 switches between various modes. This mode is used to determine 173 the duration of the alarm. 175 o handshake_count: The number of times the handshake packets have 176 been retransmitted without receiving an ack. 178 o tlp_count: The number of times a tail loss probe has been sent 179 without receiving an ack. 181 o rto_count: The number of times an rto has been sent without 182 receiving and ack. 184 o smoothed_rtt: The smoothed RTT of the connection, computed as 185 described in [RFC 6298]. TODO: Describe RTT computations. 187 o reordering_threshold: The largest delta between the largest acked 188 retransmittable packet and a packet containing retransmittable 189 frames before it's declared lost. 191 o time_loss: When true, loss detection operates solely based on 192 reordering threshold in time, rather than in packet number gaps. 194 3.2. Initialization 196 At the beginning of the connection, initialize the loss detection 197 variables as follows: 199 loss_detection_alarm.reset(); 200 handshake_count = 0; 201 tlp_count = 0; 202 rto_count = 0; 203 smoothed_rtt = 0; 204 reordering_threshold = 3; 205 time_loss = false; 207 3.3. Setting the Loss Detection Alarm 209 QUIC loss detection uses a single alarm for all timer-based loss 210 detection. The duration of the alarm is based on the alarm's mode, 211 which is set in the packet and timer events further below. The 212 function SetLossDetectionAlarm defined below shows how the single 213 timer is set based on the alarm mode. 215 Pseudocode for SetLossDetectionAlarm follows. 217 SetLossDetectionAlarm(): 218 if (retransmittable packets are outstanding): 220 loss_detection_alarm.cancel(); 221 return; 222 if (handshake packets are outstanding): 224 alarm_duration = max(1.5 * smoothed_rtt, 10ms) << handshake_count; 225 handshake_count++; 226 else if (largest sent packet is acked): 227 // Set alarm based on short timer for early retransmit. 228 alarm_duration = 0.25 x smoothed_rtt; 229 else if (tlp_count < 2): 231 if (retransmittable_packets_outstanding = 1): 232 alarm_duration = max(1.5 x smoothed_rtt + delayed_ack_timer, 233 2 x smoothed_rtt); 234 else: 235 alarm_duration = max (10ms, 2 x smoothed_rtt); 236 tlp_count++; 237 else: 239 if (rto_count = 0): 240 alarm_duration = max(200ms, smoothed_rtt + 4 x rttvar); 241 else: 242 alarm_duration = loss_detection_alarm.get_delay() << 1; 244 rto_count++; 246 loss_detecton_alarm.set(now + alarm_duration); 248 3.4. On Sending a Packet 250 After any packet is sent, be it a new transmission or a rebundled 251 transmission, the following OnPacketSent function is called. The 252 parameters to OnPacketSent are as follows. 254 o packet_number: The packet number of the sent packet. 256 o is_retransmittble: A boolean that indicates whether the packet 257 contains at least one frame requiring reliable deliver. The 258 retransmittability of various QUIC frames is described in [draft- 259 hamilton-quic-protocol]. If false, it is still acceptable for an 260 ack to be received for this packet. However, a caller MUST NOT 261 set is_retransmittable to true if an ack is not expected. 263 Pseudocode for OnPacketSent follows. 265 OnPacketSent(packet_number, is_retransmittable): 266 if is_retransmittable: 267 SetLossDetectionAlarm() 269 3.5. On Packet Acknowledgment 271 When a packet is acked for the first time, the following 272 OnPacketAcked function is called. Note that a single ACK frame may 273 newly acknowledge several packets. OnPacketAcked must be called once 274 for each of these newly acked packets. 276 OnPacketAcked takes one parameter, acked_packet, which is the packet 277 number of the newly acked packet, and returns a list of packet 278 numbers that are detected as lost. 280 Pseudocode for OnPacketAcked follows. 282 OnPacketAcked(acked_packet): 283 handshake_count = 0; 284 tlp_count = 0; 285 rto_count = 0; 286 UpdateRtt(); // TODO: document RTT estimator. 287 DetectLostPackets(acked_packet); 288 SetLossDetectionAlarm(); 290 3.6. On Alarm Firing 292 QUIC uses one loss recovery alarm, which when set, can be in one of 293 several modes. When the alarm fires, the mode determines the action 294 to be performed. OnAlarm returns a list of packet numbers that are 295 detected as lost. 297 Pseudocode for OnAlarm follows. 299 OnAlarm(acked_packet): 300 lost_packets = DetectLostPackets(acked_packet); 301 MaybeRetransmitLostPackets(); 302 SetLossDetectionAlarm(); 304 3.7. Detecting Lost Packets 306 Packets in QUIC are only considered lost once a larger packet number 307 is acknowledged. DetectLostPackets is called every time there is a 308 new largest packet or if the loss detection alarm fires the previous 309 largest acked packet is supplied. 311 DetectLostPackets takes one parameter, acked_packet, which is the 312 packet number of the largest acked packet, and returns a list of 313 packet numbers detected as lost. 315 Pseudocode for DetectLostPackets follows. 317 DetectLostPackets(acked_packet): 318 lost_packets = {}; 319 foreach (unacked_packet less than acked_packet): 320 if (unacked_packet.time_sent < 321 acked_packet.time_sent - 1/8 * smoothed_rtt): 322 lost_packets.insert(unacked_packet.packet_number); 323 else if (unacked_packet.packet_number < 324 acked_packet.packet_number - reordering_threshold) 325 lost_packets.insert(unacked_packet.packet_number); 326 return lost_packets; 328 4. Congestion Control 330 (describe NewReno-style congestion control for QUIC.) 332 5. TCP mechanisms in QUIC 334 QUIC implements the spirit of a variety of RFCs, Internet drafts, and 335 other well-known TCP loss recovery mechanisms, though the 336 implementation details differ from the TCP implementations. 338 5.1. RFC 6298 (RTO computation) 340 QUIC calculates SRTT and RTTVAR according to the standard formulas. 341 An RTT sample is only taken if the delayed ack correction is smaller 342 than the measured RTT (otherwise a negative RTT would result), and 343 the ack's contains a new, larger largest observed packet number. 344 min_rtt is only based on the observed RTT, but SRTT uses the delayed 345 ack correction delta. 347 As described above, QUIC implements RTO with the standard timeout and 348 CWND reduction. However, QUIC retransmits the earliest outstanding 349 packets rather than the latest, because QUIC doesn't have 350 retransmission ambiguity. QUIC uses the commonly accepted min RTO of 351 200ms instead of the 1s the RFC specifies. 353 5.2. FACK Loss Recovery (paper) 355 QUIC implements the algorithm for early loss recovery described in 356 the FACK paper (and implemented in the Linux kernel.) QUIC uses the 357 packet number to measure the FACK reordering threshold. Currently 358 QUIC does not implement an adaptive threshold as many TCP 359 implementations(ie: the Linux kernel) do. 361 5.3. RFC 3782, RFC 6582 (NewReno Fast Recovery) 363 QUIC only reduces its CWND once per congestion window, in keeping 364 with the NewReno RFC. It tracks the largest outstanding packet at 365 the time the loss is declared and any losses which occur before that 366 packet number are considered part of the same loss event. It's worth 367 noting that some TCP implementations may do this on a sequence number 368 basis, and hence consider multiple losses of the same packet a single 369 loss event. 371 5.4. TLP (draft) 373 QUIC always sends two tail loss probes before RTO is triggered. QUIC 374 invokes tail loss probe even when a loss is outstanding, which is 375 different than some TCP implementations. 377 5.5. RFC 5827 (Early Retransmit) with Delay Timer 379 QUIC implements early retransmit with a timer in order to minimize 380 spurious retransmits. The timer is set to 1/4 SRTT after the final 381 outstanding packet is acked. 383 5.6. RFC 5827 (F-RTO) 385 QUIC implements F-RTO by not reducing the CWND and SSThresh until a 386 subsequent ack is received and it's sure the RTO was not spurious. 387 Conceptually this is similar, but it makes for a much cleaner 388 implementation with fewer edge cases. 390 5.7. RFC 6937 (Proportional Rate Reduction) 392 PRR-SSRB is implemented by QUIC in the epoch when recovering from a 393 loss. 395 5.8. TCP Cubic (draft) with optional RFC 5681 (Reno) 397 TCP Cubic is the default congestion control algorithm in QUIC. Reno 398 is also an easily available option which may be requested via 399 connection options and is fully implemented. 401 5.9. Hybrid Slow Start (paper) 403 QUIC implements hybrid slow start, but disables ack train detection, 404 because it has shown to falsely trigger when coupled with packet 405 pacing, which is also on by default in QUIC. Currently the minimum 406 delay increase is 4ms, the maximum is 16ms, and within that range 407 QUIC exits slow start if the min_rtt within a round increases by more 408 than ⅛ of the connection mi 410 5.10. RACK (draft) 412 QUIC's loss detection is by it's time-ordered nature, very similar to 413 RACK. Though QUIC defaults to loss detection based on reordering 414 threshold in packets, it could just as easily be based on fractions 415 of an rtt, as RACK does. 417 n 419 _rtt. 421 6. References 423 6.1. Normative References 425 [RFC2119] Bradner, S., "Key Words for use in RFCs to Indicate 426 Requirement Levels", March 1997. 428 6.2. Informative References 430 [draft-hamilton-quic-transport-protocol] 431 Hamilton, R., Iyengar, J., Swett, I., and A. Wilk, "QUIC: 432 A UDP-Based Multiplexed and Secure Transport", July 2016. 434 Authors' Addresses 436 Janardhan Iyengar 437 Google 439 Email: jri@google.com 441 Ian Swett 442 Google 444 Email: ianswett@google.com