idnits 2.17.1 draft-ietf-tcpm-hystartplusplus-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 6, 2021) is 1196 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC1191' is mentioned on line 111, but not defined == Missing Reference: 'RFC4821' is mentioned on line 111, but not defined == Missing Reference: 'RFC1122' is mentioned on line 119, but not defined ** Downref: Normative reference to an Experimental RFC: RFC 3465 ** Downref: Normative reference to an Experimental RFC: RFC 3742 -- Obsolete informational reference (is this intentional?): RFC 8312 (Obsoleted by RFC 9438) Summary: 2 errors (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Balasubramanian 3 Internet-Draft Y. Huang 4 Intended status: Standards Track M. Olson 5 Expires: July 10, 2021 Microsoft 6 January 6, 2021 8 HyStart++: Modified Slow Start for TCP 9 draft-ietf-tcpm-hystartplusplus-01 11 Abstract 13 This doument describes HyStart++, a simple modification to the slow 14 start phase of TCP congestion control algorithms. Traditional slow 15 start can cause overshotting of the ideal send rate and cause large 16 packet loss within a round-trip time which results in poor 17 performance. HyStart++ combines the use of one variant of HyStart 18 and Limited Slow Start (LSS) to prevent overshooting of the ideal 19 sending rate, while also mitigating poor performance which can result 20 from false positives when HyStart is used alone. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at https://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on July 10, 2021. 39 Copyright Notice 41 Copyright (c) 2021 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (https://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 4. HyStart++ Algorithm . . . . . . . . . . . . . . . . . . . . . 3 60 4.1. Use of HyStart Delay Increase and Limited Slow Start . . 3 61 4.2. Algorithm Details . . . . . . . . . . . . . . . . . . . . 4 62 4.3. Tuning constants . . . . . . . . . . . . . . . . . . . . 5 63 5. Deployments and Performance Evaluations . . . . . . . . . . . 6 64 6. Security Considerations . . . . . . . . . . . . . . . . . . . 7 65 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 66 8. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 7 67 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 68 9.1. Normative References . . . . . . . . . . . . . . . . . . 7 69 9.2. Informative References . . . . . . . . . . . . . . . . . 7 70 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 72 1. Introduction 74 [RFC5681] describes the slow start congestion control algorithm for 75 TCP. The slow start algorithm is used when the congestion window 76 (cwnd) is less than the slow start threshold (ssthresh). During slow 77 start, in absence of packet loss signals, TCP sender increases cwnd 78 exponentially to probe the network capacity. Such a fast growth can 79 lead to overshooting the ideal sending rate and cause significant 80 packet loss. This is counter-productive for the TCP flow itself, and 81 also impacts the rest of the traffic sharing the bottleneck link. 82 TCP has several mechanisms for loss recovery, but they are only 83 effective for moderate loss. When these techniques are unable to 84 recover lost packets, a last-resort retransmission timeout (RTO) is 85 used to trigger packet recovery. In most operating systems, the 86 minimum RTO is set to a large value (200 msec or 300 msec) to prevent 87 spurious timeouts. This results in a long idle time which 88 drastically impairs flow completion times. 90 HyStart++ adds delay increase as a signal to exit slow start before 91 any packet loss occurs. This is one of two algorithms specified in 92 [HyStart]. After the HyStart delay algorithm finds an exit point, 93 LSS is used in conjunction with congestion avoidance for further 94 congestion window increases until the first packet loss is detected. 95 HyStart++ reduces packet loss and retransmissions, and improves 96 goodput in lab measurements as well as real world deployments. 98 2. Terminology 100 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 101 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 102 document are to be interpreted as described in [RFC2119]. 104 3. Definitions 106 We repeat here some definition from [RFC5681] to aid the reader. 108 SENDER MAXIMUM SEGMENT SIZE (SMSS): The SMSS is the size of the 109 largest segment that the sender can transmit. This value can be 110 based on the maximum transmission unit of the network, the path MTU 111 discovery [RFC1191, RFC4821] algorithm, RMSS (see next item), or 112 other factors. The size does not include the TCP/IP headers and 113 options. 115 RECEIVER MAXIMUM SEGMENT SIZE (RMSS): The RMSS is the size of the 116 largest segment the receiver is willing to accept. This is the value 117 specified in the MSS option sent by the receiver during connection 118 startup. Or, if the MSS option is not used, it is 536 bytes 119 [RFC1122]. The size does not include the TCP/IP headers and options. 121 RECEIVER WINDOW (rwnd): The most recently advertised receiver window. 123 CONGESTION WINDOW (cwnd): A TCP state variable that limits the amount 124 of data a TCP can send. At any given time, a TCP MUST NOT send data 125 with a sequence number higher than the sum of the highest 126 acknowledged sequence number and the minimum of cwnd and rwnd. 128 4. HyStart++ Algorithm 130 4.1. Use of HyStart Delay Increase and Limited Slow Start 132 [HyStart] specifies two algorithms (a "Delay Increase" algorithm and 133 an "Inter-Packet Arrival" algorithm) to be run in parallel to detect 134 that the sending rate has reached capacity. In practice, the Inter- 135 Packet Arrival algorithm does not perform well and is not able to 136 detect congestion early, primarily due to ACK compression. The idea 137 of the Delay Increase algorithm is to look for RTT spikes, which 138 suggest that the bottleneck buffer is filling up. 140 After the HyStart "Delay Increase" algorithm triggers an exit from 141 slow start, LSS (described in [RFC3742]) is used to increase Cwnd 142 until congestion is observed. LSS is used because the HyStart exit 143 is often premature as a result of RTT fluctuations or transient queue 144 buildup. LSS grows the cwnd fast but much slower than traditional 145 slow start. LSS helps avoid massive packet losses and subsequent 146 time spent in loss recovery or retransmission timeout. 148 4.2. Algorithm Details 150 We assume that Appropriate Byte Counting (as described in [RFC3465]) 151 is in use and L is the cwnd increase limit. The choice of value of L 152 is up to the implementation. 154 A round is chosen to be approximately the Round-Trip Time (RTT). 155 Round can be approximated using sequence numbers as follows: 157 Define windowEnd as a sequence number initialize to SND.UNA 159 When windowEnd is ACKed, the current round ends and windowEnd is 160 set to SND.NXT 162 At the start of each round during slow start: 164 lastRoundMinRTT = currentRoundMinRTT 166 currentRoundMinRTT = infinity 168 rttSampleCount = 0 170 For each arriving ACK in slow start, where N is the number of 171 previously unacknowledged bytes acknowledged in the arriving ACK and 172 w: 174 Update the cwnd 176 cwnd = cwnd + min (N, L * SMSS) 178 Keep track of minimum observed RTT 180 currentRoundMinRTT = min(currentRoundMinRTT, currRTT) 182 where currRTT is the measured RTT based on the incoming ACK 184 rttSampleCount += 1 186 For rounds where cwnd is at or higher than LOW_CWND and 187 N_RTT_SAMPLE RTT samples have been obtained, check if delay 188 increase triggers slow start exit 190 if (cwnd >= (LOW_CWND * SMSS) AND rttSampleCount >= 191 N_RTT_SAMPLE) 192 RttThresh = clamp(MIN_RTT_THRESH, lastRoundMinRTT / 8, 193 MAX_RTT_THRESH) 195 if (currentRoundMinRTT >= (lastRoundMinRTT + RttThresh)) 197 ssthresh = cwnd 199 exit slow start and enter LSS 201 For each arriving ACK in LSS, where N is the number of previously 202 unacknowledged bytes acknowledged in the arriving ACK: 204 K = cwnd / (LSS_DIVISOR * ssthresh) 206 cwnd = max(cwnd + (min (N, L * SMSS) / K), CA_cwnd()) 208 CA_cwnd() denotes the cwnd that a congestion control algorithm would 209 have increased to if congestion avoidance started instead of LSS. 210 LSS grows cwnd very fast but for long-lived flows in high BDP 211 networks, the congestion avoidance algorithm could increase cwnd much 212 faster. For example, CUBIC congestion avoidance [RFC8312] in convex 213 region can ramp up cwnd rapidly. Taking the max can help improve 214 performance when exiting slow start prematurely. 216 HyStart++ ends when congestion is observed. 218 4.3. Tuning constants 220 It is RECOMMENDED that a HyStart++ implementation use the following 221 constants: 223 LOW_CWND = 16 225 MIN_RTT_THRESH = 4 msec 227 MAX_RTT_THRESH = 16 msec 229 LSS_DIVISOR = 0.25 231 N_RTT_SAMPLE = 8 233 These constants have been determined with lab measurements and real 234 world deployments. An implementation MAY tune them for different 235 network characteristics. 237 Using smaller values of LOW_CWND will cause the algorithm to kick in 238 before the last round RTT can be measured, particularly if the 239 implementation uses an initial cwnd of 10 MSS. Higher values will 240 delay the detection of delay increase and reduce the ability of 241 HyStart++ to prevent overshoot problems. 243 The delay increase sensitivity is determined by MIN_RTT_THRESH and 244 MAX_RTT_THRESH. Smaller values of MIN_RTT_THRESH may cause spurious 245 exits from slow start. Larger values of MAX_RTT_THRESH may result in 246 slow start not exiting until loss is encountered for connections on 247 large RTT paths. 249 A TCP implementation is required to take at least one RTT sample each 250 round. Using lower values of N_RTT_SAMPLE will lower the accuracy of 251 the measured RTT for the round; higher values will improve accuracy 252 at the cost of more processing. 254 The maximum value of LSS_DIVISOR SHOULD NOT exceed 0.5, which is the 255 value recommended in [RFC3742]. Otherwise the cwnd growth could 256 again become too aggressive and cause ideal send rate overshoot. 257 Smaller values will cause the algorithm to be less aggressive and may 258 leave some cwnd growth on the table. 260 An implementation SHOULD use HyStart++ only for the initial slow 261 start and fall back to using traditional slow start for the remainder 262 of the connection lifetime. This is acceptable because subsequent 263 slow starts will use the discovered ssthresh value to exit slow 264 start. An implementation MAY use HyStart++ to grow the restart 265 window ([RFC5681]) after a long idle period. 267 5. Deployments and Performance Evaluations 269 As of the time of writing, HyStart++ has been default enabled for all 270 TCP connections in Windows for two years. The original Hystart has 271 been default-enabled for all TCP connections in Linux TCP for a 272 decade. 274 In lab measurements with Windows TCP, HyStart++ shows both goodput 275 improvements as well as reductions in packet loss and 276 retransmissions. For example across a variety of tests on a 100 Mbps 277 link with a bottleneck buffer size of bandwidth-delay product, 278 HyStart++ reduces bytes retransmitted by 50% and retransmission 279 timeouts by 36%. 281 In an A/B test across a large Windows device population, out of 52 282 billion TCP connections, 0.7% of connections move from 1 RTO to 0 283 RTOs and another 0.7% connections move from 2 RTOs to 1 RTO with 284 HyStart++. This test did not focus on send heavy connections and the 285 impact on send heavy connections is likely much higher. We plan to 286 conduct more such production experiments to gather more data in the 287 future. 289 6. Security Considerations 291 HyStart++ enhances slow start and inherits the general security 292 considerations discussed in [RFC5681]. 294 7. IANA Considerations 296 This document has no actions for IANA. 298 8. Acknowledgements 300 Neal Cardwell suggested the idea of using the maximum of cwnd value 301 computed by LSS and congestion avoidance after exiting slow start. 303 9. References 305 9.1. Normative References 307 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 308 Requirement Levels", BCP 14, RFC 2119, 309 DOI 10.17487/RFC2119, March 1997, 310 . 312 [RFC3465] Allman, M., "TCP Congestion Control with Appropriate Byte 313 Counting (ABC)", RFC 3465, DOI 10.17487/RFC3465, February 314 2003, . 316 [RFC3742] Floyd, S., "Limited Slow-Start for TCP with Large 317 Congestion Windows", RFC 3742, DOI 10.17487/RFC3742, March 318 2004, . 320 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 321 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 322 . 324 9.2. Informative References 326 [HyStart] Ha, S. and I. Ree, "Hybrid Slow Start for High-Bandwidth 327 and Long-Distance Networks", 328 DOI 10.1145/1851182.1851192, International Workshop on 329 Protocols for Fast Long-Distance Networks, 2008, 330 . 333 [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and 334 R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", 335 RFC 8312, DOI 10.17487/RFC8312, February 2018, 336 . 338 Authors' Addresses 340 Praveen Balasubramanian 341 Microsoft 342 One Microsoft Way 343 Redmond, WA 98052 344 USA 346 Phone: +1 425 538 2782 347 Email: pravb@microsoft.com 349 Yi Huang 350 Microsoft 352 Phone: +1 425 703 0447 353 Email: huanyi@microsoft.com 355 Matt Olson 356 Microsoft 358 Phone: +1 425 538 8598 359 Email: maolson@microsoft.com