idnits 2.17.1 draft-ietf-tcpm-hystartplusplus-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (July 12, 2021) is 1009 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC1191' is mentioned on line 111, but not defined == Missing Reference: 'RFC4821' is mentioned on line 111, but not defined == Missing Reference: 'RFC1122' is mentioned on line 119, but not defined ** Downref: Normative reference to an Experimental RFC: RFC 3465 Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 1 comment (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Balasubramanian 3 Internet-Draft Y. Huang 4 Intended status: Standards Track M. Olson 5 Expires: January 13, 2022 Microsoft 6 July 12, 2021 8 HyStart++: Modified Slow Start for TCP 9 draft-ietf-tcpm-hystartplusplus-02 11 Abstract 13 This doument describes HyStart++, a simple modification to the slow 14 start phase of TCP congestion control algorithms. Traditional slow 15 start can cause overshotting of the ideal send rate and cause large 16 packet loss within a round-trip time which results in poor 17 performance. HyStart++ is composed of the delay increase variant of 18 HyStart to prevent overshooting of the ideal sending rate, while also 19 mitigating poor performance which can result from false positives. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at https://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on January 13, 2022. 38 Copyright Notice 40 Copyright (c) 2021 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents 45 (https://trustee.ietf.org/license-info) in effect on the date of 46 publication of this document. Please review these documents 47 carefully, as they describe your rights and restrictions with respect 48 to this document. Code Components extracted from this document must 49 include Simplified BSD License text as described in Section 4.e of 50 the Trust Legal Provisions and are provided without warranty as 51 described in the Simplified BSD License. 53 Table of Contents 55 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 56 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 4. HyStart++ Algorithm . . . . . . . . . . . . . . . . . . . . . 3 59 4.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 3 60 4.2. Algorithm Details . . . . . . . . . . . . . . . . . . . . 4 61 4.3. Tuning constants . . . . . . . . . . . . . . . . . . . . 6 62 5. Deployments and Performance Evaluations . . . . . . . . . . . 7 63 6. Security Considerations . . . . . . . . . . . . . . . . . . . 7 64 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 65 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 66 8.1. Normative References . . . . . . . . . . . . . . . . . . 7 67 8.2. Informative References . . . . . . . . . . . . . . . . . 8 68 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 70 1. Introduction 72 [RFC5681] describes the slow start congestion control algorithm for 73 TCP. The slow start algorithm is used when the congestion window 74 (cwnd) is less than the slow start threshold (ssthresh). During slow 75 start, in absence of packet loss signals, TCP sender increases cwnd 76 exponentially to probe the network capacity. Such a fast growth can 77 lead to overshooting the ideal sending rate and cause significant 78 packet loss. This is counter-productive for the TCP flow itself, and 79 also impacts the rest of the traffic sharing the bottleneck link. 80 TCP has several mechanisms for loss recovery, but they are only 81 effective for moderate loss. When these techniques are unable to 82 recover lost packets, a last-resort retransmission timeout (RTO) is 83 used to trigger packet recovery. In most operating systems, the 84 minimum RTO is set to a large value (200 msec or 300 msec) to prevent 85 spurious timeouts. This results in a long idle time which 86 drastically impairs flow completion times. 88 HyStart++ adds delay increase as a signal to exit slow start before 89 any packet loss occurs. This is one of two algorithms specified in 90 [HyStart]. After the HyStart delay algorithm finds an exit point, a 91 Conservative Slow Start (CSS) phase is used to determine if the slow 92 start exit was spurious. This provides protection against jitter and 93 prevents pefrormance problems that result from early slow start exit 94 due to false positives. HyStart++ reduces packet loss and 95 retransmissions, and improves goodput in lab measurements as well as 96 real world deployments. 98 2. Terminology 100 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 101 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 102 document are to be interpreted as described in [RFC2119]. 104 3. Definitions 106 We repeat here some definition from [RFC5681] to aid the reader. 108 SENDER MAXIMUM SEGMENT SIZE (SMSS): The SMSS is the size of the 109 largest segment that the sender can transmit. This value can be 110 based on the maximum transmission unit of the network, the path MTU 111 discovery [RFC1191, RFC4821] algorithm, RMSS (see next item), or 112 other factors. The size does not include the TCP/IP headers and 113 options. 115 RECEIVER MAXIMUM SEGMENT SIZE (RMSS): The RMSS is the size of the 116 largest segment the receiver is willing to accept. This is the value 117 specified in the MSS option sent by the receiver during connection 118 startup. Or, if the MSS option is not used, it is 536 bytes 119 [RFC1122]. The size does not include the TCP/IP headers and options. 121 RECEIVER WINDOW (rwnd): The most recently advertised receiver window. 123 CONGESTION WINDOW (cwnd): A TCP state variable that limits the amount 124 of data a TCP can send. At any given time, a TCP MUST NOT send data 125 with a sequence number higher than the sum of the highest 126 acknowledged sequence number and the minimum of cwnd and rwnd. 128 4. HyStart++ Algorithm 130 4.1. Summary 132 [HyStart] specifies two algorithms (a "Delay Increase" algorithm and 133 an "Inter-Packet Arrival" algorithm) to be run in parallel to detect 134 that the sending rate has reached capacity. In practice, the Inter- 135 Packet Arrival algorithm does not perform well and is not able to 136 detect congestion early, primarily due to ACK compression. The idea 137 of the Delay Increase algorithm is to look for RTT spikes, which 138 suggest that the bottleneck buffer is filling up. 140 In HyStart++, a TCP sender uses traditional slow start and then uses 141 the "Delay Increase" algorithm to trigger an exit from slow start. 142 But instead of using a congestion avoidance algorithm, the sender 143 uses a Conservative Slow Start (CSS) algorithm to determine if the 144 exit was spurious. If the exit is determined to be spurious, slow 145 start is resumed. If the exit is determined to be not spurious, the 146 sender enters congestion avoidance. 148 4.2. Algorithm Details 150 We assume that Appropriate Byte Counting (as described in [RFC3465]) 151 is in use and L is the cwnd increase limit. The choice of value of L 152 is up to the implementation. 154 A round is chosen to be approximately the Round-Trip Time (RTT). 155 Round can be approximated using sequence numbers as follows: 157 Define windowEnd as a sequence number initialize to SND.UNA 159 When windowEnd is ACKed, the current round ends and windowEnd is 160 set to SND.NXT 162 At the start of each round during normal slow start and CSS: 164 lastRoundMinRTT = currentRoundMinRTT 166 currentRoundMinRTT = infinity 168 rttSampleCount = 0 170 For each arriving ACK in slow start, where N is the number of 171 previously unacknowledged bytes acknowledged in the arriving ACK: 173 Update the cwnd 175 cwnd = cwnd + min (N, L * SMSS) 177 Keep track of minimum observed RTT 179 currentRoundMinRTT = min(currentRoundMinRTT, currRTT) 181 where currRTT is the RTT sampled from the incoming ACK 183 rttSampleCount += 1 185 For rounds where cwnd is at or higher than LOW_CWND and 186 N_RTT_SAMPLE RTT samples have been obtained, check if delay 187 increase triggers slow start exit 189 if (cwnd >= (LOW_CWND * SMSS) AND rttSampleCount >= 190 N_RTT_SAMPLE) 191 RttThresh = clamp(MIN_RTT_THRESH, lastRoundMinRTT / 8, 192 MAX_RTT_THRESH) 194 if (currentRoundMinRTT >= (lastRoundMinRTT + RttThresh)) 196 cssBaselineMinRtt = currentRoundMinRTT 198 exit slow start and enter CSS 200 CSS lasts CSS_ROUNDS rounds. If the transition into CSS happens in 201 the middle of a round, that partial round counts towards the limit. 203 For each arriving ACK in CSS, where N is the number of previously 204 unacknowledged bytes acknowledged in the arriving ACK: 206 Update the cwnd 208 cwnd = cwnd + (min (N, L * SMSS) / CSS_GROWTH_DIVISOR) 210 Keep track of minimum observed RTT 212 currentRoundMinRTT = min(currentRoundMinRTT, currRTT) 214 where currRTT is the sampled RTT from the incoming ACK 216 rttSampleCount += 1 218 For CSS rounds where N_RTT_SAMPLE RTT samples have been obtained, 219 check if current round's minRTT drops below baseline indicating 220 that HyStart exit was spurious. 222 if (currentRoundMinRTT < cssBaselineMinRtt) 224 cssBaselineMinRtt = infinity 226 resume slow start including HyStart++ 228 If CSS_ROUNDS rounds are complete, enter congestion avoidance. 230 ssthresh = cwnd 232 If congestion is observed anytime during slow start or CSS, enter 233 congestion avoidance. 235 ssthresh = cwnd 237 4.3. Tuning constants 239 It is RECOMMENDED that a HyStart++ implementation use the following 240 constants: 242 LOW_CWND = 16 244 MIN_RTT_THRESH = 4 msec 246 MAX_RTT_THRESH = 16 msec 248 N_RTT_SAMPLE = 8 250 CSS_GROWTH_DIVISOR = 4 252 CSS_ROUNDS = 5 254 These constants have been determined with lab measurements and real 255 world deployments. An implementation MAY tune them for different 256 network characteristics. 258 Using smaller values of LOW_CWND will cause the algorithm to kick in 259 before the last round RTT can be measured, particularly if the 260 implementation uses an initial cwnd of 10 MSS. Higher values will 261 delay the detection of delay increase and reduce the ability of 262 HyStart++ to prevent overshoot problems. 264 The delay increase sensitivity is determined by MIN_RTT_THRESH and 265 MAX_RTT_THRESH. Smaller values of MIN_RTT_THRESH may cause spurious 266 exits from slow start. Larger values of MAX_RTT_THRESH may result in 267 slow start not exiting until loss is encountered for connections on 268 large RTT paths. 270 A TCP implementation is required to take at least one RTT sample each 271 round. Using lower values of N_RTT_SAMPLE will lower the accuracy of 272 the measured RTT for the round; higher values will improve accuracy 273 at the cost of more processing. 275 The minimum value of CSS_GROWTH_DIVISOR SHOULD be at least 2. 276 Otherwise the cwnd growth could again become too aggressive and cause 277 ideal send rate overshoot. Values larger than 4 will cause the 278 algorithm to be less aggressive and maybe less performant. 280 Smaller values of CSS_ROUNDS may miss detecting jitter and larger 281 values may limit performance. 283 An implementation SHOULD use HyStart++ only for the initial slow 284 start (when ssthresh is at its initial value of arbitrarily high per 286 [RFC5681]) and fall back to using traditional slow start for the 287 remainder of the connection lifetime. This is acceptable because 288 subsequent slow starts will use the discovered ssthresh value to exit 289 slow start and avoid the overshoot problem. An implementation MAY 290 use HyStart++ to grow the restart window ([RFC5681]) after a long 291 idle period. 293 5. Deployments and Performance Evaluations 295 As of the time of writing, HyStart++ has been default enabled for all 296 TCP connections in Windows for two years. The original Hystart has 297 been default-enabled for all TCP connections in Linux TCP for a 298 decade. 300 In lab measurements with Windows TCP, HyStart++ shows both goodput 301 improvements as well as reductions in packet loss and 302 retransmissions. For example across a variety of tests on a 100 Mbps 303 link with a bottleneck buffer size of bandwidth-delay product, 304 HyStart++ reduces bytes retransmitted by 50% and retransmission 305 timeouts by 36%. 307 In an A/B test across a large Windows device population, out of 52 308 billion TCP connections, 0.7% of connections move from 1 RTO to 0 309 RTOs and another 0.7% connections move from 2 RTOs to 1 RTO with 310 HyStart++. This test did not focus on send heavy connections and the 311 impact on send heavy connections is likely much higher. We plan to 312 conduct more such production experiments to gather more data in the 313 future. 315 6. Security Considerations 317 HyStart++ enhances slow start and inherits the general security 318 considerations discussed in [RFC5681]. 320 7. IANA Considerations 322 This document has no actions for IANA. 324 8. References 326 8.1. Normative References 328 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 329 Requirement Levels", BCP 14, RFC 2119, 330 DOI 10.17487/RFC2119, March 1997, 331 . 333 [RFC3465] Allman, M., "TCP Congestion Control with Appropriate Byte 334 Counting (ABC)", RFC 3465, DOI 10.17487/RFC3465, February 335 2003, . 337 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 338 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 339 . 341 8.2. Informative References 343 [HyStart] Ha, S. and I. Ree, "Hybrid Slow Start for High-Bandwidth 344 and Long-Distance Networks", 345 DOI 10.1145/1851182.1851192, International Workshop on 346 Protocols for Fast Long-Distance Networks, 2008, 347 . 350 Authors' Addresses 352 Praveen Balasubramanian 353 Microsoft 354 One Microsoft Way 355 Redmond, WA 98052 356 USA 358 Phone: +1 425 538 2782 359 Email: pravb@microsoft.com 361 Yi Huang 362 Microsoft 364 Phone: +1 425 703 0447 365 Email: huanyi@microsoft.com 367 Matt Olson 368 Microsoft 370 Phone: +1 425 538 8598 371 Email: maolson@microsoft.com