idnits 2.17.1 draft-ietf-tcpm-hystartplusplus-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (23 January 2022) is 824 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC1191' is mentioned on line 101, but not defined == Missing Reference: 'RFC4821' is mentioned on line 101, but not defined == Missing Reference: 'RFC1122' is mentioned on line 109, but not defined ** Downref: Normative reference to an Experimental RFC: RFC 3465 -- Obsolete informational reference (is this intentional?): RFC 8312 (Obsoleted by RFC 9438) Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Balasubramanian 3 Internet-Draft Y. Huang 4 Intended status: Standards Track M. Olson 5 Expires: 27 July 2022 Microsoft 6 23 January 2022 8 HyStart++: Modified Slow Start for TCP 9 draft-ietf-tcpm-hystartplusplus-04 11 Abstract 13 This doument describes HyStart++, a simple modification to the slow 14 start phase of TCP congestion control algorithms. Traditional slow 15 start can overshoot the ideal send rate in many cases, causing high 16 packet loss and poor performance. HyStart++ uses a delay increase 17 heuristic to find an exit point before possible overshoot. It also 18 adds a mitigation to prevent jitter from causing premature slow start 19 exit. 21 Status of This Memo 23 This Internet-Draft is submitted in full conformance with the 24 provisions of BCP 78 and BCP 79. 26 Internet-Drafts are working documents of the Internet Engineering 27 Task Force (IETF). Note that other groups may also distribute 28 working documents as Internet-Drafts. The list of current Internet- 29 Drafts is at https://datatracker.ietf.org/drafts/current/. 31 Internet-Drafts are draft documents valid for a maximum of six months 32 and may be updated, replaced, or obsoleted by other documents at any 33 time. It is inappropriate to use Internet-Drafts as reference 34 material or to cite them other than as "work in progress." 36 This Internet-Draft will expire on 27 July 2022. 38 Copyright Notice 40 Copyright (c) 2022 IETF Trust and the persons identified as the 41 document authors. All rights reserved. 43 This document is subject to BCP 78 and the IETF Trust's Legal 44 Provisions Relating to IETF Documents (https://trustee.ietf.org/ 45 license-info) in effect on the date of publication of this document. 46 Please review these documents carefully, as they describe your rights 47 and restrictions with respect to this document. Code Components 48 extracted from this document must include Revised BSD License text as 49 described in Section 4.e of the Trust Legal Provisions and are 50 provided without warranty as described in the Revised BSD License. 52 Table of Contents 54 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 55 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 56 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 57 4. HyStart++ Algorithm . . . . . . . . . . . . . . . . . . . . . 3 58 4.1. Summary . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 4.2. Algorithm Details . . . . . . . . . . . . . . . . . . . . 4 60 4.3. Tuning constants . . . . . . . . . . . . . . . . . . . . 6 61 5. Deployments and Performance Evaluations . . . . . . . . . . . 7 62 6. Security Considerations . . . . . . . . . . . . . . . . . . . 7 63 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 7 64 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 7 65 8.1. Normative References . . . . . . . . . . . . . . . . . . 7 66 8.2. Informative References . . . . . . . . . . . . . . . . . 8 67 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 8 69 1. Introduction 71 [RFC5681] describes the slow start congestion control algorithm for 72 TCP. The slow start algorithm is used when the congestion window 73 (cwnd) is less than the slow start threshold (ssthresh). During slow 74 start, in absence of packet loss signals, TCP increases cwnd 75 exponentially to probe the network capacity. This fast growth can 76 overshoot the ideal sending rate and cause significant packet loss 77 which cannot always be recovered efficiently. 79 HyStart++ uses delay increase as a signal to exit slow start before 80 potential packet loss occurs as a result of overshoot. This is one 81 of two algorithms specified in [HyStart]. After the slow start exit, 82 a novel Conservative Slow Start (CSS) phase is used to determine 83 whether the slow start exit was premature and to resume slow start. 84 This mitigation improves performance in presence of jitter. 85 HyStart++ reduces packet loss and retransmissions, and improves 86 goodput in lab measurements and real world deployments. 88 2. Terminology 90 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 91 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 92 document are to be interpreted as described in [RFC2119]. 94 3. Definitions 96 We repeat here some definition from [RFC5681] to aid the reader. 98 SENDER MAXIMUM SEGMENT SIZE (SMSS): The SMSS is the size of the 99 largest segment that the sender can transmit. This value can be 100 based on the maximum transmission unit of the network, the path MTU 101 discovery [RFC1191, RFC4821] algorithm, RMSS (see next item), or 102 other factors. The size does not include the TCP/IP headers and 103 options. 105 RECEIVER MAXIMUM SEGMENT SIZE (RMSS): The RMSS is the size of the 106 largest segment the receiver is willing to accept. This is the value 107 specified in the MSS option sent by the receiver during connection 108 startup. Or, if the MSS option is not used, it is 536 bytes 109 [RFC1122]. The size does not include the TCP/IP headers and options. 111 RECEIVER WINDOW (rwnd): The most recently advertised receiver window. 113 CONGESTION WINDOW (cwnd): A TCP state variable that limits the amount 114 of data a TCP can send. At any given time, a TCP MUST NOT send data 115 with a sequence number higher than the sum of the highest 116 acknowledged sequence number and the minimum of cwnd and rwnd. 118 4. HyStart++ Algorithm 120 4.1. Summary 122 [HyStart] specifies two algorithms (a "Delay Increase" algorithm and 123 an "Inter-Packet Arrival" algorithm) to be run in parallel to detect 124 that the sending rate has reached capacity. In practice, the Inter- 125 Packet Arrival algorithm does not perform well and is not able to 126 detect congestion early, primarily due to ACK compression. The idea 127 of the Delay Increase algorithm is to look for spikes in RTT (round- 128 trip time), which suggest that the bottleneck buffer is filling up. 130 In HyStart++, a TCP sender uses traditional slow start and then uses 131 the "Delay Increase" algorithm to trigger an exit from slow start. 132 But instead of going straight from slow start to congestion 133 avoidance, the sender spends a number of RTTs in a Conservative Slow 134 Start (CSS) phase to determine whether the exit from slow start was 135 premature. During CSS, the congestion window is grown exponentially 136 like in regular slow start, but with a smaller exponential base, 137 resulting in less aggressive growth. If the RTT reduces during CSS, 138 it's concluded that the RTT spike was not related to congestion 139 caused by the connection sending at a rate greater than the ideal 140 send rate, and the connection resumes slow start. If the RTT 141 inflation persists throughout CSS, the connection enters congestion 142 avoidance. 144 4.2. Algorithm Details 146 For the pseudocode, we assume that Appropriate Byte Counting (as 147 described in [RFC3465]) is in use and L is the cwnd increase limit as 148 discussed in RFC 3465. 150 lastRoundMinRTT and currentRoundMinRTT are initialized to infinity at 151 the initialization time 153 Hystart++ measures rounds using sequence numbers, as follows: 155 Define windowEnd as a sequence number initialized to SND.UNA 157 When windowEnd is ACKed, the current round ends and windowEnd is 158 set to SND.NXT 160 At the start of each round during standard slow start ([RFC5681]) and 161 CSS: 163 lastRoundMinRTT = currentRoundMinRTT 165 currentRoundMinRTT = infinity 167 rttSampleCount = 0 169 For each arriving ACK in slow start, where N is the number of 170 previously unacknowledged bytes acknowledged in the arriving ACK: 172 Update the cwnd 174 - cwnd = cwnd + min (N, L * SMSS) 176 Keep track of minimum observed RTT 178 - currentRoundMinRTT = min(currentRoundMinRTT, currRTT) 180 - where currRTT is the RTT sampled from the latest incoming ACK 182 - rttSampleCount += 1 183 For rounds where N_RTT_SAMPLE RTT samples have been obtained and 184 currentRoundMinRTT and lastRoundMinRTT are valid, check if delay 185 increase triggers slow start exit 187 - if (rttSampleCount >= N_RTT_SAMPLE AND currentRoundMinRTT != 188 infinity AND lastRoundMinRTT != infinity) 190 o RttThresh = clamp(MIN_RTT_THRESH, lastRoundMinRTT / 8, 191 MAX_RTT_THRESH) 193 o if (currentRoundMinRTT >= (lastRoundMinRTT + RttThresh)) 195 + cssBaselineMinRtt = currentRoundMinRTT 197 + exit slow start and enter CSS 199 CSS lasts at most CSS_ROUNDS rounds. If the transition into CSS 200 happens in the middle of a round, that partial round counts towards 201 the limit. 203 For each arriving ACK in CSS, where N is the number of previously 204 unacknowledged bytes acknowledged in the arriving ACK: 206 Update the cwnd 208 - cwnd = cwnd + (min (N, L * SMSS) / CSS_GROWTH_DIVISOR) 210 Keep track of minimum observed RTT 212 - currentRoundMinRTT = min(currentRoundMinRTT, currRTT) 214 - where currRTT is the sampled RTT from the incoming ACK 216 - rttSampleCount += 1 218 For CSS rounds where N_RTT_SAMPLE RTT samples have been obtained, 219 check if current round's minRTT drops below baseline indicating 220 that HyStart exit was spurious. 222 - if (currentRoundMinRTT < cssBaselineMinRtt) 224 o cssBaselineMinRtt = infinity 226 o resume slow start including HyStart++ 228 If CSS_ROUNDS rounds are complete, enter congestion avoidance. 230 * ssthresh = cwnd 231 If loss or ECN-marking is observed anytime during standard slow start 232 or CSS, enter congestion avoidance. 234 * ssthresh = cwnd 236 4.3. Tuning constants 238 It is RECOMMENDED that a HyStart++ implementation use the following 239 constants: 241 * MIN_RTT_THRESH = 4 msec 243 * MAX_RTT_THRESH = 16 msec 245 * N_RTT_SAMPLE = 8 247 * CSS_GROWTH_DIVISOR = 4 249 * CSS_ROUNDS = 5 251 These constants have been determined with lab measurements and real 252 world deployments. An implementation MAY tune them for different 253 network characteristics. 255 The delay increase sensitivity is determined by MIN_RTT_THRESH and 256 MAX_RTT_THRESH. Smaller values of MIN_RTT_THRESH may cause spurious 257 exits from slow start. Larger values of MAX_RTT_THRESH may result in 258 slow start not exiting until loss is encountered for connections on 259 large RTT paths. 261 A TCP implementation is required to take at least one RTT sample each 262 round. Using lower values of N_RTT_SAMPLE will lower the accuracy of 263 the measured RTT for the round; higher values will improve accuracy 264 at the cost of more processing. 266 The minimum value of CSS_GROWTH_DIVISOR MUST be at least 2. A value 267 of 1 results in the same aggressive behavior as regular slow start. 268 Values larger than 4 will cause the algorithm to be less aggressive 269 and maybe less performant. 271 Smaller values of CSS_ROUNDS may miss detecting jitter and larger 272 values may limit performance. 274 An implementation SHOULD use HyStart++ only for the initial slow 275 start (when ssthresh is at its initial value of arbitrarily high per 276 [RFC5681]) and fall back to using traditional slow start for the 277 remainder of the connection lifetime. This is acceptable because 278 subsequent slow starts will use the discovered ssthresh value to exit 279 slow start and avoid the overshoot problem. An implementation MAY 280 use HyStart++ to grow the restart window ([RFC5681]) after a long 281 idle period. 283 5. Deployments and Performance Evaluations 285 As of the time of writing, HyStart++ as described in draft versions 286 01 through 04 was default enabled for all TCP connections in the 287 Windows operating system for over three years. The original Hystart 288 has been default-enabled for all TCP connections in the Linux 289 operating system using the default congestion control module CUBIC 290 ([RFC8312]) for a decade. 292 In lab measurements with Windows TCP, HyStart++ shows both goodput 293 improvements as well as reductions in packet loss and 294 retransmissions. For example across a variety of tests on a 100 Mbps 295 link with a bottleneck buffer size of bandwidth-delay product, 296 HyStart++ reduces bytes retransmitted by 50% and retransmission 297 timeouts by 36%. 299 In an A/B test for HyStart++ draft 01 across a large Windows device 300 population, out of 52 billion TCP connections, 0.7% of connections 301 move from 1 RTO to 0 RTOs and another 0.7% connections move from 2 302 RTOs to 1 RTO with HyStart++. This test did not focus on send heavy 303 connections and the impact on send heavy connections is likely much 304 higher. We plan to conduct more such production experiments to 305 gather more data in the future. 307 6. Security Considerations 309 HyStart++ enhances slow start and inherits the general security 310 considerations discussed in [RFC5681]. 312 7. IANA Considerations 314 This document has no actions for IANA. 316 8. References 318 8.1. Normative References 320 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 321 Requirement Levels", BCP 14, RFC 2119, 322 DOI 10.17487/RFC2119, March 1997, 323 . 325 [RFC3465] Allman, M., "TCP Congestion Control with Appropriate Byte 326 Counting (ABC)", RFC 3465, DOI 10.17487/RFC3465, February 327 2003, . 329 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 330 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 331 . 333 8.2. Informative References 335 [HyStart] Ha, S. and I. Ree, "Hybrid Slow Start for High-Bandwidth 336 and Long-Distance Networks", 337 DOI 10.1145/1851182.1851192, International Workshop on 338 Protocols for Fast Long-Distance Networks, 2008, 339 . 342 [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and 343 R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", 344 RFC 8312, DOI 10.17487/RFC8312, February 2018, 345 . 347 Authors' Addresses 349 Praveen Balasubramanian 350 Microsoft 351 One Microsoft Way 352 Redmond, WA 98052 353 United States of America 355 Phone: +1 425 538 2782 356 Email: pravb@microsoft.com 358 Yi Huang 359 Microsoft 361 Phone: +1 425 703 0447 362 Email: huanyi@microsoft.com 364 Matt Olson 365 Microsoft 367 Phone: +1 425 538 8598 368 Email: maolson@microsoft.com