idnits 2.17.1 draft-balasubramanian-tcpm-hystartplusplus-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (January 22, 2020) is 1555 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RFC1191' is mentioned on line 113, but not defined == Missing Reference: 'RFC4821' is mentioned on line 113, but not defined == Missing Reference: 'RFC1122' is mentioned on line 121, but not defined ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) -- Obsolete informational reference (is this intentional?): RFC 8312 (Obsoleted by RFC 9438) Summary: 1 error (**), 0 flaws (~~), 4 warnings (==), 2 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group P. Balasubramanian 3 Internet-Draft Y. Huang 4 Intended status: Informational M. Olson 5 Expires: July 25, 2020 Microsoft 6 January 22, 2020 8 HyStart++: Modified Slow Start for TCP 9 draft-balasubramanian-tcpm-hystartplusplus-02 11 Abstract 13 This informational memo describes HyStart++, a simple modification to 14 the slow start phase of TCP congestion control algorithms. HyStart++ 15 combines the use of one variant of HyStart and Limited Slow Start 16 (LSS) to prevent overshooting of the ideal sending rate value, while 17 also mitigating poor performance which can result from false 18 positives when HyStart is used alone. This memo also describes the 19 details of the current implementation in the Windows operating 20 system. 22 Status of This Memo 24 This Internet-Draft is submitted in full conformance with the 25 provisions of BCP 78 and BCP 79. 27 Internet-Drafts are working documents of the Internet Engineering 28 Task Force (IETF). Note that other groups may also distribute 29 working documents as Internet-Drafts. The list of current Internet- 30 Drafts is at https://datatracker.ietf.org/drafts/current/. 32 Internet-Drafts are draft documents valid for a maximum of six months 33 and may be updated, replaced, or obsoleted by other documents at any 34 time. It is inappropriate to use Internet-Drafts as reference 35 material or to cite them other than as "work in progress." 37 This Internet-Draft will expire on July 25, 2020. 39 Copyright Notice 41 Copyright (c) 2020 IETF Trust and the persons identified as the 42 document authors. All rights reserved. 44 This document is subject to BCP 78 and the IETF Trust's Legal 45 Provisions Relating to IETF Documents 46 (https://trustee.ietf.org/license-info) in effect on the date of 47 publication of this document. Please review these documents 48 carefully, as they describe your rights and restrictions with respect 49 to this document. Code Components extracted from this document must 50 include Simplified BSD License text as described in Section 4.e of 51 the Trust Legal Provisions and are provided without warranty as 52 described in the Simplified BSD License. 54 Table of Contents 56 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 57 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 58 3. Definitions . . . . . . . . . . . . . . . . . . . . . . . . . 3 59 4. HyStart++ Algorithm . . . . . . . . . . . . . . . . . . . . . 3 60 4.1. Use of HyStart Delay Increase and Limited Slow Start . . 3 61 4.2. Algorithm Details . . . . . . . . . . . . . . . . . . . . 4 62 4.3. Constants used and tuning . . . . . . . . . . . . . . . . 5 63 5. Security Considerations . . . . . . . . . . . . . . . . . . . 6 64 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 6 65 7. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 6 66 8. References . . . . . . . . . . . . . . . . . . . . . . . . . 6 67 8.1. Normative References . . . . . . . . . . . . . . . . . . 6 68 8.2. Informative References . . . . . . . . . . . . . . . . . 6 69 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 7 71 1. Introduction 73 [RFC0793] and [RFC5681] describe the slow start mechanism for TCP. 74 The slow start algorithm is used when congestion window (cwnd) is 75 less than the slow start threshold (ssthresh). During slow start, a 76 TCP increments cwnd by at most SMSS bytes per ACK. In absence of 77 packet loss signals, slow start effectively doubles the congestion 78 window each round trip time. 80 While traditional TCP slow start can ramp up very quickly, it 81 frequently overshoots the ideal sending rate and causes a lot of 82 unnecessary packet drops. TCP has several mechanisms for loss 83 recovery, but they are only effective for moderate loss. When these 84 techniques are unable to recover lost packets, a last-resort 85 retransmission timeout (RTO) is used to trigger packet recovery. In 86 most operating systems, the minimum RTO is set to a large value (200 87 ms or 300ms) to prevent spurious timeouts. This results in a long 88 idle time which drastically impairs flow completion times. 90 HyStart++ adds delay increase as a signal to exit slow start before 91 any packet loss occurs. This is one of two algorithms specified in 92 [HyStart]. After the HyStart delay algorithm finds an exit point, 93 LSS is used for further congestion window increases until the first 94 packet loss occurs. 96 This document describes HyStart++ as implemented in the Microsoft 97 Windows operating system. HyStart++ is widely deployed on the public 98 Internet. A precise documentation of running code enables follow-up 99 IETF Experimental or Standards Track RFCs. It also enables other 100 implementations and sharing of results for various workloads. 102 2. Terminology 104 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 105 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 106 document are to be interpreted as described in [RFC2119]. 108 3. Definitions 110 SENDER MAXIMUM SEGMENT SIZE (SMSS): The SMSS is the size of the 111 largest segment that the sender can transmit. This value can be 112 based on the maximum transmission unit of the network, the path MTU 113 discovery [RFC1191, RFC4821] algorithm, RMSS (see next item), or 114 other factors. The size does not include the TCP/IP headers and 115 options. 117 RECEIVER MAXIMUM SEGMENT SIZE (RMSS): The RMSS is the size of the 118 largest segment the receiver is willing to accept. This is the value 119 specified in the MSS option sent by the receiver during connection 120 startup. Or, if the MSS option is not used, it is 536 bytes 121 [RFC1122]. The size does not include the TCP/IP headers and options. 123 CONGESTION WINDOW (cwnd): A TCP state variable that limits the amount 124 of data a TCP can send. At any given time, a TCP MUST NOT send data 125 with a sequence number higher than the sum of the highest 126 acknowledged sequence number and the minimum of cwnd and rwnd. 128 4. HyStart++ Algorithm 130 4.1. Use of HyStart Delay Increase and Limited Slow Start 132 [HyStart] specifies two algorithms (a "Delay Increase" algorithm and 133 an "Inter-Packet Arrival" algorithm) to be run in parallel to detect 134 that the sending rate has reached capacity. In practice, the Inter- 135 Packet Arrival algorithm does not perform well and is not able to 136 detect congestion early, primarily due to ACK compression. The idea 137 of the Delay Increase algorithm is to look for RTT spikes, which 138 suggest that the bottleneck buffer is filling up. 140 After the HyStart "Delay Increase" algorithm triggers an exit from 141 slow start, LSS (described in [RFC3742]) is used to increase Cwnd 142 until the first packet loss occurs. LSS is used because the HyStart 143 exit is often premature as a result of RTT fluctuations or transient 144 queue buildup. LSS grows the cwnd fast but much slower than 145 traditional slow start. LSS helps avoid massive packet losses and 146 subsequent time spent in loss recovery or retransmission timeout. 148 4.2. Algorithm Details 150 A round is chosen to be approximately the Round-Trip Time (RTT). 151 Round can be approximated using sequence numbers as follows: 153 Define windowEnd as a sequence number initialize to SND.UNA 155 When windowEnd is ACKed, the current round ends and windowEnd is 156 set to SND.NXT 158 At the start of each round during slow start: 160 lastRoundMinRTT = currentRoundMinRTT 162 currentRoundMinRTT = infinity 164 rttSampleCount = 0 166 For each arriving ACK in slow start, where N is the number of 167 previously unacknowledged bytes acknowledged in the arriving ACK: 169 Update the cwnd 171 cwnd = cwnd + min (N, SMSS) 173 Keep track of minimum observed RTT 175 currentRoundMinRTT = min(currentRoundMinRTT, currRTT) 177 where currRTT is the measured RTT based on the incoming ACK 179 rttSampleCount += 1 181 For rounds where cwnd is at or higher than LOW_CWND and 182 N_RTT_SAMPLE RTT samples have been obtained, check if delay 183 increase triggers slow start exit 185 if (cwnd >= (LOW_CWND * SMSS) AND rttSampleCount >= 186 N_RTT_SAMPLE) 188 RttThresh = clamp(MIN_RTT_THRESH, lastRoundMinRTT / 8, 189 MAX_RTT_THRESH) 191 if (currentRoundMinRTT >= (lastRoundMinRTT + RttThresh)) 192 ssthresh = cwnd 194 exit slow start and enter LSS 196 For each arriving ACK in LSS, where N is the number of previously 197 unacknowledged bytes acknowledged in the arriving ACK: 199 K = cwnd / (LSS_DIVISOR * ssthresh) 201 cwnd = max(cwnd + (N / K), CA_cwnd()) 203 CA_cwnd() denotes the cwnd that a congestion control algorithm would 204 have increased to if congestion avoidance started instead of LSS. 205 LSS grows cwnd very fast but for long-lived flows in high BDP 206 networks, the congestion avoidance algorithm could increase cwnd much 207 faster. For example, CUBIC congestion avoidance [RFC8312] in convex 208 region can ramp up cwnd rapidly. Taking the max can help improve 209 performance when exiting slow start prematurely. 211 HyStart++ ends when congestion is observed. 213 4.3. Constants used and tuning 215 The Windows operating system implementation of HyStart++ uses the 216 following constants: 218 LOW_CWND = 16 220 MIN_RTT_THRESH = 4 msec 222 MAX_RTT_THRESH = 16 msec 224 LSS_DIVISOR = 0.25 226 N_RTT_SAMPLE = 8 228 An implementation MAY experiment with these constants and tune them 229 for different network characteristics. Windows operating system 230 implementation uses the same values for all connections. The maximum 231 value of LSS_DIVISOR SHOULD NOT exceed 0.5 which is the value 232 recommended in [RFC3742]. 234 An implementation MAY choose to use HyStart++ for all slow starts 235 including the ones post a retransmission timeout, or a long idle 236 period. The Windows operating system implementation uses HyStart++ 237 only for the initial slow start and uses traditional slow start for 238 subsequent ones. This is acceptable because subsequent slow starts 239 will use the discovered ssthresh value to exit slow start. 241 5. Security Considerations 243 HyStart++ enhances slow start and inherits the general security 244 considerations discussed in [RFC5681]. 246 6. IANA Considerations 248 This document has no actions for IANA. 250 7. Acknowledgements 252 Neal Cardwell suggested the idea for using the maximum of cwnd value 253 computed by LSS and congestion avoidance after exiting slow start. 255 8. References 257 8.1. Normative References 259 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 260 RFC 793, DOI 10.17487/RFC0793, September 1981, 261 . 263 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 264 Requirement Levels", BCP 14, RFC 2119, 265 DOI 10.17487/RFC2119, March 1997, 266 . 268 [RFC3742] Floyd, S., "Limited Slow-Start for TCP with Large 269 Congestion Windows", RFC 3742, DOI 10.17487/RFC3742, March 270 2004, . 272 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 273 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 274 . 276 8.2. Informative References 278 [HyStart] Ha, S. and I. Ree, "Hybrid Slow Start for High-Bandwidth 279 and Long-Distance Networks", 280 DOI 10.1145/1851182.1851192, International Workshop on 281 Protocols for Fast Long-Distance Networks, 2008, 282 . 285 [RFC8312] Rhee, I., Xu, L., Ha, S., Zimmermann, A., Eggert, L., and 286 R. Scheffenegger, "CUBIC for Fast Long-Distance Networks", 287 RFC 8312, DOI 10.17487/RFC8312, February 2018, 288 . 290 Authors' Addresses 292 Praveen Balasubramanian 293 Microsoft 294 One Microsoft Way 295 Redmond, WA 98052 296 USA 298 Phone: +1 425 538 2782 299 Email: pravb@microsoft.com 301 Yi Huang 302 Microsoft 304 Phone: +1 425 703 0447 305 Email: huanyi@microsoft.com 307 Matt Olson 308 Microsoft 310 Phone: +1 425 538 8598 311 Email: maolson@microsoft.com