idnits 2.17.1 draft-fairhurst-tcpm-newcwv-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC2861, but the abstract doesn't seem to directly say this. It does mention RFC2861 though, so this could be OK. -- The draft header indicates that this document updates RFC5681, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC5681, updated by this document, for RFC5378 checks: 2006-01-26) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (June 06, 2012) is 4342 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 2861 (Obsoleted by RFC 7661) ** Obsolete normative reference: RFC 2988 (Obsoleted by RFC 6298) ** Obsolete normative reference: RFC 3517 (Obsoleted by RFC 6675) Summary: 3 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TCPM Working Group G. Fairhurst 3 Internet-Draft A. Sathiaseelan 4 Updates: 5681 (if approved) University of Aberdeen 5 Obsoletes: 2861 (if approved) June 06, 2012 6 Intended status: Standards Track 7 Expires: December 06, 2012 9 Updating TCP to support Variable-Rate Traffic 10 draft-fairhurst-tcpm-newcwv-03 12 Abstract 14 This document addresses issues that arise when TCP is used to support 15 variable-rate traffic that exhibits periods where the transmission 16 rate is limited by the application rather than the congestion window. 17 It updates TCP to allow a TCP sender to restart quickly following 18 either an idle or application-limited interval. The method is 19 expected to benefit variable-rate TCP applications, while also 20 providing an appropriate response if congestion is experienced. 22 It also evaluates TCP Congestion Window Validation, CWV, an IETF 23 experimental specification defined in RFC 2861, and concludes that 24 CWV sought to address important issues, but failed to deliver a 25 widely used solution. This document therefore recommends that the 26 IETF should consider moving RFC 2861 from Experimental to Historic 27 status, and that this is replaced by the current specification. 29 Status of this Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at http://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on December 06, 2012. 46 Copyright Notice 48 Copyright (c) 2012 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents (http://trustee.ietf.org/ 53 license-info) in effect on the date of publication of this document. 54 Please review these documents carefully, as they describe your rights 55 and restrictions with respect to this document. Code Components 56 extracted from this document must include Simplified BSD License text 57 as described in Section 4.e of the Trust Legal Provisions and are 58 provided without warranty as described in the Simplified BSD License. 60 Table of Contents 62 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 2 63 2. Reviewing experience with TCP-CWV . . . . . . . . . . . . . . 3 64 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 65 4. An updated TCP response to idle and application-limited periods 4 66 4.1. A method for preserving cwnd in idle and application-limited 67 periods. . . . . . . . . . . . . . . . . . . . . . . . . . 5 68 4.2. The nonvalidated phase . . . . . . . . . . . . . . . . . . 5 69 4.3. TCP congestion control during the nonvalidated phase . . . 5 70 4.3.1. Adjustment at the end of the nonvalidated phase . . . 6 71 4.3.2. Response to congestion in the nonvalidated phase . . . 7 72 4.4. Determining a safe period to preserve cwnd . . . . . . . . 7 73 5. Security Considerations . . . . . . . . . . . . . . . . . . . 8 74 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 75 7. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 8 76 8. References . . . . . . . . . . . . . . . . . . . . . . . . . . 8 77 8.1. Normative References . . . . . . . . . . . . . . . . . . . 8 78 8.2. Informative References . . . . . . . . . . . . . . . . . . 9 79 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 9 81 1. Introduction 83 TCP is used to support a range of application behaviours. The TCP 84 congestion window (cwnd) controls the number of packets/bytes that a 85 TCP flow may have in the network at any time. A bulk application 86 will always have data available to transmit. The rate at which it 87 sends is therefore limited by the maximum permitted by the receiver 88 and congestion windows. In contrast, a variable-rate application 89 will experience periods when the sender is either idle or is unable 90 to send at the maximum rate permitted by the cwnd. This latter case 91 is called application-limited. The focus of this document is on the 92 operation of TCP in such an idle or application-limited case. 94 Standard TCP [RFC5681] requires the cwnd to be reset to the restart 95 window (RW) when an application becomes idle. [RFC2861] noted that 96 this TCP behaviour was not always observed in current 97 implementations. Recent experiments [Bis08] confirm this to still be 98 the case. 100 Standard TCP does not control growth of the cwnd when a variable-rate 101 TCP sender is application-limited. An application-limited sender may 102 therefore grow a cwnd beyond that corresponding to the current 103 transmit rate, resulting in a value that does not reflect current 104 information about the state of the network path the flow is using. 105 Use of such an invalid cwnd may result in reduced application 106 performance and/or could significantly contribute to network 107 congestion. 109 [RFC2861] proposed a solution to these issues in an experimental 110 method known as Congestion Window Validation (CWV). CWV was intended 111 to help reduce cases where TCP accumulated an invalid cwnd. The use 112 and drawbacks of using CWV with an application are discussed in 113 Section 2. 115 Section 4 specifies an alternative to CWV that seeks to address the 116 same issues, but does this in a way that is expected to mitigate the 117 impact on an application that varies its transmission rate. The 118 method described applies to both an application-limited and an idle 119 condition. 121 2. Reviewing experience with TCP-CWV 123 RFC 2861 described a simple modification to the TCP congestion 124 control algorithm that decayed the cwnd after the transition to a 125 "sufficiently-long" idle period. This used the slow-start threshold 126 (ssthresh) to save information about the previous value of the 127 congestion window. The approach relaxed the standard TCP behaviour 128 [RFC5681] for an idle session, intended to improve application 129 performance. CWV also modified the behaviour for an application- 130 limited session where a sender transmitted at a rate less than 131 allowed by cwnd. 133 RFC 2861 has been implemented in some mainstream operating systems as 134 the default behaviour [Bis08]. Analysis (e.g. [Bis10]) has shown 135 that a TCP sender using CWV is able to use available capacity on a 136 shared path after an idle period. This can benefit some 137 applications, especially over long delay paths, when compared to 138 slow-start restart specified by standard TCP. However, CWV would only 139 benefit an application if the idle period were less than several 140 Retransmission Time Out (RTO) intervals [RFC2988], since the 141 behaviour would otherwise be the same as for standard TCP, which 142 resets the cwnd to the RW after this period. 144 Experience with CWV suggests that although CWV benefits the network 145 in an application-limited scenario (reducing the probability of 146 network congestion), the behaviour can be too conservative for many 147 common variable-rate applications. This mechanism does not therefore 148 offer the desirable increase in application performance for variable- 149 rate applications and it is unclear whether applications actually use 150 this mechanism in the general Internet. 152 It is therefore concluded that CWV is often a poor solution for many 153 variable rate applications. It has the correct motivation, but has 154 the wrong approach to solving this problem. 156 3. Terminology 158 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 159 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 160 document are to be interpreted as described in [RFC2119]. 162 The document assumes familiarity with the terminology of TCP 163 congestion control [RFC5681]. 165 4. An updated TCP response to idle and application-limited periods 167 This section proposes an update to the TCP congestion control 168 behaviour during an idle or application-limited period. The new 169 method permits a TCP sender to preserve the cwnd when an application 170 becomes idle for a period of time (set in this specification to 5 171 minutes). This period, where actual usage is less than allowed by 172 cwnd, is named the non-validated phase. The method allows an 173 application to resume transmission at a previous rate without 174 incurring the delay of slow-start. However, if the TCP sender 175 experiences congestion using the preserved cwnd, it is required to 176 immediately reset the cwnd to an appropriate value specified by the 177 method. If a sender does not take advantage of the preserved cwnd 178 within five minutes, the value of cwnd is reduced, ensuring the value 179 then reflects the capacity that was recently actually used. 181 The method requires that the TCP SACK option is enabled. This allows 182 the sender to select a cwnd following a congestion event that is 183 based on the measured path capacity, better reflecting the fair- 184 share. A similar approach was proposed by TCP Jump Start [Liu07], as 185 a congestion response after more rapid opening of a TCP connection. 187 It is expected that this update will satisfy the requirements of many 188 variable-rate applications and at the same time provide an 189 appropriate method for use in the Internet. It also reduces the 190 incentive for an application to send data simply to keep transport 191 congestion state. (This is sometimes known as "padding"). 193 The new method does not differentiate between times when the sender 194 has become idle or application-limited. This is partly a response to 195 recognition that some applications wish to transmit at a variable- 196 rate, and that it can be hard to make a distinction between 197 application-limited and idle behaviour. This is expected to 198 encourage applications and TCP stacks to use standards-based 199 congestion control methods. It may also encourage the use of long- 200 lived connections where this offers benefit (such as persistent 201 http). 203 The method is specified in following subsections. 205 4.1. A method for preserving cwnd in idle and application-limited 206 periods. 208 The method described in this document updates [RFC5681]. Use of the 209 method REQUIRES a TCP sender and the corresponding receiver to enable 210 the TCP SACK option [RFC3517]. 212 [RFC5681] defines a variable FlightSize , that indicates the amount 213 of outstanding data in the network. In RFC5681 this is used during 214 loss recovery, whereas in this method it is also used during normal 215 data transfer. A sender is not required to continuously track this 216 value, but SHOULD measure the volume of data in the network with a 217 sampling period of not less than one RTT period. 219 4.2. The nonvalidated phase 221 The updated method creates a new TCP sender phase that captures 222 whether the cwnd reflects a validated or non-validated value. The 223 phases are defined as: 225 o Validated phase: FlightSize >=(2/3)*cwnd. This is the normal 226 phase, where cwnd is expected to be an approximate indication of 227 the available capacity currently available along the network path, 228 and the standard methods are used (currently [RFC5681]). 230 o Non-validated phase: FlightSize <(2/3)*cwnd. This is the phase 231 where the cwnd has a value based on a previous measurement of the 232 available capacity, and the usage of this capacity has not been 233 validated in the previous RTT. That is, when it is not known 234 whether the cwnd reflects the currently available capacity 235 available along the network path. The mechanisms to be used in 236 this phase seek to determine whether any resumed rate remains safe 237 for the Internet path, i.e., it quickly reduces the rate if the 238 flow is known to induce congestion. These mechanisms are 239 specified in section 4.3. 241 4.3. TCP congestion control during the nonvalidated phase 243 A TCP sender that enters the non-validated phase MUST preserve the 244 cwnd (i.e., this neither grows nor reduces while the sender remains 245 in this phase). The phase is concluded after a fixed period of time 246 (five minutes, as explained in section 4.4) or when the sender 247 transmits using the full cwnd (i.e. it is no longer application- 248 limited). 250 The behaviour in the non-validated phase is specified as: 252 o If the sender consumes all the available space within the cwnd 253 (i.e., the remaining unused cwnd in bytes is less than one Sender 254 Maximum Segment Size, SMSS), then the sender MUST exit the non- 255 validated phase. 257 o If the sender receives an indication of congestion while in the 258 non-validated phase (i.e. detects loss, or an Explicit Congestion 259 Notification, ECN, mark [RFC3168]), the sender MUST exit the non- 260 validated phase (reducing the cwnd as defined in section 4.3.1). 262 o If the Retransmission Time Out (RTO) expires while in the non- 263 validated phase, the sender MUST exit the non-validated phase. It 264 then resumes using the Standard TCP RTO mechanism [RFC5681]. (The 265 resulting reduction of cwnd is appropriate, since any accumulated 266 path history is considered unreliable). 268 The threshold value of cwnd required for the sender to enter the non- 269 validated phase is intentionally different to that required to leave 270 the phase. This introduces hysteresis to avoid rapid oscillation 271 between the phases. Note that a change between phases does not 272 significantly impact an application-limited sender, but serves to 273 determine its behaviour if it substantially increases its 274 transmission rate. 276 4.3.1. Adjustment at the end of the nonvalidated phase 278 During the non-validated phase, the sender may produce bursts of data 279 of up to the cwnd in size. While this is no different to standard 280 TCP, it is desirable to control the maximum burst size, e.g. by 281 setting a burst size limit, using a pacing algorithm, or some other 282 method. 284 An application that remains in the non-validated phase for a period 285 greater than five minutes is required to adjust its congestion 286 control state. At the end of the non-validated phase, the sender 287 MUST update cwnd: 289 cwnd = max(FlightSize*2, IW). 291 Where IW is the TCP initial window [RFC5681]. 293 (This allows an application to continue to send at the currently 294 utilised rate, and not incur delay should it increase to twice the 295 utilised rate.) 297 The sender also MUST reset the ssthresh: 299 ssthresh = max(ssthresh, 3*cwnd/4). 301 (This adjustment of ssthresh ensures that the sender records that it 302 has safely sustained the present rate. The change is beneficial to 303 application-limited flows that encounter occasional congestion, and 304 could otherwise suffer an unwanted additional delay in recovering the 305 transmission rate.) 307 After completing this adjustment, the sender MAY re-enter the non- 308 validated phase, if required (see section 4.2). 310 4.3.2. Response to congestion in the nonvalidated phase 312 Reception of congestion feedback while in the non-validated phase is 313 interpreted as an indication that it was inappropriate for the 314 sender to use the preserved cwnd. The sender is therefore required 315 to quickly reduce the rate to avoid further congestion. Since the 316 cwnd does not have a validated value, a new cwnd value must be 317 selected based on the utilised rate. 319 A sender that detects a packet-drop or receives an ECN marked packet 320 MUST calculate a safe cwnd, based on the volume of acknowledged data: 322 cwnd = FlightSize - R. 324 Where, R is the volume of data that was reported as unacknowledged by 325 the SACK information. This follows the method proposed for Jump 326 Start [[Liu07]. 328 At the end of the recovery phase, the TCP sender MUST reset the cwnd 329 using the method below: 331 cwnd = ((FlightSize - R)/2). 333 4.4. Determining a safe period to preserve cwnd 335 This section documents the rationale for selecting the maximum period 336 that cwnd may be preserved. 338 Preserving cwnd avoids undesirable side effects that would result if 339 the cwnd were to be preserved for an arbitrary long period, which was 340 a part of the problem that CWV originally attempted to address. The 341 period a sender may safely preserve the cwnd, is a function of the 342 period that a network path is expected to sustain the capacity 343 reflected by cwnd. There is no ideal choice for this time. 345 The period of five minutes was chosen as a compromise that was larger 346 than the idle intervals of common applications, but not sufficiently 347 larger than the period for which the capacity of an Internet path may 348 commonly be regarded as stable. The capacity of wired networks is 349 usually relatively stable for periods of several minutes and that 350 load stability increases with the capacity. This suggests that cwnd 351 may be preserved for at least a few minutes. 353 There are cases where the TCP throughput exhibits significant 354 variability over a time less than five minutes. Examples could 355 include wireless topologies, where TCP rate variations may fluctuate 356 on the order of a few seconds as a consequence of medium access 357 protocol instabilities. Mobility changes may also impact TCP 358 performance over short time scales. Senders that observe such rapid 359 changes in the path characteristic may also experience increased 360 congestion with the new method, however such variation would likely 361 also impact TCP's behaviour when supporting interactive and bulk 362 applications. 364 Routing algorithms may modify the network path, disrupting the RTT 365 measurement and changing the capacity available to a TCP connection, 366 however such changes do not often occur within a time frame of a few 367 minutes. 369 The value of five minutes is therefore expected to be sufficient for 370 most current applications. Simulation studies also suggest that for 371 many practical applications, the performance using this value will 372 not be significantly different to that observed using a non-standard 373 method that does not reset the cwnd after idle. 375 Finally, other TCP sender mechanisms have used a 5 minute timer, and 376 there could be simplifications in some implementations by reusing the 377 same interval. 379 5. Security Considerations 381 General security considerations concerning TCP congestion control are 382 discussed in [RFC5681]. This document describes an algorithm that 383 updates one aspect of the congestion control procedures, and so the 384 considerations described in RFC 5681 also apply to this algorithm. 386 6. IANA Considerations 388 There are no IANA considerations. 390 7. Acknowledgments 392 The authors acknowledge the contributions of Dr I Biswas and Dr R 393 Secchi in supporting the evaluation of CWV and for their help in 394 developing the mechanisms proposed in this draft. We also 395 acknowledge comments received from the Internet Congestion Control 396 Research Group. 398 8. References 400 8.1. Normative References 402 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 403 Requirement Levels", BCP 14, RFC 2119, March 1997. 405 [RFC2861] Handley, M., Padhye, J. and S. Floyd, "TCP Congestion 406 Window Validation", RFC 2861, June 2000. 408 [RFC2988] Paxson, V. and M. Allman, "Computing TCP's Retransmission 409 Timer", RFC 2988, November 2000. 411 [RFC3168] Ramakrishnan, K., Floyd, S. and D. Black, "The Addition of 412 Explicit Congestion Notification (ECN) to IP", RFC 3168, 413 September 2001. 415 [RFC3517] Blanton, E., Allman, M., Fall, K. and L. Wang, "A 416 Conservative Selective Acknowledgment (SACK)-based Loss 417 Recovery Algorithm for TCP", RFC 3517, April 2003. 419 [RFC5681] Allman, M., Paxson, V. and E. Blanton, "TCP Congestion 420 Control", RFC 5681, September 2009. 422 8.2. Informative References 424 [Bis08] Biswas, and Fairhurst, "A Practical Evaluation of 425 Congestion Window Validation Behaviour, 9th Annual 426 Postgraduate Symposium in the Convergence of 427 Telecommunications, Networking and Broadcasting (PGNet), 428 Liverpool, UK", June 2008. 430 [Bis10] Biswas, , Sathiaseelan, , Secchi, and Fairhurst, 431 "Analysing TCP for Bursty Traffic, Int'l J. of 432 Communications, Network and System Sciences, 7(3)", June 433 2010. 435 [Liu07] Liu, , Allman, , Jiny, and Wang, "Congestion Control 436 without a Startup Phase, 5th International Workshop on 437 Protocols for Fast Long-Distance Networks (PFLDnet), Los 438 Angeles, California, USA", February 2007. 440 Authors' Addresses 442 Godred Fairhurst 443 University of Aberdeen 444 School of Engineering 445 Fraser Noble Building 446 Aberdeen, Scotland AB24 3UE 447 UK 449 Email: gorry@erg.abdn.ac.uk 450 URI: http://www.erg.abdn.ac.uk 451 Arjuna Sathiaseelan 452 University of Aberdeen 453 School of Engineering 454 Fraser Noble Building 455 Aberdeen, Scotland AB24 3UE 456 UK 458 Email: arjuna@erg.abdn.ac.uk 459 URI: http://www.erg.abdn.ac.uk