idnits 2.17.1 draft-fairhurst-tcpm-newcwv-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC2861, but the abstract doesn't seem to directly say this. It does mention RFC2861 though, so this could be OK. -- The draft header indicates that this document updates RFC5681, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC5681, updated by this document, for RFC5378 checks: 2006-01-26) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 06, 2012) is 4279 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2861 (Obsoleted by RFC 7661) ** Obsolete normative reference: RFC 3517 (Obsoleted by RFC 6675) Summary: 3 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TCPM Working Group G. Fairhurst 3 Internet-Draft A. Sathiaseelan 4 Obsoletes: 2861 (if approved) University of Aberdeen 5 Updates: 5681 (if approved) August 06, 2012 6 Intended status: Standards Track 7 Expires: February 7, 2013 9 Updating TCP to support Application-Limited Traffic 10 draft-fairhurst-tcpm-newcwv-04 12 Abstract 14 This document addresses issues that arise when TCP is used to support 15 traffic that exhibits periods where the transmission rate is limited 16 by the application rather than the congestion window. It updates TCP 17 to allow a TCP sender to restart quickly following either an idle or 18 application-limited interval. The method is expected to benefit 19 application-limited TCP applications, while also providing an 20 appropriate response if congestion is experienced. 22 It also evaluates TCP Congestion Window Validation, CWV, an IETF 23 experimental specification defined in RFC 2861, and concludes that 24 CWV sought to address important issues, but failed to deliver a 25 widely used solution. This document therefore proposes an update to 26 the status of RFC 2861 by recommending it is moved from Experimental 27 to Historic status, and that it is replaced by the current 28 specification. 30 Status of this Memo 32 This Internet-Draft is submitted in full conformance with the 33 provisions of BCP 78 and BCP 79. 35 Internet-Drafts are working documents of the Internet Engineering 36 Task Force (IETF). Note that other groups may also distribute 37 working documents as Internet-Drafts. The list of current Internet- 38 Drafts is at http://datatracker.ietf.org/drafts/current/. 40 Internet-Drafts are draft documents valid for a maximum of six months 41 and may be updated, replaced, or obsoleted by other documents at any 42 time. It is inappropriate to use Internet-Drafts as reference 43 material or to cite them other than as "work in progress." 45 This Internet-Draft will expire on February 7, 2013. 47 Copyright Notice 48 Copyright (c) 2012 IETF Trust and the persons identified as the 49 document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 Table of Contents 63 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 64 2. Reviewing experience with TCP-CWV . . . . . . . . . . . . . . 3 65 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 66 4. An updated TCP response to idle and application-limited 67 periods . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 4.1. A method for preserving cwnd in idle and 69 application-limited periods. . . . . . . . . . . . . . . . 5 70 4.2. The nonvalidated phase . . . . . . . . . . . . . . . . . . 6 71 4.3. TCP congestion control during the nonvalidated phase . . . 6 72 4.3.1. Response to congestion in the nonvalidated phase . . . 7 73 4.3.2. Adjustment at the end of the nonvalidated phase . . . 7 74 5. Determining a safe period to preserve cwnd . . . . . . . . . . 8 75 6. Security Considerations . . . . . . . . . . . . . . . . . . . 9 76 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 9 77 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 78 9. Other related work - Author Notes . . . . . . . . . . . . . . 10 79 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 11 80 10.1. Normative References . . . . . . . . . . . . . . . . . . . 11 81 10.2. Informative References . . . . . . . . . . . . . . . . . . 12 82 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 12 84 1. Introduction 86 TCP is used to support a range of application behaviours. The TCP 87 congestion window (cwnd) controls the number of packets/bytes that a 88 TCP flow may have in the network at any time. A bulk application 89 will always have data available to transmit. The rate at which it 90 sends is therefore limited by the maximum permitted by the receiver 91 and congestion windows. In contrast, a rate-limited application will 92 experience periods when the sender is either idle or is unable to 93 send at the maximum rate permitted by the cwnd. This latter case is 94 called application-limited. The focus of this document is on the 95 operation of TCP in such an idle or application-limited case. 97 Standard TCP [RFC5681] requires the cwnd to be reset to the restart 98 window (RW) when an application becomes idle. [RFC2861] noted that 99 this TCP behaviour was not always observed in current 100 implementations. Recent experiments [Bis08] confirm this to still be 101 the case. 103 Standard TCP does not control growth of the cwnd when a TCP sender is 104 application-limited. An application-limited sender may therefore 105 grow a cwnd beyond that corresponding to the current transmit rate, 106 resulting in a value that does not reflect current information about 107 the state of the network path the flow is using. Use of such an 108 invalid cwnd may result in reduced application performance and/or 109 could significantly contribute to network congestion. 111 [RFC2861] proposed a solution to these issues in an experimental 112 method known as Congestion Window Validation (CWV). CWV was intended 113 to help reduce cases where TCP accumulated an invalid cwnd. The use 114 and drawbacks of using CWV with an application are discussed in 115 Section 2. 117 Section 4 specifies an alternative to CWV that seeks to address the 118 same issues, but does this in a way that is expected to mitigate the 119 impact on an application that varies its transmission rate. The 120 method described applies to both an application-limited and an idle 121 condition. 123 2. Reviewing experience with TCP-CWV 125 RFC 2861 described a simple modification to the TCP congestion 126 control algorithm that decayed the cwnd after the transition to a 127 "sufficiently-long" idle period. This used the slow-start threshold 128 (ssthresh) to save information about the previous value of the 129 congestion window. The approach relaxed the standard TCP behaviour 130 [RFC5681] for an idle session, intended to improve application 131 performance. CWV also modified the behaviour for an application- 132 limited session where a sender transmitted at a rate less than 133 allowed by cwnd. 135 RFC 2861 has been implemented in some mainstream operating systems as 136 the default behaviour [Bis08]. Analysis (e.g. [Bis10]) has shown 137 that a TCP sender using CWV is able to use available capacity on a 138 shared path after an idle period. This can benefit some 139 applications, especially over long delay paths, when compared to 140 slow-start restart specified by standard TCP. However, CWV would 141 only benefit an application if the idle period were less than several 142 Retransmission Time Out (RTO) intervals [RFC6298], since the 143 behaviour would otherwise be the same as for standard TCP, which 144 resets the cwnd to the RW after this period. 146 Experience with CWV suggests that although CWV benefits the network 147 in an application-limited scenario (reducing the probability of 148 network congestion), the behaviour can be too conservative for many 149 common rate-limited applications. This mechanism does not therefore 150 offer the desirable increase in application performance for rate- 151 limited applications and it is unclear whether applications actually 152 use this mechanism in the general Internet. 154 It is therefore concluded that CWV is often a poor solution for many 155 rate-limited applications. It has the correct motivation, but has 156 the wrong approach to solving this problem. 158 3. Terminology 160 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 161 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 162 document are to be interpreted as described in [RFC2119]. 164 The document assumes familiarity with the terminology of TCP 165 congestion control [RFC5681]. 167 4. An updated TCP response to idle and application-limited periods 169 This section proposes an update to the TCP congestion control 170 behaviour during an idle or application-limited period. The new 171 method permits a TCP sender to preserve the cwnd when an application 172 becomes idle for a period of time (set in this specification to 5 173 minutes, see section 5). This period, where actual usage is less 174 than allowed by cwnd, is named the non-validated phase. The method 175 allows an application to resume transmission at a previous rate 176 without incurring the delay of slow-start. However, if the TCP 177 sender experiences congestion using the preserved cwnd, it is 178 required to immediately reset the cwnd to an appropriate value 179 specified by the method. If a sender does not take advantage of the 180 preserved cwnd within five minutes, the value of cwnd is reduced, 181 ensuring the value then reflects the capacity that was recently 182 actually used. 184 The method requires that the TCP SACK option is enabled. This allows 185 the sender to select a cwnd following a congestion event that is 186 based on the measured path capacity, better reflecting the fair- 187 share. A similar approach was proposed by TCP Jump Start [Liu07], as 188 a congestion response after more rapid opening of a TCP connection. 190 It is expected that this update will satisfy the requirements of many 191 rate-limited applications and at the same time provide an appropriate 192 method for use in the Internet. It also reduces the incentive for an 193 application to send data simply to keep transport congestion state. 194 (This is sometimes known as "padding"). 196 The new method does not differentiate between times when the sender 197 has become idle or application-limited. This is partly a response to 198 recognition that some applications wish to transmit at a rate- 199 limited, and that it can be hard to make a distinction between 200 application-limited and idle behaviour. This is expected to 201 encourage applications and TCP stacks to use standards-based 202 congestion control methods. It may also encourage the use of long- 203 lived connections where this offers benefit (such as persistent 204 http). 206 The method is specified in following subsections. 208 4.1. A method for preserving cwnd in idle and application-limited 209 periods. 211 The method described in this document updates [RFC5681]. Use of the 212 method REQUIRES a TCP sender and the corresponding receiver to enable 213 the TCP SACK option [RFC3517]. 215 [RFC5681] defines a variable FlightSize , that indicates the amount 216 of outstanding data in the network. This equal to the value of Pipe 217 calculated based on the pipe algorithm [RFC3517]. In RFC5681 this 218 value is used during loss recovery, whereas in this method it is also 219 used during normal data transfer. A sender is not required to 220 continuously track this value, but SHOULD measure the volume of data 221 in the network with a sampling period of not less than one RTT 222 period. 224 4.2. The nonvalidated phase 226 The updated method creates a new TCP sender phase that captures 227 whether the cwnd reflects a validated or non-validated value. The 228 phases are defined as: 230 o Validated phase: FlightSize >=(3/4)*cwnd. This is the normal 231 phase, where cwnd is expected to be an approximate indication of 232 the available capacity currently available along the network path, 233 and the standard methods are used to increase cwnd (currently 234 [RFC5681]). 236 o Non-validated phase: FlightSize <(1/4)*cwnd. This is the phase 237 where the cwnd has a value based on a previous measurement of the 238 available capacity, and the usage of this capacity has not been 239 validated in the previous RTT. That is, when it is not known 240 whether the cwnd reflects the currently available capacity 241 available along the network path. The mechanisms to be used in 242 this phase seek to determine whether any resumed rate remains safe 243 for the Internet path, i.e., it quickly reduces the rate if the 244 flow is known to induce congestion. These mechanisms are 245 specified in section 4.3. 247 The values 1/4 and 3/4 were selected to reduce the effects of 248 variations in the measured FlightSize. 250 4.3. TCP congestion control during the nonvalidated phase 252 A TCP sender that enters the non-validated phase MUST preserve the 253 cwnd (i.e., this neither grows nor reduces while the sender remains 254 in this phase). The phase is concluded after a fixed period of time 255 (five minutes, as explained in section or when the sender transmits 256 using the full cwnd (i.e. it is no longer application-limited). 258 The behaviour in the non-validated phase is specified as: 260 o If the sender consumes all the available space within the cwnd 261 (i.e., the remaining unused cwnd in bytes is less than one Sender 262 Maximum Segment Size, SMSS), then the sender MUST exit the non- 263 validated phase. The threshold value of cwnd required for the 264 sender to enter the non-validated phase is intentionally different 265 to that required to leave the phase. This introduces hysteresis 266 to avoid rapid oscillation between the phases. Note that a change 267 between phases does not significantly impact an application- 268 limited sender, but serves to determine its behaviour if it 269 substantially increases its transmission rate. 271 o If the sender receives an indication of congestion while in the 272 non-validated phase (i.e. detects loss, or an Explicit Congestion 273 Notification, ECN, mark [RFC3168]), the sender MUST exit the non- 274 validated phase (reducing the cwnd as defined in section 4.3.1). 276 o If the Retransmission Time Out (RTO) expires while in the non- 277 validated phase, the sender MUST exit the non-validated phase. It 278 then resumes using the Standard TCP RTO mechanism [RFC5681]. (The 279 resulting reduction of cwnd describe din section 4.3.2 is 280 appropriate, since any accumulated path history is considered 281 unreliable). 283 4.3.1. Response to congestion in the nonvalidated phase 285 Reception of congestion feedback while in the non-validated phase is 286 interpreted as an indication that it was inappropriate for the sender 287 to use the preserved cwnd. The sender is therefore required to 288 quickly reduce the rate to avoid further congestion. Since the cwnd 289 does not have a validated value, a new cwnd value must be selected 290 based on the utilised rate. 292 A sender that detects a packet-drop or receives an ECN marked packet 293 MUST calculate a safe cwnd, by setting it to the value specified in 294 Section 3.2 of [RFC5681]. 296 At the end of the recovery phase, the TCP sender MUST reset the cwnd 297 using the method below: 298 cwnd = ((FlightSize - R)/2). 300 Where, R is the volume of data that was reported as unacknowledged by 301 the SACK information. This follows the method proposed for Jump 302 Start [[Liu07]. 304 The inclusion of the term R makes this adjustment is more 305 conservative than standard TCP. This is required, since the sender 306 may have sent more segments than Standard TCP would have done. 308 If the sender implements a method that allows it to identify the 309 number of ECN-marked segments within a windowthat were observed by 310 the receiver, the sender SHOULD use the method above, further 311 reducing R by the number of marked segments. 313 4.3.2. Adjustment at the end of the nonvalidated phase 315 During the non-validated phase, the sender may produce bursts of data 316 of up to the cwnd in size. While this is no different to standard 317 TCP, it is desirable to control the maximum burst size, e.g. by 318 setting a burst size limit, using a pacing algorithm, or some other 319 method [Hug01]. 321 An application that remains in the non-validated phase for a period 322 greater than five minutes is required to adjust its congestion 323 control state. At the end of the non-validated phase, the sender 324 MUST update the ssthresh: 325 sthresh = max(ssthresh, 3*cwnd/4). 327 (This adjustment of ssthresh ensures that the sender records that it 328 has safely sustained the present rate. The change is beneficial to 329 application-limited flows that encounter occasional congestion, and 330 could otherwise suffer an unwanted additional delay in recovering the 331 transmission rate.) 333 The sender MUST then update cwnd: 334 cwnd = max(FlightSize*2, IW). 336 Where IW is the TCP inital window [RFC5681]. 338 (This allows an application to continue to send at the currently 339 utilised rate, and not incur delay should it increase to twice the 340 utilised rate.) 342 After completing this adjustment, the sender MAY re-enter the non- 343 validated phase, if required (see section 4.2). 345 5. Determining a safe period to preserve cwnd 347 This section documents the rationale for selecting the maximum period 348 that cwnd may be preserved. 350 Preserving cwnd avoids undesirable side effects that would result if 351 the cwnd were to be preserved for an arbitrary long period, which was 352 a part of the problem that CWV originally attempted to address. The 353 period a sender may safely preserve the cwnd, is a function of the 354 period that a network path is expected to sustain the capacity 355 reflected by cwnd. There is no ideal choice for this time. 357 The period of five minutes was chosen as a compromise that was larger 358 than the idle intervals of common applications, but not sufficiently 359 larger than the period for which the capacity of an Internet path may 360 commonly be regarded as stable. The capacity of wired networks is 361 usually relatively stable for periods of several minutes and that 362 load stability increases with the capacity. This suggests that cwnd 363 may be preserved for at least a few minutes. 365 There are cases where the TCP throughput exhibits significant 366 variability over a time less than five minutes. Examples could 367 include wireless topologies, where TCP rate variations may fluctuate 368 on the order of a few seconds as a consequence of medium access 369 protocol instabilities. Mobility changes may also impact TCP 370 performance over short time scales. Senders that observe such rapid 371 changes in the path characteristic may also experience increased 372 congestion with the new method, however such variation would likely 373 also impact TCP's behaviour when supporting interactive and bulk 374 applications. 376 Routing algorithms may modify the network path, disrupting the RTT 377 measurement and changing the capacity available to a TCP connection, 378 however such changes do not often occur within a time frame of a few 379 minutes. 381 The value of five minutes is therefore expected to be sufficient for 382 most current applications. Simulation studies also suggest that for 383 many practical applications, the performance using this value will 384 not be significantly different to that observed using a non-standard 385 method that does not reset the cwnd after idle. 387 Finally, other TCP sender mechanisms have used a 5 minute timer, and 388 there could be simplifications in some implementations by reusing the 389 same interval. TCP defines a default user timeout of 5 minutes 390 [RFC0793] i.e. how long transmitted data may remain unacknowledged 391 before a connection is forcefully closed. 393 6. Security Considerations 395 General security considerations concerning TCP congestion control are 396 discussed in [RFC5681]. This document describes an algorithm that 397 updates one aspect of the congestion control procedures, and so the 398 considerations described in RFC 5681 also apply to this algorithm. 400 7. IANA Considerations 402 There are no IANA considerations. 404 8. Acknowledgments 406 The authors acknowledge the contributions of Dr I Biswas and Dr R 407 Secchi in supporting the evaluation of CWV and for their help in 408 developing the mechanisms proposed in this draft. We also 409 acknowledge comments received from the Internet Congestion Control 410 Research Group, in particular Yuchung Cheng, Mirja Kuehlewind, and 411 Joe Touch. 413 9. Other related work - Author Notes 415 There are several issues to be discussed more widely: 417 o Should the method explicitly state a procedure for limiting 418 burstiness or pacing? 420 This is often regarded as good practice, but isn't a formal 421 part of TCP. draft-hughes-restart-00.txt provides some 422 discussion of this topic. 424 o There are potential interaction with the proposal to raise the 425 TCP initial Window to ten segments, do these cases need to be 426 elaborated? 428 This relates to draft-ietf-tcpm-initcwnd. 430 The two methods have different functions and different response 431 to loss/congestion. 433 IW=10 proposes an experimental update to TCP that would allow 434 faster opening of the cwnd, and also a large (same size) 435 restart window. This approach is based on the assumption that 436 many forward paths can sustain bursts of up to ten segments 437 without (appreciable) loss. Such a significant increase in 438 cwnd must be matched with an equally large reduction of cwnd if 439 loss/congestion is detected, and such a congestion indication 440 is likely to require future use of IW=10 to be disabled for 441 this path for some time. This guards against the unwanted 442 behaviour of a series of short flows continuously flooding a 443 network path without network congestion feedback. 445 In contrast, new-CWV proposes a standards-track update with a 446 rationale that relies on recent previous path history to select 447 an appropriate cwnd after restart. 449 The behaviour differs in three ways: 451 1) For applications that send little initially, new-cwv may 452 constrain more than IW=10, but would not require the connection 453 to reset any path information when a restart incurred loss. In 454 contrast, new-cwv would allow the TCP connection to preserve 455 the cached cwnd, any loss, would impact cwnd, but not impact 456 other flows. 458 2) For applications that utilise more capacity than provided by 459 a cwnd=10, this method would permit a larger restart window 460 compared to a restart using IW=10. This is justified by the 461 recent path history. 463 3) new-CWV is attended to also be used for application-limited 464 use, where the application sends, but does not seek to fully 465 utilise the cwnd. In this case, new-cwv constrains the cwnd to 466 that justified by the recent path history. The performance 467 trade-offs are hence different, and it would be possible to 468 enable new-cwv when also using IW=10, and yield the benefits of 469 this. 471 o There is potential overlap with the Laminar proposal 472 (draft-mathis-tcpm-tcp-laminar) 474 The current draft was intended as a standards-track update to 475 TCP, rather than a new transport variant. At least, it would 476 be good to understand how the two interact and whether there is 477 a possibility of a single method. 479 10. References 481 10.1. Normative References 483 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 484 RFC 793, September 1981. 486 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 487 Requirement Levels", BCP 14, RFC 2119, March 1997. 489 [RFC2861] Handley, M., Padhye, J., and S. Floyd, "TCP Congestion 490 Window Validation", RFC 2861, June 2000. 492 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 493 of Explicit Congestion Notification (ECN) to IP", 494 RFC 3168, September 2001. 496 [RFC3517] Blanton, E., Allman, M., Fall, K., and L. Wang, "A 497 Conservative Selective Acknowledgment (SACK)-based Loss 498 Recovery Algorithm for TCP", RFC 3517, April 2003. 500 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 501 Control", RFC 5681, September 2009. 503 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 504 "Computing TCP's Retransmission Timer", RFC 6298, 505 June 2011. 507 10.2. Informative References 509 [Bis08] Biswas and Fairhurst, "A Practical Evaluation of 510 Congestion Window Validation Behaviour, 9th Annual 511 Postgraduate Symposium in the Convergence of 512 Telecommunications, Networking and Broadcasting (PGNet), 513 Liverpool, UK", June 2008. 515 [Bis10] Biswas, Sathiaseelan, Secchi, and Fairhurst, "Analysing 516 TCP for Bursty Traffic, Int'l J. of Communications, 517 Network and System Sciences, 7(3)", June 2010. 519 [Hug01] Hughes, Touch, and Heidemann, "Issues in TCP Slow-Start 520 Restart After Idle (Work-in-Progress)", December 2001. 522 [Liu07] Liu, Allman, Jiny, and Wang, "Congestion Control without a 523 Startup Phase, 5th International Workshop on Protocols for 524 Fast Long-Distance Networks (PFLDnet), Los Angeles, 525 California, USA", February 2007. 527 Authors' Addresses 529 Godred Fairhurst 530 University of Aberdeen 531 School of Engineering 532 Fraser Noble Building 533 Aberdeen, Scotland AB24 3UE 534 UK 536 Email: gorry@erg.abdn.ac.uk 537 URI: http://www.erg.abdn.ac.uk 538 Arjuna Sathiaseelan 539 University of Aberdeen 540 School of Engineering 541 Fraser Noble Building 542 Aberdeen, Scotland AB24 3UE 543 UK 545 Email: arjuna@erg.abdn.ac.uk 546 URI: http://www.erg.abdn.ac.uk