idnits 2.17.1 draft-ietf-tcpm-newcwv-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC2861, but the abstract doesn't seem to directly say this. It does mention RFC2861 though, so this could be OK. -- The draft header indicates that this document updates RFC5681, but the abstract doesn't seem to directly say this. It does mention RFC5681 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC5681, updated by this document, for RFC5378 checks: 2006-01-26) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (October 10, 2013) is 3845 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2861 (Obsoleted by RFC 7661) ** Obsolete normative reference: RFC 3517 (Obsoleted by RFC 6675) ** Downref: Normative reference to an Experimental RFC: RFC 6928 Summary: 4 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TCPM Working Group G. Fairhurst 3 Internet-Draft A. Sathiaseelan 4 Obsoletes: 2861 (if approved) R. Secchi 5 Updates: 5681 (if approved) University of Aberdeen 6 Intended status: Standards Track October 10, 2013 7 Expires: April 13, 2014 9 Updating TCP to support Rate-Limited Traffic 10 draft-ietf-tcpm-newcwv-03 12 Abstract 14 This document proposes an update to RFC 5681 to address issues that 15 arise when TCP is used to support traffic that exhibits periods where 16 the sending rate is limited by the application rather than the 17 congestion window. It updates TCP to allow a TCP sender to restart 18 quickly following either an idle or rate-limited interval. This 19 method is expected to benefit applications that send rate-limited 20 traffic using TCP, while also providing an appropriate response if 21 congestion is experienced. 23 It also evaluates the Experimental specification of TCP Congestion 24 Window Validation, CWV, defined in RFC 2861, and concludes that RFC 25 2861 sought to address important issues, but failed to deliver a 26 widely used solution. This document therefore recommends that the 27 status of RFC 2861 is moved from Experimental to Historic, and that 28 it is replaced by the current specification. 30 NOTE: The standards status of this WG document is under review for 31 consideration as either Experimental (EXP) or Proposed Standard (PS). 32 This decision will be made later as the document is finalised. 34 Status of this Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at http://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on April 13, 2014. 50 Copyright Notice 52 Copyright (c) 2013 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 2. Reviewing experience with TCP-CWV . . . . . . . . . . . . . . 5 69 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 70 4. An updated TCP response to idle and application-limited 71 periods . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 72 4.1. A method for preserving cwnd during the idle and 73 application-limited periods. . . . . . . . . . . . . . . . 7 74 4.2. Initialisation . . . . . . . . . . . . . . . . . . . . . . 8 75 4.3. The nonvalidated phase . . . . . . . . . . . . . . . . . . 8 76 4.4. TCP congestion control during the nonvalidated phase . . . 8 77 4.4.1. Response to congestion in the nonvalidated phase . . . 9 78 4.4.2. Adjustment at the end of the nonvalidated phase . . . 10 79 4.4.3. Examples of Implementation . . . . . . . . . . . . . . 11 80 5. Determining a safe period to preserve cwnd . . . . . . . . . . 12 81 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13 82 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 83 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13 84 9. Author Notes . . . . . . . . . . . . . . . . . . . . . . . . . 14 85 9.1. Other related work . . . . . . . . . . . . . . . . . . . . 14 86 9.2. Revision notes . . . . . . . . . . . . . . . . . . . . . . 16 87 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 88 10.1. Normative References . . . . . . . . . . . . . . . . . . . 18 89 10.2. Informative References . . . . . . . . . . . . . . . . . . 18 90 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19 92 1. Introduction 94 TCP is used to support a range of application behaviours. The TCP 95 congestion window (cwnd) controls the number of unacknowledged 96 packets/bytes that a TCP flow may have in the network at any time, a 97 value known as the FlightSize [RFC5681]. A bulk application will 98 always have data available to transmit. The rate at which it sends 99 is therefore limited by the maximum permitted by the receiver 100 advertised window and the sender congestion window (cwnd). In 101 contrast, a rate-limited application will experience periods when the 102 sender is either idle or is unable to send at the maximum rate 103 permitted by the cwnd. This latter case is called rate-limited. The 104 focus of this document is on the operation of TCP in such an idle or 105 rate-limited case. 107 Standard TCP [RFC5681] requires the cwnd to be reset to the restart 108 window (RW) when an application becomes idle. [RFC2861] noted that 109 this TCP behaviour was not always observed in current 110 implementations. Recent experiments [Bis08] confirm this to still be 111 the case. 113 Standard TCP does not impose additional restrictions on the growth of 114 the cwnd when a TCP sender is rate-limited. A rate-limited sender 115 may therefore grow a cwnd far beyond that corresponding to the 116 current transmit rate, resulting in a value that does not reflect 117 current information about the state of the network path the flow is 118 using. Use of such an invalid cwnd may result in reduced application 119 performance and/or could significantly contribute to network 120 congestion. 122 [RFC2861] proposed a solution to these issues in an experimental 123 method known as Congestion Window Validation (CWV). CWV was intended 124 to help reduce cases where TCP accumulated an invalid cwnd. The use 125 and drawbacks of using the CWV algorithm in RFC 2861 with an 126 application are discussed in Section 2. 128 Section 3 defines relevant terminology. 130 Section 4 specifies an alternative to CWV that seeks to address the 131 same issues, but does this in a way that is expected to mitigate the 132 impact on an application that varies its sending rate. The method 133 described applies to both a rate-limited and an idle condition. 134 Section 5 describes the rationale for selecting the safe period to 135 preserve the cwnd. 137 2. Reviewing experience with TCP-CWV 139 RFC 2861 described a simple modification to the TCP congestion 140 control algorithm that decayed the cwnd after the transition to a 141 "sufficiently-long" idle period. This used the slow-start threshold 142 (ssthresh) to save information about the previous value of the 143 congestion window. The approach relaxed the standard TCP behaviour 144 [RFC5681] for an idle session, intended to improve application 145 performance. CWV also modified the behaviour for a rate-limited 146 session where a sender transmitted at a rate less than allowed by 147 cwnd. 149 RFC 2861 has been implemented in some mainstream operating systems as 150 the default behaviour [Bis08]. Analysis (e.g. [Bis10] [Fai12]) has 151 shown that a TCP sender using CWV is able to use available capacity 152 on a shared path after an idle period. This can benefit some 153 applications, especially over long delay paths, when compared to the 154 slow-start restart specified by standard TCP. However, CWV would 155 only benefit an application if the idle period were less than several 156 Retransmission Time Out (RTO) intervals [RFC6298], since the 157 behaviour would otherwise be the same as for standard TCP, which 158 resets the cwnd to the RTCP Restart Window (RW) after this period. 160 Experience with RFC 2861 suggests that although the CWV method 161 benefited the network in a rate-limited scenario (reducing the 162 probability of network congestion), the behaviour was too 163 conservative for many common rate-limited applications. This 164 mechanism did not therefore offer the desirable increase in 165 application performance for rate-limited applications and it is 166 unclear whether applications actually use this mechanism in the 167 general Internet. 169 It is therefore concluded that CWV, as defined in RFC2681, was often 170 a poor solution for many rate-limited applications. It had the 171 correct motivation, but had the wrong approach to solving this 172 problem. 174 3. Terminology 176 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 177 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 178 document are to be interpreted as described in [RFC2119]. 180 The document assumes familiarity with the terminology of TCP 181 congestion control [RFC5681]. 183 The following new terminology is introduced: 185 pipeACK sample: A meaure of the volume of data acknowledged by the 186 network within an RTT. 188 pipeACK variable: A variable that measures the available capacity 189 using the set of pipeACK samples. 191 pipeACK Sampling Period: The maximum period that a measured pipeACK 192 sample may influence the pipeACK variable. 194 Non-validated phase: The phase where the cwnd reflects a previous 195 measurement of the available path capacity. 197 Non-validated period, NVP: The maximum period for which cwnd is 198 preserved in the non-validated phase. 200 Rate-limited: A TCP flow that does not consume more than one half of 201 cwnd, and hence operates in the non-validated phase. 203 Validated phase: The phase where the cwnd reflects a current estimate 204 of the available path capacity. 206 4. An updated TCP response to idle and application-limited periods 208 This section proposes an update to the TCP congestion control 209 behaviour during an idle or rate-limited period. The new method 210 permits a TCP sender to preserve the cwnd when an application becomes 211 idle for a period of time (the non-validated period, NVP, see section 212 5). The period where actual usage is less than allowed by cwnd, is 213 named as the non-validated phase. This method allows an application 214 to resume transmission at a previous rate without incurring the delay 215 of slow-start. However, if the TCP sender experiences congestion 216 using the preserved cwnd, it is required to immediately reset the 217 cwnd to an appropriate value specified by the method. If a sender 218 does not take advantage of the preserved cwnd within the NVP, the 219 value of cwnd is reduced, ensuring the value better reflects the 220 capacity that was recently actually used. 222 It is expected that this update will satisfy the requirements of many 223 rate-limited applications and at the same time provide an appropriate 224 method for use in the Internet. It also reduces the incentive for an 225 application to send data simply to keep transport congestion state. 226 (This is sometimes known as "padding"). 228 The new method does not differentiate between times when the sender 229 has become idle or rate-limited. This is partly a response to 230 recognition that some applications wish to transmit at a rate less 231 than allowed by the sender cwnd, and that it can be hard to make a 232 distinction between rate-limited and idle behaviour. This is 233 expected to encourage applications and TCP stacks to use standards- 234 based congestion control methods. It may also encourage the use of 235 long-lived connections where this offers benefit (such as persistent 236 http). 238 The method is specified in following subsections. 240 4.1. A method for preserving cwnd during the idle and application- 241 limited periods. 243 [RFC5681] defines a variable, FlightSize, that indicates the amount 244 of outstanding data in the network. This is assumed to be equal to 245 the value of Pipe calculated based on the pipe algorithm [RFC3517]. 246 In RFC5681 this value is used during loss recovery, whereas in this 247 method a new variable "pipeACK" is introduced to measure the 248 acknowledged size of the pipe, which is used to determine if the 249 sender has validated the cwnd. 251 A sender determines a pipeACK sample by measuring the volume of data 252 that was acknowledged by the network over the period of a measured 253 Round Trip Time (RTT). Using the variables defined in [RFC3517], a 254 value could be measured by caching the value of HighACK and after one 255 RTT measuring the difference between the cached HighACK value and the 256 current HighACK value. Other equivalent methods may be used. 258 A sender is not required to continuously update the pipeACK variable 259 after each received ACK, but SHOULD perform a pipeACK sample at least 260 once per RTT when it has sent unacknowledged segments. 262 The pipeACK variable MAY consider multiple pipeACK samples over the 263 pipeACK Sampling Period. The value of the pipeACK variable MUST NOT 264 exceed the maximum (highest value) within the sampling period. This 265 specification defines the pipeACK Sampling Period as Max(3*RTT, 1 266 second). This period enables a sender to compensate for large 267 fluctuations in the sending rate, where there may be pauses in 268 transmission, and allows the pipeACK variable to reflect the largest 269 recently measured pipeACK sample. 271 When no measurements are available, the pipeACK variable is set to 272 the "undefined value". This value is used to inhibit entering the 273 nonvalidated phase until the first new measurement of a pipeACK 274 sample. 276 The method RECOMMENDS that the TCP SACK option [RFC3517] is enabled. 277 This allows the sender to more accurately determine the number of 278 missing bytes during the loss recovery phase, and using this method 279 will result in a higher cwnd following loss. 281 4.2. Initialisation 283 A sender starts a TCP connection in the Validated phase and 284 initialises the pipeACK variable to the "undefined" value. This 285 value inhibts use of the value in cwv calculations. 287 4.3. The nonvalidated phase 289 The updated method creates a new TCP sender phase that captures 290 whether the cwnd reflects a validated or non-validated value. The 291 phases are defined as: 293 o Validated phase: pipeACK >=(1/2)*cwnd, or pipeACK is undefined. 294 This is the normal phase, where cwnd is expected to be an 295 approximate indication of the capacity currently available along 296 the network path, and the standard methods are used to increase 297 cwnd (currently [RFC5681]). The rule for transitioning to the 298 non-validated phase is specified in section 4.4. 300 o Non-validated phase: pipeACK <(1/2)*cwnd. This is the phase where 301 the cwnd has a value based on a previous measurement of the 302 available capacity, and the usage of this capacity has not been 303 validated in the pipeACK Sampling Period. That is, when it is not 304 known whether the cwnd reflects the currently available capacity 305 along the network path. The mechanisms to be used in this phase 306 seek to determine a safe value for cwnd and an appropriate 307 reaction to congestion. These mechanisms are specified in section 308 4.4. 310 The value 1/2 was selected to reduce the effects of variations in the 311 pipeACK variable, and to allow the sender some flexibility in when it 312 sends data. 314 4.4. TCP congestion control during the nonvalidated phase 316 A TCP sender MUST enter the non-validated phase when the pipeACK is 317 less than (1/2)*cwnd. 319 A TCP sender that enters the non-validated phase will preserve the 320 cwnd (i.e., this neither grows nor reduces while the sender remains 321 in this phase). If the sender receives an indication of congestion 322 (loss or Explicit Congestion Notification, ECN, mark [RFC3168]) it 323 uses the method described below. The phase is concluded after a 324 fixed period of time (the NVP, as explained in section 4.4.2) or when 325 the sender transmits sufficient data so that pipeACK > (1/2)*cwnd 326 (i.e. it is no longer rate-limited). 328 The behaviour in the non-validated phase is specified as: 330 o The cwnd is not increased when ACK packets are received in this 331 phase. 333 o If the sender receives an indication of congestion while in the 334 non-validated phase (i.e. detects loss, or an ECN mark), the 335 sender MUST exit the non-validated phase (reducing the cwnd as 336 defined in section 4.3.1). 338 o If the Retransmission Time Out (RTO) expires while in the non- 339 validated phase, the sender MUST exit the non-validated phase. It 340 then resumes using the Standard TCP RTO mechanism [RFC5681]. (The 341 resulting reduction of cwnd described in section 4.3.2 is 342 appropriate, since any accumulated path history is considered 343 unreliable). 345 o A sender with a pipeACK variable greater than (1/2)*cwnd SHOULD 346 enter the validated phase. (A rate-limited sender will not 347 normally be impacted by whether it is in a validated or non- 348 validated phase, since it will normally not consume the entire 349 cwnd. However a change to the validated phase will release the 350 sender from constraints on the growth of cwnd, and restore the use 351 of the standard congestion response.) 353 4.4.1. Response to congestion in the nonvalidated phase 355 Reception of congestion feedback while in the non-validated phase is 356 interpreted as an indication that it was inappropriate for the sender 357 to use the preserved cwnd. The sender is therefore required to 358 quickly reduce the rate to avoid further congestion. Since the cwnd 359 does not have a validated value, a new cwnd value must be selected 360 based on the utilised rate. 362 A sender that detects a packet-drop, or receives an indication of an 363 ECN marked packet, MUST record the current FlightSize in the variable 364 LossFlightSize and MUST calculate a safe cwnd for loss recovery using 365 the method below: 366 cwnd = (Max(pipeACK,LossFlightSize))/2. 368 This new cwnd is set to reflect that a nonvalidated cwnd may be 369 larger than the actual FlightSize, or recently used FlightSize 370 (recorded in pipeACK). The updated cwnd therefore prevents overshoot 371 by a sender significantly increasing its transmission rate during the 372 recovery period. 374 At the end of the recovery phase, the TCP sender MUST reset the cwnd 375 using the method below: 376 cwnd = (Max(pipeACK,LossFlightSize) - R)/2. 378 Where, R is the volume of data that was retransmitted during the 379 recovery phase. This follows the method proposed for Jump Start 380 [Liu07]. The inclusion of the term R makes an adjustment more 381 conservative than standard TCP. (This is required, since the sender 382 may have sent more segments than a Standard TCP sender would have 383 done. The additional reduction is beneficial when the LossFlightSize 384 significantly overshoots the available path capacity incurring 385 significant loss, for instance an intense traffic burst following a 386 non-validated period.) 388 If the sender implements a method that allows it to identify the 389 number of ECN-marked segments within a window that were observed by 390 the receiver, the sender SHOULD use the method above, further 391 reducing R by the number of marked segments. 393 The sender MUST also re-initialise the pipeACK variable to the 394 "undefined" value. This ensures that standard TCP methods are used 395 immediately after completing loss recovery until a new pipeACK value 396 can be determined. 398 4.4.2. Adjustment at the end of the nonvalidated phase 400 During the non-validated phase, a sender can produce bursts of data 401 of up to the cwnd in size. While this is no different to standard 402 TCP, it is desirable to control the maximum burst size, e.g. by 403 setting a burst size limit, using a pacing algorithm, or some other 404 method [Hug01]. 406 An application that remains in the non-validated phase for a period 407 greater than the NVP is required to adjust its congestion control 408 state. If the sender exits the non-validated phase after this 409 period, it MUST update the ssthresh: 411 ssthresh = max(ssthresh, 3*cwnd/4). 413 (This adjustment of ssthresh ensures that the sender records that it 414 has safely sustained the present rate. The change is beneficial to 415 rate-limited flows that encounter occasional congestion, and could 416 otherwise suffer an unwanted additional delay in recovering the 417 sending rate.) 419 The sender MUST then update cwnd to be not greater than: 421 cwnd = max(1/2*cwnd, IW). 423 Where IW is the appropriate TCP initial window, used by the TCP 424 sender (e.g. [RFC5681]). 426 (This adjustment ensures that sender responds conservatively at the 427 end of the non-validated phase by reducing the cwnd to better reflect 428 the current rate of the sender. The cwnd update does not take into 429 account FlightSize or pipeACK value because these values only reflect 430 historical data and do not reflect the current sending rate.) 432 4.4.3. Examples of Implementation 434 This section is intended to provide informative examples of 435 implementation methods. Implementations may choose to use other 436 methods that comply with the normative requirements. 438 XXX This section is work in progress - discussion is welcome to help 439 complete this section XXX 441 A pipeACK sample may be measured once each RTT. This reduces the 442 sender processing burden for calculating after each acknowledgement 443 and also reduces storage requirements at the sender. 445 Since application behaviour can be bursty using CWV, it may be 446 desirable to implement a maximum filter to accumulate the measured 447 values so that the pipeACK variable records the largest pipeACK 448 sample within the pipeACK Sampling Period. One simple way to 449 implement this is to divide the pipeACK Sampling Period into several 450 (e.g. 5) equal length measurement periods. The sender then records 451 the start time for each measurement period and the highest measured 452 pipeACK sample. At the end of the measurement period, any 453 measurement(s) that are older than the pipeACK Sampling Period are 454 discarded. The pipeACK variable is then assigned the largest of the 455 set of the highest measured values. 457 +----------+----------+ +----------+---...... 458 | Sample A | Sample B | No | Sample C | Sample D 459 | | | Sample | | 460 | |\ 5 | | | | 461 | | | | | | /\ 4 | 462 | | | | |\ 3 | | | \ | 463 | | \ | | \--- | | / \ | /| 2 464 |/ \------| - | | / \------/ \... 465 +----------+---------\+----/ /----+/---------+-------------> Time 467 <------------------------------------------------| 468 Sampling Period Current Time 470 Figure XX: Example of measuring pipeACK samples 472 Figure XX shows an example of how measurement samples may be 473 collected. At the time represented by the figure new samples are 474 being accumulated into sample D. Three previous samples also fall 475 within the pipeACK Sampling Period: A, B, and C. There was also a 476 period of inactivity between samples B and C during which no 477 measurements were taken. The current value of the pipeACK variable 478 will be 5, the maximum across all samples. 480 After one further measurement period, Sample A will be discarded, 481 since it then is older than the pipeACK Sampling Period and the 482 pipeACK variable will be recalculated, Its value will be the larger 483 of Sample C or the final value accumulated in Sample D. 485 Note that the NVP period does not necessarily require a new timer to 486 be implemented. An alternative is to record a timestamp when the 487 sender enters the NVP. Each time a sender transmits a new segment, 488 this timestamp may be used to determine if the NVP period has 489 expired. If the period expires, the sender may take into account how 490 many units of the NVP period have passed and make one reduction (as 491 defined in section 4.3.2) for each NVP period. 493 5. Determining a safe period to preserve cwnd 495 This section documents the rationale for selecting the maximum period 496 that cwnd may be preserved, known as the non-validated period, NVP. 498 Limiting the period that cwnd may be preserved avoids undesirable 499 side effects that would result if the cwnd were to be kept 500 unnecessarily high for an arbitrary long period, which was a part of 501 the problem that CWV originally attempted to address. The period a 502 sender may safely preserve the cwnd, is a function of the period that 503 a network path is expected to sustain the capacity reflected by cwnd. 504 There is no ideal choice for this time. 506 A period of five minutes was chosen for this NVP. This is a 507 compromise that was larger than the idle intervals of common 508 applications, but not sufficiently larger than the period for which 509 the capacity of an Internet path may commonly be regarded as stable. 510 The capacity of wired networks is usually relatively stable for 511 periods of several minutes and that load stability increases with the 512 capacity. This suggests that cwnd may be preserved for at least a 513 few minutes. 515 There are cases where the TCP throughput exhibits significant 516 variability over a time less than five minutes. Examples could 517 include wireless topologies, where TCP rate variations may fluctuate 518 on the order of a few seconds as a consequence of medium access 519 protocol instabilities. Mobility changes may also impact TCP 520 performance over short time scales. Senders that observe such rapid 521 changes in the path characteristic may also experience increased 522 congestion with the new method, however such variation would likely 523 also impact TCP's behaviour when supporting interactive and bulk 524 applications. 526 Routing algorithms may modify the network path, disrupting the RTT 527 measurement and changing the capacity available to a TCP connection, 528 however such changes do not often occur within a time frame of a few 529 minutes. 531 The value of five minutes is therefore expected to be sufficient for 532 most current applications. Simulation studies (e.g. [Bis11]) also 533 suggest that for many practical applications, the performance using 534 this value will not be significantly different to that observed using 535 a non-standard method that does not reset the cwnd after idle. 537 Finally, other TCP sender mechanisms have used a 5 minute timer, and 538 there could be simplifications in some implementations by reusing the 539 same interval. TCP defines a default user timeout of 5 minutes 540 [RFC0793] i.e. how long transmitted data may remain unacknowledged 541 before a connection is forcefully closed. 543 6. Security Considerations 545 General security considerations concerning TCP congestion control are 546 discussed in [RFC5681]. This document describes an algorithm that 547 updates one aspect of the congestion control procedures, and so the 548 considerations described in RFC 5681 also apply to this algorithm. 550 7. IANA Considerations 552 There are no IANA considerations. 554 8. Acknowledgments 556 The authors acknowledge the contributions of Dr I Biswas, Mr Ziaul 557 Hossain in supporting the evaluation of CWV and for their help in 558 developing the mechanisms proposed in this draft. We also 559 acknowledge comments received from the Internet Congestion Control 560 Research Group, in particular Yuchung Cheng, Mirja Kuehlewind, and 561 Joe Touch. This work was part-funded by the European Community under 562 its Seventh Framework Programme through the Reducing Internet 563 Transport Latency (RITE) project (ICT-317700). 565 9. Author Notes 567 9.1. Other related work 569 There are several issues to be discussed more widely: 571 o Should the method explicitly state a procedure for limiting 572 burstiness or pacing? 574 This is often regarded as good practice, but is not presently a 575 formal part of TCP. draft-hughes-restart-00.txt provides some 576 discussion of this topic. 578 o There are potential interactions with the Experimental update in 579 [RFC6928] that raises the TCP initial Window to ten segments, do 580 these cases need to be elaborated? 582 This relates to the Experimental specification for increasing 583 the TCP IW defined in RFC 6928. 585 The two methods have different functions and different response 586 to loss/congestion. 588 RFC 6928 proposes an experimental update to TCP that would 589 increase the IW to ten segments. This would allow faster 590 opening of the cwnd, and also a large (same size) restart 591 window. This approach is based on the assumption that many 592 forward paths can sustain bursts of up to ten segments without 593 (appreciable) loss. Such a significant increase in cwnd must 594 be matched with an equally large reduction of cwnd if loss/ 595 congestion is detected, and such a congestion indication is 596 likely to require future use of IW=10 to be disabled for this 597 path for some time. This guards against the unwanted behaviour 598 of a series of short flows continuously flooding a network path 599 without network congestion feedback. 601 In contrast, this document proposes an update with a rationale 602 that relies on recent previous path history to select an 603 appropriate cwnd after restart. 605 The behaviour differs in three ways: 607 1) For applications that send little initially, new-cwv may 608 constrain more than RFC 6928, but would not require the 609 connection to reset any path information when a restart 610 incurred loss. In contrast, new-cwv would allow the TCP 611 connection to preserve the cached cwnd, any loss, would impact 612 cwnd, but not impact other flows. 614 2) For applications that utilise more capacity than provided by 615 a cwnd of 10 segments, this method would permit a larger 616 restart window compared to a restart using the method in RFC 617 6928. This is justified by the recent path history. 619 3) new-CWV is attended to also be used for rate-limited 620 applications, where the application sends, but does not seek to 621 fully utilise the cwnd. In this case, new-cwv constrains the 622 cwnd to that justified by the recent path history. The 623 performance trade-offs are hence different, and it would be 624 possible to enable new-cwv when also using the method in RFC 625 6928, and yield benefits. 627 o There is potential overlap with the Laminar proposal 628 (draft-mathis-tcpm-tcp-laminar) 630 The current draft was intended as a standards-track update to 631 TCP, rather than a new transport variant. At least, it would 632 be good to understand how the two interact and whether there is 633 a possibility of a single method. 635 o There is potential performance loss in loss of a short burst 636 (off list with M Allman) 638 A sender can transmit several segments then become idle. If 639 the first segments are all ACK'ed the ssthresh collapses to a 640 small value (no new data is sent by the idle sender). Loss of 641 the later data results in congestion (e.g. maybe a RED drop or 642 some other cause, rather than the maximum rate of this flow). 643 When the sender performs loss recovery it may have an 644 appreciable pipeACK and cwnd, but a very low FlightSize - the 645 Standard algorithm results in an unusually low cwnd (1/2 646 FlightSize). 648 A constant rate flow would have maintained a FlightSize 649 appropriate to pipeACK (cwnd if it is a bulk flow). 651 This could be fixed by adding a new state variable? It could 652 also be argued this is a corner case (e.g. loss of only the 653 last segments would have resulted in RTO), the impact could be 654 significant. 656 o There is potential interaction with TCP Control Block Sharing(M 657 Welzl) 659 An application that is non-validated can accumulate a cwnd that 660 is larger than the actual capacity. Is this a fair value to 661 use in TCB sharing? 663 We propose that TCB sharing should use the pipeACK in place of 664 cwnd when a TCP sender is in the Nonvalidated phase. This 665 value better reflects the capacity that the flow has utilised 666 in the network path. 668 9.2. Revision notes 670 RFC-Editor note: please remove this section prior to publication. 672 Draft 03 was submitted to ICCRG to receive comments and feedback. 674 Draft 04 contained the first set of clarifications after feedback: 676 o Changed name to application limited and used the term rate-limited 677 in all places. 679 o Added justification and many minor changes suggested on the list. 681 o Added text to tie-in with more accurate ECN marking. 683 o Added ref to Hug01 685 Draft 05 contained various updates: 687 o New text to redefine how to measure the acknowledged pipe, 688 differentiating this from the FlightSize, and hence avoiding 689 previous issues with infrequent large bursts of data not being 690 validated. A key point new feature is that pipeACK only triggers 691 leaving the NVP after the size of the pipe has been acknowledged. 692 This removed the need for hysteresis. 694 o Reduction values were changed to 1/2, following analysis of 695 suggestions from ICCRG. This also sets the "target" cwnd as twice 696 the used rate for non-validated case. 698 o Introduced a symbolic name (NVP) to denote the 5 minute period. 700 Draft 06 contained various updates: 702 o Required reset of pipeACK after congestion. 704 o Added comment on the effect of congestion after a short burst (M. 705 Allman). 707 o Correction of minor Typos. 709 WG draft 00 contained various updates: 711 o Updated initialisation of pipeACK to maximum value. 713 o Added note on intended status still to be determined. 715 WG draft 01 contained: 717 o Added corrections from Richard Scheffenegger. 719 o Raffaello Secchi added to the mechanism, based on implementation 720 experience. 722 o Removed that the requirement for the method to use TCP SACK option 723 [RFC3517] to be enabled - Although it may be desirable to use 724 SACK, this is not essential to the algorithm. 726 o Added the notion of the sampling period to accommodate large rate 727 variations and ensure that the method is stable. This algorithm 728 to be validated through implementation. 730 WG draft 02 contained: 732 o Clarified language around pipeACK variable and pipeACK sample - 733 Feedback from Aris Angelogiannopoulos. 735 WG draft 03 contained: 737 o Editorial corrections - Feedback from Anna Brunstrom. 739 o An adjustment to the procedure at the start and end of loss 740 recovery to align the two equations. 742 o Further clarification of the "undefined" value of the pipeACK 743 variable. 745 10. References 747 10.1. Normative References 749 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 750 RFC 793, September 1981. 752 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 753 Requirement Levels", BCP 14, RFC 2119, March 1997. 755 [RFC2861] Handley, M., Padhye, J., and S. Floyd, "TCP Congestion 756 Window Validation", RFC 2861, June 2000. 758 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 759 of Explicit Congestion Notification (ECN) to IP", 760 RFC 3168, September 2001. 762 [RFC3517] Blanton, E., Allman, M., Fall, K., and L. Wang, "A 763 Conservative Selective Acknowledgment (SACK)-based Loss 764 Recovery Algorithm for TCP", RFC 3517, April 2003. 766 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 767 Control", RFC 5681, September 2009. 769 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 770 "Computing TCP's Retransmission Timer", RFC 6298, 771 June 2011. 773 [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis, 774 "Increasing TCP's Initial Window", RFC 6928, April 2013. 776 10.2. Informative References 778 [Bis08] Biswas and Fairhurst, "A Practical Evaluation of 779 Congestion Window Validation Behaviour, 9th Annual 780 Postgraduate Symposium in the Convergence of 781 Telecommunications, Networking and Broadcasting (PGNet), 782 Liverpool, UK", June 2008. 784 [Bis10] Biswas, Sathiaseelan, Secchi, and Fairhurst, "Analysing 785 TCP for Bursty Traffic, Int'l J. of Communications, 786 Network and System Sciences, 7(3)", June 2010. 788 [Bis11] Biswas, "PhD Thesis, Internet congestion control for 789 variable rate TCP traffic, School of Engineering, 790 University of Aberdeen", June 2011. 792 [Fai12] Fairhurst, Biswas, Biswas, and Biswas, "Enhancing TCP 793 Performance to support Variable-Rate Traffic, 2nd Capacity 794 Sharing Workshop, ACM CoNEXT, Nice, France, 10th December 795 2012.", June 2008. 797 [Hug01] Hughes, Touch, and Heidemann, "√√Issues in TCP 798 Slow-Start Restart After Idle (Work-in-Progress)", 799 December 2001. 801 [Liu07] Liu, Allman, Jiny, and Wang, "Congestion Control without a 802 Startup Phase, 5th International Workshop on Protocols for 803 Fast Long-Distance Networks (PFLDnet), Los Angeles, 804 California, USA", February 2007. 806 Authors' Addresses 808 Godred Fairhurst 809 University of Aberdeen 810 School of Engineering 811 Fraser Noble Building 812 Aberdeen, Scotland AB24 3UE 813 UK 815 Email: gorry@erg.abdn.ac.uk 816 URI: http://www.erg.abdn.ac.uk 818 Arjuna Sathiaseelan 819 University of Aberdeen 820 School of Engineering 821 Fraser Noble Building 822 Aberdeen, Scotland AB24 3UE 823 UK 825 Email: arjuna@erg.abdn.ac.uk 826 URI: http://www.erg.abdn.ac.uk 828 Raffaello Secchi 829 University of Aberdeen 830 School of Engineering 831 Fraser Noble Building 832 Aberdeen, Scotland AB24 3UE 833 UK 835 Email: raffaello@erg.abdn.ac.uk 836 URI: http://www.erg.abdn.ac.uk