idnits 2.17.1 draft-ietf-tcpm-newcwv-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC2861, but the abstract doesn't seem to directly say this. It does mention RFC2861 though, so this could be OK. -- The draft header indicates that this document updates RFC5681, but the abstract doesn't seem to directly say this. It does mention RFC5681 though, so this could be OK. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year (Using the creation date from RFC5681, updated by this document, for RFC5378 checks: 2006-01-26) -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 01, 2013) is 3950 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2861 (Obsoleted by RFC 7661) ** Obsolete normative reference: RFC 3517 (Obsoleted by RFC 6675) ** Downref: Normative reference to an Experimental RFC: RFC 6928 Summary: 4 errors (**), 0 flaws (~~), 1 warning (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TCPM Working Group G. Fairhurst 3 Internet-Draft A. Sathiaseelan 4 Obsoletes: 2861 (if approved) R. Secchi 5 Updates: 5681 (if approved) University of Aberdeen 6 Intended status: Standards Track July 01, 2013 7 Expires: January 2, 2014 9 Updating TCP to support Rate-Limited Traffic 10 draft-ietf-tcpm-newcwv-02 12 Abstract 14 This document proposes an update to RFC 5681 to address issues that 15 arise when TCP is used to support traffic that exhibits periods where 16 the sending rate is limited by the application rather than the 17 congestion window. It updates TCP to allow a TCP sender to restart 18 quickly following either an idle or rate-limited interval. This 19 method is expected to benefit applications that send rate-limited 20 traffic using TCP, while also providing an appropriate response if 21 congestion is experienced. 23 It also evaluates the Experimental specification of TCP Congestion 24 Window Validation, CWV, defined in RFC 2861, and concludes that RFC 25 2861 sought to address important issues, but failed to deliver a 26 widely used solution. This document therefore recommends that the 27 status of RFC 2861 is moved from Experimental to Historic, and that 28 it is replaced by the current specification. 30 NOTE: The standards status of this WG document is under review for 31 consideration as either Experimental (EXP) or Proposed Standard (PS). 32 This decision will be made later as the document is finalised. 34 Status of this Memo 36 This Internet-Draft is submitted in full conformance with the 37 provisions of BCP 78 and BCP 79. 39 Internet-Drafts are working documents of the Internet Engineering 40 Task Force (IETF). Note that other groups may also distribute 41 working documents as Internet-Drafts. The list of current Internet- 42 Drafts is at http://datatracker.ietf.org/drafts/current/. 44 Internet-Drafts are draft documents valid for a maximum of six months 45 and may be updated, replaced, or obsoleted by other documents at any 46 time. It is inappropriate to use Internet-Drafts as reference 47 material or to cite them other than as "work in progress." 48 This Internet-Draft will expire on January 2, 2014. 50 Copyright Notice 52 Copyright (c) 2013 IETF Trust and the persons identified as the 53 document authors. All rights reserved. 55 This document is subject to BCP 78 and the IETF Trust's Legal 56 Provisions Relating to IETF Documents 57 (http://trustee.ietf.org/license-info) in effect on the date of 58 publication of this document. Please review these documents 59 carefully, as they describe your rights and restrictions with respect 60 to this document. Code Components extracted from this document must 61 include Simplified BSD License text as described in Section 4.e of 62 the Trust Legal Provisions and are provided without warranty as 63 described in the Simplified BSD License. 65 Table of Contents 67 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 68 2. Reviewing experience with TCP-CWV . . . . . . . . . . . . . . 5 69 3. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 70 4. An updated TCP response to idle and application-limited 71 periods . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 72 4.1. A method for preserving cwnd during the idle and 73 application-limited periods. . . . . . . . . . . . . . . . 7 74 4.2. Initialisation . . . . . . . . . . . . . . . . . . . . . . 8 75 4.3. The nonvalidated phase . . . . . . . . . . . . . . . . . . 8 76 4.4. TCP congestion control during the nonvalidated phase . . . 8 77 4.4.1. Response to congestion in the nonvalidated phase . . . 9 78 4.4.2. Adjustment at the end of the nonvalidated phase . . . 10 79 4.4.3. Examples of Implementation . . . . . . . . . . . . . . 11 80 5. Determining a safe period to preserve cwnd . . . . . . . . . . 12 81 6. Security Considerations . . . . . . . . . . . . . . . . . . . 13 82 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 13 83 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 13 84 9. Author Notes . . . . . . . . . . . . . . . . . . . . . . . . . 14 85 9.1. Other related work . . . . . . . . . . . . . . . . . . . . 14 86 9.2. Revision notes . . . . . . . . . . . . . . . . . . . . . . 16 87 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 17 88 10.1. Normative References . . . . . . . . . . . . . . . . . . . 17 89 10.2. Informative References . . . . . . . . . . . . . . . . . . 18 90 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19 92 1. Introduction 94 TCP is used to support a range of application behaviours. The TCP 95 congestion window (cwnd) controls the number of unacknowledged 96 packets/bytes that a TCP flow may have in the network at any time, a 97 value known as the FlightSize [RFC5681]. A bulk application will 98 always have data available to transmit. The rate at which it sends 99 is therefore limited by the maximum permitted by the receiver 100 advertised window and the sender congestion window (cwnd). In 101 contrast, a rate-limited application will experience periods when the 102 sender is either idle or is unable to send at the maximum rate 103 permitted by the cwnd. This latter case is called rate-limited. The 104 focus of this document is on the operation of TCP in such an idle or 105 rate-limited case. 107 Standard TCP [RFC5681] requires the cwnd to be reset to the restart 108 window (RW) when an application becomes idle. [RFC2861] noted that 109 this TCP behaviour was not always observed in current 110 implementations. Recent experiments [Bis08] confirm this to still be 111 the case. 113 Standard TCP does not impose additional restrictions on the growth of 114 the cwnd when a TCP sender is rate-limited. A rate-limited sender 115 may therefore grow a cwnd far beyond that corresponding to the 116 current transmit rate, resulting in a value that does not reflect 117 current information about the state of the network path the flow is 118 using. Use of such an invalid cwnd may result in reduced application 119 performance and/or could significantly contribute to network 120 congestion. 122 [RFC2861] proposed a solution to these issues in an experimental 123 method known as Congestion Window Validation (CWV). CWV was intended 124 to help reduce cases where TCP accumulated an invalid cwnd. The use 125 and drawbacks of using the CWV algorithm in RFC 2861 with an 126 application are discussed in Section 2. 128 Section 3 defines relevant terminology. 130 Section 4 specifies an alternative to CWV that seeks to address the 131 same issues, but does this in a way that is expected to mitigate the 132 impact on an application that varies its sending rate. The method 133 described applies to both a rate-limited and an idle condition. 134 Section 5 describes the rationale for selecting the safe period to 135 preserve the cwnd. 137 2. Reviewing experience with TCP-CWV 139 RFC 2861 described a simple modification to the TCP congestion 140 control algorithm that decayed the cwnd after the transition to a 141 "sufficiently-long" idle period. This used the slow-start threshold 142 (ssthresh) to save information about the previous value of the 143 congestion window. The approach relaxed the standard TCP behaviour 144 [RFC5681] for an idle session, intended to improve application 145 performance. CWV also modified the behaviour for a rate-limited 146 session where a sender transmitted at a rate less than allowed by 147 cwnd. 149 RFC 2861 has been implemented in some mainstream operating systems as 150 the default behaviour [Bis08]. Analysis (e.g. [Bis10] [Fai12]) has 151 shown that a TCP sender using CWV is able to use available capacity 152 on a shared path after an idle period. This can benefit some 153 applications, especially over long delay paths, when compared to the 154 slow-start restart specified by standard TCP. However, CWV would 155 only benefit an application if the idle period were less than several 156 Retransmission Time Out (RTO) intervals [RFC6298], since the 157 behaviour would otherwise be the same as for standard TCP, which 158 resets the cwnd to the RTCP Restart Window (RW) after this period. 160 Experience with RFC 2861 suggests that although the CWV method 161 benefited the network in a rate-limited scenario (reducing the 162 probability of network congestion), the behaviour was too 163 conservative for many common rate-limited applications. This 164 mechanism did not therefore offer the desirable increase in 165 application performance for rate-limited applications and it is 166 unclear whether applications actually use this mechanism in the 167 general Internet. 169 It is therefore concluded that CWV, as defined in RFC2681, was often 170 a poor solution for many rate-limited applications. It had the 171 correct motivation, but had the wrong approach to solving this 172 problem. 174 3. Terminology 176 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 177 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 178 document are to be interpreted as described in [RFC2119]. 180 The document assumes familiarity with the terminology of TCP 181 congestion control [RFC5681]. 183 The following new terminology is introduced: 185 pipeACK sample: A meaure of the volume of data acknowledged by the 186 network within an RTT. 188 pipeACK variable: A variable that measures the available capacity 189 using the set of pipeACK samples. 191 pipeACK Sampling Period: The maximum period that a measured pipeACK 192 sample may influence the pipeACK variable. 194 Non-validated phase: The phase where the cwnd reflects a previous 195 measurement of the available path capacity. 197 Non-validated period, NVP: The maximum period for which cwnd is 198 preserved in the non-validated phase. 200 Rate-limited: A TCP flow that does not consume more than one half of 201 cwnd, and hence operates in the non-validated phase. 203 Validated phase: The phase where the cwnd reflects a current estimate 204 of the available path capacity. 206 4. An updated TCP response to idle and application-limited periods 208 This section proposes an update to the TCP congestion control 209 behaviour during an idle or rate-limited period. The new method 210 permits a TCP sender to preserve the cwnd when an application becomes 211 idle for a period of time (the non-validated period, NVP, see section 212 5). The period where actual usage is less than allowed by cwnd, is 213 named as the non-validated phase. This method allows an application 214 to resume transmission at a previous rate without incurring the delay 215 of slow-start. However, if the TCP sender experiences congestion 216 using the preserved cwnd, it is required to immediately reset the 217 cwnd to an appropriate value specified by the method. If a sender 218 does not take advantage of the preserved cwnd within the NVP, the 219 value of cwnd is reduced, ensuring the value better reflects the 220 capacity that was recently actually used. 222 It is expected that this update will satisfy the requirements of many 223 rate-limited applications and at the same time provide an appropriate 224 method for use in the Internet. It also reduces the incentive for an 225 application to send data simply to keep transport congestion state. 226 (This is sometimes known as "padding"). 228 The new method does not differentiate between times when the sender 229 has become idle or rate-limited. This is partly a response to 230 recognition that some applications wish to transmit at a rate less 231 than allowed by the sender cwnd, and that it can be hard to make a 232 distinction between rate-limited and idle behaviour. This is 233 expected to encourage applications and TCP stacks to use standards- 234 based congestion control methods. It may also encourage the use of 235 long-lived connections where this offers benefit (such as persistent 236 http). 238 The method is specified in following subsections. 240 4.1. A method for preserving cwnd during the idle and application- 241 limited periods. 243 [RFC5681] defines a variable, FlightSize, that indicates the amount 244 of outstanding data in the network. This is assumed to be equal to 245 the value of Pipe calculated based on the pipe algorithm [RFC3517]. 246 In RFC5681 this value is used during loss recovery, whereas in this 247 method a new variable "pipeACK" is introduced to measure the 248 acknowledged size of the pipe, which is used to determine if the 249 sender has validated the cwnd. 251 A sender determines a pipeACK sample by measuring the volume of data 252 that was acknowledged by the network over the period of a measured 253 Round Trip Time (RTT). Using the variables defined in [RFC3517], a 254 value could be measured by caching the value of HighACK and after one 255 RTT measuring the difference between the cached HighACK value and the 256 current HighACK value. Other equivalent methods may be used. 258 A sender is not required to continuously update the pipeACK variable 259 after each received ACK, but SHOULD perform a pipeACK sample at least 260 once per RTT when it has sent unacknowledged segments. 262 The pipeACK variable MAY consider multiple pipeACK sample over the 263 pipeACK Sampling Period. The value of the pipeACK variable MUST NOT 264 exceed the maximum (highest value) within the sampling period. This 265 specification defines the pipeACK Sampling Period as Max(3*RTT, 1 266 second). This period enables a sender to compensate for large 267 fluctuations in the sending rate, where there may be pauses in 268 transmission, and allows the pipeACK variable to reflect the largest 269 recently measured pipeACK sample. 271 When no measurements are available, the pipeACK variable is set to 272 the maximum (undefined) value. This value is used to inhibit 273 entering the nonvalidated phase until the first new measurement of a 274 pipeACK sample. 276 The method RECOMMENDS that the TCP SACK option [RFC3517] is enabled. 277 This allows the sender to more accurately determine the number of 278 missing bytes during the loss recovery phase, and using this method 279 will result in a higher cwnd following loss. 281 4.2. Initialisation 283 A sender starts a TCP connection in the Validated phase and 284 initialises the pipeACK variable to the maximum (undefined) value. 286 4.3. The nonvalidated phase 288 The updated method creates a new TCP sender phase that captures 289 whether the cwnd reflects a validated or non-validated value. The 290 phases are defined as: 292 o Validated phase: pipeACK >=(1/2)*cwnd. This is the normal phase, 293 where cwnd is expected to be an approximate indication of the 294 capacity currently available along the network path, and the 295 standard methods are used to increase cwnd (currently [RFC5681]). 296 The rule for transitioning to the non-validated phase is specified 297 in section 4.3. 299 o Non-validated phase: pipeACK <(1/2)*cwnd. This is the phase where 300 the cwnd has a value based on a previous measurement of the 301 available capacity, and the usage of this capacity has not been 302 validated in the pipeACK Sampling Period. That is, when it is not 303 known whether the cwnd reflects the currently available capacity 304 along the network path. The mechanisms to be used in this phase 305 seek to determine a safe value for cwnd and an appropriate 306 reaction to congestion. These mechanisms are specified in section 307 4.3. 309 The value 1/2 was selected to reduce the effects of variations in the 310 pipeACK variable, and to allow the sender some flexibility in when it 311 sends data. 313 4.4. TCP congestion control during the nonvalidated phase 315 A TCP sender MUST enter the non-validated phase when the pipeACK is 316 less than (1/2)*cwnd. 318 A TCP sender that enters the non-validated phase will preserve the 319 cwnd (i.e., this neither grows nor reduces while the sender remains 320 in this phase). If the sender receives an indication of congestion 321 (loss or Explicit Congestion Notification, ECN, mark [RFC3168]) it 322 uses the method described below. The phase is concluded after a 323 fixed period of time (the NVP, as explained in section 4.3.2) or when 324 the sender transmits sufficient data so that pipeACK > (1/2)*cwnd 325 (i.e. it is no longer rate-limited). 327 The behaviour in the non-validated phase is specified as: 329 o The cwnd is not increased when ACK packets are received in this 330 phase. 332 o If the sender receives an indication of congestion while in the 333 non-validated phase (i.e. detects loss, or an ECN mark), the 334 sender MUST exit the non-validated phase (reducing the cwnd as 335 defined in section 4.3.1). 337 o If the Retransmission Time Out (RTO) expires while in the non- 338 validated phase, the sender MUST exit the non-validated phase. It 339 then resumes using the Standard TCP RTO mechanism [RFC5681]. (The 340 resulting reduction of cwnd described in section 4.3.2 is 341 appropriate, since any accumulated path history is considered 342 unreliable). 344 o A sender with a pipeACK variable greater than (1/2)*cwnd SHOULD 345 enter the validated phase. (A rate-limited sender will not 346 normally be impacted by whether it is in a validated or non- 347 validated phase, since it will normally not consume the entire 348 cwnd. However a change to the validated phase will release the 349 sender from constraints on the growth of cwnd, and restore the use 350 of the standard congestion response.) 352 4.4.1. Response to congestion in the nonvalidated phase 354 Reception of congestion feedback while in the non-validated phase is 355 interpreted as an indication that it was inappropriate for the sender 356 to use the preserved cwnd. The sender is therefore required to 357 quickly reduce the rate to avoid further congestion. Since the cwnd 358 does not have a validated value, a new cwnd value must be selected 359 based on the utilised rate. 361 A sender that detects a packet-drop or receives an ECN marked packet 362 MUST record the current FlightSize in the variable LossFlightSize and 363 calculate a safe cwnd, by setting it to the value specified in 364 Section 3.2 of [RFC5681]. 366 A TCP sender MUST calculate a safe cwnd to use for loss recovery 367 using the method below: 368 cwnd = Min(cwnd/2,Max(pipeACK,LossFlightSize)). 370 This new cwnd is set to reflect that a nonvalidated cwnd may be much 371 larger than the actual FlightSize, or recently used FlightSize 372 (recorded in pipeACK). The updated cwnd therefore prevents overshoot 373 by a sender significantly increasing its transmission rate during the 374 recovery period. 376 At the end of the recovery phase, the TCP sender MUST reset the cwnd 377 using the method below: 378 cwnd = ((LossFlightSize - R)/2). 380 Where, R is the volume of data that was retransmitted during the 381 recovery phase. This follows the method proposed for Jump Start 382 [Liu07]. The inclusion of the term R makes this adjustment more 383 conservative than standard TCP. (This is required, since the sender 384 may have sent more segments than a Standard TCP sender would have 385 done. The additional reduction is beneficial when the LossFlightSize 386 significantly overshoots the available path capacity incurring 387 significant loss, for instance an intense traffic burst following a 388 non-validated period.) 390 If the sender implements a method that allows it to identify the 391 number of ECN-marked segments within a window that were observed by 392 the receiver, the sender SHOULD use the method above, further 393 reducing R by the number of marked segments. 395 The sender MUST also re-initialise the pipeACK variable to the 396 maximum (undefined) value. This ensures that standard TCP methods 397 are used immediately after completing loss recovery. 399 4.4.2. Adjustment at the end of the nonvalidated phase 401 During the non-validated phase, a sender can produce bursts of data 402 of up to the cwnd in size. While this is no different to standard 403 TCP, it is desirable to control the maximum burst size, e.g. by 404 setting a burst size limit, using a pacing algorithm, or some other 405 method [Hug01]. 407 An application that remains in the non-validated phase for a period 408 greater than the NVP is required to adjust its congestion control 409 state. If the sender exits the non-validated phase after this 410 period, it MUST update the ssthresh: 412 ssthresh = max(ssthresh, 3*cwnd/4). 414 (This adjustment of ssthresh ensures that the sender records that it 415 has safely sustained the present rate. The change is beneficial to 416 rate-limited flows that encounter occasional congestion, and could 417 otherwise suffer an unwanted additional delay in recovering the 418 sending rate.) 420 The sender MUST then update cwnd to be not greater than: 422 cwnd = max(1/2*cwnd, IW). 424 Where IW is the appropriate TCP initial window, used by the TCP 425 sender (e.g. [RFC5681]). 427 (This adjustment ensures that sender responds conservatively at the 428 end of the non-validated phase by reducing the cwnd to better reflect 429 the current rate of the sender. The cwnd update does not take into 430 account FlightSize or pipeACK value because these values only reflect 431 historical data and do not reflect the current sending rate.) 433 4.4.3. Examples of Implementation 435 This section is intended to provide informative examples of 436 implementation methods. Implementations may choose to use other 437 methods that comply with the normative requirements. 439 XXX This section is work in progress - discussion is welcome to help 440 complete this section XXX 442 A pipeACK sample may be measured once each RTT. This reduces the 443 sender processing burden for calculating after each acknowledgement 444 and also reduces storage requirements at the sender. 446 Since application behaviour can be bursty using CWV, it may be 447 desirable to implement a maximum filter to accumulate the measured 448 values so that the pipeACK variable records the largest pipeACK 449 sample within the pipeACK Sampling Period. One simple way to 450 implement this is to divide the pipeACK Sampling Period into several 451 (e.g. 5) equal length measurement periods. The sender then records 452 the start time for each measurement period and the highest measured 453 pipeACK sample. At the end of the measurement period, any 454 measurement(s) that are older than the pipeACK Sampling Period are 455 discarded. The pipeACK variable is then assigned the largest of the 456 set of the highest measured values. 458 +----------+----------+ +----------+---...... 459 | Sample A | Sample B | No | Sample C | Sample D 460 | | | Sample | | 461 | |\ 5 | | | | 462 | | | | | | /\ 4 | 463 | | | | |\ 3 | | | \ | 464 | | \ | | \--- | | / \ | /| 2 465 |/ \------| - | | / \------/ \... 466 +----------+---------\+----/ /----+/---------+-------------> Time 468 <------------------------------------------------| 469 Sampling Period Current Time 471 Figure XX: Example of measuring pipeACK samples 473 Figure XX shows an example of how measurement samples may be 474 collected. At the time represented by the figure new samples are 475 being accumulated into sample D. Three previous samples also fall 476 within the pipeACK Sampling Period: A, B, and C. There was also a 477 period of inactivity between samples B and C during which no 478 measurements were taken. The current value of the pipeACK variable 479 will be 5, the maximum across all samples. 481 After one further measurement period, Sample A will be discarded, 482 since it then is older than the pipeACK Sampling Period and the 483 pipeACK variable will be recalculated, Its value will be the larger 484 of Sample C or the final value accumulated in Sample D. 486 The NVP period does not necessarily require a new timer to be 487 implemented. An alternative is to record a timestamp when the sender 488 enters the NVP. Each time a sender transmits a new segment, this 489 timestamp may be used to determine if the NVP period has expired. If 490 the period expires, the sender may take into account how many units 491 of the NVP period have passed and make one reduction (as defined in 492 section 4.3.2) for each NVP period. 494 5. Determining a safe period to preserve cwnd 496 This section documents the rationale for selecting the maximum period 497 that cwnd may be preserved, known as the non-validated period, NVP. 499 Limiting the period that cwnd may be preserved avoids undesirable 500 side effects that would result if the cwnd were to be kept 501 unnecessarily high for an arbitrary long period, which was a part of 502 the problem that CWV originally attempted to address. The period a 503 sender may safely preserve the cwnd, is a function of the period that 504 a network path is expected to sustain the capacity reflected by cwnd. 505 There is no ideal choice for this time. 507 A period of five minutes was chosen for this NVP. This is a 508 compromise that was larger than the idle intervals of common 509 applications, but not sufficiently larger than the period for which 510 the capacity of an Internet path may commonly be regarded as stable. 511 The capacity of wired networks is usually relatively stable for 512 periods of several minutes and that load stability increases with the 513 capacity. This suggests that cwnd may be preserved for at least a 514 few minutes. 516 There are cases where the TCP throughput exhibits significant 517 variability over a time less than five minutes. Examples could 518 include wireless topologies, where TCP rate variations may fluctuate 519 on the order of a few seconds as a consequence of medium access 520 protocol instabilities. Mobility changes may also impact TCP 521 performance over short time scales. Senders that observe such rapid 522 changes in the path characteristic may also experience increased 523 congestion with the new method, however such variation would likely 524 also impact TCP's behaviour when supporting interactive and bulk 525 applications. 527 Routing algorithms may modify the network path, disrupting the RTT 528 measurement and changing the capacity available to a TCP connection, 529 however such changes do not often occur within a time frame of a few 530 minutes. 532 The value of five minutes is therefore expected to be sufficient for 533 most current applications. Simulation studies (e.g. [Bis11]) also 534 suggest that for many practical applications, the performance using 535 this value will not be significantly different to that observed using 536 a non-standard method that does not reset the cwnd after idle. 538 Finally, other TCP sender mechanisms have used a 5 minute timer, and 539 there could be simplifications in some implementations by reusing the 540 same interval. TCP defines a default user timeout of 5 minutes 541 [RFC0793] i.e. how long transmitted data may remain unacknowledged 542 before a connection is forcefully closed. 544 6. Security Considerations 546 General security considerations concerning TCP congestion control are 547 discussed in [RFC5681]. This document describes an algorithm that 548 updates one aspect of the congestion control procedures, and so the 549 considerations described in RFC 5681 also apply to this algorithm. 551 7. IANA Considerations 553 There are no IANA considerations. 555 8. Acknowledgments 557 The authors acknowledge the contributions of Dr I Biswas, Mr Ziaul 558 Hossain in supporting the evaluation of CWV and for their help in 559 developing the mechanisms proposed in this draft. We also 560 acknowledge comments received from the Internet Congestion Control 561 Research Group, in particular Yuchung Cheng, Mirja Kuehlewind, and 562 Joe Touch. This work was part-funded by the European Community under 563 its Seventh Framework Programme through the Reducing Internet 564 Transport Latency (RITE) project (ICT-317700). 566 9. Author Notes 568 9.1. Other related work 570 There are several issues to be discussed more widely: 572 o Should the method explicitly state a procedure for limiting 573 burstiness or pacing? 575 This is often regarded as good practice, but is not presently a 576 formal part of TCP. draft-hughes-restart-00.txt provides some 577 discussion of this topic. 579 o There are potential interactions with the Experimental update in 580 [RFC6928] that raises the TCP initial Window to ten segments, do 581 these cases need to be elaborated? 583 This relates to the Experimental specification for increasing 584 the TCP IW defined in RFC 6928. 586 The two methods have different functions and different response 587 to loss/congestion. 589 RFC 6928 proposes an experimental update to TCP that would 590 increase the IW to ten segments. This would allow faster 591 opening of the cwnd, and also a large (same size) restart 592 window. This approach is based on the assumption that many 593 forward paths can sustain bursts of up to ten segments without 594 (appreciable) loss. Such a significant increase in cwnd must 595 be matched with an equally large reduction of cwnd if loss/ 596 congestion is detected, and such a congestion indication is 597 likely to require future use of IW=10 to be disabled for this 598 path for some time. This guards against the unwanted behaviour 599 of a series of short flows continuously flooding a network path 600 without network congestion feedback. 602 In contrast, this document proposes an update with a rationale 603 that relies on recent previous path history to select an 604 appropriate cwnd after restart. 606 The behaviour differs in three ways: 608 1) For applications that send little initially, new-cwv may 609 constrain more than RFC 6928, but would not require the 610 connection to reset any path information when a restart 611 incurred loss. In contrast, new-cwv would allow the TCP 612 connection to preserve the cached cwnd, any loss, would impact 613 cwnd, but not impact other flows. 615 2) For applications that utilise more capacity than provided by 616 a cwnd of 10 segments, this method would permit a larger 617 restart window compared to a restart using the method in RFC 618 6928. This is justified by the recent path history. 620 3) new-CWV is attended to also be used for rate-limited 621 applications, where the application sends, but does not seek to 622 fully utilise the cwnd. In this case, new-cwv constrains the 623 cwnd to that justified by the recent path history. The 624 performance trade-offs are hence different, and it would be 625 possible to enable new-cwv when also using the method in RFC 626 6928, and yield benefits. 628 o There is potential overlap with the Laminar proposal 629 (draft-mathis-tcpm-tcp-laminar) 631 The current draft was intended as a standards-track update to 632 TCP, rather than a new transport variant. At least, it would 633 be good to understand how the two interact and whether there is 634 a possibility of a single method. 636 o There is potential performance loss in loss of a short burst 637 (off list with M Allman) 639 A sender can transmit several segments then become idle. If 640 the first segments are all ACK'ed the ssthresh collapses to a 641 small value (no new data is sent by the idle sender). Loss of 642 the later data results in congestion (e.g. maybe a RED drop or 643 some other cause, rather than the maximum rate of this flow). 644 When the sender performs loss recovery it may have an 645 appreciable pipeACK and cwnd, but a very low FlightSize - the 646 Standard algorithm results in an unusually low cwnd (1/2 647 FlightSize). 649 A constant rate flow would have maintained a FlightSize 650 appropriate to pipeACK (cwnd if it is a bulk flow). 652 This could be fixed by adding a new state variable? It could 653 also be argued this is a corner case (e.g. loss of only the 654 last segments would have resulted in RTO), the impact could be 655 significant. 657 o There is potential interaction with TCP Control Block Sharing(M 658 Welzl) 660 An application that is non-validated can accumulate a cwnd that 661 is larger than the actual capacity. Is this a fair value to 662 use in TCB sharing? 664 We propose that TCB sharing should use the pipeACK in place of 665 cwnd when a TCP sender is in the Nonvalidated phase. This 666 value better reflects the capacity that the flow has utilised 667 in the network path. 669 9.2. Revision notes 671 RFC-Editor note: please remove this section prior to publication. 673 Draft 03 was submitted to ICCRG to receive comments and feedback. 675 Draft 04 contained the first set of clarifications after feedback: 677 o Changed name to application limited and used the term rate-limited 678 in all places. 680 o Added justification and many minor changes suggested on the list. 682 o Added text to tie-in with more accurate ECN marking. 684 o Added ref to Hug01 686 Draft 05 contained various updates: 688 o New text to redefine how to measure the acknowledged pipe, 689 differentiating this from the FlightSize, and hence avoiding 690 previous issues with infrequent large bursts of data not being 691 validated. A key point new feature is that pipeACK only triggers 692 leaving the NVP after the size of the pipe has been acknowledged. 693 This removed the need for hysteresis. 695 o Reduction values were changed to 1/2, following analysis of 696 suggestions from ICCRG. This also sets the "target" cwnd as twice 697 the used rate for non-validated case. 699 o Introduced a symbolic name (NVP) to denote the 5 minute period. 701 Draft 06 contained various updates: 703 o Required reset of pipeACK after congestion. 705 o Added comment on the effect of congestion after a short burst (M. 706 Allman). 708 o Correction of minor Typos. 710 WG draft 00 contained various updates: 712 o Updated initialisation of pipeACK to maximum value. 714 o Added note on intended status still to be determined. 716 WG draft 01 contained: 718 o Added corrections from Richard Scheffenegger. 720 o Raffaello Secchi added to the mechanism, based on implementation 721 experience. 723 o Removed that the requirement for the method to use TCP SACK option 724 [RFC3517] to be enabled - Although it may be desirable to use 725 SACK, this is not essential to the algorithm. 727 o Added the notion of the sampling period to accommodate large rate 728 variations and ensure that the method is stable. This algorithm 729 to be validated through implementation. 731 WG draft 02 contained: 733 o Clarified language around pipeACK variable and pipeACK sample - 734 Feedback from Aris Angelogiannopoulos. 736 10. References 738 10.1. Normative References 740 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 741 RFC 793, September 1981. 743 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 744 Requirement Levels", BCP 14, RFC 2119, March 1997. 746 [RFC2861] Handley, M., Padhye, J., and S. Floyd, "TCP Congestion 747 Window Validation", RFC 2861, June 2000. 749 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 750 of Explicit Congestion Notification (ECN) to IP", 751 RFC 3168, September 2001. 753 [RFC3517] Blanton, E., Allman, M., Fall, K., and L. Wang, "A 754 Conservative Selective Acknowledgment (SACK)-based Loss 755 Recovery Algorithm for TCP", RFC 3517, April 2003. 757 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 758 Control", RFC 5681, September 2009. 760 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 761 "Computing TCP's Retransmission Timer", RFC 6298, 762 June 2011. 764 [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis, 765 "Increasing TCP's Initial Window", RFC 6928, April 2013. 767 10.2. Informative References 769 [Bis08] Biswas and Fairhurst, "A Practical Evaluation of 770 Congestion Window Validation Behaviour, 9th Annual 771 Postgraduate Symposium in the Convergence of 772 Telecommunications, Networking and Broadcasting (PGNet), 773 Liverpool, UK", June 2008. 775 [Bis10] Biswas, Sathiaseelan, Secchi, and Fairhurst, "Analysing 776 TCP for Bursty Traffic, Int'l J. of Communications, 777 Network and System Sciences, 7(3)", June 2010. 779 [Bis11] Biswas, "PhD Thesis, Internet congestion control for 780 variable rate TCP traffic, School of Engineering, 781 University of Aberdeen", June 2011. 783 [Fai12] Fairhurst, Biswas, Biswas, and Biswas, "Enhancing TCP 784 Performance to support Variable-Rate Traffic, 2nd Capacity 785 Sharing Workshop, ACM CoNEXT, Nice, France, 10th December 786 2012.", June 2008. 788 [Hug01] Hughes, Touch, and Heidemann, "√√Issues in TCP 789 Slow-Start Restart After Idle (Work-in-Progress)", 790 December 2001. 792 [Liu07] Liu, Allman, Jiny, and Wang, "Congestion Control without a 793 Startup Phase, 5th International Workshop on Protocols for 794 Fast Long-Distance Networks (PFLDnet), Los Angeles, 795 California, USA", February 2007. 797 Authors' Addresses 799 Godred Fairhurst 800 University of Aberdeen 801 School of Engineering 802 Fraser Noble Building 803 Aberdeen, Scotland AB24 3UE 804 UK 806 Email: gorry@erg.abdn.ac.uk 807 URI: http://www.erg.abdn.ac.uk 809 Arjuna Sathiaseelan 810 University of Aberdeen 811 School of Engineering 812 Fraser Noble Building 813 Aberdeen, Scotland AB24 3UE 814 UK 816 Email: arjuna@erg.abdn.ac.uk 817 URI: http://www.erg.abdn.ac.uk 819 Raffaello Secchi 820 University of Aberdeen 821 School of Engineering 822 Fraser Noble Building 823 Aberdeen, Scotland AB24 3UE 824 UK 826 Email: raffaello@erg.abdn.ac.uk 827 URI: http://www.erg.abdn.ac.uk