idnits 2.17.1 draft-ietf-tcpm-rfc3782-bis-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (January 18, 2012) is 4472 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Obsolete informational reference (is this intentional?): RFC 1323 (Obsoleted by RFC 7323) -- Obsolete informational reference (is this intentional?): RFC 2582 (Obsoleted by RFC 3782) -- Obsolete informational reference (is this intentional?): RFC 3782 (Obsoleted by RFC 6582) Summary: 0 errors (**), 0 flaws (~~), 3 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 TCP Maintenance and Minor T. Henderson 3 Extensions Working Group Boeing 4 Internet-Draft S. Floyd 5 Obsoletes: 3782 (if approved) ICSI 6 Intended status: Standards Track A. Gurtov 7 Expires: July 18, 2012 University of Oulu 8 Y. Nishida 9 WIDE Project 10 January 18, 2012 12 The NewReno Modification to TCP's Fast Recovery Algorithm 13 draft-ietf-tcpm-rfc3782-bis-05.txt 15 Abstract 17 RFC 5681 documents the following four intertwined TCP 18 congestion control algorithms: slow start, congestion avoidance, fast 19 retransmit, and fast recovery. RFC 5681 explicitly allows 20 certain modifications of these algorithms, including modifications 21 that use the TCP Selective Acknowledgement (SACK) option (RFC 2883), 22 and modifications that respond to "partial acknowledgments" (ACKs 23 which cover new data, but not all the data outstanding when loss was 24 detected) in the absence of SACK. This document describes a specific 25 algorithm for responding to partial acknowledgments, referred to as 26 NewReno. This response to partial acknowledgments was first proposed 27 by Janey Hoe. This document obsoletes RFC 3782. 29 Status of this Memo 31 This Internet-Draft is submitted in full conformance with the 32 provisions of BCP 78 and BCP 79. 34 Internet-Drafts are working documents of the Internet Engineering 35 Task Force (IETF). Note that other groups may also distribute 36 working documents as Internet-Drafts. The list of current Internet- 37 Drafts is at http://datatracker.ietf.org/drafts/current/. 39 Internet-Drafts are draft documents valid for a maximum of six months 40 and may be updated, replaced, or obsoleted by other documents at any 41 time. It is inappropriate to use Internet-Drafts as reference 42 material or to cite them other than as "work in progress." 44 This Internet-Draft will expire on July 18, 2012. 46 Copyright Notice 48 Copyright (c) 2012 IETF Trust and the persons identified as 49 the document authors. All rights reserved. 51 This document is subject to BCP 78 and the IETF Trust's Legal 52 Provisions Relating to IETF Documents 53 (http://trustee.ietf.org/license-info) in effect on the date of 54 publication of this document. Please review these documents 55 carefully, as they describe your rights and restrictions with respect 56 to this document. Code Components extracted from this document must 57 include Simplified BSD License text as described in Section 4.e of 58 the Trust Legal Provisions and are provided without warranty as 59 described in the Simplified BSD License. 61 This document may contain material from IETF Documents or IETF 62 Contributions published or made publicly available before November 63 10, 2008. The person(s) controlling the copyright in some of this 64 material may not have granted the IETF Trust the right to allow 65 modifications of such material outside the IETF Standards Process. 66 Without obtaining an adequate license from the person(s) controlling 67 the copyright in such materials, this document may not be modified 68 outside the IETF Standards Process, and derivative works of it may 69 not be created outside the IETF Standards Process, except to format 70 it for publication as an RFC or to translate it into languages other 71 than English. 73 1. Introduction 75 For the typical implementation of the TCP Fast Recovery algorithm 76 described in [RFC5681] (first implemented in the 1990 BSD Reno 77 release, and referred to as the Reno algorithm in [FF96]), the TCP 78 data sender only retransmits a packet after a retransmit timeout has 79 occurred, or after three duplicate acknowledgments have arrived 80 triggering the Fast Retransmit algorithm. A single retransmit 81 timeout might result in the retransmission of several data packets, 82 but each invocation of the Fast Retransmit algorithm in RFC 5681 83 leads to the retransmission of only a single data packet. 85 Two problems arise with Reno TCP when multiple packet losses occur 86 in a single window. First, Reno will often take a timeout, as 87 has been documented in [Hoe95]. Second, even if a retransmission 88 timeout is avoided, multiple fast retransmits and window reductions 89 can occur, as documented in [F94]. When multiple packet losses 90 occur, if the SACK option [RFC2883] is available, the TCP sender 91 has the information to make intelligent decisions about which packets 92 to retransmit and which packets not to retransmit during Fast 93 Recovery. This document applies to TCP connections that are 94 unable to use the TCP Selective Acknowledgement (SACK) option, 95 either because the option is not locally supported or 96 because the TCP peer did not indicate a willingness to use SACK. 98 In the absence of SACK, there is little information available to the 99 TCP sender in making retransmission decisions during Fast 100 Recovery. From the three duplicate acknowledgments, the sender 101 infers a packet loss, and retransmits the indicated packet. After 102 this, the data sender could receive additional duplicate 103 acknowledgments, as the data receiver acknowledges additional data 104 packets that were already in flight when the sender entered Fast 105 Retransmit. 107 In the case of multiple packets dropped from a single window of data, 108 the first new information available to the sender comes when the 109 sender receives an acknowledgment for the retransmitted packet (that 110 is, the packet retransmitted when Fast Retransmit was first 111 entered). If there is a single packet drop and no reordering, then 112 the acknowledgment for this packet will acknowledge all of the 113 packets transmitted before Fast Retransmit was entered. However, if 114 there are multiple packet drops, then the acknowledgment for the 115 retransmitted packet will acknowledge some but not all of the packets 116 transmitted before the Fast Retransmit. We call this acknowledgment 117 a partial acknowledgment. 119 Along with several other suggestions, [Hoe95] suggested that during 120 Fast Recovery the TCP data sender responds to a partial 121 acknowledgment by inferring that the next in-sequence packet has been 122 lost, and retransmitting that packet. This document describes a 123 modification to the Fast Recovery algorithm in RFC 5681 that 124 incorporates a response to partial acknowledgments received during 125 Fast Recovery. We call this modified Fast Recovery algorithm 126 NewReno, because it is a slight but significant variation of the 127 basic Reno algorithm in RFC 5681. This document does not discuss the 128 other suggestions in [Hoe95] and [Hoe96], such as a change to the 129 ssthresh parameter during Slow-Start, or the proposal to send a new 130 packet for every two duplicate acknowledgments during Fast 131 Recovery. The version of NewReno in this document also draws on 132 other discussions of NewReno in the literature [LM97, Hen98]. 134 We do not claim that the NewReno version of Fast Recovery described 135 here is an optimal modification of Fast Recovery for responding to 136 partial acknowledgments, for TCP connections that are unable to use 137 SACK. Based on our experiences with the NewReno modification in the 138 NS simulator [NS] and with numerous implementations of NewReno, we 139 believe that this modification improves the performance of the Fast 140 Retransmit and Fast Recovery algorithms in a wide variety of 141 scenarios. Previous versions of this RFC [RFC2582, RFC3782] provide 142 simulation-based evidence of the possible performance gains. 144 2. Terminology and Definitions 146 This document assumes that the reader is familiar with the terms 147 SENDER MAXIMUM SEGMENT SIZE (SMSS), CONGESTION WINDOW (cwnd), and 148 FLIGHT SIZE (FlightSize) defined in [RFC5681]. 150 This document defines an additional sender-side state variable 151 called RECOVER: 153 RECOVER: 154 When in Fast Recovery, this variable records the send sequence 155 number that must be acknowledged before the Fast Recovery 156 procedure is declared to be over. 158 3. The Fast Retransmit and Fast Recovery Algorithms in NewReno 160 3.1. Protocol Overview 162 The basic idea of these extensions to the Fast Retransmit and 163 Fast Recovery algorithms described in Section 3.2 of [RFC5681] 164 is as follows. The TCP sender can infer, from the arrival of 165 duplicate acknowledgments, whether multiple losses in the same 166 window of data have most likely occurred, and avoid taking a 167 retransmit timeout or making multiple congestion window reductions 168 due to such an event. 170 The NewReno modification applies to the Fast Recovery procedure that 171 begins when three duplicate ACKs are received and ends when either a 172 retransmission timeout occurs or an ACK arrives that acknowledges all 173 of the data up to and including the data that was outstanding when 174 the Fast Recovery procedure began. 176 3.2. Specification 178 The procedures specified in Section 3.2 of [RFC5681] are followed 179 with the following modifications. Note that this specification 180 avoids the use of the key words defined in RFC 2119 [RFC2119] since 181 it mainly provides sender-side implementation guidance for 182 performance improvement, and does not affect interoperability. 184 1) Initialization of TCP protocol control block: 185 When the TCP protocol control block is initialized, Recover is 186 set to the initial send sequence number. 188 2) Three duplicate ACKs: 189 When the third duplicate ACK is received, the TCP sender first 190 checks the value of Recover to see if the Cumulative 191 Acknowledgment field covers more than Recover. If so, the value 192 of Recover is incremented to the value of the highest sequence 193 number transmitted by the TCP so far. The TCP then enters Fast 194 Retransmit (step 2 of Section 3.2 of [RFC5681]). If not, the TCP 195 does not enter fast retransmit and does not reset ssthresh. 197 3) Response to newly acknowledged data: 198 Step 6 of [RFC5681] specifies the response to the next ACK that 199 acknowledges previously unacknowledged data. When an ACK 200 arrives that acknowledges new data, this ACK could be the 201 acknowledgment elicited by the retransmission from step 2, or 202 elicited by a later retransmission. There are two cases. 204 Full acknowledgments: 205 If this ACK acknowledges all of the data up to and including 206 Recover, then the ACK acknowledges all the intermediate 207 segments sent between the original transmission of the lost 208 segment and the receipt of the third duplicate ACK. Set cwnd to 209 either (1) min (ssthresh, max(FlightSize, SMSS) + SMSS) or 210 (2) ssthresh, where ssthresh is the value set when Fast 211 Retransmit was entered, and where FlightSize in (1) is the amount 212 of data presently outstanding. This is termed "deflating" the 213 window. If the second option is selected, the implementation 214 is encouraged to take measures to avoid a possible burst of 215 data, in case the amount of data outstanding in the network is 216 much less than the new congestion window allows. A simple 217 mechanism is to limit the number of data packets that can be sent 218 in response to a single acknowledgment. Exit the Fast Recovery 219 procedure. 221 Partial acknowledgments: 222 If this ACK does *not* acknowledge all of the data up to and 223 including Recover, then this is a partial ACK. In this case, 224 retransmit the first unacknowledged segment. Deflate the 225 congestion window by the amount of new data acknowledged by the 226 cumulative acknowledgment field. If the partial ACK 227 acknowledges at least one SMSS of new data, then add back SMSS 228 bytes to the congestion window. This artificially 229 inflates the congestion window in order to reflect the additional 230 segment that has left the network. Send a new segment if 231 permitted by the new value of cwnd. This "partial window 232 deflation" attempts to ensure that, when Fast Recovery eventually 233 ends, approximately ssthresh amount of data will be outstanding 234 in the network. Do not exit the Fast Recovery procedure (i.e., 235 if any duplicate ACKs subsequently arrive, execute Step 4 of 236 Section 3.2 of [RFC5681]. 238 For the first partial ACK that arrives during Fast Recovery, also 239 reset the retransmit timer. Timer management is discussed in 240 more detail in Section 4. 242 4) Retransmit timeouts: 243 After a retransmit timeout, record the highest sequence number 244 transmitted in the variable Recover and exit the Fast 245 Recovery procedure if applicable. 247 Step 2 above specifies a check that the Cumulative Acknowledgment 248 field covers more than Recover. Because the acknowledgment field 249 contains the sequence number that the sender next expects to receive, 250 the acknowledgment "ack_number" covers more than Recover when: 252 ack_number - 1 > Recover; 254 i.e., at least one byte more of data is acknowledged beyond the 255 highest byte that was outstanding when Fast Retransmit was last 256 entered. 258 Note that in Step 3 above, the congestion window is deflated after 259 a partial acknowledgment is received. The congestion window was 260 likely to have been inflated considerably when the partial 261 acknowledgment was received. In addition, depending on the original 262 pattern of packet losses, the partial acknowledgment might 263 acknowledge nearly a window of data. In this case, if the congestion 264 window was not deflated, the data sender might be able to send nearly 265 a window of data back-to-back. 267 This document does not specify the sender's response to duplicate 268 ACKs when the Fast Retransmit/Fast Recovery algorithm is not 269 invoked. This is addressed in other documents, such as those 270 describing the Limited Transmit procedure [RFC3042]. This document 271 also does not address issues of adjusting the duplicate 272 acknowledgment threshold, but assumes the threshold specified in 273 the IETF standards; the current standard is [RFC5681], which 274 specifies a threshold of three duplicate acknowledgments. 276 As a final note, we would observe that in the absence of the SACK 277 option, the data sender is working from limited information. When 278 the issue of recovery from multiple dropped packets from a single 279 window of data is of particular importance, the best alternative 280 would be to use the SACK option. 282 4. Handling Duplicate Acknowledgments After A Timeout 284 After each retransmit timeout, the highest sequence number 285 transmitted so far is recorded in the variable "recover". 286 If, after a retransmit timeout, the TCP data sender retransmits three 287 consecutive packets that have already been received by the data 288 receiver, then the TCP data sender will receive three duplicate 289 acknowledgments that do not cover more than "recover". In this 290 case, the duplicate acknowledgments are not an indication of a new 291 instance of congestion. They are simply an indication that the 292 sender has unnecessarily retransmitted at least three packets. 294 However, when a retransmitted packet is itself dropped, the sender 295 can also receive three duplicate acknowledgments that do not cover 296 more than "recover". In this case, the sender would have been 297 better off if it had initiated Fast Retransmit. For a TCP sender 298 that implements the algorithm specified in Section 3.2 of this 299 document, the sender does not infer a packet drop from duplicate 300 acknowledgments in this scenario. As always, the retransmit timer 301 is the backup mechanism for inferring packet loss in this case. 303 There are several heuristics, based on timestamps or on the amount of 304 advancement of the cumulative acknowledgment field, that allow the 305 sender to distinguish, in some cases, between three duplicate 306 acknowledgments following a retransmitted packet that was dropped, 307 and three duplicate acknowledgments from the unnecessary 308 retransmission of three packets [Gur03, GF04]. The TCP sender may 309 use such a heuristic to decide to invoke a Fast Retransmit in some 310 cases, even when the three duplicate acknowledgments do not cover 311 more than "recover". 313 For example, when three duplicate acknowledgments are caused by the 314 unnecessary retransmission of three packets, this is likely to be 315 accompanied by the cumulative acknowledgment field advancing by at 316 least four segments. Similarly, a heuristic based on timestamps uses 317 the fact that when there is a hole in the sequence space, the 318 timestamp echoed in the duplicate acknowledgment is the timestamp of 319 the most recent data packet that advanced the cumulative 320 acknowledgment field [RFC1323]. If timestamps are used, and the 321 sender stores the timestamp of the last acknowledged segment, then 322 the timestamp echoed by duplicate acknowledgments can be used to 323 distinguish between a retransmitted packet that was dropped and 324 three duplicate acknowledgments from the unnecessary 325 retransmission of three packets. 327 4.1. ACK Heuristic 329 If the ACK-based heuristic is used, then following the advancement of 330 the cumulative acknowledgment field, the sender stores the value of 331 the previous cumulative acknowledgment as prev_highest_ack, and 332 stores the latest cumulative ACK as highest_ack. In addition, the 333 following check is performed if, in Step 2 of Section 3.2, the 334 Cumulative Acknowledgment field does not cover more than "recover". 336 1*) If the Cumulative Acknowledgment field didn't cover more than 337 "recover", check to see if the congestion window is greater 338 than SMSS bytes and the difference between highest_ack and 339 prev_highest_ack is at most 4*SMSS bytes. If true, duplicate 340 ACKs indicate a lost segment (enter Fast Retransmit). 341 Otherwise, duplicate ACKs likely result from unnecessary 342 retransmissions (do not enter Fast Retransmit). 344 The congestion window check serves to protect against fast retransmit 345 immediately after a retransmit timeout. 347 If several ACKs are lost, the sender can see a jump in the cumulative 348 ACK of more than three segments, and the heuristic can fail. 349 [RFC5681] recommends that a receiver should 350 send duplicate ACKs for every out-of-order data packet, such as a 351 data packet received during Fast Recovery. The ACK heuristic is more 352 likely to fail if the receiver does not follow this advice, because 353 then a smaller number of ACK losses are needed to produce a 354 sufficient jump in the cumulative ACK. 356 4.2. Timestamp Heuristic 358 If this heuristic is used, the sender stores the timestamp of the 359 last acknowledged segment. In addition, the last sentence of step 360 2 in Section 3.2 is replaced as follows: 362 1**) If the Cumulative Acknowledgment field didn't cover more than 363 "recover", check to see if the echoed timestamp in the last 364 non-duplicate acknowledgment equals the 365 stored timestamp. If true, duplicate ACKs indicate a lost 366 segment (enter Fast Retransmit). Otherwise, duplicate 367 ACKs likely result from unnecessary retransmissions (do not 368 enter Fast Retransmit). 370 The timestamp heuristic works correctly, both when the receiver 371 echoes timestamps as specified by [RFC1323], and by its revision 372 attempts. However, if the receiver arbitrarily echoes timestamps, 373 the heuristic can fail. The heuristic can also fail if a timeout was 374 spurious and returning ACKs are not from retransmitted segments. 375 This can be prevented by detection algorithms such as [RFC3522]. 377 5. Implementation Issues for the Data Receiver 379 [RFC5681] specifies that "Out-of-order data segments SHOULD be 380 acknowledged immediately, in order to accelerate loss recovery." 381 Neal Cardwell has noted that some data receivers do not send an 382 immediate acknowledgment when they send a partial acknowledgment, 383 but instead wait first for their delayed acknowledgment timer to 384 expire [C98]. As [C98] notes, this severely limits the potential 385 benefit of NewReno by delaying the receipt of the partial 386 acknowledgment at the data sender. Echoing [RFC5681], our 387 recommendation is that the data receiver send an immediate 388 acknowledgment for an out-of-order segment, even when that 389 out-of-order segment fills a hole in the buffer. 391 6. Implementation Issues for the Data Sender 393 In Section 3, Step 5 above, it is noted that implementations should 394 take measures to avoid a possible burst of data when leaving Fast 395 Recovery, in case the amount of new data that the sender is eligible 396 to send due to the new value of the congestion window is large. This 397 can arise during NewReno when ACKs are lost or treated as pure window 398 updates, thereby causing the sender to underestimate the number of 399 new segments that can be sent during the recovery procedure. 400 Specifically, bursts can occur when the FlightSize is much less than 401 the new congestion window when exiting from Fast Recovery. One 402 simple mechanism to avoid a burst of data when leaving Fast Recovery 403 is to limit the number of data packets that can be sent in response 404 to a single acknowledgment. (This is known as "maxburst_" in the ns 405 simulator.) Other possible mechanisms for avoiding bursts include 406 rate-based pacing, or setting the slow-start threshold to the 407 resultant congestion window and then resetting the congestion window 408 to FlightSize. A recommendation on the general mechanism to avoid 409 excessively bursty sending patterns is outside the scope of this 410 document. 412 An implementation may want to use a separate flag to record whether 413 or not it is presently in the Fast Recovery procedure. The use of 414 the value of the duplicate acknowledgment counter for this purpose is 415 not reliable because it can be reset upon window updates and 416 out-of-order acknowledgments. 418 When updating the Cumulative Acknowledgment field outside of 419 Fast Recovery, the "recover" state variable may also need to be 420 updated in order to continue to permit possible entry into Fast 421 Recovery (Section 3, step 1). This issue arises when an update 422 of the Cumulative Acknowledgment field results in a sequence 423 wraparound that affects the ordering between the Cumulative 424 Acknowledgment field and the "recover" state variable. Entry 425 into Fast Recovery is only possible when the Cumulative 426 Acknowledgment field covers more than the "recover" state variable. 428 It is important for the sender to respond correctly to duplicate ACKs 429 received when the sender is no longer in Fast Recovery (e.g., because 430 of a Retransmit Timeout). The Limited Transmit procedure [RFC3042] 431 describes possible responses to the first and second duplicate 432 acknowledgments. When three or more duplicate acknowledgments are 433 received, the Cumulative Acknowledgment field doesn't cover more 434 than "recover", and a new Fast Recovery is not invoked, it is 435 important that the sender not execute the Fast Recovery steps (3) and 436 (4) in Section 3. Otherwise, the sender could end up in a chain of 437 spurious timeouts. We mention this only because several NewReno 438 implementations had this bug, including the implementation in the NS 439 simulator. 441 It has been observed that some TCP implementations enter a slow start 442 or congestion avoidance window updating algorithm immediately after 443 the cwnd is set by the equation found in (Section 3, step 5), even 444 without a new external event generating the cwnd change. Note that 445 after cwnd is set based on the procedure for exiting Fast Recovery 446 (Section 3, step 5), cwnd should not be updated until a further 447 event occurs (e.g., arrival of an ack, or timeout) after this 448 adjustment. 450 7. Security Considerations 452 [RFC5681] discusses general security considerations concerning TCP 453 congestion control. This document describes a specific algorithm 454 that conforms with the congestion control requirements of [RFC5681], 455 and so those considerations apply to this algorithm, too. There are 456 no known additional security concerns for this specific algorithm. 458 8. IANA Considerations 459 This document has no actions for IANA. 461 9. Conclusions 463 This document specifies the NewReno Fast Retransmit and Fast Recovery 464 algorithms for TCP. This NewReno modification to TCP can even be 465 important for TCP implementations that support the SACK option, 466 because the SACK option can only be used for TCP connections when 467 both TCP end-nodes support the SACK option. NewReno performs better 468 than Reno (RFC5681) in a number of scenarios discussed in 469 previous versions of this RFC ([RFC2582], [RFC3782]). 471 A number of options to the basic algorithm presented in Section 3 are 472 also referenced in Appendix A to this document. These include the 473 handling of the retransmission timer, the response to partial 474 acknowledgments, and whether or not the sender must maintain a state 475 variable called Recover. Our belief is that the differences 476 between these variants of NewReno are small compared to the 477 differences between Reno and NewReno. That is, the important thing 478 is to implement NewReno instead of Reno, for a TCP connection 479 without SACK; it is less important exactly which of the variants of 480 NewReno is implemented. 482 10. Acknowledgments 484 Many thanks to Anil Agarwal, Mark Allman, Armando Caro, Jeffrey Hsu, 485 Vern Paxson, Kacheong Poon, Keyur Shah, and Bernie Volz for detailed 486 feedback on this document or on its precursor, RFC 2582. Jeffrey 487 Hsu provided clarifications on the handling of the recover variable 488 that were applied to RFC 3782 as errata, and now are in Section 8 489 of this document. Yoshifumi Nishida contributed a modification 490 to the fast recovery algorithm to account for the case in which 491 flightsize is 0 when the TCP sender leaves fast recovery, and the 492 TCP receiver uses delayed acknowledgments. Alexander Zimmermann 493 provided several suggestions to improve the clarity of the document. 495 11. References 497 11.1. Normative References 499 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 500 Requirement Levels", BCP 14, RFC 2119, March 1997. 502 [RFC5681] Allman, M., Paxson, V. and E. Blanton, "TCP Congestion 503 Control", RFC 5681, September 2009. 505 11.2. Informative References 507 [C98] Cardwell, N., "delayed ACKs for retransmitted packets: 508 ouch!". November 1998, Email to the tcpimpl mailing list, 509 Message-ID 510 "Pine.LNX.4.02A.9811021421340.26785-100000@sake.cs. 511 washington.edu", 512 archived at "http://tcp-impl.lerc.nasa.gov/tcp-impl". 514 [FF96] Fall, K. and S. Floyd, "Simulation-based Comparisons of 515 Tahoe, Reno and SACK TCP", Computer Communication Review, 516 July 1996. 517 URL "ftp://ftp.ee.lbl.gov/papers/sacks.ps.Z". 519 [F94] Floyd, S., "TCP and Successive Fast Retransmits", Technical 520 report, October 1994. URL 521 "ftp://ftp.ee.lbl.gov/papers/fastretrans.ps". 523 [GF04] Gurtov, A. and S. Floyd, "Resolving Acknowledgment 524 Ambiguity in non-SACK TCP", Next Generation Teletraffic and 525 Wired/Wireless Advanced Networking (NEW2AN'04), February 526 2004. URL "http://www.cs.helsinki.fi/u/gurtov/papers/ 527 heuristics.html". 529 [Gur03] Gurtov, A., "[Tsvwg] resolving the problem of unnecessary 530 fast retransmits in go-back-N", email to the tsvwg mailing 531 list, message ID <3F25B467.9020609@cs.helsinki.fi>, 532 July 28, 2003. URL "http://www1.ietf.org/mail-archive/ 533 working-groups/tsvwg/current/msg04334.html". 535 [Hen98] Henderson, T., Re: NewReno and the 2001 Revision. September 536 1998. Email to the tcpimpl mailing list, Message ID 537 "Pine.BSI.3.95.980923224136.26134A-100000@raptor.CS. 538 Berkeley.EDU", 539 archived at "http://tcp-impl.lerc.nasa.gov/tcp-impl". 541 [Hoe95] Hoe, J., "Startup Dynamics of TCP's Congestion Control and 542 Avoidance Schemes", Master's Thesis, MIT, 1995. 544 [Hoe96] Hoe, J., "Improving the Start-up Behavior of a Congestion 545 Control Scheme for TCP", ACM SIGCOMM, August 1996. URL 546 "http://www.acm.org/sigcomm/sigcomm96/program.html". 548 [LM97] Lin, D. and R. Morris, "Dynamics of Random Early 549 Detection", SIGCOMM 97, September 1997. URL 550 "http://www.acm.org/sigcomm/sigcomm97/program.html". 552 [NS] The Network Simulator (NS). 553 URL "http://www.isi.edu/nsnam/ns/". 555 [RFC1323] Jacobson, V., Braden, R. and D. Borman, "TCP Extensions for 556 High Performance", RFC 1323, May 1992. 558 [RFC2582] Floyd, S. and T. Henderson, "The NewReno Modification to 559 TCP's Fast Recovery Algorithm", RFC 2582, April 1999. 561 [RFC2883] Floyd, S., J. Mahdavi, M. Mathis, and M. Podolsky, "The 562 Selective Acknowledgment (SACK) Option for TCP, RFC 2883, 563 July 2000. 565 [RFC3042] Allman, M., Balakrishnan, H. and S. Floyd, "Enhancing TCP's 566 Loss Recovery Using Limited Transmit", RFC 3042, 567 January 2001. 569 [RFC3522] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm for 570 TCP", RFC 3522, April 2003. 572 [RFC3782] Floyd, S., T. Henderson, and A. Gurtov, "The NewReno 573 Modification to TCP's Fast Recovery Algorithm", RFC 3782, 574 April 2004. 576 Appendix A. Additional Information 578 Previous versions of this RFC ([RFC2582], [RFC3782]) contained 579 additional informative material on the following subjects, and 580 may be consulted by readers who may want more information about 581 possible variants to the algorithm and who may want references 582 to specific [NS] simulations that provide NewReno test cases. 584 Section 4 of [RFC3782] discusses some alternative behaviors for 585 resetting the retransmit timer after a partial acknowledgment. 587 Section 5 of [RFC3782] discusses some alternative behaviors for 588 performing retransmission after a partial acknowledgment. 590 Section 6 of [RFC3782] describes more information about the 591 motivation for the sender's state variable Recover. 593 Section 9 of [RFC3782] introduces some NS simulation test 594 suites for NewReno. In addition, references to simulation 595 results can be found throughout [RFC3782]. 597 Section 10 of [RFC3782] provides a comparison of Reno and 598 NewReno TCP. 600 Section 11 of [RFC3782] listed changes relative to [RFC2582]. 602 Appendix B. Changes Relative to RFC 3782 604 In [RFC3782], the cwnd after Full ACK reception will be set to 605 (1) min (ssthresh, FlightSize + SMSS) or (2) ssthresh. However, 606 there is a risk in the first option which results in performance 607 degradation. With the first option, if FlightSize is zero, the 608 result will be 1 SMSS. This means TCP can transmit only 1 segment 609 at this moment, which can cause delay in ACK transmission at receiver 610 due to delayed ACK algorithm. 612 The FlightSize on Full ACK reception can be zero in some situations. 613 A typical example is where sending window size during fast recovery 614 is small. In this case, the retransmitted packet and new data packets 615 can be transmitted within a short interval. If all these packets 616 successfully arrive, the receiver may generate a Full ACK that 617 acknowledges all outstanding data. Even if window size is not small, 618 loss of ACK packets or receive buffer shortage during fast recovery 619 can also increase the possibility of falling into this situation. 621 The proposed fix in this document, which sets cwnd to at least 2*SMSS 622 if the implementation uses option 1 in the Full ACK case (Section 3.2 623 step 3, option 1), ensures that the sender TCP transmits at least two 624 segments on Full ACK reception. 626 In addition, errata for RFC3782 (editorial clarification to Section 8 627 of RFC2582, which is now Section 6 of this document) has been 628 applied. 630 The specification text (Section 3.2 herein) was rewritten to more 631 closely track Section 3.2 of [RFC5681]. 633 Sections 4, 5, 9-11 of [RFC3782] were removed, and instead Appendix 634 A of this document was added to back-reference this informative 635 material. A few references that have no citation in the main body 636 of the draft have been removed. 638 Appendix C. Document Revision History 640 To be removed upon publication 642 +----------+--------------------------------------------------+ 643 | Revision | Comments | 644 +----------+--------------------------------------------------+ 645 | draft-00 | RFC3782 errata applied, and changes applied from | 646 | | draft-nishida-newreno-modification-02 | 647 +----------+--------------------------------------------------+ 648 | draft-01 | Non-normative sections moved to appendices, | 649 | | editorial clarifications applied as suggested | 650 | | by Alexander Zimmermann. | 651 +----------+--------------------------------------------------+ 652 | draft-02 | Better align specification text with RFC5681. | 653 | | Replace informative appendices by a new appendix | 654 | | that just provides back-references to earlier | 655 | | NewReno RFCs. | 656 +----------+--------------------------------------------------+ 657 | draft-03 | Document refresh and fix id-nits | 658 +----------+--------------------------------------------------+ 659 | draft-04 | Address editorial comments received from secdir | 660 | | review (provided by Tom Yu). | 661 +----------+--------------------------------------------------+ 662 | draft-05 | Address IESG review comments from David | 663 | | Harrington, and Gen-ART review comments from | 664 | | Ben Campbell. | 665 +----------+--------------------------------------------------+ 667 Authors' Addresses 669 Tom Henderson 670 The Boeing Company 672 EMail: thomas.r.henderson@boeing.com 674 Sally Floyd 675 International Computer Science Institute 677 Phone: +1 (510) 666-2989 678 EMail: floyd@acm.org 679 URL: http://www.icir.org/floyd/ 681 Andrei Gurtov 682 University of Oulu 683 Centre for Wireless Communications CWC 684 P.O. Box 4500 685 FI-90014 University of Oulu 686 Finland 688 EMail: gurtov@ee.oulu.fi 690 Yoshifumi Nishida 691 WIDE Project 692 Endo 5322 693 Fujisawa, Kanagawa 252-8520 694 Japan 696 Email: nishida@wide.ad.jp