idnits 2.17.1 draft-ietf-tcpm-early-rexmt-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 21. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 599. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 575. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 582. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 588. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 2 instances of too long lines in the document, the longest one being 1 character in excess of 72. ** The abstract seems to contain references ([RFC2119]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 2008) is 5705 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2119' is mentioned on line 57, but not defined == Missing Reference: 'BPS99' is mentioned on line 314, but not defined == Unused Reference: 'AA02' is defined on line 412, but no explicit reference was found in the text == Unused Reference: 'LK98' is defined on line 446, but no explicit reference was found in the text == Unused Reference: 'Mor97' is defined on line 450, but no explicit reference was found in the text == Unused Reference: 'RFC3150' is defined on line 470, but no explicit reference was found in the text ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2581 (Obsoleted by RFC 5681) ** Obsolete normative reference: RFC 2988 (Obsoleted by RFC 6298) ** Downref: Normative reference to an Experimental RFC: RFC 3522 ** Obsolete normative reference: RFC 4960 (Obsoleted by RFC 9260) -- Obsolete informational reference (is this intentional?): RFC 2582 (Obsoleted by RFC 3782) -- Obsolete informational reference (is this intentional?): RFC 3517 (Obsoleted by RFC 6675) Summary: 9 errors (**), 0 flaws (~~), 8 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Mark Allman 2 INTERNET DRAFT ICSI 3 File: draft-ietf-tcpm-early-rexmt-00.txt Konstantin Avrachenkov 4 INRIA 5 Urtzi Ayesta 6 LAAS-CNRS 7 Josh Blanton 8 Ohio University 9 Per Hurtig 10 Karlstad University 11 August 2008 12 Expires: February 2009 14 Early Retransmit for TCP and SCTP 16 Status of this Memo 18 By submitting this Internet-Draft, each author represents that any 19 applicable patent or other IPR claims of which he or she is aware 20 have been or will be disclosed, and any of which he or she becomes 21 aware will be disclosed, in accordance with Section 6 of BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF), its areas, and its working groups. Note that 25 other groups may also distribute working documents as 26 Internet-Drafts. 28 Internet-Drafts are draft documents valid for a maximum of six 29 months and may be updated, replaced, or obsoleted by other documents 30 at any time. It is inappropriate to use Internet-Drafts as 31 reference material or to cite them other than as "work in progress." 33 The list of current Internet-Drafts can be accessed at 34 http://www.ietf.org/ietf/1id-abstracts.txt. 36 The list of Internet-Draft Shadow Directories can be accessed at 37 http://www.ietf.org/shadow.html. 39 Copyright Notice 41 Copyright (C) The IETF Trust (2008). 43 Abstract 45 This document proposes a new mechanism for TCP and SCTP that can be 46 used to recover lost segments when a connection's congestion window 47 is small. The "Early Retransmit" mechanism allows the transport to 48 reduce, in certain special circumstances, the number of duplicate 49 acknowledgments required to trigger a fast retransmission. This 50 allows the transport to use fast retransmit to recover packet losses 51 that would otherwise require a lengthy retransmission timeout. 53 Terminology 54 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 55 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 56 document are to be interpreted as described in RFC 2119 [RFC2119]. 58 1 Introduction 60 Many researchers have studied problems with TCP [RFC793,RFC2581] 61 when the congestion window is small and have outlined possible 62 mechanisms to mitigate these problems 63 [Mor97,BPS+98,Bal98,LK98,RFC3150,AA02]. SCTP's [RFC4960] loss 64 recovery and congestion control mechanisms are based on TCP and 65 therefore the same problems impact the performance of SCTP 66 connections. When the transport detects a missing segment, the 67 connection enters a loss recovery phase. There are several variants 68 of the loss recovery phase depending on the TCP implemention. TCP 69 can use slow start based recovery or Fast Recovery [RFC2581], 70 NewReno [RFC2582], and loss recovery based on selective 71 acknowledgments (SACKs) [RFC2018,FF96,RFC3517]. SCTP's loss 72 recovery is not as varied due to the built-in selective 73 acknowledgments. 75 All the above variants have two methods for invoking loss recovery. 76 First, if an acknowledgment (ACK) for a given segment is not 77 received in a certain amount of time a retransmission timer fires 78 and the segment is resent [RFC2988,RFC4960]. Second, the ``Fast 79 Retransmit'' algorithm resends a segment when three duplicate ACKs 80 arrive at the sender [Jac88,RFC2581]. Duplicate ACKs are triggered 81 by out-of-order arrivals at the receiver. However, because 82 duplicate ACKs from the receiver are triggered by both packet loss 83 and packet reordering in the network path, the sender waits for 84 three duplicate ACKs in an attempt to disambiguate packet loss from 85 packet reordering. When using small congestion windows it may not 86 be possible to generate the required number of duplicate ACKs to 87 trigger Fast Retransmit when a loss does happen. 89 Small windows can occur in a number of situations, such as: 91 (1) The connection is constrained by end-to-end congestion control 92 when the connection's share of the path is small, the path has a 93 small bandwidth-delay product or the transport is ascertaining 94 the available bandwidth in the first few round-trip times of 95 slow start. 97 (2) The connection is "application limited" and has only a limited 98 amount of data to send. This can happen any time the 99 application does not produce enough data to fill the congestion 100 window. A particular case when all connections become 101 application limited is as the connection ends. 103 (3) The connection is limited by the receiver's advertised window. 105 The transport's retransmission timeout (RTO) is based on measured 106 round-trip times (RTT) between the sender and receiver, as specified 107 in [RFC2988] (for TCP) and [RFC4960] (for SCTP). To prevent 108 spurious retransmissions of segments that are only delayed and not 109 lost, the minimum RTO is conservatively chosen to be 1 second. 110 Therefore, it behooves TCP senders to detect and recover from as 111 many losses as possible without incurring a lengthy timeout during 112 which the connection remains idle. However, if not enough duplicate 113 ACKs arrive from the receiver, the Fast Retransmit algorithm is 114 never triggered---this situation occurs when the congestion window 115 is small, if a large number of segments in a window are lost or at 116 the end of a transfer as data drains from the network. For 117 instance, consider a congestion window (cwnd) of three segments. If 118 one segment is dropped by the network, then at most two duplicate 119 ACKs will arrive at the sender, assuming no ACK loss. Since three 120 duplicate ACKs are required to trigger Fast Retransmit, a timeout 121 will be required to resend the dropped packet. 123 [BPS+98] shows that roughly 56% of retransmissions sent by a busy 124 web server are sent after the RTO timer expires, while only 44% are 125 handled by Fast Retransmit. In addition, only 4% of the RTO 126 timer-based retransmissions could have been avoided with SACK, which 127 has to continue to disambiguate reordering from genuine loss. 128 Furthermore, [All00] shows that for one particular web server the 129 median transfer size is less than four segments, indicating that 130 more than half of the connections will be forced to rely on the RTO 131 timer to recover from any losses that occur. Thus, loss recovery 132 that does not rely on the conservative RTO is likely to be 133 beneficial for short TCP transfers. 135 The Limited Transmit mechanism introduced in [RFC3042] allows a TCP 136 sender to transmit previously unsent data upon the reception of each 137 of the two duplicate ACKs that precede a Fast Retransmit. SCTP 138 [RFC4960] uses SACK information to calculate the number of 139 outstanding segments in the network. Hence, when the first two 140 duplicate ACKs arrive at the sender they will indicate that data has 141 left the network and allow the sender to transmit new data (if 142 available) similar to TCP's Limited Transmit algorithm. In the 143 remainder of this document we use "Limited Transmit" to include both 144 TCP and SCTP mechanisms for sending in response to the first two 145 duplicate ACKs. By sending these two new segments the sender is 146 attempting to induce additional duplicate ACKs (if appropriate) so 147 that Fast Retransmit will be triggered before the retransmission 148 timeout expires. The "Early Retransmit" mechanism outlined in this 149 document covers the case when previously unsent data is not 150 available for transmission or cannot be transmitted due to an 151 advertised window limitation. 153 2 Early Retransmit Algorithm 155 The Early Retransmit algorithm calls for lowering the threshold for 156 triggering Fast Retransmit when the amount of outstanding data is 157 small and when no previously unsent data can be transmitted (such 158 that Limited Transmit could be used). Duplicate ACKs are triggered 159 by each arriving out-of-order segment. Therefore, Fast Retransmit 160 will not be invoked when there are less than four outstanding 161 segments (assuming only one segment loss in the window). However, 162 TCP and SCTP are not required to track the number of outstanding 163 segments, but rather the number of outstanding bytes or messages. 164 Therefore, applying the intuitive notion of a transport with less 165 than four segments outstanding is more complicated than it first 166 appears. In section 2.1 we describe a "byte-based" variant of Early 167 Retransmit that attempts to roughly map the number of outstanding 168 bytes to a number of outstanding packets that is then used when 169 deciding whether to trigger Early Retransmit. In section 2.2 we 170 describe a "packet-based" variant that represents a more precise 171 algorithm for triggering Early Retransmit. The precision comes at 172 the cost of requiring additional state to be kept by the TCP sender. 173 In both cases we described SACK-based and non-SACK-based versions of 174 the scheme (of course, the non-SACK version will not apply to SCTP). 176 2.1 Byte-based Early Retransmit 178 A TCP or SCTP sender MAY use byte-based Early Retransmit. 180 A sender employing byte-based Early Retransmit MUST use the 181 following two conditions to determine when an Early Retransmit is 182 sent: 184 (2.a) The amount of outstanding data (ownd)---data sent but not yet 185 acknowledged---is less than 4*SMSS bytes. 187 (2.b) There is either no unsent data ready for transmission at the 188 sender or the advertised window does not permit new segments 189 to be transmitted. 191 When the above two conditions hold and the connection does not 192 support SACK the duplicate ACK threshold used to trigger a 193 retransmission MUST be reduced to: 195 ER_thresh = ceiling (ownd/SMSS) - 1 (1) 197 duplicate ACKs, where ownd is in terms of bytes. 199 When conditions (2.a) and (2.b) hold and the connection does support 200 SACK, Early Retransmit MUST be used only when "ownd - SMSS" bytes 201 have been SACKed. 203 When conditions (2.a) and (2.b) do not hold, the transport MUST NOT 204 use Early Retransmit, but rather prefer the standard mechanisms, 205 including Limited Transmit. 207 As noted above, the drawback of this byte-based variant is precision 208 [HB08]. We illustrate this with two examples: 210 + Consider a non-SACK TCP sender that uses an SMSS of 1460 bytes 211 and transmits three segments each with 400 bytes of payload. 212 This is a case where Early Retransmit could aid loss recovery if 213 one segment is lost. However, in this case ER_thresh will 214 become zero, per equation (1), because the number of outstanding 215 bytes is a poor estimate of the number of outstanding packets. 217 A similar problem occurs for senders that employ SACK as the 218 expression "ownd - SMSS" will become negative. 220 + Next, consider a non-SACK TCP sender that uses an SMSS of 1460 221 bytes and transmits 10 segments each with 400 bytes of payload. 222 In this case ER_thresh will be two, per equation (1). Thus, 223 even though there are enough segments outstanding to trigger 224 Fast Retransmit with the standard duplicate ACK threshold Early 225 Retransmit will be triggered. This could cause or exacerbate 226 performance problems caused by packet reordering in the network. 228 2.2 Packet-based Early Retransmit 230 A TCP or SCTP sender MAY use packet-based Early Retransmit. 232 A sender employing packet-based Early Retransmit MUST use the 233 following two conditions to determine when an Early Retransmit is 234 sent: 236 (3.a) The number of outstanding segments (oseg)---segments sent but 237 not yet acknowledged---is less than four. 239 (3.b) There is either no unsent data ready for transmission at the 240 sender or the advertised window does not permit new segments 241 to be transmitted. 243 When the above two conditions hold and the connection does not 244 support SACK the duplicate ACK threshold used to trigger a 245 retransmission MUST be reduced to: 247 ER_thresh = oseg - 1 (2) 249 duplicate ACKs, where oseg represents the number of outstanding 250 segments. (We discuss tracking the number of outstanding segments 251 below.) 253 When conditions (3.a) and (3.b) hold and the connection does support 254 SACK, Early Retransmit MUST be used only when "oseg - 1" segments 255 have been SACKed. 257 When conditions (3.a) and (3.b) do not hold, the transport MUST NOT 258 use Early Retransmit, but rather prefer the standard mechanisms, 259 including Limited Transmit. 261 This version of Early Retransmit solves the precision issues 262 discussed in the previous section. As noted previously, the cost is 263 that the implementation will have to track packet boundaries to form 264 an understanding as to how many actual segments have been 265 transmitted, but not acknowledged. This can be done by tracking the 266 boundaries of the three segments on the right side of the current 267 window (which involves tracking four sequence numbers in TCP). This 268 could be done by keeping a circular list of the packet boundaries, 269 for instance. Cumulative ACKs that do not fall within this region 270 indicate that at least four segments are outstanding and therefore 271 Early Retransmit MUST NOT be used. When the outstanding window 272 becomes small enough that Early Retransmit can be invoked, a full 273 understanding of the number of outstanding packets will be 274 available from the four sequence numbers retained. 276 3 Discussion 278 The SACK variant of the Early Retransmit algorithm is preferred to 279 the non-SACK variant due to its robustness in the face of ACK loss 280 (since SACKs are sent redundantly) and due to interactions with the 281 delayed ACK timer. Consider a flight of three segments, S1...S3, 282 with S2 being dropped by the network. When S1 arrives it is 283 in-order and so the receiver may or may not delay the ACK, leading 284 to two scenarios: 286 (A) The ACK for S1 is delayed: In this case the arrival of S3 will 287 trigger an ACK to be transmitted covering segment S1 (which was 288 previously unacknowledged). In this case Early Retransmit 289 without SACK will not prevent an RTO because no duplicate ACKs 290 will arrive. However, with SACK the ACK for S1 will also 291 include SACK information indicating that S3 has arrived at the 292 receiver. The sender can then invoke Early Retransmit on this 293 ACK because only one packet remains outstanding. 295 (B) The ACK for S1 is not delayed: In this case the arrival of S1 296 triggers an ACK of previously unacknowledged data. The arrival 297 of S3 triggers a duplicate ACK (because it is out-of-order). 298 Both ACKs will cover the same segment (S1). Therefore, 299 regardless of whether SACK is used Early Retransmit can be 300 performed by the sender (assuming no ACK loss). 302 Early Retransmit is less robust in the face of reordered segments 303 than when using the standard Fast Retransmit threshold. Research 304 shows that a general reduction in the number of duplicate ACKs 305 required to trigger Fast Retransmit to two (rather than three) leads 306 to a reduction in the ratio of good to bad retransmits by a factor 307 of three [Pax97]. However, this analysis did not include the 308 additional conditioning on the event that the ownd was smaller than 309 4 segments and that no new data was available for transmission. 311 A number of studies have shown that network reordering is not a rare 312 event across some network paths. Various measurement studies have 313 shown that reordering along most paths is negligible, but along 314 certain paths can be quite prevalent [Pax97,BPS99,BS02,Pir05]. 315 Evaluating Early Retransmit in the face of real packet reordering is 316 part of the experiment we hope to instigate with this document. 318 Next, we note two "worst case" scenarios for Early Retransmit: 320 (1) Persistent reordering of segments coupled with an application 321 that does not constantly send data can result in large numbers 322 of needless retransmissions when using Early Retransmit. For 323 instance, consider an application that sends data two segments 324 at a time, followed by an idle period when no data is queued for 325 delivery. If the network consistently reorders the two 326 segments, the sender will needlessly retransmit one out of every 327 two unique segments transmitted when using the above algorithm 328 (meaning that one-third of all segments sent are needless 329 retransmissions). However, this would only be a problem for 330 long-lived connections from applications that transmit in 331 spurts. 333 (2) Similar to the above, consider the case of 2 segment transfers 334 that always experience reordering. Just as in (1) above, one 335 out of every two unique data segments will be retransmitted 336 needlessly, therefore one-third of the traffic will be spurious. 338 Currently this document offers no suggestion on how to mitigate the 339 above problems. However, the worst cases are likely pathological 340 and part of the experiments that this document hopes to trigger 341 would involve better understanding of whether such theoretical worst 342 case scenarios are prevalent in the network and in general to 343 explore the tradeoff between spurious fast retransmits and the delay 344 imposed by the RTO. Appendix A does offer a survey of possible 345 mitigations that call for curtailing the use of Early Retransmit 346 when it is making poor retransmission decisions. 348 4 Related Work 350 Deployment of Explicit Congestion Notification (ECN) [Flo94,RFC3168] 351 may benefit connections with small congestion window sizes 352 [RFC2884]. ECN provides a method for indicating congestion to the 353 end-host without dropping segments. While some segment drops may 354 still occur, ECN may allow a transport to perform better with small 355 cwnd sizes because the sender will be required to detect less 356 segment loss [RFC2884]. 358 [Bal98] outlines another solution to the problem of having no new 359 segments to transmit into the network when the first two duplicate 360 ACKs arrive. In response to these duplicate ACKs, a TCP sender 361 transmits zero-byte segments to induce additional duplicate ACKs. 362 This method preserves the robustness of the standard Fast Retransmit 363 algorithm at the cost of injecting segments into the network that do 364 not deliver any data (and, therefore are potentially wasting network 365 resources). 367 5 Security Considerations 369 The security considerations found in [RFC2581] apply to this 370 document. No additional security problems have been identified with 371 Early Retransmit at this time. 373 Acknowledgments 375 We thank Sally Floyd for her feedback in discussions about Early 376 Retransmit. The notion of Early Transmit was originally sketched in 377 an Internet-Draft co-authored by Sally Floyd and Hari Balakrishnan. 378 Armando Caro and many members of the TSVWG and TCPM working groups 379 provided good discussions that helped shape this document. Our 380 thanks to all! 382 Normative References 384 [RFC793] Jon Postel. Transmission Control Protocol. Std 7, RFC 385 793. September 1981. 387 [RFC2018] Matt Mathis, Jamshid Mahdavi, Sally Floyd, Allyn Romanow. 388 TCP Selective Acknowledgement Options. RFC 2018, October 1996. 390 [RFC2581] Mark Allman, Vern Paxson, W. Richard Stevens. TCP 391 Congestion Control. RFC 2581, April 1999. 393 [RFC2883] Sally Floyd, Jamshid Mahdavi, Matt Mathis, Matt Podolsky. 394 An Extension to the Selective Acknowledgement (SACK) Option for 395 TCP. RFC 2883, July 2000. 397 [RFC2988] Vern Paxson, Mark Allman. Computing TCP's Retransmission 398 Timer. RFC 2988, April 2000. 400 [RFC3042] Mark Allman, Hari Balakrishnan, Sally Floyd. Enhancing 401 TCP's Loss Recovery Using Limited Transmit. RFC 3042, January 402 2001. 404 [RFC3522] Reiner Ludwig, Michael Meyer. The Eifel Detection 405 Algorithm for TCP. RFC 3522, April 2003. 407 [RFC4960] R. Stewart. Stream Control Transmission Protocol. RFC 408 4960, September 2007. 410 Informative References 412 [AA02] Urtzi Ayesta, Konstantin Avrachenkov, "The Effect of the 413 Initial Window Size and Limited Transmit Algorithm on the 414 Transient Behavior of TCP Transfers", In Proc. of the 15th ITC 415 Internet Specialist Seminar, Wurzburg, July 2002. 417 [All00] Mark Allman. A Web Server's View of the Transport Layer. 418 ACM Computer Communications Review, October 2000. 420 [Bal98] Hari Balakrishnan. Challenges to Reliable Data Transport 421 over Heterogeneous Wireless Networks. Ph.D. Thesis, University 422 of California at Berkeley, August 1998. 424 [BPS+98] Hari Balakrishnan, Venkata Padmanabhan, Srinivasan Seshan, 425 Mark Stemm, and Randy Katz. TCP Behavior of a Busy Web Server: 426 Analysis and Improvements. Proc. IEEE INFOCOM Conf., San 427 Francisco, CA, March 1998. 429 [BS02] John Bellardo, Stefan Savage. Measuring Packet Reordering, 430 ACM/USENIX Internet Measurement Workshop, November 2002. 432 [FF96] Kevin Fall, Sally Floyd. Simulation-based Comparisons of 433 Tahoe, Reno, and SACK TCP. ACM Computer Communication Review, 434 July 1996. 436 [Flo94] Sally Floyd. TCP and Explicit Congestion Notification. ACM 437 Computer Communication Review, October 1994. 439 [HB08] Per Hurtig, Anna Brunstrom. Enhancing SCTP Loss Recovery: An 440 Experimental Evaluation of Early Retransmit. Elsevier Computer 441 Communication, 2008, to appear. 443 [Jac88] Van Jacobson. Congestion Avoidance and Control. ACM 444 SIGCOMM 1988. 446 [LK98] Dong Lin, H.T. Kung. TCP Fast Recovery Strategies: Analysis 447 and Improvements. Proceedings of InfoCom, San Francisco, CA, 448 March 1998. 450 [Mor97] Robert Morris. TCP Behavior with Many Flows. Proceedings 451 of the Fifth IEEE International Conference on Network Protocols. 452 October 1997. 454 [Pax97] Vern Paxson. End-to-End Internet Packet Dynamics. ACM 455 SIGCOMM, September 1997. 457 [Pir05] N. M. Piratla, "A Theoretical Foundation, Metrics and 458 Modeling of Packet Reordering and Methodology of Delay Modeling 459 using Inter-packet Gaps," Ph.D. Dissertation, Department of 460 Electrical and Computer Engineering, Colorado State University, 461 Fort Collins, CO, Fall 2005. 463 [RFC2582] Sally Floyd, Tom Henderson. The NewReno Modification to 464 TCP's Fast Recovery Algorithm. RFC 2582, April 1999. 466 [RFC2884] Jamal Hadi Salim and Uvaiz Ahmed. Performance Evaluation 467 of Explicit Congestion Notification (ECN) in IP Networks. RFC 468 2884, July 2000. 470 [RFC3150] Spencer Dawkins, Gabriel Montenegro, Markku Kojo, Vincent 471 Magret. End-to-end Performance Implications of Slow Links. RFC 472 3150, July 2001. 474 [RFC3168] K. K. Ramakrishnan, Sally Floyd, David Black. The 475 Addition of Explicit Congestion Notification (ECN) to IP. RFC 476 3168, September 2001. 478 [RFC3517] Ethan Blanton, Mark Allman, Kevin Fall, Lili Wang. A 479 Conservative Selective Acknowledgment (SACK)-based Loss Recovery 480 Algorithm for TCP. RFC 3517, April 2003. 482 Author's Addresses: 484 Mark Allman 485 International Computer Science Institute 486 1947 Center Street, Suite 600 487 Berkeley, CA 94704-1198 488 Phone: 440-235-1792 489 mallman@icir.org 490 http://www.icir.org/mallman/ 492 Konstantin Avrachenkov 493 INRIA 494 2004 route des Lucioles, B.P.93 495 06902, Sophia Antipolis 496 France 497 Phone: 00 33 492 38 7751 498 k.avrachenkov@sophia.inria.fr 499 http://www.inria.fr/mistral/personnel/K.Avrachenkov/moi.html 501 Urtzi Ayesta 502 LAAS-CNRS 503 7 Avenue Colonel Roche 504 31077 Toulouse 505 France 506 urtzi@laas.fr 507 http://www.laas.fr/~urtzi 509 Josh Blanton 510 Ohio University 511 301 Stocker Center 512 Athens, OH 45701 513 jblanton@irg.cs.ohiou.edu 515 Per Hurtig 516 Karlstad University 517 Department of Computer Science 518 Universitetsgatan 2 651 88 519 Karlstad Sweden 520 per.hurtig@kau.se 522 Appendix A: Research Issues in Adjusting the Duplicate ACK Threshold 524 Decreasing the number of duplicate ACKs required to trigger Fast 525 Retransmit, as suggested in section 2, has the drawback of making 526 Fast Retransmit less robust in the face of minor network reordering. 527 Two egregious examples of problems caused by reordering are given in 528 section 3. This appendix outlines several schemes that have been 529 suggested to mitigate the problems caused by Early Retransmit in the 530 face of packet reordering. These methods need further research 531 before they are suggested for general use (and, current consensus is 532 that the cases that make Early Retransmit unnecessarily retransmit a 533 large amount of data are pathological and therefore these 534 mitigations are not generally required). 536 MITIGATION A.1: Allow a connection to use Early Retransmit as long 537 as the algorithm is not injecting "too much" spurious data into 538 the network. For instance, using the information provided by 539 TCP's DSACK option [RFC2883] or SCTP's Duplicate-TSN 540 notification, a sender can determine when segments sent via 541 Early Retransmit are needless. Likewise, using Eifel [RFC3522] 542 the sender can detect spurious Early Retransmits. Once spurious 543 Early Retransmits are detected the sender can either eliminate 544 the use of Early Retransmit or limit the use of the algorithm to 545 ensure that an acceptably small fraction of the connection's 546 transmissions are not spurious. For example, a connection could 547 stop using Early Retransmit after the first spurious retransmit 548 is detected. 550 MITIGATION A.2: If a sender cannot reliably determine if an Early 551 Retransmitted segment is spurious or not the sender could simply 552 limit Early Retransmits either to some fixed number per 553 connection (e.g., Early Retransmit is allowed only once per 554 connection) or to some small percentage of the total traffic 555 being transmitted. 557 MITIGATION A.3: Allow a connection to trigger Early Retransmit using 558 the criteria given in section 2, in addition to a "small" 559 timeout [Pax97]. For instance, a sender may have to wait for 2 560 duplicate ACKs and then T msec before Early Retransmit is 561 invoked. The added time gives reordered acknowledgments time to 562 arrive at the sender and avoid a needless retransmit. Designing 563 a method for choosing an appropriate timeout is part of the 564 research that would need to be involved in this scheme. 566 Intellectual Property Statement 568 The IETF takes no position regarding the validity or scope of any 569 Intellectual Property Rights or other rights that might be claimed 570 to pertain to the implementation or use of the technology described 571 in this document or the extent to which any license under such 572 rights might or might not be available; nor does it represent that 573 it has made any independent effort to identify any such rights. 574 Information on the procedures with respect to rights in RFC 575 documents can be found in BCP 78 and BCP 79. 577 Copies of IPR disclosures made to the IETF Secretariat and any 578 assurances of licenses to be made available, or the result of an 579 attempt made to obtain a general license or permission for the use 580 of such proprietary rights by implementers or users of this 581 specification can be obtained from the IETF on-line IPR repository 582 at http://www.ietf.org/ipr. 584 The IETF invites any interested party to bring to its attention any 585 copyrights, patents or patent applications, or other proprietary 586 rights that may cover technology that may be required to implement 587 this standard. Please address the information to the IETF at 588 ietf-ipr@ietf.org. 590 Disclaimer of Validity 592 This document and the information contained herein are provided on 593 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 594 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE 595 IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL 596 WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY 597 WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE 598 ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS 599 FOR A PARTICULAR PURPOSE. 601 Copyright Statement 603 Copyright (C) The IETF Trust (2008). This document is subject 604 to the rights, licenses and restrictions contained in BCP 78, and 605 except as set forth therein, the authors retain all their rights. 607 Acknowledgment 609 Funding for the RFC Editor function is currently provided by the 610 Internet Society.