idnits 2.17.1 draft-allman-rto-backoff-05.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 17. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 375. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 351. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 358. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 364. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There is 1 instance of too long lines in the document, the longest one being 1 character in excess of 72. ** The abstract seems to contain references ([RFC2119], [RFC2988]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (July 2007) is 6127 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC1323' is mentioned on line 95, but not defined ** Obsolete undefined reference: RFC 1323 (Obsoleted by RFC 7323) ** Obsolete normative reference: RFC 2988 (Obsoleted by RFC 6298) ** Downref: Normative reference to an Experimental RFC: RFC 3522 ** Downref: Normative reference to an Experimental RFC: RFC 3708 ** Downref: Normative reference to an Experimental RFC: RFC 4138 -- Obsolete informational reference (is this intentional?): RFC 1323 (ref. 'Flo98') (Obsoleted by RFC 7323) -- Obsolete informational reference (is this intentional?): RFC 3517 (Obsoleted by RFC 6675) -- Obsolete informational reference (is this intentional?): RFC 3782 (Obsoleted by RFC 6582) Summary: 8 errors (**), 0 flaws (~~), 3 warnings (==), 10 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Josh Blanton 2 INTERNET DRAFT Ohio University 3 draft-allman-rto-backoff-05.txt Ethan Blanton 4 Expires: January 2008 Purdue University 5 Mark Allman 6 ICIR/ICSI 7 July 2007 9 Using Spurious Retransmissions to Adapt the Retransmission Timeout 10 draft-allman-rto-backoff-05.txt 12 Status of this Memo 14 By submitting this Internet-Draft, each author represents that any 15 applicable patent or other IPR claims of which he or she is aware 16 have been or will be disclosed, and any of which he or she becomes 17 aware will be disclosed, in accordance with Section 6 of BCP 79. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as 22 Internet-Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six 25 months and may be updated, replaced, or obsoleted by other documents 26 at any time. It is inappropriate to use Internet-Drafts as 27 reference material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt. 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 Copyright Notice 37 Copyright (C) The IETF Trust (2007). 39 Abstract 41 This document describes a method for using spurious retransmission 42 timeouts as the trigger for slightly changing the way TCP's 43 retransmission timeout is computed in an effort to avoid subsequent 44 unnecessary retransmissions. 46 Terminology 48 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL 49 NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and 50 "OPTIONAL" in this document are to be interpreted as described 51 in [RFC2119]. 53 The reader is expected to be familiar with the algorithm and 54 terminology from [RFC2988]. 56 1. Introduction 58 Various studies have shown that the retransmission timeout (RTO) 59 estimator in [RFC2988] can trigger spurious retransmissions. [AP99] 60 shows that such unnecessary retransmissions are generally fairly 61 rare. However, [LK00] shows that in some networks (e.g., wireless 62 networks) spurious retransmissions are more problematic due to 63 occasional delay spikes that are not well predicted by TCP's RTO 64 estimator. In this document we outline one possible approach to 65 mitigate the impact of pre-mature RTO firings by altering the RTO 66 estimator specified in [RFC2988]. 68 Several methods for detecting spurious timeouts have been developed 69 [RFC3522,RFC3708,RFC4138]. Additionally, [RFC4015] outlines one 70 possible response to detecting spurious timeouts. This document 71 outlines an alternative to [RFC4015]. In general terms, [RFC4015] 72 specifies two actions upon the detection of an unnecessary RTO-based 73 retransmission. First, the sending rate prior to the spurious 74 retransmission is restored. Furthermore, the RTO is adapted by 75 re-initializing the RTO estimator with the long round-trip time 76 (RTT) measurement that caused the spurious RTO. The approach given 77 in [RFC4015] is reasonable if the underlying cause of the problem is 78 a shift in the path RTT. For instance, if the route a TCP 79 connection is traversing changes and the new path's RTT is 80 significantly longer than the previous path's RTT then simply 81 re-initializing the RTO is a reasonable action. 83 As specified in the next section this document takes a slightly 84 different approach than [RFC4015]. Generally, this document uses 85 the failure of the RTO to wait long enough before triggering a 86 retransmit as an indication that the RTO estimator itself is not 87 properly capturing the variance present in the RTTs experienced by 88 the TCP connection. Therefore, this document calls for an 89 additive contribution to the variance component in the RTO 90 estimator upon the detection of retransmission timeouts in an 91 effort to cope. This change represents a preference to try to 92 avoid future spurious timeouts rather than simply reacting to each 93 spurious retransmission. 95 We note that TCP implementations using the RTTM mechanism [RFC1323] 96 to assess the RTT multiple times per RTT with the standard 97 exponentially-weighted moving average (EWMA) gains from [RFC2988] 98 retain less RTT history than when taking one RTT measurement per RTT. 99 [AP99] shows that "fast" EWMAs yield more spurious retransmissions 100 than when using the standard gains with one RTT sample per RTT. 101 Therefore, an orthogonal change to TCP implementations that use RTTM 102 that may prevent spurious RTOs is to set the EWMA gains based on the 103 number of RTT samples taken per RTT such that the amount of history 104 kept, in terms of time, is the same regardless of the number RTT 105 samples taken [Flo98,LS00]. 107 2. Parameter Changes 109 As the basis for the changes proposed below, a TCP MUST support an 110 IETF-specified spurious timeout detection method. Currently, 111 [RFC3522], [RFC3708] and [RFC4138] are such detection methods. We 112 note that the research literature includes alternate methods for 113 detecting spurious retransmissions, e.g., the "retransmit bit" 114 [LK00], but these schemes MUST NOT be used as part of the changes 115 specified in this document until such time that the IETF approves a 116 specification of these schemes. 118 We also note that [RFC2988] explicitly allows for an RTO estimator 119 that is more conservative than that given in [RFC2988] (which this 120 document specifies). 122 Also we note that, given that the TCP is savvy enough to untangle 123 needed and uneeded retransmission timeouts, the TCP does not need to 124 use Karn's algorithm [KP87,RFC2988] and can accurately determine the 125 RTT that causes spurious retransmissions. 127 This document specifies that a TCP MAY change the RTO estimator 128 given in [RFC2988] upon detection of a spurious timeout, as follows. 130 The general idea behind the mechanism is to introduce an additive 131 variance term, V, in addition to the muliplier K which is applied 132 to RTTVAR in the RTO calculation given in step (2.3) of [RFC2988], 133 to allow for additional variance in the path's RTT. The specific 134 mechanism for TCPs using this change is: 136 (A) A TCP using this method MUST replace the calculation of RTO in 137 step (2.3) of [RFC2988] with: 139 RTO <- SRTT + max(G, K*RTTVAR) + V (1) 141 to include the additional variance term. 143 (B) When a TCP connection is initiated, V is set to 0. 145 (C) Upon the first expiration of the retransmission timer for a 146 given sequence number, the values of SRTT and RTTVAR MUST be 147 saved as SRTT_prev and RTTVAR_prev, respectively. 149 (D) Upon detecting that a previous RTO-based retransmission was 150 spurious, a TCP MUST calculate a V' using the RTT sample 151 R', which is the time between when the original transmission of 152 the given segment was sent and when the that original 153 transmission is acknowledged, as follows: 155 V' = R' - SRTT_prev + max(G, K*SRTTVAR_prev) (2) 157 V' then becomes the difference between the previously 158 calculated RTO and the RTO value which would have prevented 159 the spurious retransmission. 161 The value of V' MUST NOT be reduced for the remainder of the 162 connection (as discussed in more detail below). 164 (E) The values of SRTT and RTTVAR in use when the spurious 165 retransmit occured MUST replace the current values: 167 SRTT = SRTT_prev (3) 168 RTTVAR = RTTVAR_prev (4) 170 (F) The R RTT sample MUST be used to adjust SRTT and RTTVAR and 171 therefore the RTO, per [RFC2988]. 173 The actual V that is used in the RTO calculation is determined by 174 the size of the congestion window. When a TCP has only a small 175 number of outstanding segments, advanced loss recovery that relies 176 on the receipt of three duplicate acknowledgments as a recovery 177 trigger is not as effective as when the congestion window is larger. 178 Therefore, TCP relies more heavily on the RTO in this regime. 179 Furthermore, the impact caused by spurious timeouts in this 180 situation---in terms of congestion window reduction and resource 181 wastage by go-back-N transmission---is small. Hence, when the 182 congestion window is less than or equal to 4*SMSS bytes then a 183 V of 0 SHOULD be used when calculating the RTO. Once the congestion 184 window size grows beyond 4*SMSS bytes, the calculated value of V 185 SHOULD be used in the calculation of the RTO. 187 This specification explicitly offers no way to reduce V after it 188 has been inflated. V is never reduced because the presence of 189 spurious timeouts which inflated V indicates that the standard 190 estimator is inadequate for accurately estimating the variance of 191 the RTT across the network path and therefore reducing V would 192 increase the chances of further spurious retransmissions. 194 Finally, we note that bounding V' is not advisable. Say V' would be 195 set to 20 via equation (2). If V' were, instead, bound to 10 then 196 legitimate RTOs would be forced to wait longer without offering 197 solid protection against delay spikes (given that delay spikes that 198 a V' of 10 will not handle have been observed). 200 3. Advantages 202 The advantage of tuning the RTO calculation to be more conservative 203 after detecting spurious RTO-based retransmissions is in preventing 204 further spurious RTOs. In addition, spurious RTOs can cause 205 go-back-N behavior [LK00] which can also be avoided by adapting the 206 RTO to be more conservative. 208 4. Disadvantages 210 The disadvantage of tuning the RTO calculation to be more 211 conservative is that legitimate RTO firings takes longer and could 212 hurt performance. However, an important note is that the RTO should 213 not be TCP's primary loss recovery strategy. [RFC3782] and 214 [RFC3517] provide methods for TCP to effectively repair multiple 215 lost segments from a single window of data without falling back to 216 using the RTO. Further, research shows that these changes are 217 widely implemented [MAF05]. Therefore, making TCP's RTO calculation 218 more conservative should not hinder performance under normal 219 circumstance. Put differently, when using advanced loss recovery 220 techniques the firing of the RTO should be an indication that the 221 congestion situation in the network is fairly bad. In this case, it 222 may well be that making the RTO estimator more conservative is the 223 right general approach. 225 The common exception to the above argument is when the congestion 226 window is small, such that these advanced loss recovery algorithms 227 do not work effectively. The mechanism in this document explicitly 228 takes this case into account by not using the more conservative RTO 229 estimate when the congestion window is small. 231 5. Summary 233 This document specifies a small change that makes the RTO 234 calculation given in [RFC2988] more conservative upon the detection 235 of spurious RTO-based retransmissions. The root cause of spurious 236 retransmits is an inaccurate assessment of the network conditions 237 (in this case, of the RTT). Therefore, we tackle this by making the 238 RTO calculation take into account an additional variance term. 239 While this does lengthen the time required for legitimate 240 retransmissions to fire, the RTO should not be TCP's primary means 241 for retransmitting data and therefore this lengthened interval 242 should only minimally impact overall performance and should only 243 come into play when conditions along the network path have 244 deteriorated significantly. Finally, we note that this document 245 makes the estimator given in [RFC2988] strictly more conservative 246 and is therefore allowed via [RFC2988]. 248 6. Security Considerations 250 This document calls for a simple parameter tweak and does not change 251 the security considerations given in [RFC2988]. 253 7. IANA Considerations 255 None. 257 Acknowledgments 259 This document has benefited from discussions with Ted Faber, Aaron 260 Falk, Joseph Ishac, Janardhan Iyengar, Sally Floyd, Vern Paxson and 261 Joe Touch. 263 Normative References 265 [RFC2119] S. Bradner. Key words for use in RFCs to Indicate 266 Requirement Levels, March 1997. BCP 14, RFC 2119. 268 [RFC2988] V. Paxson, M. Allman. Computing TCP's Retransmission 269 Timer, November 2000. RFC 2988. 271 [RFC3522] R. Ludwig, M. Meyer. The Eifel Detection Algorithm for 272 TCP, April 2003. RFC 3522. 274 [RFC3708] E. Blanton, M. Allman. Using TCP Duplicate Selective 275 Acknowledgement (DSACKs) and Stream Control Transmission 276 Protocol (SCTP) Duplicate Transmission Sequence Numbers (TSNs) 277 to Detect Spurious Retransmissions, February 2004. RFC 3708. 279 [RFC4138] P. Sarolahti, M. Kojo. Forward RTO-Recovery (F-RTO): An 280 Algorithm for Detecting Spurious Retransmission Timeouts with 281 TCP and the Stream Control Transmission Protocol (SCTP), August 282 2005. RFC 4138. 284 Informative References 286 [AP99] Mark Allman, Vern Paxson. On Estimating End-to-End Network 287 Path Properties. ACM SIGCOMM, September 1999. 289 [Flo98] Sally Floyd. Comments on RFC1323.bis, TCP-LW mailing list, 290 May 1998. 292 [KP87] Phil Karn, Craig Partridge. Improving Round-Trip Time 293 Estimates in Reliable Transport Protocols. ACM SIGCOMM, August 294 1997. 296 [LK00] R. Ludwig, R. H. Katz. The Eifel Algorithm: Making TCP 297 Robust Against Spurious Retransmissions. ACM Computer 298 Communication Review, 30(1), January 2000. 300 [LS00] R. Ludwig, K. Sklower, The Eifel Retransmission Timer, ACM 301 Computer Communication Review, Vol. 30, No. 3, July 2000. 303 [MAF05] A. Medina, M. Allman, S. Floyd. Measuring the Evolution of 304 Transport Protocols in the Internet. ACM Computer Communication 305 Review, 35(2), April 2005. 307 [RFC3517] E. Blanton, M. Allman, K. Fall, L. Wang. A Conservative 308 Selective Acknowledgment (SACK)-based Loss Recovery Algorithm 309 for TCP, April 2003. RFC 3517. 311 [RFC3782] S. Floyd, T. Henderson, A. Gurtov. The NewReno 312 Modification to TCP's Fast Recovery Algorithm, April 2004. RFC 313 3782. 315 [RFC4015] R. Ludwig, A. Gurtov. The Eifel Response Algorithm for 316 TCP, February 2005. RFC 4015. 318 Author's Addresses 320 Josh Blanton 321 Ohio University Internetworking Research Group 322 301 Stocker Center 323 Athens, OH 45701 324 Email: jblanton@cs.ohiou.edu 325 URL: http://irg.cs.ohiou.edu/~jblanton/ 327 Ethan Blanton 328 Purdue University Computer Sciences 329 305 North University Street 330 West Lafayette, IN 47907 331 Email: eblanton@cs.purdue.edu 332 URL: http://www.cs.purdue.edu/homes/eblanton/ 334 Mark Allman 335 ICSI Center for Internet Research 336 1947 Center Street, Suite 600 337 Berkeley, CA 94704-1198 338 Phone: (440) 235-1792 339 Email: mallman@icir.org 340 URL: http://www.icir.org/mallman/ 342 Intellectual Property Statement 344 The IETF takes no position regarding the validity or scope of any 345 Intellectual Property Rights or other rights that might be claimed 346 to pertain to the implementation or use of the technology described 347 in this document or the extent to which any license under such 348 rights might or might not be available; nor does it represent that 349 it has made any independent effort to identify any such rights. 350 Information on the procedures with respect to rights in RFC 351 documents can be found in BCP 78 and BCP 79. 353 Copies of IPR disclosures made to the IETF Secretariat and any 354 assurances of licenses to be made available, or the result of an 355 attempt made to obtain a general license or permission for the use 356 of such proprietary rights by implementers or users of this 357 specification can be obtained from the IETF on-line IPR repository 358 at http://www.ietf.org/ipr. 360 The IETF invites any interested party to bring to its attention any 361 copyrights, patents or patent applications, or other proprietary 362 rights that may cover technology that may be required to implement 363 this standard. Please address the information to the IETF at 364 ietf-ipr@ietf.org. 366 Disclaimer of Validity 368 This document and the information contained herein are provided on 369 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 370 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE 371 IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL 372 WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY 373 WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE 374 ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS 375 FOR A PARTICULAR PURPOSE. 377 Copyright Statement 379 Copyright (C) The IETF Trust (2007). This document is subject 380 to the rights, licenses and restrictions contained in BCP 78, and 381 except as set forth therein, the authors retain all their rights. 383 Acknowledgment 385 Funding for the RFC Editor function is currently provided by the 386 Internet Society.