idnits 2.17.1 draft-ietf-tcpm-rto-consider-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- == The document has an IETF Trust Provisions of 28 Dec 2009, Section 6.c(i) Publication Limitation clause. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The abstract seems to contain references ([RFC2119]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year -- The document date (April 15, 2016) is 2931 days in the past. Is this intentional? Checking references for intended status: Best Current Practice ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC5681' is mentioned on line 301, but not defined -- Obsolete informational reference (is this intentional?): RFC 3940 (Obsoleted by RFC 5740) -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) Summary: 2 errors (**), 0 flaws (~~), 3 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force M. Allman 2 INTERNET-DRAFT ICSI 3 File: draft-ietf-tcpm-rto-consider-03.txt April 15, 2016 4 Intended Status: Best Current Practice 5 Expires: October 15, 2016 7 Retransmission Timeout Considerations 9 Status of this Memo 11 This document may not be modified, and derivative works of it may 12 not be created, except to format it for publication as an RFC or to 13 translate it into languages other than English. 15 This Internet-Draft is submitted in full conformance with the 16 provisions of BCP 78 and BCP 79. Internet-Drafts are working 17 documents of the Internet Engineering Task Force (IETF), its areas, 18 and its working groups. Note that other groups may also distribute 19 working documents as Internet-Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six 22 months and may be updated, replaced, or obsoleted by other documents 23 at any time. It is inappropriate to use Internet-Drafts as 24 reference material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/1id-abstracts.html 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html 32 This Internet-Draft will expire on October 15, 2016. 34 Copyright Notice 36 Copyright (c) 2016 IETF Trust and the persons identified as the 37 document authors. All rights reserved. 39 This document is subject to BCP 78 and the IETF Trust's Legal 40 Provisions Relating to IETF Documents 41 (http://trustee.ietf.org/license-info) in effect on the date of 42 publication of this document. Please review these documents 43 carefully, as they describe your rights and restrictions with 44 respect to this document. Code Components extracted from this 45 document must include Simplified BSD License text as described in 46 Section 4.e of the Trust Legal Provisions and are provided without 47 warranty as described in the Simplified BSD License. 49 Abstract 51 Each implementation of a retransmission timeout mechanism represents 52 a balance between correctness and timeliness and therefore no 53 implementation suits all situations. This document provides 54 high-level requirements for retransmission timeout schemes 55 appropriate for general use in the Internet. Within the 56 requirements, implementations have latitude to define particulars 57 that best address each situation. 59 Terminology 61 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 62 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 63 document are to be interpreted as described in BCP 14, RFC 2119 64 [RFC2119]. 66 1 Introduction 68 Despite our best intentions and most robust mechanisms, reliability 69 in networking ultimately requires a timeout and re-try mechanism. 70 Often there are more timely and precise mechanisms than a timeout 71 for repairing loss (e.g., TCP's fast retransmit [RFC5681], NewReno 72 [RFC6582] or selective acknowledgment scheme [RFC2018,RFC6675]) 73 which require information exchange between components in the system. 74 Such communication cannot be guaranteed. Alternatively, information 75 coding---e.g., FEC---can allow the recipient to recover from some 76 amount of lost information without use of a retransmission. This 77 latter provides probabilistic reliability. Finally, negative 78 acknowledgment schemes exist that do not depend on continuous 79 feedback to trigger retransmissions (e.g., [RFC3940]). However, 80 regardless of these useful alternatives, the only thing we can truly 81 depend on is the passage of time and therefore our ultimate backstop 82 to ensuring reliability is a timeout. (Note: There is a case when 83 we cannot count on the passage of time, but in this case we believe 84 repairing loss will be a moot point and hence we do not further 85 consider this case in this document.) 87 Various protocols have defined their own timeout mechanisms (e.g., 88 TCP [RFC6298], SCTP [RFC4960], SIP [RFC3261]). Ideally, if we know 89 a segment will be lost before reaching the destination, a second 90 copy of it would be sent immediately after the first transmission. 91 However, in reality the specifics of retransmission timeouts often 92 represent a particular tradeoff between correctness and 93 responsiveness [AP99]. In other words we want to simultaneously: 95 - Wait long enough to ensure the decision to retransmit is 96 correct. 98 - Bound the delay we impose on applications before 99 retransmitting. 101 However, serving both of these goals is difficult as they pull in 102 opposite directions. I.e., towards either (a) withholding needed 103 retransmissions too long to ensure the retransmissions are truly 104 needed or (b) not waiting long enough to help application 105 responsiveness and sending spurious retransmissions. Given this 106 fundamental tradeoff [AP99], we have found that even though the 107 retransmission timeout (RTO) procedures are standardized, 108 implementations often add their own subtle imprint on the specifics 109 of the process to tilt the tradeoff between correctness and 110 responsiveness in some particular way. 112 At this point we recognize that often these specific tweaks are not 113 crucial for network safety. Hence, in this document we outline the 114 high-level requirements that are crucial for any retransmission 115 timeout scheme to follow. The intent is to then allow 116 implementations to instantiate mechanisms that best realize their 117 specific goals within this framework. These specific mechanisms 118 could be standardized by the IETF or ad-hoc, but as long as they 119 adhere to the requirements given in this document they would be 120 considered consistent with the standards. 122 Finally, we note the requirements in this document are applicable to 123 any protocol that uses a retransmission timeout mechanism. The 124 examples and discussion are framed in terms of TCP, however, that is 125 an artifact of where much of our experience with RTOs comes from and 126 should not be read as narrowing the scope of the requirements. 128 2 Scope 130 This document offers high-level requirements based on experience 131 with retransmission timer algorithms. However, this document 132 explicitly does not update or obsolete currently standardized 133 algorithms nor limit future standardization of specific RTO 134 mechanisms. Specifically: 136 (a) RTO mechanisms that are currently standardized are not updated 137 or obsoleted by this document. This holds even in cases where 138 the existing specification differs from the requirements in this 139 document (e.g., [RFC3261] uses a smaller initial RTO than this 140 document specifies). Existing standard specifications enjoy 141 their own consensus which this document does not change. 143 (b) Future standardization efforts that specify RTO mechanisms 144 SHOULD follow the requirements in this document. This follows 145 the definition of "SHOULD" [RFC2119] and is explicitly not a 146 "MUST". That is, the requirements in this document hold unless 147 the community has consensus that specific deviations in a 148 particular context are warranted. 150 (c) RTO mechanisms that are not standardized but adhere to the 151 requirements in the following section are deemed consistent with 152 the standards. This includes RTO mechanisms that are deviations 153 from a specific standardized algorithm, but are still within the 154 requirements below. 156 More colloquially we note that each RTO implementation can be placed 157 into one of the following four categories: 159 - The implementation precisely follows a standard RTO mechanism 160 (e.g., [RFC6298]), as well as adhering to the requirements in this 161 document. 163 This document represents no change for this situation as such an 164 implementation is clearly standards compliant. 166 - The implementation does not precisely follow a standard RTO 167 mechanism and does not adhere to the requirements in this 168 document. 170 This document makes no change to this situation as such an 171 implementation is clearly not standards compliant. 173 - The implementation precisely follows a standard RTO mechanism 174 (e.g., [RFC3261]), but does not precisely adhere to the 175 requirements in this document. 177 This document represents no change for this situation as such an 178 implementation is considered standards compliant by virtue of 179 precisely implementing a standard mechanism that has community 180 consensus as a reasonable approach. That is, this document's 181 stance is to not limit the community's ability to make exceptions 182 to the requirements herein for particular cases. 184 - The implementation does not precisely follow a standard RTO 185 mechanism, yet does adhere to the requirements in this document. 187 This document represents a change for these implementations and 188 considers them to be consistent with the standards by virtue of 189 following the requirements herein that provide for an RTO safe for 190 operation in the Internet. 192 In other words, the requirements in this document can be viewed as 193 specifying the default properties of an RTO mechanism. 194 Specifications can more concretely nail down specifics within these 195 defaults or work outside the defaults as necessary. However, 196 implementations that fall within the defaults do not require 197 explicit specifications to be considered consistent with the 198 standards. 200 3 Requirements 202 We now list the requirements that SHOULD apply when designing 203 retransmission timeout (RTO) mechanisms. 205 (1) In the absence of any knowledge about the latency of a path, the 206 RTO MUST be conservatively set to no less than 1 second. 208 This requirement ensures two important aspects of the RTO. 209 First, when transmitting into an unknown network, 210 retransmissions will not be sent before an ACK would reasonably 211 be expected to arrive and hence possibly waste scarce network 212 resources. Second, as noted below, sometimes retransmissions 213 can lead to ambiguities in assessing the latency of a network 214 path. Therefore, it is especially important for the first 215 latency sample to be free of ambiguities such that there is a 216 baseline for the remainder of the communication. 218 The specific constant (1 second) comes from the analysis of 219 Internet RTTs found in Appendix A of [RFC6298]. 221 (2) We specify three requirements that pertain to the sampling of 222 the latency across a path. 224 Often measuring the latency is framed as assessing the 225 round-trip time (RTT)---e.g., in TCP's RTO computation 226 specification [RFC6298]. This is somewhat mis-leading as the 227 latency is better framed as the "feedback time" (FT). In other 228 words, it is not simply a network property, but the length of 229 time before a sender should reasonably expect a response to a 230 query. 232 For instance, consider a DNS request from a client to a 233 resolver. When the request can be served from the resolver's 234 cache the FT likely well approximates the network RTT between 235 the client and resolver. However, on a cache miss the resolver 236 will have to request the needed information from authoritative 237 DNS servers, which will non-trivially increase the FT and 238 therefore the FT between the client and resolver does not well 239 match the network-based RTT between the two hosts. 241 (a) In steady state the RTO MUST be set based on recent 242 observations of both the FT and the variance of the FT. 244 In other words, the RTO should be based on a reasonable 245 amount of time that the sender should wait for an 246 acknowledgment of the data before retransmitting the given 247 data. 249 (b) FT observations MUST be taken regularly. 251 The exact definition of "regularly" is deliberately left 252 vague. TCP takes a FT sample roughly once per RTT, or if 253 using the timestamp option [RFC7323] on each acknowledgment 254 arrival. [AP99] shows that both these approaches result in 255 roughly equivalent performance for the RTO estimator. 256 Additionally, [AP99] shows that taking only a single FT 257 sample per TCP connection is suboptimal and hence the 258 requirement that the FT be sampled continuously throughout 259 the lifetime of a connection. For the purpose of this 260 requirement, we state that FT samples SHOULD be taken at 261 least once per RTT or as frequently as data is exchanged and 262 ACKed if that happens less frequently than every RTT. 263 However, we also recognize that it may not always be 264 practical to take a FT sample this often in all cases. 265 Hence, this once-per-RTT sampling requirement is explicitly 266 a "SHOULD" and not a "MUST". 268 (c) FT samples used in the computation of the RTO MUST NOT be 269 ambiguous. 271 Assume two copies of some segment X are transmitted at times 272 t0 and t1 and then segment X is acknowledged at time t2. In 273 some cases, it is not clear which copy of X triggered the 274 ACK and hence the actual FT is either t2-t1 or t2-t0, but 275 which is a mystery. Therefore, in this situation an 276 implementation MUST use Karn's algorithm [KP87,RFC6298] and 277 use neither version of the FT sample and hence not update 278 the RTO. 280 There are cases where two copies of some data are 281 transmitted in a way whereby the sender can tell which is 282 being acknowledged by an incoming ACK. E.g., TCP's 283 timestamp option [RFC7323] allows for segments to be 284 uniquely identified and hence avoid the ambiguity. In such 285 cases there is no ambiguity and the resulting samples can 286 update the RTO. 288 (3) Each time the RTO fires and causes a retransmission the value of 289 the RTO MUST be exponentially backed off such that the next 290 firing requires a longer interval. The backoff may be removed 291 after the successful transmission of non-retransmitted data. 293 A maximum value MAY be placed on the RTO provided it is at least 294 60 seconds (a la [RFC6298]). 296 This ensures network safety. 298 (4) Retransmission timeouts MUST be taken as indications of 299 congestion in the network and the sending rate adapted using a 300 standard mechanism (e.g., TCP collapses the congestion window to 301 one segment [RFC5681]). 303 This ensures network safety. 305 An exception is made to this rule if an IETF standardized 306 mechanism is used to determine that a particular loss is due to 307 a non-congestion event (e.g., packet corruption). In such a 308 case a congestion control action is not required. Additionally, 309 RTO-triggered congestion control actions may be reversed when a 310 standard mechanism determines that the cause of the loss was not 311 congestion after all. 313 4 Discussion 315 We note that research has shown the tension between the 316 responsiveness and correctness of retransmission timeouts seems to 317 be a fundamental tradeoff [AP99]. That is, making the RTO more 318 aggressive (e.g., via changing TCP's EWMA gains, lowering the 319 minimum RTO, etc.) can reduce the time spent waiting on needed 320 retransmissions. However, at the same time, such aggressiveness 321 leads to more needless retransmissions. Therefore, being as 322 aggressive as the requirements given in the previous section allow 323 in any particular situation may not be the best course of action 324 because an RTO expiration carries a requirement to slow down. 326 While the tradeoff between responsiveness and correctness seems 327 fundamental, the tradeoff can be made less relevant if the sender 328 can detect and recover from spurious RTOs. Several mechanisms have 329 been proposed for this purpose, such as Eifel [RFC3522], F-RTO 330 [RFC5682] and DSACK [RFC2883,RFC3708]. Using such mechanisms may 331 allow a data originator to tip towards being more responsive without 332 incurring (as much of) the attendant costs of needless retransmits. 334 Also, note, that in addition to the experiments discussed in [AP99], 335 the Linux TCP implementation has been using various non-standard RTO 336 mechanisms for many years seemingly without large scale problems 337 (e.g., using different EWMA gains). Further, a number of 338 implementations use minimum RTOs that are less than the 1 second 339 specified in [RFC6298]. While the implication of these deviations 340 from the standard may be more spurious retransmits (per [AP99]), we 341 are aware of no large scale problems caused by this change to the 342 minimum RTO. 344 Finally, we note that while allowing implementations to be more 345 aggressive may in fact increase the number of needless 346 retransmissions the above requirements fail safe in that they insist 347 on exponential backoff of the RTO and a transmission rate reduction. 348 Therefore, allowing implementers latitude in their instantiations of 349 an RTO mechanism does not somehow open the flood gates to aggressive 350 behavior. Since there is a downside to being aggressive the 351 incentives for proper behavior are retained in the mechanism. 353 5 Security Considerations 355 This document does not alter the security properties of 356 retransmission timeout mechanisms. See [RFC6298] for a discussion 357 of these within the context of TCP. 359 Acknowledgments 361 This document benefits from years of discussions with Ethan Blanton, 362 Sally Floyd, Jana Iyengar, Shawn Ostermann, Vern Paxson, and the 363 members of the TCPM and TCP-IMPL working groups. Ran Atkinson, 364 Yuchung Cheng, Jonathan Looney and Michael Scharf provided useful 365 comments on a previous version of this draft. 367 Normative References 369 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 370 Requirement Levels", BCP 14, RFC 2119, March 1997. 372 Informative References 374 [AP99] Allman, M., V. Paxson, "On Estimating End-to-End Network Path 375 Properties", Proceedings of the ACM SIGCOMM Technical Symposium, 376 September 1999. 378 [KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time 379 Estimates in Reliable Transport Protocols", SIGCOMM 87. 381 [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP 382 Selective Acknowledgment Options", RFC 2018, October 1996. 384 [RFC2883] Floyd, S., Mahdavi, J., Mathis, M., and M. Podolsky, "An 385 Extension to the Selective Acknowledgement (SACK) Option for 386 TCP", RFC 2883, July 2000. 388 [RFC3261] Rosenberg, J., Schulzrinne, H., Camarillo, G., Johnston, 389 A., Peterson, J., Sparks, R., Handley, M., and E. Schooler, 390 "SIP: Session Initiation Protocol", RFC 3261, June 2002. 392 [RFC3522] Ludwig, R., M. Meyer, "The Eifel Detection Algorithm for 393 TCP", RFC 3522, april 2003. 395 [RFC3708] Blanton, E., M. Allman, "Using TCP Duplicate Selective 396 Acknowledgement (DSACKs) and Stream Control Transmission 397 Protocol (SCTP) Duplicate Transmission Sequence Numbers (TSNs) 398 to Detect Spurious Retransmissions", RFC 3708, February 2004. 400 [RFC3940] Adamson, B., C. Bormann, M. Handley, J. Macker, 401 "Negative-acknowledgment (NACK)-Oriented Reliable Multicast 402 (NORM) Protocol", November 2004, RFC 3940. 404 [RFC4960] Stweart, R., "Stream Control Transmission Protocol", RFC 405 4960, September 2007. 407 [RFC5682] Sarolahti, P., M. Kojo, K. Yamamoto, M. Hata, "Forward 408 RTO-Recovery (F-RTO): An Algorithm for Detecting Spurious 409 Retransmission Timeouts with TCP", RFC 5682, September 2009. 411 [RFC6298] Paxson, V., M. Allman, H.K. Chu, M. Sargent, "Computing 412 TCP's Retransmission Timer", June 2011, RFC 6298. 414 [RFC6582] Henderson, T., S. Floyd, A. Gurtov, Y. Nishida, "The 415 NewReno Modification to TCP's Fast Recovery Algorithm", April 416 2012, RFC 6582. 418 [RFC6675] Blanton, E., M. Allman, L. Wang, I. Jarvinen, M. Kojo, 419 Y. Nishida, "A Conservative Loss Recovery Algorithm Based on 420 Selective Acknowledgment (SACK) for TCP", August 2012, RFC 6675. 422 [RFC7323] Borman D., B. Braden, V. Jacobson, R. Scheffenegger, "TCP 423 Extensions for High Performance", September 2014, RFC 7323. 425 Authors' Addresses 427 Mark Allman 428 International Computer Science Institute 429 1947 Center St. Suite 600 430 Berkeley, CA 94704 431 EMail: mallman@icir.org 432 http://www.icir.org/mallman