idnits 2.17.1 

draft-allman-tcpm-rto-consider-02.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  -- The document has an IETF Trust Provisions (28 Dec 2009) Section 6.c(i)
     Publication Limitation clause.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The abstract seems to contain references ([RFC2119]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  -- The document date (November 2, 2015) is 3088 days in the past.  Is this
     intentional?


  Checking references for intended status: Best Current Practice
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFC5681' is mentioned on line 183, but not defined

  -- Obsolete informational reference (is this intentional?): RFC 3940
     (Obsoleted by RFC 5740)

  -- Obsolete informational reference (is this intentional?): RFC 4960
     (Obsoleted by RFC 9260)


     Summary: 2 errors (**), 0 flaws (~~), 2 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force                                M. Allman
2	INTERNET-DRAFT                                                      ICSI
3	File: draft-allman-tcpm-rto-consider-02.txt             November 2, 2015
4	Intended Status: Best Current Practice
5	Expires: May 2, 2016

7	                 Retransmission Timeout Considerations

9	Status of this Memo

11	    This document may not be modified, and derivative works of it may
12	    not be created, except to format it for publication as an RFC or to
13	    translate it into languages other than English.

15	    This Internet-Draft is submitted in full conformance with the
16	    provisions of BCP 78 and BCP 79.  Internet-Drafts are working
17	    documents of the Internet Engineering Task Force (IETF), its areas,
18	    and its working groups. Note that other groups may also distribute
19	    working documents as Internet-Drafts.

21	    Internet-Drafts are draft documents valid for a maximum of six
22	    months and may be updated, replaced, or obsoleted by other documents
23	    at any time. It is inappropriate to use Internet-Drafts as
24	    reference material or to cite them other than as "work in progress."

26	    The list of current Internet-Drafts can be accessed at
27	    http://www.ietf.org/1id-abstracts.html

29	    The list of Internet-Draft Shadow Directories can be accessed at
30	    http://www.ietf.org/shadow.html

32	    This Internet-Draft will expire on May 2, 2016.

34	Copyright Notice

36	    Copyright (c) 2015 IETF Trust and the persons identified as the
37	    document authors. All rights reserved.

39	    This document is subject to BCP 78 and the IETF Trust's Legal
40	    Provisions Relating to IETF Documents
41	    (http://trustee.ietf.org/license-info) in effect on the date of
42	    publication of this document. Please review these documents
43	    carefully, as they describe your rights and restrictions with
44	    respect to this document. Code Components extracted from this
45	    document must include Simplified BSD License text as described in
46	    Section 4.e of the Trust Legal Provisions and are provided without
47	    warranty as described in the Simplified BSD License."

49	Abstract

51	    Each implementation of a retransmission timeout mechanism must
52	    balance correctness and timeliness and therefore no implementation
53	    is suits all situations.  This document provides for high-level
54	    guidance for retransmission timeout schemes appropriate for general
55	    use in the Internet.  Within the guidelines, implementations have
56	    latitude to define particulars that best address each situation.

58	Terminology

60	    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
61	    "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
62	    document are to be interpreted as described in BCP 14, RFC 2119
63	    [RFC2119].

65	1   Introduction

67	    Despite our best intentions and most robust mechanisms, reliability
68	    in networking ultimately requires a timeout and re-try mechanism.
69	    Often there are more timely and precise mechanisms for repairing
70	    loss (e.g., TCP's fast retransmit [RFC5681], NewReno [RFC6582] or
71	    selective acknowledgment scheme [RFC2018,RFC6675]) which require
72	    information exchange between components in the system.  Such
73	    communication cannot be guaranteed.  Alternatively, information
74	    coding can allow the recipient to recover from some amount of lost
75	    information without use of a retransmission.  This latter provides
76	    probabilistic reliability.  Finally, negative acknowledgment schemes
77	    exist that do not depend on positive feedback to prevent
78	    retransmissions (e.g., [RFC3940]).  However, regardless of these
79	    useful alternatives, the only thing we can truly depend on is the
80	    passage of time and therefore our ultimate backstop to ensuring
81	    reliability is a timeout.  (Note: There is a case when we cannot
82	    count on the passage of time, but in this case we believe repairing
83	    loss will be a moot point and hence we do not further consider this
84	    case in this document.)

86	    Various protocols have defined their own timeout mechanisms (e.g.,
87	    TCP [RFC6298], SCTP [RFC4960]).  The specifics of retransmission
88	    timeouts often represent a particular tradeoff between correctness
89	    and responsiveness [AP99].  That is, waiting long enough to ensure
90	    retransmission correctness leads to unacceptably high delays.  On
91	    the other hand, bounding delay often leads to incorrect
92	    retransmission decisions.  Therefore, we have found that even though
93	    the procedures are standardized, implementations also often add
94	    their own subtle imprint on the specifics of the process to tilt the
95	    tradeoff between correctness and responsiveness in some way.  At
96	    this point we recognize that often these specific tweaks are not
97	    crucial for network safety.  Hence, in this document we outline the
98	    high-level principles that are crucial for any retransmission
99	    timeout scheme to follow.  The intent is to then allow
100	    implementations of protocols and applications to instantiate
101	    mechanisms that best realize their specific goals within this
102	    framework.  These specific mechanisms could be standardized or
103	    ad-hoc, but as long as they adhere to the guidelines given in this
104	    document they would be considered consistent with the standards.

106	2   Guidelines

108	    We now list the four guidelines that apply when utilizing a
109	    retransmission timeout (RTO).

111	    (1) In the absence of any knowledge about the round-trip time (RTT)
112	        of a path the RTO MUST be conservatively set to no less than 1
113	        second, per TCP's current default RTO [RFC6298].

115	        This guideline ensures two important aspects of the RTO.  First,
116	        when transmitting into an unknown network, retransmissions will
117	        not be sent before an ACK would reasonably be expected to arrive
118	        and hence possibly waste scarce network resources.  Second, as
119	        noted below, sometimes retransmissions can lead to ambiguities
120	        in assessing the RTT of a network path.  Therefore, it is
121	        especially important for the first RTT sample to be free of
122	        ambiguities such that there is a baseline for the remainder of
123	        the communication.

125	    (2) We specify three guidelines that pertain to the sampling of the
126	        RTT.

128	        (a) In steady state the RTO MUST be set based on recent
129	            observations of both the RTT and the variance of the RTT.

131	            In other words, the RTO should be based on a reasonable
132	            amount of time that the sender should wait for an
133	            acknowledgment of the data before retransmitting the given
134	            data.

136	        (b) RTT observations MUST be taken regularly.

138	            The exact definition of "regularly" is deliberately left
139	            vague.  TCP takes an RTT sample once per RTT, or if using
140	            the timestamp option [RFC7323] on each acknowledgment
141	            arrival.  [AP99] shows that both these approaches result in
142	            roughly equivalent performance for the RTO estimator.
143	            Additionally, [AP99] shows that taking only a single RTT
144	            sample per TCP connection is suboptimal.  Therefore, for the
145	            purpose of this guideline we state that RTT samples SHOULD
146	            be taken at least once per RTT or as frequently as data is
147	            exchanged and ACKed if that happens less frequently than
148	            every RTT.  However, we also recognize that it may not
149	            always be practical to take an RTT sample this often in all
150	            cases and hence this requirement is explicitly a "SHOULD"
151	            and not a "MUST".

153	        (c) RTT samples used in the computation of the RTO MUST NOT be
154	            ambiguous.

156	            Assume two copies of some segment X are transmitted at times
157	            t0 and t1 and then segment X is acknowledged at time t2.  In
158	            some cases, it is not clear which copy of X triggered the
159	            ACK and hence the actual RTT is either t2-t1 or t2-t0, but
160	            which is a mystery.  Therefore, in this situation an
161	            implementation MUST use Karn's algorithm [KP87,RFC6298] and
162	            use neither version of the RTT sample and hence not update
163	            the RTO.

165	            There are cases where two copies of some data are
166	            transmitted in a way whereby the sender can tell which is
167	            being acknowledged by an incoming ACK.  E.g., TCP's
168	            timestamp option [RFC7323] allows for segments to be
169	            uniquely identified and hence avoid the ambiguity.  In such
170	            cases there is no ambiguity and the resulting samples can
171	            update the RTO.

173	    (3) Each time the RTO fires and causes a retransmission the value of
174	        the RTO MUST be exponentially backed off such that the next
175	        firing requires a longer interval.  The backoff may be removed
176	        after the successful transmission of non-retransmitted data.

178	        This ensures network safety.

180	    (4) Retransmission timeouts MUST be taken as indications of
181	        congestion in the network and the sending rate adapted using a
182	        standard mechanism (e.g., TCP collapses the congestion window to
183	        one segment [RFC5681]).

185	        This ensures network safety.

187	        An exception is made to this rule if a standard mechanism is
188	        used to determine that a particular loss is due to a
189	        non-congestion event (e.g., bit errors or packet reordering).
190	        In such a case a congestion control action is not required.

192	3   Discussion

194	    We note that research has shown the tension between responsiveness
195	    and correctness of TCP's RTO seems to be a fundamental tradeoff
196	    [AP99].  That is, making TCP's RTO more aggressive (via the EWMA
197	    gains, lowering the minimum RTO, etc.) can reduce the time spent
198	    waiting on needed retransmissions.  However, at the same time such
199	    aggressiveness leads to more needless retransmissions, as well.
200	    Therefore, being as aggressive as the guidelines sketched in the
201	    last section allow in any particular situation may not be the best
202	    course of action (e.g., because an RTO expiration carries a
203	    requirement to slow down).

205	    While the tradeoff between responsiveness and correctness seems
206	    fundamental, the tradeoff can be made less relevant if the sender
207	    can detect and recover from spurious RTOs.  Several mechanisms have
208	    been proposed for this purpose, such as Eifel [RFC3522], F-RTO
209	    [RFC5682] and DSACK [RFC2883,RFC3708].  Using such mechanisms may
210	    allow a data originator to tip towards being more responsive without
211	    incurring (as much of) the attendant costs of needless retransmits.

213	    Also, note, that in addition to the experiments discussed in [AP99],
214	    the Linux TCP implementation has been using various non-standard RTO
215	    mechanisms for many years seemingly without large scale problems
216	    (e.g., using different EWMA gains).  Also, a number of
217	    implementations use minimum RTOs that are less than the 1 second
218	    specified in [RFC6298].  While the precise implications of this may
219	    show more spurious retransmits (per [AP99]) we are aware of no large
220	    scale problems caused by this change to the minimum RTO.

222	    Finally, we note that while allowing implementations to be more
223	    aggressive may in fact increase the number of needless
224	    retransmissions the above guidelines fail safe in that they insist
225	    on exponential backoff of the RTO and a transmission rate reduction.
226	    Therefore, allowing implementers latitude in their instantiations of
227	    an RTO mechanism does not somehow open the flood gates to aggressive
228	    behavior.  Since there is a downside to being aggressive the
229	    incentives for proper behavior are retained in the mechanism.

231	4   Security Considerations

233	    This document does not alter the security properties of
234	    retransmission timeout mechanisms.  See [RFC6298] for a discussion
235	    of these within the context of TCP.

237	Acknowledgments

239	    This document benefits from years of discussions with Ethan Blanton,
240	    Sally Floyd, Shawn Ostermann, Vern Paxson and the members of the
241	    TCPM and TCP-IMPL working groups.  Ran Atkinson provided useful
242	    comments on a previous version of this draft.

244	Normative References

246	    [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
247	        Requirement Levels", BCP 14, RFC 2119, March 1997.

249	Informative References

251	    [AP99] Allman, M., V. Paxson, "On Estimating End-to-End Network Path
252	        Properties", Proceedings of the ACM SIGCOMM Technical Symposium,
253	        September 1999.

255	    [KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time
256	        Estimates in Reliable Transport Protocols", SIGCOMM 87.

258	    [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and A. Romanow, "TCP
259	        Selective Acknowledgment Options", RFC 2018, October 1996.

261	    [RFC2883] Floyd, S., Mahdavi, J., Mathis, M., and M. Podolsky, "An
262	        Extension to the Selective Acknowledgement (SACK) Option for
263	        TCP", RFC 2883, July 2000.

265	    [RFC3522] Ludwig, R., M. Meyer, "The Eifel Detection Algorithm for
266	        TCP", RFC 3522, april 2003.

268	    [RFC3708] Blanton, E., M. Allman, "Using TCP Duplicate Selective
269	        Acknowledgement (DSACKs) and Stream Control Transmission
270	        Protocol (SCTP) Duplicate Transmission Sequence Numbers (TSNs)
271	        to Detect Spurious Retransmissions", RFC 3708, February 2004.

273	    [RFC3940] Adamson, B., C. Bormann, M. Handley, J. Macker,
274	        "Negative-acknowledgment (NACK)-Oriented Reliable Multicast
275	        (NORM) Protocol", November 2004, RFC 3940.

277	    [RFC4960] Stweart, R., "Stream Control Transmission Protocol", RFC
278	        4960, September 2007.

280	    [RFC5682] Sarolahti, P., M. Kojo, K. Yamamoto, M. Hata, "Forward
281	        RTO-Recovery (F-RTO): An Algorithm for Detecting Spurious
282	        Retransmission Timeouts with TCP", RFC 5682, September 2009.

284	    [RFC6298] Paxson, V., M. Allman, H.K. Chu, M. Sargent, "Computing
285	        TCP's Retransmission Timer", June 2011, RFC 6298.

287	    [RFC6582] Henderson, T., S. Floyd, A. Gurtov, Y. Nishida, "The
288	        NewReno Modification to TCP's Fast Recovery Algorithm", April
289	        2012, RFC 6582.

291	    [RFC6675] Blanton, E., M. Allman, L. Wang, I. Jarvinen, M.  Kojo,
292	        Y. Nishida, "A Conservative Loss Recovery Algorithm Based on
293	        Selective Acknowledgment (SACK) for TCP", August 2012, RFC 6675.

295	    [RFC7323] Borman D., B. Braden, V. Jacobson, R. Scheffenegger, "TCP
296	        Extensions for High Performance", September 2014, RFC 7323.

298	Authors' Addresses

300	   Mark Allman
301	   International Computer Science Institute
302	   1947 Center St.  Suite 600
303	   Berkeley, CA  94704

305	   Phone: 440-235-1792
306	   EMail: mallman@icir.org
307	   http://www.icir.org/mallman