idnits 2.17.1 

draft-paxson-tcp-rto-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There are 5 instances of too long lines in the document, the longest one
     being 2 characters in excess of 72.

  ** There are 6 instances of lines with control characters in the document.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD',
     or 'RECOMMENDED' is not an accepted usage according to RFC 2119.  Please
     use uppercase 'NOT' together with RFC 2119 keywords (if that is what you
     mean).
     
     Found 'SHOULD not' in this paragraph:
     
     Note that some implementations may use a "heartbeat" timer that in
     fact yield a value between 2.5 seconds and 3 seconds. Accordingly, a
     lower bound of 2.5 seconds is also acceptable, providing that the timer
     will never expire faster than 2.5 seconds. Implementations using a
     heartbeat timer with a granularity of G SHOULD not set the timer below
     2.5 + G seconds.

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'JBB92' is mentioned on line 140, but not defined

  -- Possible downref: Non-RFC (?) normative reference: ref. 'AP99'

  ** Obsolete normative reference: RFC 2581 (ref. 'APS99') (Obsoleted by RFC
     5681)

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Jac88'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'JK88'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'KP87'

  ** Obsolete normative reference: RFC  793 (ref. 'Pos81') (Obsoleted by RFC
     9293)


     Summary: 8 errors (**), 0 flaws (~~), 3 warnings (==), 6 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force                              Vern Paxson
2	INTERNET DRAFT                                                     ACIRI
3	File: draft-paxson-tcp-rto-01.txt                            Mark Allman
4	                                                            NASA GRC/BBN
5	                                                             April, 2000
6	                                                  Expires: October, 2000

8	                  Computing TCP's Retransmission Timer

10	Status of this Memo

12	    This document is an Internet-Draft and is in full conformance with
13	    all provisions of Section 10 of RFC2026.

15	    Internet-Drafts are working documents of the Internet Engineering
16	    Task Force (IETF), its areas, and its working groups.  Note that
17	    other groups may also distribute working documents as
18	    Internet-Drafts.

20	    Internet-Drafts are draft documents valid for a maximum of six
21	    months and may be updated, replaced, or obsoleted by other documents
22	    at any time.  It is inappropriate to use Internet- Drafts as
23	    reference material or to cite them other than as "work in progress."

25	    The list of current Internet-Drafts can be accessed at
26	    http://www.ietf.org/ietf/1id-abstracts.txt

28	    The list of Internet-Draft Shadow Directories can be accessed at
29	    http://www.ietf.org/shadow.html.

31	Abstract

33	    This document defines the standard algorithm TCP senders are
34	    required to use to compute and manage their retransmission timer.
35	    It expands on the discussion in section 4.2.3.1 of RFC 1122 and
36	    upgrades the requirement of supporting the algorithm from a SHOULD
37	    to a MUST.

39	1   Introduction

41	    The Transmission Control Protocol (TCP) [Pos81] uses a
42	    retransmission timer to ensure data delivery in the absence of any
43	    feedback from the remote data receiver.  The duration of this timer
44	    is referred to as RTO (retransmission timeout).  RFC 1122 [Bra89]
45	    specifies that the RTO should be calculated as outlined in [Jac88].

47	    This document codifies the algorithm for setting the RTO.  In
48	    addition, this document expands on the discussion in section 4.2.3.1
49	    of RFC 1122 and upgrades the requirement of supporting the algorithm
50	    from a SHOULD to a MUST.  RFC 2581 [APS99] outlines the algorithm
51	    TCP uses to begin sending after the RTO expires and a retransmission
52	    is sent.  This document does not alter the behavior outlined in RFC
53	    2581 [APS99].

55	    In some situations it may be beneficial for a TCP sender to be more
56	    conservative than the algorithms detailed in this document allow.
57	    However, a TCP MUST NOT be more aggressive than the following
58	    algorithms allow.

60	    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
61	    "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
62	    document are to be interpreted as described in [Bra97].

64	2   The Basic Algorithm

66	    To compute the current RTO, a TCP sender maintains two state
67	    variables, SRTT (smoothed round-trip time) and RTTVAR (round-trip
68	    time variation).  In addition, we assume a clock granularity of G
69	    seconds.

71	    The rules governing the computation of SRTT, RTTVAR, and RTO are as
72	    follows:

74	    (2.1) Until a round-trip time (RTT) measurement has been made for a
75	          segment sent between the sender and receiver, the sender SHOULD
76	          set RTO <- 3 seconds (per RFC 1122 [Bra89]), though the
77	          "backing off" on repeated retransmission discussed in (5.5)
78	          still applies.

80		  Note that some implementations may use a "heartbeat" timer that
81		  in fact yield a value between 2.5 seconds and 3 seconds.
82		  Accordingly, a lower bound of 2.5 seconds is also acceptable,
83		  providing that the timer will never expire faster than 2.5 seconds.
84		  Implementations using a heartbeat timer with a granularity of G
85		  SHOULD not set the timer below 2.5 + G seconds.

87	    (2.2) When the first RTT measurement R is made, the host MUST set

89	              SRTT <- R
90	              RTTVAR <- R/2
91	              RTO <- SRTT + max (G, K*RTTVAR)

93	          where K = 4.

95	    (2.3) When a subsequent RTT measurement R' is made, a host MUST set

97	              RTTVAR <- (1 - beta) * RTTVAR + beta * |SRTT - R'|
98	              SRTT <- (1 - alpha) * SRTT + alpha * R'

100	          The value of SRTT used in the update to RTTVAR is its value
101	          before updating SRTT itself using the second assignment.  That
102	          is, updating RTTVAR and SRTT MUST be computed in the above
103	          order.

105	          The above SHOULD be computed using alpha=1/8 and beta=1/4 (as
106	          suggested in [JK88]).

108	          After the computation, a host MUST update
109	          RTO <- SRTT + max (G, K*RTTVAR)

111	    (2.4) Whenever RTO is computed, if it is less than 1 second then the
112	          RTO SHOULD be rounded up to 1 second.

114	          Traditionally, TCP implementations use coarse grain clocks to
115	          measure the RTT and trigger the RTO, which imposes a large
116	          minimum value on the RTO.  Research suggests that a large
117	          minimum RTO is needed to keep TCP conservative and avoid
118	          spurious retransmissions [AP99].  Therefore, this
119	          specification requires a large minimum RTO as a conservative
120	          approach, while at the same time acknowledging that at some
121	          future point, research may show that a smaller minimum RTO is
122	          acceptable or superior.

124	    (2.5) A maximum value MAY be placed on RTO provided it is at least 60
125	          seconds.

127	3   Taking RTT Samples

129	    TCP MUST use Karn's algorithm [KP87] for taking RTT samples.  That
130	    is, RTT samples MUST NOT be made using segments that were
131	    retransmitted (and thus for which it is ambiguous whether the reply
132	    was for the first instance of the packet or a later instance).  The
133	    only case when TCP can safely take RTT samples from retransmitted
134	    segments is when the TCP timestamp option [JBB92] is employed, since
135	    the timestamp option removes the ambiguity regarding which instance
136	    of the data segment triggered the acknowledgment.

138	    Traditionally, TCP implementations have taken one RTT measurement at
139	    a time (typically once per RTT).  However, when using the timestamp
140	    option, each ACK can be used as an RTT sample.  RFC 1323 [JBB92]
141	    suggests that TCP connections utilizing large congestion windows
142	    should take many RTT samples per window of data to avoid aliasing
143	    effects in the estimated RTT.  A TCP implementation MUST take at
144	    least one RTT measurement per RTT (unless that is not possible per
145	    Karn's algorithm).

147	    For fairly modest congestion window sizes research suggests that
148	    timing each segment does not lead to a better RTT estimator [AP99].
149	    Additionally, when multiple samples are taken per RTT the alpha and
150	    beta defined in section 2 may keep an inadequate RTT history.  A
151	    method for changing these constants is currently an open research
152	    question.

154	4   Clock Granularity

156	    There is no requirement for the clock granularity G used for
157	    computing RTT measurements and the different state variables.
158	    However, if the K*RTTVAR term in the RTO calculation equals zero,
159	    the variance term MUST be rounded to G seconds (i.e., use the
160	    equation given in step 2.3).

162	        RTO <- SRTT + max (G, K*RTTVAR)

164	    Experience has shown that finer clock granularities (<= 100 msec)
165	    perform somewhat better than more coarse granularities.

167	    Note that [Jac88] outlines several clever tricks that can be used to
168	    obtain better precision from coarse granularity timers.  These
169	    changes are widely implemented in current TCP implementations.

171	5   Managing the RTO Timer

173	    The following algorithm MUST be used for managing the retransmission
174	    timer:

176	    (5.1) Every time a packet containing data is sent (including a
177	          retransmission), if the timer is not running, start it running
178	          so that it will expire after RTO seconds (for the current value
179	          of RTO).

181	    (5.2) When all outstanding data has been acknowledged, turn off the
182	          retransmission timer.

184	    (5.3) When an ACK is received that acknowledges new data, restart the
185	          retransmission timer so that it will expire after RTO seconds
186	          (for the current value of RTO).

188	    When the retransmission timer expires, do the following:

190	    (5.4) Retransmit the earliest segment that has not been acknowledged
191	          by the TCP receiver.

193	    (5.5) The host MUST set RTO <- RTO * 2 ("back off the timer").  The
194	          maximum value discussed in (2.5) above may be used to provide an
195	          upper bound to this doubling operation.

197	    (5.6) Start the retransmission timer, such that it expires after RTO
198	          seconds (for the value of RTO after the doubling operation
199	          outlined in 5.5).

201	    Note that after retransmitting, once a new RTT measurement is
202	    obtained (which can only happen when new data has been sent and
203	    acknowledged), the computations outlined in section 2 are performed,
204	    including the computation of RTO, which may result in "collapsing"
205	    RTO back down after it has been subject to exponential backoff
206	    (rule 5.5).

208	    Note that a TCP implementation MAY clear SRTT and RTTVAR after
209	    backing off the timer multiple times as it is likely that the
210	    current SRTT and RTTVAR are bogus in this situation.  Once SRTT and
211	    RTTVAR are cleared they should be initialized with the next RTT
212	    sample taken per (2.2) rather than using (2.3).

214	6   Security Considerations

216	    This document requires a TCP to wait for a given interval before
217	    retransmitting an unacknowledged segment.  An attacker could cause a
218	    TCP sender to compute a large value of RTO by adding delay to a
219	    timed packet's latency, or that of its acknowledgment.  However,
220	    the ability to add delay to a packet's latency often coincides with
221	    the ability to cause the packet to be lost, so it is difficult to
222	    see what an attacker might gain from such an attack that could cause
223	    more damage than simply discarding some of the TCP connection's
224	    packets.

226	    The Internet to a considerable degree relies on the correct
227	    implementation of the RTO algorithm (as well as those described in
228	    RFC 2581) in order to preserve network stability and avoid
229	    congestion collapse.  An attacker could cause TCP endpoints to
230	    respond more aggressively in the face of congestion by forging
231	    acknowledgments for segments before the receiver has actually
232	    received the data, thus lowering RTO to an unsafe value.  But to do
233	    so requires spoofing the acknowledgments correctly, which is
234	    difficult unless the attacker can monitor traffic along the path
235	    between the sender and the receiver.  In addition, even if the
236	    attacker can cause the sender's RTO to reach too small a value, it
237	    appears the attacker cannot leverage this into much of an attack
238	    (compared to the other damage they can do if they can spoof packets
239	    belonging to the connection), since the sending TCP will still back
240	    off its timer in the face of an incorrectly transmitted packet's
241	    loss due to actual congestion.

243	Acknowledgments

245	    The RTO algorithm described in this memo was originated by Van
246	    Jacobson in [Jac88].

248	References

250	    [AP99] Allman, M. and V. Paxson, "On Estimating End-to-End Network
251	        Path Properties", SIGCOMM 99.

253	    [APS99] Allman, M., V. Paxson and W. R. Stevens, "TCP Congestion
254	        Control", RFC 2581, April 1999.

256	    [Bra89] Braden, R., "Requirements for Internet Hosts --
257	        Communication Layers", STD 3, RFC 1122, October 1989.

259	    [Bra97]  Bradner, S., "Key words for use in RFCs to Indicate
260	        Requirement Levels", BCP 14, RFC 2119, March 1997.

262	    [Jac88] Jacobson, V., "Congestion Avoidance and Control", Computer
263	        Communication Review, vol. 18, no. 4, pp. 314-329, Aug.  1988.

265	    [JK88] Jacobson, V. and M. Karels, "Congestion Avoidance and
266	        Control", ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z.

268	    [KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time
269	        Estimates in Reliable Transport Protocols", SIGCOMM 87.

271	    [Pos81] Postel, J., "Transmission Control Protocol", STD 7, RFC 793,
272	        September 1981.

274	Author's Addresses:

276	    Vern Paxson
277	    ACIRI / ICSI
278	    1947 Center Street
279	    Suite 600
280	    Berkeley, CA 94704-1198
281	    Phone: 510-642-4274 x302
282	    Fax: 510-643-7684
283	    vern@aciri.org
284	    http://www.aciri.org/vern/

286	    Mark Allman
287	    NASA Glenn Research Center/BBN Technologies
288	    Lewis Field
289	    21000 Brookpark Rd.  MS 54-2
290	    Cleveland, OH  44135
291	    Phone: 216-433-6586
292	    Fax: 216-433-8705
293	    mallman@grc.nasa.gov
294	    http://roland.grc.nasa.gov/~mallman