idnits 2.17.1 

draft-ietf-tsvwg-tcp-frto-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-26) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document is more than 15 pages and seems to lack a Table of Contents.

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 159: '...   A TCP sender MAY implement the basi...'
     RFC 2119 keyword, line 160: '...thm, the following steps MUST be taken...'
     RFC 2119 keyword, line 163: '..., the TCP sender SHOULD retransmit the...'
     RFC 2119 keyword, line 175: '..., the TCP sender MUST revert to the co...'
     RFC 2119 keyword, line 177: '.... The TCP sender MUST NOT enter step 3...'
     (20 more instances...)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFC2026' is mentioned on line 16, but not defined

  == Missing Reference: 'RTO' is mentioned on line 779, but not defined

  == Missing Reference: 'SACK 8' is mentioned on line 787, but not defined

  ** Obsolete normative reference: RFC 2581 (ref. 'APS99') (Obsoleted by RFC
     5681)

  ** Obsolete normative reference: RFC 3517 (ref. 'BAFW03') (Obsoleted by RFC
     6675)

  ** Obsolete normative reference: RFC 2988 (ref. 'PA00') (Obsoleted by RFC
     6298)

  ** Obsolete normative reference: RFC  793 (ref. 'Pos81') (Obsoleted by RFC
     9293)

  ** Obsolete normative reference: RFC 2960 (ref. 'Ste00') (Obsoleted by RFC
     4960)

  -- Obsolete informational reference (is this intentional?): RFC 1323 (ref.
     'BBJ92') (Obsoleted by RFC 7323)

  -- Obsolete informational reference (is this intentional?): RFC 2582 (ref.
     'FH99') (Obsoleted by RFC 3782)


     Summary: 9 errors (**), 0 flaws (~~), 4 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                             P. Sarolahti
3	INTERNET DRAFT                                     Nokia Research Center
4	File: draft-ietf-tsvwg-tcp-frto-00.txt                           M. Kojo
5	                                                  University of Helsinki
6	                                                           October, 2003
7	                                                    Expires: April, 2004

9	                   F-RTO: An Algorithm for Detecting
10	           Spurious Retransmission Timeouts with TCP and SCTP

12	Status of this Memo

14	   This document is an Internet-Draft and is in full conformance with
15	   all provisions of Section 10 of [RFC2026].

17	   Internet-Drafts are working documents of the Internet Engineering
18	   Task Force (IETF), its areas, and its working groups.  Note that
19	   other groups may also distribute working documents as Internet-
20	   Drafts.

22	   Internet-Drafts are draft documents valid for a maximum of six months
23	   and may be updated, replaced, or obsoleted by other documents at any
24	   time.  It is inappropriate to use Internet-Drafts as reference
25	   material or to cite them other than as "work in progress."

27	   The list of current Internet-Drafts can be accessed at
28	   http://www.ietf.org/ietf/1id-abstracts.txt

30	   The list of Internet-Draft Shadow Directories can be accessed at
31	   http://www.ietf.org/shadow.html.

33	Abstract

35	   Spurious retransmission timeouts (RTOs) cause suboptimal TCP
36	   performance, because they often result in unnecessary retransmission
37	   of the last window of data. This document describes the "Forward RTO
38	   Recovery" (F-RTO) algorithm for detecting spurious TCP RTOs. F-RTO is
39	   a TCP sender only algorithm that does not require any TCP options to
40	   operate. After retransmitting the first unacknowledged segment
41	   triggered by an RTO, the F-RTO algorithm at a TCP sender monitors the
42	   incoming acknowledgements to determine whether the timeout was
43	   spurious and to decide whether to send new segments or retransmit
44	   unacknowledged segments. The algorithm effectively helps to avoid
45	   additional unnecessary retransmissions and thereby improves TCP
46	   performance in case of a spurious timeout. The F-RTO algorithm can
47	   also be applied with the SCTP protocol.

49	1.  Introduction

51	   The TCP protocol [Pos81] has two methods for triggering
52	   retransmissions.  Primarily, the TCP sender relies on incoming
53	   duplicate ACKs, which indicate that the receiver is missing some of
54	   the data. After a required amount of successive duplicate ACKs have
55	   arrived at the sender, it retransmits the first unacknowledged
56	   segment [APS99]. Secondarily, the TCP sender maintains a
57	   retransmission timer which triggers retransmission of segments, if
58	   they have not been acknowledged within the retransmission timer
59	   expiration period. When the retransmission timer expires, the TCP
60	   sender enters the RTO recovery where congestion window is initialized
61	   to one segment and unacknowledged segments are retransmitted using
62	   the slow-start algorithm. The retransmission timer is adjusted
63	   dynamically based on the measured round-trip times [PA00].

65	   It has been pointed out that the retransmission timer can expire
66	   spuriously and trigger unnecessary retransmissions when no segments
67	   have been lost [GL02]. After a spurious RTO the late acknowledgements
68	   of original segments arrive at the sender, usually triggering
69	   unnecessary retransmissions of whole window of segments during the
70	   RTO recovery.  Furthermore, after a spurious RTO a conventional TCP
71	   sender increases the congestion window on each late acknowledgement
72	   in slow start, injecting a large number of data segments to the
73	   network within one round-trip time.

75	   There are a number of potential reasons for spurious RTOs. First,
76	   some mobile networking technologies involve sudden delay peaks on
77	   transmission because of actions taken during a hand-off. Second,
78	   arrival of competing traffic, possibly with higher priority, on a
79	   low-bandwidth link or some other change in available bandwidth
80	   involves a sudden increase of round-trip time which may trigger a
81	   spurious retransmission timeout. A persistently reliable link layer
82	   can also cause a sudden delay when several data frames are lost for
83	   some reason. This document does not distinguish the different causes
84	   of such a delay, but discusses the spurious RTOs caused by a delay in
85	   general.

87	   This document describes an alternative RTO recovery algorithm called
88	   "Forward RTO-Recovery" (F-RTO) to be used for detecting spurious RTOs
89	   and thus avoiding unnecessary retransmissions following the RTO. When
90	   the RTO is not spurious, the F-RTO algorithm reverts back to the
91	   conventional RTO recovery algorithm and should have similar behavior
92	   and performance. F-RTO does not require any TCP options in its
93	   operation, and it can be implemented by modifying only the TCP
94	   sender. This is different from alternative algorithms (Eifel [LK00],
95	   [LM03] and DSACK-based algorithms [BA02]) that have been suggested
96	   for detecting unnecessary retransmissions. The Eifel algorithm uses
97	   TCP timestamps [BBJ92] for detecting a spurious timeout and the
98	   DSACK-based algorithms require that the TCP Selective Acknowledgment
99	   Option [MMFR96] with DSACK extension [FMMP00] is in use. With DSACK,
100	   the TCP receiver can report if it has received a duplicate segment,
101	   making it possible for the sender to detect afterwards whether it has
102	   retransmitted segments unnecessarily.

104	   When an RTO occurs, the F-RTO sender retransmits the first
105	   unacknowledged segment normally, but tries to transmit new,
106	   previously unsent data after that. If the next two acknowledgements
107	   cover segments that were not retransmitted, the F-RTO sender can
108	   declare the RTO spurious and exit the RTO recovery. However, if
109	   either of the next two acknowledgements is a duplicate ACK, there was
110	   no sufficient evidence of spurious RTO; therefore the F-RTO sender
111	   retransmits the unacknowledged segments in slow start similarly to
112	   the traditional algorithm. With a SACK-enhanced version of the F-RTO
113	   algorithm, spurious RTOs may be detected even if duplicate ACKs
114	   arrive after an RTO. The F-RTO algorithm only attempts to detect and
115	   avoid unnecessary retransmissions after an RTO. Eifel and DSACK can
116	   also be used in detecting unnecessary retransmissions in other
117	   events, for example due to packet reordering.

119	   The F-RTO algorithm can also be applied with the SCTP protocol
120	   [Ste00], because SCTP has similar acknowledgement and packet
121	   retransmission concepts as TCP. When a SCTP retransmission timeout
122	   occurs, the SCTP sender is required to retransmit the outstanding
123	   data similarly to TCP, thus being prone to unnecessary
124	   retransmissions and congestion control adjustments, if delay spikes
125	   occur in the network. The SACK-enhanced version of F-RTO should be
126	   directly applicable to SCTP, which has selective acknowledgements as
127	   a built-in feature. For simplicity, this document mostly refers to
128	   TCP, but the algorithms and other discussion should be applicable
129	   also to SCTP.

131	   This document is organized as follows. Section 2 describes the basic
132	   F-RTO algorithm. Section 3 outlines an optional enhancement to the F-
133	   RTO algorithm that takes leverage on the TCP SACK option.  Section 4
134	   discusses the possible actions to be taken after detecting a spurious
135	   RTO, and Section 5 discusses the security considerations.

137	2.  F-RTO Algorithm

139	   An RTO is spurious if there are segments outstanding in the network
140	   that would have prevented the RTO, had their acknowledgements arrived
141	   earlier at the sender. F-RTO affects the TCP sender behavior only
142	   after a retransmission timeout, otherwise the TCP behavior remains
143	   unmodified.  When RTO expires the F-RTO algorithm monitors incoming
144	   acknowledgements and declares an RTO spurious, if the TCP sender gets
145	   an acknowledgement for a segment that was not retransmitted due to
146	   RTO. The actions taken in response to spurious RTO are not specified
147	   in this document, but we discuss the different alternatives for
148	   congestion control in Section 4.

150	   Following the practice used with the Eifel Detection algorithm
151	   [LM03], we use the "SpuriousRecovery" variable to indicate whether
152	   the retransmission is declared spurious by the sender. This variable
153	   can be used as an input for a related response algorithm. With F-RTO,
154	   the outcome of SpuriousRecovery can either be SPUR_TO, indicating a
155	   spurious retransmission timeout; or FALSE, when the RTO is not
156	   declared spurious, and the TCP sender should follow the conventional
157	   RTO recovery algorithm.

159	   A TCP sender MAY implement the basic F-RTO algorithm, and if it
160	   chooses to apply the algorithm, the following steps MUST be taken
161	   after the retransmission timer expires.

163	   1) When RTO expires, the TCP sender SHOULD retransmit the first
164	      unacknowledged segment and set SpuriousRecovery to FALSE. Store
165	      the highest sequence number transmitted so far in variable
166	      "send_high".

168	   2) When the first acknowledgement after the RTO arrives at the
169	      sender, the sender chooses the following actions depending on
170	      whether the ACK advances the window or whether it is a duplicate
171	      ACK.

173	      a) If the acknowledgement is a duplicate ACK OR it is
174	         acknowledging a sequence number equal to (or above) the value
175	         of send_high, the TCP sender MUST revert to the conventional
176	         RTO recovery, and continue by retransmitting unacknowledged
177	         data in slow start. The TCP sender MUST NOT enter step 3 of
178	         this algorithm, and the SpuriousRecovery variable remains as
179	         FALSE.

181	      b) If the acknowledgement advances the window AND it is below the
182	         value of send_high, the TCP sender SHOULD transmit up to two
183	         new (previously unsent) segments and enter step 3 of this
184	         algorithm. If the TCP sender does not have enough unsent data,
185	         it SHOULD send only one segment. In addition, the TCP sender
186	         MAY override the Nagle algorithm and send immediately an
187	         undersized segment if needed. If the TCP sender does not have
188	         any new data to send, the TCP sender SHOULD transmit a segment
189	         from the retransmission queue. If TCP sender retransmits the
190	         first unacknowledged segment, it MUST NOT enter step 3 of this
191	         algorithm but continue with the conventional RTO recovery
192	         algorithm.

194	   3) When the second acknowledgement after the RTO arrives at the
195	      sender, either declare the RTO spurious, or start retransmitting
196	      the unacknowledged segments.

198	      a) If the acknowledgement is a duplicate ACK, the TCP sender MUST
199	         set congestion window to no more than 3 * MSS, and continue
200	         with the slow start algorithm retransmitting unacknowledged
201	         segments. The sender leaves SpuriousRecovery to FALSE.

203	      b) If the acknowledgement advances the window, i.e. it
204	         acknowledges data that was not retransmitted after the RTO, the
205	         TCP sender SHOULD declare the RTO spurious, set
206	         SpuriousRecovery to SPUR_TO and set the value of send_high
207	         variable to SND.UNA.

209	   The F-RTO sender takes cautious actions when it receives duplicate
210	   acknowledgements after an RTO. Since duplicate ACKs may indicate that
211	   segments have been lost, reliably detecting a spurious RTO is
212	   difficult in the lack of additional information. Therefore the safest
213	   alternative is to follow the conventional TCP recovery in those
214	   cases.

216	   If the first acknowledgement after RTO covers the send_high point at
217	   algorithm step (2a), there is not enough evidence that a non-
218	   retransmitted segment has arrived at the receiver after the RTO.
219	   This is a common case when a fast retransmission is lost and it has
220	   been retransmitted again after an RTO, while the rest of the
221	   unacknowledged segments have successfully been delivered to the TCP
222	   receiver before the RTO. Therefore the RTO cannot be declared
223	   spurious in this case.

225	   If the first acknowledgement after RTO does not acknowledge all of
226	   the data that was retransmitted in step 1, the TCP sender must not
227	   enter step 3 of this algorithm, but revert to the conventional RTO
228	   recovery. Otherwise, a malicious receiver acknowledging partial
229	   segments could cause the sender to declare the RTO spurious in a case
230	   where data was lost.

232	   The TCP sender is allowed to send two new segments in algorithm
233	   branch (2b), because the conventional TCP sender would retransmit two
234	   segments after one round-trip has elapsed since the RTO. If sending
235	   new data is not possible in algorithm branch (2b), or the receiver
236	   window limits the transmission, it has to send something in order to
237	   prevent the TCP transfer from stalling. In that case the following
238	   options are available for the sender.

240	   - Continue with the conventional RTO recovery algorithm and do not
241	     try to detect the spurious RTO. The disadvantage is that the sender
242	     may do unnecessary retransmissions due to possible spurious RTO. On
243	     the other hand, we believe that the benefits of detecting spurious
244	     RTO in an application limited or receiver limited situations are
245	     not very remarkable.

247	   - Use additional information if available, e.g. TCP timestamps with
248	     the Eifel Detection algorithm, for detecting a spurious RTO.
249	     However, Eifel detection may yield different results from F-RTO
250	     when ACK losses and a RTO occur within the same round-trip time
251	     [SKR02].

253	   - Retransmit data from the tail of the retransmission queue and
254	     continue with step 3 of the F-RTO algorithm. It is possible that
255	     the retransmission is unnecessarily made, hence this option is not
256	     encouraged, except for hosts that are known to operate in an
257	     environment that is highly likely to have spurious RTOs. On the
258	     other hand, with this method it is possible to avoid several
259	     unnecessary retransmissions due to spurious RTO by doing only one
260	     retransmission that may be unnecessary.

262	   - Send a zero-sized segment below SND.UNA similar to TCP Keep-Alive
263	     probe and continue with step 3 of the F-RTO algorithm. Since the
264	     receiver replies with a duplicate ACK, the sender is able to detect
265	     from the incoming acknowledgement whether the RTO was spurious.
266	     While this method does not send data unnecessarily, it delays the
267	     recovery by one round-trip time in cases where the RTO was not
268	     spurious, and therefore is not encouraged.

270	   - In receiver-limited cases, send one octet of new data regardless of
271	     the advertised window limit, and continue with step 3 of the F-RTO
272	     algorithm. It is possible that the receiver has free buffer space
273	     to receive the data by the time the segment has propagated through
274	     the network, in which case no harm is done. If the receiver is not
275	     capable of receiving the segment, it rejects the segment, and sends
276	     a duplicate ACK.

278	   If the RTO is declared spurious, the TCP sender sets the value of
279	   send_high variable to SND.UNA in order to disable the NewReno
280	   "bugfix" [FH99]. The send_high variable was proposed for avoiding
281	   unnecessary multiple fast retransmits when RTO expires during fast
282	   recovery with NewReno TCP. As the sender has not retransmitted other
283	   segments but the one that triggered RTO, the problem addressed by the
284	   bugfix cannot occur. Therefore, if there are three duplicate ACKs
285	   arriving at the sender after the RTO, they are likely to indicate a
286	   packet loss, hence fast retransmit should be used to allow efficient
287	   recovery. If there are not enough duplicate ACKs arriving at the
288	   sender after a packet loss, the retransmission timer expires another
289	   time and the sender enters step 1 of this algorithm.

291	   When the RTO is declared spurious, the TCP sender cannot detect
292	   whether the unnecessary RTO retransmission was lost. In principle the
293	   loss of the RTO retransmission should be taken as a congestion
294	   signal, and thus there is a small possibility that the F-RTO sender
295	   violates the congestion control rules, if it chooses to fully revert
296	   congestion control parameters after detecting a spurious RTO. The
297	   Eifel detection algorithm has a similar property, while the DSACK
298	   option can be used to detect whether the retransmitted segment was
299	   successfully delivered to the receiver.

301	   The F-RTO algorithm has a side-effect on the TCP round-trip time
302	   measurement. Because the TCP sender can avoid most of the unnecessary
303	   retransmissions after detecting a spurious RTO, the sender is able to
304	   take round-trip time samples on the delayed segments. If the regular
305	   RTO recovery was used without TCP timestamps, this would not be
306	   possible due to retransmission ambiguity. As a result, the RTO
307	   estimator is likely have more accurate and larger values with F-RTO
308	   than with the regular TCP after a spurious RTO that was triggered due
309	   to delayed segments. We believe this is an advantage in the networks
310	   that are prone to delay spikes.

312	   It is possible that the F-RTO algorithm does not always avoid
313	   unnecessary retransmissions after a spurious RTO. If packet
314	   reordering or packet duplication occurs on the segment that triggered
315	   the spurious RTO, the F-RTO algorithm may not detect the spurious RTO
316	   due to incoming duplicate ACKs. Additionally, if a spurious RTO
317	   occurs during fast recovery, the F-RTO algorithm often cannot detect
318	   the spurious RTO.  However, we consider these cases relatively rare,
319	   and note that in cases where F-RTO fails to detect the spurious RTO,
320	   it performs similarly to the regular RTO recovery.

322	3.  A SACK-enhanced version of the F-RTO algorithm

324	   This section describes an alternative version of the F-RTO algorithm,
325	   that makes use of TCP Selective Acknowledgement Option [MMFR96].  By
326	   using the SACK option the TCP sender can detect spurious RTOs in most
327	   of the cases when packet reordering or packet duplication is present.
328	   The difference to the basic F-RTO algorithm is that the sender may
329	   declare RTO spurious even when duplicate ACKs follow the RTO, if the
330	   SACK blocks acknowledge new data that was not transmitted after RTO.

332	   The algorithm principle presented in this section is also applicable
333	   to be used with the SCTP protocol.

335	   Given that the TCP Selective Acknowledgement Option [MMFR96] is
336	   enabled for a TCP connection, TCP sender MAY implement the SACK-
337	   enhanced F-RTO algorithm. If the sender applies the SACK-enhanced F-
338	   RTO algorithm, it MUST follow the steps below.  This algorithm SHOULD
339	   NOT be applied, if the TCP sender is already in loss recovery when
340	   RTO occurs.  However, it should be possible to apply the principle of
341	   F-RTO within certain limitations also when RTO occurs during existing
342	   loss recovery. While this is a topic of further research, Appendix B
343	   briefly discusses the related issues.

345	   1) When RTO expires, the TCP sender SHOULD retransmit first
346	      unacknowledged segment and set SpuriousRecovery to FALSE. Variable
347	      "send_high" is set to indicate the highest segment transmitted so
348	      far.

350	   2) Wait until the acknowledgement for the segment retransmitted due
351	      to RTO arrives at the sender. If duplicate ACKs arrive, store the
352	      incoming SACK information but stay in step 2. If RTO expires,
353	      restart the algorithm.

355	      a) if the cumulative ACK acknowledges all segments up to
356	         send_high, the TCP sender SHOULD revert to the conventional RTO
357	         recovery and it MUST set congestion window to no more than 2 *
358	         MSS. The sender does not enter step 3 of this algorithm.

360	      b) otherwise, the TCP sender SHOULD transmit up to two new
361	         (previously unsent) segments, within the limitations of the
362	         congestion window. If the TCP sender is not able to transmit
363	         any previously unsent data due to receiver window limitation or
364	         because it does not have any new data to send, it MAY follow
365	         one of the options presented in Section 2. However, if the TCP
366	         sender chooses to retransmit a data segment here, SACK of that
367	         segment MUST NOT be used for declaring a spurious RTO in step
368	         (3b).

370	   3) When the next acknowledgement arrives at the sender.

372	      a) if the ACK acknowledges data above send_high, either in SACK
373	         blocks or as a cumulative ACK, the sender MUST set congestion
374	         window to no more than 3 * MSS and proceed with conventional
375	         recovery, retransmitting unacknowledged segments. The sender
376	         SHOULD take this branch also when the acknowledgement is a
377	         duplicate ACK and it does not contain any new SACK blocks for
378	         previously unacknowledged data below send_high.

380	      b) if the ACK does not acknowledge data above send_high AND it
381	         acknowledges some previously unacknowledged data below
382	         send_high, the TCP sender SHOULD declare the RTO spurious and
383	         set SpuriousRecovery to SPUR_TO.

385	   If there are unacknowledged holes between the received SACK blocks,
386	   those segments SHOULD be retransmitted similarly to the conventional
387	   SACK recovery algorithm [BAFW03].  If the algorithm exits with
388	   SpuriousRecovery set to SPUR_TO, send_high SHOULD be set to SND.UNA,
389	   thus allowing fast recovery on incoming duplicate acknowledgements.

391	4.  Taking Actions after Detecting Spurious RTO

393	   Upon retransmission timeout, a conventional TCP sender assumes that
394	   outstanding segments are lost and starts retransmitting the
395	   unacknowledged segments. When the RTO is detected to be spurious, the
396	   TCP sender should not start retransmitting based on the RTO. For
397	   example, if the sender was in congestion avoidance phase transmitting
398	   new previously unsent segments, it should continue transmitting
399	   previously unsent segments after detecting spurious RTO. In addition,
400	   it is suggested that the RTO estimation is reinitialized and the RTO
401	   timer is adjusted to a more conservative value in order to avoid
402	   subsequent spurious RTOs [LG03].

404	   Different approaches have been suggested for adjusting the congestion
405	   control state after a spurious RTO. This document does not
406	   specifically recommend any of the alternatives below, but considers
407	   the response to spurious RTO as a subject of further research.

409	   1) Revert the congestion control parameters to the state before the
410	      RTO [LG03]. This appears to be a justified decision, because it is
411	      similar to the situation in which the RTO did not expire
412	      spuriously. However, two concerns exists with this approach:
413	      First, some detection mechanisms, such as F-RTO or the Eifel
414	      Detection algorithm, do not notice the loss of the spurious
415	      retransmission, thus introducing a small risk of violation of the
416	      congestion control principles. Second, a spurious RTO indicates
417	      that some part of the network was unable to deliver packets for a
418	      while, which can be considered as a potential indication of
419	      congestion.

421	   2) Reduce congestion window to half of its earlier value but revert
422	      slow start threshold to its earlier value [SL03].  This
423	      alternative takes measures to validate the congestion window after
424	      a period during which no data has been transmitted. This would be
425	      a justified action to take if the spurious RTO is assumed to be
426	      caused due to changes in the network conditions, such as a change
427	      in the available bandwidth or a wireless handoff to another point
428	      of attachment in the network.

430	   3) Reduce ssthresh and congestion window when detecting a spurious
431	      RTO [SKR02]. For example, ssthresh and cwnd could be set to half
432	      of their earlier values, as done with the other congestion
433	      notification events. This alternative would be conservative enough
434	      considering the possibility of not detecting a packet loss of the
435	      RTO-triggered retransmission, but the TCP sender should avoid
436	      reducing the congestion window more than once in a round-trip
437	      time. Furthermore, if a spurious RTO occurs in the beginning of a
438	      TCP connection, this alternative causes the slow start to be
439	      canceled, which may sacrifice TCP performance.

441	5.  SCTP Considerations

443	   The SACK-enhanced F-RTO algorithm can be applied with the SCTP proto-
444	   col. However, SCTP contains features that are not present with TCP
445	   that need to be discussed when applying the F-RTO algorithm.

447	   SCTP association can be multi-homed. The current retransmission pol-
448	   icy states that retransmissions should go to alternative addresses.
449	   If the retransmission was due to spurious RTO caused by a delay
450	   spike, it is possible that the acknowledgement for the retransmission
451	   arrives back at the sender before the acknowledgements of the origi-
452	   nal transmissions arrive. If this happens, a possible loss of the
453	   original transmission of the data chunk that was retransmitted due to
454	   RTO may remain undetected when applying the F-RTO algorithm and there
455	   was a delay spike that triggered the RTO. Because the RTO was caused
456	   by the delay, and it was spurious in that respect, a suitable
457	   response is to continue by sending new data. However, if the original
458	   transmission was lost, fully reverting the congestion control parame-
459	   ters is too aggressive. Therefore, taking conservative actions on
460	   congestion control is recommended, if the SCTP association is multi-
461	   homed and retransmissions go to alternative address. The information
462	   in duplicate TSNs can be then used for reverting congestion control,
463	   if desired [BA02].

465	   Note that the forward transmissions made in F-RTO algorithm step (2b)
466	   should be destined to the primary address, since they are not
467	   retransmissions.

469	   When making a retransmission, a SCTP sender can bundle a number of
470	   unacknowledged data chunks and include them in the same packet. This
471	   needs to be considered when implementing F-RTO for SCTP. The basic
472	   principle of F-RTO still holds: in order to declare the RTO spurious,
473	   the sender must get an acknowledgement for a data chunk that was not
474	   retransmitted after the RTO. In other words, acknowledgements of data
475	   chunks that were bundled in RTO retransmission must not be used for
476	   declaring the RTO spurious.

478	6.  Security Considerations

480	   The main security threat regarding F-RTO is the possibility of
481	   receiver misleading the sender to set too large a congestion window
482	   after an RTO.  There are two possible ways a malicious receiver could
483	   trigger a wrong output from the F-RTO algorithm. First, the receiver
484	   can acknowledge data that it has not received. Second, it can delay
485	   acknowledgement of a segment it has received earlier, and acknowledge
486	   the segment after the TCP sender has been deluded to enter algorithm
487	   step 3.

489	   If the receiver acknowledges a segment it has not really received,
490	   the sender can be lead to declare RTO spurious in F-RTO algorithm
491	   step 3. However, since this causes the sender to have incorrect
492	   state, it cannot retransmit the segment that has never reached the
493	   receiver. Therefore, this attack is unlikely to be useful for the
494	   receiver to maliciously gain a larger congestion window.

496	   A common case of an RTO is that a fast retransmission of a segment is
497	   lost. If all other segments have been received, the RTO retransmis-
498	   sion causes the whole window to be acknowledged at once. This case is
499	   recognized in F-RTO algorithm branch (2a). However, if the receiver
500	   only acknowledges one segment after receiving the RTO retransmission,
501	   and then the rest of the segments, it could cause the RTO to be
502	   declared spurious when it is not. Therefore, it is suggested that
503	   when an RTO expires during fast recovery phase, the sender would not
504	   fully revert the congestion window even if the RTO was declared spu-
505	   rious, but reduce the congestion window to 1. However, the sender can
506	   take actions to avoid unnecessary retransmissions normally. If a TCP
507	   sender implements a burst avoidance algorithm that limits the sending
508	   rate to be no higher than in slow start, this precaution is not
509	   needed, and the sender may apply F-RTO normally.

511	   If there are more than one segments missing at the time when an RTO
512	   occurs, the receiver does not benefit from misleading the sender to
513	   declare a spurious RTO, because the sender would then have to go
514	   through another recovery period to retransmit the missing segments,
515	   usually after an RTO.

517	Acknowledgements
518	   We are grateful to Reiner Ludwig, Andrei Gurtov, Josh Blanton, Mark
519	   Allman, Sally Floyd, Yogesh Swami, Mika Liljeberg, Ivan Arias
520	   Rodriguez, Sourabh Ladha, and Martin Duke for the discussion and
521	   feedback contributed to this text.

523	Normative References

525	   [APS99]   M. Allman, V. Paxson, and W. Stevens. TCP Congestion Con-
526	             trol. RFC 2581, April 1999.

528	   [BAFW03]  E. Blanton, M. Allman, K. Fall, and L. Wang. A Conservative
529	             Selective Acknowledgment (SACK)-based Loss Recovery Algo-
530	             rithm for TCP. RFC 3517. April 2003.

532	   [MMFR96]  M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP Selec-
533	             tive Acknowledgement Options. RFC 2018, October 1996.

535	   [PA00]    V. Paxson and M. Allman. Computing TCP's Retransmission
536	             Timer. RFC 2988, November 2000.

538	   [Pos81]   J. Postel. Transmission Control Protocol. RFC 793, Septem-
539	             ber 1981.

541	   [Ste00]   R. Stewart, et. al. Stream Control Transmission Protocol.
542	             RFC 2960, October 2000.

544	Informative References

546	   [ABF01]   M. Allman, H. Balakrishnan, and S. Floyd. Enhancing TCP's
547	             Loss Recovery Using Limited Transmit. RFC 3042, January
548	             2001.

550	   [BA02]    E. Blanton and M. Allman. On Making TCP more Robust to
551	             Packet Reordering. ACM Computer Communication Review,
552	             32(1), January 2002.

554	   [BBJ92]   D. Borman, R. Braden, and V. Jacobson. TCP Extensions for
555	             High Performance. RFC 1323, May 1992.

557	   [FH99]    S. Floyd and T. Henderson. The NewReno Modification to
558	             TCP's Fast Recovery Algorithm. RFC 2582, April 1999.

560	   [FMMP00]  S. Floyd, J. Mahdavi, M. Mathis, and M. Podolsky. An Exten-
561	             sion to the Selective Acknowledgement (SACK) Option to TCP.
562	             RFC 2883, July 2000.

564	   [GL02]    A. Gurtov and R. Ludwig. Evaluating the Eifel Algorithm for
565	             TCP in a GPRS Network. In Proc. of European Wireless, Flo-
566	             rence, Italy, February 2002

568	   [LG03]    R. Ludwig and A. Gurtov. The Eifel Response Algorithm for
569	             TCP. Internet draft "draft-ietf-tsvwg-tcp-eifel-
570	             response-03.txt".  March 2003. Work in progress.

572	   [LK00]    R. Ludwig and R.H. Katz. The Eifel Algorithm: Making TCP
573	             Robust Against Spurious Retransmissions. ACM Computer Com-
574	             munication Review, 30(1), January 2000.

576	   [LM03]    R. Ludwig and M. Meyer. The Eifel Detection Algorithm for
577	             TCP. RFC 3522, April 2003.

579	   [SKR02]   P. Sarolahti, M. Kojo, and K. Raatikainen. F-RTO: A New
580	             Recovery Algorithm for TCP Retransmission Timeouts. Univer-
581	             sity of Helsinki, Dept. of Computer Science. Series of Pub-
582	             lications C, No. C-2002-07. February 2002. Available at:
583	             http://www.cs.helsinki.fi/research/iwtcp/papers/f-rto.ps

585	   [SL03]    Y. Swami and K. Le. DCLOR: De-correlated Loss Recovery
586	             using SACK option for spurious timeouts. Internet draft
587	             "draft-swami-tsvwg-tcp-dclor-01.txt". April 2003. Work in
588	             progress.

590	Appendix A: Scenarios

592	   This section discusses different scenarios where RTOs occur and how
593	   the basic F-RTO algorithm performs in those scenarios. The
594	   interesting scenarios are a sudden delay triggering RTO, loss of a
595	   retransmitted packet during fast recovery, link outage causing the
596	   loss of several packets, and packet reordering. A performance
597	   evaluation with a more thorough analysis on a real implementation of
598	   F-RTO is given in [SKR02].

600	A.1.  Sudden delay

602	   The main motivation of F-RTO algorithm is to improve TCP performance
603	   when a delay spike triggers a spurious retransmission timeout.  The
604	   example below illustrates the segments and acknowledgements
605	   transmitted by the TCP end hosts when a spurious RTO occurs, but no
606	   packets are lost. For simplicity, delayed acknowledgements are not
607	   used in the example. The example below reduces the congestion window
608	   and slow start threshold by half after detecting a spurious RTO.

610	         ...
611	          (cwnd = 6,
612	          ssthresh < 6,
613	          FlightSize = 5)
614	         1.  SEND 10 ---------------------------->
615	         2.          <---------------------------- ACK 6
616	         3.  SEND 11 ---------------------------->
617	         4.                       |
618	                               [delay]
619	                                  |
620	             [RTO]
621	         5.  SEND 6  ---------------------------->
622	                     <earlier xmitted SEG 6>  --->
623	         6.          <---------------------------- ACK 7
624	             [F-RTO step (2b)]
625	         7.  SEND 12 ---------------------------->
626	         8.  SEND 13 ---------------------------->
627	                     <earlier xmitted SEG 7>  --->
628	         9.          <---------------------------- ACK 8
629	             [F-RTO step (3b)]
630	             [SpuriousRecovery <- SPUR_TO]
631	             [cwnd <- 3, ssthresh <- 3]
632	         10.         <---------------------------- ACK 9
633	         11.         <---------------------------- ACK 10
634	         12.         <---------------------------- ACK 11
635	         13. SEND 14 ---------------------------->
636	         ...

638	   When a sudden delay long enough to trigger RTO occurs at step 4, the
639	   TCP sender retransmits the first unacknowledged segment (step 5).
640	   Because the next ACK covers the RTO retransmission because originally
641	   transmitted segment 6 arrives at the receiver, the TCP sender
642	   continues by sending two new data segments (steps 7, 8). Because the
643	   second acknowledgement arriving after the RTO acknowledges data that
644	   was not retransmitted due to RTO (step 9), the TCP sender declares
645	   the RTO as spurious and continues by sending new data. Because the
646	   TCP sender reduces cwnd when it detects the spurious RTO, it has to
647	   wait for some outstanding segments to leave the network before it can
648	   continue transmitting again at step 13.

650	A.2.  Loss of a retransmission

652	   If a retransmitted segment is lost, the only way to retransmit it
653	   again is to wait for the RTO to trigger the retransmission. Once the
654	   segment is successfully received, the receiver usually acknowledges
655	   several segments at once, because other segments in the same window
656	   have been successfully delivered before the retransmission arrives at
657	   the receiver. The example below shows a scenario where retransmission
658	   (of segment 6) is lost, as well as a later segment (segment 9) in the
659	   same window. The limited transmit [ABF01] or SACK TCP [MMFR96]
660	   enhancements are not in use in this example.

662	         ...
663	          (cwnd = 6,
664	          ssthresh < 6,
665	          FlightSize = 5)
666	             <segment 6 lost>
667	             <segment 9 lost>
668	         1.  SEND 10 ---------------------------->
669	         2.          <---------------------------- ACK 6
670	         3.  SEND 11 ---------------------------->
671	         4.          <---------------------------- ACK 6
672	         5.          <---------------------------- ACK 6
673	         6.          <---------------------------- ACK 6
674	         7.  SEND 6  --------------X
675	             <segment 6 lost>
676	             [ssthresh <- 3, cwnd <- ssthresh + 3 = 6]
677	         8.          <---------------------------- ACK 6
678	                                  |
679	                                  |
680	             [RTO]
681	             [ssthresh <- 2]
682	         9.  SEND 6  ---------------------------->
683	         10.         <---------------------------- ACK 9
684	             [F-RTO step (2b)]
685	         11. SEND 12 ---------------------------->
686	         12. SEND 13 ---------------------------->
687	         13.         <---------------------------- ACK 9
688	             [F-RTO step (3a)]
689	             [SpuriousRecovery <- FALSE]
690	             [cwnd <- 3]
691	         14. SEND 9  ---------------------------->
692	         15. SEND 10 ---------------------------->
693	         16. SEND 11 ---------------------------->
694	         17.         <---------------------------- ACK 11
695	         ...

697	   In the example above, segment 6 is lost and the sender retransmits it
698	   after three duplicate ACKs in step 7. However, the retransmission is
699	   also lost, and the sender has to wait for the RTO to expire before
700	   retransmitting it again. Because the first ACK following the RTO
701	   acknowledges the RTO retransmission (step 10), the sender transmits
702	   two new segments. The second ACK in step 13 does not acknowledge any
703	   previously unacknowledged data. Therefore the F-RTO sender enters the
704	   slow start and sets cwnd to 3 * MSS. Congestion window can be set to
705	   three segments, because two round-trips have elapsed after the RTO.
706	   After this the receiver acknowledges all segments transmitted prior
707	   to entering recovery and the sender can continue transmitting new
708	   data in congestion avoidance.

710	A.3.  Link outage

712	   The example below illustrates the F-RTO behavior when 4 consecutive
713	   packets are lost in the network causing the TCP sender to fall back
714	   to RTO recovery. Limited transmit and SACK are not used in this
715	   example.

717	         ...
718	          (cwnd = 6,
719	          ssthresh < 6,
720	          FlightSize = 5)
721	             <segments 6-9 lost>
722	         1.  SEND 10 ---------------------------->
723	         2.          <---------------------------- ACK 6
724	         3.  SEND 11 ---------------------------->
725	         4.          <---------------------------- ACK 6
726	                                  |
727	                                  |
728	             [RTO]
729	             [ssthresh <- 3]
730	         5.  SEND 6  ---------------------------->
731	         6.          <---------------------------- ACK 7
732	             [F-RTO step (2b)]
733	         7.  SEND 12 ---------------------------->
734	         8.  SEND 13 ---------------------------->
735	         9.          <---------------------------- ACK 7
736	             [F-RTO step (3a)]
737	             [SpuriousRecovery <- FALSE]
738	             [cwnd <- 3]
739	         10. SEND 7  ---------------------------->
740	         11. SEND 8  ---------------------------->
741	         12. SEND 9  ---------------------------->
742	         13.         <---------------------------- ACK 14

744	   Again, F-RTO sender transmits two new segments (steps 7 and 8) after
745	   the RTO retransmission is acknowledged. Because the next ACK does not
746	   acknowledge any data that was not retransmitted after the RTO (step
747	   9), the F-RTO sender proceeds with conventional recovery and slow
748	   start retransmissions.

750	A.4.  Packet reordering

752	   Since F-RTO modifies the TCP sender behavior only after a
753	   retransmission timeout and it is intended to avoid unnecessary
754	   retransmits only after spurious RTO, we limit the discussion on the
755	   effects of packet reordering in F-RTO behavior to the cases where
756	   packet reordering occurs immediately after the RTO.  When the TCP
757	   receiver gets an out-of-order segment, it generates a duplicate ACK.
758	   If the TCP sender implements the basic F-RTO algorithm, this may
759	   prevent the sender from detecting a spurious RTO.

761	   However, if the TCP sender applies the SACK-enhanced F-RTO, it is
762	   possible to detect a spurious RTO also when packet reordering occurs.
763	   We illustrate the behavior of SACK-enhanced F-RTO below when segment
764	   8 arrives before segments 6 and 7, and segments starting from segment
765	   6 are delayed in the network. In this example the TCP sender reduces
766	   the congestion window and slow start threshold in response to
767	   spurious RTO.

769	         ...
770	          (cwnd = 6,
771	          ssthresh < 6,
772	          FlightSize = 5)
773	         1.  SEND 10 ---------------------------->
774	         2.          <---------------------------- ACK 6
775	         3.  SEND 11 ---------------------------->
776	         4.                       |
777	                               [delay]
778	                                  |
779	             [RTO]
780	         5.  SEND 6  ---------------------------->
781	                     <earlier xmitted SEG 8>  --->
782	         6.          <---------------------------- ACK 6
783	                                                   [SACK 8]
784	             [SACK F-RTO stays in step 2]
785	         7.          <earlier xmitted SEG 6>  --->
786	         8.          <---------------------------- ACK 7
787	                                                   [SACK 8]
788	             [SACK F-RTO step (2b)]
789	         9.  SEND 12 ---------------------------->
790	         10. SEND 13 ---------------------------->
791	         11.         <earlier xmitted SEG 7>  --->
792	         12.         <---------------------------- ACK 9
793	             [SACK F-RTO step (3b)]
794	             [SpuriousRecovery <- SPUR_TO]
795	             [ssthresh <- 3, cwnd <- 3]
796	         13.         <---------------------------- ACK 10
797	         14.         <---------------------------- ACK 11
798	         15. SEND 14 ---------------------------->
799	         ...

801	   After RTO expires and the sender retransmits segment 6 (step 5), the
802	   receiver gets segment 8 and generates duplicate ACK with SACK for
803	   segment 8. In response to the acknowledgement the TCP sender does not
804	   send anything but stays in F-RTO step 2. Because the next
805	   acknowledgement advances the cumulative ACK point (step 8), the
806	   sender can transmit two new segments according to SACK-enhanced F-
807	   RTO. The next segment acknowledges new data between 7 and 11 that was
808	   not acknowledged earlier (segment 7), so the F-RTO sender declares
809	   the RTO spurious.

811	Appendix B: Applying SACK-enhanced F-RTO when RTO occurs during loss
812	recovery

814	   We believe that slightly modified SACK-enhanced F-RTO algorithm can
815	   be used to detect spurious RTOs also when RTO occurs while an earlier
816	   loss recovery is underway. However, there are issues that need to be
817	   considered if F-RTO is applied in this case.

819	   The original SACK-based F-RTO requires in algorithm step 3 that an
820	   ACK acknowledges previously unacknowledged non-retransmitted data
821	   between SND.UNA and send_high. If RTO takes place during earlier
822	   (SACK-based) loss recovery, the F-RTO sender must only use
823	   acknowledgements for non-retransmitted segments transmitted before
824	   the SACK-based loss recovery started. This means that in order to
825	   declare RTO spurious the TCP sender must receive an acknowledgement
826	   for non-retransmitted segment between SND.UNA and RecoveryPoint in
827	   algorithm step 3. RecoveryPoint is defined in conservative SACK-
828	   recovery algorithm [BAFW03], and it is set to indicate the highest
829	   segment transmitted so far when SACK-based loss recovery begins. In
830	   other words, if the TCP sender receives acknowledgement for segment
831	   that was transmitted more than one RTO ago, it can declare the RTO
832	   spurious. Defining an efficient algorithm for checking these
833	   conditions remains as a future work item.

835	   When spurious RTO is detected according to the rules given above, it
836	   may be possible that the response algorithm needs to consider this
837	   case separately, for example in terms of what segments to retransmit
838	   after RTO, and whether it is safe to revert the congestion control
839	   parameters in this case. This is considered as a topic of future
840	   research.

842	Authors' Addresses

844	   Pasi Sarolahti
845	   Nokia Research Center
846	   P.O. Box 407
847	   FIN-00045 NOKIA GROUP
848	   Finland
849	   Phone: +358 50 4876607
850	   EMail: pasi.sarolahti@nokia.com
851	   http://www.cs.helsinki.fi/u/sarolaht/

853	   Markku Kojo
854	   University of Helsinki
855	   Department of Computer Science
856	   P.O. Box 26
857	   FIN-00014 UNIVERSITY OF HELSINKI
858	   Finland

860	   Phone: +358 9 1914 4179
861	   EMail: markku.kojo@cs.helsinki.fi