idnits 2.17.1 

draft-sarolahti-tsvwg-tcp-frto-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-26) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack a both a reference to RFC 2119 and the
     recommended RFC 2119 boilerplate, even if it appears to use RFC 2119
     keywords. 

     RFC 2119 keyword, line 135: '..., the TCP sender SHOULD retransmit fir...'
     RFC 2119 keyword, line 139: '.... The TCP sender MAY postpone adjustin...'
     RFC 2119 keyword, line 154: '..., the TCP sender SHOULD revert to the ...'
     RFC 2119 keyword, line 157: '...      The sender MUST set cwnd to 1 * ...'
     RFC 2119 keyword, line 168: '...h, the TCP sender MAY transmit two new...'
     (26 more instances...)


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFC2026' is mentioned on line 16, but not defined

  ** Obsolete normative reference: RFC 2581 (ref. 'APS99') (Obsoleted by RFC
     5681)

  ** Obsolete normative reference: RFC 2988 (ref. 'PA00') (Obsoleted by RFC
     6298)

  ** Obsolete normative reference: RFC  793 (ref. 'Pos81') (Obsoleted by RFC
     9293)

  -- Obsolete informational reference (is this intentional?): RFC 1323 (ref.
     'BBJ92') (Obsoleted by RFC 7323)

  -- Obsolete informational reference (is this intentional?): RFC 2582 (ref.
     'FH99') (Obsoleted by RFC 3782)


     Summary: 6 errors (**), 0 flaws (~~), 2 warnings (==), 4 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                             P. Sarolahti
3	INTERNET DRAFT                                     Nokia Research Center
4	File: draft-sarolahti-tsvwg-tcp-frto-03.txt                      M. Kojo
5	                                                  University of Helsinki
6	                                                           January, 2003
7	                                                     Expires: July, 2003

9	                F-RTO: A TCP RTO Recovery Algorithm for
10	                  Avoiding Unnecessary Retransmissions

12	Status of this Memo

14	   This document is an Internet-Draft and is in full conformance with
15	   all provisions of Section 10 of [RFC2026].

17	   Internet-Drafts are working documents of the Internet Engineering
18	   Task Force (IETF), its areas, and its working groups.  Note that
19	   other groups may also distribute working documents as Internet-
20	   Drafts.

22	   Internet-Drafts are draft documents valid for a maximum of six months
23	   and may be updated, replaced, or obsoleted by other documents at any
24	   time.  It is inappropriate to use Internet-Drafts as reference
25	   material or to cite them other than as "work in progress."

27	   The list of current Internet-Drafts can be accessed at
28	   http://www.ietf.org/ietf/1id-abstracts.txt

30	   The list of Internet-Draft Shadow Directories can be accessed at
31	   http://www.ietf.org/shadow.html.

33	Abstract

35	   Spurious retransmission timeouts (RTOs) cause suboptimal TCP
36	   performance, because they often result in unnecessary retransmission
37	   of the last window of data. This document describes the "Forward RTO
38	   Recovery" (F-RTO) algorithm for detecting spurious TCP RTOs. F-RTO is
39	   a TCP sender only algorithm that does not require any TCP options to
40	   operate. After retransmitting the first unacknowledged segment
41	   triggered by an RTO, the F-RTO algorithm at a TCP sender monitors the
42	   incoming acknowledgements to determine whether the timeout was
43	   spurious and to decide whether to send new segments or retransmit
44	   unacknowledged segments. The algorithm effectively helps to avoid
45	   additional unnecessary retransmissions and thereby improves TCP
46	   performance in case of a spurious timeout.

48	1.  Introduction

50	   The TCP protocol [Pos81] has two methods for triggering
51	   retransmissions.  Primarily, the TCP sender relies on incoming
52	   duplicate ACKs, which indicate that the receiver is missing some of
53	   the data. After a required amount of successive duplicate ACKs have
54	   arrived at the sender, it retransmits the first unacknowledged
55	   segment [APS99]. Secondarily, the TCP sender maintains a
56	   retransmission timer which triggers retransmission of segments, if
57	   they have not been acknowledged within the retransmission timer
58	   expiration period. When the retransmission timer expires, the
59	   congestion window is initialized to one segment and unacknowledged
60	   segments are retransmitted using the slow-start algorithm. The
61	   retransmission timer is adjusted dynamically based on the measured
62	   round-trip times [PA00].

64	   It has been pointed out that the retransmission timer can expire
65	   spuriously and trigger unnecessary retransmissions when no segments
66	   have been lost [GL02]. After a spurious RTO the acknowledgements of
67	   original segments arrive at the sender, usually triggering
68	   unnecessary retransmissions of whole window of segments during the
69	   RTO recovery. Furthermore, after a spurious RTO a conventional TCP
70	   sender increases the congestion window in slow start, injecting a
71	   large number of data segments to the network within one round-trip
72	   time.

74	   There are a number of potential reasons for spurious RTOs. First,
75	   some mobile networking technologies involve sudden delay peaks on
76	   transmission because of actions taken during a hand-off. Second,
77	   arrival of competing traffic, possibly with higher priority, on a
78	   low-bandwidth link or some other change in available bandwidth
79	   involves a sudden increase of round-trip time which may trigger a
80	   spurious retransmission timeout. A persistently reliable link layer
81	   can also cause a sudden delay when several data frames are lost for
82	   some reason. This document does not distinguish the different causes
83	   of such a delay, but discusses the spurious RTO caused by delay in
84	   general.

86	   This document describes an alternative RTO recovery algorithm called
87	   "Forward RTO-Recovery" (F-RTO) to be used for detecting spurious RTO
88	   and thus avoiding unnecessary retransmissions following the RTO. When
89	   the RTO is not spurious, the F-RTO algorithm reverts back to the
90	   conventional RTO recovery algorithm and should have similar
91	   performance. F-RTO does not require any TCP options in its operation,
92	   and it can be implemented by modifying only the TCP sender. This is
93	   different from alternative algorithms (Eifel [LK00] and DSACK-based
94	   algorithms [BA02]) that have been suggested for detecting unnecessary
95	   retransmissions.  The Eifel algorithm uses TCP timestamps for
96	   detecting a spurious timeout and the DSACK-based algorithms require
97	   that the SACK option with DSACK extension [FMMP00] is in use. With
98	   DSACK, the TCP receiver can report if it has received a duplicate
99	   segment, making it possible for the sender to detect afterwards
100	   whether it has made unnecessary retransmissions.

102	   When an RTO occurs, the F-RTO sender retransmits the first
103	   unacknowledged segment normally. If the next two acknowledgements
104	   advance the window, the F-RTO sender continues sending new data and
105	   exits the recovery.  However, if either of the next two
106	   acknowledgements is a duplicate ACK, there was no sufficient evidence
107	   of spurious RTO; therefore the F-RTO sender retransmits the
108	   unacknowledged segments in slow start similarly to the traditional
109	   algorithm. The F-RTO algorithm only attempts to avoid unnecessary
110	   retransmissions after a RTO. Eifel can also be used in avoiding
111	   unnecessary retransmissions in other events, for example due to
112	   packet reordering.

114	   This document is organized as follows. Section 2 describes the basic
115	   F-RTO algorithm. Section 3 outlines an optional enhancement to the F-
116	   RTO algorithm that takes leverage on the TCP Selective Acknowledgment
117	   Option [MMFR96] and Section 4 presents an alternative of F-RTO that
118	   uses the TCP timestamp option. Section 5 discusses the possible
119	   actions to be taken after detecting a spurious RTO, and Section 6
120	   discusses the security considerations.

122	2.  F-RTO Algorithm

124	   The F-RTO algorithm affects the TCP sender behavior only after a
125	   retransmission timeout. Otherwise the TCP behavior remains
126	   unmodified.  This section describes a basic version of the F-RTO
127	   algorithm that does not require TCP options to work.  The actions
128	   taken in response to spurious RTO are not described in this document,
129	   but we discuss the different alternatives for congestion control in
130	   Section 5.

132	   When the retransmission timer expires, the F-RTO algorithm takes the
133	   following steps at the TCP sender.

135	   1) When RTO expires, the TCP sender SHOULD retransmit first
136	      unacknowledged segment.

138	      The highest sequence number transmitted so far is stored in
139	      variable "send_high". The TCP sender MAY postpone adjusting the
140	      congestion control parameters for the next two incoming ACKs,
141	      until it has got more input on whether the RTO was spurious or
142	      not. If the TCP sender adjusts the congestion control parameters
143	      at this point, it may store the earlier values of the parameters
144	      to be able to restore the values when it detects that the RTO was
145	      spurious.

147	   2) When the first acknowledgement after the RTO arrives at the
148	      sender, the sender chooses the following actions depending on
149	      whether the ACK advances the window or whether it is a duplicate
150	      ACK.

152	      a) If the acknowledgement is a duplicate ACK OR it is
153	         acknowledging a sequence number equal or above to the value of
154	         send_high, the TCP sender SHOULD revert to the conventional
155	         recovery and not enter step 3 of this algorithm.

157	         The sender MUST set cwnd to 1 * MSS. This duplicate ACK is
158	         triggered by a segment that was sent before the RTO
159	         retransmission.  This is possible, for example, if the RTO
160	         expired during fast recovery while forward transmissions are
161	         triggering duplicate ACKs.  Furthermore, if a segment
162	         retransmitted during fast recovery is lost, it needs to be
163	         retransmitted again by retransmission timer. In this case it is
164	         also possible that the duplicate ACK is triggered by a new
165	         segment transmitted during the fast recovery before the RTO.

167	      b) If the acknowledgement advances the window AND it is below the
168	         value of send_high, the TCP sender MAY transmit two new
169	         (previously unsent) segments.

171	         Sending two new segments at this point is equally aggressive to
172	         the conventional RTO recovery algorithm, which would have
173	         increased its cwnd to 2 * MSS when the first valid ACK arrives
174	         after RTO. It is possible that the sender can transmit only one
175	         new segment at this time, because the receiver window limits
176	         it, or because the TCP sender does not have more data to send.
177	         This does not prevent the algorithm from working. In any case,
178	         the TCP sender SHOULD transmit at least one segment, either new
179	         data or from the retransmission queue. If the sender
180	         retransmits the next unacknowledged segment, it MUST NOT enter
181	         the step 3 of this algorithm, but continue retransmitting
182	         similarly to the conventional RTO recovery algorithm.

184	         If the first acknowledgement after RTO does not acknowledge all
185	         of the data that was retransmitted in step 1, the TCP sender
186	         MUST NOT enter step 3 of this algorithm. Otherwise, a malicious
187	         receiver acknowledging partial segments could cause the sender
188	         to declare the RTO spurious in a case where data was lost. When
189	         receiving an acknowledgement for a partial segment, the TCP
190	         sender SHOULD revert to conventional RTO recovery.

192	   3) When the second acknowledgement after the RTO arrives at the
193	      sender, either declare the RTO spurious, or start retransmitting
194	      the unacknowledged segments.

196	      a) If the acknowledgement is a duplicate ACK, the TCP sender MUST
197	         set congestion window to no more than 3 * MSS, and continue
198	         with the slow start algorithm retransmitting unacknowledged
199	         segments.

201	         The duplicate ACK indicates that at least one segment other
202	         than the segment which triggered RTO is lost in the last window
203	         of data. There is no sufficient evidence that the RTO was
204	         spurious. Therefore, the sender proceeds with retransmissions
205	         similarly to the conventional RTO recovery algorithm, with the
206	         send_high variable stored when the retransmission timer expired
207	         to avoid unnecessary fast retransmits.

209	      b) If the acknowledgement advances the window and acknowledges
210	         data beyond the highest sequence number that was retransmitted
211	         on RTO, the TCP sender SHOULD declare the RTO spurious.

213	         Because the TCP sender has retransmitted only one segment after
214	         the RTO, this acknowledgement indicates that an originally
215	         transmitted segment has arrived at the receiver. This is
216	         regarded as a strong indication of a spurious RTO. The TCP
217	         sender should not assume that the unacknowledged segments are
218	         lost, and it should continue by sending new previously unsent
219	         segments.

221	         If this algorithm branch is taken, the TCP sender SHOULD set
222	         the value of send_high variable to SND.UNA in order to disable
223	         the Reno "bugfix" [FH99]. The send_high variable was proposed
224	         for avoiding unnecessary multiple fast retransmits when RTO
225	         expires during fast recovery with NewReno TCP. As the sender
226	         has not retransmitted other segments but the one that triggered
227	         RTO, the problem addressed by the bugfix cannot occur.
228	         Therefore, if there are duplicate ACKs arriving at the sender
229	         after the RTO, they are likely to indicate a packet loss, hence
230	         fast retransmit should be used to allow efficient recovery.  If
231	         there are not enough duplicate ACKs arriving at the sender
232	         after a packet loss, the retransmission timer expires another
233	         time and the sender enters step 1 of this algorithm.

235	   If the TCP sender does not have any new data to send in algorithm
236	   branch (2b), or the receiver window limits the transmission, the
237	   sender SHOULD revert back to retransmitting unacknowledged data
238	   similarly to the regular TCP. The motivation for this is to ensure
239	   that the flow of segments into the network does not stop. In the
240	   worst case that would result in additional RTO significantly
241	   degrading the TCP performance.  The TCP sender could try to proceed
242	   with the F-RTO algorithm by alternatively transmitting one segment
243	   from the tail of the retransmission queue, if it is not possible to
244	   transmit new data in algorithm step (2b). Another option would be to
245	   transmit data beyond the advertised receiver window. If the RTO was
246	   spurious, the receiver is likely to be able to store the segment at
247	   the time when it arrives. However, the current recommendation is to
248	   revert to the conventional RTO recovery if sending new data is not
249	   possible, because we believe the benefits of doing otherwise are not
250	   very remarkable.

252	   After the RTO is declared spurious, the TCP sender cannot detect if
253	   the unnecessary RTO retransmission was lost. In principle the loss of
254	   the RTO retransmission should be taken as a congestion signal, and
255	   thus there is a small possibility that the F-RTO sender violates the
256	   congestion control rules, if it chooses to fully revert congestion
257	   control parameters after detecting a spurious RTO. The Eifel
258	   detection algorithm has a similar property, but the DSACK option can
259	   be used to detect whether the retransmitted segment was successfully
260	   delivered to the receiver.

262	   The F-RTO algorithm has a side-effect on the TCP round-trip time
263	   measurement. Because the TCP sender avoids most of the unnecessary
264	   retransmissions after a spurious RTO, the sender is able to take
265	   round-trip time samples of the delayed segments. This would not be
266	   possible due to retransmission ambiguity, if the regular RTO recovery
267	   is used without TCP timestamps. As a result, the RTO estimator is
268	   likely have larger values with F-RTO than with the regular TCP after
269	   the spurious RTO. We believe this is an advantage in the networks
270	   that are prone to delay spikes.

272	   It is possible that the F-RTO algorithm does not always avoid
273	   unnecessary retransmissions after spurious RTO. If packet reordering
274	   or packet duplication occurs on the segment that triggered the
275	   spurious RTO, the F-RTO algorithm may not detect the spurious RTO.
276	   Additionally, if a spurious RTO occurs during fast recovery, the F-
277	   RTO algorithm often cannot detect the spurious RTO.  However, we
278	   consider these cases relatively rare, and note that in cases where F-
279	   RTO fails to detect the spurious RTO, it performs similarly to the
280	   regular RTO recovery.

282	3.  A SACK-enhanced version of the F-RTO algorithm

284	   This section describes an alternative version of the F-RTO algorithm,
285	   that makes use of TCP Selective Acknowledgement Option [MMFR96].  By
286	   using the SACK option the TCP sender can detect spurious RTOs in most
287	   of the cases when packet reordering or packet duplication is present,
288	   or when the TCP sender is under loss recovery. The difference to the
289	   basic F-RTO algorithm is that the sender may declare RTO spurious
290	   even when duplicate ACKs follow the RTO, if the SACK blocks
291	   acknowledge new data that was not transmitted after RTO.

293	   DCLOR is a related TCP enhancement that uses SACK option for avoiding
294	   unnecessary retransmissions after a spurious RTO [SL02].  However,
295	   DCLOR is different from F-RTO in that it does not declare the RTO
296	   spurious before all segments outstanding when the RTO occurs have
297	   been acknowledged.

299	   The SACK-enhanced F-RTO algorithm takes the following steps:

301	   1) When RTO expires, the TCP sender SHOULD retransmit first
302	      unacknowledged segment.

304	      The TCP sender should also store the highest sequence number
305	      transmitted in variable "send_high".

307	   2) The first acknowledgement after RTO arrives at the sender.

309	      a) if the cumulative ACK acknowledges all segments up to send_high
310	         stored in algorithm step 1, the TCP sender SHOULD revert to the
311	         conventional RTO recovery and it MUST set congestion window to
312	         no more than 2 * MSS. The sender does not enter step 3 of this
313	         algorithm.

315	      b) otherwise, the TCP sender MAY transmit two new segments. If the
316	         TCP sender does not transmit any previously unsent data, it
317	         MUST NOT enter step 3 of this algorithm, but revert to the
318	         conventional RTO recovery.

320	   3) The second acknowledgement after RTO arrives at the sender.

322	      a) if the ACK acknowledges data above send_high, either in SACK
323	         blocks or as a cumulative ACK, the sender MUST set congestion
324	         window to no more than 3 * MSS and proceed with slow start,
325	         retransmitting unacknowledged segments. The sender SHOULD take
326	         this branch also when the acknowledgement is a duplicate ACK
327	         and it does not contain any new SACK blocks for previously
328	         unacknowledged data below send_high.

330	      b) if the ACK does not acknowledge data above send_high and some
331	         previously unacknowledged data below send_high is acknowledged,
332	         the TCP sender SHOULD declare the RTO spurious.

334	         If there are unacknowledged holes between the received SACK
335	         blocks, those segments SHOULD be retransmitted similarly to the
336	         conventional SACK recovery algorithm. In addition, send_high
337	         should be set to its earlier value, since no loss recovery was
338	         needed due to the RTO.

340	   As with the basic version of the F-RTO algorithm, in step (2b) the
341	   sender may transmit only one segment if the receiver window does not
342	   allow more, or there are no more application data.

344	4.  On using the TCP timestamps with F-RTO

346	   The basic F-RTO algorithm suggests applying the conventional RTO
347	   recovery if the receiver window or application limits the
348	   transmission of new previously unsent data, and in such a case it is
349	   possible that the F-RTO algorithm cannot be used to detect a spurious
350	   RTO. The F-RTO sender can avoid the need of transmitting new
351	   previously unsent segments after RTO, if it has TCP timestamps
352	   [BBJ92] available. The Eifel detection algorithm [LK00] describes how
353	   the TCP timestamps can be used to avoid unnecessary retransmissions
354	   after a spurious RTO. However, if the RTO is declared spurious based
355	   on the timestamp echoed with the first acceptable ACK following the
356	   RTO, the TCP sender may falsely declare the RTO spurious and continue
357	   by transmitting new data when the RTO was caused by loss of
358	   acknowledgements. The Eifel algorithm may signal spurious RTO
359	   falsely, if the first data segment retransmitted after RTO was not
360	   lost, but the corresponding acknowledgement was, and the
361	   acknowledgement does not include DSACK option [FMMP00]. If sender and
362	   receiver implement DSACK, this problem can be avoided.

364	   An alternative algorithm for detecting spurious RTOs by using TCP
365	   timestamps without DSACK is described below. When TCP timestamps are
366	   available, the F-RTO sender MAY apply the following algorithm.

368	   1) When RTO expires, retransmit first unacknowledged segment and
369	      store the timestamp of retransmitted segment in variable
370	      "RetransmitTS". Store the highest sequence number transmitted so
371	      far in variable "send_high".

373	   2) Wait until the first ACK that acknowledges previously
374	      unacknowledged data arrives at the sender. If duplicate ACKs
375	      arrive, they are processed normally while the sender stays in this
376	      step of the algorithm.

378	      a) if the timestamp echoed with the ACK is later or equal than
379	         what is stored in "RetransmitTS", the TCP sender SHOULD revert
380	         to the conventional RTO recovery and it MUST NOT enter step 3
381	         of this algorithm. The sender should adjust the congestion
382	         window according to the standard congestion control rules.

384	      b) if the timestamp echoed with the first ACK is earlier than what
385	         is stored in "RetransmitTS", the TCP sender SHOULD transmit the
386	         first unacknowledged segment and enter step 3 of this
387	         algorithm.

389	   3) When the next acknowledgement arrives at the sender, it SHOULD
390	      apply one of the following branches of the algorithm.

392	      a) if the timestamp echoed with the ACK is later or equal than
393	         what is stored in "RetransmitTS", or if the acknowledgement is
394	         duplicate ACK, the TCP sender SHOULD revert to the conventional
395	         RTO recovery. The TCP sender MUST set the congestion window to
396	         no more than 2 * MSS.

398	      b) if the timestamp echoed with the ACK is earlier than what is
399	         stored in "RetransmitTS", the TCP sender SHOULD declare the RTO
400	         spurious. send_high SHOULD be set to the value of SND.UNA to
401	         cancel the NewReno bugfix, as described in Section 2.

403	   The drawback of this algorithm compared to the original Eifel
404	   detection is that the above-presented algorithm can make two
405	   unnecessary retransmissions instead of one. In addition, packet
406	   reordering, packet duplication, or packet loss for the next segment
407	   after the one that triggered RTO may prevent the detection of
408	   spurious RTO.  Therefore, it may be desirable to apply the basic F-
409	   RTO or the SACK-enhanced version of the F-RTO algorithm whenever the
410	   sender is able to transmit previously unsent data when the first ACK
411	   after RTO arrives. However, we believe the algorithm above
412	   effectively avoids false spurious RTO signals.

414	5.  Taking Actions after Detecting Spurious RTO

416	   Upon retransmission timeout, a conventional TCP sender assumes that
417	   outstanding segments are lost and starts retransmitting the
418	   unacknowledged segments. When the RTO is detected to be spurious, the
419	   TCP sender should not start retransmitting based on the RTO. For
420	   example, if the sender was in congestion avoidance phase transmitting
421	   new previously unsent segments, it should continue transmitting
422	   previously unsent segments after detecting spurious RTO. In addition,
423	   it is suggested that the RTO estimation is reinitialized and the RTO
424	   timer is adjusted to a more conservative value in order to avoid
425	   subsequent spurious RTOs [LG02].

427	   Different approaches have been suggested for adjusting the congestion
428	   control state after a spurious RTO. This document does not recommend
429	   any of the alternatives below, but considers the response to spurious
430	   RTO as a subject of further research.

432	   1) Revert the congestion control parameters to the state before the
433	      RTO [LG02]. This appears to be a justified decision, because it is
434	      similar to the situation in which the RTO did not expire
435	      spuriously. However, we identified two concerns in this approach:
436	      First, some detection mechanisms, such as F-RTO or the Eifel
437	      Detection algorithm, do not notice the loss of the spurious
438	      retransmission, thus introducing a small risk of violation of the
439	      congestion control principles. Second, a spurious RTO indicates
440	      that some part of the network was unable to deliver packets for a
441	      while, which can be considered as a potential indication of
442	      congestion.

444	   2) Reduce ssthresh and congestion window when detecting a spurious
445	      RTO [SKR02]. For example, ssthresh and cwnd could be set to half
446	      of their earlier values, as done with the other congestion
447	      notification events. This alternative would be conservative enough
448	      considering the possibility of not detecting a packet loss of the
449	      RTO-triggered retransmission, but the TCP sender should avoid
450	      reducing the congestion window more than once in a round-trip
451	      time.

453	   3) Reset congestion window to one segment and proceed with slow
454	      start, once the pipe is assumed to be empty from earlier packets
455	      [SL02]. This would be a justified action to take if the spurious
456	      RTO is assumed to be caused due to changes in the network
457	      conditions, such as a change in the available bandwidth or a
458	      wireless handoff to another point in the network. Disadvantage of
459	      this alternative is that it is rather inefficient on a network
460	      paths with high delay, and on the other hand, it may result in
461	      slow start overshoot.

463	6.  Security Considerations

465	   No additional security threats on TCP due to the F-RTO algorithm are
466	   known.

468	Acknowledgements
469	   We are grateful to Reiner Ludwig, Andrei Gurtov, Josh Blanton, Mark
470	   Allman, Sally Floyd, Yogesh Swami, and Mika Liljeberg for the
471	   discussion and feedback contributed to this text.

473	Normative References

475	   [APS99]   M. Allman, V. Paxson, and W. Stevens. TCP Congestion Con-
476	             trol. RFC 2581, April 1999.

478	   [MMFR96]  M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP Selec-
479	             tive Acknowledgement Options. RFC 2018, October 1996.

481	   [PA00]    V. Paxson and M. Allman. Computing TCP's Retransmission
482	             Timer. RFC 2988, November 2000.

484	   [Pos81]   J. Postel. Transmission Control Protocol. RFC 793, Septem-
485	             ber 1981.

487	Informative References

489	   [ABF01]   M. Allman, H. Balakrishnan, and S. Floyd. Enhancing TCP's
490	             Loss Recovery Using Limited Transmit. RFC 3042, January
491	             2001.

493	   [BA02]    E. Blanton and M. Allman. On Making TCP more Robust to
494	             Packet Reordering. ACM Computer Communication Review,
495	             32(1), January 2002.

497	   [BBJ92]   D. Borman, R. Braden, and V. Jacobson. TCP Extensions for
498	             High Performance. RFC 1323, May 1992.

500	   [FH99]    S. Floyd and T. Henderson. The NewReno Modification to
501	             TCP's Fast Recovery Algorithm. RFC 2582, April 1999.

503	   [FMMP00]  S. Floyd, J. Mahdavi, M. Mathis, and M. Podolsky. An Exten-
504	             sion to the Selective Acknowledgement (SACK) Option to TCP.
505	             RFC 2883, July 2000.

507	   [GL02]    A. Gurtov and R. Ludwig. Evaluating the Eifel Algorithm for
508	             TCP in a GPRS Network. In Proc. of European Wireless, Flo-
509	             rence, Italy, February 2002

511	   [LG02]    R. Ludwig and A. Gurtov. The Eifel Response Algorithm for
512	             TCP. Internet draft "draft-ietf-tsvwg-tcp-eifel-
513	             response-02.txt".  December 2002. Work in progress.

515	   [LK00]    R. Ludwig and R.H. Katz. The Eifel Algorithm: Making TCP
516	             Robust Against Spurious Retransmissions. ACM Computer Com-
517	             munication Review, 30(1), January 2000.

519	   [SKR02]   P. Sarolahti, M. Kojo, and K. Raatikainen. F-RTO: A New
520	             Recovery Algorithm for TCP Retransmission Timeouts. Univer-
521	             sity of Helsinki, Dept. of Computer Science. Series of Pub-
522	             lications C, No. C-2002-07. February 2002. Available at:
523	             http://www.cs.helsinki.fi/research/iwtcp/papers/f-rto.ps

525	   [SL02]    Y. Swami and K. Le. DCLOR: De-correlated Loss Recovery
526	             using SACK option for spurious timeouts. Internet draft
527	             "draft-swami-tsvwg-tcp-dclor-00.txt". November 2002. Work
528	             in progress.

530	Appendix A: Scenarios

532	   This section discusses different scenarios where RTOs occur and how
533	   the basic F-RTO algorithm performs in those scenarios. The
534	   interesting scenarios are a sudden delay triggering RTO, loss of a
535	   retransmitted packet during fast recovery, link outage causing the
536	   loss of several packets, and packet reordering. A performance
537	   evaluation with a more thorough analysis on a real implementation of
538	   F-RTO is given in [SKR02].

540	A.1.  Sudden delay

542	   An unexpectedly long delay can trigger an RTO, should it occur on a
543	   single packet blocking the following packets, or appear as increased
544	   RTTs for several successive packets. The example below illustrates
545	   the sequence of packets and acknowledgements seen by the TCP sender
546	   that follows the F-RTO algorithm, when a sudden delay occurs
547	   triggering RTO but no packets are lost. For simplicity, delayed
548	   acknowledgements are not used in the example.

550	         ...                (cwnd = 6, ssthresh < 6, FlightSize = 5)
551	         1.  SEND(10)
552	         2.  ACK(6)
553	         3.  SEND(11)
554	         4.  <delay + RTO>  (set ssthresh <- 3)
555	         5.  SEND(6)
556	         6.  ACK(7)
557	         7.  SEND(12)
558	         8.  SEND(13)
559	         9.  ACK(8)         (set cwnd <- 3, FlightSize = 6)
560	         10. ACK(9)         (cwnd = 3,  FlightSize = 5)
561	         11. ACK(10)        (cwnd = 3,  FlightSize = 4)
562	         12. ACK(11)        (cwnd = 4,  FlightSize = 3)
563	         13. SEND(14)
564	         ...

566	   When a sudden delay long enough to trigger RTO occurs at step 4, the
567	   TCP sender retransmits the first unacknowledged segment (step 5).
568	   Because the next ACK advances the cumulative ACK point, the TCP
569	   sender continues by sending two new data segments (steps 7, 8) and
570	   adjusts cwnd to 3 MSS. Because the second acknowledgement arriving
571	   after the RTO also advances the cumulative ACK point, the TCP sender
572	   exits the recovery and continues with the congestion avoidance. From
573	   this point on the retransmissions are invoked either by fast
574	   retransmit or when triggered by the retransmission timer. Because the
575	   TCP sender reduces cwnd when receiving the first ACK after RTO and
576	   sends the two new data segments at steps 7 and 8, it has to wait
577	   until the FlightSize is reduced to the level of congestion window
578	   before it can continue transmitting again at step 13.

580	A.2.  Loss of a retransmission

582	   If a retransmitted segment is lost, the only way to retransmit it
583	   again is to wait for the RTO to trigger the retransmission. Once the
584	   segment is successfully received, the receiver usually acknowledges
585	   several segments cumulatively. The example below shows a scenario
586	   where retransmission (of segment 6) is lost, as well as a later
587	   segment (segment 9) in the same window. The limited transmit [ABF01]
588	   or SACK TCP [MMFR96] enhancements are not in use in this example.

590	         ...                (cwnd = 6, ssthresh < 6, FlightSize = 5)
591	             <segment 6 lost>
592	         1.  SEND(10)
593	         2.  ACK(6)
594	         3.  SEND(11)
595	         4.  ACK(6)
596	         5.  ACK(6)
597	         6.  ACK(6)
598	         7.  SEND(6)        (set cwnd <- 6, set ssthresh <- 3)
599	             <segment 6 lost>
600	         8.  ACK(6)
601	         9.  <RTO>          (set ssthresh <- 2)
602	         10. SEND(6)
603	         11. ACK(9)
604	         12. SEND(12)
605	         13. SEND(13)
606	         14. ACK(9)         (set cwnd <- 3)
607	         15. SEND(9)
608	         16. SEND(10)
609	         17. SEND(11)
610	         18. ACK(11)
611	         ...

613	   In the example above, segment 6 is lost and the sender retransmits it
614	   after three duplicate ACKs in step 7. However, the retransmission is
615	   also lost, and the sender has to wait for the RTO to expire before
616	   retransmitting it again. Because the first ACK following the RTO
617	   advances the cumulative ACK point (step 11), the sender transmits two
618	   new segments. The second ACK in step 14 does not advance the
619	   cumulative ACK point, and the sender enters the slow start, sets cwnd
620	   to 3 * MSS, and retransmits the next three unacknowledged segments,
621	   as per the F-RTO algorithm description given in Section 2. After this
622	   the receiver acknowledges all segments transmitted prior to entering
623	   recovery and the sender can continue transmitting new data in
624	   congestion avoidance.

626	A.3.  Link outage

628	   A performance study shows that F-RTO performs similarly to the
629	   regular recovery when consecutive packets are lost both up- and
630	   downstream as a result of link outage, triggering an RTO [SKR02].  If
631	   the RTO was not spurious but some data was actually lost, one of the
632	   next two ACKs after RTO does not advance the cumulative ACK point
633	   when RTO was caused by data loss, because the basic F-RTO retransmits
634	   only one segment after RTO. As a result, F-RTO sender continues by
635	   retransmitting unacknowledged segments similarly to the conventional
636	   RTO recovery.

638	A.4.  Packet reordering

640	   Since F-RTO modifies the TCP sender behavior only after a
641	   retransmission timeout and it is intended to avoid unnecessary
642	   retransmits only after spurious RTO, we limit the discussion on the
643	   effects of packet reordering in F-RTO behavior to the cases where
644	   packet reordering occurs immediately after RTO. We consider the
645	   retransmission timeout due to packet reordering to be very rare case,
646	   since reordering often triggers fast retransmit due to duplicate ACKs
647	   caused by out-of-order segments. Should packet reordering occur after
648	   an RTO, duplicate ACKs arrive to the sender, taking the F-RTO
649	   algorithm to retransmit in slow start as a regular RTO recovery would
650	   do. Although this might not be the correct action, it is similar to
651	   the behavior of the regular TCP, making F-RTO a safe modification
652	   also in the presence of reordering.

654	Authors' Addresses

656	   Pasi Sarolahti
657	   Nokia Research Center
658	   P.O. Box 407
659	   FIN-00045 NOKIA GROUP
660	   Finland

662	   Phone: +358 50 4876607
663	   EMail: pasi.sarolahti@nokia.com
664	   http://www.cs.helsinki.fi/u/sarolaht/

666	   Markku Kojo
667	   University of Helsinki
668	   Department of Computer Science
669	   P.O. Box 26
670	   FIN-00014 UNIVERSITY OF HELSINKI
671	   Finland

673	   Phone: +358 9 1914 4179
674	   EMail: markku.kojo@cs.helsinki.fi