idnits 2.17.1 

draft-ietf-tcpm-frto-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** It looks like you're using RFC 3978 boilerplate.  You should update this
     to the boilerplate described in the IETF Trust License Policy document
     (see https://trustee.ietf.org/license-info), which is required now.

  -- Found old boilerplate from RFC 3667, Section 5.1 on line 989.

  -- Found old boilerplate from RFC 3978, Section 5.5 on line 1003.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 1014.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 1021.

  -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 1027.

  ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line
     995), which is fine, but *also* found old RFC 2026, Section 10.4C,
     paragraph 1 text on line 38.

  ** The document claims conformance with section 10 of RFC 2026, but uses
     some RFC 3978/3979 boilerplate.  As RFC 3978/3979 replaces section 10 of
     RFC 2026, you should not claim conformance with it if you have changed to
     using RFC 3978/3979 boilerplate.

  ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure
     Acknowledgement -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  ** This document has an original RFC 3978 Section 5.4 Copyright Line,
     instead of the newer IETF Trust Copyright according to RFC 4748.

  ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead
     of the newer disclaimer which includes the IETF Trust according to RFC
     4748.

  ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate
     instead of verbatim RFC 3978 boilerplate.  After 6 May 2005, submission
     of drafts without verbatim RFC 3978 boilerplate is not accepted.

     The following non-3978 patterns matched text found in the document. 
     That text should be removed or replaced:

        By submitting this Internet-Draft, I certify that any applicable patent
        or other IPR claims of which I am aware have been disclosed, or
        will be disclosed, and any of which I become aware will be
        disclosed, in accordance with RFC 3668.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard

  == The page length should not exceed 58 lines per page, but there was 23
     longer pages, the longest (page 13) being 71 lines

  == It seems as if not all pages are separated by form feeds - found 0 form
     feeds but 23 pages


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The abstract seems to contain references ([RFC2119]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The document has an RFC 3978 Section 5.2(a) Derivative Works Limitation
     clause.

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RTO' is mentioned on line 849, but not defined

  == Missing Reference: 'SACK 8' is mentioned on line 858, but not defined

  == Unused Reference: 'GL03' is defined on line 603, but no explicit
     reference was found in the text

  == Unused Reference: 'Sar03' is defined on line 630, but no explicit
     reference was found in the text

  ** Obsolete normative reference: RFC 2581 (ref. 'APS99') (Obsoleted by RFC
     5681)

  ** Obsolete normative reference: RFC 3517 (ref. 'BAFW03') (Obsoleted by RFC
     6675)

  ** Obsolete normative reference: RFC 3782 (ref. 'FHG04') (Obsoleted by RFC
     6582)

  ** Obsolete normative reference: RFC 2988 (ref. 'PA00') (Obsoleted by RFC
     6298)

  ** Obsolete normative reference: RFC  793 (ref. 'Pos81') (Obsoleted by RFC
     9293)

  ** Obsolete normative reference: RFC 2960 (ref. 'Ste00') (Obsoleted by RFC
     4960)

  -- Obsolete informational reference (is this intentional?): RFC 1323 (ref.
     'BBJ92') (Obsoleted by RFC 7323)

  -- Obsolete informational reference (is this intentional?): RFC  896 (ref.
     'Nag84') (Obsoleted by RFC 7805)


     Summary: 14 errors (**), 0 flaws (~~), 10 warnings (==), 9 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	Internet Engineering Task Force                             P. Sarolahti
2	INTERNET DRAFT                                     Nokia Research Center
3	File: draft-ietf-tcpm-frto-01.txt                                M. Kojo
4	                                                  University of Helsinki
5	                                                              July, 2004
6	                                                  Expires: January, 2005

8	                   F-RTO: An Algorithm for Detecting
9	           Spurious Retransmission Timeouts with TCP and SCTP

11	Status of this Memo

13	   This document is an Internet-Draft and is in full conformance with
14	   all provisions of Section 10 of RFC 2026.

16	   Internet-Drafts are working documents of the Internet Engineering
17	   Task Force (IETF), its areas, and its working groups.  Note that
18	   other groups may also distribute working documents as
19	   Internet-Drafts.

21	   Internet-Drafts are draft documents valid for a maximum of six months
22	   and may be updated, replaced, or obsoleted by other documents at any
23	   time.  It is inappropriate to use Internet-Drafts as reference
24	   material or to cite them other than as "work in progress."

26	   The list of current Internet-Drafts can be accessed at
27	   http://www.ietf.org/ietf/1id-abstracts.txt

29	   The list of Internet-Draft Shadow Directories can be accessed at
30	   http://www.ietf.org/shadow.html.

32	   This document may not be modified, and derivative works of it may not
33	   be created, except to publish it as an RFC and to translate it into
34	   languages other than English.

36	Copyright Notice

38	   Copyright (C) The Internet Society (2004).  All Rights Reserved.

40	Abstract

42	   Spurious retransmission timeouts cause suboptimal TCP performance,
43	   because they often result in unnecessary retransmission of the last
44	   window of data. This document describes the F-RTO detection algorithm
45	   for detecting spurious TCP retransmission timeouts. F-RTO is a TCP
46	   sender-only algorithm that does not require any TCP options to
47	   operate. After retransmitting the first unacknowledged segment
48	   triggered by a timeout, the F-RTO algorithm at a TCP sender monitors
49	   the incoming acknowledgments to determine whether the timeout was
50	   spurious and to decide whether to send new segments or retransmit
51	   unacknowledged segments. The algorithm effectively helps to avoid
52	   additional unnecessary retransmissions and thereby improves TCP
53	   performance in case of a spurious timeout. The F-RTO algorithm can
54	   also be applied to SCTP.

56	Terminology

58	   The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD,
59	   SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this
60	   document, are to be interpreted as described in [RFC2119].

62	Table of Contents

64	   1.  Introduction  . . . . . . . . . . . . . . . . . . . . . . . .   3
65	   2.  F-RTO Algorithm . . . . . . . . . . . . . . . . . . . . . . .   5
66	        2.1  The Algorithm . . . . . . . . . . . . . . . . . . . . .   5
67	        2.2  Discussion  . . . . . . . . . . . . . . . . . . . . . .   6
68	   3.  SACK-enhanced version of the F-RTO algorithm  . . . . . . . .   8
69	   4.  Taking Actions after Detecting Spurious RTO . . . . . . . . .  10
70	   5.  SCTP Considerations . . . . . . . . . . . . . . . . . . . . .  10
71	   6.  Security Considerations . . . . . . . . . . . . . . . . . . .  11
72	   7.  IANA Considerations . . . . . . . . . . . . . . . . . . . . .  12
73	   8.  Acknowledgments . . . . . . . . . . . . . . . . . . . . . . .  12
74	   9.  References  . . . . . . . . . . . . . . . . . . . . . . . . .  13
75	   Appendix A: Scenarios . . . . . . . . . . . . . . . . . . . . . .  14
76	   Appendix B: SACK-enhanced F-RTO and Fast Recovery . . . . . . . .  19
77	   Appendix C: Discussion on Window Limited Cases  . . . . . . . . .  20

79	1.  Introduction

81	   The Transmission Control Protocol (TCP) [Pos81] has two methods for
82	   triggering retransmissions.  First, the TCP sender relies on incoming
83	   duplicate ACKs, which indicate that the receiver is missing some of
84	   the data. After a required number of successive duplicate ACKs have
85	   arrived at the sender, it retransmits the first unacknowledged
86	   segment [APS99] and continues with a loss recovery algorithm such as
87	   NewReno [FHG04] or SACK-based loss recovery [BAFW03]. Second, the TCP
88	   sender maintains a retransmission timer which triggers retransmission
89	   of segments, if they have not been acknowledged before the
90	   retransmission timeout (RTO) expires. When the retransmission timeout
91	   occurs, the TCP sender enters the RTO recovery where the congestion
92	   window is initialized to one segment and unacknowledged segments are
93	   retransmitted using the slow-start algorithm. The retransmission
94	   timer is adjusted dynamically based on the measured round-trip times
95	   [PA00].

97	   It has been pointed out that the retransmission timer can expire
98	   spuriously and cause unnecessary retransmissions when no segments
99	   have been lost [LK00, GL02, LM03]. After a spurious retransmission
100	   timeout the late acknowledgments of the original segments arrive at
101	   the sender, usually triggering unnecessary retransmissions of a whole
102	   window of segments during the RTO recovery.  Furthermore, after a
103	   spurious retransmission timeout a conventional TCP sender increases
104	   the congestion window on each late acknowledgment in slow start,
105	   injecting a large number of data segments to the network within one
106	   round-trip time, thus violating the packet conservation principle
107	   [Jac88].

109	   There are a number of potential reasons for spurious retransmission
110	   timeouts. First, some mobile networking technologies involve sudden
111	   delay spikes on transmission because of actions taken during a
112	   hand-off.  Second, arrival of competing traffic, possibly with higher
113	   priority, on a low-bandwidth link or some other change in available
114	   bandwidth can cause a sudden increase of round-trip time which may
115	   trigger a spurious retransmission timeout. A persistently reliable
116	   link layer can also cause a sudden delay when a data frame and
117	   several retransmissions of it are lost for some reason. This document
118	   does not distinguish between the different causes of such a delay
119	   spike, but discusses the spurious retransmission timeouts caused by a
120	   delay spike in general.

122	   This document describes the F-RTO detection algorithm. It is based on
123	   the detection mechanism of the "Forward RTO-Recovery" (F-RTO)
124	   algorithm [SKR03] that is used for detecting spurious retransmission
125	   timeouts and thus avoiding unnecessary retransmissions following the
126	   retransmission timeout. When the timeout is not spurious, the F-RTO
127	   algorithm reverts back to the conventional RTO recovery algorithm and
128	   therefore has similar behavior and performance.  In contrast to
129	   alternative algorithms proposed for detecting unnecessary
130	   retransmissions (Eifel [LK00], [LM03] and DSACK-based algorithms
131	   [BA04]), F-RTO does not require any TCP options for its operation,
132	   and it can be implemented by modifying only the TCP sender.  The
133	   Eifel algorithm uses TCP timestamps [BBJ92] for detecting a spurious
134	   timeout upon arrival of the first acknowledgment after the
135	   retransmission. The DSACK-based algorithms require that the TCP
136	   Selective Acknowledgment Option [MMFR96] with the DSACK extension
137	   [FMMP00] is in use. With DSACK, the TCP receiver can report if it has
138	   received a duplicate segment, making it possible for the sender to
139	   detect afterwards whether it has retransmitted segments
140	   unnecessarily. The F-RTO algorithm only attempts to detect and avoid
141	   unnecessary retransmissions after an RTO. Eifel and DSACK can also be
142	   used for detecting unnecessary retransmissions caused by other
143	   events, for example packet reordering.

145	   When an RTO expires, the F-RTO sender retransmits the first
146	   unacknowledged segment as usual [APS99]. Deviating from the normal
147	   operation after a timeout, it then tries to transmit new, previously
148	   unsent data, for the first acknowledgment that arrives after the
149	   timeout given that the acknowledgment advances the window. If the
150	   second acknowledgment that arrives after the timeout also advances
151	   the window, i.e., acknowledges data that was not retransmitted, the
152	   F-RTO sender declares the timeout spurious and exits the RTO
153	   recovery. However, if either of these two acknowledgments is a
154	   duplicate ACK, there is no sufficient evidence of a spurious timeout;
155	   therefore the F-RTO sender retransmits the unacknowledged segments in
156	   slow start similarly to the traditional algorithm. With a
157	   SACK-enhanced version of the F-RTO algorithm, spurious timeouts may
158	   be detected even if duplicate ACKs arrive after an RTO
159	   retransmission.

161	   The F-RTO algorithm can also be applied to the Stream Control
162	   Transmission Protocol (SCTP) [Ste00], because SCTP has similar
163	   acknowledgment and packet retransmission concepts as TCP. For
164	   convenience, this document mostly refers to TCP, but the algorithms
165	   and other discussion are valid for SCTP as well.

167	   This document is organized as follows. Section 2 describes the basic
168	   F-RTO algorithm. Section 3 outlines an optional enhancement to the
169	   F-RTO algorithm that takes advantage of the TCP SACK option.  Section
170	   4 discusses the possible actions to be taken after detecting a
171	   spurious RTO. Section 5 gives considerations on applying F-RTO with
172	   SCTP, and Section 6 discusses the security considerations.

174	2.  F-RTO Algorithm

176	   A timeout is considered spurious if it would have been avoided had
177	   the sender waited longer for an acknowledgment to arrive [LM03].
178	   F-RTO affects the TCP sender behavior only after a retransmission
179	   timeout, otherwise the TCP behavior remains the same.  When the RTO
180	   expires the F-RTO algorithm monitors incoming acknowledgments and
181	   declares a timeout spurious, if the TCP sender gets an acknowledgment
182	   for a segment that was not retransmitted due to timeout. The actions
183	   taken in response to a spurious timeout are not specified in this
184	   document, but we discuss some alternatives in Section 4. This section
185	   introduces the algorithm and then discusses the different steps of
186	   the algorithm in more detail.

188	   Following the practice used with the Eifel Detection algorithm
189	   [LM03], we use the "SpuriousRecovery" variable to indicate whether
190	   the retransmission is declared spurious by the sender. This variable
191	   can be used as an input for a corresponding response algorithm. With
192	   F-RTO, the value of SpuriousRecovery can be either SPUR_TO,
193	   indicating a spurious retransmission timeout, or FALSE, when the
194	   timeout is not declared spurious, and the TCP sender should follow
195	   the conventional RTO recovery algorithm.

197	2.1.  The Algorithm

199	   A TCP sender MAY implement the basic F-RTO algorithm, and if it
200	   chooses to apply the algorithm, the following steps MUST be taken
201	   after the retransmission timer expires. If the sender implements some
202	   loss recovery algorithm other than Reno or NewReno [FHG04], F-RTO
203	   algorithm SHOULD NOT be entered when earlier fast recovery is
204	   underway.

206	   1) When RTO expires, the TCP sender SHOULD retransmit the first
207	      unacknowledged segment and set SpuriousRecovery to FALSE.  Also,
208	      the TCP SHOULD store the highest sequence number transmitted so
209	      far in variable "recover".

211	   2) When the first acknowledgment after the RTO retransmission arrives
212	      at the sender, the sender chooses the following actions depending
213	      on whether the ACK advances the window or whether it is a
214	      duplicate ACK.

216	      a) If the acknowledgment is a duplicate ACK OR it acknowledges a
217	         sequence number equal to the value of "recover" OR it does not
218	         acknowledge all of the data that was retransmitted in step 1,
219	         the TCP sender MUST revert to the conventional RTO recovery and
220	         continue by retransmitting unacknowledged data in slow start.

222	         The TCP sender MUST NOT enter step 3 of this algorithm, and the
223	         SpuriousRecovery variable remains as FALSE.

225	      b) Else, if the acknowledgment advances the window AND it is below
226	         the value of "recover", the TCP sender SHOULD transmit up to
227	         two new (previously unsent) segments and enter step 3 of this
228	         algorithm. If the TCP sender does not have enough unsent data,
229	         it SHOULD send only one segment. In addition, the TCP sender
230	         MAY override the Nagle algorithm [Nag84] and immediately send a
231	         segment if needed.  Note that sending two segments in this step
232	         is allowed by TCP congestion control requirements [APS99]: An
233	         F-RTO TCP sender simply chooses different segments to transmit.

235	         If the TCP sender does not have any new data to send, or the
236	         advertised window prohibits new transmissions, the recommended
237	         action is to skip step 3 of this algorithm and continue with
238	         slow start retransmissions following the conventional RTO
239	         recovery algorithm. However, alternative ways of handling the
240	         window limited cases that could result in better performance
241	         are discussed in Appendix C.

243	   3) When the second acknowledgment after the RTO retransmission
244	      arrives at the sender, the TCP sender either declares the timeout
245	      spurious, or starts retransmitting the unacknowledged segments.

247	      a) If the acknowledgment is a duplicate ACK, the TCP sender MUST
248	         set the congestion window to no more than 3 * MSS, and continue
249	         with the slow start algorithm retransmitting unacknowledged
250	         segments. Congestion window can be set to 3 * MSS, because two
251	         round-trip times have elapsed since the RTO, and a conventional
252	         TCP sender would have increased cwnd to 3 during the same time.
253	         The sender leaves SpuriousRecovery set to FALSE.

255	      b) If the acknowledgment advances the window, i.e. it acknowledges
256	         data that was not retransmitted after the timeout, the TCP
257	         sender SHOULD declare the timeout spurious, set
258	         SpuriousRecovery to SPUR_TO and set the value of "recover"
259	         variable to SND.UNA, the oldest unacknowledged sequence number
260	         [Pos81].

262	2.2.  Discussion

264	   The F-RTO sender takes cautious actions when it receives duplicate
265	   acknowledgments after a retransmission timeout. Since duplicate ACKs
266	   may indicate that segments have been lost, reliably detecting a
267	   spurious timeout is difficult due to the lack of additional
268	   information. Therefore, it is prudent to follow the conventional TCP
269	   recovery in those cases.

271	   If the first acknowledgment after the RTO retransmission covers the
272	   "recover" point at algorithm step (2a), there is not enough evidence
273	   that a non-retransmitted segment has arrived at the receiver after
274	   the timeout.  This is a common case when a fast retransmission is
275	   lost and it has been retransmitted again after an RTO, while the rest
276	   of the unacknowledged segments have successfully been delivered to
277	   the TCP receiver before the retransmission timeout. Therefore the
278	   timeout cannot be declared spurious in this case.

280	   If the first acknowledgment after the RTO retransmission does not
281	   acknowledge all of the data that was retransmitted in step 1, the TCP
282	   sender reverts to the conventional RTO recovery. Otherwise, a
283	   malicious receiver acknowledging partial segments could cause the
284	   sender to declare the timeout spurious in a case where data was lost.

286	   The TCP sender is allowed to send two new segments in algorithm
287	   branch (2b), because the conventional TCP sender would transmit two
288	   segments when the first new ACK arrives after the RTO retransmission.
289	   If sending new data is not possible in algorithm branch (2b), or the
290	   receiver window limits the transmission, the TCP sender has to send
291	   something in order to prevent the TCP transfer from stalling. If no
292	   segments were sent, the pipe between sender and receiver may run out
293	   of segments, and no further acknowledgments would arrive. In this
294	   case the recommendation is to revert to the conventional RTO recovery
295	   with slow start retransmissions, but Appendix C discusses some
296	   alternative solutions for window limited situations.

298	   If the retransmission timeout is declared spurious, the TCP sender
299	   sets the value of the "recover" variable to SND.UNA in order to allow
300	   fast retransmit [FHG04]. The "recover" variable was proposed for
301	   avoiding unnecessary multiple fast retransmits when RTO expires
302	   during fast recovery with NewReno TCP. As the sender does not
303	   retransmit other segments but the one that triggered the timeout, the
304	   problem of unnecessary multiple fast retransmits [FHG04] cannot
305	   occur. Therefore, if there are three duplicate ACKs arriving at the
306	   sender after the timeout, they are likely to indicate a packet loss,
307	   hence fast retransmit should be used to allow efficient recovery. If
308	   there are not enough duplicate ACKs arriving at the sender after a
309	   packet loss, the retransmission timer expires another time and the
310	   sender enters step 1 of this algorithm.

312	   When the timeout is declared spurious, the TCP sender cannot detect
313	   whether the unnecessary RTO retransmission was lost. In principle the
314	   loss of the RTO retransmission should be taken as a congestion
315	   signal, and thus there is a small possibility that the F-RTO sender
316	   violates the congestion control rules, if it chooses to fully revert
317	   congestion control parameters after detecting a spurious timeout. The
318	   Eifel detection algorithm has a similar property, while the DSACK
319	   option can be used to detect whether the retransmitted segment was
320	   successfully delivered to the receiver.

322	   The F-RTO algorithm has a side-effect on the TCP round-trip time
323	   measurement. Because the TCP sender can avoid most of the unnecessary
324	   retransmissions after detecting a spurious timeout, the sender is
325	   able to take round-trip time samples on the delayed segments. If the
326	   regular RTO recovery was used without TCP timestamps, this would not
327	   be possible due to the retransmission ambiguity. As a result, the RTO
328	   is likely to have more accurate and larger values with F-RTO than
329	   with the regular TCP after a spurious timeout that was triggered due
330	   to delayed segments. We believe this is an advantage in the networks
331	   that are prone to delay spikes.

333	   It is possible that the F-RTO algorithm does not always avoid
334	   unnecessary retransmissions after a spurious timeout. If packet
335	   reordering or packet duplication occurs on the segment that triggered
336	   the spurious timeout, the F-RTO algorithm may not detect the spurious
337	   timeout due to incoming duplicate ACKs. Additionally, if a spurious
338	   timeout occurs during fast recovery, the F-RTO algorithm often cannot
339	   detect the spurious timeout, because the segments transmitted before
340	   the fast recovery trigger duplicate ACKs.  However, we consider these
341	   cases relatively rare, and note that in cases where F-RTO fails to
342	   detect the spurious timeout, it retransmits the unacknowledged
343	   segments in slow start and thus performs similarly to the regular RTO
344	   recovery.

346	3.  SACK-enhanced version of the F-RTO algorithm

348	   This section describes an alternative version of the F-RTO algorithm,
349	   that makes use of the TCP Selective Acknowledgment Option [MMFR96].
350	   By using the SACK option the TCP sender can detect spurious timeouts
351	   in most of the cases when packet reordering or packet duplication is
352	   present. The difference to the basic F-RTO algorithm is that the
353	   sender may declare timeout spurious even when duplicate ACKs follow
354	   the RTO, if the SACK blocks acknowledge new data that was not
355	   transmitted after the RTO retransmission.

357	   Given that the TCP Selective Acknowledgment Option [MMFR96] is
358	   enabled for a TCP connection, a TCP sender MAY implement the
359	   SACK-enhanced F-RTO algorithm. If the sender applies the
360	   SACK-enhanced F-RTO algorithm, it MUST follow the steps below.  This
361	   algorithm SHOULD NOT be applied, if the TCP sender is already in loss
362	   recovery when retransmission timeout occurs.  However, it should be
363	   possible to apply the principle of F-RTO within certain limitations
364	   also when retransmission timeout occurs during existing loss
365	   recovery. While this is a topic of further research, Appendix B
366	   briefly discusses the related issues.

368	   1) When the RTO expires, the TCP sender SHOULD retransmit the first
369	      unacknowledged segment and set SpuriousRecovery to FALSE. Variable
370	      "recover" is set to indicate the highest segment transmitted so
371	      far. Following the recommendation in SACK specification [MMFR96],
372	      the SACK scoreboard SHOULD be reset.

374	   2) Wait until the acknowledgment for the data retransmitted due to
375	      the timeout arrives at the sender. If duplicate ACKs arrive before
376	      the cumulative acknowledgment for retransmitted data, adjust the
377	      scoreboard according to the incoming SACK information but stay in
378	      step 2 waiting for the next new acknowledgment. If RTO expires
379	      again, go to step 1 of the algorithm.

381	      a) if a cumulative ACK acknowledges a sequence number equal to
382	         "recover", the TCP sender SHOULD revert to the conventional RTO
383	         recovery and it MUST set congestion window to no more than 2 *
384	         MSS, like a regular TCP would do. The sender MUST NOT enter
385	         step 3 of this algorithm.

387	      b) else, if a cumulative ACK acknowledges a sequence number
388	         smaller than "recover" but larger than SND.UNA, the TCP sender
389	         SHOULD transmit up to two new (previously unsent) segments and
390	         proceed to step 3. If the TCP sender is not able to transmit
391	         any previously unsent data due to receiver window limitation or
392	         because it does not have any new data to send, the recommended
393	         action is to not enter step 3 of this algorithm but continue
394	         with slow start retransmissions following the conventional RTO
395	         recovery algorithm.

397	         It is also possible to apply some of the alternatives for
398	         handling window limited cases discussed in Appendix C. In this
399	         case, the TCP sender should also follow the recommendations
400	         concerning acknowledgments of retransmitted segments given in
401	         Appendix B.

403	   3) The next acknowledgment arrives at the sender. Either duplicate
404	      ACK or a new cumulative ACK advancing the window applies in this
405	      step.

407	      a) if the ACK acknowledges sequence number above "recover", either
408	         in SACK blocks or as a cumulative ACK, the sender MUST set
409	         congestion window to no more than 3 * MSS and proceed with the
410	         conventional RTO recovery, retransmitting unacknowledged
411	         segments. The sender SHOULD take this branch also when the
412	         acknowledgment is a duplicate ACK and it does not acknowledge
413	         any new, previously unacknowledged data below "recover" in the
414	         SACK blocks. The sender leaves SpuriousRecovery set to FALSE.

416	      b) if the ACK does not acknowledge sequence numbers above
417	         "recover" AND it acknowledges data that was not acknowledged
418	         earlier either with cumulative acknowledgment or using SACK
419	         blocks, the TCP sender SHOULD declare the timeout spurious and
420	         set SpuriousRecovery to SPUR_TO. The retransmission timeout can
421	         be declared spurious, because the segment acknowledged with
422	         this ACK was transmitted before the timeout.

424	   If there are unacknowledged holes between the received SACK blocks,
425	   those segments SHOULD be retransmitted similarly to the conventional
426	   SACK recovery algorithm [BAFW03].  If the algorithm exits with
427	   SpuriousRecovery set to SPUR_TO, "recover" SHOULD be set to SND.UNA,
428	   thus allowing fast recovery on incoming duplicate acknowledgments.

430	4.  Taking Actions after Detecting Spurious RTO

432	   Upon retransmission timeout, a conventional TCP sender assumes that
433	   outstanding segments are lost and starts retransmitting the
434	   unacknowledged segments. When the retransmission timeout is detected
435	   to be spurious, the TCP sender should not continue retransmitting
436	   based on the timeout. For example, if the sender was in congestion
437	   avoidance phase transmitting new previously unsent segments, it
438	   should continue transmitting previously unsent segments after
439	   detecting a spurious RTO. This document does not describe the
440	   response to spurious timeout, but a response algorithm is described
441	   in another IETF document [LG04].

443	   Additionally, different response variants to spurious retransmission
444	   timeout have been discussed in various research papers [SKR03, GL03,
445	   Sar03] and Internet-Drafts [SL03]. The different response
446	   alternatives vary in whether the spurious retransmission timeout
447	   should be taken as a congestion signal, thus causing the congestion
448	   window or slow start threshold to be reduced at the sender, or
449	   whether the congestion control state should be fully reverted to the
450	   state valid prior to the retransmission timeout.

452	5.  SCTP Considerations

454	   SCTP has similar retransmission algorithms and congestion control to
455	   TCP. The SCTP T3-rtx timer for one destination address is maintained
456	   in the same way than the TCP retransmission timer, and after a T3-rtx
457	   expires, an SCTP sender retransmits unacknowledged data chunks in
458	   slow start like TCP does.  Therefore, SCTP is vulnerable to the
459	   negative effects of the spurious retransmission timeouts similarly to
460	   TCP. Due to similar RTO recovery algorithms, F-RTO algorithm logic
461	   can be applied also to SCTP. Since SCTP uses selective
462	   acknowledgments, the SACK-based variant of the algorithm is
463	   recommended, although the basic version can also be applied to SCTP.
464	   However, SCTP contains features that are not present with TCP that
465	   need to be discussed when applying the F-RTO algorithm.

467	   SCTP associations can be multi-homed. The current retransmission
468	   policy states that retransmissions should go to alternative
469	   addresses. If the retransmission was due to spurious timeout caused
470	   by a delay spike, it is possible that the acknowledgment for the
471	   retransmission arrives back at the sender before the acknowledgments
472	   of the original transmissions arrive. If this happens, a possible
473	   loss of the original transmission of the data chunk that was
474	   retransmitted due to the spurious timeout may remain undetected when
475	   applying the F-RTO algorithm.  Because the timeout was caused by a
476	   delay spike, and it was spurious in that respect, a suitable response
477	   is to continue by sending new data. However, if the original
478	   transmission was lost, fully reverting the congestion control
479	   parameters is too aggressive. Therefore, taking conservative actions
480	   on congestion control is recommended, if the SCTP association is
481	   multi-homed and retransmissions go to alternative address. The
482	   information in duplicate TSNs can be then used for reverting
483	   congestion control, if desired [BA04].

485	   Note that the forward transmissions made in F-RTO algorithm step (2b)
486	   should be destined to the primary address, since they are not
487	   retransmissions.

489	   When making a retransmission, a SCTP sender can bundle a number of
490	   unacknowledged data chunks and include them in the same packet. This
491	   needs to be considered when implementing F-RTO for SCTP. The basic
492	   principle of F-RTO still holds: in order to declare the timeout
493	   spurious, the sender must get an acknowledgment for a data chunk that
494	   was not retransmitted after the retransmission timeout. In other
495	   words, acknowledgments of data chunks that were bundled in RTO
496	   retransmission must not be used for declaring the timeout spurious.

498	6.  Security Considerations

500	   The main security threat regarding F-RTO is the possibility of a
501	   receiver misleading the sender to set too large a congestion window
502	   after an RTO.  There are two possible ways a malicious receiver could
503	   trigger a wrong output from the F-RTO algorithm. First, the receiver
504	   can acknowledge data that it has not received. Second, it can delay
505	   acknowledgment of a segment it has received earlier, and acknowledge
506	   the segment after the TCP sender has been deluded to enter algorithm
507	   step 3.

509	   If the receiver acknowledges a segment it has not really received,
510	   the sender can be led to declare spurious timeout in F-RTO algorithm
511	   step 3. However, since this causes the sender to have incorrect
512	   state, it cannot retransmit the segment that has never reached the
513	   receiver. Therefore, this attack is unlikely to be useful for the
514	   receiver to maliciously gain a larger congestion window.

516	   A common case for a retransmission timeout is that a fast
517	   retransmission of a segment is lost. If all other segments have been
518	   received, the RTO retransmission causes the whole window to be
519	   acknowledged at once. This case is recognized in F-RTO algorithm
520	   branch (2a). However, if the receiver only acknowledges one segment
521	   after receiving the RTO retransmission, and then the rest of the
522	   segments, it could cause the timeout to be declared spurious when it
523	   is not. Therefore, it is suggested that when an RTO expires during
524	   fast recovery phase, the sender would not fully revert the congestion
525	   window even if the timeout was declared spurious, but reduce the
526	   congestion window to 1. However, the sender can take actions to avoid
527	   unnecessary retransmissions normally. If a TCP sender implements a
528	   burst avoidance algorithm that limits the sending rate to be no
529	   higher than in slow start, this precaution is not needed, and the
530	   sender may apply F-RTO normally.

532	   If there are more than one segments missing at the time when a
533	   retransmission timeout occurs, the receiver does not benefit from
534	   misleading the sender to declare a spurious timeout, because the
535	   sender would then have to go through another recovery period to
536	   retransmit the missing segments, usually after an RTO has elapsed.

538	7.  IANA Considerations

540	   This document has no actions for IANA.

542	8.  Acknowledgments

544	   We are grateful to Reiner Ludwig, Andrei Gurtov, Josh Blanton, Mark
545	   Allman, Sally Floyd, Yogesh Swami, Mika Liljeberg, Ivan Arias
546	   Rodriguez, Sourabh Ladha, Martin Duke, Motoharu Miyake, Ted Faber,
547	   Samu Kontinen, and Kostas Pentikousis for the discussion and feedback
548	   contributed to this text.

550	9.  References

552	Normative References

554	   [APS99]   M. Allman, V. Paxson, and W. Stevens. TCP Congestion
555	             Control. RFC 2581, April 1999.

557	   [BAFW03]  E. Blanton, M. Allman, K. Fall, and L. Wang. A Conservative
558	             Selective Acknowledgment (SACK)-based Loss Recovery
559	             Algorithm for TCP. RFC 3517, April 2003.

561	   [RFC2119] S. Bradner. Key words for use in RFCs to Indicate
562	             Requirement Levels. RFC 2119, March 1997.

564	   [FHG04]   S. Floyd, T. Henderson, and A. Gurtov. The NewReno
565	             Modification to TCP's Fast Recovery Algorithm. RFC 3782,
566	             April 2004.

568	   [MMFR96]  M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP
569	             Selective Acknowledgment Options. RFC 2018, October 1996.

571	   [PA00]    V. Paxson and M. Allman. Computing TCP's Retransmission
572	             Timer. RFC 2988, November 2000.

574	   [Pos81]   J. Postel. Transmission Control Protocol. RFC 793,
575	             September 1981.

577	   [Ste00]   R. Stewart, et. al. Stream Control Transmission Protocol.
578	             RFC 2960, October 2000.

580	Informative References

582	   [ABF01]   M. Allman, H. Balakrishnan, and S. Floyd. Enhancing TCP's
583	             Loss Recovery Using Limited Transmit. RFC 3042, January
584	             2001.

586	   [BA04]    E. Blanton and M. Allman. Using TCP Duplicate Selective
587	             Acknowledgment (DSACKs) and Stream Control Transmission
588	             Protocol (SCTP) Duplicate Transmission Sequence Numbers
589	             (TSNs) to Detect Spurious Retransmissions. RFC 3708,
590	             February 2004.

592	   [BBJ92]   D. Borman, R. Braden, and V. Jacobson. TCP Extensions for
593	             High Performance. RFC 1323, May 1992.

595	   [FMMP00]  S. Floyd, J. Mahdavi, M. Mathis, and M. Podolsky. An
596	             Extension to the Selective Acknowledgment (SACK) Option to
597	             TCP. RFC 2883, July 2000.

599	   [GL02]    A. Gurtov and R. Ludwig. Evaluating the Eifel Algorithm for
600	             TCP in a GPRS Network. In Proc. of European Wireless,
601	             Florence, Italy, February 2002.

603	   [GL03]    A. Gurtov and R. Ludwig, Responding to Spurious Timeouts in
604	             TCP. In Proceedings of IEEE INFOCOM 03, San Francisco, CA,
605	             USA, March 2003.

607	   [Jac88]   V. Jacobson. Congestion Avoidance and Control. In
608	             Proceedings of ACM SIGCOMM 88.

610	   [LG04]    R. Ludwig and A. Gurtov. The Eifel Response Algorithm for
611	             TCP. Internet draft
612	             "draft-ietf-tsvwg-tcp-eifel-response-05.txt".  March 2004.
613	             Work in progress.

615	   [LK00]    R. Ludwig and R.H. Katz. The Eifel Algorithm: Making TCP
616	             Robust Against Spurious Retransmissions. ACM SIGCOMM
617	             Computer Communication Review, 30(1), January 2000.

619	   [LM03]    R. Ludwig and M. Meyer. The Eifel Detection Algorithm for
620	             TCP. RFC 3522, April 2003.

622	   [Nag84]   J. Nagle. Congestion Control in IP/TCP Internetworks. RFC
623	             896, January 1984.

625	   [SKR03]   P. Sarolahti, M. Kojo, and K. Raatikainen. F-RTO: An
626	             Enhanced Recovery Algorithm for TCP Retransmission
627	             Timeouts. ACM SIGCOMM Computer Communication Review, 33(2),
628	             April 2003.

630	   [Sar03]   P. Sarolahti. Congestion Control on Spurious TCP
631	             Retransmission Timeouts. In Proceedings of IEEE Globecom
632	             2003, San Francisco, CA, USA. December 2003.

634	   [SL03]    Y. Swami and K. Le. DCLOR: De-correlated Loss Recovery
635	             using SACK option for spurious timeouts. Internet draft
636	             "draft-swami-tsvwg-tcp-dclor-02.txt". September 2003. Work
637	             in progress.

639	Appendix A: Scenarios

641	   This section discusses different scenarios where RTOs occur and how
642	   the basic F-RTO algorithm performs in those scenarios. The
643	   interesting scenarios are a sudden delay triggering retransmission
644	   timeout, loss of a retransmitted packet during fast recovery, link
645	   outage causing the loss of several packets, and packet reordering. A
646	   performance evaluation with a more thorough analysis on a real
647	   implementation of F-RTO is given in [SKR03].

649	A.1.  Sudden delay

651	   The main motivation behind the F-RTO algorithm is to improve TCP
652	   performance when a delay spike triggers a spurious retransmission
653	   timeout.  The example below illustrates the segments and
654	   acknowledgments transmitted by the TCP end hosts when a spurious
655	   timeout occurs, but no packets are lost. For simplicity, delayed
656	   acknowledgments are not used in the example. The example below
657	   applies the Eifel Response Algorithm [LG04] after detecting a
658	   spurious timeout.

660	         ...
661	          (cwnd = 6, ssthresh < 6, FlightSize = 6)
662	         1.          <---------------------------- ACK 5
663	         2.  SEND 10 ---------------------------->
664	          (cwnd = 6, ssthresh < 6, FlightSize = 6)
665	         3.          <---------------------------- ACK 6
666	         4.  SEND 11 ---------------------------->
667	          (cwnd = 6, ssthresh < 6, FlightSize = 6)
668	         5.                       |
669	                               [delay]
670	                                  |
671	             [RTO]
672	             [F-RTO step (1)]
673	         6.  SEND 6  ---------------------------->
674	          (cwnd = 6, ssthresh = 3, FlightSize = 6)
675	                     <earlier xmitted SEG 6>  --->
676	         7.          <---------------------------- ACK 7
677	             [F-RTO step (2b)]
678	         8.  SEND 12 ---------------------------->
679	         9.  SEND 13 ---------------------------->
680	          (cwnd = 7, ssthresh = 3, FlightSize = 7)
681	                     <earlier xmitted SEG 7>  --->
682	         10.         <---------------------------- ACK 8
683	             [F-RTO step (3b)]
684	             [SpuriousRecovery <- SPUR_TO]
685	           (cwnd = 7, ssthresh = 6, FlightSize = 6)
686	         11. SEND 14 ---------------------------->
687	           (cwnd = 7, ssthresh = 6, FlightSize = 7)
688	         12.         <---------------------------- ACK 9
689	         13. SEND 15 ---------------------------->
690	           (cwnd = 7, ssthresh = 6, FlightSize = 7)
691	         14.         <---------------------------- ACK 10
692	         15. SEND 16 ---------------------------->
693	           (cwnd = 7, ssthresh = 6, FlightSize = 7)

695	         ...

697	   When a sudden delay long enough to trigger timeout occurs at step 5,
698	   the TCP sender retransmits the first unacknowledged segment (step 6).
699	   The next ACK covers the RTO retransmission because the originally
700	   transmitted segment 6 arrived at the receiver, and the TCP sender
701	   continues by sending two new data segments (steps 8, 9). Note that on
702	   F-RTO steps (1) and (2b) congestion window and FlightSize are not yet
703	   reset, because in case of possible spurious timeout the segments sent
704	   before the timeout are still in the network. However, the sender
705	   should still be equally aggressive to conventional TCP. Because the
706	   second acknowledgment arriving after the RTO retransmission
707	   acknowledges data that was not retransmitted due to timeout (step
708	   10), the TCP sender declares the timeout as spurious and continues by
709	   sending new data on next acknowledgments. Also the congestion control
710	   state is reversed, as required by the Eifel Response Algorithm.

712	A.2.  Loss of a retransmission

714	   If a retransmitted segment is lost, the only way to retransmit it
715	   again is to wait for the timeout to trigger the retransmission. Once
716	   the segment is successfully received, the receiver usually
717	   acknowledges several segments at once, because other segments in the
718	   same window have been successfully delivered before the
719	   retransmission arrives at the receiver. The example below shows a
720	   scenario where retransmission (of segment 6) is lost, as well as a
721	   later segment (segment 9) in the same window. The limited transmit
722	   [ABF01] or SACK TCP [MMFR96] enhancements are not in use in this
723	   example.

725	         ...
726	          (cwnd = 6, ssthresh < 6, FlightSize = 6)
727	             <segment 6 lost>
728	             <segment 9 lost>
729	         1.          <---------------------------- ACK 5
730	         2.  SEND 10 ---------------------------->
731	          (cwnd = 6, ssthresh < 6, FlightSize = 6)
732	         3.          <---------------------------- ACK 6
733	         4.  SEND 11 ---------------------------->
734	          (cwnd = 6, ssthresh < 6, FlightSize = 6)
735	         5.          <---------------------------- ACK 6
736	         6.          <---------------------------- ACK 6
737	         7.          <---------------------------- ACK 6
738	         8.  SEND 6  --------------X
739	          (cwnd = 6, ssthresh = 3, FlightSize = 6)
740	             <segment 6 lost>
741	         9.          <---------------------------- ACK 6
742	         10. SEND 12 ---------------------------->
743	          (cwnd = 7, ssthresh = 3, FlightSize = 7)
744	         11.         <---------------------------- ACK 6
745	         12. SEND 13 ---------------------------->
746	          (cwnd = 8, ssthresh = 3, FlightSize = 8)
747	             [RTO]
748	         13. SEND 6  ---------------------------->
749	          (cwnd = 8, ssthresh = 2, FlightSize = 8)
750	         14.         <---------------------------- ACK 9
751	             [F-RTO step (2b)]
752	         15. SEND 14 ---------------------------->
753	         16. SEND 15 ---------------------------->
754	          (cwnd = 7, ssthresh = 2, FlightSize = 7)
755	         17.         <---------------------------- ACK 9
756	             [F-RTO step (3a)]
757	             [SpuriousRecovery <- FALSE]
758	          (cwnd = 3, ssthresh = 2, FlightSize = 7)
759	         18. SEND 9  ---------------------------->
760	         19. SEND 10 ---------------------------->
761	         20. SEND 11 ---------------------------->
762	         ...

764	   In the example above, segment 6 is lost and the sender retransmits it
765	   after three duplicate ACKs in step 8. However, the retransmission is
766	   also lost, and the sender has to wait for the RTO to expire before
767	   retransmitting it again. Because the first ACK following the RTO
768	   retransmission acknowledges the RTO retransmission (step 14), the
769	   sender transmits two new segments. The second ACK in step 17 does not
770	   acknowledge any previously unacknowledged data. Therefore the F-RTO
771	   sender enters the slow start and sets cwnd to 3 * MSS. Congestion
772	   window can be set to three segments, because two round-trips have
773	   elapsed after the retransmission timeout. After this the receiver
774	   acknowledges all segments transmitted prior to entering recovery and
775	   the sender can continue transmitting new data in congestion
776	   avoidance.

778	A.3.  Link outage

780	   The example below illustrates the F-RTO behavior when 4 consecutive
781	   packets are lost in the network causing the TCP sender to fall back
782	   to RTO recovery. Limited transmit and SACK are not used in this
783	   example.

785	         ...
786	          (cwnd = 6, ssthresh < 6, FlightSize = 6)
787	             <segments 6-9 lost>
788	         1.          <---------------------------- ACK 5
789	         2.  SEND 10 ---------------------------->
790	          (cwnd = 6, ssthresh < 6, FlightSize = 6)
791	         3.          <---------------------------- ACK 6
792	         4.  SEND 11 ---------------------------->
793	          (cwnd = 6, ssthresh < 6, FlightSize = 6)
794	         5.          <---------------------------- ACK 6
795	                                  |
796	                                  |
797	             [RTO]
798	         6.  SEND 6  ---------------------------->
799	          (cwnd = 6, ssthresh = 3, FlightSize = 6)
800	         7.          <---------------------------- ACK 7
801	             [F-RTO step (2b)]
802	         8.  SEND 12 ---------------------------->
803	         9.  SEND 13 ---------------------------->
804	          (cwnd = 7, ssthresh = 3, FlightSize = 7)
805	         10.         <---------------------------- ACK 7
806	             [F-RTO step (3a)]
807	             [SpuriousRecovery <- FALSE]
808	          (cwnd = 3, ssthresh = 3, FlightSize = 7)
809	         11. SEND 7  ---------------------------->
810	         12. SEND 8  ---------------------------->
811	         13. SEND 9  ---------------------------->

813	   Again, F-RTO sender transmits two new segments (steps 8 and 9) after
814	   the RTO retransmission is acknowledged. Because the next ACK does not
815	   acknowledge any data that was not retransmitted after the
816	   retransmission timeout (step 10), the F-RTO sender proceeds with
817	   conventional recovery and slow start retransmissions.

819	A.4.  Packet reordering

821	   Since F-RTO modifies the TCP sender behavior only after a
822	   retransmission timeout and it is intended to avoid unnecessary
823	   retransmissions only after spurious timeout, we limit the discussion
824	   on the effects of packet reordering in F-RTO behavior to the cases
825	   where packet reordering occurs immediately after the retransmission
826	   timeout.  When the TCP receiver gets an out-of-order segment, it
827	   generates a duplicate ACK. If the TCP sender implements the basic
828	   F-RTO algorithm, this may prevent the sender from detecting a
829	   spurious timeout.

831	   However, if the TCP sender applies the SACK-enhanced F-RTO, it is
832	   possible to detect a spurious timeout also when packet reordering
833	   occurs. We illustrate the behavior of SACK-enhanced F-RTO below when
834	   segment 8 arrives before segments 6 and 7, and segments starting from
835	   segment 6 are delayed in the network. In this example the TCP sender
836	   reduces the congestion window and slow start threshold in response to
837	   spurious timeout.

839	         ...
840	          (cwnd = 6, ssthresh < 6, FlightSize = 6)
841	         1.          <---------------------------- ACK 5
842	         2.  SEND 10 ---------------------------->
843	          (cwnd = 6, ssthresh < 6, FlightSize = 6)
844	         3.          <---------------------------- ACK 6
845	         4.  SEND 11 ---------------------------->
846	         5.                       |
847	                               [delay]
848	                                  |
849	             [RTO]
850	         6.  SEND 6  ---------------------------->
851	          (cwnd = 6, ssthresh = 3, FlightSize = 6)
852	                     <earlier xmitted SEG 8>  --->
853	         7.          <---------------------------- ACK 6
854	                                                   [SACK 8]
855	             [SACK F-RTO stays in step 2]
856	         8.          <earlier xmitted SEG 6>  --->
857	         9.          <---------------------------- ACK 7
858	                                                   [SACK 8]
859	             [SACK F-RTO step (2b)]
860	         10. SEND 12 ---------------------------->
861	         11. SEND 13 ---------------------------->
862	           (cwnd = 7, ssthresh = 3, FlightSize = 7)
863	         12.         <earlier xmitted SEG 7>  --->
864	         13.         <---------------------------- ACK 9
865	             [SACK F-RTO step (3b)]
866	             [SpuriousRecovery <- SPUR_TO]
867	           (cwnd = 7, ssthresh = 6, FlightSize = 6)
868	         14. SEND 14 ---------------------------->
869	           (cwnd = 7, ssthresh = 6, FlightSize = 7)
870	         15.         <---------------------------- ACK 10
871	         16. SEND 15 ---------------------------->
872	         ...

874	   After RTO expires and the sender retransmits segment 6 (step 6), the
875	   receiver gets segment 8 and generates duplicate ACK with SACK for
876	   segment 8. In response to the acknowledgment the TCP sender does not
877	   send anything but stays in F-RTO step 2. Because the next
878	   acknowledgment advances the cumulative ACK point (step 9), the sender
879	   can transmit two new segments according to SACK-enhanced F-RTO. The
880	   next segment acknowledges new data between 7 and 11 that was not
881	   acknowledged earlier (segment 7), so the F-RTO sender declares the
882	   timeout spurious.

884	Appendix B: SACK-enhanced F-RTO and Fast Recovery
885	   We believe that slightly modified SACK-enhanced F-RTO algorithm can
886	   be used to detect spurious timeouts also when RTO expires while an
887	   earlier loss recovery is underway. However, there are issues that
888	   need to be considered if F-RTO is applied in this case.

890	   The original SACK-based F-RTO requires in algorithm step 3 that an
891	   ACK acknowledges previously unacknowledged non-retransmitted data
892	   between SND.UNA and send_high. If RTO expires during earlier
893	   (SACK-based) loss recovery, the F-RTO sender must only use
894	   acknowledgments for non-retransmitted segments transmitted before the
895	   SACK-based loss recovery started. This means that in order to declare
896	   timeout spurious the TCP sender must receive an acknowledgment for
897	   non-retransmitted segment between SND.UNA and RecoveryPoint in
898	   algorithm step 3. RecoveryPoint is defined in conservative
899	   SACK-recovery algorithm [BAFW03], and it is set to indicate the
900	   highest segment transmitted so far when SACK-based loss recovery
901	   begins. In other words, if the TCP sender receives acknowledgment for
902	   segment that was transmitted more than one RTO ago, it can declare
903	   the timeout spurious. Defining an efficient algorithm for checking
904	   these conditions remains as a future work item.

906	   When spurious timeout is detected according to the rules given above,
907	   it may be possible that the response algorithm needs to consider this
908	   case separately, for example in terms of what segments to retransmit
909	   after RTO expires, and whether it is safe to revert the congestion
910	   control parameters in this case. This is considered as a topic of
911	   future research.

913	Appendix C: Discussion on Window Limited Cases

915	   When the advertised window limits the transmission of two new
916	   previously unsent segments, or there are no new data to sent, it was
917	   recommended in F-RTO algorithm step (2b) that the TCP sender would
918	   continue with conventional RTO recovery algorithm. The disadvantage
919	   of doing this is that the sender may continue unnecessary
920	   retransmissions due to possible spurious timeout. This section
921	   briefly discusses the options that can potentially result in better
922	   performance when transmitting previously unsent data is not possible.

924	   - The TCP sender could reserve an unused space of a size of one or
925	     two segments in the advertised window to ensure the use of
926	     algorithms such as F-RTO or Limited Transmit [ABF01] in window
927	     limited situations. On the other hand, while doing this, the TCP
928	     sender should ensure that the window of outstanding segments is
929	     large enough to have a proper utilization of the available pipe.

931	   - Use additional information if available, e.g. TCP timestamps with
932	     the Eifel Detection algorithm, for detecting a spurious timeout.

934	     However, Eifel detection may yield different results from F-RTO
935	     when ACK losses and a RTO occur within the same round-trip time
936	     [SKR03].

938	   - Retransmit data from the tail of the retransmission queue and
939	     continue with step 3 of the F-RTO algorithm. It is possible that
940	     the retransmission is unnecessarily made, hence this option is not
941	     encouraged, except for hosts that are known to operate in an
942	     environment that is highly likely to have spurious timeouts. On the
943	     other hand, with this method it is possible to avoid several
944	     unnecessary retransmissions due to spurious timeout by doing only
945	     one retransmission that may be unnecessary.

947	   - Send a zero-sized segment below SND.UNA similar to TCP Keep-Alive
948	     probe and continue with step 3 of the F-RTO algorithm. Since the
949	     receiver replies with a duplicate ACK, the sender is able to detect
950	     from the incoming acknowledgment whether the timeout was spurious.
951	     While this method does not send data unnecessarily, it delays the
952	     recovery by one round-trip time in cases where the timeout was not
953	     spurious, and therefore is not encouraged.

955	   - In receiver-limited cases, send one octet of new data regardless of
956	     the advertised window limit, and continue with step 3 of the F-RTO
957	     algorithm. It is possible that the receiver has free buffer space
958	     to receive the data by the time the segment has propagated through
959	     the network, in which case no harm is done. If the receiver is not
960	     capable of receiving the segment, it rejects the segment and sends
961	     a duplicate ACK.

963	Authors' Addresses

965	   Pasi Sarolahti
966	   Nokia Research Center
967	   P.O. Box 407
968	   FIN-00045 NOKIA GROUP
969	   Finland

971	   Phone: +358 50 4876607
972	   EMail: pasi.sarolahti@nokia.com
973	   http://www.cs.helsinki.fi/u/sarolaht/

975	   Markku Kojo
976	   University of Helsinki
977	   Department of Computer Science
978	   P.O. Box 26
979	   FIN-00014 UNIVERSITY OF HELSINKI
980	   Finland
981	   Phone: +358 9 1914 4179
982	   EMail: markku.kojo@cs.helsinki.fi

984	IPR Disclosure Agreement

986	   By submitting this Internet-Draft, I certify that any applicable
987	   patent or other IPR claims of which I am aware have been disclosed,
988	   and any of which I become aware will be disclosed, in accordance with
989	   RFC 3668.

991	Full Copyright Statement

993	   Copyright (C) The Internet Society (2004).  This document is subject
994	   to the rights, licenses and restrictions contained in BCP 78, and
995	   except as set forth therein, the authors retain all their rights.

997	   This document and the information contained herein are provided on an
998	   "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS
999	   OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET
1000	   ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED,
1001	   INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE
1002	   INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED
1003	   WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

1005	Intellectual Property

1007	   The IETF takes no position regarding the validity or scope of any
1008	   Intellectual Property Rights or other rights that might be claimed to
1009	   pertain to the implementation or use of the technology described in
1010	   this document or the extent to which any license under such rights
1011	   might or might not be available; nor does it represent that it has
1012	   made any independent effort to identify any such rights.  Information
1013	   on the procedures with respect to rights in RFC documents can be
1014	   found in BCP 78 and BCP 79.

1016	   Copies of IPR disclosures made to the IETF Secretariat and any
1017	   assurances of licenses to be made available, or the result of an
1018	   attempt made to obtain a general license or permission for the use of
1019	   such proprietary rights by implementers or users of this
1020	   specification can be obtained from the IETF on-line IPR repository at
1021	   http://www.ietf.org/ipr.

1023	   The IETF invites any interested party to bring to its attention any
1024	   copyrights, patents or patent applications, or other proprietary
1025	   rights that may cover technology that may be required to implement
1026	   this standard.  Please address the information to the IETF at
1027	   ietf-ipr@ietf.org.

1029	Acknowledgement

1031	   Funding for the RFC Editor function is currently provided by the
1032	   Internet Society.