idnits 2.17.1 

draft-ietf-tcpm-rfc3782-bis-01.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document is more than 15 pages and seems to lack a Table of Contents.


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** There are 266 instances of too long lines in the document, the longest
     one being 20 characters in excess of 72.

  ** The abstract seems to contain references ([RFC5681], [RFC2883],
     [RFC3782], [Hoe95]), which it shouldn't.  Please replace those with
     straight textual mentions of the documents in question.

  -- The draft header indicates that this document obsoletes RFC3782, but the
     abstract doesn't seem to directly say this.  It does mention RFC3782
     though, so this could be OK.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to lack the recommended RFC 2119 boilerplate, even if
     it appears to use RFC 2119 keywords. 

     (The document does seem to have the reference to RFC 2119 which the
     ID-Checklist requires).
  == The document seems to contain a disclaimer for pre-RFC5378 work, but was
     first submitted on or after 10 November 2008.  The disclaimer is usually
     necessary only for documents that revise or obsolete older RFCs, and that
     take significant amounts of text from those RFCs.  If you can contact all
     authors of the source material and they are willing to grant the BCP78
     rights to the IETF Trust, you can and should remove the disclaimer. 
     Otherwise, the disclaimer is needed and you can ignore this comment. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (March 14, 2011) is 4785 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  ** Obsolete normative reference: RFC 2988 (Obsoleted by RFC 6298)

  -- Obsolete informational reference (is this intentional?): RFC 2001 (ref.
     'F98') (Obsoleted by RFC 2581)

  -- Obsolete informational reference (is this intentional?): RFC 1323
     (Obsoleted by RFC 7323)

  -- Obsolete informational reference (is this intentional?): RFC 2582
     (Obsoleted by RFC 3782)

  -- Obsolete informational reference (is this intentional?): RFC 3782
     (Obsoleted by RFC 6582)


     Summary: 4 errors (**), 0 flaws (~~), 3 warnings (==), 6 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Network Working Group                                       T. Henderson
3	Internet-Draft                                                    Boeing
4	Obsoletes: 3782  (if approved)                                  S. Floyd
5	Intended status: Standards Track                                    ICSI
6	Expires:  September 15, 2011                                   A. Gurtov
7	                                                                    HIIT
8	                                                              Y. Nishida
9	                                                            WIDE Project
10	                                                          March 14, 2011

12	       The NewReno Modification to TCP's Fast Recovery Algorithm
13	                   draft-ietf-tcpm-rfc3782-bis-01.txt

15	Abstract

17	   RFC 5681 [RFC5681] documents the following four intertwined TCP
18	   congestion control algorithms: Slow Start, Congestion Avoidance, Fast
19	   Retransmit, and Fast Recovery.  RFC 5681 explicitly allows
20	   certain modifications of these algorithms, including modifications
21	   that use the TCP Selective Acknowledgement (SACK) option [RFC2883],
22	   and modifications that respond to "partial acknowledgments" (ACKs
23	   which cover new data, but not all the data outstanding when loss was
24	   detected) in the absence of SACK.  This document describes a specific
25	   algorithm for responding to partial acknowledgments, referred to as
26	   NewReno.  This response to partial acknowledgments was first proposed
27	   by Janey Hoe in [Hoe95].

29	   The purpose of this revision from [RFC3782] is to make errata changes
30	   and to adopt a proposal from Yoshifumi Nishida to slightly increase
31	   the minimum window size after Fast Recovery from one to two segments,
32	   to improve performance when the receiver uses delayed acknowledgments.

34	Status of this Memo

36	   This Internet-Draft is submitted to IETF in full conformance with the
37	   provisions of BCP 78 and BCP 79.

39	   Internet-Drafts are working documents of the Internet Engineering
40	   Task Force (IETF).  Note that other groups may also distribute
41	   working documents as Internet-Drafts.  The list of current Internet-
42	   Drafts is at http://datatracker.ietf.org/drafts/current/.

44	   Internet-Drafts are draft documents valid for a maximum of six months
45	   and may be updated, replaced, or obsoleted by other documents at any
46	   time.  It is inappropriate to use Internet-Drafts as reference
47	   material or to cite them other than as "work in progress."
48	   This Internet-Draft will expire on September 15, 2011.

50	Copyright Notice

52	   Copyright (c) 2011 IETF Trust and the persons identified as
53	   the document authors.  All rights reserved.

55	   This document is subject to BCP 78 and the IETF Trust's Legal
56	   Provisions Relating to IETF Documents
57	   (http://trustee.ietf.org/license-info) in effect on the date of
58	   publication of this document.  Please review these documents
59	   carefully, as they describe your rights and restrictions with respect
60	   to this document.  Code Components extracted from this document must
61	   include Simplified BSD License text as described in Section 4.e of
62	   the Trust Legal Provisions and are provided without warranty as
63	   described in the Simplified BSD License.

65	   This document may contain material from IETF Documents or IETF
66	   Contributions published or made publicly available before November
67	   10, 2008.  The person(s) controlling the copyright in some of this
68	   material may not have granted the IETF Trust the right to allow
69	   modifications of such material outside the IETF Standards Process.
70	   Without obtaining an adequate license from the person(s) controlling
71	   the copyright in such materials, this document may not be modified
72	   outside the IETF Standards Process, and derivative works of it may
73	   not be created outside the IETF Standards Process, except to format
74	   it for publication as an RFC or to translate it into languages other
75	   than English.

77	1.  Introduction

79	   For the typical implementation of the TCP Fast Recovery algorithm
80	   described in [RFC5681] (first implemented in the 1990 BSD Reno
81	   release, and referred to as the Reno algorithm in [FF96]), the TCP
82	   data sender only retransmits a packet after a retransmit timeout has
83	   occurred, or after three duplicate acknowledgments have arrived
84	   triggering the Fast Retransmit algorithm.  A single retransmit
85	   timeout might result in the retransmission of several data packets,
86	   but each invocation of the Fast Retransmit algorithm in RFC 5681
87	   leads to the retransmission of only a single data packet.

89	   Two problems arise with Reno TCP when multiple packet losses occur
90	   in a single window.  First, Reno will often take a timeout, as
91	   has been documented in [Hoe95].  Second, even if a retransmission
92	   timeout is avoided, multiple fast retransmits and window reductions
93	   can occur, as documented in [F94].  When multiple packet losses
94	   occur, if the SACK option [RFC2883] is available, the TCP sender
95	   has the information to make intelligent decisions about which packets
96	   to retransmit and which packets not to retransmit during Fast
97	   Recovery.  This document applies to TCP connections that are
98	   unable to use the TCP Selective Acknowledgement (SACK) option,
99	   either because the option is not locally supported or
100	   because the TCP peer did not indicate a willingness to use SACK.

102	   In the absence of SACK, there is little information available to the
103	   TCP sender in making retransmission decisions during Fast
104	   Recovery.  From the three duplicate acknowledgments, the sender
105	   infers a packet loss, and retransmits the indicated packet.  After
106	   this, the data sender could receive additional duplicate
107	   acknowledgments, as the data receiver acknowledges additional data
108	   packets that were already in flight when the sender entered Fast
109	   Retransmit.

111	   In the case of multiple packets dropped from a single window of data,
112	   the first new information available to the sender comes when the
113	   sender receives an acknowledgment for the retransmitted packet (that
114	   is, the packet retransmitted when Fast Retransmit was first
115	   entered).  If there is a single packet drop and no reordering, then the
116	   acknowledgment for this packet will acknowledge all of the packets
117	   transmitted before Fast Retransmit was entered.  However, if there
118	   are multiple packet drops, then the acknowledgment for the
119	   retransmitted packet will acknowledge some but not all of the packets
120	   transmitted before the Fast Retransmit.  We call this acknowledgment
121	   a partial acknowledgment.

123	   Along with several other suggestions, [Hoe95] suggested that during
124	   Fast Recovery the TCP data sender responds to a partial
125	   acknowledgment by inferring that the next in-sequence packet has been
126	   lost, and retransmitting that packet.  This document describes a
127	   modification to the Fast Recovery algorithm in RFC 5681 that
128	   incorporates a response to partial acknowledgments received during
129	   Fast Recovery.  We call this modified Fast Recovery algorithm
130	   NewReno, because it is a slight but significant variation of the
131	   basic Reno algorithm in RFC 5681.  This document does not discuss the
132	   other suggestions in [Hoe95] and [Hoe96], such as a change to the
133	   ssthresh parameter during Slow-Start, or the proposal to send a new
134	   packet for every two duplicate acknowledgments during Fast
135	   Recovery.  The version of NewReno in this document also draws on other
136	   discussions of NewReno in the literature [LM97, Hen98].

138	   We do not claim that the NewReno version of Fast Recovery described
139	   here is an optimal modification of Fast Recovery for responding to
140	   partial acknowledgments, for TCP connections that are unable to use
141	   SACK.  Based on our experiences with the NewReno modification in the
142	   NS simulator [NS] and with numerous implementations of NewReno, we
143	   believe that this modification improves the performance of the Fast
144	   Retransmit and Fast Recovery algorithms in a wide variety of
145	   scenarios.

147	2.  Terminology and Definitions

149	   In this document, the key words "MUST", "MUST NOT", "REQUIRED",
150	   "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY",
151	   and "OPTIONAL" are to be interpreted as described in BCP 14, RFC 2119
152	   [RFC2119].  This RFC indicates requirement levels for compliant TCP
153	   implementations implementing the NewReno Fast Retransmit and Fast
154	   Recovery algorithms described in this document.

156	   This document assumes that the reader is familiar with the terms
157	   SENDER MAXIMUM SEGMENT SIZE (SMSS), CONGESTION WINDOW (cwnd), and
158	   FLIGHT SIZE (FlightSize) defined in [RFC5681].  FLIGHT SIZE is
159	   defined as in [RFC5681] as follows:

161	      FLIGHT SIZE:
162	         The amount of data that has been sent but not yet cumulatively
163	         acknowledged.

165	3.  The Fast Retransmit and Fast Recovery Algorithms in NewReno

167	   The basic idea of these extensions to the Fast Retransmit and
168	   Fast Recovery algorithms described in [RFC5681] is as follows.
169	   The TCP sender can infer, from the arrival of duplicate
170	   acknowledgments, whether multiple losses in the same window of
171	   data have most likely occurred, and avoid taking a retransmit
172	   timeout or making multiple congestion window reductions due to
173	   such an event.

175	   The standard implementation of the Fast Retransmit and Fast Recovery
176	   algorithms is given in [RFC5681].  This section specifies the basic
177	   NewReno algorithm.  Section 4 describes heuristics for processing
178	   duplicate acknowledgments after a retransmission timeout.  Sections
179	   5 and 6 provide some guidance to implementors based on experience
180	   with NewReno implementations.  Several appendices provide more
181	   background information and describe variations that an implementor
182	   may want to consider when tuning performance for certain network
183	   scenarios.

185	   The NewReno modification applies to the Fast Recovery procedure that
186	   begins when three duplicate ACKs are received and ends when either a
187	   retransmission timeout occurs or an ACK arrives that acknowledges all
188	   of the data up to and including the data that was outstanding when
189	   the Fast Recovery procedure began.

191	   The NewReno algorithm specified in this document extends the
192	   implementation in [RFC5681] by introducing a variable specified as
193	   "recover" whose initial value is the initial send sequence number.
194	   This new variable is used by the sender to record the send sequence
195	   number that must be acknowledged before the Fast Recovery
196	   procedure is declared to be over.  This variable is used below
197	   in step 1, in the response to a partial or new
198	   acknowledgment in step 5, and in modifications to step 1 and the
199	   addition of step 6 for avoiding multiple Fast Retransmits caused by
200	   the retransmission of packets already received by the receiver.

202	   1)  Three duplicate ACKs:
203	       When the third duplicate ACK is received and the sender is not
204	       already in the Fast Recovery procedure, check to see if the
205	       Cumulative Acknowledgment field covers more than
206	       "recover".  If so, go to Step 1A.  Otherwise, go to Step 1B.

208	   1A) Invoking Fast Retransmit:
209	       If so, then set ssthresh to no more than the value given in
210	       equation 1 below.  (This is equation 4 from [RFC5681]).

212	         ssthresh = max (FlightSize / 2, 2*SMSS)           (1)

214	       In addition, record the highest sequence number transmitted in
215	       the variable "recover", and go to Step 2.

217	   1B) Not invoking Fast Retransmit:
218	       Do not enter the Fast Retransmit and Fast Recovery procedure.  In
219	       particular, do not change ssthresh, do not go to Step 2 to
220	       retransmit the "lost" segment, and do not execute Step 3 upon
221	       subsequent duplicate ACKs.

223	   2)  Entering Fast Retransmit:
224	       Retransmit the lost segment and set cwnd to ssthresh plus
225	       3*SMSS.  This artificially "inflates" the congestion window by the
226	       number of segments (three) that have left the network and the
227	       receiver has buffered.

229	   3)  Fast Recovery:
230	       For each additional duplicate ACK received while in Fast
231	       Recovery, increment cwnd by SMSS.  This artificially inflates
232	       the congestion window in order to reflect the additional segment
233	       that has left the network.

235	   4)  Fast Recovery, continued:
236	       Transmit a segment, if allowed by the new value of cwnd and the
237	       receiver's advertised window.

239	   5)  When an ACK arrives that acknowledges new data, this ACK could be
240	       the acknowledgment elicited by the retransmission from step 2, or
241	       elicited by a later retransmission.

243	       Full acknowledgments:
244	       If this ACK acknowledges all of the data up to and including
245	       "recover", then the ACK acknowledges all the intermediate
246	       segments sent between the original transmission of the lost
247	       segment and the receipt of the third duplicate ACK.  Set cwnd to
248	       either (1) min (ssthresh, max(FlightSize, SMSS) + SMSS) or
249	       (2) ssthresh, where ssthresh is the value set in step 1; this is
250	       termed "deflating" the window.  (We note that "FlightSize" in step 1
251	       referred to the amount of data outstanding in step 1, when Fast
252	       Recovery was entered, while "FlightSize" in step 5 refers to the
253	       amount of data outstanding in step 5, when Fast Recovery is
254	       exited.)  If the second option is selected, the implementation
255	       is encouraged to take measures to avoid a possible burst of
256	       data, in case the amount of data outstanding in the network is
257	       much less than the new congestion window allows.  A simple mechanism
258	       is to limit the number of data packets that can be sent in response
259	       to a single acknowledgment.  Exit the Fast Recovery procedure.

261	       Partial acknowledgments:
262	       If this ACK does *not* acknowledge all of the data up to and
263	       including "recover", then this is a partial ACK.  In this case,
264	       retransmit the first unacknowledged segment.  Deflate the
265	       congestion window by the amount of new data acknowledged by the
266	       cumulative acknowledgment field.  If the partial ACK
267	       acknowledges at least one SMSS of new data, then add back SMSS
268	       bytes to the congestion window.  As in Step 3, this artificially
269	       inflates the congestion window in order to reflect the additional
270	       segment that has left the network.  Send a new segment if
271	       permitted by the new value of cwnd.  This "partial window
272	       deflation" attempts to ensure that, when Fast Recovery eventually
273	       ends, approximately ssthresh amount of data will be outstanding
274	       in the network.  Do not exit the Fast Recovery procedure (i.e.,
275	       if any duplicate ACKs subsequently arrive, execute Steps 3 and
276	       4 above).

278	       For the first partial ACK that arrives during Fast Recovery, also
279	       reset the retransmit timer.  Timer management is discussed in
280	       more detail in Section 4.

282	   6)  Retransmit timeouts:
283	       After a retransmit timeout, record the highest sequence number
284	       transmitted in the variable "recover" and exit the Fast
285	       Recovery procedure if applicable.

287	   Step 1 specifies a check that the Cumulative Acknowledgment field
288	   covers more than "recover".  Because the acknowledgment field
289	   contains the sequence number that the sender next expects to receive,
290	   the acknowledgment "ack_number" covers more than "recover" when:

292	      ack_number - 1 > recover;

294	   i.e., at least one byte more of data is acknowledged beyond the
295	   highest byte that was outstanding when Fast Retransmit was last
296	   entered.

298	   Note that in Step 5, the congestion window is deflated after a
299	   partial acknowledgment is received.  The congestion window was
300	   likely to have been inflated considerably when the partial
301	   acknowledgment was received.  In addition, depending on the original
302	   pattern of packet losses, the partial acknowledgment might
303	   acknowledge nearly a window of data.  In this case, if the congestion
304	   window was not deflated, the data sender might be able to send nearly
305	   a window of data back-to-back.

307	   This document does not specify the sender's response to duplicate
308	   ACKs when the Fast Retransmit/Fast Recovery algorithm is not
309	   invoked.  This is addressed in other documents, such as those
310	   describing the Limited Transmit procedure [RFC3042].  This document
311	   also does not address issues of adjusting the duplicate acknowledgment
312	   threshold, but assumes the threshold specified in the IETF standards;
313	   the current standard is RFC 5681, which specifies a threshold of three
314	   duplicate acknowledgments.

316	   As a final note, we would observe that in the absence of the SACK
317	   option, the data sender is working from limited information.  When
318	   the issue of recovery from multiple dropped packets from a single
319	   window of data is of particular importance, the best alternative
320	   would be to use the SACK option.

322	4.  Handling Duplicate Acknowledgments After A Timeout

324	   After each retransmit timeout, the highest sequence number
325	   transmitted so far is recorded in the variable "recover".
326	   If, after a retransmit timeout, the TCP data sender retransmits three
327	   consecutive packets that have already been received by the data
328	   receiver, then the TCP data sender will receive three duplicate
329	   acknowledgments that do not cover more than "recover".  In this
330	   case, the duplicate acknowledgments are not an indication of a new
331	   instance of congestion.  They are simply an indication that the
332	   sender has unnecessarily retransmitted at least three packets.

334	   However, when a retransmitted packet is itself dropped, the sender
335	   can also receive three duplicate acknowledgments that do not cover
336	   more than "recover".  In this case, the sender would have been
337	   better off if it had initiated Fast Retransmit.  For a TCP that
338	   implements the algorithm specified in Section 3 of this document, the
339	   sender does not infer a packet drop from duplicate acknowledgments
340	   in this scenario.  As always, the retransmit timer is the backup
341	   mechanism for inferring packet loss in this case.

343	   There are several heuristics, based on timestamps or on the amount of
344	   advancement of the cumulative acknowledgment field, that allow the
345	   sender to distinguish, in some cases, between three duplicate
346	   acknowledgments following a retransmitted packet that was dropped,
347	   and three duplicate acknowledgments from the unnecessary
348	   retransmission of three packets [Gur03, GF04].  The TCP sender MAY use
349	   such a heuristic to decide to invoke a Fast Retransmit in some cases,
350	   even when the three duplicate acknowledgments do not cover more than
351	   "recover".

353	   For example, when three duplicate acknowledgments are caused by the
354	   unnecessary retransmission of three packets, this is likely to be
355	   accompanied by the cumulative acknowledgment field advancing by at
356	   least four segments.  Similarly, a heuristic based on timestamps uses
357	   the fact that when there is a hole in the sequence space, the
358	   timestamp echoed in the duplicate acknowledgment is the timestamp of
359	   the most recent data packet that advanced the cumulative
360	   acknowledgment field [RFC1323].  If timestamps are used, and the
361	   sender stores the timestamp of the last acknowledged segment, then
362	   the timestamp echoed by duplicate acknowledgments can be used to
363	   distinguish between a retransmitted packet that was dropped and
364	   three duplicate acknowledgments from the unnecessary
365	   retransmission of three packets.

367	4.1.  ACK Heuristic

369	   If the ACK-based heuristic is used, then following the advancement of
370	   the cumulative acknowledgment field, the sender stores the value of
371	   the previous cumulative acknowledgment as prev_highest_ack, and stores
372	   the latest cumulative ACK as highest_ack.  In addition, the following
373	   step is performed if Step 1 in Section 3 fails, before proceeding to
374	   Step 1B.

376	   1*)  If the Cumulative Acknowledgment field didn't cover more than
377	        "recover", check to see if the congestion window is greater
378	        than SMSS bytes and the difference between highest_ack and
379	        prev_highest_ack is at most 4*SMSS bytes.  If true, duplicate
380	        ACKs indicate a lost segment (proceed to Step 1A in Section
381	        3).  Otherwise, duplicate ACKs likely result from unnecessary
382	        retransmissions (proceed to Step 1B in Section 3).

384	   The congestion window check serves to protect against fast retransmit
385	   immediately after a retransmit timeout.

387	   If several ACKs are lost, the sender can see a jump in the cumulative
388	   ACK of more than three segments, and the heuristic can fail.
389	   RFC 5681 recommends that a receiver should
390	   send duplicate ACKs for every out-of-order data packet, such as a
391	   data packet received during Fast Recovery.  The ACK heuristic is more
392	   likely to fail if the receiver does not follow this advice, because
393	   then a smaller number of ACK losses are needed to produce a
394	   sufficient jump in the cumulative ACK.

396	4.2.  Timestamp Heuristic

398	   If this heuristic is used, the sender stores the timestamp of the
399	   last acknowledged segment.  In addition, the second paragraph of step
400	   1 in Section 3 is replaced as follows:

402	   1**) If the Cumulative Acknowledgment field didn't cover more than
403	        "recover", check to see if the echoed timestamp in the last
404	        non-duplicate acknowledgment equals the
405	        stored timestamp.  If true, duplicate ACKs indicate a lost
406	        segment (proceed to Step 1A in Section 3).  Otherwise, duplicate
407	        ACKs likely result from unnecessary retransmissions (proceed
408	        to Step 1B in Section 3).

410	   The timestamp heuristic works correctly, both when the receiver echoes
411	   timestamps as specified by [RFC1323], and by its revision attempts.
412	   However, if the receiver arbitrarily echoes timestamps, the heuristic
413	   can fail.  The heuristic can also fail if a timeout was spurious and
414	   returning ACKs are not from retransmitted segments.  This can be
415	   prevented by detection algorithms such as [RFC3522].

417	5.  Implementation Issues for the Data Receiver

419	   [RFC5681] specifies that "Out-of-order data segments SHOULD be
420	   acknowledged immediately, in order to accelerate loss recovery."
421	   Neal Cardwell has noted that some data receivers do not send an
422	   immediate acknowledgment when they send a partial acknowledgment,
423	   but instead wait first for their delayed acknowledgment timer to
424	   expire [C98].  As [C98] notes, this severely limits the potential
425	   benefit of NewReno by delaying the receipt of the partial
426	   acknowledgment at the data sender.  Echoing RFC 5681, our
427	   recommendation is that the data receiver send an immediate
428	   acknowledgment for an out-of-order segment, even when that
429	   out-of-order segment fills a hole in the buffer.

431	6.  Implementation Issues for the Data Sender

433	   In Section 3, Step 5 above, it is noted that implementations should
434	   take measures to avoid a possible burst of data when leaving Fast
435	   Recovery, in case the amount of new data that the sender is eligible
436	   to send due to the new value of the congestion window is large.  This
437	   can arise during NewReno when ACKs are lost or treated as pure window
438	   updates, thereby causing the sender to underestimate the number of
439	   new segments that can be sent during the recovery procedure.
440	   Specifically, bursts can occur when the FlightSize is much less than
441	   the new congestion window when exiting from Fast Recovery.  One
442	   simple mechanism to avoid a burst of data when leaving Fast Recovery
443	   is to limit the number of data packets that can be sent in response
444	   to a single acknowledgment.  (This is known as "maxburst_" in the ns
445	   simulator.)  Other possible mechanisms for avoiding bursts include
446	   rate-based pacing, or setting the slow-start threshold to the
447	   resultant congestion window and then resetting the congestion window
448	   to FlightSize.  A recommendation on the general mechanism to avoid
449	   excessively bursty sending patterns is outside the scope of this
450	   document.

452	   An implementation may want to use a separate flag to record whether
453	   or not it is presently in the Fast Recovery procedure.  The use of
454	   the value of the duplicate acknowledgment counter for this purpose is
455	   not reliable because it can be reset upon window updates and
456	   out-of-order acknowledgments.

458	   When updating the Cumulative Acknowledgment field outside of
459	   Fast Recovery, the "recover" state variable may also need to be
460	   updated in order to continue to permit possible entry into Fast
461	   Recovery (Section 3, step 1).  This issue arises when an update
462	   of the Cumulative Acknowledgment field results in a sequence
463	   wraparound that affects the ordering between the Cumulative
464	   Acknowledgment field and the "recover" state variable.  Entry
465	   into Fast Recovery is only possible when the Cumulative
466	   Acknowledgment field covers more than the "recover" state variable.

468	   It is important for the sender to respond correctly to duplicate ACKs
469	   received when the sender is no longer in Fast Recovery (e.g., because
470	   of a Retransmit Timeout).  The Limited Transmit procedure [RFC3042]
471	   describes possible responses to the first and second duplicate
472	   acknowledgments.  When three or more duplicate acknowledgments are
473	   received, the Cumulative Acknowledgment field doesn't cover more
474	   than "recover", and a new Fast Recovery is not invoked, it is
475	   important that the sender not execute the Fast Recovery steps (3) and
476	   (4) in Section 3.  Otherwise, the sender could end up in a chain of
477	   spurious timeouts.  We mention this only because several NewReno
478	   implementations had this bug, including the implementation in the NS
479	   simulator.

481	   It has been observed that some TCP implementations enter a slow start
482	   or congestion avoidance window updating algorithm immediately after
483	   the cwnd is set by the equation found in (Section 3, step 5), even
484	   without a new external event generating the cwnd change.  Note that
485	   after cwnd is set based on the procedure for exiting Fast Recovery
486	   (Section 3, step 5), cwnd SHOULD NOT be updated until a further
487	   event occurs (e.g., arrival of an ack, or timeout) after this
488	   adjustment.

490	7.  Security Considerations

492	   RFC 5681 discusses general security considerations concerning TCP
493	   congestion control.  This document describes a specific algorithm
494	   that conforms with the congestion control requirements of RFC 5681,
495	   and so those considerations apply to this algorithm, too.  There are
496	   no known additional security concerns for this specific algorithm.

498	8.  IANA Considerations

500	   This document has no actions for IANA.

502	9.  Conclusions

504	   This document specifies the NewReno Fast Retransmit and Fast Recovery
505	   algorithms for TCP.  This NewReno modification to TCP can even be
506	   important for TCP implementations that support the SACK option,
507	   because the SACK option can only be used for TCP connections when
508	   both TCP end-nodes support the SACK option.  NewReno performs better
509	   than Reno (RFC 5681) in a number of scenarios discussed herein.

511	   A number of options to the basic algorithm presented in Section 3 are
512	   also described in appendices to this document.  These include the
513	   handling of the retransmission timer (Appendix A), the response to
514	   partial acknowledgments (Appendix B), and whether or not the sender
515	   maintains a state variable called "recover" (Appendix C).
516	   Our belief is that the differences between these variants of NewReno
517	   are small compared to the differences between Reno and NewReno.
518	   That is, the important thing is to implement NewReno instead of Reno,
519	   for a TCP connection without SACK; it is less important exactly
520	   which of the variants of NewReno is implemented.

522	10.  Acknowledgments

524	   Many thanks to Anil Agarwal, Mark Allman, Armando Caro, Jeffrey Hsu,
525	   Vern Paxson, Kacheong Poon, Keyur Shah, and Bernie Volz for detailed
526	   feedback on this document or on its precursor, RFC 2582.  Jeffrey
527	   Hsu provided clarifications on the handling of the recover variable
528	   that were applied to RFC 3782 as errata, and now are in Section 8
529	   of this document.  Yoshifumi Nishida contributed a modification
530	   to the fast recovery algorithm to account for the case in which
531	   flightsize is 0 when the TCP sender leaves fast recovery, and the
532	   TCP receiver uses delayed acknowledgments.  Alexander Zimmermann
533	   provided several suggestions to improve the clarity of the document.

535	11.  References

537	11.1.  Normative References

539	   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
540	             Requirement Levels", BCP 14, RFC 2119, March 1997.

542	   [RFC2988] Paxson, V. and M. Allman, "Computing TCP's Retransmission
543	             Timer", RFC 2988, November 2000.

545	   [RFC5681] Allman, M., Paxson, V. and  E. Blanton, "TCP Congestion
546	             Control", RFC 5681, September 2009.

548	11.2.  Informative References

550	   [C98]     Cardwell, N., "delayed ACKs for retransmitted packets: ouch!".
551	             November 1998,  Email to the tcpimpl mailing list, Message-ID
552	             "Pine.LNX.4.02A.9811021421340.26785-100000@sake.cs.washington.edu",
553	             archived at "http://tcp-impl.lerc.nasa.gov/tcp-impl".

555	   [F98]     Floyd, S., Revisions to RFC 2001, "Presentation to the TCPIMPL
556	             Working Group", August 1998.  URLs
557	             "ftp://ftp.ee.lbl.gov/talks/sf-tcpimpl-aug98.ps" and
558	             "ftp://ftp.ee.lbl.gov/talks/sf-tcpimpl-aug98.pdf".

560	   [F03]     Floyd, S., "Moving NewReno from Experimental to Proposed
561	             Standard?  Presentation to the TSVWG Working Group", March 2003.
562	             URLs "http://www.icir.org/floyd/talks/newreno-Mar03.ps" and
563	             "http://www.icir.org/floyd/talks/newreno-Mar03.pdf".

565	   [FF96]    Fall, K. and S. Floyd, "Simulation-based Comparisons of Tahoe,
566	             Reno and SACK TCP", Computer Communication Review, July 1996.  URL
567	             "ftp://ftp.ee.lbl.gov/papers/sacks.ps.Z".

569	   [F94]     Floyd, S., "TCP and Successive Fast Retransmits", Technical
570	             report, October 1994.  URL
571	             "ftp://ftp.ee.lbl.gov/papers/fastretrans.ps".

573	   [GF04]    Gurtov, A. and S. Floyd, "Resolving Acknowledgment Ambiguity
574	             in non-SACK TCP", Next Generation Teletraffic and
575	             Wired/Wireless Advanced Networking (NEW2AN'04), February
576	             2004.  URL "http://www.cs.helsinki.fi/u/gurtov/papers/
577	             heuristics.html".

579	   [Gur03]   Gurtov, A., "[Tsvwg] resolving the problem of unnecessary fast
580	             retransmits in go-back-N", email to the tsvwg mailing list, message
581	             ID <3F25B467.9020609@cs.helsinki.fi>, July 28, 2003.  URL
582	             "http://www1.ietf.org/mail-archive/working-groups/tsvwg/current/msg04334.html".

584	   [Hen98]   Henderson, T., Re: NewReno and the 2001 Revision. September
585	             1998.  Email to the tcpimpl mailing list, Message ID
586	             "Pine.BSI.3.95.980923224136.26134A-100000@raptor.CS.Berkeley.EDU",
587	             archived at "http://tcp-impl.lerc.nasa.gov/tcp-impl".

589	   [Hoe95]   Hoe, J., "Startup Dynamics of TCP's Congestion Control and
590	             Avoidance Schemes", Master's Thesis, MIT, 1995.

592	   [Hoe96]   Hoe, J., "Improving the Start-up Behavior of a Congestion
593	             Control Scheme for TCP", ACM SIGCOMM, August 1996.  URL
594	             "http://www.acm.org/sigcomm/sigcomm96/program.html".

596	   [LM97]    Lin, D. and R. Morris, "Dynamics of Random Early Detection",
597	             SIGCOMM 97, September 1997.  URL
598	             "http://www.acm.org/sigcomm/sigcomm97/program.html".

600	   [NS]      The Network Simulator (NS). URL "http://www.isi.edu/nsnam/ns/".

602	   [PF01]    Padhye, J. and S. Floyd, "Identifying the TCP Behavior of Web
603	             Servers", June 2001, SIGCOMM 2001.

605	   [RFC1323] Jacobson, V., Braden, R. and D. Borman, "TCP Extensions for
606	             High Performance", RFC 1323, May 1992.

608	   [RFC2582] Floyd, S. and T. Henderson, "The NewReno Modification to
609	             TCP's Fast Recovery Algorithm", RFC 2582, April 1999.

611	   [RFC2883] Floyd, S., J. Mahdavi, M. Mathis, and M. Podolsky, "The
612	             Selective Acknowledgment (SACK) Option for TCP, RFC 2883, July 2000.

614	   [RFC3042] Allman, M., Balakrishnan, H. and S. Floyd, "Enhancing TCP's
615	             Loss Recovery Using Limited Transmit", RFC 3042, January 2001.

617	   [RFC3522] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm for
618	             TCP", RFC 3522, April 2003.

620	   [RFC3782] Floyd, S., T. Henderson, and A. Gurtov, "The NewReno
621	             Modification to TCP's Fast Recovery Algorithm", RFC 3782, April 2004.

623	Appendix A.  Resetting the Retransmit Timer in Response to Partial
624	              Acknowledgments

626	             One possible variant to the response to partial acknowledgments
627	             specified in Section 3 concerns when to reset the retransmit timer
628	             after a partial acknowledgment.  The algorithm in Section 3, Step 5,
629	             resets the retransmit timer only after the first partial ACK.  In
630	             this case, if a large number of packets were dropped from a window of
631	             data, the TCP data sender's retransmit timer will ultimately expire,
632	             and the TCP data sender will invoke Slow-Start.  (This is illustrated
633	             on page 12 of [F98].)  We call this the Impatient variant of NewReno.
634	             We note that the Impatient variant in Section 3 doesn't follow the
635	             recommended algorithm in RFC 2988 of restarting the retransmit timer
636	             after every packet transmission or retransmission (step 5.1 of
637	             [RFC2988]).

639	             In contrast, the NewReno simulations in [FF96] illustrate the
640	             algorithm described above with the modification that the retransmit
641	             timer is reset after each partial acknowledgment.  We call this the
642	             Slow-but-Steady variant of NewReno.  In this case, for a window with
643	             a large number of packet drops, the TCP data sender retransmits at
644	             most one packet per roundtrip time.  (This behavior is illustrated in
645	             the New-Reno TCP simulation of Figure 5 in [FF96], and on page 11 of
646	             [F98]).

648	             When N packets have been dropped from a window of data for a large
649	             value of N, the Slow-but-Steady variant can remain in Fast Recovery
650	             for N round-trip times, retransmitting one more dropped packet each
651	             round-trip time; for these scenarios, the Impatient variant gives a
652	             faster recovery and better performance.
653	             The Impatient variant can be particularly important for TCP
654	             connections with large congestion windows.

656	             One can also construct scenarios where the Slow-but-Steady variant
657	             gives better performance than the Impatient variant.  As an example,
658	             this occurs when only a small number of packets are dropped, the RTO
659	             is sufficiently small that the retransmit timer expires, and
660	             performance would have been better without a retransmit timeout.

662	             The Slow-but-Steady variant can also achieve higher goodput than the
663	             Impatient variant, by avoiding unnecessary retransmissions.  This
664	             could be of special interest for cellular links, where every
665	             transmission costs battery power and money.  The
666	             Slow-but-Steady variant can also be more robust to delay variation in
667	             the network, where a delay spike might force the Impatient variant into
668	             a timeout and go-back-N recovery.

670	             Neither of the two variants discussed above are optimal.  Our
671	             recommendation is for the Impatient variant, as specified in Section
672	             3 of this document, because of the poor performance of the
673	             Slow-but-Steady variant for TCP connections with large congestion
674	             windows.

676	             One possibility for a more optimal algorithm would be one that
677	             recovered from multiple packet drops as quickly as does slow-start,
678	             while resetting the retransmit timers after each partial
679	             acknowledgment, as described in the section below.  We note,
680	             however, that there is a limitation to the potential performance in
681	             this case in the absence of the SACK option.

683	Appendix B.  Retransmissions after a Partial Acknowledgment

685	             One possible variant to the response to partial acknowledgments
686	             specified in Section 3 would be to retransmit more than one packet
687	             after each partial acknowledgment, and to reset the retransmit timer
688	             after each retransmission.  The algorithm specified in Section 3
689	             retransmits a single packet after each partial acknowledgment.  This
690	             is the most conservative alternative, in that it is the least likely
691	             to result in an unnecessarily-retransmitted packet.  A variant that
692	             would recover faster from a window with many packet drops would be to
693	             effectively Slow-Start, retransmitting two packets after each partial
694	             acknowledgment.  Such an approach would take less than N roundtrip
695	             times to recover from N losses [Hoe96].  However, in the absence of
696	             SACK, recovering as quickly as slow-start introduces the likelihood
697	             of unnecessarily retransmitting packets, and this could significantly
698	             complicate the recovery mechanisms.

700	             We note that the response to partial acknowledgments specified in
701	             Section 3 of this document and in RFC 2582 differs from the response
702	             in [FF96], even though both approaches only retransmit one packet in
703	             response to a partial acknowledgment.  Step 5 of Section 3 specifies
704	             that the TCP sender responds to a partial ACK by deflating the
705	             congestion window by the amount of new data acknowledged, adding
706	             back SMSS bytes if the partial ACK acknowledges at least SMSS bytes
707	             of new data, and sending a new segment if permitted by the new value
708	             of cwnd.  Thus, only one previously-sent packet is retransmitted in
709	             response to each partial acknowledgment, but additional new packets
710	             might be transmitted as well, depending on the amount of new data
711	             acknowledged by the partial acknowledgment.  In contrast, the
712	             variant of NewReno illustrated in [FF96] simply set the congestion
713	             window to ssthresh when a partial acknowledgment was received.  The
714	             approach in [FF96] is more conservative, and does not attempt to
715	             accurately track the actual number of outstanding packets after a
716	             partial acknowledgment is received.  While either of these
717	             approaches gives acceptable performance, the variant specified in
718	             Section 3 recovers more smoothly when multiple packets are dropped
719	             from a window of data.

721	Appendix C.  Avoiding Multiple Fast Retransmits

723	             This appendix describes the motivation for the sender's state
724	             variable "recover".

726	             In the absence of the SACK option or timestamps, a duplicate
727	             acknowledgment carries no information to identify the data packet or
728	             packets at the TCP data receiver that triggered that duplicate
729	             acknowledgment.  In this case, the TCP data sender is unable to
730	             distinguish between a duplicate acknowledgment that results from a
731	             lost or delayed data packet, and a duplicate acknowledgment that
732	             results from the sender's unnecessary retransmission of a data packet
733	             that had already been received at the TCP data receiver.  Because of
734	             this, with the Retransmit and Fast Recovery algorithms in Reno TCP,
735	             multiple segment losses from a single window of data can sometimes
736	             result in unnecessary multiple Fast Retransmits (and multiple
737	             reductions of the congestion window) [F94].

739	             With the Fast Retransmit and Fast Recovery algorithms in Reno TCP,
740	             the performance problems caused by multiple Fast Retransmits are
741	             relatively minor compared to the potential problems with Tahoe TCP,
742	             which does not implement Fast Recovery.  Nevertheless, unnecessary
743	             Fast Retransmits can occur with Reno TCP unless some explicit
744	             mechanism is added to avoid this, such as the use of the "recover"
745	             variable.  (This modification is called "bugfix" in [F98], and is
746	             illustrated on pages 7 and 9 of that document.  Unnecessary Fast
747	             Retransmits for Reno without "bugfix" is illustrated on page 6 of

749	             [F98].)

751	             Section 3 of [RFC2582] defined a default variant of NewReno TCP that
752	             did not use the variable "recover", and did not check if duplicate
753	             ACKs cover the variable "recover" before invoking Fast Retransmit.
754	             With this default variant from RFC 2582, the problem of multiple Fast
755	             Retransmits from a single window of data can occur after a Retransmit
756	             Timeout (as in page 8 of [F98]) or in scenarios with reordering.
757	             RFC 2582 also defined Careful and Less Careful variants of the NewReno
758	             algorithm, and recommended the Careful variant.

760	             The algorithm specified in Section 3 of this document corresponds to
761	             the Careful variant of NewReno TCP from RFC 2582, and eliminates the
762	             problem of multiple Fast Retransmits.  This algorithm uses the
763	             variable "recover", whose initial value is the initial send sequence
764	             number.  After each retransmit timeout, the highest sequence number
765	             transmitted so far is recorded in the variable "recover".

767	Appendix D.  Simulations

769	             This section provides pointers to simulation scripts available in
770	             the NS simulator that reproduce behavior described above.

772	             In Section 3, a simple mechanism is described to limit the number of
773	             data packets that can be sent in response to a single acknowledgment.
774	             This is known as "maxburst_" in the NS simulator.

776	             Simulations with NewReno are illustrated with the validation test
777	             "tcl/test/test-all-newreno" in the NS simulator.  The command
778	             "../../ns test-suite-newreno.tcl reno" shows a simulation with Reno
779	             TCP, illustrating the data sender's lack of response to a partial
780	             acknowledgment.  In contrast, the command "../../ns
781	             test-suite-newreno.tcl newreno_B" shows a simulation with the same
782	             scenario using the NewReno algorithms described in this paper.

784	             Regarding the handling of duplicate acknowledgments after a timeout,
785	             the congestion window check serves to protect against fast retransmit
786	             immediately after a retransmit timeout, similar to the
787	             "exitFastRetrans_" variable in NS.  Examples of applying the ACK
788	             heuristic (Section 4) are in validation tests "./test-all-newreno
789	             newreno_rto_loss_ack" and "./test-all-newreno newreno_rto_dup_ack" in
790	             directory "tcl/test" of the NS simulator.
791	             If several ACKs are lost, the sender can see a jump in the cumulative
792	             ACK of more than three segments, and the heuristic can fail.  A
793	             validation test for this scenario is "./test-all-newreno
794	             newreno_rto_loss_ackf".

796	             Examples of applying the timestamp heuristic (Section 4) are in
797	             validation tests "./test-all-newreno newreno_rto_loss_tsh" and
798	             "./test-all-newreno newreno_rto_dup_tsh".

800	             Section 6 described a problem involving possible spurious timeouts,
801	             and mentions that this bug existed in the NS simulator.
802	             This bug in the NS simulator was fixed in July 2003,
803	             with the variable "exitFastRetrans_".

805	             Regarding the Slow-but-Steady and Impatient variants described
806	             in Appendix A, The tests "ns
807	             test-suite-newreno.tcl impatient1" and "ns test-suite-newreno.tcl
808	             slow1" in the NS simulator illustrate a scenario in which the
809	             Impatient variant performs better than the Slow-but-Steady
810	             variant.  The Impatient variant can be particularly important for TCP
811	             connections with large congestion windows, as illustrated by the tests
812	             "ns test-suite-newreno.tcl impatient4" and "ns test-suite-newreno.tcl
813	             slow4" in the NS simulator.  The tests
814	             "ns test-suite-newreno.tcl impatient2" and
815	             "ns test-suite-newreno.tcl slow2" in the NS simulator illustrate
816	             scenarios in which the Slow-but-Steady variant outperforms the Impatient
817	             variant.  The tests "ns test-suite-newreno.tcl impatient3" and
818	             "ns test-suite-newreno.tcl slow3" in the NS simulator illustrate
819	             scenarios in which the Slow-but-Steady variants avoid unnecessary
820	             retransmissions.

822	             Appendix B describes different policies for partial window deflation.
823	             The [FF96] behavior can be seen in the NS
824	             simulator by setting the variable "partial_window_deflation_" for
825	             "Agent/TCP/Newreno" to 0; the behavior specified in Section 3 is
826	             achieved by setting "partial_window_deflation_" to 1.

828	             Section 3 of [RFC2582] defined a default variant of NewReno TCP that
829	             did not use the variable "recover", and did not check if duplicate
830	             ACKs cover the variable "recover" before invoking Fast Retransmit.
831	             With this default variant from RFC 2582, the problem of multiple Fast
832	             Retransmits from a single window of data can occur after a Retransmit
833	             Timeout (as in page 8 of [F98]) or in scenarios with reordering (as
834	             An NS validation test "./test-all-newreno newreno5_noBF" in
835	             directory "tcl/test" of the NS simulator illustartes the default
836	             variant of NewReno TCP that doesn't use the variable "recover";
837	             this gives performance similar to that on page 8 of [F03].

839	Appendix E.  Comparisons between Reno and NewReno TCP

841	             As we stated in the introduction, we believe that the NewReno
842	             modification described in this document improves the performance of
843	             the Fast Retransmit and Fast Recovery algorithms of Reno TCP in a
844	             wide variety of scenarios.  This has been discussed in some depth in

846	             [FF96], which illustrates Reno TCP's poor performance when multiple
847	             packets are dropped from a window of data and also illustrates
848	             NewReno TCP's good performance in that scenario.

850	             We do, however, know of one scenario where Reno TCP gives better
851	             performance than NewReno TCP, that we describe here for the sake of
852	             completeness.  Consider a scenario with no packet loss, but with
853	             sufficient reordering so that the TCP sender receives three duplicate
854	             acknowledgments.  This will trigger the Fast Retransmit and Fast
855	             Recovery algorithms.  With Reno TCP or with Sack TCP, this will
856	             result in the unnecessary retransmission of a single packet, combined
857	             with a halving of the congestion window (shown on pages 4 and 6 of
858	             [F03]).  With NewReno TCP, however, this reordering will also result
859	             in the unnecessary retransmission of an entire window of data (shown
860	             on page 5 of [F03]).

862	             While Reno TCP performs better than NewReno TCP in the presence of
863	             reordering, NewReno's superior performance in the presence of
864	             multiple packet drops generally outweighs its less optimal
865	             performance in the presence of reordering.  (Sack TCP is the
866	             preferred solution, with good performance in both scenarios.)  This
867	             document recommends the Fast Retransmit and Fast Recovery algorithms
868	             of NewReno TCP instead of those of Reno TCP for those TCP connections
869	             that do not support SACK.  We would also note that NewReno's Fast
870	             Retransmit and Fast Recovery mechanisms are widely deployed in TCP
871	             implementations in the Internet today, as documented in [PF01].  For
872	             example, tests of TCP implementations in several thousand web servers
873	             in 2001 showed that for those TCP connections where the web browser
874	             was not SACK-capable, more web servers used the Fast Retransmit and
875	             Fast Recovery algorithms of NewReno than those of Reno or Tahoe TCP
876	             [PF01].

878	Appendix F.  Changes Relative to RFC 2582

880	             The purpose of this document is to advance the NewReno's Fast
881	             Retransmit and Fast Recovery algorithms in RFC 2582 to Standards Track.

883	             The main change in this document relative to RFC 2582 is to specify
884	             the Careful variant of NewReno's Fast Retransmit and Fast Recovery
885	             algorithms.  The base algorithm described in RFC 2582 did not attempt
886	             to avoid unnecessary multiple Fast Retransmits that can occur after a
887	             timeout (described in more detail in the section above).  However,
888	             RFC 2582 also defined "Careful" and "Less Careful" variants that
889	             avoid these unnecessary Fast Retransmits, and recommended the Careful
890	             variant.  This document specifies the previously-named "Careful"
891	             variant as the basic version of NewReno.  As described below, this
892	             algorithm uses a variable "recover", whose initial value is the send
893	             sequence number.

895	             The algorithm specified in Section 3 checks whether the
896	             acknowledgment field of a partial acknowledgment covers *more* than
897	             "recover", as defined in Section 3.  Another possible variant would be
898	             to simply require that the acknowledgment field covers *more than or
899	             equal to* "recover" before initiating another Fast Retransmit.  We
900	             called this the Less Careful variant in RFC 2582.

902	             There are two separate scenarios in which the TCP sender could
903	             receive three duplicate acknowledgments acknowledging "recover" but
904	             no more than "recover".  One scenario would be that the data sender
905	             transmitted four packets with sequence numbers higher than "recover",
906	             that the first packet was dropped in the network, and the following
907	             three packets triggered three duplicate acknowledgments
908	             acknowledging "recover".  The second scenario would be that the
909	             sender unnecessarily retransmitted three packets below "recover", and
910	             that these three packets triggered three duplicate acknowledgments
911	             acknowledging "recover".  In the absence of SACK, the TCP sender is
912	             unable to distinguish between these two scenarios.

914	             For the Careful variant of Fast Retransmit, the data sender would
915	             have to wait for a retransmit timeout in the first scenario, but
916	             would not have an unnecessary Fast Retransmit in the second
917	             scenario.  For the Less Careful variant to Fast Retransmit, the data
918	             sender would Fast Retransmit as desired in the first scenario, and would
919	             unnecessarily Fast Retransmit in the second scenario.  This document
920	             only specifies the Careful variant in Section 3.  Unnecessary Fast
921	             Retransmits with the Less Careful variant in scenarios with
922	             reordering are illustrated in page 8 of [F03].

924	             The document also specifies two heuristics that the TCP sender MAY
925	             use to decide to invoke Fast Retransmit even when the three duplicate
926	             acknowledgments do not cover more than "recover".  These heuristics,
927	             an ACK-based heuristic and a timestamp heuristic, are described in
928	             Sections 6.1 and 6.2 respectively.

930	Appendix G.  Changes Relative to RFC 3782

932	             In [RFC3782], the cwnd after Full ACK reception will be set to
933	             (1) min (ssthresh, FlightSize + SMSS) or (2) ssthresh.  However,
934	             there is a risk in the first logic which results in performance
935	             degradation.  With the first logic, if FlightSize is zero, the result
936	             will be 1 SMSS. This means TCP can transmit only 1 segment at this
937	             moment, which can cause delay in ACK transmission at receiver due to
938	             delayed ACK algorithm.

940	             The FlightSize on Full ACK reception can be zero in some situations.
941	             A typical example is where sending window size during fast recovery is
942	             small. In this case, the retransmitted packet and new data packets can
943	             be transmitted within a short interval.  If all these packets
944	             successfully arrive, the receiver may generate a Full ACK that
945	             acknowledges all outstanding data.  Even if window size is not small,
946	             loss of ACK packets or receive buffer shortage during fast recovery can
947	             also increase the possibility to fall into this situation.

949	             The proposed fix in this document ensures that sender TCP transmits at
950	             least two segments on Full ACK reception.

952	             In addition, errata for RFC3782 (editorial clarification to Section 8
953	             of RFC2582, which is now Section 6 of this document) has been applied.

955	             Sections 4, 5, and 9-11 of RFC2582 were relocated to appendices of
956	             this document since they are non-normative and provide background
957	             information and references to simulation results.

959	Appendix H.  Document Revision History

961	             To be removed upon publication

963	             +----------+--------------------------------------------------+
964	             | Revision | Comments                                         |
965	             +----------+--------------------------------------------------+
966	             | draft-00 | RFC3782 errata applied, and changes applied from |
967	             |          | draft-nishida-newreno-modification-02            |
968	             +----------+--------------------------------------------------+
969	             | draft-01 | Non-normative sections moved to appendices,      |
970	             |          | editorial clarifications applied as suggested    |
971	             |          | by Alexander Zimmermann.                         |
972	             +----------+--------------------------------------------------+

974	Authors' Addresses

976	   Tom Henderson
977	   The Boeing Company

979	   EMail: thomas.r.henderson@boeing.com

981	   Sally Floyd
982	   International Computer Science Institute

984	   Phone: +1 (510) 666-2989
985	   EMail: floyd@acm.org
986	   URL: http://www.icir.org/floyd/
987	   Andrei Gurtov
988	   HIIT
989	   Helsinki Institute for Information Technology
990	   P.O. Box 19215
991	   00076 Aalto
992	   Finland

994	   EMail: gurtov@hiit.fi

996	   Yoshifumi Nishida
997	   WIDE Project
998	   Endo 5322
999	   Fujisawa, Kanagawa  252-8520
1000	   Japan

1002	   Email: nishida@wide.ad.jp