idnits 2.17.1 

draft-ietf-tcpm-rfc3782-bis-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

     No issues found here.

  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

     No issues found here.

  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the IETF Trust and authors Copyright Line does not
     match the current year

  == The document seems to contain a disclaimer for pre-RFC5378 work, but was
     first submitted on or after 10 November 2008.  The disclaimer is usually
     necessary only for documents that revise or obsolete older RFCs, and that
     take significant amounts of text from those RFCs.  If you can contact all
     authors of the source material and they are willing to grant the BCP78
     rights to the IETF Trust, you can and should remove the disclaimer. 
     Otherwise, the disclaimer is needed and you can ignore this comment. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (October 22, 2011) is 4563 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Unused Reference: 'RFC6298' is defined on line 512, but no explicit
     reference was found in the text

  == Unused Reference: 'F98' is defined on line 524, but no explicit
     reference was found in the text

  == Unused Reference: 'F03' is defined on line 529, but no explicit
     reference was found in the text

  == Unused Reference: 'PF01' is defined on line 575, but no explicit
     reference was found in the text

  -- Obsolete informational reference (is this intentional?): RFC 2001 (ref.
     'F98') (Obsoleted by RFC 2581)

  -- Obsolete informational reference (is this intentional?): RFC 1323
     (Obsoleted by RFC 7323)

  -- Obsolete informational reference (is this intentional?): RFC 2582
     (Obsoleted by RFC 3782)

  -- Obsolete informational reference (is this intentional?): RFC 3782
     (Obsoleted by RFC 6582)


     Summary: 0 errors (**), 0 flaws (~~), 6 warnings (==), 5 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.
--------------------------------------------------------------------------------


2	TCP Maintenance and Minor                                   T. Henderson
3	Extensions Working Group                                          Boeing
4	Internet-Draft                                                  S. Floyd
5	Obsoletes: 3782  (if approved)                                      ICSI
6	Intended status: Standards Track                               A. Gurtov
7	Expires:  April 22, 2012                                            HIIT
8	                                                              Y. Nishida
9	                                                            WIDE Project
10	                                                        October 22, 2011

12	       The NewReno Modification to TCP's Fast Recovery Algorithm
13	                   draft-ietf-tcpm-rfc3782-bis-03.txt

15	Abstract

17	   RFC 5681 documents the following four intertwined TCP
18	   congestion control algorithms: slow start, congestion avoidance, fast
19	   retransmit, and fast recovery.  RFC 5681 explicitly allows
20	   certain modifications of these algorithms, including modifications
21	   that use the TCP Selective Acknowledgement (SACK) option (RFC 2883),
22	   and modifications that respond to "partial acknowledgments" (ACKs
23	   which cover new data, but not all the data outstanding when loss was
24	   detected) in the absence of SACK.  This document describes a specific
25	   algorithm for responding to partial acknowledgments, referred to as
26	   NewReno.  This response to partial acknowledgments was first proposed
27	   by Janey Hoe.  This document obsoletes RFC 3782.

29	Status of this Memo

31	   This Internet-Draft is submitted in full conformance with the
32	   provisions of BCP 78 and BCP 79.

34	   Internet-Drafts are working documents of the Internet Engineering
35	   Task Force (IETF).  Note that other groups may also distribute
36	   working documents as Internet-Drafts.  The list of current Internet-
37	   Drafts is at http://datatracker.ietf.org/drafts/current/.

39	   Internet-Drafts are draft documents valid for a maximum of six months
40	   and may be updated, replaced, or obsoleted by other documents at any
41	   time.  It is inappropriate to use Internet-Drafts as reference
42	   material or to cite them other than as "work in progress."

44	   This Internet-Draft will expire on April 22, 2012.

46	Copyright Notice

48	   Copyright (c) 2011 IETF Trust and the persons identified as
49	   the document authors.  All rights reserved.

51	   This document is subject to BCP 78 and the IETF Trust's Legal
52	   Provisions Relating to IETF Documents
53	   (http://trustee.ietf.org/license-info) in effect on the date of
54	   publication of this document.  Please review these documents
55	   carefully, as they describe your rights and restrictions with respect
56	   to this document.  Code Components extracted from this document must
57	   include Simplified BSD License text as described in Section 4.e of
58	   the Trust Legal Provisions and are provided without warranty as
59	   described in the Simplified BSD License.

61	   This document may contain material from IETF Documents or IETF
62	   Contributions published or made publicly available before November
63	   10, 2008.  The person(s) controlling the copyright in some of this
64	   material may not have granted the IETF Trust the right to allow
65	   modifications of such material outside the IETF Standards Process.
66	   Without obtaining an adequate license from the person(s) controlling
67	   the copyright in such materials, this document may not be modified
68	   outside the IETF Standards Process, and derivative works of it may
69	   not be created outside the IETF Standards Process, except to format
70	   it for publication as an RFC or to translate it into languages other
71	   than English.

73	1.  Introduction

75	   For the typical implementation of the TCP Fast Recovery algorithm
76	   described in [RFC5681] (first implemented in the 1990 BSD Reno
77	   release, and referred to as the Reno algorithm in [FF96]), the TCP
78	   data sender only retransmits a packet after a retransmit timeout has
79	   occurred, or after three duplicate acknowledgments have arrived
80	   triggering the Fast Retransmit algorithm.  A single retransmit
81	   timeout might result in the retransmission of several data packets,
82	   but each invocation of the Fast Retransmit algorithm in RFC 5681
83	   leads to the retransmission of only a single data packet.

85	   Two problems arise with Reno TCP when multiple packet losses occur
86	   in a single window.  First, Reno will often take a timeout, as
87	   has been documented in [Hoe95].  Second, even if a retransmission
88	   timeout is avoided, multiple fast retransmits and window reductions
89	   can occur, as documented in [F94].  When multiple packet losses
90	   occur, if the SACK option [RFC2883] is available, the TCP sender
91	   has the information to make intelligent decisions about which packets
92	   to retransmit and which packets not to retransmit during Fast
93	   Recovery.  This document applies to TCP connections that are
94	   unable to use the TCP Selective Acknowledgement (SACK) option,
95	   either because the option is not locally supported or
96	   because the TCP peer did not indicate a willingness to use SACK.

98	   In the absence of SACK, there is little information available to the
99	   TCP sender in making retransmission decisions during Fast
100	   Recovery.  From the three duplicate acknowledgments, the sender
101	   infers a packet loss, and retransmits the indicated packet.  After
102	   this, the data sender could receive additional duplicate
103	   acknowledgments, as the data receiver acknowledges additional data
104	   packets that were already in flight when the sender entered Fast
105	   Retransmit.

107	   In the case of multiple packets dropped from a single window of data,
108	   the first new information available to the sender comes when the
109	   sender receives an acknowledgment for the retransmitted packet (that
110	   is, the packet retransmitted when Fast Retransmit was first
111	   entered).  If there is a single packet drop and no reordering, then
112	   the acknowledgment for this packet will acknowledge all of the
113	   packets transmitted before Fast Retransmit was entered.  However, if
114	   there are multiple packet drops, then the acknowledgment for the
115	   retransmitted packet will acknowledge some but not all of the packets
116	   transmitted before the Fast Retransmit.  We call this acknowledgment
117	   a partial acknowledgment.

119	   Along with several other suggestions, [Hoe95] suggested that during
120	   Fast Recovery the TCP data sender responds to a partial
121	   acknowledgment by inferring that the next in-sequence packet has been
122	   lost, and retransmitting that packet.  This document describes a
123	   modification to the Fast Recovery algorithm in RFC 5681 that
124	   incorporates a response to partial acknowledgments received during
125	   Fast Recovery.  We call this modified Fast Recovery algorithm
126	   NewReno, because it is a slight but significant variation of the
127	   basic Reno algorithm in RFC 5681.  This document does not discuss the
128	   other suggestions in [Hoe95] and [Hoe96], such as a change to the
129	   ssthresh parameter during Slow-Start, or the proposal to send a new
130	   packet for every two duplicate acknowledgments during Fast
131	   Recovery.  The version of NewReno in this document also draws on
132	   other discussions of NewReno in the literature [LM97, Hen98].

134	   We do not claim that the NewReno version of Fast Recovery described
135	   here is an optimal modification of Fast Recovery for responding to
136	   partial acknowledgments, for TCP connections that are unable to use
137	   SACK.  Based on our experiences with the NewReno modification in the
138	   NS simulator [NS] and with numerous implementations of NewReno, we
139	   believe that this modification improves the performance of the Fast
140	   Retransmit and Fast Recovery algorithms in a wide variety of
141	   scenarios.  Previous versions of this RFC [RFC2582, RFC3782] provide
142	   simulation-based evidence of the possible performance gains.

144	2.  Terminology and Definitions

146	   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
147	   NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and
148	  "OPTIONAL" in this document are to be interpreted as described in
149	   RFC 2119 [RFC2119].

151	   This document assumes that the reader is familiar with the terms
152	   SENDER MAXIMUM SEGMENT SIZE (SMSS), CONGESTION WINDOW (cwnd), and
153	   FLIGHT SIZE (FlightSize) defined in [RFC5681].  FLIGHT SIZE is
154	   defined as in [RFC5681] as follows:

156	      FLIGHT SIZE:
157	         The amount of data that has been sent but not yet cumulatively
158	         acknowledged.

160	   This document defines an additional sender-side state variable
161	   called RECOVER:

163	      RECOVER:
164	         When in Fast Recovery, this variable records the send sequence
165	         number that must be acknowledged before the Fast Recovery
166	         procedure is declared to be over.

168	3.  The Fast Retransmit and Fast Recovery Algorithms in NewReno

170	3.1.  Protocol Overview

172	   The basic idea of these extensions to the Fast Retransmit and
173	   Fast Recovery algorithms described in Section 3.2 of [RFC5681]
174	   is as follows.  The TCP sender can infer, from the arrival of
175	   duplicate acknowledgments, whether multiple losses in the same
176	   window of data have most likely occurred, and avoid taking a
177	   retransmit timeout or making multiple congestion window reductions
178	   due to such an event.

180	   The NewReno modification applies to the Fast Recovery procedure that
181	   begins when three duplicate ACKs are received and ends when either a
182	   retransmission timeout occurs or an ACK arrives that acknowledges all
183	   of the data up to and including the data that was outstanding when
184	   the Fast Recovery procedure began.

186	3.2.  Specification

188	   The procedures specified in Section 3.2 of [RFC5681] are followed
189	   with the following modifications.

191	   1)  Initialization of TCP protocol control block:
192	       When the TCP protocol control block is initialized, Recover is
193	       set to the initial send sequence number.

195	   2)  Three duplicate ACKs:
196	       When the third duplicate ACK is received, the TCP sender first
197	       checks the value of Recover to see if the Cumulative
198	       Acknowledgment field covers more than Recover.  If so, the value
199	       of Recover is incremented to the value of the highest sequence
200	       number transmitted by the TCP so far.  The TCP then enters Fast
201	       Retransmit (step 2 of Section 3.2 of [RFC5681]).  If not, the
202	       TCP does not enter fast retransmit and does not reset ssthresh.

204	   3)  Response to newly acknowledged data:
205	       Step 6 of [RFC5681] specifies the response to the next ACK that
206	       acknowledges previously unacknowledged data.  When an ACK
207	       arrives that acknowledges new data, this ACK could be the
208	       acknowledgment elicited by the retransmission from step 2, or
209	       elicited by a later retransmission.  There are two cases.

211	       Full acknowledgments:
212	       If this ACK acknowledges all of the data up to and including
213	       Recover, then the ACK acknowledges all the intermediate
214	       segments sent between the original transmission of the lost
215	       segment and the receipt of the third duplicate ACK.  Set cwnd to
216	       either (1) min (ssthresh, max(FlightSize, SMSS) + SMSS) or
217	       (2) ssthresh, where ssthresh is the value set when Fast
218	       Retransmit was entered, and where FlightSize in (1) is the amount
219	       of data presently outstanding.  This is termed "deflating" the
220	       window.  If the second option is selected, the implementation
221	       is encouraged to take measures to avoid a possible burst of
222	       data, in case the amount of data outstanding in the network is
223	       much less than the new congestion window allows.  A simple
224	       mechanism is to limit the number of data packets that can be sent
225	       in response to a single acknowledgment.  Exit the Fast Recovery
226	       procedure.

228	       Partial acknowledgments:
229	       If this ACK does *not* acknowledge all of the data up to and
230	       including Recover, then this is a partial ACK.  In this case,
231	       retransmit the first unacknowledged segment.  Deflate the
232	       congestion window by the amount of new data acknowledged by the
233	       cumulative acknowledgment field.  If the partial ACK
234	       acknowledges at least one SMSS of new data, then add back SMSS
235	       bytes to the congestion window.  This artificially
236	       inflates the congestion window in order to reflect the additional
237	       segment that has left the network.  Send a new segment if
238	       permitted by the new value of cwnd.  This "partial window
239	       deflation" attempts to ensure that, when Fast Recovery eventually
240	       ends, approximately ssthresh amount of data will be outstanding
241	       in the network.  Do not exit the Fast Recovery procedure (i.e.,
242	       if any duplicate ACKs subsequently arrive, execute Step 4 of
243	       Section 3.2 of [RFC5681].

245	       For the first partial ACK that arrives during Fast Recovery, also
246	       reset the retransmit timer.  Timer management is discussed in
247	       more detail in Section 4.

249	   4)  Retransmit timeouts:
250	       After a retransmit timeout, record the highest sequence number
251	       transmitted in the variable Recover and exit the Fast
252	       Recovery procedure if applicable.

254	   Step 2 above specifies a check that the Cumulative Acknowledgment
255	   field covers more than Recover.  Because the acknowledgment field
256	   contains the sequence number that the sender next expects to receive,
257	   the acknowledgment "ack_number" covers more than Recover when:

259	      ack_number - 1 > Recover;

261	   i.e., at least one byte more of data is acknowledged beyond the
262	   highest byte that was outstanding when Fast Retransmit was last
263	   entered.

265	   Note that in Step 3 above, the congestion window is deflated after
266	   a partial acknowledgment is received.  The congestion window was
267	   likely to have been inflated considerably when the partial
268	   acknowledgment was received.  In addition, depending on the original
269	   pattern of packet losses, the partial acknowledgment might
270	   acknowledge nearly a window of data.  In this case, if the congestion
271	   window was not deflated, the data sender might be able to send nearly
272	   a window of data back-to-back.

274	   This document does not specify the sender's response to duplicate
275	   ACKs when the Fast Retransmit/Fast Recovery algorithm is not
276	   invoked.  This is addressed in other documents, such as those
277	   describing the Limited Transmit procedure [RFC3042].  This document
278	   also does not address issues of adjusting the duplicate
279	   acknowledgment threshold, but assumes the threshold specified in the
280	   IETF standards; the current standard is [RFC5681], which specifies
281	   a threshold of three duplicate acknowledgments.

283	   As a final note, we would observe that in the absence of the SACK
284	   option, the data sender is working from limited information.  When
285	   the issue of recovery from multiple dropped packets from a single
286	   window of data is of particular importance, the best alternative
287	   would be to use the SACK option.

289	4.  Handling Duplicate Acknowledgments After A Timeout

291	   After each retransmit timeout, the highest sequence number
292	   transmitted so far is recorded in the variable "recover".
293	   If, after a retransmit timeout, the TCP data sender retransmits three
294	   consecutive packets that have already been received by the data
295	   receiver, then the TCP data sender will receive three duplicate
296	   acknowledgments that do not cover more than "recover".  In this
297	   case, the duplicate acknowledgments are not an indication of a new
298	   instance of congestion.  They are simply an indication that the
299	   sender has unnecessarily retransmitted at least three packets.

301	   However, when a retransmitted packet is itself dropped, the sender
302	   can also receive three duplicate acknowledgments that do not cover
303	   more than "recover".  In this case, the sender would have been
304	   better off if it had initiated Fast Retransmit.  For a TCP that
305	   implements the algorithm specified in Section 3 of this document, the
306	   sender does not infer a packet drop from duplicate acknowledgments
307	   in this scenario.  As always, the retransmit timer is the backup
308	   mechanism for inferring packet loss in this case.

310	   There are several heuristics, based on timestamps or on the amount of
311	   advancement of the cumulative acknowledgment field, that allow the
312	   sender to distinguish, in some cases, between three duplicate
313	   acknowledgments following a retransmitted packet that was dropped,
314	   and three duplicate acknowledgments from the unnecessary
315	   retransmission of three packets [Gur03, GF04].  The TCP sender MAY
316	   use such a heuristic to decide to invoke a Fast Retransmit in some
317	   cases, even when the three duplicate acknowledgments do not cover
318	   more than "recover".

320	   For example, when three duplicate acknowledgments are caused by the
321	   unnecessary retransmission of three packets, this is likely to be
322	   accompanied by the cumulative acknowledgment field advancing by at
323	   least four segments.  Similarly, a heuristic based on timestamps uses
324	   the fact that when there is a hole in the sequence space, the
325	   timestamp echoed in the duplicate acknowledgment is the timestamp of
326	   the most recent data packet that advanced the cumulative
327	   acknowledgment field [RFC1323].  If timestamps are used, and the
328	   sender stores the timestamp of the last acknowledged segment, then
329	   the timestamp echoed by duplicate acknowledgments can be used to
330	   distinguish between a retransmitted packet that was dropped and
331	   three duplicate acknowledgments from the unnecessary
332	   retransmission of three packets.

334	4.1.  ACK Heuristic

336	   If the ACK-based heuristic is used, then following the advancement of
337	   the cumulative acknowledgment field, the sender stores the value of
338	   the previous cumulative acknowledgment as prev_highest_ack, and
339	   stores the latest cumulative ACK as highest_ack.  In addition, the
340	   following step is performed if Step 1 in Section 3 fails, before
341	   proceeding to Step 1B.

343	   1*)  If the Cumulative Acknowledgment field didn't cover more than
344	        "recover", check to see if the congestion window is greater
345	        than SMSS bytes and the difference between highest_ack and
346	        prev_highest_ack is at most 4*SMSS bytes.  If true, duplicate
347	        ACKs indicate a lost segment (proceed to Step 1A in Section
348	        3).  Otherwise, duplicate ACKs likely result from unnecessary
349	        retransmissions (proceed to Step 1B in Section 3).

351	   The congestion window check serves to protect against fast retransmit
352	   immediately after a retransmit timeout.

354	   If several ACKs are lost, the sender can see a jump in the cumulative
355	   ACK of more than three segments, and the heuristic can fail.
356	   [RFC5681] recommends that a receiver should
357	   send duplicate ACKs for every out-of-order data packet, such as a
358	   data packet received during Fast Recovery.  The ACK heuristic is more
359	   likely to fail if the receiver does not follow this advice, because
360	   then a smaller number of ACK losses are needed to produce a
361	   sufficient jump in the cumulative ACK.

363	4.2.  Timestamp Heuristic

365	   If this heuristic is used, the sender stores the timestamp of the
366	   last acknowledged segment.  In addition, the second paragraph of step
367	   1 in Section 3 is replaced as follows:

369	   1**) If the Cumulative Acknowledgment field didn't cover more than
370	        "recover", check to see if the echoed timestamp in the last
371	        non-duplicate acknowledgment equals the
372	        stored timestamp.  If true, duplicate ACKs indicate a lost
373	        segment (proceed to Step 1A in Section 3).  Otherwise, duplicate
374	        ACKs likely result from unnecessary retransmissions (proceed
375	        to Step 1B in Section 3).

377	   The timestamp heuristic works correctly, both when the receiver
378	   echoes timestamps as specified by [RFC1323], and by its revision
379	   attempts.  However, if the receiver arbitrarily echoes timestamps,
380	   the heuristic can fail.  The heuristic can also fail if a timeout was
381	   spurious and returning ACKs are not from retransmitted segments.
382	   This can be prevented by detection algorithms such as [RFC3522].

384	5.  Implementation Issues for the Data Receiver

386	   [RFC5681] specifies that "Out-of-order data segments SHOULD be
387	   acknowledged immediately, in order to accelerate loss recovery."
388	   Neal Cardwell has noted that some data receivers do not send an
389	   immediate acknowledgment when they send a partial acknowledgment,
390	   but instead wait first for their delayed acknowledgment timer to
391	   expire [C98].  As [C98] notes, this severely limits the potential
392	   benefit of NewReno by delaying the receipt of the partial
393	   acknowledgment at the data sender.  Echoing [RFC5681], our
394	   recommendation is that the data receiver send an immediate
395	   acknowledgment for an out-of-order segment, even when that
396	   out-of-order segment fills a hole in the buffer.

398	6.  Implementation Issues for the Data Sender

400	   In Section 3, Step 5 above, it is noted that implementations should
401	   take measures to avoid a possible burst of data when leaving Fast
402	   Recovery, in case the amount of new data that the sender is eligible
403	   to send due to the new value of the congestion window is large.  This
404	   can arise during NewReno when ACKs are lost or treated as pure window
405	   updates, thereby causing the sender to underestimate the number of
406	   new segments that can be sent during the recovery procedure.
407	   Specifically, bursts can occur when the FlightSize is much less than
408	   the new congestion window when exiting from Fast Recovery.  One
409	   simple mechanism to avoid a burst of data when leaving Fast Recovery
410	   is to limit the number of data packets that can be sent in response
411	   to a single acknowledgment.  (This is known as "maxburst_" in the ns
412	   simulator.)  Other possible mechanisms for avoiding bursts include
413	   rate-based pacing, or setting the slow-start threshold to the
414	   resultant congestion window and then resetting the congestion window
415	   to FlightSize.  A recommendation on the general mechanism to avoid
416	   excessively bursty sending patterns is outside the scope of this
417	   document.

419	   An implementation may want to use a separate flag to record whether
420	   or not it is presently in the Fast Recovery procedure.  The use of
421	   the value of the duplicate acknowledgment counter for this purpose is
422	   not reliable because it can be reset upon window updates and
423	   out-of-order acknowledgments.

425	   When updating the Cumulative Acknowledgment field outside of
426	   Fast Recovery, the "recover" state variable may also need to be
427	   updated in order to continue to permit possible entry into Fast
428	   Recovery (Section 3, step 1).  This issue arises when an update
429	   of the Cumulative Acknowledgment field results in a sequence
430	   wraparound that affects the ordering between the Cumulative
431	   Acknowledgment field and the "recover" state variable.  Entry
432	   into Fast Recovery is only possible when the Cumulative
433	   Acknowledgment field covers more than the "recover" state variable.

435	   It is important for the sender to respond correctly to duplicate ACKs
436	   received when the sender is no longer in Fast Recovery (e.g., because
437	   of a Retransmit Timeout).  The Limited Transmit procedure [RFC3042]
438	   describes possible responses to the first and second duplicate
439	   acknowledgments.  When three or more duplicate acknowledgments are
440	   received, the Cumulative Acknowledgment field doesn't cover more
441	   than "recover", and a new Fast Recovery is not invoked, it is
442	   important that the sender not execute the Fast Recovery steps (3) and
443	   (4) in Section 3.  Otherwise, the sender could end up in a chain of
444	   spurious timeouts.  We mention this only because several NewReno
445	   implementations had this bug, including the implementation in the NS
446	   simulator.

448	   It has been observed that some TCP implementations enter a slow start
449	   or congestion avoidance window updating algorithm immediately after
450	   the cwnd is set by the equation found in (Section 3, step 5), even
451	   without a new external event generating the cwnd change.  Note that
452	   after cwnd is set based on the procedure for exiting Fast Recovery
453	   (Section 3, step 5), cwnd SHOULD NOT be updated until a further
454	   event occurs (e.g., arrival of an ack, or timeout) after this
455	   adjustment.

457	7.  Security Considerations

459	   [RFC5681] discusses general security considerations concerning TCP
460	   congestion control.  This document describes a specific algorithm
461	   that conforms with the congestion control requirements of [RFC5681],
462	   and so those considerations apply to this algorithm, too.  There are
463	   no known additional security concerns for this specific algorithm.

465	8.  IANA Considerations

467	   This document has no actions for IANA.

469	9.  Conclusions

471	   This document specifies the NewReno Fast Retransmit and Fast Recovery
472	   algorithms for TCP.  This NewReno modification to TCP can even be
473	   important for TCP implementations that support the SACK option,
474	   because the SACK option can only be used for TCP connections when
475	   both TCP end-nodes support the SACK option.  NewReno performs better
476	   than Reno (RFC5681) in a number of scenarios discussed in
477	   previous versions of this RFC ([RFC2582], [RFC3782]).

479	   A number of options to the basic algorithm presented in Section 3 are
480	   also referenced in Appendix A to this document.  These include the
481	   handling of the retransmission timer, the response to partial
482	   acknowledgments, and whether or not the sender must maintain a state
483	   variable called Recover.  Our belief is that the differences
484	   between these variants of NewReno are small compared to the
485	   differences between Reno and NewReno.  That is, the important thing
486	   is to implement NewReno instead of Reno, for a TCP connection
487	   without SACK; it is less important exactly which of the variants of
488	   NewReno is implemented.

490	10.  Acknowledgments

492	   Many thanks to Anil Agarwal, Mark Allman, Armando Caro, Jeffrey Hsu,
493	   Vern Paxson, Kacheong Poon, Keyur Shah, and Bernie Volz for detailed
494	   feedback on this document or on its precursor, RFC 2582.  Jeffrey
495	   Hsu provided clarifications on the handling of the recover variable
496	   that were applied to RFC 3782 as errata, and now are in Section 8
497	   of this document.  Yoshifumi Nishida contributed a modification
498	   to the fast recovery algorithm to account for the case in which
499	   flightsize is 0 when the TCP sender leaves fast recovery, and the
500	   TCP receiver uses delayed acknowledgments.  Alexander Zimmermann
501	   provided several suggestions to improve the clarity of the document.

503	11.  References
504	11.1.  Normative References

506	   [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
507	             Requirement Levels", BCP 14, RFC 2119, March 1997.

509	   [RFC5681] Allman, M., Paxson, V. and  E. Blanton, "TCP Congestion
510	             Control", RFC 5681, September 2009.

512	   [RFC6298] Paxson, V., Allman, M., Chu, J., and Sargent, M.,
513	             "Computing TCP's Retransmission Timer", RFC 6298,
514	             June 2011.

516	11.2.  Informative References

518	   [C98]     Cardwell, N., "delayed ACKs for retransmitted packets:
519	             ouch!".  November 1998,  Email to the tcpimpl mailing list,
520	             Message-ID "Pine.LNX.4.02A.9811021421340.26785-100000@
521	             sake.cs.washington.edu",
522	             archived at "http://tcp-impl.lerc.nasa.gov/tcp-impl".

524	   [F98]     Floyd, S., Revisions to RFC 2001, "Presentation to the
525	             TCPIMPL Working Group", August 1998.  URLs
526	             "ftp://ftp.ee.lbl.gov/talks/sf-tcpimpl-aug98.ps" and
527	             "ftp://ftp.ee.lbl.gov/talks/sf-tcpimpl-aug98.pdf".

529	   [F03]     Floyd, S., "Moving NewReno from Experimental to Proposed
530	             Standard?  Presentation to the TSVWG Working Group", March
531	             2003.  URLs
532	             "http://www.icir.org/floyd/talks/newreno-Mar03.ps" and
533	             "http://www.icir.org/floyd/talks/newreno-Mar03.pdf".

535	   [FF96]    Fall, K. and S. Floyd, "Simulation-based Comparisons of
536	             Tahoe, Reno and SACK TCP", Computer Communication Review,
537	             July 1996.  URL "ftp://ftp.ee.lbl.gov/papers/sacks.ps.Z".

539	   [F94]     Floyd, S., "TCP and Successive Fast Retransmits", Technical
540	             report, October 1994.  URL
541	             "ftp://ftp.ee.lbl.gov/papers/fastretrans.ps".

543	   [GF04]    Gurtov, A. and S. Floyd, "Resolving Acknowledgment
544	             Ambiguity in non-SACK TCP", Next Generation Teletraffic and
545	             Wired/Wireless Advanced Networking (NEW2AN'04), February
546	             2004.  URL "http://www.cs.helsinki.fi/u/gurtov/papers/
547	             heuristics.html".

549	   [Gur03]   Gurtov, A., "[Tsvwg] resolving the problem of unnecessary
550	             fast retransmits in go-back-N", email to the tsvwg mailing
551	             list, message ID <3F25B467.9020609@cs.helsinki.fi>, July
552	             28, 2003.  URL "http://www1.ietf.org/mail-archive/
553	             working-groups/ tsvwg/current/msg04334.html".

555	   [Hen98]   Henderson, T., Re: NewReno and the 2001 Revision. September
556	             1998.  Email to the tcpimpl mailing list, Message ID
557	             "Pine.BSI.3.95.980923224136.26134A-100000@raptor.
558	             CS.Berkeley.EDU", archived at
559	             "http://tcp-impl.lerc.nasa.gov/tcp-impl".

561	   [Hoe95]   Hoe, J., "Startup Dynamics of TCP's Congestion Control and
562	             Avoidance Schemes", Master's Thesis, MIT, 1995.

564	   [Hoe96]   Hoe, J., "Improving the Start-up Behavior of a Congestion
565	             Control Scheme for TCP", ACM SIGCOMM, August 1996.  URL
566	             "http://www.acm.org/sigcomm/sigcomm96/program.html".

568	   [LM97]    Lin, D. and R. Morris, "Dynamics of Random Early
569	             Detection", SIGCOMM 97, September 1997.  URL
570	             "http://www.acm.org/sigcomm/sigcomm97/program.html".

572	   [NS]      The Network Simulator (NS).
573	             URL "http://www.isi.edu/nsnam/ns/".

575	   [PF01]    Padhye, J. and S. Floyd, "Identifying the TCP Behavior of
576	             Web Servers", June 2001, SIGCOMM 2001.

578	   [RFC1323] Jacobson, V., Braden, R. and D. Borman, "TCP Extensions for
579	             High Performance", RFC 1323, May 1992.

581	   [RFC2582] Floyd, S. and T. Henderson, "The NewReno Modification to
582	             TCP's Fast Recovery Algorithm", RFC 2582, April 1999.

584	   [RFC2883] Floyd, S., J. Mahdavi, M. Mathis, and M. Podolsky, "The
585	             Selective Acknowledgment (SACK) Option for TCP, RFC 2883,
586	             July 2000.

588	   [RFC3042] Allman, M., Balakrishnan, H. and S. Floyd, "Enhancing TCP's
589	             Loss Recovery Using Limited Transmit", RFC 3042, January
590	             2001.

592	   [RFC3522] Ludwig, R. and M. Meyer, "The Eifel Detection Algorithm for
593	             TCP", RFC 3522, April 2003.

595	   [RFC3782] Floyd, S., T. Henderson, and A. Gurtov, "The NewReno
596	             Modification to TCP's Fast Recovery Algorithm", RFC 3782,
597	             April 2004.

599	Appendix A.  Additional Information

601	   Previous versions of this RFC ([RFC2582], [RFC3782]) contained
602	   additional informative material on the following subjects, and
603	   may be consulted by readers who may want more information about
604	   possible variants to the algorithm and who may want references
605	   to specific [NS] simulations that provide NewReno test cases.

607	   Section 4 of [RFC3782] discusses some alternative behaviors for
608	   resetting the retransmit timer after a partial acknowledgment.

610	   Section 5 of [RFC3782] discusses some alternative behaviors for
611	   performing retransmission after a partial acknowledgment.

613	   Section 6 of [RFC3782] describes more information about the
614	   motivation for the sender's state variable Recover.

616	   Section 9 of [RFC3782] introduces some NS simulation test
617	   suites for NewReno.  In addition, references to simulation
618	   results can be found throughout [RFC3782].

620	   Section 10 of [RFC3782] provides a comparison of Reno and
621	   NewReno TCP.

623	   Section 11 of [RFC3782] listed changes relative to [RFC3782].

625	Appendix B.  Changes Relative to RFC 3782

627	   In [RFC3782], the cwnd after Full ACK reception will be set to
628	   (1) min (ssthresh, FlightSize + SMSS) or (2) ssthresh.  However,
629	   there is a risk in the first logic which results in performance
630	   degradation.  With the first logic, if FlightSize is zero, the
631	   result will be 1 SMSS. This means TCP can transmit only 1 segment
632	   at this moment, which can cause delay in ACK transmission at receiver
633	   due to delayed ACK algorithm.

635	   The FlightSize on Full ACK reception can be zero in some situations.
636	   A typical example is where sending window size during fast recovery
637	   is small. In this case, the retransmitted packet and new data packets
638	   can be transmitted within a short interval.  If all these packets
639	   successfully arrive, the receiver may generate a Full ACK that
640	   acknowledges all outstanding data.  Even if window size is not small,
641	   loss of ACK packets or receive buffer shortage during fast recovery
642	   can also increase the possibility to fall into this situation.

644	   The proposed fix in this document ensures that sender TCP transmits
645	   at least two segments on Full ACK reception.

647	   In addition, errata for RFC3782 (editorial clarification to Section 8
648	   of RFC2582, which is now Section 6 of this document) has been
649	   applied.

651	   The specification text (Section 3.2 herein) was rewritten to more
652	   closely track Section 3.2 of [RFC5681].

654	   Sections 4, 5, 9-11 of [RFC3782] were removed, and instead Appendix
655	   A of this document was added to back-reference this informative
656	   material.

658	Appendix C.  Document Revision History

660	   To be removed upon publication

662	   +----------+--------------------------------------------------+
663	   | Revision | Comments                                         |
664	   +----------+--------------------------------------------------+
665	   | draft-00 | RFC3782 errata applied, and changes applied from |
666	   |          | draft-nishida-newreno-modification-02            |
667	   +----------+--------------------------------------------------+
668	   | draft-01 | Non-normative sections moved to appendices,      |
669	   |          | editorial clarifications applied as suggested    |
670	   |          | by Alexander Zimmermann.                         |
671	   +----------+--------------------------------------------------+
672	   | draft-02 | Better align specification text with RFC5681.    |
673	   |          | Replace informative appendices by a new appendix |
674	   |          | that just provides back-references to earlier    |
675	   |          | NewReno RFCs.                                    |
676	   +----------+--------------------------------------------------+

678	Authors' Addresses

680	   Tom Henderson
681	   The Boeing Company

683	   EMail: thomas.r.henderson@boeing.com

685	   Sally Floyd
686	   International Computer Science Institute

688	   Phone: +1 (510) 666-2989
689	   EMail: floyd@acm.org
690	   URL: http://www.icir.org/floyd/

692	   Andrei Gurtov
693	   HIIT
694	   Helsinki Institute for Information Technology
695	   P.O. Box 19215
696	   00076 Aalto
697	   Finland

699	   EMail: gurtov@hiit.fi
700	   Yoshifumi Nishida
701	   WIDE Project
702	   Endo 5322
703	   Fujisawa, Kanagawa  252-8520
704	   Japan

706	   Email: nishida@wide.ad.jp