idnits 2.17.1 

draft-stevens-tcpca-spec-00.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in
     this document.

     Expected boilerplate is as follows today (2024-04-19) according to
     https://trustee.ietf.org/license-info :

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.a:
        This Internet-Draft is submitted in full conformance with the provisions
        of BCP 78 and BCP 79.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2:
        Copyright (c) 2024 IETF Trust and the persons identified as the document
        authors.  All rights reserved.

     IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3:
        This document is subject to BCP 78 and the IETF Trust's Legal Provisions
        Relating to IETF Documents
        (https://trustee.ietf.org/license-info) in effect on the date of
        publication of this document.  Please review these documents
        carefully, as they describe your rights and restrictions with
        respect to this document.  Code Components extracted from this
        document must include Simplified BSD License text as described in
        Section 4.e of the Trust Legal Provisions and are provided
        without warranty as described in the Simplified BSD License.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** Missing expiration date.  The document expiration date should appear on
     the first and last page.

  ** The document seems to lack a 1id_guidelines paragraph about
     Internet-Drafts being working documents. 

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     current Internet-Drafts. 

  ** The document seems to lack a 1id_guidelines paragraph about the list of
     Shadow Directories. 

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Introduction section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** The abstract seems to contain references ([2], [3], [4], [5], [1]),
     which it shouldn't.  Please replace those with straight textual mentions
     of the documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- The document date (February 1996) is 10291 days in the past.  Is this
     intentional?


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  -- Possible downref: Non-RFC (?) normative reference: ref. '2'

  -- Possible downref: Non-RFC (?) normative reference: ref. '3'

  -- Possible downref: Non-RFC (?) normative reference: ref. '4'

  -- Possible downref: Non-RFC (?) normative reference: ref. '5'


     Summary: 10 errors (**), 0 flaws (~~), 1 warning (==), 6 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------

1	INTERNET-DRAFT                                        W. Richard Stevens
2	Expires: August 26, 1996                                   February 1996
3	<draft-stevens-tcpca-spec-00.txt>

5	                 TCP Slow Start, Congestion Avoidance,
6	             Fast Retransmit, and Fast Recovery Algorithms

8	Status of this Memo

10	    This document is an Internet Draft.  Internet Drafts are working
11	    documents of the Internet Engineering Task Force (IETF), its Areas,
12	    and its Working Groups.  Note that other groups may also distribute
13	    working documents as Internet Drafts.

15	    Internet Drafts are draft documents valid for a maximum of six
16	    months.  Internet Drafts may be updated, replaced, or obsoleted by
17	    other documents at any time.  It is not appropriate to use Internet
18	    Drafts as reference material or to cite them other than as a
19	    "working draft" or "work in progress."

21	    To learn the current status of any Internet-Draft, please check the
22	    "1id-abstracts.txt" listing contained in the internet-drafts Shadow
23	    Directories on:

25	             ftp.is.co.za (Africa)
26	             nic.nordu.net (Europe)
27	             ds.internic.net (US East Coast)
28	             ftp.isi.edu (US West Coast)
29	             munnari.oz.au (Pacific Rim)

31	Abstract

33	    Modern implementations of TCP contain four intertwined algorithms
34	    that have never been fully documented as Internet standards:  slow
35	    start, congestion avoidance, fast retransmit, and fast recovery.
36	    [2] and [3] provide some details on these algorithms, [4] provides
37	    examples of the algorithms in action, and [5] provides the source
38	    code for the 4.4BSD implementation.  RFC 1122 requires that a TCP
39	    must implement slow start and congestion avoidance (Section 4.2.2.15
40	    of [1]), citing [2] as the reference, but fast retransmit and fast
41	    recovery were implemented after RFC 1122.  The purpose of this
42	    Internet Draft is to document these four algorithms for the
43	    Internet.

45	Acknowledgments

47	    Much of this memo is taken from "TCP/IP Illustrated, Volume 1:  The
48	    Protocols" by W. Richard Stevens (Addison-Wesley, 1994) and "TCP/IP
49	    Illustrated, Volume 2: The Implementation" by Gary R. Wright and W.
50	    Richard Stevens (Addison-Wesley, 1995).  This material is used with
51	    the permission of Addison-Wesley.

53	1.  Slow Start

55	    Old TCPs would start a connection with the sender injecting multiple
56	    segments into the network, up to the window size advertised by the
57	    receiver.  While this is OK when the two hosts are on the same LAN,
58	    if there are routers and slower links between the sender and the
59	    receiver, problems can arise.  Some intermediate router must queue
60	    the packets, and it's possible for that router to run out of space.
61	    [2] shows how this naive approach can reduce the throughput of a TCP
62	    connection drastically.

64	    The algorithm to avoid this is called slow start.  It operates by
65	    observing that the rate at which new packets should be injected into
66	    the network is the rate at which the acknowledgments are returned by
67	    the other end.

69	    Slow start adds another window to the sender's TCP:  the congestion
70	    window, called "cwnd".  When a new connection is established with a
71	    host on another network, the congestion window is initialized to one
72	    segment (i.e., the segment size announced by the other end, or the
73	    default, typically 536 or 512).  Each time an ACK is received, the
74	    congestion window is increased by one segment.  The sender can
75	    transmit up to the minimum of the congestion window and the
76	    advertised window.  The congestion window is flow control imposed by
77	    the sender, while the advertised window is flow control imposed by
78	    the receiver.  The former is based on the sender's assessment of
79	    perceived network congestion; the latter is related to the amount of
80	    available buffer space at the receiver for this connection.

82	    The sender starts by transmitting one segment and waiting for its
83	    ACK.  When that ACK is received, the congestion window is
84	    incremented from one to two, and two segments can be sent.  When
85	    each of those two segments is acknowledged, the congestion window is
86	    increased to four.  This provides an exponential increase, although
87	    it is not exactly exponential because the receiver may delay its
88	    ACKs, typically sending one ACK for every two segments that it
89	    receives.

91	    At some point the capacity of the internet can be reached, and an
92	    intermediate router will start discarding packets.  This tells the
93	    sender that its congestion window has gotten too large.

95	    Early implementations performed slow start only if the other end was
96	    on a different network.  Current implementations always perform slow
97	    start.

99	2.  Congestion Avoidance

101	    Congestion can occur when data arrives on a big pipe (a fast LAN)
102	    and gets sent out a smaller pipe (a slower WAN).  Congestion can
103	    also occur when multiple input streams arrive at a router whose
104	    output capacity is less than the sum of the inputs.  Congestion
105	    avoidance is a way to deal with lost packets.  It is described in
106	    [2].

108	    The assumption of the algorithm is that packet loss caused by damage
109	    is very small (much less than 1%), therefore the loss of a packet
110	    signals congestion somewhere in the network between the source and
111	    destination.  There are two indications of packet loss:  a timeout
112	    occurring and the receipt of duplicate ACKs.

114	    Congestion avoidance and slow start are independent algorithms with
115	    different objectives.  But when congestion occurs TCP must slow down
116	    its transmission rate of packets into the network, and then invoke
117	    slow start to get things going again.  In practice they are
118	    implemented together.

120	    Congestion avoidance and slow start require that two variables be
121	    maintained for each connection: a congestion window, cwnd, and a
122	    slow start threshold size, ssthresh.  The combined algorithm
123	    operates as follows:

125	    1.  Initialization for a given connection sets cwnd to one segment
126	        and ssthresh to 65535 bytes.

128	    2.  The TCP output routine never sends more than the minimum of cwnd
129	        and the receiver's advertised window.

131	    3.  When congestion occurs (indicated by a timeout or the reception
132	        of duplicate ACKs), one-half of the current window size (the
133	        minimum of cwnd and the receiver's advertised window, but at
134	        least two segments) is saved in ssthresh.  Additionally, if the
135	        congestion is indicated by a timeout, cwnd is set to one segment
136	        (i.e., slow start).

138	    4.  When new data is acknowledged by the other end, increase cwnd,
139	        but the way it increases depends on whether TCP is performing
140	        slow start or congestion avoidance.

142	        If cwnd is less than or equal to ssthresh, TCP is in slow start;
143	        otherwise TCP is performing congestion avoidance.  Slow start
144	        continues until TCP is halfway to where it was when congestion
145	        occurred (since it recorded half of the window size that caused
146	        the problem in step 2), and then congestion avoidance takes
147	        over.

149	        Slow start has cwnd begin at one segment, and be incremented by
150	        one segment every time an ACK is received.  As mentioned
151	        earlier, this opens the window exponentially:  send one segment,
152	        then two, then four, and so on.  Congestion avoidance dictates
153	        that cwnd be incremented by 1/cwnd each time an ACK is received.
154	        This is an additive increase, compared to slow start's
155	        exponential increase.  The increase in cwnd should be at most
156	        one segment each round-trip time (regardless how many ACKs are
157	        received in that RTT), whereas slow start increments cwnd by the
158	        number of ACKs received in a round-trip time.

160	    Many implementations incorrectly add a small fraction of the segment
161	    size (typically the segment size divided by 8) during congestion
162	    avoidance.  This is wrong and should not be emulated in future
163	    releases.

165	3.  Fast Retransmit

167	    Modifications to the congestion avoidance algorithm were proposed in
168	    1990 [3].  Before describing the change, realize that TCP may
169	    generate an immediate acknowledgment (a duplicate ACK) when an out-
170	    of-order segment is received (Section 4.2.2.21 of [1], with a note
171	    that one reason for doing so was for the experimental fast-
172	    retransmit algorithm).  This duplicate ACK should not be delayed.
173	    The purpose of this duplicate ACK is to let the other end know that
174	    a segment was received out of order, and to tell it what sequence
175	    number is expected.

177	    Since TCP does not know whether a duplicate ACK is caused by a lost
178	    segment or just a reordering of segments, it waits for a small
179	    number of duplicate ACKs to be received.  It is assumed that if
180	    there is just a reordering of the segments, there will be only one
181	    or two duplicate ACKs before the reordered segment is processed,
182	    which will then generate a new ACK.  If three or more duplicate ACKs
183	    are received in a row, it is a strong indication that a segment has
184	    been lost.  TCP then performs a retransmission of what appears to be
185	    the missing segment, without waiting for a retransmission timer to
186	    expire.  This is the fast retransmit algorithm.

188	4.  Fast Recovery
189	    After fast retransmit sends what appears to be the missing segment,
190	    congestion avoidance, but not slow start is performed.  This is the
191	    fast recovery algorithm.  It is an improvement that allows high
192	    throughput under moderate congestion, especially for large windows.

194	    The reason for not performing slow start in this case is that the
195	    receipt of the duplicate ACKs tells TCP more than just a packet has
196	    been lost.  Since the receiver can only generate the duplicate ACK
197	    when another segment is received, that segment has left the network
198	    and is in the receiver's buffer.  That is, there is still data
199	    flowing between the two ends, and TCP does not want to reduce the
200	    flow abruptly by going into slow start.

202	    The fast retransmit and fast recovery algorithms are usually
203	    implemented together as follows.

205	    1.  When the third duplicate ACK in a row is received, set ssthresh
206	        to one-half the current congestion window, cwnd, but no less
207	        than two segments.  Retransmit the missing segment.  Set cwnd to
208	        ssthresh plus 3 times the segment size.  This inflates the
209	        congestion window by the number of segments that have left the
210	        network and which the other end has cached (3).

212	    2.  Each time another duplicate ACK arrives, increment cwnd by the
213	        segment size.  This inflates the congestion window for the
214	        additional segment that has left the network.  Transmit a
215	        packet, if allowed by the new value of cwnd.

217	    3.  When the next ACK arrives that acknowledges new data, set cwnd
218	        to ssthresh (the value set in step 1).  This ACK should be the
219	        acknowledgment of the retransmission from step 1, one round-trip
220	        time after the retransmission.  Additionally, this ACK should
221	        acknowledge all the intermediate segments sent between the lost
222	        packet and the receipt of the first duplicate ACK.  This step is
223	        congestion avoidance, since TCP is down to one-half the rate it
224	        was at when the packet was lost.

226	    The fast retransmit algorithm first appeared in the 4.3BSD Tahoe
227	    release, but it was incorrectly followed by slow start.  The fast
228	    recovery algorithm appeared in the 4.3BSD Reno release.

230	5.  Security Considerations

232	    Security considerations are not discussed in this memo.

234	6.  References

236	    [1]  B. Braden, ed., "Requirements for Internet Hosts --
237	         Communication Layers," RFC 1122, Oct. 1989.

239	    [2]  V. Jacobson, "Congestion Avoidance and Control," Computer
240	         Communication Review, vol. 18, no. 4, pp. 314-329, Aug. 1988.
241	         ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z.

243	    [3]  V. Jacobson, "Modified TCP Congestion Avoidance Algorithm,"
244	         end2end-interest mailing list, April 30, 1990.
245	         ftp://ftp.isi.edu/end2end/end2end-interest-1990.mail.

247	    [4]  W. R. Stevens, "TCP/IP Illustrated, Volume 1: The Protocols",
248	         Addison-Wesley, 1994.

250	    [5]  G. R. Wright, W. R. Stevens, "TCP/IP Illustrated, Volume 2:
251	         The Implementation", Addison-Wesley, 1995.

253	Author's  Address:

255	    W. Richard Stevens
256	    1202 E. Paseo del Zorro
257	    Tucson, AZ  85718

259	    Phone: 520-297-9416

261	    EMail: rstevens@noao.edu

263	    Expires: August 26, 1996