idnits 2.17.1 

draft-ietf-tsvwg-initwin-03.txt:

  Checking boilerplate required by RFC 5378 and the IETF Trust (see
  https://trustee.ietf.org/license-info):
  ----------------------------------------------------------------------------

  ** Looks like you're using RFC 2026 boilerplate.  This must be updated to
     follow RFC 3978/3979, as updated by RFC 4748.


  Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt:
  ----------------------------------------------------------------------------

  ** The document seems to lack a 1id_guidelines paragraph about 6 months
     document validity -- however, there's a paragraph with a matching
     beginning. Boilerplate error?

  == No 'Intended status' indicated for this document; assuming Proposed
     Standard


  Checking nits according to https://www.ietf.org/id-info/checklist :
  ----------------------------------------------------------------------------

  ** The document seems to lack an Introduction section.

  ** The document seems to lack an IANA Considerations section.  (See Section
     2.2 of https://www.ietf.org/id-info/checklist for how to handle the case
     when there are no actions for IANA.)

  ** The document seems to lack separate sections for Informative/Normative
     References.  All references will be assumed normative when checking for
     downward references.

  ** There are 9 instances of too long lines in the document, the longest one
     being 11 characters in excess of 72.

  ** There is 1 instance of lines with control characters in the document.

  ** The abstract seems to contain references ([RFC2119]), which it
     shouldn't.  Please replace those with straight textual mentions of the
     documents in question.


  Miscellaneous warnings:
  ----------------------------------------------------------------------------

  == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not
     match the current year

  -- The document seems to lack a disclaimer for pre-RFC5378 work, but may
     have content which was first submitted before 10 November 2008.  If you
     have contacted all the original authors and they are all willing to grant
     the BCP78 rights to the IETF Trust, then this is fine, and you can ignore
     this comment.  If not, you may need to add the pre-RFC5378 disclaimer. 
     (See the Legal Provisions document at
     https://trustee.ietf.org/license-info for more information.)

  -- Couldn't find a document date in the document -- date freshness check
     skipped.


  Checking references for intended status: Proposed Standard
  ----------------------------------------------------------------------------

     (See RFCs 3967 and 4897 for information about using normative references
     to lower-maturity documents in RFCs)

  == Missing Reference: 'RFC2481' is mentioned on line 573, but not defined

  ** Obsolete undefined reference: RFC 2481 (Obsoleted by RFC 3168)

  == Missing Reference: 'MMFR96' is mentioned on line 585, but not defined

  == Unused Reference: 'FF96' is defined on line 461, but no explicit
     reference was found in the text

  == Unused Reference: 'FJ93' is defined on line 470, but no explicit
     reference was found in the text

  == Unused Reference: 'Flo96' is defined on line 477, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC2309' is defined on line 516, but no explicit
     reference was found in the text

  == Unused Reference: 'RFC3168' is defined on line 541, but no explicit
     reference was found in the text

  -- Possible downref: Non-RFC (?) normative reference: ref. 'AHO98'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'All97a'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'All97b'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'All00'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'FF96'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'FF98'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'FJ93'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Flo94'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Flo96'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Flo97'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'KAGT98'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Mor97'

  -- Possible downref: Non-RFC (?) normative reference: ref. 'Nic97'

  ** Obsolete normative reference: RFC  821 (ref. 'Pos82') (Obsoleted by RFC
     2821)

  ** Downref: Normative reference to an Informational RFC: RFC 1945

  ** Obsolete normative reference: RFC 2068 (Obsoleted by RFC 2616)

  ** Obsolete normative reference: RFC 2309 (Obsoleted by RFC 7567)

  ** Obsolete normative reference: RFC 2414 (Obsoleted by RFC 3390)

  ** Downref: Normative reference to an Informational RFC: RFC 2415

  ** Downref: Normative reference to an Informational RFC: RFC 2416

  ** Obsolete normative reference: RFC 2581 (Obsoleted by RFC 5681)

  ** Obsolete normative reference: RFC 2988 (Obsoleted by RFC 6298)


     Summary: 18 errors (**), 0 flaws (~~), 9 warnings (==), 15 comments (--).

     Run idnits with the --verbose option for more detailed information about
     the items above.

--------------------------------------------------------------------------------


2	Internet Engineering Task Force                              Mark Allman
3	INTERNET DRAFT                                              BBN/NASA GRC
4	File: draft-ietf-tsvwg-initwin-03.txt                        April, 2002
5	                                                  Expires: October, 2002
6	                                                             Sally Floyd
7	                                                                    ICIR
8	                                                         Craig Partridge
9	                                                        BBN Technologies

11	                    Increasing TCP's Initial Window

13	Status of this Memo

15	    This document is an Internet-Draft and is in full conformance with
16	    all provisions of Section 10 of RFC2026.

18	    Internet-Drafts are working documents of the Internet Engineering
19	    Task Force (IETF), its areas, and its working groups.  Note that
20	    other groups may also distribute working documents as
21	    Internet-Drafts.

23	    Internet-Drafts are draft documents valid for a maximum of six
24	    months and may be updated, replaced, or obsoleted by other documents
25	    at any time.  It is inappropriate to use Internet- Drafts as
26	    reference material or to cite them other than as "work in progress."

28	    The list of current Internet-Drafts can be accessed at
29	    http://www.ietf.org/ietf/1id-abstracts.txt

31	    The list of Internet-Draft Shadow Directories can be accessed at
32	    http://www.ietf.org/shadow.html.

34	Abstract

36	    This document specifies an optional standard for TCP to increase the
37	    permitted initial window from one segment to roughly 4K bytes,
38	    replacing RFC 2414.  This document discusses the advantages and
39	    disadvantages of the higher initial window.  The document includes
40	    discussion of experiments and simulations showing that the higher
41	    initial window does not lead to congestion collapse. Finally, the
42	    document provides guidance on implementation issues.

44	Terminology

46	    The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
47	    "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
48	    document are to be interpreted as described in RFC 2119 [RFC2119].

50	1.  TCP Modification

52	    This document updates [RFC2414] and specifies an increase in the
53	    permitted upper bound for TCP's initial window from one segment to
54	    between two and four segments.  In most cases, this change results
55	    in an upper bound on the initial window of roughly 4K bytes
56	    (although given a large segment size, the permitted initial window
57	    of two segments may be significantly larger than 4K bytes).  The
58	    upper bound for the initial window is given more precisely in (1):

60	          min (4*MSS, max (2*MSS, 4380 bytes))			    (1)

62	    Equivalently, the upper bound for the initial window size is based
63	    on the maximum segment size (MSS), as follows:

65	        If (MSS <= 1095 bytes)
66	            then win <= 4 * MSS;
67	        If (1095 bytes < MSS < 2190 bytes)
68	            then win <= 4380;
69	        If (2190 bytes <= MSS)
70	            then win <= 2 * MSS;

72	    This increased initial window is optional: that a TCP MAY start with
73	    a larger initial window.  However, we expect that most
74	    general-purpose TCP implementations would choose to use the larger
75	    initial congestion window given in equation (1) above.

77	    This upper bound for the initial window size represents a change
78	    from RFC 2581 [RFC2581], which specified that the congestion window
79	    be initialized to one or two segments.

81	    This change applies to the initial window of the connection in the
82	    first round trip time (RTT) of data transmission following the TCP three-
83	    way handshake.  Neither the SYN/ACK nor its acknowledgment (ACK) in
84	    the three-way handshake should increase the initial window size
85	    above that outlined in equation (1).  If the SYN or SYN/ACK is lost,
86	    the initial window used by a sender after a correctly transmitted
87	    SYN MUST be one segment consisting of MSS bytes.

89	    TCP implementations use slow start in as many as three different
90	    ways: (1) to start a new connection (the initial window); (2) to
91	    restart transmission after a long idle period (the restart window);
92	    and (3) to restart transmission after a retransmit timeout (the loss
93	    window).  The change specified in this document affects the value of
94	    the initial window.  Optionally, a TCP MAY set the restart window to
95	    the minimum of the value used for the initial window and the current
96	    value of cwnd (in other words, using a larger value for the restart
97	    window should never increase the size of cwnd).  These changes do
98	    NOT change the loss window, which must remain 1 segment of MSS bytes
99	    (to permit the lowest possible window size in the case of severe
100	    congestion).

102	2.  Implementation Issues

104	    When larger initial windows are implemented along with Path MTU
105	    Discovery [RFC1191], and the MSS being used is found to be too large,
106	    the congestion window `cwnd' SHOULD be reduced to prevent large
107	    bursts of smaller segments.  Specifically, `cwnd' SHOULD be reduced
108	    by the ratio of the old segment size to the new segment size.

110	    When larger initial windows are implemented along with Path MTU
111	    Discovery [RFC1191], alternatives are to set the "Don't Fragment"
112	    (DF) bit in all segments in the initial window, or to set the "Don't
113	    Fragment" (DF) bit in one of the segments.  It is an open question
114	    which of these two alternatives is best; we would hope that
115	    implementation experiences will shed light on this question.  In the
116	    first case of setting the DF bit in all segments, if the initial
117	    packets are too large, then all of the initial packets will be
118	    dropped in the network.  In the second case of setting the DF bit in
119	    only one segment, if the initial packets are too large, then all but
120	    one of the initial packets will be fragmented in the network.  When
121	    the second case is followed, setting the DF bit in the last segment
122	    in the initial window provides the least chance for needless
123	    retransmissions when the initial segment size is found to be too
124	    large, because it minimizes the chances of duplicate ACKs triggering
125	    a Fast Retransmit.  However, more attention needs to be paid to the
126	    interaction between larger initial windows and Path MTU Discovery.

128	    The larger initial window specified in this document is not intended
129	    as encouragement for web browsers to open multiple simultaneous
130	    TCP connections all with large initial windows.  When web browsers
131	    open simultaneous TCP connections to the same destination, this
132	    works against TCP's congestion control mechanisms [FF98], regardless
133	    of the size of the initial window.  Combining this behavior with
134	    larger initial windows further increases the unfairness to other
135	    traffic in the network.

137	3.  Advantages of Larger Initial Windows

139	    1.  When the initial window is one segment, a receiver employing
140	        delayed ACKs [RFC1122] is forced to wait for a timeout before
141	        generating an ACK.  With an initial window of at least two
142	        segments, the receiver will generate an ACK after the second
143	        data segment arrives.  This eliminates the wait on the timeout
144	        (often up to 200 msec, and possibly up to 500 msec [RFC1122]).

146	    2.  For connections transmitting only a small amount of data, a
147	        larger initial window reduces the transmission time (assuming at
148	        most moderate segment drop rates).  For many email (SMTP
149	        [Pos82]) and web page (HTTP [RFC1945, RFC2068]) transfers that
150	        are less than 4K bytes, the larger initial window would reduce
151	        the data transfer time to a single RTT.

153	    3.  For connections that will be able to use large congestion
154	        windows, this modification eliminates up to three RTTs and a
155	        delayed ACK timeout during the initial slow-start phase.  This
156	        will be of particular benefit for high-bandwidth large-
157	        propagation-delay TCP connections, such as those over satellite
158	        links.

160	4.  Disadvantages of Larger Initial Windows for the Individual
161	    Connection

163	    In high-congestion environments, particularly for routers that have
164	    a bias against bursty traffic (as in the typical Drop Tail router
165	    queues), a TCP connection can sometimes be better off starting with
166	    an initial window of one segment.  There are scenarios where a TCP
167	    connection slow-starting from an initial window of one segment might
168	    not have segments dropped, while a TCP connection starting with an
169	    initial window of four segments might experience unnecessary
170	    retransmits due to the inability of the router to handle small
171	    bursts.  This could result in an unnecessary retransmit timeout.
172	    For a large-window connection that is able to recover without a
173	    retransmit timeout, this could result in an unnecessarily-early
174	    transition from the slow-start to the congestion-avoidance phase of
175	    the window increase algorithm.  These premature segment drops are
176	    unlikely to occur in uncongested networks with sufficient buffering
177	    or in moderately-congested networks where the congested router uses
178	    active queue management (such as Random Early Detection [FJ93,
179	    RFC2309]).

181	    Some TCP connections will receive better performance with the larger
182	    initial window even if the burstiness of the initial window results
183	    in premature segment drops.  This will be true if (1) the TCP
184	    connection recovers from the segment drop without a retransmit
185	    timeout, and (2) the TCP connection is ultimately limited to a small
186	    congestion window by either network congestion or by the receiver's
187	    advertised window.

189	5.  Disadvantages of Larger Initial Windows for the Network

191	    In terms of the potential for congestion collapse, we consider two
192	    separate potential dangers for the network.  The first danger would
193	    be a scenario where a large number of segments on congested links
194	    were duplicate segments that had already been received at the
195	    receiver.  The second danger would be a scenario where a large
196	    number of segments on congested links were segments that would be
197	    dropped later in the network before reaching their final
198	    destination.

200	    In terms of the negative effect on other traffic in the network, a
201	    potential disadvantage of larger initial windows would be that they
202	    increase the general packet drop rate in the network.  We discuss
203	    these three issues below.

205	    Duplicate segments:

207	        As described in the previous section, the larger initial window
208	        could occasionally result in a segment dropped from the initial
209	        window, when that segment might not have been dropped if the
210	        sender had slow-started from an initial window of one segment.
211	        However, Appendix A shows that even in this case, the larger
212	        initial window would not result in the transmission of a large
213	        number of duplicate segments.

215	    Segments dropped later in the network:

217	        How much would the larger initial window for TCP increase the
218	        number of segments on congested links that would be dropped
219	        before reaching their final destination?  This is a problem that
220	        can only occur for connections with multiple congested links,
221	        where some segments might use scarce bandwidth on the first
222	        congested link along the path, only to be dropped later along
223	        the path.

225	        First, many of the TCP connections will have only one congested
226	        link along the path.  Segments dropped from these connections do
227	        not "waste" scarce bandwidth, and do not contribute to
228	        congestion collapse.

230	        However, some network paths will have multiple congested links,
231	        and segments dropped from the initial window could use scarce
232	        bandwidth along the earlier congested links before ultimately
233	        being dropped on subsequent congested links.  To the extent that
234	        the drop rate is independent of the initial window used by TCP
235	        segments, the problem of congested links carrying segments that
236	        will be dropped before reaching their destination will be
237	        similar for TCP connections that start by sending four segments
238	        or one segment.

240	    An increased packet drop rate:

242	        For a network with a high segment drop rate, increasing the TCP
243	        initial window could increase the segment drop rate even
244	        further.  This is in part because routers with Drop Tail queue
245	        management have difficulties with bursty traffic in times of
246	        congestion.  However, given uncorrelated arrivals for TCP
247	        connections, the larger TCP initial window should not
248	        significantly increase the segment drop rate.  Simulation-based
249	        explorations of these issues are discussed in Section 7.2.

251	    These potential dangers for the network are explored in simulations
252	    and experiments described in the section below.  Our judgment is that
253	    while there are dangers of congestion collapse in the current
254	    Internet (see [FF98] for a discussion of the dangers of congestion
255	    collapse from an increased deployment of UDP connections without
256	    end-to-end congestion control), there is no such danger to the
257	    network from increasing the TCP initial window to 4K bytes.

259	6.  Interactions with the Retransmission Timer

261	    Using a larger initial burst of data can exacerbate existing
262	    problems with spurious retransmit timeouts on low-bandwidth paths,
263	    assuming the standard algorithm for determining the TCP
264	    retransmission timeout (RTO) [RFC2988].  The problem is that across
265	    low-bandwidth network paths on which the transmission time of a
266	    packet is a large portion of the round-trip time, the small packets
267	    used to establish a TCP connection do not seed the RTO estimator appropriately.
268	    When the first window of data packets is transmitted, the sender's
269	    retransmit timer could expire before the acknowledgments for those
270	    packets are received.  As each acknowledgment arrives, the
271	    retransmit timer is generally reset.  Thus, the retransmit timer
272	    will not expire as long as an acknowledgment arrives at least once
273	    a second, given the one-second minimum on the RTO recommended in RFC
274	    2988.

276	    For instance, consider a 9.6 Kbps link.  The initial RTT measurement
277	    will be on the order of 67 msec, if we simply consider the
278	    transmission time of 2 packets (the SYN and SYN-ACK) each consisting
279	    of 40 bytes.  Using the RTO estimator given in [RFC2988], this
280	    yields an initial RTO of 201 msec (67 + 4*(67/2)).  However, we
281	    round the RTO to 1 second as specified in RFC 2988.  Then assume we
282	    send an initial window of one or more 1500-byte packets (1460 data
283	    bytes plus overhead).  Each packet will take on the order of 1.25
284	    seconds to transmit.  Clearly the RTO will fire before the ACK for
285	    the first packet returns, causing a spurious timeout.  In this case,
286	    a larger initial window of three or four packets exacerbates the
287	    problems caused by this spurious timeout.

289	    One way to deal with this problem is to make the RTO algorithm more
290	    conservative.  During the initial window of data, for instance, we
291	    could update the RTO for each acknowledgment received.  In
292	    addition, if the retransmit timer expires for some packet lost in
293	    the first window of data, we could leave the exponential-backoff of
294	    the retransmit timer engaged until at least one valid RTT measurement is
295	    received that involves a data packet.

297	    Another method would be to refrain from taking a RTT sample during
298	    connection establishment, leaving the default RTO in place until TCP
299	    takes a sample from a data segment and the corresponding ACK.  While
300	    this method likely helps prevent spurious retransmits it also slows
301	    the data transfer down if loss occurs before the RTO is seeded.

303	    This specification leaves the decision about what to do (if
304	    anything) with regards to the RTO when using a larger initial window
305	    to the implementer.

307	7.  Typical Levels of Burstiness for TCP Traffic.

309	    Larger TCP initial windows would not dramatically increase the
310	    burstiness of TCP traffic in the Internet today, because such
311	    traffic is already fairly bursty.  Bursts of two and three segments
312	    are already typical of TCP [Flo97]; A delayed ACK (covering two
313	    previously unacknowledged segments) received during congestion
314	    avoidance causes the congestion window to slide and two segments to
315	    be sent.  The same delayed ACK received during slow start causes the
316	    window to slide by two segments and then be incremented by one
317	    segment, resulting in a three-segment burst.  While not necessarily
318	    typical, bursts of four and five segments for TCP are not rare.
319	    Assuming delayed ACKs, a single dropped ACK causes the subsequent
320	    ACK to cover four previously unacknowledged segments.  During
321	    congestion avoidance this leads to a four-segment burst and during
322	    slow start a five-segment burst is generated.

324	    There are also changes in progress that reduce the performance
325	    problems posed by moderate traffic bursts.  One such change is the
326	    deployment of higher-speed links in some parts of the network, where
327	    a burst of 4K bytes can represent a small quantity of data.  A
328	    second change, for routers with sufficient buffering, is the
329	    deployment of queue management mechanisms such as RED, which is
330	    designed to be tolerant of transient traffic bursts.

332	8.  Simulations and Experimental Results

334	8.1 Studies of TCP Connections using that Larger Initial Window

336	    This section surveys simulations and experiments that have been used
337	    to explore the effect of larger initial windows on TCP
338	    connections.  The first set of experiments
339	    explores performance over satellite links.  Larger initial windows
340	    have been shown to improve performance of TCP connections over
341	    satellite channels [All97b].  In this study, an initial window of
342	    four segments (512 byte MSS) resulted in throughput improvements of
343	    up to 30% (depending upon transfer size).  [KAGT98] shows that the
344	    use of larger initial windows results in a decrease in transfer time
345	    in HTTP tests over the ACTS satellite system.  A study involving
346	    simulations of a large number of HTTP transactions over hybrid fiber
347	    coax (HFC) indicates that the use of larger initial windows
348	    decreases the time required to load WWW pages [Nic97].

350	    A second set of experiments has explored TCP performance over dialup
351	    modem links.  In experiments over a 28.8 bps dialup channel [All97a,
352	    AHO98], a four-segment initial window decreased the transfer time of
353	    a 16KB file by roughly 10%, with no accompanying increase in the
354	    drop rate.  A particular area of concern has been TCP performance
355	    over low speed tail circuits (e.g., dialup modem links) with routers
356	    with small buffers.  A simulation study [RFC2416] investigated the
357	    effects of using a larger initial window on a host connected by a
358	    slow modem link and a router with a 3 packet buffer.  The study
359	    concluded that for the scenario investigated, the use of larger
360	    initial windows was not harmful to TCP performance.  Questions have
361	    been raised concerning the effects of larger initial windows on the
362	    transfer time for short transfers in this environment, but these
363	    effects have not been quantified.  A question has also been raised
364	    concerning the possible effect on existing TCP connections sharing
365	    the link.

367	    Finally, [All00] illustrates that the percentage of connections at a
368	    particular web server that experience loss in the initial window of
369	    data transmission increases with the size of the initial congestion
370	    window.  However, the increase is in line with what would be
371	    expected from sending a larger burst into the network.

373	8.2 Studies of Networks using Larger Initial Windows

375	    This section surveys simulations and experiments investigating the
376	    impact of the larger window on other TCP connections sharing the
377	    path.  Experiments in [All97a, AHO98] show that for 16 KB transfers
378	    to 100 Internet hosts, four-segment initial windows resulted in a
379	    small increase in the drop rate of 0.04 segments/transfer.  While
380	    the drop rate increased slightly, the transfer time was reduced by
381	    roughly 25% for transfers using the four-segment (512 byte MSS)
382	    initial window when compared to an initial window of one segment.

384	    One scenario of concern is heavily loaded links.  For instance,
385	    several years ago one of the trans-Atlantic links was so heavily
386	    loaded that the correct congestion window size for each connection was
387	    about one segment.  In this environment, new connections using
388	    larger initial windows would be starting with windows that were four
389	    times too big.  What would the effects be?  Do connections thrash?

391	    A simulation study in [RFC2415] explores the impact of a larger initial
392	    window on competing network traffic.  In this investigation, HTTP
393	    and FTP flows share a single congested gateway (where the number of
394	    HTTP and FTP flows varies from one simulation set to another).  For
395	    each simulation set, the paper examines aggregate link utilization
396	    and packet drop rates, median web page delay, and network power for
397	    the FTP transfers.  The larger initial window generally resulted in
398	    increased throughput, slightly-increased packet drop rates, and an
399	    increase in overall network power.  With the exception of one
400	    scenario, the larger initial window resulted in an increase in the
401	    drop rate of less than 1% above the loss rate experienced when using
402	    a one-segment initial window; in this scenario, the drop rate
403	    increased from 3.5% with one-segment initial windows, to 4.5% with
404	    four-segment initial windows.  The overall conclusions were that
405	    increasing the TCP initial window to three packets (or 4380 bytes)
406	    helps to improve perceived performance.

408	    Morris [Mor97] investigated larger initial windows in a very
409	    congested network with transfers of size 20K.  The loss rate in
410	    networks where all TCP connections use an initial window of four
411	    segments is shown to be 1-2% greater than in a network where all
412	    connections use an initial window of one segment.  This relationship
413	    held in scenarios where the loss rates with one-segment initial
414	    windows ranged from 1% to 11%.  In addition, in networks where
415	    connections used an initial window of four segments, TCP connections
416	    spent more time waiting for the retransmit timer (RTO) to expire to
417	    resend a segment than was spent when using an initial window of one
418	    segment.  The time spent waiting for the RTO timer to expire
419	    represents idle time when no useful work was being accomplished for
420	    that connection.  These results show that in a very congested
421	    environment, where each connection's share of the bottleneck
422	    bandwidth is close to one segment, using a larger initial window can
423	    cause a perceptible increase in both loss rates and retransmit
424	    timeouts.

426	9.  Security Considerations

428	    This document discusses the initial congestion window permitted for
429	    TCP connections.  Changing this value does not raise any known new
430	    security issues with TCP.

432	10. Conclusion
433	    This document specifies a small change to TCP that will likely be beneficial
434	    to short-lived TCP connections and those over links with long RTTs
435	    (saving several RTTs during the initial slow-start phase).

437	11. Acknowledgments

439	    We would like to acknowledge Vern Paxson, Tim Shepard, members of
440	    the End-to-End-Interest Mailing List, and members of the IETF TCP
441	    Implementation Working Group for continuing discussions of these
442	    issues for discussions and feedback on this document.

444	12. References

446	    [AHO98] Mark Allman, Chris Hayes, and Shawn Ostermann, An Evaluation
447	        of TCP with Larger Initial Windows, March 1998.  Submitted to
448	        ACM Computer Communication Review.  URL:
449	        "http://roland.lerc.nasa.gov/~mallman/papers/initwin.ps".

451	    [All97a] Mark Allman.  An Evaluation of TCP with Larger Initial
452	        Windows.  40th IETF Meeting -- TCP Implementations WG.
453	        December, 1997.  Washington, DC.

455	    [All97b] Mark Allman.  Improving TCP Performance Over Satellite
456	        Channels.  Master's thesis, Ohio University, June 1997.

458	    [All00] Mark Allman. A Web Server's View of the Transport Layer. ACM
459	        Computer Communication Review, 30(5), October 2000.

461	    [FF96] Fall, K., and Floyd, S., Simulation-based Comparisons of
462	        Tahoe, Reno, and SACK TCP.  Computer Communication Review,
463	        26(3), July 1996.

465	    [FF98] Sally Floyd, Kevin Fall.  Promoting the Use of End-to-End
466	        Congestion Control in the Internet.  Submitted to IEEE
467	        Transactions on Networking.  URL "http://www-
468	        nrg.ee.lbl.gov/floyd/end2end-paper.html".

470	    [FJ93] Floyd, S., and Jacobson, V., Random Early Detection gateways
471	        for Congestion Avoidance. IEEE/ACM Transactions on Networking,
472	        V.1 N.4, August 1993, p. 397-413.

474	    [Flo94] Floyd, S., TCP and Explicit Congestion Notification.
475	        Computer Communication Review, 24(5):10-23, October 1994.

477	    [Flo96] Floyd, S., Issues of TCP with SACK. Technical report,
478	        January 1996.  Available from http://www-nrg.ee.lbl.gov/floyd/.

480	    [Flo97] Floyd, S., Increasing TCP's Initial Window.  Viewgraphs,
481	        40th IETF Meeting - TCP Implementations WG. December, 1997.  URL
482	        "ftp://ftp.ee.lbl.gov/talks/sf-tcp-ietf97.ps".

484	    [KAGT98] Hans Kruse, Mark Allman, Jim Griner, Diepchi Tran.  HTTP
485	        Page Transfer Rates Over Geo-Stationary Satellite Links.  March
486	        1998.  Proceedings of the Sixth International Conference on
487	        Telecommunication Systems.  URL
488	        "http://roland.lerc.nasa.gov/~mallman/papers/nash98.ps".

490	    [Mor97] Robert Morris.  Private communication, 1997.  Cited for
491	        acknowledgement purposes only.

493	    [Nic97] Kathleen Nichols.  Improving Network Simulation with
494	        Feedback.  Com21, Inc. Technical Report.  Available from
495	        http://www.com21.com/pages/papers/068.pdf.

497	    [Pos82] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC
498	        821, August 1982.

500	    [RFC1122] Braden, R., "Requirements for Internet Hosts --
501	        Communication Layers", STD 3, RFC 1122, October 1989.

503	    [RFC1191] Mogul, J., and S. Deering, "Path MTU Discovery", RFC 1191,
504	        November 1990.

506	    [RFC1945] Berners-Lee, T., Fielding, R., and H. Nielsen, "Hypertext
507	        Transfer Protocol -- HTTP/1.0", RFC 1945, May 1996.

509	    [RFC2068] Fielding, R., Mogul, J., Gettys, J., Frystyk, H., and T.
510	        Berners-Lee, "Hypertext Transfer Protocol -- HTTP/1.1", RFC
511	        2068, January 1997.

513	    [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
514	        Requirement Levels", BCP 14, RFC 2119, March 1997.

516	    [RFC2309] Braden, B., Clark, D., Crowcroft, J., Davie, B., Deering,
517	        S., Estrin, D., Floyd, S., Jacobson, V., Minshall, G.,
518	        Partridge, C., Peterson, L., Ramakrishnan, K., Shenker, S.,
519	        Wroclawski, J., and L.  Zhang, "Recommendations on Queue
520	        Management and Congestion Avoidance in the Internet", RFC 2309,
521	        April 1998.

523	    [RFC2414] Allman, M., Floyd, S., and C. Partridge, "Increasing TCP's
524	        Initial Window", RFC 2414, September 1998.

526	    [RFC2415] Poduri, K., and K. Nichols, "Simulation Studies of
527	        Increased Initial TCP Window Size", RFC 2415, September 1998.

529	    [RFC2416] Shepard, T., and C. Partridge, "When TCP Starts Up With
530	        Four Packets Into Only Three Buffers", RFC 2416, September 1998.

532	    [RFC2581] Mark Allman, Vern Paxson, W. Richard Stevens. TCP
533	        Congestion Control, April 1999.  RFC 2581.

535	    [RFC2988] Vern Paxson, Mark Allman. Computing TCP's Retransmission
536	        Timer, November 2000.  RFC 2988.

538	    [RFC3042] M. Allman, H. Balakrishnan, and S. Floyd, Enhancing TCP's
539	        Loss Recovery Using Limited Transmit, RFC 3042, January 2001.

541	    [RFC3168] Ramakrishnan, K.K., Floyd, S., and Black, D., "The
542	        Addition of Explicit Congestion Notification (ECN) to IP", RFC
543	        3168, September 2001.

545	13. Author's Addresses

547	    Mark Allman
548	    BBN Technologies/NASA Glenn Research Center
549	    21000 Brookpark Road
550	    MS 54-5
551	    Cleveland, OH 44135
552	    EMail: mallman@bbn.com
553	    http://roland.lerc.nasa.gov/~mallman/

555	    Sally Floyd
556	    ICSI Center for Internet Research
557	    1947 Center St, Suite 600
558	    Berkeley, CA 94704
559	    Phone: +1 (510) 666-2989
560	    EMail: floyd@icir.org
561	    http://www.icir.org/floyd/

563	    Craig Partridge
564	    BBN Technologies
565	    10 Moulton Street
566	    Cambridge, MA 02138

568	    EMail: craig@bbn.com

570	14.  Appendix - Duplicate Segments

572	    In the current environment (without Explicit Congestion Notification
573	    [Flo94] [RFC2481]), all TCPs use segment drops as indications from
574	    the network about the limits of available bandwidth.  We argue here
575	    that the change to a larger initial window should not result in the
576	    sender retransmitting a large number of duplicate segments that have
577	    already arrived at the receiver.

579	    If one segment is dropped from the initial window, there are three
580	    different ways for TCP to recover: (1) Slow-starting from a window
581	    of one segment, as is done after a retransmit timeout, or after Fast
582	    Retransmit in Tahoe TCP; (2) Fast Recovery without selective
583	    acknowledgments (SACK), as is done after three duplicate ACKs in
584	    Reno TCP; and (3) Fast Recovery with SACK, for TCP where both the
585	    sender and the receiver support the SACK option [MMFR96].  In all
586	    three cases, if a single segment is dropped from the initial window,
587	    no duplicate segments (i.e., segments that have already been
588	    received at the receiver) are transmitted.  Note that for a TCP
589	    sending four 512-byte segments in the initial window, a single
590	    segment drop will not require a retransmit timeout, but can be
591	    recovered from using the Fast Retransmit algorithm (unless the
592	    retransmit timer expires prematurely).  In addition, a single
593	    segment dropped from an initial window of three segments might be
594	    repaired using the fast retransmit algorithm, depending on which
595	    segment is dropped and whether or not delayed ACKs are used.  For
596	    example, dropping the first segment of a three segment initial
597	    window will always require waiting for a timeout, in the absence of
598	    Limited Transmit [RFC3042].  However, dropping the third segment
599	    will always allow recovery via the fast retransmit algorithm, as
600	    long as no ACKs are lost.

602	    Next we consider scenarios where the initial window contains two to
603	    four segments, and at least two of those segments are dropped.  If
604	    all segments in the initial window are dropped, then clearly no
605	    duplicate segments are retransmitted, as the receiver has not yet
606	    received any segments.  (It is still a possibility that these
607	    dropped segments used scarce bandwidth on the way to their drop
608	    point; this issue was discussed in Section 5.)

610	    When two segments are dropped from an initial window of three
611	    segments, the sender will only send a duplicate segment if the first
612	    two of the three segments were dropped, and the sender does not
613	    receive a packet with the SACK option acknowledging the third
614	    segment.

616	    When two segments are dropped from an initial window of four
617	    segments, an examination of the six possible scenarios (which we
618	    don't go through here) shows that, depending on the position of the
619	    dropped packets, in the absence of SACK the sender might send one
620	    duplicate segment.  There are no scenarios in which the sender sends
621	    two duplicate segments.

623	    When three segments are dropped from an initial window of four
624	    segments, then, in the absence of SACK, it is possible that one
625	    duplicate segment will be sent, depending on the position of the
626	    dropped segments.

628	    The summary is that in the absence of SACK, there are some scenarios
629	    with multiple segment drops from the initial window where one
630	    duplicate segment will be transmitted.  There are no scenarios where
631	    more that one duplicate segment will be transmitted.  Our conclusion
632	    is that the number of duplicate segments transmitted as a result of
633	    a larger initial window should be small.

635	15. Full Copyright Statement

637	    Copyright (C) The Internet Society (2001). All Rights Reserved.

639	    This document and translations of it may be copied and furnished to
640	    others, and derivative works that comment on or otherwise explain it
641	    or assist in its implementation may be prepared, copied, published
642	    and distributed, in whole or in part, without restriction of any
643	    kind, provided that the above copyright notice and this paragraph are
644	    included on all such copies and derivative works. However, this
645	    document itself may not be modified in any way, such as by removing
646	    the copyright notice or references to the Internet Society or other
647	    Internet organizations, except as needed for the purpose of
648	    developing Internet standards in which case the procedures for
649	    copyrights defined in the Internet Standards process must be
650	    followed, or as required to translate it into languages other than
651	    English.

653	    The limited permissions granted above are perpetual and will not be
654	    revoked by the Internet Society or its successors or assigns.

656	    This document and the information contained herein is provided on an
657	    "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
658	    TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
659	    BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
660	    HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
661	    MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.