< draft-floyd-incr-init-win-01.txt   draft-floyd-incr-init-win-02.txt >
Internet Engineering Task Force Mark Allman TCP Implementation Working Group M. Allman
INTERNET DRAFT NASA Lewis/Sterling Software INTERNET DRAFT NASA Lewis/Sterling Software
File: draft-floyd-incr-init-win-01.txt Sally Floyd File: draft-floyd-incr-init-win-02.txt S. Floyd
LBL LBNL
Craig Partridge C. Partridge
BBN Technologies BBN Technologies
March, 1998 April, 1998
Expires: September, 1998
Increasing TCP's Initial Window Increasing TCP's Initial Window
Status of this Memo Status of this Memo
This document is an Internet-Draft. Internet-Drafts are working This document is an Internet-Draft. Internet-Drafts are working
documents of the Internet Engineering Task Force (IETF), its areas, documents of the Internet Engineering Task Force (IETF), its areas,
and its working groups. Note that other groups may also distribute and its working groups. Note that other groups may also distribute
working documents as Internet-Drafts. working documents as Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet-Drafts as at any time. It is inappropriate to use Internet-Drafts as
reference material or to cite them other than as ``work in reference material or to cite them other than as ``work in
progress.'' progress.''
To learn the current status of any Internet-Draft, please check the To view the entire list of current Internet-Drafts, please check
``1id-abstracts.txt'' listing contained in the Internet- Drafts the "1id-abstracts.txt" listing contained in the Internet-Drafts
Shadow Directories on ftp.is.co.za (Africa), nic.nordu.net (Europe), Shadow Directories on ftp.is.co.za (Africa), ftp.nordu.net
munnari.oz.au (Pacific Rim), ds.internic.net (US East Coast), or (Northern Europe), ftp.nis.garr.it (Southern Europe), munnari.oz.au
ftp.isi.edu (US West Coast). (Pacific Rim), ftp.ietf.org (US East Coast), or ftp.isi.edu
(US West Coast).
Abstract Abstract
This is a note to suggest changing the permitted initial window in This document specifies an increase in the permitted initial window
TCP from 1 segment to roughly 4K bytes. This draft considers the for TCP from one segment to roughly 4K bytes. This document
advantages and disadvantages of such a change, as well as outlining discusses the advantages and disadvantages of such a change,
some experimental results that indicate the costs and benefits of outlining experimental results that indicate the costs and benefits
making such a change to TCP, and pointing out remaining research of such a change to TCP.
questions.
Terminology
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL
NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in
RFC 2119 [RFC2119].
1. TCP Modification 1. TCP Modification
This draft suggests allowing the initial window used by a TCP This document specifies an increase in the permitted upper bound
connection to increase from 1 segment to between 2 and 4 segments. for TCP's initial window from one segment to between two
In most cases, this will result in an initial window of roughly 4K and four segments. In most cases, this change results in an upper
bytes (although given a large segment size, the initial window could bound on the initial window of roughly 4K bytes (although given a
be significantly larger than 4K bytes). The proposed initial window large segment size, the permitted initial window of two segments
size is given in (1): could be significantly larger than 4K bytes). The upper bound for
the initial window is given more precisely in (1):
min (4*MSS, max (2*MSS, 4380 bytes)) (1) min (4*MSS, max (2*MSS, 4380 bytes)) (1)
Or, more specifically the initial window size is based on the Equivalently, the upper bound for the initial window size
maximum segment size (MSS), as follows: is based on the maximum segment size (MSS), as follows:
MSS <= 1095 bytes: If (MSS <= 1095 bytes)
win = 4 * MSS then win <= 4 * MSS;
1095 bytes < MSS < 2190 bytes: If (1095 bytes < MSS < 2190 bytes)
win = 4380 then win <= 4380;
MSS => 2190 bytes: If (2190 bytes <= MSS)
win = 2 * MSS then win <= 2 * MSS;
This increased initial window would be optional: that a TCP MAY This increased initial window is optional: that a TCP MAY
start with a larger initial window, not that it SHOULD. start with a larger initial window, not that it SHOULD.
This change would only apply to the initial window of the This upper bound for the initial window size represents a change
connection, in the first round trip time (RTT) of transmission from RFC 2001 [S97], which specifies that the congestion window be
following the TCP three-way handshake. That is, the SYN/ACK in the initialized to one segment. If implementation experience proves
three way handshake should not increase the initial window size successful, then the intent is for this change to be incorporated
above that outlined in equation (1). However, if the SYN or SYN/ACK into a revision to RFC 2001.
is lost the initial window used after a correctly transmitted SYN
MUST be 1 segment.
Some TCP implementations use slow start to re-start transmission This change applies to the initial window of the connection in the
after a long idle period. In this case, the initial window used first round trip time (RTT) of transmission following the TCP
should be the same as the initial window used at the beginning of three-way handshake. Neither the SYN/ACK nor its acknowledgment
the transfer. The change proposed in this document would not change (ACK) in the three-way handshake should increase the initial window
the behavior after a retransmit timeout, when the sender would size above that outlined in equation (1). If the SYN or SYN/ACK is
continue to slow start from an initial window of one segment. lost, the initial window used by a sender after a correctly
transmitted SYN MUST be one segment.
2. Advantages of Larger Initial Windows TCP implementations use slow start in as many as three different ways:
(1) to start a new connection (the initial window); (2) to restart
a transmission after a long idle period (the restart window); and
(3) to restart after a retransmit timeout (the loss window). The
change proposed in this document affects the value of the initial
window. Optionally, a TCP MAY set the restart window to the
same value used for the initial window. These changes do NOT
change the loss window, which must remain 1 (to permit the lowest
possible window size in the case of severe congestion).
1. For connections transmitting only a small amount of data, a 2. Implementation Issues
larger initial window would reduce the transmission time
(assuming moderate segment drop rates). For many email (SMTP
[Pos82]) and web page (HTTP [BLFN96, FJGFBL97]) transfers that
are less than 4K bytes, the larger initial window would reduce
the data transfer time to a single RTT.
2. For connections that will be able to use large congestion When larger initial windows are implemented along with Path MTU
windows, this modification eliminates up to three RTTs and a Discovery [MD90], and the MSS being used is found to be too large,
delayed ACK timeout during the initial slow-start phase. This the congestion window `cwnd' SHOULD be reduced to prevent large
would be of particular benefit for high-bandwidth bursts of smaller segments. Specifically, `cwnd' SHOULD be reduced
large-propagation-delay TCP connections, such as those over by the ratio of the old segment size to the new segment size.
satellite links.
3. When the initial window is 1 segment, a receiver employing When larger initial windows are implemented along with Path MTU
delayed acknowledgments (ACK) [Bra89] is forced to wait for a Discovery [MD90], alternatives are to set the "Don't Fragment" (DF)
timeout before generating an ACK. With a larger initial window, bit in all segments in the initial window, or to set the "Don't
the receiver will be able to generate an ACK after the second Fragment" (DF) bit in one of the segments. It is an open question
data segment arrives. This eliminates the need to wait on the which of these two alternatives is best; we would hope that
timeout (0.1 seconds, or more). implementation experiences will shed light on this. In the first
case of setting the DF bit in all segments, if the initial packets
are too large, then all of the initial packets will be dropped in
the network. In the second case of setting the DF bit in only one
segment, if the initial packets are too large, then all but one of
the initial packets will be fragmented in the network. When the
second case is followed, setting the DF bit in the last segment in
the initial window provides the least chance for needless
retransmissions when the initial segment size is found to be too
large, because it minimizes the chances of duplicate ACKs
triggering a Fast Retransmit. However, more attention needs to be
paid to the interaction between larger initial windows and Path MTU
Discovery.
3. Implementation Issues The larger initial window proposed in this document is not intended
as an encouragement for web browsers to open multiple simultaneous
TCP connections all with large initial windows. When web browsers
open simultaneous TCP connections to the same destination, this
works against TCP's congestion control mechanisms [FF98],
regardless of the size of the initial window. Combining this
behavior with larger initial windows further increases the
unfairness to other traffic in the network.
When larger initial windows are implemented along with Path MTU 3. Advantages of Larger Initial Windows
Discovery [MD90], only one of the segments in the initial window
should have the "Don't Fragment" (DF) bit set. Preliminary analysis
indicates that setting the DF bit in the last segment in the initial
window provides the least chance for needless retransmissions and
large line-rate bursts of segments when the initial segment size is
found to be too large. In addition, if the MSS being used is found
to be too large, the cwnd should be reduced to prevent large bursts
of smaller segments. Specifically, cwnd should be reduced by the
ratio of the old segment size to the new segment size. However,
more attention needs to be paid to the interaction between larger
initial windows and Path MTU Discovery.
The larger initial window proposed in this document SHOULD NOT be 1. When the initial window is one segment, a receiver employing
viewed as an encouragement for web browsers to open multiple delayed ACKs [Bra89] is forced to wait for a timeout before
simultaneous TCP connections all with larger initial windows. (Web generating an ACK. With an initial window of at least two
browsers should not open four simultaneous TCP connections to the segments, the receiver will generate an ACK after the second
same destination in any case, because this works against TCP's data segment arrives. This eliminates the wait on the timeout
congestion control mechanisms [FF98]). (often up to 200 msec).
2. For connections transmitting only a small amount of data, a
larger initial window reduces the transmission time (assuming
moderate segment drop rates). For many email (SMTP [Pos82])
and web page (HTTP [BLFN96, FJGFBL97]) transfers that are less
than 4K bytes, the larger initial window would reduce the data
transfer time to a single RTT.
3. For connections that will be able to use large congestion
windows, this modification eliminates up to three RTTs and a
delayed ACK timeout during the initial slow-start phase. This
would be of particular benefit for high-bandwidth
large-propagation-delay TCP connections, such as those over
satellite links.
4. Disadvantages of Larger Initial Windows for the Individual 4. Disadvantages of Larger Initial Windows for the Individual
Connection Connection
In high-congestion environments, particularly for routers that have In high-congestion environments, particularly for routers that have
a bias against bursty traffic (as in the typical Drop Tail router a bias against bursty traffic (as in the typical Drop Tail router
queues), a TCP connection can sometimes be better off starting with queues), a TCP connection can sometimes be better off starting with
an initial window of one segment. There are scenarios where a TCP an initial window of one segment. There are scenarios where a TCP
connection slow-starting from an initial window of one segment might connection slow-starting from an initial window of one segment might
not have segments dropped, while a TCP connection starting with an not have segments dropped, while a TCP connection starting with an
initial window of four segments might experience unnecessary initial window of four segments might experience unnecessary
retransmits due to the inability of the router to handle small retransmits due to the inability of the router to handle small
bursts. This could result in an unnecessary retransmit timeout. bursts. This could result in an unnecessary retransmit timeout.
For a large-window connection that is able to recover without a For a large-window connection that is able to recover without a
retransmit timeout, this could result in an unnecessarily-early retransmit timeout, this could result in an unnecessarily-early
transition from the slow-start to the congestion-avoidance phase of transition from the slow-start to the congestion-avoidance phase of
the window increase algorithm. These premature segment drops should the window increase algorithm. These premature segment drops should
not happen in uncongested networks, or in moderately-congested not occur in uncongested networks with sufficient buffering or in
networks where the congested router used active queue management moderately-congested networks where the congested router uses
(such as Random Early Detection [FJ93]). active queue management (such as Random Early Detection [FJ93]).
Some TCP connections will receive better performance with the higher Some TCP connections will receive better performance with the higher
initial window even if the burstiness of the initial window results initial window even if the burstiness of the initial window results
in premature segment drops. This will be true if (1) the TCP in premature segment drops. This will be true if (1) the TCP
connection recovers from the segment drop without a retransmit connection recovers from the segment drop without a retransmit
timeout, and (2) the TCP connection is ultimately limited to a small timeout, and (2) the TCP connection is ultimately limited to a small
congestion window by either network congestion or by the receiver's congestion window by either network congestion or by the receiver's
advertised window. advertised window.
5. Disadvantages of Larger Initial Windows for the Network 5. Disadvantages of Larger Initial Windows for the Network
We consider two separate potential dangers for the network. The In terms of the potential for congestion collapse, we consider two
first danger would be a scenario where a large number of segments on separate potential dangers for the network. The first danger would
congested links were duplicate or unnecessarily-retransmitted be a scenario where a large number of segments on congested links
segments that had already been received at the receiver. The second were duplicate segments that had already been received at the
danger would be a scenario where a large number of segments on receiver. The second danger would be a scenario where a large
congested links were segments that would be dropped later in the number of segments on congested links were segments that would be
network before reaching their final destination. dropped later in the network before reaching their final
destination.
Unnecessarily-retransmitted segments: In terms of the negative effect on other traffic in the network, a
potential disadvantage of larger initial windows would be that they
increase the general packet drop rate in the network. We discuss
these three issues below.
As described in the previous section, the larger initial window Duplicate segments:
could occasionally result in a segment dropped from the initial
window, when that segment might not have been dropped if the As described in the previous section, the larger initial window
sender had slow-started from an initial window of one segment. could occasionally result in a segment dropped from the initial
However, Appendix A shows that even in this case, the larger window, when that segment might not have been dropped if the
initial window would not result in a large number of sender had slow-started from an initial window of one segment.
unnecessarily-retransmitted segments. However, Appendix A shows that even in this case, the larger
initial window would not result in the transmission of a large
number of duplicate segments.
Segments dropped later in the network: Segments dropped later in the network:
How much would the larger initial window for TCP increase the How much would the larger initial window for TCP increase the
number of segments on congested links that would be dropped number of segments on congested links that would be dropped
before reaching their final destination? This is a problem that before reaching their final destination? This is a problem that
can only occur for connections with multiple congested links, can only occur for connections with multiple congested links,
where some segments might use scarce bandwidth on the first where some segments might use scarce bandwidth on the first
congested link along the path, only to be dropped later along congested link along the path, only to be dropped later along
the path. the path.
First, many of the TCP connections will have only one congested First, many of the TCP connections will have only one congested
link along the path. Segments dropped from these connections do link along the path. Segments dropped from these connections do
not ``waste'' scarce bandwidth, and do not contribute to not ``waste'' scarce bandwidth, and do not contribute to
congestion collapse. congestion collapse.
However, some network paths will have multiple congested links, However, some network paths will have multiple congested links,
and segments dropped from the initial window could use scarce and segments dropped from the initial window could use scarce
bandwidth along the earlier congested links before being dropped bandwidth along the earlier congested links before ultimately
on subsequent congested links. To the extent that the drop rate being dropped on subsequent congested links. To the extent
is independent of the initial window used by TCP segments, the that the drop rate is independent of the initial window used by
problem of congested links carrying segments that will be TCP segments, the problem of congested links carrying segments
dropped before reaching their destination will be similar for that will be dropped before reaching their destination will be
TCP connections that start by sending four segments or one similar for TCP connections that start by sending four segments
segment. or one segment.
For a network with a high segment drop rate, increasing the An increased packet drop rate:
initial TCP congestion window could increase the segment drop
rate even further. This is in part because routers with drop
tail queue management have difficulties with bursty traffic in
times of congestion. However, this should be a second order
effect. Given uncorrelated arrivals for TCP connections, the
larger initial TCP congestion window should generally not
significantly increase the segment drop rate.
6. Network Changes For a network with a high segment drop rate, increasing the TCP
initial window could increase the segment drop rate even
further. This is in part because routers with Drop Tail queue
management have difficulties with bursty traffic in times of
congestion. However, given uncorrelated arrivals for TCP
connections, the larger TCP initial window should not
significantly increase the segment drop rate. Simulation-based
explorations of these issues are discussed in Section 7.2.
There are other changes in the network that make a larger initial These potential dangers for the network are explored in simulations
window less of a problem. These include the increasing deployment and experiments described in the section below. Our judgement
of higher-speed links where 4K bytes is a rather small quantity of would be, while there are dangers of congestion collapse in the
data and the deployment of queue management mechanisms such as RED current Internet (see [FF98] for a discussion of the dangers of
that are more tolerant of transient traffic bursts. The current congestion collapse from an increased deployment of UDP connections
dangers of congestion collapse most likely now come not from a 4K without end-to-end congestion control), there is no such danger to
initial burst from TCP connections, but from the increased the network from increasing the TCP initial window to 4K bytes.
deployment of UDP connections without end-to-end congestion control.
7. Concerns 6. Typical Levels of Burstiness for TCP Traffic.
All the experiments (see section 8) with larger initial windows have Larger TCP initial windows would not dramatically increase the
tested how the larger window affects the TCP connection that uses burstiness of TCP traffic in the Internet today, because such
the larger window. No one has thoroughly studied the impact of the traffic is already fairly bursty. Bursts of two and three segments
larger window on other TCP connections. In particular, no one has a are already typical of TCP [Flo97]; A delayed ACK (covering two
thorough set of answers about what happens when a TCP bursts a previously unacknowledged segments) received during congestion
larger initial window into or across a path already being shared by avoidance causes the congestion window to slide and two segments to
a set of established TCP connections. be sent. The same delayed ACK received during slow start causes
the window to slide by two segments and then be incremented by one
segment, resulting in a three-segment burst. While not necessarily
typical, bursts of four and five segments for TCP are not rare.
Assuming delayed ACKs, a single dropped ACK causes the subsequent
ACK to cover four previously unacknowledged segments. During
congestion avoidance this leads to a four-segment burst and during
slow start a five-segment burst is generated.
Part of the reason for this omission is the assumption that the There are also changes in progress that reduce the performance
effect is small. For example, in much of the Internet bursts of 2 problems posed by moderate traffic bursts. One such change is the
and 3 segments are common and bursts of 4 and 5 segments are not deployment of higher-speed links in some parts of the network,
rare. A delayed ACK (covering two previously unacknowledged where a burst of 4K bytes can represent a small quantity of data.
segments) received during congestion avoidance causes the window to A second change, for routers with sufficient buffering, is the
slide and 2 segments to be sent. The same delayed ACK received deployment of queue management mechanisms such as RED, which is
during slow start causes the window to slide by 2 segments and then designed to be tolerant of transient traffic bursts.
be incremented by 1 segment, leading to a 3 segment burst. Assuming
delayed ACKs, a single dropped ACK causes the subsequent ACK to
cover 4 previously unacknowledged segments. During congestion
avoidance this leads to a 4 segment burst and during slow start a 5
segment burst is generated.
However, there are some common scenarios where a larger initial 7. Simulations and Experimental Results
window might have an effect. One example is low speed tail circuits
with routers with small buffers. For instance, imagine a dialup
link connecting routers each of which have a handful of buffers.
Further imagine the link is already being shared by a few TCP
connections. Then a new connection launches a large initial window,
causing losses. How long will it be before the connections resume
sharing the link fairly? Are there any signs of a capture effect,
in which the new TCP gets a large fraction of the bandwidth? (A
capture effect could ensure that, say, an SMTP server got more
bandwidth than a long running FTP).
Another scenario of concern is heavily loaded links. For instance, 7.1 Studies of TCP Connections using that Larger Initial Window
a couple of years ago, one of the trans-Atlantic links was so
heavily loaded that the correct congestion window size for a
connection was about one segment. In this environment, new
connections using larger initial windows would be starting with
windows that were four times too big. What would the effects be?
Do connections thrash?
8. Experimental Results This section surveys simulations and experiments that have been
used to explore the effect of larger initial windows on the TCP
connection using that larger window. The first set of experiments
explores performance over satellite links. Larger initial windows
have been shown to improve performance of TCP connections over
satellite channels [All97b]. In this study, an initial window of
four segments (512 byte MSS) resulted in throughput improvements of
up to 30% (depending upon transfer size). [HAGT98] shows that the
use of larger initial windows results in a decrease in transfer
time in HTTP tests over the ACTS satellite system. A study
involving simulations of a large number of HTTP transactions over
hybrid fiber coax (HFC) indicates that the use of larger initial
windows decreases the time required to load WWW pages [Nic97].
8.1 Studies of TCP Connections using Larger Initial Windows A second set of experiments has explored TCP performance over
dialup modem links. In experiments over a 28.8 bps dialup channel
[All97a, AHO98], a four-segment initial window decreased the
transfer time of a 16KB file by roughly 10%, with no accompanying
increase in the drop rate. A particular area of concern has been
TCP performance over low speed tail circuits (e.g., dialup modem
links) with routers with small buffers. A simulation study [SP97]
investigated the effects of using a larger initial window on a host
connected by a slow modem link and a router with a 3 packet
buffer. The study concluded that for the scenario investigated,
the use of larger initial windows was not harmful to TCP
performance. Questions have been raised concerning the effects of
larger initial windows on the transfer time for short transfers in
this environment, but these effects have not been quantified. A
question has also been raised concerning the possible effect on
existing TCP connections sharing the link.
A number of studies have been done using larger initial windows. 7.2 Studies of Networks using Larger Initial Windows
The first study considers the effects on the global Internet, as
well as on slow dialup modem links [All97a]. These test results
show that for 16 KB transfers to 100 Internet hosts, 4 segment
initial windows resulted in an increase in the drop rate of 0.04
segments/transfer. While the drop rate increased slightly, the
transfer time was reduced by roughly 25% for transfers using a 4
segment (512 byte MSS) initial window when compared to an initial
window of 1 segment. Tests over a 28.8 bps dialup channel showed no
increase in the drop rate and a transfer time decrease of roughly
10% over standard TCP when using a 4 segment initial window.
In another study, larger initial windows have been shown to improve This section surveys simulations and experiments investigating the
performance over satellite channels [All97b]. In this study, an impact of the larger window on other TCP connections sharing the
initial window of 4 segments (512 byte MSS) resulted in throughput path. Experiments in [All97a, AHO98] show that for 16 KB transfers
improvements of up to 30% (depending upon transfer size). to 100 Internet hosts, four-segment initial windows resulted in a
small increase in the drop rate of 0.04 segments/transfer. While
the drop rate increased slightly, the transfer time was reduced by
roughly 25% for transfers using the four-segment (512 byte MSS)
initial window when compared to an initial window of one segment.
Next, a study involving simulations of a large number of HTTP One scenario of concern is heavily loaded links. For
transactions over hybrid fiber coax (HFC) indicates that the use of instance, a couple of years ago, one of the trans-Atlantic links
larger initial windows decreases the time required to load WWW pages was so heavily loaded that the correct congestion window size for a
[Nic97]. [HAGT98] also shows that the use of larger initial windows connection was about one segment. In this environment, new
results in a decrease in transfer time in HTTP tests over the ACTS connections using larger initial windows would be starting with
satellite system. windows that were four times too big. What would the effects be?
Do connections thrash?
A study investigated the effects of using a larger initial window on A simulation study in [PN98] explores the impact of a larger
a host connected by a slow modem link and a router with a 3 packet initial window on competing network traffic. In this
buffer [SP97]. This study found that in this environment, larger investigation, HTTP and FTP flows share a single congested gateway
initial windows slightly improved performance. (where the number of HTTP and FTP flows varies from one simulation
set to another). For each simulation set, the paper examines
aggregate link utilization and packet drop rates, median web page
delay, and network power for the FTP transfers. The larger initial
window generally resulted in increased throughput,
slightly-increased packet drop rates, and an increase in overall
network power. With the exception of one scenario, the larger
initial window resulted in an increase in the drop rate of less
than 1% above the loss rate experienced when using a one-segment
initial window; in this scenario, the drop rate increased from
3.5% with one-segment initial windows, to 4.5% with four-segment
initial windows. The overall conclusions were that increasing the
TCP initial window to three packets (or 4380 bytes) helps to
improve perceived performance.
8.2 Studies of Networks using Larger Initial Windows Morris [Mor97] investigated larger initial windows in a very
congested network with transfers of size 20K. The loss rate in
networks where all TCP connections use an initial window of four
segments is shown to be 1-2% greater than in a network where all
connections use an initial window of one segment. This
relationship held in scenarios where the loss rates with
one-segment initial windows ranged from 1% to 11%. In addition, in
networks where connections used an initial window of four segments,
TCP connections spent more time waiting for the retransmit timer
(RTO) to expire to resend a segment than was spent when using an
initial window of one segment. The time spent waiting for the RTO
timer to expire represents idle time when no useful work was being
accomplished for that connection. These results show that in a
very congested environment, where each connection's share of the
bottleneck bandwidth is close to one segment, using a larger
initial window can cause a perceptible increase in both loss rates
and retransmit timeouts.
A simulation study of how the use of a larger initial window impacts 8. Security Considerations
competing network traffic is outlined in [PN98]. In this
investigation, a number of HTTP and FTP flows were sharing a
congested gateway (the exact number of flows was varied in this
study). The study showed improvement in HTTP transfer times on the
order of 30% in many scenarios. In addition, a larger initial
window slightly increased the segment drop rate (only one scenario
increased the drop rate more than 1% above the loss rate experienced
when using an initial window of 1 segment).
Morris [Mor97] investigated larger initial windows in a very congested This document discusses the initial congestion window permitted
network. The loss rate in networks where all TCP connections use an for TCP connections. Changing this value does not raise any known
initial window of 4 segments is shown to be 1-2% greater than in a new security issues with TCP.
network where all connections use an initial window of 1 segment.
In addition, in networks where connections used an initial window of
4 segments, roughly 5-10% more time was spent waiting for the
retransmit timer (RTO) to expire to resend a segment than was spent
when using an initial window of 1 segment. The time spent waiting
for the RTO timer to expire represents idle time when no useful work
was being accomplished. These results show that in a very congested
environment, where each connection's share of the bottleneck
bandwidth is close to 1 segment, using a larger initial window
degrades performance.
9. Conclusion 9. Conclusion
This draft suggests a small change to TCP that may be beneficial to This document proposes a small change to TCP that may be beneficial to
short lived TCP connections and those over links with long RTTs short-lived TCP connections and those over links with long RTTs
(saving several RTTs during the initial slow-start phase). (saving several RTTs during the initial slow-start phase).
10. Acknowledgments 10. Acknowledgments
We would like to acknowledge Tim Shepard and the members of the We would like to acknowledge Vern Paxson, Tim Shepard, members of
End-to-End-Interest Mailing List for continuing discussions of these the End-to-End-Interest Mailing List, and members of the IETF TCP
issues. Implementation Working Group for continuing discussions of these
issues for discussions and feedback on this document.
References 11. References
[All97a] Mark Allman. An Evaluation of TCP with Larger Initial [All97a] Mark Allman. An Evaluation of TCP with Larger Initial
Windows. 40th IETF Meeting -- TCP Implementations WG. Windows. 40th IETF Meeting -- TCP Implementations WG.
December, 1997. Washington, DC. December, 1997. Washington, DC.
[AHO98] Mark Allman, Chris Hayes, and Shawn Ostermann, An
Evaluation of TCP with Larger Initial Windows, March 1998.
Submitted to ACM Computer Communication Review. URL
"http://gigahertz.lerc.nasa.gov/~mallman/papers/initwin.ps".
[All97b] Mark Allman. Improving TCP Performance Over Satellite [All97b] Mark Allman. Improving TCP Performance Over Satellite
Channels. Master's thesis, Ohio University, June 1997. Channels. Master's thesis, Ohio University, June 1997.
[BLFN96] Tim Berners-Lee, R. Fielding, and H. Nielsen. Hypertext [BLFN96] Tim Berners-Lee, R. Fielding, and H. Nielsen. Hypertext
Transfer Protocol -- HTTP/1.0, May 1996. RFC 1945. Transfer Protocol -- HTTP/1.0, May 1996. RFC 1945.
[Bra89] Robert Braden. Requirements for Internet Hosts -- [Bra89] Robert Braden. Requirements for Internet Hosts --
Communication Layers, October 1989. RFC 1122. Communication Layers, October 1989. RFC 1122.
[FF96] Fall, K., and Floyd, S., Simulation-based Comparisons of [FF96] Fall, K., and Floyd, S., Simulation-based Comparisons of
Tahoe, Reno, and SACK TCP. Computer Communication Review, Tahoe, Reno, and SACK TCP. Computer Communication Review,
26(3), July 1996. 26(3), July 1996.
[FF98] Sally Floyd, Kevin Fall. Promoting the Use of End-to-End [FF98] Sally Floyd, Kevin Fall. Promoting the Use of End-to-End
Congestion Control in the Internet. Submitted to IEEE Congestion Control in the Internet. Submitted to IEEE
Transactions on Networking. Transactions on Networking. URL
"http://www-nrg.ee.lbl.gov/floyd/end2end-paper.html".
[FJGFBL97] R. Fielding, Jeffrey C. Mogul, Jim Gettys, H. Frystyk, [FJGFBL97] R. Fielding, Jeffrey C. Mogul, Jim Gettys, H. Frystyk,
and Tim Berners-Lee. Hypertext Transfer Protocol -- HTTP/1.1, and Tim Berners-Lee. Hypertext Transfer Protocol -- HTTP/1.1,
January 1997. RFC 2068. January 1997. RFC 2068.
[FJ93] Floyd, S., and Jacobson, V., Random Early Detection gateways [FJ93] Floyd, S., and Jacobson, V., Random Early Detection gateways
for Congestion Avoidance. IEEE/ACM Transactions on Networking, for Congestion Avoidance. IEEE/ACM Transactions on Networking,
V.1 N.4, August 1993, p. 397-413. V.1 N.4, August 1993, p. 397-413.
[Flo94] Floyd, S., TCP and Explicit Congestion Notification. [Flo94] Floyd, S., TCP and Explicit Congestion Notification.
Computer Communication Review, 24(5):10-23, October 1994. Computer Communication Review, 24(5):10-23, October 1994.
[Flo96] Floyd, S., Issues of TCP with SACK. Technical report, January [Flo96] Floyd, S., Issues of TCP with SACK. Technical report, January
1996. Available from http://www-nrg.ee.lbl.gov/floyd/. 1996. Available from http://www-nrg.ee.lbl.gov/floyd/.
[HAGT98] Hans Kruse, Mark Allman, Jim Griner, Diepchi Tran. HTTP [Flo97] Floyd, S., Increasing TCP's Initial Window. Viewgraphs,
40th IETF Meeting - TCP Implementations WG. December, 1997.
URL "ftp://ftp.ee.lbl.gov/talks/sf-tcp-ietf97.ps".
[KAGT98] Hans Kruse, Mark Allman, Jim Griner, Diepchi Tran. HTTP
Page Transfer Rates Over Geo-Stationary Satellite Links. March Page Transfer Rates Over Geo-Stationary Satellite Links. March
1998. Proceedings of the Sixth International Conference on 1998. Proceedings of the Sixth International Conference on
Telecommunication Systems. To Appear. Telecommunication Systems. URL
"http://gigahertz.lerc.nasa.gov/~mallman/papers/nash98.ps".
[MD90] Jeffrey C. Mogul and Steve Deering. Path MTU Discovery, [MD90] Jeffrey C. Mogul and Steve Deering. Path MTU Discovery,
November 1990. RFC 1191. November 1990. RFC 1191.
[MMFR96] Matt Mathis, Jamshid Mahdavi, Sally Floyd and Allyn [MMFR96] Matt Mathis, Jamshid Mahdavi, Sally Floyd and Allyn
Romanow. TCP Selective Acknowledgment Options, October 1996. Romanow. TCP Selective Acknowledgment Options, October 1996.
RFC 2018. RFC 2018.
[Mor97] Robert Morris. Private communication. [Mor97] Robert Morris. Private communication, 1997. Cited for
acknowledgement purposes only.
[Nic97] Kathleen Nichols. Improving Network Simulation with [Nic97] Kathleen Nichols. Improving Network Simulation with
Feedback. Com21, Inc. Technical Report. Available from Feedback. Com21, Inc. Technical Report. Available from
http://www.com21.com/pages/papers/068.pdf. http://www.com21.com/pages/papers/068.pdf.
[PN98] Poduri, K., and Nichols, K., Simulation Studies of Increased [PN98] Poduri, K., and Nichols, K., Simulation Studies of Increased
Initial TCP Window Size, February 1998. Internet-Draft Initial TCP Window Size, February 1998. Internet-Draft
draft-ietf-tcpimpl-poduri-00.txt (work in progress). draft-ietf-tcpimpl-poduri-00.txt (work in progress).
[Pos82] Jon Postel. Simple Mail Transfer Protocol, August 1982. [Pos82] Jon Postel. Simple Mail Transfer Protocol, August 1982.
RFC 821. RFC 821.
[RF97] Ramakrishnan, K.K., and Floyd, S., A Proposal to Add Explicit [RF97] Ramakrishnan, K.K., and Floyd, S., A Proposal to Add Explicit
Congestion Notification (ECN) to IPv6 and to TCP. Internet-Draft Congestion Notification (ECN) to IPv6 and to TCP. Internet-Draft
draft-kksjf-ecn-00.txt (work in progress). November 1997. draft-kksjf-ecn-00.txt (work in progress). November 1997.
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
Requirement Levels", BCP 14, RFC 2119, March 1997.
[S97] W. Stevens, TCP Slow Start, Congestion Avoidance, Fast
Retransmit, and Fast Recovery Algorithms. RFC 2001, Proposed
Standard, January 1997.
[SP97] Tim Shepard and Craig Partridge. When TCP Starts Up With [SP97] Tim Shepard and Craig Partridge. When TCP Starts Up With
Four Packets Into Only Three Buffers, July 1997. Internet-Draft Four Packets Into Only Three Buffers, July 1997. Internet-Draft
draft-shepard-TCP-4-packets-3-buff-00.txt (work in progress). draft-shepard-TCP-4-packets-3-buff-00.txt (work in progress).
Appendix A 12. Author's Addresses
In the current environment (without Explicit Congestion Notification
[Flo94] [RF97]), all TCPs use segment drops as indications from the
network about the limits of available bandwidth. The change to a
larger initial window should not result in a large number of
unnecessarily-retransmitted segments.
If a segment is dropped from the initial window, there are three
different ways for TCP to recover: (1) Slow-starting from a window
of one segment, as is done after a retransmit timeout, or after Fast
Retransmit in Tahoe TCP; (2) Fast Recovery without selective
acknowledgments (SACK), as is done after three duplicate ACKs in
Reno TCP; and (3) Fast Recovery with SACK, for TCP where both the
sender and the receiver support the SACK option [MMFR96]. In all
three cases, if a single segment is dropped from the initial window,
there are no unnecessarily-retransmitted segments. Note that for a
TCP sending four 512-byte segments in the initial window, a single
segment drop will not require a retransmit timeout, but can be
recovered from using the Fast Retransmit algorithm. In addition, a
single segment dropped from an initial window of three segments may
be repaired using the fast retransmit algorithm, depending on which
segment is dropped and whether or not delayed ACKs are used. For
example, dropping the first segment of a three segment initial
window will always require waiting for a timeout. However, dropping
the third segment will always allow recovery via the fast retransmit
algorithm.
We now consider the case when multiple segments are dropped from the
initial window. Using the first recovery method, slow-starting from
a window of one segment, the number of unnecessarily-retransmitted
segments is limited [FF96]. In the second case of Fast Recovery
without SACK, multiple segment drops from a window of data generally
result in a retransmit timeout. Again, the number of
unnecessarily-retransmitted segments is small. In the third case,
of Fast Recovery with SACK, there can only be
unnecessarily-retransmitted segments if a precise pattern of ACK
segments are also lost [Flo96], or if segments are
seriously-reordered in the network. In any case, the number of
unnecessarily-retransmitted segments due to a larger initial window
should be small.
Author's Addresses
Mark Allman Mark Allman
NASA Lewis Research Center/Sterling Software NASA Lewis Research Center/Sterling Software
21000 Brookpark Road 21000 Brookpark Road
MS 54-2 MS 54-2
Cleveland, OH 44135 Cleveland, OH 44135
mallman@lerc.nasa.gov mallman@lerc.nasa.gov
http://gigahertz.lerc.nasa.gov/~mallman/ http://gigahertz.lerc.nasa.gov/~mallman/
Sally Floyd Sally Floyd
Lawrence Berkeley National Laboratory Lawrence Berkeley National Laboratory
One Cyclotron Road One Cyclotron Road
Berkeley, CA 94720 Berkeley, CA 94720
floyd@ee.lbl.gov floyd@ee.lbl.gov
Craig Partridge Craig Partridge
BBN Technologies BBN Technologies
10 Moulton Street 10 Moulton Street
Cambridge, MA 02138 Cambridge, MA 02138
craig@bbn.com craig@bbn.com
13. Appendix - Duplicate Segments
In the current environment (without Explicit Congestion
Notification [Flo94] [RF97]), all TCPs use segment drops as
indications from the network about the limits of available
bandwidth. We argue here that the change to a larger initial
window should not result in the sender retransmitting
a large number of duplicate segments that have already been
received at the receiver.
If one segment is dropped from the initial window, there are three
different ways for TCP to recover: (1) Slow-starting from a window
of one segment, as is done after a retransmit timeout, or after Fast
Retransmit in Tahoe TCP; (2) Fast Recovery without selective
acknowledgments (SACK), as is done after three duplicate ACKs in
Reno TCP; and (3) Fast Recovery with SACK, for TCP where both the
sender and the receiver support the SACK option [MMFR96]. In all
three cases, if a single segment is dropped from the initial window,
no duplicate segments (i.e., segments that have already been
received at the receiver) are transmitted. Note that for a
TCP sending four 512-byte segments in the initial window, a single
segment drop will not require a retransmit timeout, but can be
recovered from using the Fast Retransmit algorithm (unless the
retransmit timer expires prematurely). In addition, a single
segment dropped from an initial window of three segments might be
repaired using the fast retransmit algorithm, depending on which
segment is dropped and whether or not delayed ACKs are used. For
example, dropping the first segment of a three segment initial
window will always require waiting for a timeout. However,
dropping the third segment will always allow recovery via the fast
retransmit algorithm, as long as no ACKs are lost.
Next we consider scenarios where the initial window contains
two to four segments, and at least two of those segments are dropped.
If all segments in the initial window are dropped, then clearly
no duplicate segments are retransmitted, as the receiver has not yet
received any segments. (It is still a possibility that these dropped
segments used scarce bandwidth on the way to their drop point;
this issue was discussed in Section 5.)
When two segments are dropped from an initial window of three
segments, the sender will only send a duplicate segment if the
first two of the three segments were dropped, and the sender does
not receive a packet with the SACK option acknowledging the third
segment.
When two segments are dropped from an initial window of four
segments, an examination of the six possible scenarios (which we
don't go through here) shows that, depending on the position of the
dropped packets, in the absence of SACK the sender might send one
duplicate segment. There are no scenarios in which the sender
sends two duplicate segments.
When three segments are dropped from an initial window of four segments,
then, in the absence of SACK, it is possible that one duplicate
segment will be sent, depending on the position of the dropped segments.
The summary is that in the absence of SACK, there are some
scenarios with multiple segment drops from the initial window where
one duplicate segment will be transmitted. There are no scenarios
where more that one duplicate segment will be transmitted. Our
conclusion is that the number of duplicate segments transmitted as
a result of a larger initial window should be small.
14. Full Copyright Statement
[This section would be filled in with the standard template if
this document advances to an RFC.]
 End of changes. 52 change blocks. 
269 lines changed or deleted 301 lines changed or added

This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/