Internet Engineering Task Force Mark Allman
INTERNET DRAFT BBN/NASA GRC
File: draft-allman-tcp-abc-03.txt September, 2002
Expires: March, 2003A new Request for Comments is now available in online RFC libraries.
RFC 3465
Title: TCP Congestion Control with Appropriate Byte
Counting
Status of this Memo
This document is an Internet-Draft and is in full conformance with
all provisions of Section (ABC)
Author(s): M. Allman
Status: Experimental
Date: February 2003
Mailbox: mallman@bbn.com
Pages: 10 of RFC2026.
Internet-Drafts are working documents of the Internet Engineering
Task Force (IETF), its areas, and its working groups. Note that
other groups may also distribute working documents as
Internet-Drafts.
Internet-Drafts are draft documents valid for a maximum of six
months and may be updated, replaced, or obsoleted by other documents
at any time. It is inappropriate to use Internet- Drafts as
reference material or to cite them other than as "work in progress."
The list of current Internet-Drafts can be accessed at
http://www.ietf.org/ietf/1id-abstracts.txt
The list of Internet-Draft Shadow Directories can be accessed at
http://www.ietf.org/shadow.html.
Abstract
Characters: 23486
Updates/Obsoletes/SeeAlso: None
I-D Tag: draft-allman-tcp-abc-04.txt
URL: ftp://ftp.rfc-editor.org/in-notes/rfc3465.txt
This document proposed proposes a small modification to the way TCP increases
its congestion window. Rather than the traditional method of
increasing the congestion window by a constant amount for each
arriving acknowledgment, we suggest the document suggests basing the increase
on the number of previously unacknowledged bytes each ACK covers.
This change improves the performance of TCP, as well as closes a
security hole TCP receivers can use to induce the sender into
increasing the sending rate too rapidly.
Terminology
Much of the language in this document is taken from [RFC2581].
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
document are to be interpreted as described in [RFC2119].
1 Introduction
This document proposes a modified algorithm for increasing TCP's
congestion window (cwnd) that improves both performance and
security. Rather than increasing a TCP's congestion window based on
the number of acknowledgments (ACKs) that arrive at the data sender
the congestion window is increased based on the number of bytes
acknowledged by the arriving ACKs. The algorithm improves
performance by mitigating the impact of delayed ACKs on the growth
of cwnd. At the same time, the algorithm provides more appropriate
cwnd growth in response to ACKs that cover only small amounts of
data (less than a full segment size). More appropriate cwnd growth
can improve both performance and can prevent inappropriate cwnd
growth in response to a misbehaving receiver.
This document is organized as follows. Section 2 outlines the
modified algorithm for increasing TCP's congestion window. Section
3 discusses the advantages of using the modified algorithm. Section
4 discusses the disadvantages of the approach outlined in this
document. Section 5 outlines some of the fairness issues that must
be considered for the modified algorithm. Section 6 discusses
security considerations.
2 A Modified Algorithm for Increasing the Congestion Window
As originally outlined in [Jac88] and specified in [RFC2581], TCP
uses two algorithms for increasing the congestion window (cwnd).
During steady-state, TCP uses the Congestion Avoidance algorithm to
linearly increase the value of cwnd. At the beginning of a
transfer, after a retransmission timeout or after a long idle period
(in some implementations), TCP uses the Slow Start algorithm to
increase cwnd exponentially. According to RFC 2581 slow start bases
the cwnd increase on the number of incoming acknowledgments. During
congestion avoidance RFC 2581 allows more latitude in increasing
cwnd, but traditionally implementations have based the increase on
the number of arriving ACKs. In the following two subsections, we
detail modifications to these algorithms to increase cwnd based on
the number of bytes being acknowledged by each arriving ACK, rather
than by the number of ACKs that arrive. We call these changes
``Appropriate Byte Counting'' (ABC) [All99].
2.1 Congestion Avoidance
RFC 2581 specifies that cwnd should be increased by 1 segment per
round-trip time (RTT) during the congestion avoidance phase of a
transfer. Traditionally, TCPs have approximated this increase by
increasing cwnd by 1/cwnd for each arriving ACK. This algorithm
opens cwnd by roughly 1 segment per RTT if the receiver ACKs each
incoming segment and no ACK loss occurs. However, if the receiver
implements delayed ACKs [Bra89] the receiver returns roughly half as
many ACKs which causes the sender to open cwnd more conservatively
(by approximately 1 segment every second RTT). The approach that we
suggest is to store the number of bytes that have been ACKed in a
bytes_acked variable in the TCP control block. When bytes_acked
becomes greater than or equal to the value of the congestion window,
bytes_acked is reduced by the value of cwnd. Next, cwnd is
incremented by a full-sized segment (SMSS). The algorithm suggested
above is specifically allowed by RFC 2581 during congestion
avoidance because it opens the window by at most 1 segment per RTT.
2.2 Slow Start
RFC 2581 states that the sender increments the congestion window by
at most 1*SMSS bytes memo defines an Experimental Protocol for each arriving acknowledgment during slow
start. We propose that a TCP sender SHOULD increase cwnd by the
number of previously unacknowledged bytes ACKed by each incoming
acknowledgment provided the increase is Internet community.
It does not more than L bytes.
Choosing the limit on the increase, L, is discussed in the next
subsection. When the number of previously unacknowledged bytes
ACKed is less than or equal to 1*SMSS bytes or L is less than or
equal to 1*SMSS bytes this proposal is no more aggressive (and
possibly less aggressive) than allowed by RFC 2581. However,
increasing cwnd by more than 1*SMSS bytes in response to a single
ACK is more aggressive than allowed by RFC 2581. We believe the
more aggressive version of the slow start algorithm still falls
within the spirit of the principles outlined in [Jac88] and is safe
for experimentation in shared networks provided specify an appropriate limit
is applied (see next section).
2.3 Choosing the Limit
The limit, L, chosen for the cwnd increase during slow start
controls the aggressiveness of the algorithm. Choosing L=1*SMSS
bytes provides behavior that is no more aggressive than allowed by
RFC 2581. However, ABC with L=1*SMSS bytes is more conservative in
a number Internet standard of key ways (as discussed in the next section) any kind. Discussion and
therefore, we believe that even though with L=1*SMSS bytes TCP
stacks will see little performance benefit, ABC SHOULD be used.
A very large L could potentially lead to large line-rate bursts of
traffic in the face of a large amount of ACK loss or in the case
when the receiver sends ``stretch ACKs'' (ACKs
suggestions for more than the two
full-sized segments allowed by the delayed ACK algorithm) [Pax97].
This documents specifies that TCP implementations MAY use L=2*SMSS
bytes and MUST NOT use L > 2*SMSS bytes. This choice balances
between being conservative (L=1*SMSS bytes) and potentially being
very aggressive. In addition, L=2*SMSS bytes exactly balances the
negative impact of the delayed ACK algorithm (as discussed in more
detail in section 3.2). Note that when L=2*SMSS bytes cwnd growth
is roughly the same as the case when the standard algorithms improvement are
used in conjunction with a receiver that transmits an ACK for each
incoming segment [All98].
The exception to the above suggestion is during a slow start phase
that follows a retransmission timeout (RTO). In this situation, a
TCP MUST use L=1*SMSS as specified in RFC 2581 since ACKs for large
amounts requested. Distribution of previously unacknowledged data are common during this
phase of a transfer. These ACKs do not necessarily indicate how
much data has left the network in the last RTT and therefore ABC
cannot accurately determine how much to increase cwnd. As an
example, say segment N memo
is dropped by the network and segments N+1
and N+2 arrive successfully at the receiver. The sender will
receive only two duplicate ACKs and therefore must rely on the
retransmission timer (RTO) to detect the loss. When the RTO expires
segment N unlimited.
This announcement is retransmitted. The ACK sent in response to the
retransmission will be for segment N+2. However, this ACK does not
indicate that three segments have left the network in the last RTT,
but rather only a single segment left the network. Therefore, the
appropriate cwnd increment is at most 1*SMSS bytes.
2.4 RTO Implications
[Jac88] shows that increases in cwnd of more than a factor of two in
succeeding RTTs can cause spurious retransmissions on slow links
where the bandwidth dominates the RTT, assuming the RTO estimator
given in [Jac88] IETF list and [RFC2988]. ABC stays within this limit of no
more than doubling cwnd in successive RTTs by capping the increase
(no matter what L is employed) by the number of previously
unacknowledged bytes covered by each incoming ACK.
3 Advantages
This section outlines several advantages of using the ABC algorithm
to increase cwnd, rather than the standard ACK counting algorithm
given in [RFC2581].
3.1 More Appropriate Congestion Window Increase
The ABC algorithm outlined in section 2 increases TCP's cwnd in
proportion to the amount of data actually sent into the network.
ACK counting, on the other hand, increments cwnd by a constant upon
the arrival of each ACK. For instance, consider an interactive
telnet connection (e.g., ssh or telnet) in which ACKs generally
cover only a few bytes of data, but cwnd is increased by 1*SMSS
bytes for each ACK received. When a large amount of data needs RFC-DIST list.
Requests to be transmitted (e.g., displaying a large file) the data is sent in
one large burst because the cwnd grows by 1*SMSS bytes per ACK
rather than based on the actual amount of capacity used. Such a
line-rate burst of data can potentially cause a large amount of
segment loss.
Congestion Window Validation (CWV) [RFC2861] helps the above problem
as well. CWV limits the amount of unused cwnd a TCP connection can
accumulate. ABC can be used in conjunction with CWV to obtain an
accurate measure of the network path.
3.2 Mitigate the Impact of Delayed ACKs and Lost ACKs
Delayed ACKs [RFC1122,RFC2581] allow a TCP receiver added to refrain or deleted from
sending an ACK for each incoming segment. However, a receiver
SHOULD send an ACK for every second full-sized segment that arrives.
Furthermore, a receiver MUST NOT withhold an ACK for more than 500
ms. By reducing the number of ACKs IETF distribution list
should be sent to the data originator the
receiver is slowing the growth of the congestion window under an ACK
counting system. Using ABC with L=2*SMSS bytes can roughly negate
the negative impact imposed by delayed ACKs by allowing cwnd IETF-REQUEST@IETF.ORG. Requests to be
increased for ACKs that are withheld by the receiver. This allows
the congestion window to grow in a manner similar
added to or deleted from the case when
the receiver ACKs each incoming segment, but without adding extra
traffic to the network. Simulation studies have shown increased
throughput when a TCP sender uses ABC when compared to the standard
ACK counting algorithm [All99], especially for short transfers that
never leave the initial slow start period.
Note that delayed ACKs should not be an issue during slow
start-based loss recovery, as RFC 2581 recommends that receivers RFC-DIST distribution list should not delay ACKs that cover out-of-order segments. Therefore,
as discussed above, ABC with L > 1*SMSS is inappropriate for such
slow start based loss recovery and MUST NOT
be used.
3.3 Prevents Attacks from Misbehaving Receivers
[SCWA99] outlines several methods for a receiver to induce a TCP
sender into violating congestion control and transmitting data at a
potentially inappropriate rate. One of the outlined attacks is
``ACK Splitting''. This scheme involves the receiver sending
multiple ACKs for each incoming data segment, each ACKing only a
small portion of the original TCP data segment. Since TCP senders
have traditionally used ACK counting to increase cwnd, ACK splitting
causes inappropriately rapid cwnd growth and, in turn, a potentially
inappropriate sending rate. A TCP sender that uses ABC can prevent
this attack from being used to undermine standard congestion control
because the cwnd increase is based on the number of bytes ACKed,
rather than the number of ACKs received.
To prevent misbehaving receivers from inducing inappropriate sender
behavior we suggest TCP implementations use ABC, even if L=1*SMSS
bytes (i.e., not allowing ABC to provide more aggressive cwnd growth
than allowed by RFC 2581).
4 Disadvantages
The main disadvantages of using ABC with L=2*SMSS bytes are an
increase in the burstiness of TCP and a small increase in the
overall loss rate. [All98] discusses the two ways that ABC
increases the burstiness of the TCP sender. First, the ``micro
burstiness'' of the connection is increased. In other words, the
number of segments sent in response to each incoming ACK is
increased RFC-DIST-REQUEST@RFC-EDITOR.ORG.
Details on obtaining RFCs via FTP or EMAIL may be obtained by at most 1 segment when using ABC with L=2*SMSS bytes in
conjunction with a receiver that is sending delayed ACKs. During
slow start this translates into an increase from sending 2
back-to-back segments to sending 3 back-to-back packets in response
to
an ACK for a single packet. Or, an increase from 3 packets to 4
packets when receiving a delayed ACK for two outstanding packets.
Note that ACK loss can cause larger bursts. However, ABC only
increases the burst size by at most 1*SMSS bytes per ACK received
when compared EMAIL message to rfc-info@RFC-EDITOR.ORG with the standard behavior. This slight increase in the
burstiness should only cause problems for devices that have very
small buffers. In addition, ABC increases the ``macro burstiness''
of the TCP sender in response to delayed ACKs. Rather than
increasing cwnd by roughly 1.5 times per RTT, ABC roughly doubles
the congestion window every RTT. However, doubling cwnd every RTT
fits within the spirit of slow start, as originally outlined
[Jac88].
With the increased burstiness comes a modest increase in the loss
rate for a TCP connection employing ABC (see the next section message body
help: ways_to_get_rfcs. For example:
To: rfc-info@RFC-EDITOR.ORG
Subject: getting rfcs
help: ways_to_get_rfcs
Requests for a
short discussion on the fairness of ABC to non-ABC flows). The
additional loss can special distribution should be directly attributable addressed to either the increased
aggressiveness
author of ABC. During slow start cwnd is increased more
rapidly and therefore when loss occurs cwnd is larger and more drops
are likely. Similarly, a congestion avoidance cycle takes roughly
half as long when using ABC and delayed ACKs when compared to an ACK
counting implementation. In other words, a TCP sender reaches the
capacity of the network path, drops a packet and reduces the
congestion window by half roughly twice as often when using ABC.
However, as discussed above, in spite of the additional loss an ABC
TCP sender generally obtains better overall performance than a
non-ABC TCP [All99].
Due to the increase in the packet drop rate we suggest ABC be
implemented RFC in conjunction with selective acknowledgments
[RFC2018].
5 Fairness Considerations
[All99] presents several simple simulations conducted question, or to measure the
impact of ABC RFC-Manager@RFC-EDITOR.ORG. Unless
specifically noted otherwise on competing traffic (both ABC and non-ABC). The
experiments show that while ABC increases the drop rate for the
connection using ABC, competing traffic is not greatly effected.
The experiments show that standard TCP and ABC both obtain roughly
the same throughput regardless of the variant of the competing
traffic. The simulations also reaffirm that ABC outperforms non-ABC
TCP in an environment with varying types of TCP connections. On the
other hand, the simulations presented in [All99] are not necessarily
realistic and therefore we are encouraging more experimentation in the Internet.
6 Security Considerations
As discussed in section 3.3 ABC protects a TCP from a misbehaving
receiver that induces the sender into transmitting at an
inappropriate rate with an ``ACK splitting'' attack. This, in turn,
protects the network from an overly aggressive sender.
7 Conclusions
We RECOMMEND that RFC itself, all TCP stacks be modified to use ABC with
L=1*SMSS bytes. This change does not increase the aggressiveness of
TCP. Furthermore, simulations of ABC with L=2*SMSS bytes show a
promising performance improvement that we encourage researchers to
experiment with in the Internet.
Acknowledgments
This draft has benefited from discussions with and encouragement
from Sally Floyd. Van Jacobson and Reiner Ludwig provided valuable
input on the implications of byte counting on the RTO.
Normative References
[RFC1122] B. Braden, ed., Requirements RFCs are for Internet Hosts --
Communication Layers, RFC 1122, Oct. 1989.
[RFC2119] S. Bradner, Key words
unlimited distribution.echo
Submissions for use in RFCs Requests for Comments should be sent to Indicate
Requirement Levels, BCP 14, RFC 2119, March 1997.
[RFC2581] Mark Allman, Vern Paxson, W. Richard Stevens. TCP
Congestion Control, April 1999. RFC 2581.
Non-Normative References
[All98] Mark Allman. On the Generation and Use of TCP
Acknowledgments. ACM Computer Communication Review, 29(3), July
1999.
[All99] Mark Allman. TCP Byte Counting Refinements. ACM Computer
Communication Review, 29(3), July 1999.
[Jac88] Van Jacobson. Congestion Avoidance and Control. ACM
SIGCOMM 1988.
[Pax97] Vern Paxson. Automated Packet Trace Analysis of TCP
Implementations. ACM SIGCOMM, September 1997.
[RFC2018] M. Mathis, J. Mahdavi, S. Floyd, A. Romanow. TCP Selective
Acknowledgment Options.
RFC-EDITOR@RFC-EDITOR.ORG. Please consult RFC 2018, October 1996
[RFC2861] Mark Handley, Jitendra Padhye, Sally Floyd. TCP
Congestion Window Validation, June 2000. 2223, Instructions to RFC 2861.
[SCWA99] Stefan Savage, Neal Cardwell, David Wetherall, Tom
Anderson. TCP Congestion Control with a Misbehaving Receiver.
ACM Computer Communication Review, 29(5), October 1999.
Author's Addresses:
Mark Allman
BBN Technologies/NASA Glenn Research Center
Lewis Field
21000 Brookpark Rd. MS 54-5
Cleveland, OH 44135
Phone: 216-433-6586
Fax: 216-433-8705
mallman@bbn.com
http://roland.grc.nasa.gov/~mallman
Authors, for further information.