| < draft-allman-tcp-abc-03.txt | draft-allman-tcp-abc-04.txt > | |||
|---|---|---|---|---|
| A new Request for Comments is now available in online RFC libraries. | ||||
| Internet Engineering Task Force Mark Allman | RFC 3465 | |||
| INTERNET DRAFT BBN/NASA GRC | ||||
| File: draft-allman-tcp-abc-03.txt September, 2002 | ||||
| Expires: March, 2003 | ||||
| TCP Congestion Control with Appropriate Byte Counting | ||||
| Status of this Memo | ||||
| This document is an Internet-Draft and is in full conformance with | ||||
| all provisions of Section 10 of RFC2026. | ||||
| Internet-Drafts are working documents of the Internet Engineering | ||||
| Task Force (IETF), its areas, and its working groups. Note that | ||||
| other groups may also distribute working documents as | ||||
| Internet-Drafts. | ||||
| Internet-Drafts are draft documents valid for a maximum of six | ||||
| months and may be updated, replaced, or obsoleted by other documents | ||||
| at any time. It is inappropriate to use Internet- Drafts as | ||||
| reference material or to cite them other than as "work in progress." | ||||
| The list of current Internet-Drafts can be accessed at | ||||
| http://www.ietf.org/ietf/1id-abstracts.txt | ||||
| The list of Internet-Draft Shadow Directories can be accessed at | ||||
| http://www.ietf.org/shadow.html. | ||||
| Abstract | ||||
| This document proposed a small modification to the way TCP increases | ||||
| its congestion window. Rather than the traditional method of | ||||
| increasing the congestion window by a constant amount for each | ||||
| arriving acknowledgment, we suggest basing the increase on the | ||||
| number of previously unacknowledged bytes each ACK covers. This | ||||
| change improves the performance of TCP, as well as closes a security | ||||
| hole TCP receivers can use to induce the sender into increasing the | ||||
| sending rate too rapidly. | ||||
| Terminology | ||||
| Much of the language in this document is taken from [RFC2581]. | ||||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | ||||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | ||||
| document are to be interpreted as described in [RFC2119]. | ||||
| 1 Introduction | ||||
| This document proposes a modified algorithm for increasing TCP's | ||||
| congestion window (cwnd) that improves both performance and | ||||
| security. Rather than increasing a TCP's congestion window based on | ||||
| the number of acknowledgments (ACKs) that arrive at the data sender | ||||
| the congestion window is increased based on the number of bytes | ||||
| acknowledged by the arriving ACKs. The algorithm improves | ||||
| performance by mitigating the impact of delayed ACKs on the growth | ||||
| of cwnd. At the same time, the algorithm provides more appropriate | ||||
| cwnd growth in response to ACKs that cover only small amounts of | ||||
| data (less than a full segment size). More appropriate cwnd growth | ||||
| can improve both performance and can prevent inappropriate cwnd | ||||
| growth in response to a misbehaving receiver. | ||||
| This document is organized as follows. Section 2 outlines the | ||||
| modified algorithm for increasing TCP's congestion window. Section | ||||
| 3 discusses the advantages of using the modified algorithm. Section | ||||
| 4 discusses the disadvantages of the approach outlined in this | ||||
| document. Section 5 outlines some of the fairness issues that must | ||||
| be considered for the modified algorithm. Section 6 discusses | ||||
| security considerations. | ||||
| 2 A Modified Algorithm for Increasing the Congestion Window | ||||
| As originally outlined in [Jac88] and specified in [RFC2581], TCP | ||||
| uses two algorithms for increasing the congestion window (cwnd). | ||||
| During steady-state, TCP uses the Congestion Avoidance algorithm to | ||||
| linearly increase the value of cwnd. At the beginning of a | ||||
| transfer, after a retransmission timeout or after a long idle period | ||||
| (in some implementations), TCP uses the Slow Start algorithm to | ||||
| increase cwnd exponentially. According to RFC 2581 slow start bases | ||||
| the cwnd increase on the number of incoming acknowledgments. During | ||||
| congestion avoidance RFC 2581 allows more latitude in increasing | ||||
| cwnd, but traditionally implementations have based the increase on | ||||
| the number of arriving ACKs. In the following two subsections, we | ||||
| detail modifications to these algorithms to increase cwnd based on | ||||
| the number of bytes being acknowledged by each arriving ACK, rather | ||||
| than by the number of ACKs that arrive. We call these changes | ||||
| ``Appropriate Byte Counting'' (ABC) [All99]. | ||||
| 2.1 Congestion Avoidance | ||||
| RFC 2581 specifies that cwnd should be increased by 1 segment per | ||||
| round-trip time (RTT) during the congestion avoidance phase of a | ||||
| transfer. Traditionally, TCPs have approximated this increase by | ||||
| increasing cwnd by 1/cwnd for each arriving ACK. This algorithm | ||||
| opens cwnd by roughly 1 segment per RTT if the receiver ACKs each | ||||
| incoming segment and no ACK loss occurs. However, if the receiver | ||||
| implements delayed ACKs [Bra89] the receiver returns roughly half as | ||||
| many ACKs which causes the sender to open cwnd more conservatively | ||||
| (by approximately 1 segment every second RTT). The approach that we | ||||
| suggest is to store the number of bytes that have been ACKed in a | ||||
| bytes_acked variable in the TCP control block. When bytes_acked | ||||
| becomes greater than or equal to the value of the congestion window, | ||||
| bytes_acked is reduced by the value of cwnd. Next, cwnd is | ||||
| incremented by a full-sized segment (SMSS). The algorithm suggested | ||||
| above is specifically allowed by RFC 2581 during congestion | ||||
| avoidance because it opens the window by at most 1 segment per RTT. | ||||
| 2.2 Slow Start | ||||
| RFC 2581 states that the sender increments the congestion window by | ||||
| at most 1*SMSS bytes for each arriving acknowledgment during slow | ||||
| start. We propose that a TCP sender SHOULD increase cwnd by the | ||||
| number of previously unacknowledged bytes ACKed by each incoming | ||||
| acknowledgment provided the increase is not more than L bytes. | ||||
| Choosing the limit on the increase, L, is discussed in the next | ||||
| subsection. When the number of previously unacknowledged bytes | ||||
| ACKed is less than or equal to 1*SMSS bytes or L is less than or | ||||
| equal to 1*SMSS bytes this proposal is no more aggressive (and | ||||
| possibly less aggressive) than allowed by RFC 2581. However, | ||||
| increasing cwnd by more than 1*SMSS bytes in response to a single | ||||
| ACK is more aggressive than allowed by RFC 2581. We believe the | ||||
| more aggressive version of the slow start algorithm still falls | ||||
| within the spirit of the principles outlined in [Jac88] and is safe | ||||
| for experimentation in shared networks provided an appropriate limit | ||||
| is applied (see next section). | ||||
| 2.3 Choosing the Limit | ||||
| The limit, L, chosen for the cwnd increase during slow start | ||||
| controls the aggressiveness of the algorithm. Choosing L=1*SMSS | ||||
| bytes provides behavior that is no more aggressive than allowed by | ||||
| RFC 2581. However, ABC with L=1*SMSS bytes is more conservative in | ||||
| a number of key ways (as discussed in the next section) and | ||||
| therefore, we believe that even though with L=1*SMSS bytes TCP | ||||
| stacks will see little performance benefit, ABC SHOULD be used. | ||||
| A very large L could potentially lead to large line-rate bursts of | ||||
| traffic in the face of a large amount of ACK loss or in the case | ||||
| when the receiver sends ``stretch ACKs'' (ACKs for more than the two | ||||
| full-sized segments allowed by the delayed ACK algorithm) [Pax97]. | ||||
| This documents specifies that TCP implementations MAY use L=2*SMSS | ||||
| bytes and MUST NOT use L > 2*SMSS bytes. This choice balances | ||||
| between being conservative (L=1*SMSS bytes) and potentially being | ||||
| very aggressive. In addition, L=2*SMSS bytes exactly balances the | ||||
| negative impact of the delayed ACK algorithm (as discussed in more | ||||
| detail in section 3.2). Note that when L=2*SMSS bytes cwnd growth | ||||
| is roughly the same as the case when the standard algorithms are | ||||
| used in conjunction with a receiver that transmits an ACK for each | ||||
| incoming segment [All98]. | ||||
| The exception to the above suggestion is during a slow start phase | ||||
| that follows a retransmission timeout (RTO). In this situation, a | ||||
| TCP MUST use L=1*SMSS as specified in RFC 2581 since ACKs for large | ||||
| amounts of previously unacknowledged data are common during this | ||||
| phase of a transfer. These ACKs do not necessarily indicate how | ||||
| much data has left the network in the last RTT and therefore ABC | ||||
| cannot accurately determine how much to increase cwnd. As an | ||||
| example, say segment N is dropped by the network and segments N+1 | ||||
| and N+2 arrive successfully at the receiver. The sender will | ||||
| receive only two duplicate ACKs and therefore must rely on the | ||||
| retransmission timer (RTO) to detect the loss. When the RTO expires | ||||
| segment N is retransmitted. The ACK sent in response to the | ||||
| retransmission will be for segment N+2. However, this ACK does not | ||||
| indicate that three segments have left the network in the last RTT, | ||||
| but rather only a single segment left the network. Therefore, the | ||||
| appropriate cwnd increment is at most 1*SMSS bytes. | ||||
| 2.4 RTO Implications | ||||
| [Jac88] shows that increases in cwnd of more than a factor of two in | ||||
| succeeding RTTs can cause spurious retransmissions on slow links | ||||
| where the bandwidth dominates the RTT, assuming the RTO estimator | ||||
| given in [Jac88] and [RFC2988]. ABC stays within this limit of no | ||||
| more than doubling cwnd in successive RTTs by capping the increase | ||||
| (no matter what L is employed) by the number of previously | ||||
| unacknowledged bytes covered by each incoming ACK. | ||||
| 3 Advantages | ||||
| This section outlines several advantages of using the ABC algorithm | ||||
| to increase cwnd, rather than the standard ACK counting algorithm | ||||
| given in [RFC2581]. | ||||
| 3.1 More Appropriate Congestion Window Increase | ||||
| The ABC algorithm outlined in section 2 increases TCP's cwnd in | ||||
| proportion to the amount of data actually sent into the network. | ||||
| ACK counting, on the other hand, increments cwnd by a constant upon | ||||
| the arrival of each ACK. For instance, consider an interactive | ||||
| telnet connection (e.g., ssh or telnet) in which ACKs generally | ||||
| cover only a few bytes of data, but cwnd is increased by 1*SMSS | ||||
| bytes for each ACK received. When a large amount of data needs to | ||||
| be transmitted (e.g., displaying a large file) the data is sent in | ||||
| one large burst because the cwnd grows by 1*SMSS bytes per ACK | ||||
| rather than based on the actual amount of capacity used. Such a | ||||
| line-rate burst of data can potentially cause a large amount of | ||||
| segment loss. | ||||
| Congestion Window Validation (CWV) [RFC2861] helps the above problem | ||||
| as well. CWV limits the amount of unused cwnd a TCP connection can | ||||
| accumulate. ABC can be used in conjunction with CWV to obtain an | ||||
| accurate measure of the network path. | ||||
| 3.2 Mitigate the Impact of Delayed ACKs and Lost ACKs | ||||
| Delayed ACKs [RFC1122,RFC2581] allow a TCP receiver to refrain from | ||||
| sending an ACK for each incoming segment. However, a receiver | ||||
| SHOULD send an ACK for every second full-sized segment that arrives. | ||||
| Furthermore, a receiver MUST NOT withhold an ACK for more than 500 | ||||
| ms. By reducing the number of ACKs sent to the data originator the | ||||
| receiver is slowing the growth of the congestion window under an ACK | ||||
| counting system. Using ABC with L=2*SMSS bytes can roughly negate | ||||
| the negative impact imposed by delayed ACKs by allowing cwnd to be | ||||
| increased for ACKs that are withheld by the receiver. This allows | ||||
| the congestion window to grow in a manner similar to the case when | ||||
| the receiver ACKs each incoming segment, but without adding extra | ||||
| traffic to the network. Simulation studies have shown increased | ||||
| throughput when a TCP sender uses ABC when compared to the standard | ||||
| ACK counting algorithm [All99], especially for short transfers that | ||||
| never leave the initial slow start period. | ||||
| Note that delayed ACKs should not be an issue during slow | ||||
| start-based loss recovery, as RFC 2581 recommends that receivers | ||||
| should not delay ACKs that cover out-of-order segments. Therefore, | ||||
| as discussed above, ABC with L > 1*SMSS is inappropriate for such | ||||
| slow start based loss recovery and MUST NOT be used. | ||||
| 3.3 Prevents Attacks from Misbehaving Receivers | ||||
| [SCWA99] outlines several methods for a receiver to induce a TCP | ||||
| sender into violating congestion control and transmitting data at a | ||||
| potentially inappropriate rate. One of the outlined attacks is | ||||
| ``ACK Splitting''. This scheme involves the receiver sending | ||||
| multiple ACKs for each incoming data segment, each ACKing only a | ||||
| small portion of the original TCP data segment. Since TCP senders | ||||
| have traditionally used ACK counting to increase cwnd, ACK splitting | ||||
| causes inappropriately rapid cwnd growth and, in turn, a potentially | ||||
| inappropriate sending rate. A TCP sender that uses ABC can prevent | ||||
| this attack from being used to undermine standard congestion control | ||||
| because the cwnd increase is based on the number of bytes ACKed, | ||||
| rather than the number of ACKs received. | ||||
| To prevent misbehaving receivers from inducing inappropriate sender | ||||
| behavior we suggest TCP implementations use ABC, even if L=1*SMSS | ||||
| bytes (i.e., not allowing ABC to provide more aggressive cwnd growth | ||||
| than allowed by RFC 2581). | ||||
| 4 Disadvantages | ||||
| The main disadvantages of using ABC with L=2*SMSS bytes are an | ||||
| increase in the burstiness of TCP and a small increase in the | ||||
| overall loss rate. [All98] discusses the two ways that ABC | ||||
| increases the burstiness of the TCP sender. First, the ``micro | ||||
| burstiness'' of the connection is increased. In other words, the | ||||
| number of segments sent in response to each incoming ACK is | ||||
| increased by at most 1 segment when using ABC with L=2*SMSS bytes in | ||||
| conjunction with a receiver that is sending delayed ACKs. During | ||||
| slow start this translates into an increase from sending 2 | ||||
| back-to-back segments to sending 3 back-to-back packets in response | ||||
| to an ACK for a single packet. Or, an increase from 3 packets to 4 | ||||
| packets when receiving a delayed ACK for two outstanding packets. | ||||
| Note that ACK loss can cause larger bursts. However, ABC only | ||||
| increases the burst size by at most 1*SMSS bytes per ACK received | ||||
| when compared to the standard behavior. This slight increase in the | ||||
| burstiness should only cause problems for devices that have very | ||||
| small buffers. In addition, ABC increases the ``macro burstiness'' | ||||
| of the TCP sender in response to delayed ACKs. Rather than | ||||
| increasing cwnd by roughly 1.5 times per RTT, ABC roughly doubles | ||||
| the congestion window every RTT. However, doubling cwnd every RTT | ||||
| fits within the spirit of slow start, as originally outlined | ||||
| [Jac88]. | ||||
| With the increased burstiness comes a modest increase in the loss | ||||
| rate for a TCP connection employing ABC (see the next section for a | ||||
| short discussion on the fairness of ABC to non-ABC flows). The | ||||
| additional loss can be directly attributable to the increased | ||||
| aggressiveness of ABC. During slow start cwnd is increased more | ||||
| rapidly and therefore when loss occurs cwnd is larger and more drops | ||||
| are likely. Similarly, a congestion avoidance cycle takes roughly | ||||
| half as long when using ABC and delayed ACKs when compared to an ACK | ||||
| counting implementation. In other words, a TCP sender reaches the | ||||
| capacity of the network path, drops a packet and reduces the | ||||
| congestion window by half roughly twice as often when using ABC. | ||||
| However, as discussed above, in spite of the additional loss an ABC | ||||
| TCP sender generally obtains better overall performance than a | ||||
| non-ABC TCP [All99]. | ||||
| Due to the increase in the packet drop rate we suggest ABC be | ||||
| implemented in conjunction with selective acknowledgments | ||||
| [RFC2018]. | ||||
| 5 Fairness Considerations | ||||
| [All99] presents several simple simulations conducted to measure the | ||||
| impact of ABC on competing traffic (both ABC and non-ABC). The | ||||
| experiments show that while ABC increases the drop rate for the | ||||
| connection using ABC, competing traffic is not greatly effected. | ||||
| The experiments show that standard TCP and ABC both obtain roughly | ||||
| the same throughput regardless of the variant of the competing | ||||
| traffic. The simulations also reaffirm that ABC outperforms non-ABC | ||||
| TCP in an environment with varying types of TCP connections. On the | ||||
| other hand, the simulations presented in [All99] are not necessarily | ||||
| realistic and therefore we are encouraging more experimentation in | ||||
| the Internet. | ||||
| 6 Security Considerations | ||||
| As discussed in section 3.3 ABC protects a TCP from a misbehaving | ||||
| receiver that induces the sender into transmitting at an | ||||
| inappropriate rate with an ``ACK splitting'' attack. This, in turn, | ||||
| protects the network from an overly aggressive sender. | ||||
| 7 Conclusions | ||||
| We RECOMMEND that all TCP stacks be modified to use ABC with | ||||
| L=1*SMSS bytes. This change does not increase the aggressiveness of | ||||
| TCP. Furthermore, simulations of ABC with L=2*SMSS bytes show a | ||||
| promising performance improvement that we encourage researchers to | ||||
| experiment with in the Internet. | ||||
| Acknowledgments | ||||
| This draft has benefited from discussions with and encouragement | ||||
| from Sally Floyd. Van Jacobson and Reiner Ludwig provided valuable | ||||
| input on the implications of byte counting on the RTO. | ||||
| Normative References | ||||
| [RFC1122] B. Braden, ed., Requirements for Internet Hosts -- | ||||
| Communication Layers, RFC 1122, Oct. 1989. | ||||
| [RFC2119] S. Bradner, Key words for use in RFCs to Indicate | ||||
| Requirement Levels, BCP 14, RFC 2119, March 1997. | ||||
| [RFC2581] Mark Allman, Vern Paxson, W. Richard Stevens. TCP | ||||
| Congestion Control, April 1999. RFC 2581. | ||||
| Non-Normative References | Title: TCP Congestion Control with Appropriate Byte | |||
| Counting (ABC) | ||||
| Author(s): M. Allman | ||||
| Status: Experimental | ||||
| Date: February 2003 | ||||
| Mailbox: mallman@bbn.com | ||||
| Pages: 10 | ||||
| Characters: 23486 | ||||
| Updates/Obsoletes/SeeAlso: None | ||||
| [All98] Mark Allman. On the Generation and Use of TCP | I-D Tag: draft-allman-tcp-abc-04.txt | |||
| Acknowledgments. ACM Computer Communication Review, 29(3), July | ||||
| 1999. | ||||
| [All99] Mark Allman. TCP Byte Counting Refinements. ACM Computer | URL: ftp://ftp.rfc-editor.org/in-notes/rfc3465.txt | |||
| Communication Review, 29(3), July 1999. | ||||
| [Jac88] Van Jacobson. Congestion Avoidance and Control. ACM | This document proposes a small modification to the way TCP increases | |||
| SIGCOMM 1988. | its congestion window. Rather than the traditional method of | |||
| increasing the congestion window by a constant amount for each | ||||
| arriving acknowledgment, the document suggests basing the increase | ||||
| on the number of previously unacknowledged bytes each ACK covers. | ||||
| This change improves the performance of TCP, as well as closes a | ||||
| security hole TCP receivers can use to induce the sender into | ||||
| increasing the sending rate too rapidly. | ||||
| [Pax97] Vern Paxson. Automated Packet Trace Analysis of TCP | This memo defines an Experimental Protocol for the Internet community. | |||
| Implementations. ACM SIGCOMM, September 1997. | It does not specify an Internet standard of any kind. Discussion and | |||
| suggestions for improvement are requested. Distribution of this memo | ||||
| is unlimited. | ||||
| [RFC2018] M. Mathis, J. Mahdavi, S. Floyd, A. Romanow. TCP Selective | This announcement is sent to the IETF list and the RFC-DIST list. | |||
| Acknowledgment Options. RFC 2018, October 1996 | Requests to be added to or deleted from the IETF distribution list | |||
| should be sent to IETF-REQUEST@IETF.ORG. Requests to be | ||||
| added to or deleted from the RFC-DIST distribution list should | ||||
| be sent to RFC-DIST-REQUEST@RFC-EDITOR.ORG. | ||||
| [RFC2861] Mark Handley, Jitendra Padhye, Sally Floyd. TCP | Details on obtaining RFCs via FTP or EMAIL may be obtained by sending | |||
| Congestion Window Validation, June 2000. RFC 2861. | an EMAIL message to rfc-info@RFC-EDITOR.ORG with the message body | |||
| help: ways_to_get_rfcs. For example: | ||||
| [SCWA99] Stefan Savage, Neal Cardwell, David Wetherall, Tom | To: rfc-info@RFC-EDITOR.ORG | |||
| Anderson. TCP Congestion Control with a Misbehaving Receiver. | Subject: getting rfcs | |||
| ACM Computer Communication Review, 29(5), October 1999. | ||||
| Author's Addresses: | help: ways_to_get_rfcs | |||
| Mark Allman | Requests for special distribution should be addressed to either the | |||
| BBN Technologies/NASA Glenn Research Center | author of the RFC in question, or to RFC-Manager@RFC-EDITOR.ORG. Unless | |||
| Lewis Field | specifically noted otherwise on the RFC itself, all RFCs are for | |||
| 21000 Brookpark Rd. MS 54-5 | unlimited distribution.echo | |||
| Cleveland, OH 44135 | Submissions for Requests for Comments should be sent to | |||
| Phone: 216-433-6586 | RFC-EDITOR@RFC-EDITOR.ORG. Please consult RFC 2223, Instructions to RFC | |||
| Fax: 216-433-8705 | Authors, for further information. | |||
| mallman@bbn.com | ||||
| http://roland.grc.nasa.gov/~mallman | ||||
| End of changes. 12 change blocks. | ||||
| 354 lines changed or deleted | 36 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||