| < draft-floyd-tcp-highspeed-02.txt | draft-floyd-tcp-highspeed-03.txt > | |||
|---|---|---|---|---|
| Internet Engineering Task Force Sally Floyd | Internet Engineering Task Force Sally Floyd | |||
| INTERNET DRAFT ICSI | INTERNET-DRAFT ICSI | |||
| draft-floyd-tcp-highspeed-02.txt February, 2003 | draft-floyd-tcp-highspeed-03.txt 29 June 2003 | |||
| Expires: December 2003 | ||||
| HighSpeed TCP for Large Congestion Windows | HighSpeed TCP for Large Congestion Windows | |||
| Status of this Memo | Status of this Document | |||
| This document is an Internet-Draft and is in full conformance with | This document is an Internet-Draft and is in full conformance with | |||
| all provisions of Section 10 of RFC2026. | all provisions of Section 10 of RFC2026. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| other groups may also distribute working documents as Internet- | other groups may also distribute working documents as Internet- | |||
| Drafts. | Drafts. | |||
| Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six | |||
| and may be updated, replaced, or obsoleted by other documents at any | months and may be updated, replaced, or obsoleted by other documents | |||
| time. It is inappropriate to use Internet- Drafts as reference | at any time. It is inappropriate to use Internet- Drafts as | |||
| material or to cite them other than as "work in progress." | reference material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt | http://www.ietf.org/ietf/1id-abstracts.txt | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| Abstract | Abstract | |||
| This document proposes HighSpeed TCP, a modification to TCP's | This document proposes HighSpeed TCP, a modification to TCP's | |||
| congestion control mechanism for use with TCP connections with large | congestion control mechanism for use with TCP connections with | |||
| congestion windows. The congestion control mechanisms of the current | large congestion windows. The congestion control mechanisms | |||
| Standard TCP constrains the congestion windows that can be achieved | of the current Standard TCP constrains the congestion windows | |||
| by TCP in realistic environments. For example, for a Standard TCP | that can be achieved by TCP in realistic environments. For | |||
| connection with 1500-byte packets and a 100 ms round-trip time, | example, for a Standard TCP connection with 1500-byte packets | |||
| achieving a steady-state throughput of 10 Gbps would require an | and a 100 ms round-trip time, achieving a steady-state | |||
| average congestion window of 83,333 segments, and a packet drop rate | throughput of 10 Gbps would require an average congestion | |||
| of at most one congestion event every 5,000,000,000 packets (or | window of 83,333 segments, and a packet drop rate of at most | |||
| equivalently, at most one congestion event every 1 2/3 hours). This | one congestion event every 5,000,000,000 packets (or | |||
| is widely acknowledged as an unrealistic constraint. To address this | equivalently, at most one congestion event every 1 2/3 hours). | |||
| limitation of TCP, this document proposes HighSpeed TCP, and solicits | This is widely acknowledged as an unrealistic constraint. To | |||
| experimentation and feedback from the wider community. | address this limitation of TCP, this document proposes | |||
| HighSpeed TCP, and solicits experimentation and feedback from | ||||
| the wider community. | ||||
| TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION: | TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION: | |||
| Changes from draft-floyd-tcp-highspeed-01.txt: | Changes from draft-floyd-tcp-highspeed-02.txt: | |||
| * Added a section on "Tradeoffs for Choosing Congestion Control | * Added a section on "Deployment issues." | |||
| Parameters". | ||||
| * Added mention of Scalable TCP from Tom Kelly. | * Added a short section on "Implementation issues." | |||
| Changes from draft-floyd-tcp-highspeed-00.txt: | * Added a section on "Limiting burstiness on short time | |||
| scales". | ||||
| * Added a discussion on related work about changing the PMTU. | * Added to the discussion on convergence times. | |||
| * Added a discussion of an alternate, linear response function. | * Clarified that "log" is "log base 10". | |||
| * Added a discussion of the TCP window scale option. | * Clarified that W = Low_window and W_1 = High_window, in the | |||
| equation for b(w). | ||||
| * Added a discussion of HighSpeed TCP as roughly emulating the | Changes from draft-floyd-tcp-highspeed-01.txt: | |||
| congestion control response of N parallel TCP connections. | ||||
| * Added a discussion of the time to converge to fairness. | * Added a section on "Tradeoffs for Choosing Congestion | |||
| Control Parameters". | ||||
| * Expanded the Introduction. | * Added mention of Scalable TCP from Tom Kelly. | |||
| Changes from draft-floyd-tcp-highspeed-00.txt: | ||||
| * Added a discussion on related work about changing the PMTU. | ||||
| * Added a discussion of an alternate, linear response | ||||
| function. | ||||
| * Added a discussion of the TCP window scale option. | ||||
| * Added a discussion of HighSpeed TCP as roughly emulating the | ||||
| congestion control response of N parallel TCP connections. | ||||
| * Added a discussion of the time to converge to fairness. | ||||
| * Expanded the Introduction. | ||||
| Table of Contents | ||||
| 1. Introduction. . . . . . . . . . . . . . . . . . . . . . 5 | ||||
| 2. The Problem Description.. . . . . . . . . . . . . . . . 6 | ||||
| 3. Design Guidelines.. . . . . . . . . . . . . . . . . . . 6 | ||||
| 4. Non-Goals.. . . . . . . . . . . . . . . . . . . . . . . 7 | ||||
| 5. Modifying the TCP Response Function.. . . . . . . . . . 8 | ||||
| 6. Fairness Implications of the HighSpeed Response | ||||
| Function.. . . . . . . . . . . . . . . . . . . . . . . . . 11 | ||||
| 7. Translating the HighSpeed Response Function into | ||||
| Congestion Control Parameters. . . . . . . . . . . . . . . 14 | ||||
| 8. An alternate, linear response functions.. . . . . . . . 16 | ||||
| 9. Tradeoffs for Choosing Congestion Control Parame- | ||||
| ters.. . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | ||||
| 9.1. The Number of Round-Trip Times between Loss | ||||
| Events. . . . . . . . . . . . . . . . . . . . . . . . . . 19 | ||||
| 9.2. The Number of Packet Drops per Loss Event, | ||||
| with Drop-Tail. . . . . . . . . . . . . . . . . . . . . . 19 | ||||
| 10. Related Issues . . . . . . . . . . . . . . . . . . . . 20 | ||||
| 10.1. Slow-Start. . . . . . . . . . . . . . . . . . . . . 20 | ||||
| 10.2. Limiting burstiness on short time scales. . . . . . 21 | ||||
| 10.3. Other limitations on window size. . . . . . . . . . 22 | ||||
| 10.4. Implementation issues.. . . . . . . . . . . . . . . 22 | ||||
| 11. Deployment issues. . . . . . . . . . . . . . . . . . . 22 | ||||
| 11.1. Deployment issues of HighSpeed TCP. . . . . . . . . 22 | ||||
| 11.2. Deployment issues of Scalable TCP . . . . . . . . . 24 | ||||
| 12. Related Work in HighSpeed TCP. . . . . . . . . . . . . 26 | ||||
| 13. Relationship to other Work.. . . . . . . . . . . . . . 27 | ||||
| 14. Conclusions. . . . . . . . . . . . . . . . . . . . . . 28 | ||||
| 15. Acknowledgements . . . . . . . . . . . . . . . . . . . 28 | ||||
| 16. Normative References . . . . . . . . . . . . . . . . . 28 | ||||
| 17. Informative References . . . . . . . . . . . . . . . . 28 | ||||
| 18. Security Considerations. . . . . . . . . . . . . . . . 31 | ||||
| 19. IANA Considerations. . . . . . . . . . . . . . . . . . 31 | ||||
| 20. TCP's Loss Event Rate in Steady-State. . . . . . . . . 31 | ||||
| 1. Introduction. | 1. Introduction. | |||
| This document proposes HighSpeed TCP, a modification to TCP's | This document proposes HighSpeed TCP, a modification to TCP's | |||
| congestion control mechanism for use with TCP connections with large | congestion control mechanism for use with TCP connections with large | |||
| congestion windows. In a steady-state environment, with a packet | congestion windows. In a steady-state environment, with a packet | |||
| loss rate p, the current Standard TCP's average congestion window is | loss rate p, the current Standard TCP's average congestion window is | |||
| roughly 1.2/sqrt(p) segments. This places a serious constraint on | roughly 1.2/sqrt(p) segments. This places a serious constraint on | |||
| the congestion windows that can be achieved by TCP in realistic | the congestion windows that can be achieved by TCP in realistic | |||
| environments. For example, for a Standard TCP connection with | environments. For example, for a Standard TCP connection with | |||
| 1500-byte packets and a 100 ms round-trip time, achieving a steady- | 1500-byte packets and a 100 ms round-trip time, achieving a steady- | |||
| state throughput of 10 Gbps would require an average congestion | state throughput of 10 Gbps would require an average congestion | |||
| window of 83,333 segments, and a packet drop rate of at most one | window of 83,333 segments, and a packet drop rate of at most one | |||
| congestion event every 5,000,000,000 packets (or equivalently, at | congestion event every 5,000,000,000 packets (or equivalently, at | |||
| most one congestion event every 1 2/3 hours). | most one congestion event every 1 2/3 hours). The average packet | |||
| drop rate of at most 2*10^(-10) needed for full link utilization in | ||||
| this environment corresponds to a bit error rate of at most | ||||
| 2*10^(-14), and this is an unrealistic requirement for current | ||||
| networks. | ||||
| To address this fundamental limitation of TCP and of the TCP response | To address this fundamental limitation of TCP and of the TCP | |||
| function (the function mapping the steady-state packet drop rate to | response function (the function mapping the steady-state packet drop | |||
| TCP's average sending rate in packets per round-trip time), this | rate to TCP's average sending rate in packets per round-trip time), | |||
| document proposes modifying the TCP response function for regimes | this document describes a modified TCP response function for regimes | |||
| with higher congestion windows. This document also solicits | with higher congestion windows. This document also solicits | |||
| experimentation and feedback on HighSpeed TCP from the wider | experimentation and feedback on HighSpeed TCP from the wider | |||
| community. | community. | |||
| Because HighSpeed TCP's modified response function would only take | Because HighSpeed TCP's modified response function would only take | |||
| effect with higher congestion windows, HighSpeed TCP does not modify | effect with higher congestion windows, HighSpeed TCP does not modify | |||
| TCP behavior in environments with mild to heavy congestion, and | TCP behavior in environments with mild to heavy congestion, and | |||
| therefore does not introduce any new dangers of congestion collapse. | therefore does not introduce any new dangers of congestion collapse. | |||
| However, if relative fairness between HighSpeed TCP connections is to | However, if relative fairness between HighSpeed TCP connections is | |||
| be preserved, then in our view any modification to the TCP response | to be preserved, then in our view any modification to the TCP | |||
| function should be globally-agreed-upon in the IETF, rather than made | response function should be addressed in the IETF, rather than made | |||
| as ad hoc decisions by individual implementors or TCP senders. | as ad hoc decisions by individual implementors or TCP senders. | |||
| Modifications to the TCP response function would also have | Modifications to the TCP response function would also have | |||
| implications for transport protocols that use TFRC and other forms of | implications for transport protocols that use TFRC and other forms | |||
| equation-based congestion control, as these congestion control | of equation-based congestion control, as these congestion control | |||
| mechanisms directly use the TCP response function [TFRC]. | mechanisms directly use the TCP response function [RFC3448]. | |||
| This proposal for HighSpeed TCP focuses specifically on a proposed | This proposal for HighSpeed TCP focuses specifically on a proposed | |||
| change to the TCP response function, and its implications for TCP. | change to the TCP response function, and its implications for TCP. | |||
| This document does not address what we view as a separate fundamental | This document does not address what we view as a separate | |||
| issue, of the mechanisms required to enable best-effort connections | fundamental issue, of the mechanisms required to enable best-effort | |||
| to *start* with large initial windows. In our view, while HighSpeed | connections to *start* with large initial windows. In our view, | |||
| TCP proposes a somewhat fundamental change to the TCP response | while HighSpeed TCP proposes a somewhat fundamental change to the | |||
| function, at the same time it is a relatively simple change to | TCP response function, at the same time it is a relatively simple | |||
| implement in a single TCP sender, and presents no dangers in terms of | change to implement in a single TCP sender, and presents no dangers | |||
| congestion collapse. In contrast, in our view, the problem of | in terms of congestion collapse. In contrast, in our view, the | |||
| enabling connections to *start* with large initial windows is | problem of enabling connections to *start* with large initial | |||
| inherently more risky and structurally more difficult, requiring some | windows is inherently more risky and structurally more difficult, | |||
| form of explicit feedback from all of the routers along the path. | requiring some form of explicit feedback from all of the routers | |||
| This is another reason why we would propose addressing the problem of | along the path. This is another reason why we would propose | |||
| starting with large initial windows separately, and on a separate | addressing the problem of starting with large initial windows | |||
| timetable, from the problem of modifying the TCP response function. | separately, and on a separate timetable, from the problem of | |||
| modifying the TCP response function. | ||||
| 2. The Problem Description. | 2. The Problem Description. | |||
| This section describes the number of round-trip times between | This section describes the number of round-trip times between | |||
| congestion events required for a Standard TCP flow to achieve an | congestion events required for a Standard TCP flow to achieve an | |||
| average throughput of B bps, given packets of D bytes and a round- | average throughput of B bps, given packets of D bytes and a round- | |||
| trip time of R seconds. A congestion event refers to a window of | trip time of R seconds. A congestion event refers to a window of | |||
| data with one or more dropped or ECN-marked packets (where ECN stands | data with one or more dropped or ECN-marked packets (where ECN | |||
| for Explicit Congestion Notification). | stands for Explicit Congestion Notification). | |||
| From Appendix A, achieving an average TCP throughput of B bps | From Appendix A, achieving an average TCP throughput of B bps | |||
| requires a loss event at most every BR/(12D) round-trip times. This | requires a loss event at most every BR/(12D) round-trip times. This | |||
| is illustrated in Table 1, for R = 0.1 seconds and D = 1500 bytes. | is illustrated in Table 1, for R = 0.1 seconds and D = 1500 bytes. | |||
| The table also gives the average congestion window W of BR/(8D), and | The table also gives the average congestion window W of BR/(8D), and | |||
| the steady-state packet drop rate P of 1.5/W^2. | the steady-state packet drop rate P of 1.5/W^2. | |||
| TCP Throughput (Mbps) RTTs Between Losses W P | TCP Throughput (Mbps) RTTs Between Losses W P | |||
| --------------------- ------------------- ------ ----- | --------------------- ------------------- ---- ----- | |||
| 1 5.5 8.3 0.02 | 1 5.5 8.3 0.02 | |||
| 10 55.5 83.3 0.0002 | 10 55.5 83.3 0.0002 | |||
| 100 555.5 833.3 0.000002 | 100 555.5 833.3 0.000002 | |||
| 1000 5555.5 8333.3 0.00000002 | 1000 5555.5 8333.3 0.00000002 | |||
| 10000 55555.5 83333.3 0.0000000002 | 10000 55555.5 83333.3 0.0000000002 | |||
| Table 1: RTTs Between Congestion Events for Standard TCP, for | Table 1: RTTs Between Congestion Events for Standard TCP, for | |||
| 1500-Byte Packets and a Round-Trip Time of 0.1 Seconds. | 1500-Byte Packets and a Round-Trip Time of 0.1 Seconds. | |||
| This document proposes HighSpeed TCP, a minimal modification to TCP's | This document proposes HighSpeed TCP, a minimal modification to | |||
| increase and decrease parameters, for TCP connections with larger | TCP's increase and decrease parameters, for TCP connections with | |||
| congestion windows, to allow TCP to achieve high throughput with more | larger congestion windows, to allow TCP to achieve high throughput | |||
| realistic requirements for the steady-state packet drop rate. | with more realistic requirements for the steady-state packet drop | |||
| Equivalently, HighSpeed TCP has more realistic requirements for the | rate. Equivalently, HighSpeed TCP has more realistic requirements | |||
| number of round-trip times between loss events. | for the number of round-trip times between loss events. | |||
| 3. Design Guidelines. | 3. Design Guidelines. | |||
| Our proposal for HighSpeed TCP is motivated by the following | Our proposal for HighSpeed TCP is motivated by the following | |||
| requirements: | requirements: | |||
| * Achieve high per-connection throughput without requiring | * Achieve high per-connection throughput without requiring | |||
| unrealistically low packet loss rates. | unrealistically low packet loss rates. | |||
| * Reach high throughput reasonably quickly when in slow-start. | * Reach high throughput reasonably quickly when in slow-start. | |||
| * Reach high throughput without overly long delays when recovering | * Reach high throughput without overly long delays when recovering | |||
| from multiple retransmit timeouts, or when ramping-up from a period | from multiple retransmit timeouts, or when ramping-up from a period | |||
| with small congestion windows. | with small congestion windows. | |||
| * No additional feedback or support required from routers: | * No additional feedback or support required from routers: | |||
| For example, the goal is for acceptable performance in both ECN- | For example, the goal is for acceptable performance in both ECN- | |||
| capable and non-ECN-capable environments, and with Drop-Tail as well | capable and non-ECN-capable environments, and with Drop-Tail as well | |||
| as with Active Queue Management such as RED in the routers. | as with Active Queue Management such as RED in the routers. | |||
| * No additional feedback required from TCP receivers. | * No additional feedback required from TCP receivers. | |||
| * TCP-compatible performance in environments with moderate or high | * TCP-compatible performance in environments with moderate or high | |||
| congestion: | congestion: | |||
| Equivalently, the requirement is that there be no additional load on | Equivalently, the requirement is that there be no additional load on | |||
| the network (in terms of increased packet drop rates) in environments | the network (in terms of increased packet drop rates) in | |||
| with moderate or high congestion. | environments with moderate or high congestion. | |||
| * Performance at least as good as Standard TCP in environments with | * Performance at least as good as Standard TCP in environments with | |||
| moderate or high congestion. | moderate or high congestion. | |||
| * Acceptable transient performance, in terms of increases in the | * Acceptable transient performance, in terms of increases in the | |||
| congestion window in one round-trip time, responses to severe | congestion window in one round-trip time, responses to severe | |||
| congestion, and convergence times to fairness. | congestion, and convergence times to fairness. | |||
| Currently, users wishing to achieve throughputs of 1Gbps or more | Currently, users wishing to achieve throughputs of 1 Gbps or more | |||
| typically open up multiple TCP connections in parallel, or use MulTCP | typically open up multiple TCP connections in parallel, or use | |||
| [CO98,GRK99], which behaves roughly like the aggregate of N virtual | MulTCP [CO98,GRK99], which behaves roughly like the aggregate of N | |||
| TCP connections. While this approach suffices for the occasional | virtual TCP connections. While this approach suffices for the | |||
| user on well-provisioned links, it leaves the parameter N to be | occasional user on well-provisioned links, it leaves the parameter N | |||
| determined by the user, and results in more aggressive performance | to be determined by the user, and results in more aggressive | |||
| and higher steady-state packet drop rates if used in environments | performance and higher steady-state packet drop rates if used in | |||
| with periods of moderate or high congestion. We believe that a new | environments with periods of moderate or high congestion. We | |||
| approach is needed that offers more flexibility, more effectively | believe that a new approach is needed that offers more flexibility, | |||
| scales to a wide range of available bandwidths, and competes more | more effectively scales to a wide range of available bandwidths, and | |||
| fairly with Standard TCP in congested environments. | competes more fairly with Standard TCP in congested environments. | |||
| 4. Non-Goals. | 4. Non-Goals. | |||
| The following are explicitly *not* goals of our work: | The following are explicitly *not* goals of our work: | |||
| * Non-goal: TCP-compatible performance in environments with very low | * Non-goal: TCP-compatible performance in environments with very low | |||
| packet drop rates. | packet drop rates. | |||
| We note that our proposal does not require, or deliver, TCP- | We note that our proposal does not require, or deliver, TCP- | |||
| compatible performance in environments with very low packet drop | compatible performance in environments with very low packet drop | |||
| rates, e.g., with packet loss rates of 10^-5 or 10^-6. As we discuss | rates, e.g., with packet loss rates of 10^-5 or 10^-6. As we | |||
| later in this document, we assume that Standard TCP is unable to make | discuss later in this document, we assume that Standard TCP is | |||
| effective use of the available bandwidth in environments with loss | unable to make effective use of the available bandwidth in | |||
| rates of 10^-6 in any case, so that it is acceptable and appropriate | environments with loss rates of 10^-6 in any case, so that it is | |||
| for HighSpeed TCP to perform more aggressively than Standard TCP is | acceptable and appropriate for HighSpeed TCP to perform more | |||
| such an environment. | aggressively than Standard TCP is such an environment. | |||
| * Non-goal: Ramping-up more quickly than allowed by slow-start. | * Non-goal: Ramping-up more quickly than allowed by slow-start. | |||
| It is our belief that ramping-up more quickly than allowed by slow- | It is our belief that ramping-up more quickly than allowed by slow- | |||
| start would necessitate more explicit feedback from routers along the | start would necessitate more explicit feedback from routers along | |||
| path. The proposal for HighSpeed TCP is focused on changes to TCP | the path. The proposal for HighSpeed TCP is focused on changes to | |||
| that could be effectively deployed in the current Internet | TCP that could be effectively deployed in the current Internet | |||
| environment. | environment. | |||
| * Non-goal: Avoiding oscillations in environments with only one-way, | * Non-goal: Avoiding oscillations in environments with only one-way, | |||
| long-lived flows all with the same round-trip times. | long-lived flows all with the same round-trip times. | |||
| While we agree that attention to oscillatory behavior is useful, | While we agree that attention to oscillatory behavior is useful, | |||
| avoiding oscillations in aggregate throughput has not been our | avoiding oscillations in aggregate throughput has not been our | |||
| primary consideration, particularly for simplified environments | primary consideration, particularly for simplified environments | |||
| limited to one-way, long-lived flows all with the same, large round- | limited to one-way, long-lived flows all with the same, large round- | |||
| trip times. Our assessment is that some oscillatory behavior in | trip times. Our assessment is that some oscillatory behavior in | |||
| these extreme environments is an acceptable price to pay for the | these extreme environments is an acceptable price to pay for the | |||
| other benefits of HighSpeed TCP. | other benefits of HighSpeed TCP. | |||
| 5. Modifying the TCP Response Function. | 5. Modifying the TCP Response Function. | |||
| The TCP response function, w = 1.2/sqrt(p), gives TCP's average | The TCP response function, w = 1.2/sqrt(p), gives TCP's average | |||
| congestion window w in MSS-sized segments, as a function of the | congestion window w in MSS-sized segments, as a function of the | |||
| steady-state packet drop rate p [FF98]. This TCP response function | steady-state packet drop rate p [FF98]. This TCP response function | |||
| is a direct consequence of TCP's Additive Increase Multiplicative | is a direct consequence of TCP's Additive Increase Multiplicative | |||
| Decrease (AIMD) mechanisms of increasing the congestion window by | Decrease (AIMD) mechanisms of increasing the congestion window by | |||
| roughly one segment per round-trip time in the absence of congestion, | roughly one segment per round-trip time in the absence of | |||
| and halving the congestion window in response to a round-trip time | congestion, and halving the congestion window in response to a | |||
| with a congestion event. This response function for Standard TCP is | round-trip time with a congestion event. This response function for | |||
| reflected in the table below. In this proposal we restrict our | Standard TCP is reflected in the table below. In this proposal we | |||
| attention to TCP performance in environments with packet loss rates | restrict our attention to TCP performance in environments with | |||
| of at most 10^-2, and so we can ignore the more complex response | packet loss rates of at most 10^-2, and so we can ignore the more | |||
| functions that are required to model TCP performance in more | complex response functions that are required to model TCP | |||
| congested environments with retransmit timeouts. From Appendix A, an | performance in more congested environments with retransmit timeouts. | |||
| average congestion window of W corresponds to an average of W/1.5 | From Appendix A, an average congestion window of W corresponds to an | |||
| round-trip times between loss events for Standard TCP. | average of 2/3 W round-trip times between loss events for Standard | |||
| TCP (with the congestion window varying from 2/3 W to 4/3 W). | ||||
| Packet Drop Rate P Congestion Window W RTTs Between Losses | Packet Drop Rate P Congestion Window W RTTs Between Losses | |||
| ------------------ ------------------- ------------------- | ------------------ ------------------- ------------------- | |||
| 10^-2 12 8 | 10^-2 12 8 | |||
| 10^-3 38 25 | 10^-3 38 25 | |||
| 10^-4 120 80 | 10^-4 120 80 | |||
| 10^-5 379 252 | 10^-5 379 252 | |||
| 10^-6 1200 800 | 10^-6 1200 800 | |||
| 10^-7 3795 2530 | 10^-7 3795 2530 | |||
| 10^-8 12000 8000 | 10^-8 12000 8000 | |||
| 10^-9 37948 25298 | 10^-9 37948 25298 | |||
| 10^-10 120000 80000 | 10^-10 120000 80000 | |||
| Table 2: TCP Response Function for Standard TCP. The average | Table 2: TCP Response Function for Standard TCP. The average | |||
| congestion window W in MSS-sized segments is given as a function of | congestion window W in MSS-sized segments is given as a function of | |||
| the packet drop rate P. | the packet drop rate P. | |||
| To specify a modified response function for HighSpeed TCP, we use | To specify a modified response function for HighSpeed TCP, we use | |||
| three parameters, Low_Window, High_Window, and High_P. To ensure TCP | three parameters, Low_Window, High_Window, and High_P. To ensure | |||
| compatibility, the HighSpeed response function uses the same response | TCP compatibility, the HighSpeed response function uses the same | |||
| function as Standard TCP when the current congestion window is at | response function as Standard TCP when the current congestion window | |||
| most Low_Window, and uses the HighSpeed response function when the | is at most Low_Window, and uses the HighSpeed response function when | |||
| current congestion window is greater than Low_Window. In this | the current congestion window is greater than Low_Window. In this | |||
| document we set Low_Window to 38 MSS-sized segments, corresponding to | document we set Low_Window to 38 MSS-sized segments, corresponding | |||
| a packet drop rate of 10^-3 for TCP. | to a packet drop rate of 10^-3 for TCP. | |||
| To specify the upper end of the HighSpeed response function, we | To specify the upper end of the HighSpeed response function, we | |||
| specify the packet drop rate needed in the HighSpeed response | specify the packet drop rate needed in the HighSpeed response | |||
| function to achieve an average congestion window of 83000 segments. | function to achieve an average congestion window of 83000 segments. | |||
| This is roughly the window needed to sustain 10Gbps throughput, for a | This is roughly the window needed to sustain 10 Gbps throughput, for | |||
| TCP connection with the default packet size and round-trip time used | a TCP connection with the default packet size and round-trip time | |||
| earlier in this document. For High_Window set to 83000, we specify | used earlier in this document. For High_Window set to 83000, we | |||
| High_P of 10^-7; that is, with HighSpeed TCP a packet drop rate of | specify High_P of 10^-7; that is, with HighSpeed TCP a packet drop | |||
| 10^-7 allows the HighSpeed TCP connection to achieve an average | rate of 10^-7 allows the HighSpeed TCP connection to achieve an | |||
| congestion window of 83000 segments. We believe that this loss rate | average congestion window of 83000 segments. We believe that this | |||
| sets an achieveable target for high-speed environments, while still | loss rate sets an achievable target for high-speed environments, | |||
| allowing acceptable fairness for the HighSpeed response function when | while still allowing acceptable fairness for the HighSpeed response | |||
| competing with Standard TCP in environments with packet drop rates of | function when competing with Standard TCP in environments with | |||
| 10^-4 or 10^5. | packet drop rates of 10^-4 or 10^5. | |||
| For simplicity, for the HighSpeed response function we maintain the | For simplicity, for the HighSpeed response function we maintain the | |||
| property that the response function gives a straight line on a log- | property that the response function gives a straight line on a log- | |||
| log scale (as does the response function for Standard TCP, for low to | log scale (as does the response function for Standard TCP, for low | |||
| moderate congestion). This results in the following response | to moderate congestion). This results in the following response | |||
| function, for values of the average congestion window W greater than | function, for values of the average congestion window W greater than | |||
| Low_Window: | Low_Window: | |||
| W = (p/Low_P)^S Low_Window, | W = (p/Low_P)^S Low_Window, | |||
| for Low_P the packet drop rate corresponding to Low_Window, and for S | for Low_P the packet drop rate corresponding to Low_Window, and for | |||
| as following constant [FRS02]: | S as following constant [FRS02]: | |||
| S = (log High_Window - log Low_Window)/(log High_P - log Low_P). | S = (log High_Window - log Low_Window)/(log High_P - log Low_P). | |||
| For example, for Low_Window set to 38, we have Low_P of 10^-3 (for | (In this paper, "log x" refers to the log base 10.) For example, | |||
| compatibility with Standard TCP). Thus, for High_Window set to 83000 | for Low_Window set to 38, we have Low_P of 10^-3 (for compatibility | |||
| and High_P set to 10^-7, we get the following response function: | with Standard TCP). Thus, for High_Window set to 83000 and High_P | |||
| set to 10^-7, we get the following response function: | ||||
| W = 0.12/p^0.835. (1) | W = 0.12/p^0.835. (1) | |||
| This HighSpeed response function is illustrated in Table 3 below. | This HighSpeed response function is illustrated in Table 3 below. | |||
| For HighSpeed TCP, the number of round-trip times between losses, | For HighSpeed TCP, the number of round-trip times between losses, | |||
| 1/(pW), equals 12.7 W^0.2, for W > 38 segments. | 1/(pW), equals 12.7 W^0.2, for W > 38 segments. | |||
| Packet Drop Rate P Congestion Window W RTTs Between Losses | Packet Drop Rate P Congestion Window W RTTs Between Losses | |||
| ------------------ ------------------- ------------------- | ------------------ ------------------- ------------------- | |||
| 10^-2 12 8 | 10^-2 12 8 | |||
| 10^-3 38 25 | 10^-3 38 25 | |||
| 10^-4 263 38 | 10^-4 263 38 | |||
| 10^-5 1795 57 | 10^-5 1795 57 | |||
| 10^-6 12279 83 | 10^-6 12279 83 | |||
| 10^-7 83981 123 | 10^-7 83981 123 | |||
| 10^-8 574356 180 | 10^-8 574356 180 | |||
| 10^-9 3928088 264 | 10^-9 3928088 264 | |||
| 10^-10 26864653 388 | 10^-10 26864653 388 | |||
| Table 3: TCP Response Function for HighSpeed TCP. The average | Table 3: TCP Response Function for HighSpeed TCP. The average | |||
| congestion window W in MSS-sized segments is given as a function of | congestion window W in MSS-sized segments is given as a function of | |||
| the packet drop rate P. | the packet drop rate P. | |||
| We believe that the problem of backward compatibility with Standard | We believe that the problem of backward compatibility with Standard | |||
| TCP requires a response function that is quite close to that of | TCP requires a response function that is quite close to that of | |||
| Standard TCP for loss rates of 10^-1, 10^-2, or 10^-3. We believe, | Standard TCP for loss rates of 10^-1, 10^-2, or 10^-3. We believe, | |||
| however, that such stringent TCP-compatibility is not required for | however, that such stringent TCP-compatibility is not required for | |||
| smaller loss rates, and that an appropriate response function is one | smaller loss rates, and that an appropriate response function is one | |||
| that gives a plausible packet drop rate for a connection throughput | that gives a plausible packet drop rate for a connection throughput | |||
| of 10Gbps. This also gives a slowly increasing number of round-trip | of 10 Gbps. This also gives a slowly increasing number of round- | |||
| times between loss events as a function of a decreasing packet drop | trip times between loss events as a function of a decreasing packet | |||
| rate. | drop rate. | |||
| Another way to look at the HighSpeed response function is to consider | Another way to look at the HighSpeed response function is to | |||
| that HighSpeed TCP is roughly emulating the congestion control | consider that HighSpeed TCP is roughly emulating the congestion | |||
| response of N parallel TCP connections, where N is initially one, and | control response of N parallel TCP connections, where N is initially | |||
| where N increases as a function of the HighSpeed TCP's congestion | one, and where N increases as a function of the HighSpeed TCP's | |||
| window. Thus for the HighSpeed response function in Equation (1) | congestion window. Thus for the HighSpeed response function in | |||
| above, the response function can be viewed as equivalent to that of | Equation (1) above, the response function can be viewed as | |||
| N(W) parallel TCP connections, where N(W) varies as a function of the | equivalent to that of N(W) parallel TCP connections, where N(W) | |||
| congestion window W. Recall that for a single standard TCP | varies as a function of the congestion window W. Recall that for a | |||
| connection, the average congestion window equals 1.2/sqrt(p). For N | single standard TCP connection, the average congestion window equals | |||
| parallel TCP connections, the aggregate congestion window W_n equals | 1.2/sqrt(p). For N parallel TCP connections, the aggregate | |||
| N*1.2/sqrt(p). From the HighSpeed response function in Equation (1) | congestion window for the N connections equals N*1.2/sqrt(p). From | |||
| and the relationship above, we can derive the following: | the HighSpeed response function in Equation (1) and the relationship | |||
| above, we can derive the following: | ||||
| N(W) = 0.23*W^(0.4) | N(W) = 0.23*W^(0.4) | |||
| for N(W) the number of parallel TCP connections emulated by the | for N(W) the number of parallel TCP connections emulated by the | |||
| HighSpeed TCP response function, and for N(W) >= 1. This is shown in | HighSpeed TCP response function, and for N(W) >= 1. This is shown | |||
| Table 4 below. | in Table 4 below. | |||
| Congestion Window W Number N(W) of Parallel TCPs | Congestion Window W Number N(W) of Parallel TCPs | |||
| ------------------- ------------------------- | ------------------- ------------------------- | |||
| 1 1 | 1 1 | |||
| 10 1 | 10 1 | |||
| 100 1.4 | 100 1.4 | |||
| 1,000 3.6 | 1,000 3.6 | |||
| 10,000 9.2 | 10,000 9.2 | |||
| 100,000 23.0 | 100,000 23.0 | |||
| Table 4: Number N(W) of parallel TCP connections roughly emulated by | Table 4: Number N(W) of parallel TCP connections roughly emulated by | |||
| the HighSpeed TCP response function. | the HighSpeed TCP response function. | |||
| We do not in this document attempt to seriously evaluate the | We do not in this document attempt to seriously evaluate the | |||
| HighSpeed response function for congestion windows greater than | HighSpeed response function for congestion windows greater than | |||
| 100,000 packets. We believe that we will learn more about the | 100,000 packets. We believe that we will learn more about the | |||
| requirements for sustaining the throughput of best-effort connections | requirements for sustaining the throughput of best-effort | |||
| in that range as we gain more experience with HighSpeed TCP with | connections in that range as we gain more experience with HighSpeed | |||
| congestion windows of thousands and tens of thousands of packets. | TCP with congestion windows of thousands and tens of thousands of | |||
| There also might be limitations to the per-connection throughput that | packets. There also might be limitations to the per-connection | |||
| can be realistically achieved for best-effort traffic in the absence | throughput that can be realistically achieved for best-effort | |||
| of additional support or feedback from the routers along the path. | traffic, in terms of congestion window of hundreds of thousands of | |||
| packets or more, in the absence of additional support or feedback | ||||
| from the routers along the path. | ||||
| 6. Fairness Implications of the HighSpeed Response Function. | 6. Fairness Implications of the HighSpeed Response Function. | |||
| The Standard and Highspeed Response Functions can be used directly to | The Standard and Highspeed Response Functions can be used directly | |||
| infer the relative fairness between flows using the two response | to infer the relative fairness between flows using the two response | |||
| functions. For example, given a packet drop rate P, assume that | functions. For example, given a packet drop rate P, assume that | |||
| Standard TCP has an average congestion window of W_Standard, and | Standard TCP has an average congestion window of W_Standard, and | |||
| HighSpeed TCP has a higher average congestion window of W_HighSpeed. | HighSpeed TCP has a higher average congestion window of W_HighSpeed. | |||
| In this case, a single HighSpeed TCP connection is receiving | ||||
| W_HighSpeed/W_Standard times the throughput of a single Standard TCP | ||||
| connection competing in the same environment. | ||||
| This relative fairness is illustrated below in Table 5, for the | In this case, a single HighSpeed TCP connection is receiving | |||
| parameters used for the Highspeed response function in the section | W_HighSpeed/W_Standard times the throughput of a single Standard TCP | |||
| above. The second column gives the relative fairness, for the | connection competing in the same environment. | |||
| steady-state packet drop rate specified in the first column. To help | ||||
| calibrate, the third column gives the aggregate average congestion | ||||
| window for the two TCP connections, and the fourth column gives the | ||||
| bandwidth that would be needed by the two connections to achieve that | ||||
| aggregate window and packet drop rate, given 100 ms round-trip times | ||||
| and 1500-byte packets. | ||||
| Packet Drop Rate P Fairness Aggregate Window Bandwidth | This relative fairness is illustrated below in Table 5, for the | |||
| ------------------ -------- ---------------- --------- | parameters used for the Highspeed response function in the section | |||
| 10^-2 1.0 24 2.8 Mbps | above. The second column gives the relative fairness, for the | |||
| 10^-3 1.0 76 9.1 Mbps | steady-state packet drop rate specified in the first column. To | |||
| 10^-4 2.2 383 45.9 Mbps | help calibrate, the third column gives the aggregate average | |||
| 10^-5 4.7 2174 260.8 Mbps | congestion window for the two TCP connections, and the fourth column | |||
| 10^-6 10.2 13479 1.6 Gbps | gives the bandwidth that would be needed by the two connections to | |||
| 10^-7 22.1 87776 10.5 Gbps | achieve that aggregate window and packet drop rate, given 100 ms | |||
| 10^-8 47.9 586356 70.3 Gbps | round-trip times and 1500-byte packets. | |||
| 10^-9 103.5 3966036 475.9 Gbps | ||||
| 10^-10 223.9 26984653 3238.1 Gbps | ||||
| Table 5: Relative Fairness between the HighSpeed and Standard | Packet Drop Rate P Fairness Aggregate Window Bandwidth | |||
| Response Functions. | ------------------ -------- ---------------- --------- | |||
| 10^-2 1.0 24 2.8 Mbps | ||||
| 10^-3 1.0 76 9.1 Mbps | ||||
| 10^-4 2.2 383 45.9 Mbps | ||||
| 10^-5 4.7 2174 260.8 Mbps | ||||
| 10^-6 10.2 13479 1.6 Gbps | ||||
| 10^-7 22.1 87776 10.5 Gbps | ||||
| Thus, for packet drop rates of 10^-4, a flow with the HighSpeed | Table 5: Relative Fairness between the HighSpeed and Standard | |||
| response function can expect to receive 2.2 times the throughput of a | Response Functions. | |||
| flow using the Standard response function, given the same round-trip | ||||
| times and packet sizes. With packet drop rates of 10^-6 (or 10^-7), | ||||
| the unfairness is more severe, and we have entered the regime where a | ||||
| Standard TCP connection requires a congestion event at most every 800 | ||||
| (or 2530) round-trip times in order to make use of the available | ||||
| bandwidth. Our judgement would be that there are not a lot of TCP | ||||
| connections effectively operating in this regime today, with | ||||
| congestion windows of thousands of packets, and that therefore the | ||||
| benefits of the HighSpeed response function would outweigh the | ||||
| unfairness that would be experienced by Standard TCP in this regime. | ||||
| However, one purpose of this document is to solicit feedback on this | ||||
| issue. The parameter Low_Window determines directly the point of | ||||
| divergence between the Standard and HighSpeed Response Functions. | ||||
| The third column of Table 5, the Aggregate Window, gives the | Thus, for packet drop rates of 10^-4, a flow with the HighSpeed | |||
| aggregate congestion window of the two competing TCP connections, | response function can expect to receive 2.2 times the throughput of | |||
| with HighSpeed and Standard TCP, given the packet drop rate specified | a flow using the Standard response function, given the same round- | |||
| in the first column. From Table 5, a HighSpeed TCP connection would | trip times and packet sizes. With packet drop rates of 10^-6 (or | |||
| receive ten times the bandwidth of a Standard TCP in an environment | 10^-7), the unfairness is more severe, and we have entered the | |||
| with a packet drop rate of 10^-6. This would occur when the two | regime where a Standard TCP connection requires at most one | |||
| flows sharing a single pipe achieved an aggregate window of 13479 | congestion event every 800 (or 2530) round-trip times in order to | |||
| packets. Given a round-trip time of 100 ms and a packet size of 1500 | make use of the available bandwidth. Our judgement would be that | |||
| bytes, this would occur with an available bandwidth for the two | there are not a lot of TCP connections effectively operating in this | |||
| competing flows of 1.6 Gbps. | regime today, with congestion windows of thousands of packets, and | |||
| that therefore the benefits of the HighSpeed response function would | ||||
| outweigh the unfairness that would be experienced by Standard TCP in | ||||
| this regime. However, one purpose of this document is to solicit | ||||
| feedback on this issue. The parameter Low_Window determines | ||||
| directly the point of divergence between the Standard and HighSpeed | ||||
| Response Functions. | ||||
| Next we consider the time that it takes two HighSpeed TCP flows to | The third column of Table 5, the Aggregate Window, gives the | |||
| converge to fairness. The worst case for convergence to fairness | aggregate congestion window of the two competing TCP connections, | |||
| occurs when a new flow is starting up, competing against a high- | with HighSpeed and Standard TCP, given the packet drop rate | |||
| bandwidth existing flow, and the new flow suffers a packet drop and | specified in the first column. From Table 5, a HighSpeed TCP | |||
| exits slow-start while its window is still small. In the worst case, | connection would receive ten times the bandwidth of a Standard TCP | |||
| consider that the new flow has entered the congestion avoidance phase | in an environment with a packet drop rate of 10^-6. This would | |||
| while its window is only one packet. A standard TCP flow in | occur when the two flows sharing a single pipe achieved an aggregate | |||
| congestion avoidance increases its window by at most one packet per | window of 13479 packets. Given a round-trip time of 100 ms and a | |||
| round-trip time, and after N round-trip times has only achieved a | packet size of 1500 bytes, this would occur with an available | |||
| window of N packets (when starting with a window of 1 in the first | bandwidth for the two competing flows of 1.6 Gbps. | |||
| round-trip time). In contrast, a HighSpeed TCP flows increases much | ||||
| faster than a standard TCP flow while in the congestion avoidance | ||||
| phase, and we can expect its convergence to fairness to be much | ||||
| better. This is shown in Table 6 below. The script used to generate | ||||
| this table is given in Appendix C. | ||||
| RTT HS_Window Standard_TCP_Window | Next we consider the time that it takes a standard or HighSpeed TCP | |||
| --- --------- ------------------- | flow to converge to fairness against a pre-existing HighSpeed TCP | |||
| 100 131 100 | flow. The worst case for convergence to fairness occurs when a new | |||
| 200 475 200 | flow is starting up, competing against a high-bandwidth existing | |||
| 300 1131 300 | flow, and the new flow suffers a packet drop and exits slow-start | |||
| 400 2160 400 | while its window is still small. In the worst case, consider that | |||
| 500 3601 500 | the new flow has entered the congestion avoidance phase while its | |||
| 600 5477 600 | window is only one packet. A standard TCP flow in congestion | |||
| 700 7799 700 | avoidance increases its window by at most one packet per round-trip | |||
| 800 10567 800 | time, and after N round-trip times has only achieved a window of N | |||
| 900 13774 900 | packets (when starting with a window of 1 in the first round-trip | |||
| 1000 17409 1000 | time). In contrast, a HighSpeed TCP flows increases much faster | |||
| 1100 21455 1100 | than a standard TCP flow while in the congestion avoidance phase, | |||
| 1200 25893 1200 | and we can expect its convergence to fairness to be much better. | |||
| 1300 30701 1300 | This is shown in Table 6 below. The script used to generate this | |||
| 1400 35856 1400 | table is given in Appendix C. | |||
| 1500 41336 1500 | ||||
| 1600 47115 1600 | ||||
| 1700 53170 1700 | ||||
| 1800 59477 1800 | ||||
| 1900 66013 1900 | ||||
| 2000 72754 2000 | ||||
| Table 6: For a HighSpeed and a Standard TCP connection, the | RTT HS_Window Standard_TCP_Window | |||
| congestion window during congestion avoidance phase (starting with a | --- --------- ------------------- | |||
| congestion window of 1 packet during RTT 1. | 100 131 100 | |||
| 200 475 200 | ||||
| 300 1131 300 | ||||
| 400 2160 400 | ||||
| 500 3601 500 | ||||
| 600 5477 600 | ||||
| 700 7799 700 | ||||
| 800 10567 800 | ||||
| 900 13774 900 | ||||
| 1000 17409 1000 | ||||
| 1100 21455 1100 | ||||
| 1200 25893 1200 | ||||
| 1300 30701 1300 | ||||
| 1400 35856 1400 | ||||
| 1500 41336 1500 | ||||
| 1600 47115 1600 | ||||
| 1700 53170 1700 | ||||
| 1800 59477 1800 | ||||
| 1900 66013 1900 | ||||
| 2000 72754 2000 | ||||
| The classic paper on relative fairness is from Chiu and Jain [CJ89]. | Table 6: For a HighSpeed and a Standard TCP connection, the | |||
| This paper shows that AIMD (Additive Increase Multiplicative | congestion window during congestion avoidance phase (starting with a | |||
| Decrease) converges to fairness in an environment with synchronized | congestion window of 1 packet during RTT 1. | |||
| congestion events. From [CJ89], it is easy to see that MIMD and AIAD | ||||
| do not converge to fairness in this environment. However, the | ||||
| results of [CJ89] do not apply to an asynchronous environment such as | ||||
| that of the current Internet, where the frequency of congestion | ||||
| feedback can be different for different flows. For example, it has | ||||
| been shown that MIMD converges to fair states in a model with | ||||
| proportional instead of synchronous feedback in terms of packet drops | ||||
| [GV02]. Thus, we are not concerned about abandoning a strict model | The classic paper on relative fairness is from Chiu and Jain [CJ89]. | |||
| of AIMD for HighSpeed TCP. | This paper shows that AIMD (Additive Increase Multiplicative | |||
| Decrease) converges to fairness in an environment with synchronized | ||||
| congestion events. From [CJ89], it is easy to see that MIMD and | ||||
| AIAD do not converge to fairness in this environment. However, the | ||||
| results of [CJ89] do not apply to an asynchronous environment such | ||||
| as that of the current Internet, where the frequency of congestion | ||||
| feedback can be different for different flows. For example, it has | ||||
| been shown that MIMD converges to fair states in a model with | ||||
| proportional instead of synchronous feedback in terms of packet | ||||
| drops [GV02]. Thus, we are not concerned about abandoning a strict | ||||
| model of AIMD for HighSpeed TCP. | ||||
| 7. Translating the HighSpeed Response Function into Congestion Control | 7. Translating the HighSpeed Response Function into Congestion Control | |||
| Parameters. | Parameters. | |||
| For equation-based congestion control such as TFRC, the HighSpeed | For equation-based congestion control such as TFRC, the HighSpeed | |||
| Response Function above could be used directly by the TFRC congestion | Response Function above could be used directly by the TFRC | |||
| control mechanism. However, for TCP the HighSpeed response function | congestion control mechanism. However, for TCP the HighSpeed | |||
| would have to be translated into additive increase and multiplicative | response function has to be translated into additive increase and | |||
| decrease parameters. The HighSpeed response function cannot be | multiplicative decrease parameters. The HighSpeed response function | |||
| achieved by TCP with an additive increase of one segment per round- | cannot be achieved by TCP with an additive increase of one segment | |||
| trip time and a multiplicative decrease of halving the current | per round-trip time and a multiplicative decrease of halving the | |||
| congestion window; HighSpeed TCP will have to modify either the | current congestion window; HighSpeed TCP will have to modify either | |||
| increase or the decrease parameter, or both. We have concluded that | the increase or the decrease parameter, or both. We have concluded | |||
| HighSpeed TCP is most likely to achieve an acceptable compromise | that HighSpeed TCP is most likely to achieve an acceptable | |||
| between moderate increases and timely decreases by modifying both the | compromise between moderate increases and timely decreases by | |||
| increase and the decrease parameter. | modifying both the increase and the decrease parameter. | |||
| That is, for HighSpeed TCP let the congestion window increase by a(w) | That is, for HighSpeed TCP let the congestion window increase by | |||
| segments per round-trip time in the absence of congestion, and let | a(w) segments per round-trip time in the absence of congestion, and | |||
| the congestion window decrease to w(1-b(w)) segments in response to a | let the congestion window decrease to w(1-b(w)) segments in response | |||
| round-trip time with one or more loss events. Thus, in response to a | to a round-trip time with one or more loss events. Thus, in | |||
| single acknowledgement HighSpeed TCP increases its congestion window | response to a single acknowledgement HighSpeed TCP increases its | |||
| in segments as follows: | congestion window in segments as follows: | |||
| w <- w + a(w)/w. | w <- w + a(w)/w. | |||
| In response to a congestion event, HighSpeed TCP decreases as | In response to a congestion event, HighSpeed TCP decreases as | |||
| follows: | follows: | |||
| w <- (1-b(w))w. | w <- (1-b(w))w. | |||
| For Standard TCP, a(w) = 1 and b(w) = 1/2, regardless of the value of | For Standard TCP, a(w) = 1 and b(w) = 1/2, regardless of the value | |||
| w. HighSpeed TCP uses the same values of a(w) and b(w) for w <= | of w. HighSpeed TCP uses the same values of a(w) and b(w) for w <= | |||
| Low_Window. This section specifies a(w) and b(w) for HighSpeed TCP | Low_Window. This section specifies a(w) and b(w) for HighSpeed TCP | |||
| for larger values of w. | for larger values of w. | |||
| For w = High_Window, we have specified a loss rate of High_P. From | For w = High_Window, we have specified a loss rate of High_P. From | |||
| [FRS02], or from elementary calculations, this requires the following | [FRS02], or from elementary calculations, this requires the | |||
| relationship between a(w) and b(w) for w = High_Window: | following relationship between a(w) and b(w) for w = High_Window: | |||
| a(w) = High_Window^2 * High_P * 2 * b(w)/(2-b(w). (2) | a(w) = High_Window^2 * High_P * 2 * b(w)/(2-b(w). (2) | |||
| We use the parameter High_Decrease to specify the decrease parameter | We use the parameter High_Decrease to specify the decrease parameter | |||
| b(w) for w = High_Window, and use Equation (2) to derive the increase | b(w) for w = High_Window, and use Equation (2) to derive the | |||
| parameter a(w) for w = High_Window. Along with High_P = 10^-7 and | increase parameter a(w) for w = High_Window. Along with High_P = | |||
| High_Window = 83000, for example, we specify High_Decrease = 0.1, | 10^-7 and High_Window = 83000, for example, we specify High_Decrease | |||
| specifying that b(83000) = 0.1, giving a decrease of 10% after a | = 0.1, specifying that b(83000) = 0.1, giving a decrease of 10% | |||
| congestion event. Equation (2) then gives a(83000) = 72, for an | after a congestion event. Equation (2) then gives a(83000) = 72, | |||
| increase of 72 segments, or just under 0.1%, within a round-trip | for an increase of 72 segments, or just under 0.1%, within a round- | |||
| time, for w = 83000. | trip time, for w = 83000. | |||
| This moderate decrease strikes us as acceptable, particularly when | This moderate decrease strikes us as acceptable, particularly when | |||
| coupled with the role of TCP's ACK-clocking in limiting the sending | coupled with the role of TCP's ACK-clocking in limiting the sending | |||
| rate in response to more severe congestion [BBFS01]. A more severe | rate in response to more severe congestion [BBFS01]. A more severe | |||
| decrease would require a more aggressive increase in the congestion | decrease would require a more aggressive increase in the congestion | |||
| window for a round-trip time without congestion. In particular, a | window for a round-trip time without congestion. In particular, a | |||
| decrease factor High_Decrease of 0.5, as in Standard TCP, would | decrease factor High_Decrease of 0.5, as in Standard TCP, would | |||
| require an increase of 459 segments per round-trip time when w = | require an increase of 459 segments per round-trip time when w = | |||
| 83000. | 83000. | |||
| Given decrease parameters of b(w) = 1/2 for w = Low_Window, and b(w) | Given decrease parameters of b(w) = 1/2 for w = Low_Window, and b(w) | |||
| = High_Decrease for w = High_Window, we are left to specify the value | = High_Decrease for w = High_Window, we are left to specify the | |||
| of b(w) for other values of w > Low_Window. From [FRS02], we let | value of b(w) for other values of w > Low_Window. From [FRS02], we | |||
| b(w) vary linearly as the log of w, as follows: | let b(w) vary linearly as the log of w, as follows: | |||
| b(w) = (High_Decrease - 0.5) (log(w)-log(W)) / (log(W_1)-log(W)) + | b(w) = (High_Decrease - 0.5) (log(w)-log(W)) / (log(W_1)-log(W)) + | |||
| 0.5. | 0.5, | |||
| The increase parameter a(w) can then be computed as follows: | for W = Low_window and W_1 = High_window. The increase parameter | |||
| a(w) can then be computed as follows: | ||||
| a(w) = w^2 * p(w) * 2 * b(w)/(2-b(w)), | a(w) = w^2 * p(w) * 2 * b(w)/(2-b(w)), | |||
| for p(w) the packet drop rate for congestion window w. From | for p(w) the packet drop rate for congestion window w. From | |||
| inverting Equation (1), we get p(w) as follows: | inverting Equation (1), we get p(w) as follows: | |||
| p(w) = 0.078/w^1.2. | p(w) = 0.078/w^1.2. | |||
| We assume that experimental implementations of HighSpeed TCP for | We assume that experimental implementations of HighSpeed TCP for | |||
| further investigation will use a pre-computed look-up table for | further investigation will use a pre-computed look-up table for | |||
| finding a(w) and b(w). For example, the implementation from Tom | finding a(w) and b(w). For example, the implementation from Tom | |||
| Dunigan adjusts the a(w) and b(w) parameters every 0.1 seconds. In | Dunigan adjusts the a(w) and b(w) parameters every 0.1 seconds. In | |||
| the appendix we give such a table for our default values of | the appendix we give such a table for our default values of | |||
| Low_Window = 38, High_Window = 83,000, High_P = 10^-7, and | Low_Window = 38, High_Window = 83,000, High_P = 10^-7, and | |||
| High_Decrease = 0.1. These are also the default values in the NS | High_Decrease = 0.1. These are also the default values in the NS | |||
| simulator; example simulations in NS can be run with the command | simulator; example simulations in NS can be run with the command | |||
| "./test-all-tcpHighspeed" in the directory tcl/test. | "./test-all-tcpHighspeed" in the directory tcl/test. | |||
| 8. An alternate, linear response functions. | 8. An alternate, linear response functions. | |||
| In this section we explore an alternate, linear response function for | In this section we explore an alternate, linear response function | |||
| HighSpeed TCP that has been proposed by a number of other people, in | for HighSpeed TCP that has been proposed by a number of other | |||
| particular by Glenn Vinnicombe and Tom Kelly. Similarly, it has been | people, in particular by Glenn Vinnicombe and Tom Kelly. Similarly, | |||
| suggested by others that a less "ad-hoc" guideline for a response | it has been suggested by others that a less "ad-hoc" guideline for a | |||
| function for HighSpeed TCP would be to specify a constant value for | response function for HighSpeed TCP would be to specify a constant | |||
| the number of round-trip times between congestion events. | value for the number of round-trip times between congestion events. | |||
| Assume that we keep the value of Low_Window as 38 MSS-sized segments, | Assume that we keep the value of Low_Window as 38 MSS-sized | |||
| indicating when the HighSpeed response function diverges from the | segments, indicating when the HighSpeed response function diverges | |||
| current TCP response function, but that we modify the High_Window and | from the current TCP response function, but that we modify the | |||
| High_P parameters that specify the upper range of the HighSpeed | High_Window and High_P parameters that specify the upper range of | |||
| response function. In particular, consider the response function | the HighSpeed response function. In particular, consider the | |||
| given by High_Window = 380,000 and High_P = 10^-7, with Low_Window = | response function given by High_Window = 380,000 and High_P = 10^-7, | |||
| 38 and Low_P = 10^-3 as before. | with Low_Window = 38 and Low_P = 10^-3 as before. | |||
| Using the equations in Section 5, this would give the following | Using the equations in Section 5, this would give the following | |||
| Linear response function, for w > Low_Window: | Linear response function, for w > Low_Window: | |||
| W = 0.038/p. | W = 0.038/p. | |||
| This Linear HighSpeed response function is illustrated in Table 7 | This Linear HighSpeed response function is illustrated in Table 7 | |||
| below. For HighSpeed TCP, the number of round-trip times between | below. For HighSpeed TCP, the number of round-trip times between | |||
| losses, 1/(pW), equals 1/0.38, or equivalently, 26, for W > 38 | losses, 1/(pW), equals 1/0.38, or equivalently, 26, for W > 38 | |||
| segments. | segments. | |||
| Packet Drop Rate P Congestion Window W RTTs Between Losses | Packet Drop Rate P Congestion Window W RTTs Between Losses | |||
| ------------------ ------------------- ------------------- | ------------------ ------------------- ------------------- | |||
| 10^-2 12 8 | 10^-2 12 8 | |||
| 10^-3 38 26 | 10^-3 38 26 | |||
| 10^-4 380 26 | 10^-4 380 26 | |||
| 10^-5 3800 26 | 10^-5 3800 26 | |||
| 10^-6 38000 26 | 10^-6 38000 26 | |||
| 10^-7 380000 26 | 10^-7 380000 26 | |||
| 10^-8 3800000 26 | 10^-8 3800000 26 | |||
| 10^-9 38000000 26 | 10^-9 38000000 26 | |||
| 10^-10 380000000 26 | 10^-10 380000000 26 | |||
| Table 7: An Alternate, Linear TCP Response Function for HighSpeed | Table 7: An Alternate, Linear TCP Response Function for HighSpeed | |||
| TCP. The average congestion window W in MSS-sized segments is given | TCP. The average congestion window W in MSS-sized segments is given | |||
| as a function of the packet drop rate P. | as a function of the packet drop rate P. | |||
| Given a constant decrease b(w) of 1/2, this would give an increase | Given a constant decrease b(w) of 1/2, this would give an increase | |||
| a(w) of w/Low_Window, or equivalently, an constant increase of | a(w) of w/Low_Window, or equivalently, a constant increase of | |||
| 1/Low_Window packets per acknowledgement, for w > Low_Window. | 1/Low_Window packets per acknowledgement, for w > Low_Window. | |||
| Another possibility is Scalable TCP [K03], which uses a fixed | Another possibility is Scalable TCP [K03], which uses a fixed | |||
| decrease b(w) of 1/8 and a fixed increase per acknowledgement of | decrease b(w) of 1/8 and a fixed increase per acknowledgement of | |||
| 0.01. This gives an increase a(w) per window of 0.005 w, for a TCP | 0.01. This gives an increase a(w) per window of 0.005 w, for a TCP | |||
| with delayed acknowledgements. | with delayed acknowledgements, for pure MIMD. | |||
| The relative fairness between the alternate Linear response function | The relative fairness between the alternate Linear response function | |||
| and the standard TCP response function is illustrated below in Table | and the standard TCP response function is illustrated below in Table | |||
| 8. | 8. | |||
| Packet Drop Rate P Fairness Aggregate Window Bandwidth | Packet Drop Rate P Fairness Aggregate Window Bandwidth | |||
| ------------------ -------- ---------------- --------- | ------------------ -------- ---------------- --------- | |||
| 10^-2 1.0 24 2.8 Mbps | 10^-2 1.0 24 2.8 Mbps | |||
| 10^-3 1.0 76 9.1 Mbps | 10^-3 1.0 76 9.1 Mbps | |||
| 10^-4 3.2 500 60.0 Mbps | 10^-4 3.2 500 60.0 Mbps | |||
| 10^-5 15.1 4179 501.4 Mbps | 10^-5 15.1 4179 501.4 Mbps | |||
| 10^-6 31.6 39200 4.7 Gbps | 10^-6 31.6 39200 4.7 Gbps | |||
| 10^-7 100.1 383795 46.0 Gbps | 10^-7 100.1 383795 46.0 Gbps | |||
| 10^-8 316.6 3812000 457.4 Gbps | ||||
| 10^-9 1001.3 38037948 4564.5 Gbps | ||||
| 10^-10 3166.6 380120000 45614.4 Gbps | ||||
| Table 8: Relative Fairness between the Linear HighSpeed and Standard | Table 8: Relative Fairness between the Linear HighSpeed and Standard | |||
| Response Functions. | Response Functions. | |||
| One attraction of the linear response function is that it is scale- | One attraction of the linear response function is that it is scale- | |||
| invariant, with a fixed increase in the congestion window per | invariant, with a fixed increase in the congestion window per | |||
| acknowledgement, and a fixed number of round-trip times between loss | acknowledgement, and a fixed number of round-trip times between loss | |||
| events. My own assumption would be that having a fixed length for | events. My own assumption would be that having a fixed length for | |||
| the congestion epoch in round-trip times, regardless of the packet | the congestion epoch in round-trip times, regardless of the packet | |||
| drop rate, would be a poor fit for an imprecise and imperfect world | drop rate, would be a poor fit for an imprecise and imperfect world | |||
| with routers with a range of queue management mechanisms, such as the | with routers with a range of queue management mechanisms, such as | |||
| Drop-Tail queue management that is common today. For example, a | the Drop-Tail queue management that is common today. For example, a | |||
| response function with a fixed length for the congestion epoch in | response function with a fixed length for the congestion epoch in | |||
| round-trip times might give less clearly-differentiated feedback in | round-trip times might give less clearly-differentiated feedback in | |||
| an environment with steady-state background losses at fixed intervals | an environment with steady-state background losses at fixed | |||
| for all flows (as might occur with a wireless link with occasional | intervals for all flows (as might occur with a wireless link with | |||
| short error bursts, giving losses for all flows every N seconds | occasional short error bursts, giving losses for all flows every N | |||
| regardless of their sending rate). | seconds regardless of their sending rate). | |||
| While it is not a goal to have perfect fairness in an environment | While it is not a goal to have perfect fairness in an environment | |||
| with synchronized losses, it would be good to have moderately | with synchronized losses, it would be good to have moderately | |||
| acceptable performance in this regime. This goal might argue against | acceptable performance in this regime. This goal might argue | |||
| a response function with a constant number of round-trip times | against a response function with a constant number of round-trip | |||
| between congestion events. However, this is a question that could | times between congestion events. However, this is a question that | |||
| clearly use additional research and investigation. In addition, | could clearly use additional research and investigation. In | |||
| flows with different round-trip times would have different time | addition, flows with different round-trip times would have different | |||
| durations for congestion epochs even in the model with a linear | time durations for congestion epochs even in the model with a linear | |||
| response function. | response function. | |||
| The third column of Table 8, the Aggregate Window, gives the | The third column of Table 8, the Aggregate Window, gives the | |||
| aggregate congestion window of two competing TCP connections, one | aggregate congestion window of two competing TCP connections, one | |||
| with Linear HighSpeed TCP and one with Standard TCP, given the packet | with Linear HighSpeed TCP and one with Standard TCP, given the | |||
| drop rate specified in the first column. From Table 8, a Linear | packet drop rate specified in the first column. From Table 8, a | |||
| HighSpeed TCP connection would receive fifteen times the bandwidth of | Linear HighSpeed TCP connection would receive fifteen times the | |||
| a Standard TCP in an environment with a packet drop rate of 10^-5. | bandwidth of a Standard TCP in an environment with a packet drop | |||
| This would occur when the two flows sharing a single pipe achieved an | rate of 10^-5. This would occur when the two flows sharing a single | |||
| aggregate window of 4179 packets. Given a round-trip time of 100 ms | pipe achieved an aggregate window of 4179 packets. Given a round- | |||
| and a packet size of 1500 bytes, this would occur with an available | trip time of 100 ms and a packet size of 1500 bytes, this would | |||
| bandwidth for the two competing flows of 501 Mbps. Thus, because the | occur with an available bandwidth for the two competing flows of 501 | |||
| Linear HighSpeed TCP is more aggressive than the HighSpeed TCP | Mbps. Thus, because the Linear HighSpeed TCP is more aggressive | |||
| proposed above, it also is less fair when competing with Standard TCP | than the HighSpeed TCP proposed above, it also is less fair when | |||
| in a high-bandwidth environment. | competing with Standard TCP in a high-bandwidth environment. | |||
| 9. Tradeoffs for Choosing Congestion Control Parameters. | 9. Tradeoffs for Choosing Congestion Control Parameters. | |||
| A range of metrics can be used for evaluating choices for congestion | A range of metrics can be used for evaluating choices for congestion | |||
| control parameters for HighSpeed TCP. My assumption in this section | control parameters for HighSpeed TCP. My assumption in this section | |||
| is that for a response function of the form w = c/p^d, for constant c | is that for a response function of the form w = c/p^d, for constant | |||
| and exponent d, the only response functions that would be considered | c and exponent d, the only response functions that would be | |||
| are response functions with 1/2 <= d <= 1. The two ends of this | considered are response functions with 1/2 <= d <= 1. The two ends | |||
| spectrum are represented by current TCP, with d = 1/2, and by the | of this spectrum are represented by current TCP, with d = 1/2, and | |||
| linear response function described in Section 8 above, with d = 1. | by the linear response function described in Section 8 above, with d | |||
| HighSpeed TCP lies somewhere in the middle of the spectrum, with d = | = 1. HighSpeed TCP lies somewhere in the middle of the spectrum, | |||
| 0.835. | with d = 0.835. | |||
| Response functions with exponents less than 1/2 can be eliminated | Response functions with exponents less than 1/2 can be eliminated | |||
| from consideration because they would be even worse than standard TCP | from consideration because they would be even worse than standard | |||
| in accomodating connections with high congestion windows. | TCP in accomodating connections with high congestion windows. | |||
| 9.1. The Number of Round-Trip Times between Loss Events. | 9.1. The Number of Round-Trip Times between Loss Events. | |||
| Response functions with exponents greater than 1 can be eliminated | Response functions with exponents greater than 1 can be eliminated | |||
| from consideration because for these response functions, the number | from consideration because for these response functions, the number | |||
| of round-trip times between loss events decreases as congestion | of round-trip times between loss events decreases as congestion | |||
| decreases. For a response function of w = c/p^d, with one loss event | decreases. For a response function of w = c/p^d, with one loss | |||
| or congestion event every 1/p packets, the number of round-trip times | event or congestion event every 1/p packets, the number of round- | |||
| between loss events is w^((1/d)-1)/c^(1/d). Thus, for standard TCP | trip times between loss events is w^((1/d)-1)/c^(1/d). Thus, for | |||
| the number of round-trip times between loss events is linear in w. | standard TCP the number of round-trip times between loss events is | |||
| In contrast, one attraction of the linear response function, as | linear in w. In contrast, one attraction of the linear response | |||
| described in Section 8 above, is that it is scale-invariant, in terms | function, as described in Section 8 above, is that it is scale- | |||
| of a fixed increase in the congestion window per acknowledgement, and | invariant, in terms of a fixed increase in the congestion window per | |||
| a fixed number of round-trip times between loss events. | acknowledgement, and a fixed number of round-trip times between loss | |||
| events. | ||||
| However, for a response function with d > 1, the number of round-trip | However, for a response function with d > 1, the number of round- | |||
| times between loss events would be proportional to w^((1/d)-1), for a | trip times between loss events would be proportional to w^((1/d)-1), | |||
| negative exponent ((1/d)-1), setting smaller as w increases. This | for a negative exponent ((1/d)-1), setting smaller as w increases. | |||
| would seem undesirable. | This would seem undesirable. | |||
| 9.2. The Number of Packet Drops per Loss Event, with Drop-Tail. | 9.2. The Number of Packet Drops per Loss Event, with Drop-Tail. | |||
| A TCP connection increases its sending rate by a(w) packets per | A TCP connection increases its sending rate by a(w) packets per | |||
| round-trip time, and in a Drop-Tail environment, this is likely to | round-trip time, and in a Drop-Tail environment, this is likely to | |||
| result in a(w) dropped packets during a single loss event. One | result in a(w) dropped packets during a single loss event. One | |||
| attraction of standard TCP is that it has a fixed increase per round- | attraction of standard TCP is that it has a fixed increase per | |||
| trip time of one packet, minimizing the number of packets that would | round-trip time of one packet, minimizing the number of packets that | |||
| be dropped in a Drop-Tail environment. For an environment with some | would be dropped in a Drop-Tail environment. For an environment | |||
| form of Active Queue Management, and in particular for an environment | with some form of Active Queue Management, and in particular for an | |||
| that uses ECN, the number of packets dropped in a single congestion | environment that uses ECN, the number of packets dropped in a single | |||
| event would not be a problem. However, even in these environments, | congestion event would not be a problem. However, even in these | |||
| larger increases in the sending rate per round-trip time result in | environments, larger increases in the sending rate per round-trip | |||
| larger stresses on the ability of the queues in the router to absorb | time result in larger stresses on the ability of the queues in the | |||
| the fluctuations. | router to absorb the fluctuations. | |||
| HighSpeed TCP plays a middle ground between the metrics of a moderate | HighSpeed TCP plays a middle ground between the metrics of a | |||
| number of round-trip times between loss events, and a moderate | moderate number of round-trip times between loss events, and a | |||
| increase in the sending rate per round-trip time. As shown in | moderate increase in the sending rate per round-trip time. As shown | |||
| Appendix B, for a congestion window of 83,000 packets, HighSpeed TCP | in Appendix B, for a congestion window of 83,000 packets, HighSpeed | |||
| increases its sending rate by 70 packets per round-trip time, | TCP increases its sending rate by 70 packets per round-trip time, | |||
| resulting in roughly 70 packet drops for each congestion event in a | resulting in at most 70 packet drops when the buffer overflows in a | |||
| Drop-Tail environment. This increased aggressiveness is the price | Drop-Tail environment. This increased aggressiveness is the price | |||
| paid by HighSpeed TCP for its increased scalability. A large number | paid by HighSpeed TCP for its increased scalability. A large number | |||
| of packets dropped per congestion event could result in synchronized | of packets dropped per congestion event could result in synchronized | |||
| drops from multiple flows, with a possible loss of throughput as a | drops from multiple flows, with a possible loss of throughput as a | |||
| result. | result. | |||
| Scalable TCP has an increase a(w) of 0.005 w packets per round-trip | Scalable TCP has an increase a(w) of 0.005 w packets per round-trip | |||
| time. For a congestion window of 83,000 packets, this gives an | time. For a congestion window of 83,000 packets, this gives an | |||
| increase of 415 packets per round-trip time, resulting in roughly 415 | increase of 415 packets per round-trip time, resulting in roughly | |||
| packet drops per congestion event in a Drop-Tail environment. | 415 packet drops per congestion event in a Drop-Tail environment. | |||
| Thus, HighSpeed TCP and its variants place increased demands on queue | Thus, HighSpeed TCP and its variants place increased demands on | |||
| management in routers, relative to Standard TCP. (This is rather | queue management in routers, relative to Standard TCP. (This is | |||
| similar to the increased demands on queue management that would | rather similar to the increased demands on queue management that | |||
| result from using N parallel TCP connections instead of a single | would result from using N parallel TCP connections instead of a | |||
| Standard TCP connection.) | single Standard TCP connection.) | |||
| 10. Slow-Start. | 10. Related Issues | |||
| An companion internet-draft on "Limited Slow-Start for TCP with Large | 10.1. Slow-Start. | |||
| Congestion Windows" [F02b] proposes a modification to TCP's slow- | ||||
| start procedure that can significantly improve the performance of TCP | ||||
| connections slow-starting up to large congestion windows. For TCP | ||||
| connections that are able to use congestion windows of thousands (or | ||||
| tens of thousands) of MSS-sized segments (for MSS the sender's | ||||
| MAXIMUM SEGMENT SIZE), the current slow-start procedure can result in | ||||
| increasing the congestion window by thousands of segments in a single | ||||
| round-trip time. Such an increase can easily result in thousands of | ||||
| packets being dropped in one round-trip time. This is often counter- | ||||
| productive for the TCP flow itself, and is also hard on the rest of | ||||
| the traffic sharing the congested link. | ||||
| [F02b] proposes Limited Slow-Start, limiting the number of segments | An companion internet-draft on "Limited Slow-Start for TCP with | |||
| by which the congestion window is increased for one window of data | Large Congestion Windows" [F02b] proposes a modification to TCP's | |||
| during slow-start, in order to improve performance for TCP | slow-start procedure that can significantly improve the performance | |||
| connections with large congestion windows. We have separated out | of TCP connections slow-starting up to large congestion windows. | |||
| Limited Slow-Start to a separate draft because it can be used both | For TCP connections that are able to use congestion windows of | |||
| with Standard or with HighSpeed TCP. | thousands (or tens of thousands) of MSS-sized segments (for MSS the | |||
| sender's MAXIMUM SEGMENT SIZE), the current slow-start procedure can | ||||
| result in increasing the congestion window by thousands of segments | ||||
| in a single round-trip time. Such an increase can easily result in | ||||
| thousands of packets being dropped in one round-trip time. This is | ||||
| often counter-productive for the TCP flow itself, and is also hard | ||||
| on the rest of the traffic sharing the congested link. | ||||
| Limited Slow-Start is illustrated in the NS simulator, for snapshots | [F02b] proposes Limited Slow-Start, limiting the number of segments | |||
| after May 1, 2002, in the tests "./test-all-tcpHighspeed tcp1A" and | by which the congestion window is increased for one window of data | |||
| "./test-all-tcpHighspeed tcpHighspeed1" in the subdirectory | during slow-start, in order to improve performance for TCP | |||
| "tcl/lib". | connections with large congestion windows. We have separated out | |||
| Limited Slow-Start to a separate draft because it can be used both | ||||
| with Standard or with HighSpeed TCP. | ||||
| In order for best-effort flows to safely start-up faster than slow- | Limited Slow-Start is illustrated in the NS simulator, for snapshots | |||
| start, e.g., in future high-bandwidth networks, we believe that it | after May 1, 2002, in the tests "./test-all-tcpHighspeed tcp1A" and | |||
| would be necessary for the flow to have explicit feedback from the | "./test-all-tcpHighspeed tcpHighspeed1" in the subdirectory | |||
| routers along the path. There are a number of proposals for this, | "tcl/lib". | |||
| ranging from a minimal proposal for an IP option that allows TCP SYN | ||||
| packets to collect information from routers along the path about the | ||||
| allowed initial sending rate [J02], to proposals with more power that | ||||
| require more fine-tuned and continuous feedback from routers. These | ||||
| proposals all are somewhat longer-term proposals that the HighSpeed | ||||
| TCP proposal in this document, requiring longer lead times and more | ||||
| coordination for deployment, and will be discussed in later | ||||
| documents. | ||||
| 11. Other limitations on window size. | In order for best-effort flows to safely start-up faster than slow- | |||
| start, e.g., in future high-bandwidth networks, we believe that it | ||||
| would be necessary for the flow to have explicit feedback from the | ||||
| routers along the path. There are a number of proposals for this, | ||||
| ranging from a minimal proposal for an IP option that allows TCP SYN | ||||
| packets to collect information from routers along the path about the | ||||
| allowed initial sending rate [J02], to proposals with more power | ||||
| that require more fine-tuned and continuous feedback from routers. | ||||
| These proposals all are somewhat longer-term proposals than the | ||||
| HighSpeed TCP proposal in this document, requiring longer lead times | ||||
| and more coordination for deployment, and will be discussed in later | ||||
| documents. | ||||
| The TCP header uses a 16-bit field to report the receive window size | 10.2. Limiting burstiness on short time scales. | |||
| to the sender. Unmodified, this allows a window size of at most | ||||
| 2**16 = 65K bytes. With window scaling, the maximum window size is | Because the congestion window achieved by a HighSpeed TCP connection | |||
| 2**30 = 1073M bytes [RFC 1323]. Given 1500-byte packets, this allows | could be quite large, there is a possibility for the sender to send | |||
| a window of up to 715,000 packets. | a large burst of packets in response to a single acknowledgement. | |||
| This could happen, for example, when there is congestion or | ||||
| reordering on the reverse path, and the sender receives an | ||||
| acknowledgement acknowledging hundreds or thousands of new packets. | ||||
| Such a burst would also result if the application was idle for a | ||||
| short period of time less than a round-trip time, and then suddenly | ||||
| had lots of data available to send. In this case, it would be | ||||
| useful for the HighSpeed TCP connection to have some method for | ||||
| limiting bursts. | ||||
| We do not in this document specify TCP mechanisms for reducing the | ||||
| short-term burstiness. One possible mechanism is to use some form | ||||
| of rate-based pacing, and another possibility is to use maxburst, | ||||
| which limits the number of packets that are sent in response to a | ||||
| single acknowledgement. We would caution, however, against a | ||||
| permanent reduction in the congestion window as a mechanism for | ||||
| limiting short-term bursts. Such a mechanism has been deployed in | ||||
| some TCP stacks, and our view would be that using permanent | ||||
| reductions of the congestion window to reduce transient bursts would | ||||
| be a bad idea [Fl03]. | ||||
| 10.3. Other limitations on window size. | ||||
| The TCP header uses a 16-bit field to report the receive window size | ||||
| to the sender. Unmodified, this allows a window size of at most | ||||
| 2**16 = 65K bytes. With window scaling, the maximum window size is | ||||
| 2**30 = 1073M bytes [RFC 1323]. Given 1500-byte packets, this | ||||
| allows a window of up to 715,000 packets. | ||||
| 10.4. Implementation issues. | ||||
| One implementation issue that has been raised with HighSpeed TCP is | ||||
| that with congestion windows of 4MB or more, the handling of | ||||
| successive SACK packets after a packet is dropped becomes very time- | ||||
| consuming at the TCP sender [S03]. Tom Kelly's Scalable TCP | ||||
| includes a "SACK Fast Path" patch that addresses this problem. | ||||
| The issues addressed in the Web100 project, the Net100 project, and | ||||
| related projects about the tuning necessary to achieve high | ||||
| bandwidth data rates with TCP apply to HighSpeed TCP as well | ||||
| [Net100, Web100]. | ||||
| 11. Deployment issues. | ||||
| 11.1. Deployment issues of HighSpeed TCP | ||||
| We do not claim that the HighSpeed TCP modification to TCP described | ||||
| in this paper is an optimal transport protocol for high-bandwidth | ||||
| environments. Based on our experiences with HighSpeed TCP in the NS | ||||
| simulator [NS], on simulation studies [SA03], and on experimental | ||||
| reports [ABLLS03,D02,CC03,F03], we believe that HighSpeed TCP | ||||
| improves the performance of TCP in high-bandwidth environments, and | ||||
| we are documenting it for the benefit of the IETF community. We | ||||
| encourage the use of HighSpeed TCP, and of its underlying response | ||||
| function, and we further encourage feedback about operational | ||||
| experiences with this or related modifications. | ||||
| We note that in environments typical of much of the current | ||||
| Internet, HighSpeed TCP behaves exactly as does Standard TCP today. | ||||
| This is the case any time the congestion window is less than 38 | ||||
| segments. | ||||
| Bandwidth Avg Cwnd w (pkts) Increase a(w) Decrease b(w) | ||||
| --------- ----------------- ------------- ------------- | ||||
| 1.5 Mbps 12.5 1 0.50 | ||||
| 10 Mbps 83 1 0.50 | ||||
| 100 Mbps 833 6 0.35 | ||||
| 1 Gbps 8333 26 0.22 | ||||
| 10 Gbps 83333 70 0.10 | ||||
| Table 9: Performance of a HighSpeed TCP connection. | ||||
| To help calibrate, Table 9 considers a TCP connection with 1500-byte | ||||
| packets, an RTT of 100 ms (including average queueing delay), and no | ||||
| competing traffic, and shows the average congestion window if that | ||||
| TCP connection had a pipe all to itself and fully used the link | ||||
| bandwidth, for a range of bandwidths for the pipe. This assumes | ||||
| that the TCP connection would use Table 12 in determining its | ||||
| increase and decrease parameters. The first column of Table 9 gives | ||||
| the bandwidth, and the second column gives the average congestion | ||||
| window w needed to utilize that bandwidth. The third column show | ||||
| the increase a(w) in segments per RTT for window w. The fourth | ||||
| column show the decrease b(w) for that window w (where the TCP | ||||
| sender decreases the congestion window from w to w(1-b(w)) segments | ||||
| after a loss event). We note that the actual congestion window when | ||||
| a loss occurs is likely to be greater than the average congestion | ||||
| window w in column 2, so the decrease parameter used could be | ||||
| slightly smaller than the one given in column 4 of Table 9. | ||||
| Table 9 shows that a HighSpeed TCP over a 10 Mbps link behaves | ||||
| exactly the same as a Standard TCP connection, even in the absence | ||||
| of competing traffic. One can think of the congestion window | ||||
| staying generally in the range of 55 to 110 segments, with the | ||||
| HighSpeed TCP behavior being exactly the same as the behavior of | ||||
| Standard TCP. (If the congestion window is ever 128 segments or | ||||
| more, then the HighSpeed TCP increases by two segments per RTT | ||||
| instead of by one, and uses a decrease parameter of 0.44 instead of | ||||
| 0.50.) | ||||
| Table 9 shows that for a HighSpeed TCP connection over a 100 Mbps | ||||
| link, with no competing traffic, HighSpeed TCP behaves roughly as | ||||
| aggressively as six parallel TCP connections, increasing its | ||||
| congestion window by roughly six segments per round-trip time, and | ||||
| with a decrease parameter of roughly 1/3 (corresponding to | ||||
| decreasing down to 2/3-rds of its old congestion window, rather than | ||||
| to half, in response to a loss event). | ||||
| For a Standard TCP connection in this environment, the congestion | ||||
| window could be thought of as varying generally in the range of 550 | ||||
| to 1100 segments, with an average packet drop rate of 2.2 * 10^-6 | ||||
| (corresponding to a bit error rate of 1.8 * 10^-10), or | ||||
| equivalently, roughly 55 seconds between congestion events. While a | ||||
| Standard TCP connection could sustain such a low packet drop rate in | ||||
| a carefully controlled environment with minimal competing traffic, | ||||
| we would contend that in an uncontrolled best-effort environment | ||||
| with even a small amount of competing traffic, the occasional | ||||
| congestion events from smaller competing flows could easily be | ||||
| sufficient to prevent a Standard TCP flow with no lower-speed | ||||
| bottlenecks from fully utilizing the available bandwidth of the | ||||
| underutilized 100 Mbps link. | ||||
| That is, we would content that in the environment of 100 Mbps links | ||||
| with a significant amount of available bandwidth, Standard TCP would | ||||
| sometimes be unable to fully utilize the link bandwidth, and that | ||||
| HighSpeed TCP would be an improvement in this regard. We would | ||||
| further contend that in this environment, the behavior of HighSpeed | ||||
| TCP is sufficiently close to that of Standard TCP that HighSpeed TCP | ||||
| would be safe to deploy in the current Internet. | ||||
| 11.2. Deployment issues of Scalable TCP | ||||
| We believe that Scalable TCP and HighSpeed TCP have sufficiently | ||||
| similar response functions that they could easily coexist in the | ||||
| Internet. However, we have not investigated Scalable TCP | ||||
| sufficiently to be able to claim, in this document, that Scalable | ||||
| TCP is safe for a widespread deployment in the current Internet. | ||||
| Bandwidth Avg Cwnd w (pkts) Increase a(w) Decrease b(w) | ||||
| --------- ----------------- ------------- ------------- | ||||
| 1.5 Mbps 12.5 1 0.50 | ||||
| 10 Mbps 83 0.4 0.125 | ||||
| 100 Mbps 833 4.1 0.125 | ||||
| 1 Gbps 8333 41.6 0.125 | ||||
| 10 Gbps 83333 416.5 0.125 | ||||
| Table 10: Performance of a Scalable TCP connection. | ||||
| Table 10 shows the performance of a Scalable TCP connection with | ||||
| 1500-byte packets, an RTT of 100 ms (including average queueing | ||||
| delay), and no competing traffic. The TCP connection is assumed to | ||||
| use delayed acknowledgements. The first column of Table 10 gives | ||||
| the bandwidth, the second column gives the average congestion window | ||||
| needed to utilize that bandwidth, and the third and fourth columns | ||||
| give the increase and decrease parameters. | ||||
| Note that even in an environment with a 10 Mbps link, Scalable TCP's | ||||
| behavior is considerably different from that of Standard TCP. The | ||||
| increase parameter is smaller than that of Standard TCP, and the | ||||
| decrease is smaller also, 1/8-th instead of 1/2. That is, for 10 | ||||
| Mbps links, Scalable TCP increases less aggressively than Standard | ||||
| TCP or HighSpeed TCP, but decreases less aggressively as well. | ||||
| In an environment with a 100 Mbps link, Scalable TCP has an increase | ||||
| parameter of roughly four segments per round-trip time, with the | ||||
| same decrease parameter of 1/8-th. A comparison of Tables 9 and 10 | ||||
| shows that for this scenario of 100 Mbps links, HighSpeed TCP | ||||
| increases more aggressively than Scalable TCP. | ||||
| Next we consider the relative fairness between Standard TCP, | ||||
| HighSpeed TCP and Scalable TCP. The relative fairness between | ||||
| HighSpeed TCP and Standard TCP was shown in Table 5 earlier in this | ||||
| document, and the relative fairness between Scalable TCP and | ||||
| Standard TCP was shown in Table 8. Following the approach in | ||||
| Section 6, for a given packet drop rate p, for p < 10^-3, we can | ||||
| estimate the relative fairness between Scalable and HighSpeed TCP as | ||||
| W_Scalable/W_HighSpeed. This relative fairness is shown in Table 11 | ||||
| below. The bandwidth in the last column of Table 11 is the | ||||
| aggregate bandwidth of the two competing flows given 100 ms round- | ||||
| trip times and 1500-byte packets. | ||||
| Packet Drop Rate P Fairness Aggregate Window Bandwidth | ||||
| ------------------ -------- ---------------- --------- | ||||
| 10^-2 1.0 24 2.8 Mbps | ||||
| 10^-3 1.0 76 9.1 Mbps | ||||
| 10^-4 1.4 643 77.1 Mbps | ||||
| 10^-5 2.1 5595 671.4 Mbps | ||||
| 10^-6 3.1 50279 6.0 Gbps | ||||
| 10^-7 4.5 463981 55.7 Gbps | ||||
| Table 11: Relative Fairness between the Scalable and HighSpeed | ||||
| Response Functions. | ||||
| The second row of Table 11 shows that for a Scalable TCP and a | ||||
| HighSpeed TCP flow competing in an environment with 100 ms RTTs and | ||||
| a 10 Mbps pipe, the two flows would receive essentially the same | ||||
| bandwidth. The next row shows that for a Scalable TCP and a | ||||
| HighSpeed TCP flow competing in an environment with 100 ms RTTs and | ||||
| a 100 Mbps pipe, the Scalable TCP flow would receive roughly 50% | ||||
| more bandwidth than would HighSpeed TCP. Table 11 shows the | ||||
| relative fairness in higher-bandwidth environments as well. This | ||||
| relative fairness seems sufficient that there should be no problems | ||||
| with Scalable TCP and HighSpeed TCP coexisting in the same | ||||
| environment as Experimental variants of TCP. | ||||
| We note that one question that requires more investigation with | ||||
| Scalable TCP is that of convergence to fairness in environments with | ||||
| Drop-Tail queue management. | ||||
| 12. Related Work in HighSpeed TCP. | 12. Related Work in HighSpeed TCP. | |||
| HighSpeed TCP has been separately investigated in simulations by | HighSpeed TCP has been separately investigated in simulations by | |||
| Sylvia Ratnasamy and by Evandro de Souza, and reports of some of | Sylvia Ratnasamy and by Evandro de Souza [SA03]. The simulations in | |||
| these simulations should be available shortly. The simulations by | [SA03] verify the fairness properties of HighSpeed TCP when sharing | |||
| Evandro verify the fairness properties of HighSpeed TCP when sharing | a link with Standard TCP. | |||
| a link with Standard TCP. | ||||
| These simulations explore the relative fairness of HighSpeed TCP | These simulations explore the relative fairness of HighSpeed TCP | |||
| flows when competing with Standard TCP. The simulation environment | flows when competing with Standard TCP. The simulation environment | |||
| include background forward and reverse-path TCP traffic limited by | includes background forward and reverse-path TCP traffic limited by | |||
| the TCP receive window, along with a small amount of forward and | the TCP receive window, along with a small amount of forward and | |||
| reverse-path traffic from the web traffic generator. Most of the | reverse-path traffic from the web traffic generator. Most of the | |||
| simulations so far explore performance on a simple dumbbell topology | simulations so far explore performance on a simple dumbbell topology | |||
| with a 1Gbps link with a propagation delay of 50 ms. Simulations | with a 1 Gbps link with a propagation delay of 50 ms. Simulations | |||
| have been run both the Adaptive RED and with DropTail queue | have been run with Adaptive RED and with DropTail queue management. | |||
| management. | ||||
| Future work to explore in more detail includes convergence times | The simulations in [SA03] explore performance with a varying number | |||
| after new flows start-up; recovery time after a transient outage; the | of competing flows, with the competing traffic being all standard | |||
| response to sudden severe congestion, and investigations of the | TCP; all HighSpeed TCP; or a mix of standard and HighSpeed TCP. For | |||
| potential for oscillations. Additional future work includes | the simulations in [SA03] with RED queue management, the relative | |||
| evaluating more fully the choices of parameters for HighSpeed TCP. | fairness between standard and HighSpeed TCP is consistent with the | |||
| We invite contributions from others in this work. | relative fairness predicted in Table 5. For the simulations with | |||
| Drop Tail queues, the relative fairness is more skewed, with the | ||||
| HighSpeed TCP flows receiving an even larger share of the link | ||||
| bandwidth. This is not surprising; with Active Queue Management at | ||||
| the congested link, the fraction of packet drops received by each | ||||
| flow should be roughly proportional to that flow's share of the link | ||||
| bandwidth, while this property no longer holds with Drop Tail queue | ||||
| management. We also note that relative fairness in simulations with | ||||
| Drop Tail queue management can sometimes depend on small details of | ||||
| the simulation scenario, and that Drop Tail simulations need special | ||||
| care to avoid phase effects [F92]. | ||||
| Suggestions to other citations of related work would also be welcome. | [SA03] explores the bandwidth `stolen' by HighSpeed TCP from | |||
| standard TCP by exploring the fraction of the link bandwidth N | ||||
| standard TCP flows receive when competing against N other standard | ||||
| TCP flows, and comparing this to the fraction of the link bandwidth | ||||
| the N standard TCP flows receive when competing against N HighSpeed | ||||
| TCP flows. For the 1 Gbps simulation scenarios dominated by long- | ||||
| lived traffic, a small number of standard TCP flows are able to | ||||
| achieve high link utilization, and the HighSpeed TCP flows can be | ||||
| viewed as stealing bandwidth from the competing standard TCP flows, | ||||
| as predicted in Section 6 on the Fairness Implications of the | ||||
| HighSpeed Response Function. However, [SA03] shows that when even a | ||||
| small fraction of the link bandwidth is used by more bursty, short | ||||
| TCP connections, the standard TCP flows are unable to achieve high | ||||
| link utilization, and the HighSpeed TCP flows in this case are not | ||||
| `stealing' bandwidth from the standard TCP flows, but instead are | ||||
| using bandwidth that otherwise would not be utilized. | ||||
| 13. Relationship to other Work. | The conclusions of [SA03] are that "HighSpeed TCP behaved as forseen | |||
| by its response function, and appears to be a real and viable option | ||||
| for use on high-speed wide area TCP connections." | ||||
| Our assumption is that HighSpeed TCP will be used along with the TCP | Future work that could be explored in more detail includes | |||
| SACK option, and also with the increased Initial Window of three or | convergence times after new flows start-up; recovery time after a | |||
| four segments, as allowed by [AFP02]. For paths that have | transient outage; the response to sudden severe congestion, and | |||
| substantial reordering, TCP performance would be greatly improved by | investigations of the potential for oscillations. We invite | |||
| some of the mechanisms still in the research stages for robust | contributions from others in this work. | |||
| performance in the presence of reordered packets. | ||||
| Our view is that HighSpeed TCP is largely orthogonal to proposals for | 13. Relationship to other Work. | |||
| higher PMTU (Path MTU) values [M02]. Unlike changes to the PMTU, | ||||
| HighSpeed TCP does not require any changes in the network or at the | ||||
| TCP receiver, and works well in the current Internet. Our assumption | ||||
| is that HighSpeed TCP would be useful even with larger values for the | ||||
| PMTU. In particular, unlike the current congestion window, the PMTU | ||||
| gives no information about the bandwidth-delay product available to | ||||
| that particular flow. | ||||
| A related approach is that of a virtual MTU, where the actual MTU of | Our assumption is that HighSpeed TCP will be used with the TCP SACK | |||
| the path might be limited [VMSS,S02]. The virtual MTU approach has | option, and also with the increased Initial Window of three or four | |||
| not been fully investigated, and we do not explore the virtual MTU | segments, as allowed by [RFC3390]. For paths that have substantial | |||
| approach further in this document. | reordering, TCP performance would be greatly improved by some of the | |||
| mechanisms still in the research stages for robust performance in | ||||
| the presence of reordered packets. | ||||
| 14. Conclusions. | Our view is that HighSpeed TCP is largely orthogonal to proposals | |||
| for higher PMTU (Path MTU) values [M02]. Unlike changes to the | ||||
| PMTU, HighSpeed TCP does not require any changes in the network or | ||||
| at the TCP receiver, and works well in the current Internet. Our | ||||
| assumption is that HighSpeed TCP would be useful even with larger | ||||
| values for the PMTU. Unlike the current congestion window, the PMTU | ||||
| gives no information about the bandwidth-delay product available to | ||||
| that particular flow. | ||||
| This is an initial proposal, and we are asking from feedback from the | A related approach is that of a virtual MTU, where the actual MTU of | |||
| wider community. We have explored this proposal in simulations, | the path might be limited [VMSS,S02]. The virtual MTU approach has | |||
| though we have not yet finished our reports on these simulations. We | not been fully investigated, and we do not explore the virtual MTU | |||
| would welcome additional analysis, simulations, and particularly, | approach further in this document. | |||
| experimentation. More information on simuations and experiments is | ||||
| available from the HighSpeed TCP Web Page [HSTCP]. | ||||
| There are three parameters that determine the HighSpeed Response | 14. Conclusions. | |||
| Function, and an additional parameter that determines HighSpeed TCP's | ||||
| tradeoffs between increases and decreases using that response | ||||
| function. We solicit feedback on our setting of these parameters as | ||||
| well as on other issues. | ||||
| We are bringing this proposal to the IETF to be considered as an | This document has proposed HighSpeed TCP, a modification to TCP's | |||
| Experimental RFC. One reason to bring this to the IETF at this stage | congestion control mechanism for use with TCP connections with large | |||
| is that HighSpeed TCP proposes a rather significant change in the | congestion windows. We have explored this proposal in simulations, | |||
| underlying TCP response function, and in our view any such change | and others have explored HighSpeed TCP with experiments, and we | |||
| would have to be globally agreed-upon. It seems advisable to us to | believe HighSpeed TCP to be safe to deploy on the current Internet. | |||
| bring such a proposal to the IETF for feedback even in its | We would welcome additional analysis, simulations, and particularly, | |||
| preliminary stages. | experimentation. More information on simuations and experiments is | |||
| available from the HighSpeed TCP Web Page [HSTCP]. There are | ||||
| several independent implementations of HighSpeed TCP [D02,F03] and | ||||
| of Scalable TCP [K03] for further investigation. | ||||
| Another reason to bring this proposal to the IETF is that, while | We are bringing this proposal to the IETF to be considered as an | |||
| several people have conducted evaluations of HighSpeed TCP using | Experimental RFC. | |||
| simulations, our belief is that the "real" evaluations will have to | ||||
| happen in experiments and in actual deployment. As part of this | ||||
| experimentation, HighSpeed TCP has been implemented in the Linux | ||||
| 2.4.16 Web100 kernel [HSTCP]. It seemed to us that it was advisable, | ||||
| at this stage, to bring the proposal for HighSpeed TCP to the IETF | ||||
| and to seek Experimental status. | ||||
| 15. Acknowledgements | 15. Acknowledgements | |||
| The HighSpeed TCP proposal is from joint work with Sylvia Ratnasamy | The HighSpeed TCP proposal is from joint work with Sylvia Ratnasamy | |||
| and Scott Shenker (and was initiated by Scott Shenker). Additional | and Scott Shenker (and was initiated by Scott Shenker). Additional | |||
| investigations of HighSpeed TCP were joint work with Evandro de Souza | investigations of HighSpeed TCP were joint work with Evandro de | |||
| and Deb Agarwal. We thank Tom Dunigan for the implementation in the | Souza and Deb Agarwal. We thank Tom Dunigan for the implementation | |||
| Linux 2.4.16 Web100 kernel, and for resulting experimentation with | in the Linux 2.4.16 Web100 kernel, and for resulting experimentation | |||
| HighSpeed TCP. We are grateful to the End-to-End Research Group, the | with HighSpeed TCP. We are grateful to the End-to-End Research | |||
| members of the Transport Area Working Group, and to members of the | Group, the members of the Transport Area Working Group, and to | |||
| IPAM program in Large Scale Communication Networks for feedback. We | members of the IPAM program in Large Scale Communication Networks | |||
| thank Glenn Vinnicombe for framing the Linear response function in | for feedback. We thank Glenn Vinnicombe for framing the Linear | |||
| the parameters of HighSpeed TCP. We are also grateful for | response function in the parameters of HighSpeed TCP. We are also | |||
| contributions and feedback from the following individuals: Tom Kelly, | grateful for contributions and feedback from the following | |||
| Jitendra Padhye, Stanislav Shalunov, Paul Sutter, Brian Tierney, Joe | individuals: Les Cottrell, Mitchell Erblich, Jeffrey Hsu, Tom Kelly, | |||
| Touch. Thanks to Jeffrey Hsu and Andrew Reiter for feedback on | Jitendra Padhye, Andrew Reiter, Stanislav Shalunov, Alex Solan, Paul | |||
| earlier versions of this document. | Sutter, Brian Tierney, Joe Touch. | |||
| 16. Normative References | 16. Normative References | |||
| [RFC2581] M. Allman and V. Paxson, "TCP Congestion Control", RFC | [RFC2581] M. Allman, V. Paxson, and W. Stevens, "TCP Congestion | |||
| 2581, April 1999. | Control", RFC 2581, April 1999. | |||
| 17. Informative References | 17. Informative References | |||
| [AFP02] Allman, M., Floyd, S., and Partridge, C., "Increasing TCP's | [ABLLS03] A. Antony, J. Blom, C. de Laat, J. Lee, and W. Sjouw, | |||
| Initial Window", internet-draft draft-ietf-tsvwg-initwin-04.txt, | Macroscopic Examination of TCP Flows over Transatlantic Links, | |||
| work-in-progress, June 2002. | January 2003. URL | |||
| "http://carol.wins.uva.nl/%7Edelaat/techrep-2003-2-tcp.pdf". | ||||
| [BBFS01] Deepak Bansal, Hari Balakrishnan, Sally Floyd, and Scott | [BBFS01] Deepak Bansal, Hari Balakrishnan, Sally Floyd, and Scott | |||
| Shenker, "Dynamic Behavior of Slowly-Responsive Congestion Control | Shenker, "Dynamic Behavior of Slowly-Responsive Congestion Control | |||
| Algorithms", SIGCOMM 2001, August 2001. | Algorithms", SIGCOMM 2001, August 2001. | |||
| [CJ89] D. Chiu and R. Jain, "Analysis of the Increase and Decrease | [CC03] Fabrizio Coccetti and Les Cottrell, TCP Stack Measurements on | |||
| Algorithms for Congestion Avoidance in Computer Networks", Computer | Lightly Loaded Testbeds, 2003. URL "http://www- | |||
| Networks and ISDN Systems, Vol. 17, pp. 1-14, 1989. | iepm.slac.stanford.edu/monitoring/bulk/fast/". | |||
| [CO98] J. Crowcroft and P. Oechslin, "Differentiated end-to-end | [CJ89] D. Chiu and R. Jain, "Analysis of the Increase and Decrease | |||
| services using a weighted proportional fair share TCP", Computer | Algorithms for Congestion Avoidance in Computer Networks", Computer | |||
| Communication Review, 28(3):53--69, 1998. | Networks and ISDN Systems, Vol. 17, pp. 1-14, 1989. | |||
| [FF98] Floyd, S., and Fall, K., "Promoting the Use of End-to-End | [CO98] J. Crowcroft and P. Oechslin, "Differentiated End-to-end | |||
| Congestion Control in the Internet", IEEE/ACM Transactions on | Services using a Weighted Proportional Fair Share TCP", Computer | |||
| Networking, August 1999. | Communication Review, 28(3):53--69, 1998. | |||
| [FRS02] Sally Floyd, Sylvia Ratnasamy, and Scott Shenker, "Modifying | [D02] Tom Dunigan, Floyd's TCP slow-start and AIMD mods, URL | |||
| TCP's Congestion Control for High Speeds", May 2002. URL | "http://www.csm.ornl.gov/~dunigan/net100/floyd.html". | |||
| "http://www.icir.org/floyd/notes.html". | ||||
| [GRK99] Panos Gevros, Fulvio Risso and Peter Kirstein, "Analysis of a | [F03] Gareth Fairey, High-Speed TCP, 2003. URL | |||
| Method for Differential TCP Service" In Proceedings of the IEEE | "http://www.hep.man.ac.uk/u/garethf/hstcp/". | |||
| GLOBECOM'99, Symposium on Global Internet , December 1999, Rio de | ||||
| Janeiro, Brazil. | ||||
| [GV02] S. Gorinsky and H. Vin, "Extended Analysis of Binary | [F92] S. Floyd and V. Jacobson, On Traffic Phase Effects in Packet- | |||
| Adjustment Algorithms", Technical Report TR2002-39, Department of | Switched Gateways, Internetworking: Research and Experience, V.3 | |||
| Computer Sciences, The University of Texas at Austin, August 2002. | N.3, September 1992, p.115-156. URL | |||
| URL "http://www.cs.utexas.edu/users/gorinsky/pubs.html". | "http://www.icir.org/floyd/papers.html". | |||
| [HSTCP] HighSpeed TCP Web Page, URL | [Fl03] Sally Floyd, "Re: [Tsvwg] taking NewReno (RFC 2582) to | |||
| "http://www.icir.org/floyd/hstcp.html". | Proposed Standard", Email to the tsvwg mailing list, May 14, 2003, | |||
| URLs "http://www1.ietf.org/mail-archive/working- | ||||
| groups/tsvwg/current/msg04086.html" and "http://www1.ietf.org/mail- | ||||
| archive/working-groups/tsvwg/current/msg04087.html". | ||||
| [J02] Amit Jain and Sally Floyd, "Quick-Start for TCP and IP", | [FF98] Floyd, S., and Fall, K., "Promoting the Use of End-to-End | |||
| internet draft draft-amit-quick-start-00.txt, work in progress, 2002. | Congestion Control in the Internet", IEEE/ACM Transactions on | |||
| Networking, August 1999. | ||||
| [K03] Tom Kelly, "Scalable TCP: Improving Performance in HighSpeed | [FRS02] Sally Floyd, Sylvia Ratnasamy, and Scott Shenker, "Modifying | |||
| Wide Area Networks", February 2003. URL "http://www- | TCP's Congestion Control for High Speeds", May 2002. URL | |||
| lce.eng.cam.ac.uk/~ctk21/scalable/". | "http://www.icir.org/floyd/notes.html". | |||
| [M02] Matt Mathis, "Raising the Internet MTU", Web Page, URL | [GRK99] Panos Gevros, Fulvio Risso and Peter Kirstein, "Analysis of | |||
| "http://www.psc.edu/~mathis/MTU/". | a Method for Differential TCP Service" In Proceedings of the IEEE | |||
| GLOBECOM'99, Symposium on Global Internet , December 1999, Rio de | ||||
| Janeiro, Brazil. | ||||
| [RFC 1323] V. Jacobson, R. Braden, and D. Borman, TCP Extensions for | [GV02] S. Gorinsky and H. Vin, "Extended Analysis of Binary | |||
| High Performance, RFC 1323, May 1992. | Adjustment Algorithms", Technical Report TR2002-39, Department of | |||
| Computer Sciences, The University of Texas at Austin, August 2002. | ||||
| URL "http://www.cs.utexas.edu/users/gorinsky/pubs.html". | ||||
| [S02] Stanislav Shalunov, TCP Armonk, draft, 2002, URL | [HSTCP] HighSpeed TCP Web Page, URL | |||
| "http://www.internet2.edu/~shalunov/tcpar/". | "http://www.icir.org/floyd/hstcp.html". | |||
| [TFRC] Mark Handley, Jitendra Padhye, Sally Floyd, and Joerg Widmer, | [J02] Amit Jain and Sally Floyd, "Quick-Start for TCP and IP", | |||
| TCP Friendly Rate Control (TFRC): Protocol Specification, internet | internet draft draft-amit-quick-start-02.txt, work in progress, | |||
| draft draft-ietf-tsvwg-tfrc-04.txt, work in progress, 2002. | 2002. | |||
| [VMSS] "Web100 at ORNL", Web Page, | [K03] Tom Kelly, "Scalable TCP: Improving Performance in HighSpeed | |||
| "http://www.csm.ornl.gov/~dunigan/netperf/web100.html". | Wide Area Networks", February 2003. URL "http://www- | |||
| lce.eng.cam.ac.uk/~ctk21/scalable/". | ||||
| [M02] Matt Mathis, "Raising the Internet MTU", Web Page, URL | ||||
| "http://www.psc.edu/~mathis/MTU/". | ||||
| [Net100] The DOE/MICS Net100 project. URL | ||||
| "http://www.csm.ornl.gov/~dunigan/net100/". | ||||
| [NS] The NS Simulator, "http://www.isi.edu/nsnam/ns/". | ||||
| [RFC 1323] V. Jacobson, R. Braden, and D. Borman, TCP Extensions for | ||||
| High Performance, RFC 1323, May 1992. | ||||
| [RFC3390] Allman, M., Floyd, S., and Partridge, C., "Increasing | ||||
| TCP's Initial Window", RFC 3390, October 2002. | ||||
| [RFC3448] Mark Handley, Jitendra Padhye, Sally Floyd, and Joerg | ||||
| Widmer, TCP Friendly Rate Control (TFRC): Protocol Specification, | ||||
| RFC 3448, January 2003. | ||||
| [SA03] Souza, E., and Agarwal, D.A., A HighSpeed TCP Study: | ||||
| Characteristics and Deployment Issues, LBNL Technical Report | ||||
| LBNL-53215. URL "http://www.icir.org/floyd/hstcp.html". | ||||
| [S02] Stanislav Shalunov, TCP Armonk, draft, 2002, URL | ||||
| "http://www.internet2.edu/~shalunov/tcpar/". | ||||
| [S03] Alex Solan, private communication, 2003. | ||||
| [VMSS] "Web100 at ORNL", Web Page, | ||||
| "http://www.csm.ornl.gov/~dunigan/netperf/web100.html". | ||||
| [Web100] The Web100 project. URL "http://www.web100.org/". | ||||
| 18. Security Considerations | 18. Security Considerations | |||
| This proposal makes no changes to the underlying security of TCP. | This proposal makes no changes to the underlying security of TCP. | |||
| 19. IANA Considerations | 19. IANA Considerations | |||
| There are no IANA considerations regarding this document. | There are no IANA considerations regarding this document. | |||
| A. TCP's Loss Event Rate in Steady-State | 20. TCP's Loss Event Rate in Steady-State | |||
| This section gives the number of round-trip times between congestion | This section gives the number of round-trip times between congestion | |||
| events for a TCP flow with D-byte packets, for D=1500, as a function | events for a TCP flow with D-byte packets, for D=1500, as a function | |||
| of the connection's average throughput B in bps. To achieve this | of the connection's average throughput B in bps. To achieve this | |||
| average throughput B, a TCP connection with round-trip time R in | average throughput B, a TCP connection with round-trip time R in | |||
| seconds requires an average congestion window w of BR/(8D) segments. | seconds requires an average congestion window w of BR/(8D) segments. | |||
| In steady-state, TCP's average congestion window w is roughly | In steady-state, TCP's average congestion window w is roughly | |||
| 1.2/sqrt(p) segments. This is equivalent to a lost event at most | 1.2/sqrt(p) segments. This is equivalent to a lost event at most | |||
| once every 1/p packets, or at most once every 1/(pw) = w/1.5 round- | once every 1/p packets, or at most once every 1/(pw) = w/1.5 round- | |||
| trip times. Substituting for w, this is a loss event at most every | trip times. Substituting for w, this is a loss event at most every | |||
| (BR)/12D)round-trip times. | (BR)/12D)round-trip times. | |||
| An an example, for R = 0.1 seconds and D = 1500 bytes, this gives | An an example, for R = 0.1 seconds and D = 1500 bytes, this gives | |||
| B/180000 round-trip times between loss events. | B/180000 round-trip times between loss events. | |||
| B. A table for a(w) and b(w). | B. A table for a(w) and b(w). | |||
| This section gives a table for the increase and decrease parameters | This section gives a table for the increase and decrease parameters | |||
| a(w) and b(w) for HighSpeed TCP, for the default values of Low_Window | a(w) and b(w) for HighSpeed TCP, for the default values of Low_Window | |||
| = 38, High_Window = 83000, High_P = 10^-7, and High_Decrease = 0.1. | = 38, High_Window = 83000, High_P = 10^-7, and High_Decrease = 0.1. | |||
| w a(w) b(w) | w a(w) b(w) | |||
| ---- ---- ---- | ---- ---- ---- | |||
| 38 1 0.50 | 38 1 0.50 | |||
| skipping to change at page 24, line 32 ¶ | skipping to change at page 33, line 32 ¶ | |||
| 61799 65 0.12 | 61799 65 0.12 | |||
| 64851 66 0.11 | 64851 66 0.11 | |||
| 68113 67 0.11 | 68113 67 0.11 | |||
| 71617 68 0.11 | 71617 68 0.11 | |||
| 75401 69 0.10 | 75401 69 0.10 | |||
| 79517 70 0.10 | 79517 70 0.10 | |||
| 84035 71 0.10 | 84035 71 0.10 | |||
| 89053 72 0.10 | 89053 72 0.10 | |||
| 94717 73 0.09 | 94717 73 0.09 | |||
| Table 9: Parameters for HighSpeed TCP. | Table 12: Parameters for HighSpeed TCP. | |||
| This table was computed with the following Perl program: | This table was computed with the following Perl program: | |||
| $top = 100000; | $top = 100000; | |||
| $num = 38; | $num = 38; | |||
| if ($num == 38) { | if ($num == 38) { | |||
| print " w a(w) b(w)\n"; | print " w a(w) b(w)\n"; | |||
| print " ---- ---- ----\n"; | print " ---- ---- ----\n"; | |||
| print " 38 1 0.50\n"; | print " 38 1 0.50\n"; | |||
| $oldb = 0.50; | $oldb = 0.50; | |||
| skipping to change at page 25, line 24 ¶ | skipping to change at page 34, line 24 ¶ | |||
| while ($num < $top) { | while ($num < $top) { | |||
| $bw = (0.1 -0.5)*(log($num)-log(38))/(log(83000)-log(38))+0.5; | $bw = (0.1 -0.5)*(log($num)-log(38))/(log(83000)-log(38))+0.5; | |||
| $aw = ($num**2*2.0*$bw) / ((2.0-$bw)*$num**1.2*12.8); | $aw = ($num**2*2.0*$bw) / ((2.0-$bw)*$num**1.2*12.8); | |||
| if ($aw > $olda + 1) { | if ($aw > $olda + 1) { | |||
| printf "%6d %5d %3.2f0, $num, $aw, $bw; | printf "%6d %5d %3.2f0, $num, $aw, $bw; | |||
| $olda = $aw; | $olda = $aw; | |||
| } | } | |||
| $num ++; | $num ++; | |||
| } | } | |||
| Table 10: Perl Program for computing parameters for HighSpeed TCP. | Table 13: Perl Program for computing parameters for HighSpeed TCP. | |||
| C. Exploring the time to converge to fairness. | C. Exploring the time to converge to fairness. | |||
| This section gives the Perl program used to compute the congestion | This section gives the Perl program used to compute the congestion | |||
| window growth during congestion avoidance. | window growth during congestion avoidance. | |||
| $top = 2001; | $top = 2001; | |||
| $hswin = 1; | $hswin = 1; | |||
| $regwin = 1; | $regwin = 1; | |||
| $rtt = 1; | $rtt = 1; | |||
| skipping to change at page 26, line 30 ¶ | skipping to change at page 35, line 30 ¶ | |||
| } | } | |||
| if ($rtt >= $lastrtt + $rttstep) { | if ($rtt >= $lastrtt + $rttstep) { | |||
| printf "%5d %9d %10d0, $rtt, $hswin, $regwin; | printf "%5d %9d %10d0, $rtt, $hswin, $regwin; | |||
| $lastrtt = $rtt; | $lastrtt = $rtt; | |||
| } | } | |||
| $hswin += $aw; | $hswin += $aw; | |||
| $regwin += 1; | $regwin += 1; | |||
| $rtt ++; | $rtt ++; | |||
| } | } | |||
| Table 11: Perl Program for computing the window in congestion | Table 14: Perl Program for computing the window in congestion | |||
| avoidance. | avoidance. | |||
| AUTHORS' ADDRESSES | AUTHORS' ADDRESSES | |||
| Sally Floyd | Sally Floyd | |||
| Phone: +1 (510) 666-2989 | Phone: +1 (510) 666-2989 | |||
| ICIR (ICSI Center for Internet Research) | ICIR (ICSI Center for Internet Research) | |||
| Email: floyd@icir.org | Email: floyd@icir.org | |||
| URL: http://www.icir.org/floyd/ | URL: http://www.icir.org/floyd/ | |||
| End of changes. 167 change blocks. | ||||
| 813 lines changed or deleted | 1138 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||