| < draft-hadi-jhsua-ecnperf-00.txt | draft-hadi-jhsua-ecnperf-01.txt > | |||
|---|---|---|---|---|
| Internet Engineering Task Force Jamal Hadi Salim | Internet Engineering Task Force Jamal Hadi Salim | |||
| Internet Draft Nortel Networks | Internet Draft Nortel Networks | |||
| Expires: June 2000 Uvaiz Ahmed | Expires: Sept 2000 Uvaiz Ahmed | |||
| Carleton University | draft-hadi-jhsua-ecnperf-01.txt Carleton University | |||
| December 1999 | March 2000 | |||
| draft-hadi-jhsua-ecnperf-00.txt | ||||
| Performance Evaluation of Explicit Congestion Notification (ECN) in IP | Performance Evaluation of Explicit Congestion Notification (ECN) in IP | |||
| Networks | Networks | |||
| Status of this Memo | Status of this Memo | |||
| This document is an Internet-Draft and is in full conformance with | This document is an Internet-Draft and is in full conformance with | |||
| all provisions of Section 10 of RFC2026. | all provisions of Section 10 of RFC2026. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| skipping to change at page 1, line 41 ¶ | skipping to change at page 1, line 41 ¶ | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| Abstract | Abstract | |||
| This draft presents a performance study of the Explicit Congestion | This draft presents a performance study of the Explicit Congestion | |||
| Notification (ECN) mechanism in the TCP/IP protocol using our | Notification (ECN) mechanism in the TCP/IP protocol using our | |||
| implementation on the Linux Operating System. ECN is an end-to-end | implementation on the Linux Operating System. ECN is an end-to-end | |||
| congestion avoidance mechanism proposed by [6] and incorporated into | congestion avoidance mechanism proposed by [6] and incorporated into | |||
| RFC 2481[7]. We study the behavior of ECN for both bulk and | RFC 2481[7]. We study the behavior of ECN for both bulk and | |||
| transactional transfers. Our experiments show that there is | transactional transfers. Our experiments show that there is | |||
| improvement in throughput over NON ECN (RENO TCP) in the case of bulk | improvement in throughput over NON ECN (TCP employing any of Reno, | |||
| SACK/FACK or NewReno congestion control) in the case of bulk | ||||
| transfers and substantial improvement for transactional transfers. | transfers and substantial improvement for transactional transfers. | |||
| A more complete pdf version of this document is available at: | A more complete pdf version of this document is available at: | |||
| http://www7.nortel.com:8080/CTL/ecn-perf.pdf | http://www7.nortel.com:8080/CTL/ecnperf.pdf | |||
| This draft in its current revision is missing a lot of the visual | This draft in its current revision is missing a lot of the visual | |||
| representations and experimental results found in the pdf version. | representations and experimental results found in the pdf version. | |||
| 1. Introduction | 1. Introduction | |||
| In current IP networks, congestion management is left to the | In current IP networks, congestion management is left to the | |||
| protocols running on top of IP. An IP router when congested simply | protocols running on top of IP. An IP router when congested simply | |||
| drops packets. TCP is the dominant transport protocol today [26]. | drops packets. TCP is the dominant transport protocol today [26]. | |||
| TCP infers that there is congestion in the network by detecting | TCP infers that there is congestion in the network by detecting | |||
| skipping to change at page 7, line 21 ¶ | skipping to change at page 6, line 35 ¶ | |||
| network (when it receives an ACK packet that has the ECN-Echo flag | network (when it receives an ACK packet that has the ECN-Echo flag | |||
| set) is equivalent to the Fast Retransmit/Recovery algorithm (when | set) is equivalent to the Fast Retransmit/Recovery algorithm (when | |||
| there is a congestion loss) in NON-ECN-capable TCP i.e. the sender | there is a congestion loss) in NON-ECN-capable TCP i.e. the sender | |||
| halves the congestion window cwnd and reduces the slow start | halves the congestion window cwnd and reduces the slow start | |||
| threshold ssthresh. Fast Retransmit/Recovery is still available for | threshold ssthresh. Fast Retransmit/Recovery is still available for | |||
| ECN capable stacks for responding to three duplicate acknowledgments. | ECN capable stacks for responding to three duplicate acknowledgments. | |||
| 4. Experimental setup | 4. Experimental setup | |||
| For testing purposes we have added ECN to the Linux TCP/IP stack, | For testing purposes we have added ECN to the Linux TCP/IP stack, | |||
| kernel version 2.0.32. The implementation conforms to RFC 2481 [7] | kernels version 2.0.32. 2.2.5, 2.3.43 (there were also earlier | |||
| for the end systems. We have also modified the router code in Linux | revisons of 2.3 which were tested). The 2.0.32 implementation | |||
| version 2.1.128 for the router for ECN and RED. The code is available | conforms to RFC 2481 [7] for the end systems only. We have also | |||
| at [18]. Note Linux version 2.0.32 implements TCP Reno congestion | modified the code in the 2.1,2.2 and 2.3 cases for the router portion | |||
| control. | as well as end system to conform to the RFC. An outdated version of | |||
| the 2.0 code is available at [18]. Note Linux version 2.0.32 | ||||
| implements TCP Reno congestion control while kernels >= 2.2.0 default | ||||
| to New Reno but will opt for a SACK/FACK combo when the remote end | ||||
| understands SACK. Our initial tests were carried out with the 2.0 | ||||
| kernel at the end system and 2.1 (pre 2.2) for the router part. The | ||||
| majority of the test results here apply to the 2.0 tests. We did | ||||
| repeat these tests on a different testbed (move from Pentium to | ||||
| Pentium-II class machines)with faster machines for the 2.2 and 2.3 | ||||
| kernels, so the comparisons on the 2.0 and 2.2/3 are not relative. | ||||
| We have updated this draft release to reflect the tests against SACK | ||||
| and New Reno. | ||||
| 4.1. Testbed setup | 4.1. Testbed setup | |||
| ----- ---- | ----- ---- | |||
| | ECN | | ECN | | | ECN | | ECN | | |||
| | ON | | OFF | | | ON | | OFF | | |||
| data direction ---->> ----- ---- | data direction ---->> ----- ---- | |||
| | | | | | | |||
| server | | | server | | | |||
| ---- ------ ------ | | | ---- ------ ------ | | | |||
| skipping to change at page 8, line 14 ¶ | skipping to change at page 7, line 37 ¶ | |||
| All the physical links are 10Mbps ethernet. Using Class Based | All the physical links are 10Mbps ethernet. Using Class Based | |||
| Queuing (CBQ) [22], packets from the data server are constricted to a | Queuing (CBQ) [22], packets from the data server are constricted to a | |||
| 1.5Mbps pipe at the router R1. Data is always retrieved from the | 1.5Mbps pipe at the router R1. Data is always retrieved from the | |||
| server towards the clients labelled , "ECN ON", "ECN OFF", and "C". | server towards the clients labelled , "ECN ON", "ECN OFF", and "C". | |||
| Since the pipe from the server is 10Mbps, this creates congestion at | Since the pipe from the server is 10Mbps, this creates congestion at | |||
| the exit from the router towards the clients for competing flows. The | the exit from the router towards the clients for competing flows. The | |||
| machines labeled "ECN ON" and "ECN OFF" are running the same version | machines labeled "ECN ON" and "ECN OFF" are running the same version | |||
| of Linux and have exactly the same hardware configuration. The server | of Linux and have exactly the same hardware configuration. The server | |||
| is always ECN capable (and can handle NON ECN flows as well using the | is always ECN capable (and can handle NON ECN flows as well using the | |||
| standard RENO algorithm). The machine labeled "C" is used to create | standard congestion algorithms). The machine labeled "C" is used to | |||
| congestion in the network. Router R2 acts as a path-delay controller. | create congestion in the network. Router R2 acts as a path-delay | |||
| With it we adjust the RTT the clients see. Router R1 has RED | controller. With it we adjust the RTT the clients see. Router R1 | |||
| implemented in it and has capability for supporting ECN flows. The | has RED implemented in it and has capability for supporting ECN | |||
| details of the client machines are as follows: OS: Linux 2.0.32; | flows. The path-delay router is a PC running the Nistnet [16] | |||
| Processor: Intel Pentium II 200MHz ; Memory: 64MB; Network Card: 3COM | package on a Linux platform. The latency of the link for the | |||
| 3C509 Network Interface Card. The path-delay router is a PC running | experiments was set to be 20 millisecs. | |||
| the Nistnet [16] package on a Linux platform. The latency of the link | ||||
| for the experiments was set to be 20 msec. | ||||
| 4.2. Validating the Implementation | 4.2. Validating the Implementation | |||
| We spent time validating that the implementation was conformant to | We spent time validating that the implementation was conformant to | |||
| the specification in RFC 2481. To do this, the popular tcpdump | the specification in RFC 2481. To do this, the popular tcpdump | |||
| sniffer [24] was modified to show the packets being marked. We | sniffer [24] was modified to show the packets being marked. We | |||
| visually inspected tcpdump traces to validate the conformance to the | visually inspected tcpdump traces to validate the conformance to the | |||
| RFC under a lot of different scenarios. We also modified tcptrace | RFC under a lot of different scenarios. We also modified tcptrace | |||
| [25] in order to plot the marked packets for visualization and | [25] in order to plot the marked packets for visualization and | |||
| analysis. | analysis. | |||
| skipping to change at page 12, line 35 ¶ | skipping to change at page 12, line 11 ¶ | |||
| alleviated by ECN). However, while TCP Reno has performance problems | alleviated by ECN). However, while TCP Reno has performance problems | |||
| with multiple packets dropped in a window of data, New Reno and SACK | with multiple packets dropped in a window of data, New Reno and SACK | |||
| have no such problems. | have no such problems. | |||
| Thus, for scenarios with very high levels of congestion, the | Thus, for scenarios with very high levels of congestion, the | |||
| advantages of ECN for TCP Reno flows could be more dramatic than the | advantages of ECN for TCP Reno flows could be more dramatic than the | |||
| advantages of ECN for NewReno or SACK flows. An important | advantages of ECN for NewReno or SACK flows. An important | |||
| observation to make from our results is that we do not notice | observation to make from our results is that we do not notice | |||
| multiple drops within a single window of data. Thus, we would expect | multiple drops within a single window of data. Thus, we would expect | |||
| that our results are not heavily influenced by Reno's performance | that our results are not heavily influenced by Reno's performance | |||
| problems with multiple packets dropped from a window of data. We | problems with multiple packets dropped from a window of data. | |||
| would expect that the results in this paper concerning the benefits | We repeated these tests with ECN patched newer Linux kernels. As | |||
| of ECN would also apply to the benefits of ECN for New Reno or SACK | mentioned earlier these kernels would use a SACK/FACK combo with a | |||
| implementations of TCP | fallback to New Reno. SACK can be selectively turned off (defaulting | |||
| to New Reno). Our results indicate that ECN still improves | ||||
| performance for the bulk transfers. More results are available in the | ||||
| pdf version[27]. As in 1) above, maintaining a maximum drop | ||||
| probability of 0.1 and increasing the congestion level, it is | ||||
| observed that ECN-SACK improves performance from about 5% at low | ||||
| congestion to about 15% at high congestion. In the scenario where | ||||
| high congestion is maintained and the maximum drop probability is | ||||
| moved from 0.02 to 0.5, the relative advantage of ECN-SACK improves | ||||
| from 10% to 40%. Although this numbers are lower than the ones | ||||
| exhibited by Reno, they do reflect the improvement that ECN offers | ||||
| even in the presence of robust recovery mechanisms such as SACK. | ||||
| 5.3. Transactional transfers | 5.3. Transactional transfers | |||
| We model transactional transfers by sending a small request and | We model transactional transfers by sending a small request and | |||
| getting a response from a server before sending the next request. To | getting a response from a server before sending the next request. To | |||
| generate transactional transfer traffic we use Netperf [17] with the | generate transactional transfer traffic we use Netperf [17] with the | |||
| CRR (Connect Request Response) option. As an example let us assume | CRR (Connect Request Response) option. As an example let us assume | |||
| that we are retrieving a small file of say 5 - 20 KB, then in effect | that we are retrieving a small file of say 5 - 20 KB, then in effect | |||
| we send a small request to the server and the server responds by | we send a small request to the server and the server responds by | |||
| sending us the file. The transaction is complete when we receive the | sending us the file. The transaction is complete when we receive the | |||
| skipping to change at page 13, line 19 ¶ | skipping to change at page 13, line 5 ¶ | |||
| the slowest response in the set of the opened concurrent sessions (in | the slowest response in the set of the opened concurrent sessions (in | |||
| HTTP). The transactional data sizes were selected based on [2] which | HTTP). The transactional data sizes were selected based on [2] which | |||
| indicates that the average web transaction was around 8 - 10 KB; The | indicates that the average web transaction was around 8 - 10 KB; The | |||
| smaller (5KB) size was selected to guestimate the size of | smaller (5KB) size was selected to guestimate the size of | |||
| transactional processing that may become prevalent with policy | transactional processing that may become prevalent with policy | |||
| management schemes in the diffserv [4] context. Using Netperf we are | management schemes in the diffserv [4] context. Using Netperf we are | |||
| able to initiate these kind of transactional transfers for a variable | able to initiate these kind of transactional transfers for a variable | |||
| length of time. The main metric of interest in this case is the | length of time. The main metric of interest in this case is the | |||
| transaction rate, which is recorded by Netperf. | transaction rate, which is recorded by Netperf. | |||
| * Define Transaction rate as: The number of requests and responses | * Define Transaction rate as: The number of requests and complete | |||
| for a particular requested size that we are able to do per second. | responses for a particular requested size that we are able to do per | |||
| For example if our request is of 1KB and the response is 5KB then we | second. For example if our request is of 1KB and the response is 5KB | |||
| define the transaction rate as the number of such complete | then we define the transaction rate as the number of such complete | |||
| transactions that we can accomplish per second. | transactions that we can accomplish per second. | |||
| Experiment Details: Similar to the case of bulk transfers we start | Experiment Details: Similar to the case of bulk transfers we start | |||
| the background FTP flows to introduce the congestion in the network | the background FTP flows to introduce the congestion in the network | |||
| at time 0. About 20 seconds later we start the transactional | at time 0. About 20 seconds later we start the transactional | |||
| transfers and run each test for three minutes. We record the | transfers and run each test for three minutes. We record the | |||
| transactions per second that are complete. We repeat the test for | transactions per second that are complete. We repeat the test for | |||
| about an hour and plot the various transactions per second, averaged | about an hour and plot the various transactions per second, averaged | |||
| out over the runs. The experiment is repeated for various maximum | out over the runs. The experiment is repeated for various maximum | |||
| drop probabilities, file sizes and various levels of congestion. | drop probabilities, file sizes and various levels of congestion. | |||
| skipping to change at page 14, line 24 ¶ | skipping to change at page 14, line 9 ¶ | |||
| ECE flag. This is by design in our experimental setup. [3] shows | ECE flag. This is by design in our experimental setup. [3] shows | |||
| that most of the TCP loss recovery in fact happens in timeouts for | that most of the TCP loss recovery in fact happens in timeouts for | |||
| short flows. The effectiveness of the Fast Retransmit/Recovery | short flows. The effectiveness of the Fast Retransmit/Recovery | |||
| algorithm is limited by the fact that there might not be enough data | algorithm is limited by the fact that there might not be enough data | |||
| in the pipe to elicit 3 duplicate ACKs. TCP RENO needs at least 4 | in the pipe to elicit 3 duplicate ACKs. TCP RENO needs at least 4 | |||
| outstanding packets to recover from losses without going into a | outstanding packets to recover from losses without going into a | |||
| timeout. For 5KB (4 packets for MTU of 1500Bytes) a NON ECN flow will | timeout. For 5KB (4 packets for MTU of 1500Bytes) a NON ECN flow will | |||
| always have to wait for a retransmit timeout if any of its packets | always have to wait for a retransmit timeout if any of its packets | |||
| are lost. ( This timeout could only have been avoided if the flow had | are lost. ( This timeout could only have been avoided if the flow had | |||
| used an initial window of four packets, and the first of the four | used an initial window of four packets, and the first of the four | |||
| packets was the packet dropped). For these experiments with small | packets was the packet dropped). We repeated these experiments with | |||
| transaction sizes, the performance of TCP NewReno and SACK is similar | the kernels implementing SACK/FACK and New Reno algorithms. Our | |||
| to that of TCP Reno. Thus, for these experiments, we would expect to | observation was that there was hardly any difference with what we saw | |||
| see the same relative benefits for ECN if we had used NewReno or SACK | with Reno. For example in the case of SACK-ECN enabling: maintaining | |||
| TCP instead of Reno. | the maximum drop probability to 0.1 and increasing the congestion | |||
| level for the 5KB transaction we noticed that the relative gain for | ||||
| the ECN enabled flows increases from 47-80%. If we maintain the | ||||
| congestion level for the 5KB transactions and increase the maximum | ||||
| drop probabilities instead, we notice that SACKs perfomance increases | ||||
| from 15%-120%. It is fair to comment that the difference in the | ||||
| testbeds (different machines, same topology) might have contributed | ||||
| to the results; however, it is worth noting that the relative | ||||
| advantage of the SACK-ECN is obvious. | ||||
| 6. Conclusion | 6. Conclusion | |||
| ECN enhancements improve on both bulk and transactional TCP traffic | ECN enhancements improve on both bulk and transactional TCP traffic. | |||
| over TCP RENO. The improvement is more obvious in short transactional | The improvement is more obvious in short transactional type of flows | |||
| type of flows (popularly referred to as mice). | (popularly referred to as mice). | |||
| * Because less retransmits happen with ECN, it means less traffic on | * Because less retransmits happen with ECN, it means less traffic on | |||
| the network. Although the relative amount of data retransmitted in | the network. Although the relative amount of data retransmitted in | |||
| our case is small, the effect could be higher when there are more | our case is small, the effect could be higher when there are more | |||
| contributing end systems. The absence of retransmits also implies an | contributing end systems. The absence of retransmits also implies an | |||
| improvement in the goodput. This becomes very important for scenarios | improvement in the goodput. This becomes very important for scenarios | |||
| where bandwidth is expensive such as in low bandwidth links. This | where bandwidth is expensive such as in low bandwidth links. This | |||
| implies also that ECN lends itself well to applications that require | implies also that ECN lends itself well to applications that require | |||
| reliability but but would prefer to avoid unnecessary | reliability but would prefer to avoid unnecessary retransmissions. | |||
| retransmissions. | ||||
| * The fact that ECN avoids timeouts by getting faster notification | * The fact that ECN avoids timeouts by getting faster notification | |||
| (as opposed to traditional packet dropping inference from 3 duplicate | (as opposed to traditional packet dropping inference from 3 duplicate | |||
| ACKs or, even worse, timeouts) implies less time is spent during | ACKs or, even worse, timeouts) implies less time is spent during | |||
| error recovery - this also improves goodput. | error recovery - this also improves goodput. | |||
| * ECN could be used to help in service differentiation where the end | * ECN could be used to help in service differentiation where the end | |||
| user is able to "probe" for their target rate faster. Assured | user is able to "probe" for their target rate faster. Assured | |||
| forwarding [1] in the diffserv working group at the IETF proposes | forwarding [1] in the diffserv working group at the IETF proposes | |||
| using RED with varying drop probabilities as a service | using RED with varying drop probabilities as a service | |||
| skipping to change at page 15, line 26 ¶ | skipping to change at page 15, line 18 ¶ | |||
| notes that RENO is the most widely deployed TCP implementation today. | notes that RENO is the most widely deployed TCP implementation today. | |||
| It is clear that the advent of policy management schemes | It is clear that the advent of policy management schemes | |||
| introduces new requirements for transactional type of applications, | introduces new requirements for transactional type of applications, | |||
| which constitute a very short query and a response in the order of a | which constitute a very short query and a response in the order of a | |||
| few packets. ECN provides advantages to transactional traffic as we | few packets. ECN provides advantages to transactional traffic as we | |||
| have shown in the experiments. | have shown in the experiments. | |||
| 7. Acknowledgements | 7. Acknowledgements | |||
| We would like to thank Alan Chapman, Ioannis Lambadaris, Thomas Kunz, | We would like to thank Alan Chapman, Ioannis Lambadaris, Thomas Kunz, | |||
| Biswajit Nandy, Nabil Seddigh and Sally Floyd for their helpful | Biswajit Nandy, Nabil Seddigh, Sally Floyd, and Rupinder Makkar for | |||
| feedback and valuable suggestions. | their helpful feedback and valuable suggestions. | |||
| 8. References | 8. References | |||
| [1] Heinanen J et al , "Assured Forwarding PHB Group", http: | [1] Heinanen J et al , "Assured Forwarding PHB Group", http: | |||
| //search.ietf.org/internet-drafts/draft-ietf-diffserv-af-06.txt. | //search.ietf.org/internet-drafts/draft-ietf-diffserv-af-06.txt. | |||
| Work in Progress. | Work in Progress. | |||
| [2] B.A.Mat. " An empirical model of HTTP network traffic." | [2] B.A.Mat. " An empirical model of HTTP network traffic." | |||
| In proceedings INFOCOMM'97. | In proceedings INFOCOMM'97. | |||
| End of changes. 12 change blocks. | ||||
| 40 lines changed or deleted | 69 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||