| < draft-paxson-tcpm-rfc2988bis-01.txt | draft-paxson-tcpm-rfc2988bis-02.txt > | |||
|---|---|---|---|---|
| Internet Engineering Task Force V. Paxson | Internet Engineering Task Force V. Paxson | |||
| INTERNET DRAFT ICSI/UC Berkeley | INTERNET DRAFT ICSI/UC Berkeley | |||
| File: draft-paxson-tcpm-rfc2988bis-01.txt M. Allman | File: draft-paxson-tcpm-rfc2988bis-02.txt M. Allman | |||
| ICSI | Intended status: Proposed Standard ICSI | |||
| J. Chu | J. Chu | |||
| M. Sargent | M. Sargent | |||
| CWRU | CWRU | |||
| December 6, 2010 | March 14, 2011 | |||
| Computing TCP's Retransmission Timer | Computing TCP's Retransmission Timer | |||
| Status of this Memo | Status of this Memo | |||
| This Internet-Draft is submitted to IETF in full conformance with | This Internet-Draft is submitted to IETF in full conformance with | |||
| the provisions of BCP 78 and BCP 79. | the provisions of BCP 78 and BCP 79. | |||
| Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
| Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
| skipping to change at page 1, line 36 ¶ | skipping to change at page 1, line 36 ¶ | |||
| months and may be updated, replaced, or obsoleted by other documents | months and may be updated, replaced, or obsoleted by other documents | |||
| at any time. It is inappropriate to use Internet-Drafts as | at any time. It is inappropriate to use Internet-Drafts as | |||
| reference material or to cite them other than as "work in progress." | reference material or to cite them other than as "work in progress." | |||
| The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
| http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
| The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
| http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
| This Internet-Draft will expire on June 6, 2011. | This Internet-Draft will expire on September 14, 2011. | |||
| Copyright Notice | Copyright Notice | |||
| Copyright (c) 2010 IETF Trust and the persons identified as the | Copyright (c) 2011 IETF Trust and the persons identified as the | |||
| document authors. All rights reserved. | document authors. All rights reserved. | |||
| This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
| Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
| (http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
| publication of this document. Please review these documents | publication of this document. Please review these documents | |||
| carefully, as they describe your rights and restrictions with | carefully, as they describe your rights and restrictions with | |||
| respect to this document. Code Components extracted from this | respect to this document. Code Components extracted from this | |||
| document must include Simplified BSD License text as described in | document must include Simplified BSD License text as described in | |||
| Section 4.e of the Trust Legal Provisions and are provided without | Section 4.e of the Trust Legal Provisions and are provided without | |||
| warranty as described in the BSD License. | warranty as described in the Simplified BSD License. | |||
| Abstract | Abstract | |||
| This document defines the standard algorithm that Transmission | This document defines the standard algorithm that Transmission | |||
| Control Protocol (TCP) senders are required to use to compute and | Control Protocol (TCP) senders are required to use to compute and | |||
| manage their retransmission timer. It expands on the discussion in | manage their retransmission timer. It expands on the discussion in | |||
| section 4.2.3.1 of RFC 1122 and upgrades the requirement of | section 4.2.3.1 of RFC 1122 and upgrades the requirement of | |||
| supporting the algorithm from a SHOULD to a MUST. | supporting the algorithm from a SHOULD to a MUST. | |||
| 1 Introduction | 1 Introduction | |||
| The Transmission Control Protocol (TCP) [Pos81] uses a retransmission | The Transmission Control Protocol (TCP) [Pos81] uses a retransmission | |||
| timer to ensure data delivery in the absence of any feedback from the | timer to ensure data delivery in the absence of any feedback from the | |||
| remote data receiver. The duration of this timer is referred to as | remote data receiver. The duration of this timer is referred to as | |||
| RTO (retransmission timeout). RFC 1122 [Bra89] specifies that the | RTO (retransmission timeout). RFC 1122 [Bra89] specifies that the | |||
| RTO should be calculated as outlined in [Jac88]. | RTO should be calculated as outlined in [Jac88]. | |||
| This document codifies the algorithm for setting the RTO. In | This document codifies the algorithm for setting the RTO. In | |||
| addition, this document expands on the discussion in section 4.2.3.1 | addition, this document expands on the discussion in section 4.2.3.1 | |||
| of RFC 1122 and upgrades the requirement of supporting the algorithm | of RFC 1122 and upgrades the requirement of supporting the algorithm | |||
| from a SHOULD to a MUST. RFC 2581 [APS99] outlines the algorithm TCP | from a SHOULD to a MUST. RFC 5681 [APB09] outlines the algorithm TCP | |||
| uses to begin sending after the RTO expires and a retransmission is | uses to begin sending after the RTO expires and a retransmission is | |||
| sent. This document does not alter the behavior outlined in RFC 2581 | sent. This document does not alter the behavior outlined in RFC 5681 | |||
| [APS99]. | [APB09]. | |||
| In some situations it may be beneficial for a TCP sender to be more | In some situations it may be beneficial for a TCP sender to be more | |||
| conservative than the algorithms detailed in this document allow. | conservative than the algorithms detailed in this document allow. | |||
| However, a TCP MUST NOT be more aggressive than the following | However, a TCP MUST NOT be more aggressive than the following | |||
| algorithms allow. | algorithms allow. | |||
| The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
| "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
| document are to be interpreted as described in [Bra97]. | document are to be interpreted as described in [Bra97]. | |||
| skipping to change at page 2, line 49 ¶ | skipping to change at page 2, line 49 ¶ | |||
| The rules governing the computation of SRTT, RTTVAR, and RTO are as | The rules governing the computation of SRTT, RTTVAR, and RTO are as | |||
| follows: | follows: | |||
| (2.1) Until a round-trip time (RTT) measurement has been made for a | (2.1) Until a round-trip time (RTT) measurement has been made for a | |||
| segment sent between the sender and receiver, the sender SHOULD | segment sent between the sender and receiver, the sender SHOULD | |||
| set RTO <- 1 second, though the "backing off" on repeated | set RTO <- 1 second, though the "backing off" on repeated | |||
| retransmission discussed in (5.5) still applies. | retransmission discussed in (5.5) still applies. | |||
| Note that the previous version of this document used an | Note that the previous version of this document used an | |||
| initial RTO of 3 seconds [RFC2988]. A TCP implementation MAY | initial RTO of 3 seconds [PA00]. A TCP implementation MAY | |||
| still use this value (or any other value > 1 second). This | still use this value (or any other value > 1 second). This | |||
| change in the lower bound on the initial RTO is discussed in | change in the lower bound on the initial RTO is discussed in | |||
| further detail in Appendix A. | further detail in Appendix A. | |||
| (2.2) When the first RTT measurement R is made, the host MUST set | (2.2) When the first RTT measurement R is made, the host MUST set | |||
| SRTT <- R | SRTT <- R | |||
| RTTVAR <- R/2 | RTTVAR <- R/2 | |||
| RTO <- SRTT + max (G, K*RTTVAR) | RTO <- SRTT + max (G, K*RTTVAR) | |||
| skipping to change at page 5, line 16 ¶ | skipping to change at page 5, line 16 ¶ | |||
| (5.6) Start the retransmission timer, such that it expires after RTO | (5.6) Start the retransmission timer, such that it expires after RTO | |||
| seconds (for the value of RTO after the doubling operation | seconds (for the value of RTO after the doubling operation | |||
| outlined in 5.5). | outlined in 5.5). | |||
| (5.7) If the timer expires awaiting the ACK of a SYN segment and the | (5.7) If the timer expires awaiting the ACK of a SYN segment and the | |||
| TCP implementation is using an RTO less than 3 seconds, the RTO | TCP implementation is using an RTO less than 3 seconds, the RTO | |||
| MUST be re-initialized to 3 seconds when data transmission | MUST be re-initialized to 3 seconds when data transmission | |||
| begins (i.e., after the three-way handshake completes). | begins (i.e., after the three-way handshake completes). | |||
| This represents a change from the previous version of this | This represents a change from the previous version of this | |||
| document [RFC2988] and is discussed in Appendix A. | document [PA00] and is discussed in Appendix A. | |||
| Note that after retransmitting, once a new RTT measurement is | Note that after retransmitting, once a new RTT measurement is | |||
| obtained (which can only happen when new data has been sent and | obtained (which can only happen when new data has been sent and | |||
| acknowledged), the computations outlined in section 2 are performed, | acknowledged), the computations outlined in section 2 are performed, | |||
| including the computation of RTO, which may result in "collapsing" | including the computation of RTO, which may result in "collapsing" | |||
| RTO back down after it has been subject to exponential backoff | RTO back down after it has been subject to exponential backoff | |||
| (rule 5.5). | (rule 5.5). | |||
| Note that a TCP implementation MAY clear SRTT and RTTVAR after | Note that a TCP implementation MAY clear SRTT and RTTVAR after | |||
| backing off the timer multiple times as it is likely that the | backing off the timer multiple times as it is likely that the | |||
| skipping to change at page 5, line 45 ¶ | skipping to change at page 5, line 45 ¶ | |||
| TCP sender to compute a large value of RTO by adding delay to a | TCP sender to compute a large value of RTO by adding delay to a | |||
| timed packet's latency, or that of its acknowledgment. However, | timed packet's latency, or that of its acknowledgment. However, | |||
| the ability to add delay to a packet's latency often coincides with | the ability to add delay to a packet's latency often coincides with | |||
| the ability to cause the packet to be lost, so it is difficult to | the ability to cause the packet to be lost, so it is difficult to | |||
| see what an attacker might gain from such an attack that could cause | see what an attacker might gain from such an attack that could cause | |||
| more damage than simply discarding some of the TCP connection's | more damage than simply discarding some of the TCP connection's | |||
| packets. | packets. | |||
| The Internet to a considerable degree relies on the correct | The Internet to a considerable degree relies on the correct | |||
| implementation of the RTO algorithm (as well as those described in | implementation of the RTO algorithm (as well as those described in | |||
| RFC 2581) in order to preserve network stability and avoid | RFC 5681) in order to preserve network stability and avoid | |||
| congestion collapse. An attacker could cause TCP endpoints to | congestion collapse. An attacker could cause TCP endpoints to | |||
| respond more aggressively in the face of congestion by forging | respond more aggressively in the face of congestion by forging | |||
| acknowledgments for segments before the receiver has actually | acknowledgments for segments before the receiver has actually | |||
| received the data, thus lowering RTO to an unsafe value. But to do | received the data, thus lowering RTO to an unsafe value. But to do | |||
| so requires spoofing the acknowledgments correctly, which is | so requires spoofing the acknowledgments correctly, which is | |||
| difficult unless the attacker can monitor traffic along the path | difficult unless the attacker can monitor traffic along the path | |||
| between the sender and the receiver. In addition, even if the | between the sender and the receiver. In addition, even if the | |||
| attacker can cause the sender's RTO to reach too small a value, it | attacker can cause the sender's RTO to reach too small a value, it | |||
| appears the attacker cannot leverage this into much of an attack | appears the attacker cannot leverage this into much of an attack | |||
| (compared to the other damage they can do if they can spoof packets | (compared to the other damage they can do if they can spoof packets | |||
| skipping to change at page 6, line 21 ¶ | skipping to change at page 6, line 21 ¶ | |||
| The RTO algorithm described in this memo was originated by Van | The RTO algorithm described in this memo was originated by Van | |||
| Jacobson in [Jac88]. | Jacobson in [Jac88]. | |||
| Much of the data that motivated changing the initial RTO from 3 | Much of the data that motivated changing the initial RTO from 3 | |||
| seconds to 1 second came from Robert Love, Andre Broido and Mike | seconds to 1 second came from Robert Love, Andre Broido and Mike | |||
| Belshe. | Belshe. | |||
| Normative References | Normative References | |||
| [APS99] Allman, M., Paxson V. and W. Stevens, "TCP Congestion | [APB09] Allman, M., Paxson V. and E. Blanton, "TCP Congestion | |||
| Control", RFC 2581, April 1999. | Control", RFC 5681, September 2009. | |||
| [Bra89] Braden, R., "Requirements for Internet Hosts -- | [Bra89] Braden, R., "Requirements for Internet Hosts -- | |||
| Communication Layers", STD 3, RFC 1122, October 1989. | Communication Layers", STD 3, RFC 1122, October 1989. | |||
| [Bra97] Bradner, S., "Key words for use in RFCs to Indicate | [Bra97] Bradner, S., "Key words for use in RFCs to Indicate | |||
| Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
| [JBB92] Jacobson, V., R. Braden, D. Borman, "TCP Extensions for High | ||||
| Performance", RFC 1323, May 1992. | ||||
| [Pos81] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, | [Pos81] Postel, J., "Transmission Control Protocol", STD 7, RFC 793, | |||
| September 1981. | September 1981. | |||
| Non-Normative References | Non-Normative References | |||
| [AP99] Allman, M. and V. Paxson, "On Estimating End-to-End Network | [AP99] Allman, M. and V. Paxson, "On Estimating End-to-End Network | |||
| Path Properties", SIGCOMM 99. | Path Properties", SIGCOMM 99. | |||
| [Chu09] Chu, J., "Tuning TCP Parameters for the 21st Century", | [Chu09] Chu, J., "Tuning TCP Parameters for the 21st Century", | |||
| http://www.ietf.org/proceedings/75/slides/tcpm-1.pdf, July | http://www.ietf.org/proceedings/75/slides/tcpm-1.pdf, July | |||
| skipping to change at page 7, line 8 ¶ | skipping to change at page 7, line 11 ¶ | |||
| [Jac88] Jacobson, V., "Congestion Avoidance and Control", Computer | [Jac88] Jacobson, V., "Congestion Avoidance and Control", Computer | |||
| Communication Review, vol. 18, no. 4, pp. 314-329, Aug. 1988. | Communication Review, vol. 18, no. 4, pp. 314-329, Aug. 1988. | |||
| [JK88] Jacobson, V. and M. Karels, "Congestion Avoidance and | [JK88] Jacobson, V. and M. Karels, "Congestion Avoidance and | |||
| Control", ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z. | Control", ftp://ftp.ee.lbl.gov/papers/congavoid.ps.Z. | |||
| [KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time | [KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time | |||
| Estimates in Reliable Transport Protocols", SIGCOMM 87. | Estimates in Reliable Transport Protocols", SIGCOMM 87. | |||
| [PA00] Paxson, V. and M. Allman, "Computing TCP's Retransmission | ||||
| Timer", RFC 2988, November 2000. | ||||
| Author's Addresses | Author's Addresses | |||
| Vern Paxson | Vern Paxson | |||
| ICSI | ICSI | |||
| 1947 Center Street | 1947 Center Street | |||
| Suite 600 | Suite 600 | |||
| Berkeley, CA 94704-1198 | Berkeley, CA 94704-1198 | |||
| Phone: 510-666-2882 | Phone: 510-666-2882 | |||
| EMail: vern@icir.org | EMail: vern@icir.org | |||
| skipping to change at page 7, line 48 ¶ | skipping to change at page 8, line 4 ¶ | |||
| Matt Sargent | Matt Sargent | |||
| Case Western Reserve University Olin Building | Case Western Reserve University Olin Building | |||
| 10900 Euclid Avenue | 10900 Euclid Avenue | |||
| Room 505 | Room 505 | |||
| Cleveland, OH 44106 | Cleveland, OH 44106 | |||
| Phone: 440-223-5932 | Phone: 440-223-5932 | |||
| Email: mts71@case.edu | Email: mts71@case.edu | |||
| Appendix A | Appendix A | |||
| Choosing a reasonable initial RTO requires balancing two | Choosing a reasonable initial RTO requires balancing two | |||
| competing considerations: | competing considerations: | |||
| 1. The initial RTO should be sufficiently large to cover most of the | 1. The initial RTO should be sufficiently large to cover most of the | |||
| end-to-end paths to avoid spurious retransmissions and their | end-to-end paths to avoid spurious retransmissions and their | |||
| associated negative performance impact. | associated negative performance impact. | |||
| 2. The initial RTO should be small enough to ensure a timely | 2. The initial RTO should be small enough to ensure a timely | |||
| recovery from packet loss occurring before an RTT sample is | recovery from packet loss occurring before an RTT sample is | |||
| taken. | taken. | |||
| Traditionally, TCP has used 3 seconds as the initial RTO | Traditionally, TCP has used 3 seconds as the initial RTO | |||
| [RFC1122,RFC2988]. This document calls for lowering this value to 1 | [Bra89,PA00]. This document calls for lowering this value to 1 | |||
| second using the following rationale: | second using the following rationale: | |||
| - Modern networks are simply faster than the state-of-the-art was | - Modern networks are simply faster than the state-of-the-art was | |||
| at the time the initial RTO of 3 seconds was defined. | at the time the initial RTO of 3 seconds was defined. | |||
| - Studies have found that the round-trip times of more than 97.5% of | - Studies have found that the round-trip times of more than 97.5% of | |||
| the connections observed in a large scale analysis were less than | the connections observed in a large scale analysis were less than | |||
| 1 second [Chu09], suggesting that 1 second meets criteria 1 above. | 1 second [Chu09], suggesting that 1 second meets criteria 1 above. | |||
| - In addition, the studies observed retransmission rates within | - In addition, the studies observed retransmission rates within | |||
| End of changes. 15 change blocks. | ||||
| 16 lines changed or deleted | 21 lines changed or added | |||
This html diff was produced by rfcdiff 1.48. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ | ||||