idnits 2.17.1 draft-ietf-tsvwg-tcp-eifel-alg-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([RFC1323]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (February 23, 2001) is 8462 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2481' is mentioned on line 187, but not defined ** Obsolete undefined reference: RFC 2481 (Obsoleted by RFC 3168) == Missing Reference: 'RFC2582' is mentioned on line 238, but not defined ** Obsolete undefined reference: RFC 2582 (Obsoleted by RFC 3782) ** Obsolete normative reference: RFC 2581 (Obsoleted by RFC 5681) -- Possible downref: Non-RFC (?) normative reference: ref. 'BPS99' -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO8073' ** Obsolete normative reference: RFC 1323 (Obsoleted by RFC 7323) -- Possible downref: Non-RFC (?) normative reference: ref. 'KP87' -- Possible downref: Non-RFC (?) normative reference: ref. 'LK00' -- Possible downref: Non-RFC (?) normative reference: ref. 'LS00' -- Possible downref: Non-RFC (?) normative reference: ref. 'Pax97' ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) -- Possible downref: Non-RFC (?) normative reference: ref. 'SCWA99' Summary: 11 errors (**), 0 flaws (~~), 4 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Reiner Ludwig 3 INTERNET-DRAFT Ericsson Research 4 Expires: August 2001 February 23, 2001 6 The Eifel Algorithm for TCP 7 9 Status of this memo 11 This document is an Internet-Draft and is in full conformance with 12 all provisions of Section 10 of RFC2026. 14 Internet-Drafts are working documents of the Internet Engineering 15 Task Force (IETF), its areas, and its working groups. Note that other 16 groups may also distribute working documents as Internet-Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six months 19 and may be updated, replaced, or obsoleted by other documents at any 20 time. It is inappropriate to use Internet-Drafts as reference 21 material or cite them other than as "work in progress". 23 The list of current Internet-Drafts can be accessed at 24 http://www.ietf.org/ietf/lid-abstracts.txt 26 The list of Internet-Draft Shadow Directories can be accessed at 27 http://www.ietf.org/shadow.html 29 Abstract 31 TCP's intertwined error and congestion control is not robust against 32 spurious timeouts nor is it robust against packet re-orderings. A 33 packet that is delayed in the network beyond the expiration of TCP's 34 retransmission timer, is mistaken for a packet loss by a TCP sender. 35 Also, a packet that is re-ordered in the network beyond TCP's 36 duplicate acknowledgment threshold, is eventually mistaken for a 37 packet loss by a TCP sender. Both situations lead to a spurious 38 retransmit of the oldest outstanding segment, and an unnecessary 39 reduction of the congestion window at the sender. Moreover, a 40 spurious timeout forces the sender into a go-back-N retransmission 41 mode leading to spurious retransmits of all outstanding segments. 43 We propose the "Eifel algorithm" as a way to make TCP robust against 44 spurious timeouts and packet re-orderings. The Eifel algorithm uses 45 extra information in the ACKs to reliably detect (a posteriori) a 46 spurious retransmit of the oldest outstanding segment at the TCP 47 sender. In response to such a detection, the Eifel algorithm restores 48 the congestion window, and prevents the spurious go-back-N 49 retransmits following a spurious timeout. As extra information in the 50 ACKs, the Eifel algorithm allows for two alternatives: the timestamp 51 option [RFC1323] and/or two new flags in the Reserved field of the 52 TCP header. 54 1. Introduction 56 In this document, we use the terms 'valid ACK' as defined in 57 [RFC793], and the terms 'duplicate ACK' (DUPACK), 'Congestion Window' 58 (cwnd), and 'Slow Start Threshold' (ssthresh) as defined in 59 [RFC2581]. Further, our use of the term 'retransmit' includes both 60 fast retransmits triggered by the third DUPACK and retransmits 61 triggered by a timeout. 63 The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, 64 SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this 65 document, are to be interpreted as described in [RFC2119]. 67 TCP's intertwined error and congestion control is not robust against 68 spurious timeouts nor is it robust against packet re-orderings. A 69 packet that is delayed in the network beyond the expiration of TCP's 70 retransmission timer, is mistaken for a packet loss by a TCP sender. 71 This results in a so-called spurious timeout, i.e., a timeout that 72 would not have occurred had the sender "waited longer". Also, a 73 packet that is re-ordered in the network beyond TCP's DUPACK 74 threshold of 3, is eventually mistaken for a packet loss by a TCP 75 sender. This is because the fast retransmit algorithm uses the 76 arrival of 3 DUPACKs as an indication that a segment has been lost 77 [RFC2581]. Both situations lead to a spurious retransmit of the 78 oldest outstanding segment, and an unnecessary reduction of the 79 congestion window at the sender. Moreover, a spurious timeout forces 80 the sender into a go-back-N retransmission mode leading to spurious 81 retransmits of all outstanding segments. A detailed explanation of 82 these effects using trace plots is found in [LK00]. 84 We propose the "Eifel algorithm" as a way to make TCP robust against 85 spurious timeouts and packet re-orderings. The Eifel algorithm is 86 based on the observation that the spurious go-back-N retransmits 87 following a spurious timeout and the unnecessary reduction of the 88 congestion window caused by packet re-ordering have the same root: 89 the retransmission ambiguity. The retransmission ambiguity problem 90 [KP87] is the TCP sender's inability to distinguish an ACK for the 91 original transmit of a segment from the ACK for its retransmit. The 92 Eifel algorithm uses extra information in the ACKs to reliably detect 93 (a posteriori) a spurious retransmit of the oldest outstanding 94 segment at the TCP sender. In response to such a detection, the Eifel 95 algorithm restores the congestion window, and prevents the spurious 96 go-back-N retransmits following a spurious timeout. 98 Spurious timeouts have not generally been a concern in the past since 99 they are rare [Pax97]. This can be credited to the conservativeness 100 of TCP's retransmission timer [LS00]. Yet, there is benefit in 101 avoiding the detrimental effects that spurious timeouts have on TCP 102 performance. This is since those effects create a strong incentive 103 for keeping a conservative - potentially too conservative - 104 retransmission timer. However, a retransmission timer that is too 105 conservative may cause long idle times before a lost packet is 106 retransmitted. This can degrade performance. This is obvious for 107 interactive request/response-style connections. But it also affects 108 bulk data transfers whenever the sender has exhausted its send window 109 before the retransmission timer has expired. The Eifel algorithm 110 opens the door to the development of a more optimistic retransmission 111 timer as it ensures that the penalty for underestimating the round- 112 trip time is minimal. In the common case, the only penalty is a 113 single spurious retransmit. 115 Packet re-orderings can occur due to the connection-less nature of IP 116 [RFC791] which does not guarantee an in-order delivery of packets. 117 However, it is difficult to evaluate how serious this is in the 118 Internet today. Early studies [Pax97] conclude that this occurs 119 rarely, while recent studies [BPS99] find this problem to be more 120 serious. Clearly, this depends on the paths underlying such studies, 121 e.g., whenever routers are inter-connected via multiple links/paths 122 (for fault tolerance) and load balancing is performed across those 123 links/paths on the aggregate traffic, packet re-orderings will occur 124 more frequently. 126 2. Detecting a Spurious Retransmit in TCP 128 Detecting the retransmission ambiguity requires extra information in 129 the ACKs that the sender can use to unambiguously distinguish an ACK 130 for the original transmit of a segment from that of a retransmit. 131 This in turn requires that every segment and the corresponding ACK 132 carry the extra information to allow the sender to avoid the spurious 133 go-back-N retransmits described in Section 1. Waiting for the 134 receiver to signal in DUPACKs that is has correctly received 135 duplicate segments, as proposed in [RFC2883], would be too late, and 136 is thus not an alternative. 138 As extra information in the ACKs, the Eifel algorithm allows for two 139 alternatives: the timestamp option [RFC1323] and/or two new flags in 140 the Reserved field of the TCP header. Both alternatives are specified 141 in the following two subsections. In Section 2.3, we specify 142 precedence rules among those two alternatives. We speak of the 143 timestamp-based Eifel algorithm to emphasize that timestamps are 144 being used to detect spurious retransmits. Likewise, we speak of the 145 flag-based Eifel algorithm. 147 2.1. Detection based on the Xmit-Echo Flag 149 We define Bit 6 and Bit 7 in the Reserved field of the TCP header as 150 the "Xmit flag" and "Xmit-Echo flag", respectively. The sender uses 151 the Xmit flag to mark retransmits while the receiver sets the Xmit- 152 Echo flag in the ACKs it sends in response to a segment with the Xmit 153 flag set. Since, TCP can be a sender and receiver at the same time, 154 two separate bits need to be used for those flags. It is worth noting 155 that the two flags are similar to the sub-sequence field proposed in 156 [ISO8073]. The location of the 6-bit Reserved field in the TCP header 157 is shown in Figure 3 of [RFC793]. Bit 8 and 9 of the Reserved field 158 have been assigned to the Explicit Congestion Notification (ECN) 159 [RFC2481]. 161 2.1.1. TCP Initialization 163 The text in this subsection has been derived from Section 6.1.1 of 164 [RFC2481]. This is because the same initialization semantics also 165 apply to the flag-based Eifel algorithm. Thus, TCP's that support 166 ECN, and wish to support the flag-based Eifel algorithm, should be 167 able to re-use most of the initialization code implemented for ECN. 169 When a TCP sends a SYN packet, it MAY set (i.e., equal to 1) the Xmit 170 and Xmit-Echo flag. For a SYN packet, the setting of both flags is 171 defined as an indication that the sending TCP whishes to use the 172 flag-based Eifel algorithm, rather than as an indication that the SYN 173 packet is a retransmit. More precisely, such a SYN packet indicates 174 that the TCP transmitting the SYN packet will participate in the 175 flag-based Eifel algorithm as both a sender and receiver. 177 Only if a TCP receives a SYN packet with both the Xmit and Xmit-Echo 178 flags set, MAY it respond with a SYN-ACK packet in which it sets the 179 Xmit-Echo flag, but unsets (i.e., sets equal to 0) the Xmit flag. For 180 a SYN-ACK packet, the pattern of the Xmit-Echo flag set and the Xmit 181 flag unset is defined as an indication that the TCP transmitting the 182 SYN-ACK packet agrees to participate in the flag-based Eifel 183 algorithm as both a sender and receiver. 185 This asymmetry is necessary for the robust negotiation of the use of 186 the flag-based Eifel algorithm with deployed TCP implementations (see 187 section 6.1.1 of [RFC2481] for details). 189 For the TCP transmitting the SYN packet with both the Xmit and Xmit- 190 Echo flags set, the flag-based Eifel algorithm has been successfully 191 negotiated, if it receives a SYN-ACK packet in which the Xmit-Echo 192 flag is set, but the Xmit flag is unset. For the TCP transmitting the 193 SYN-ACK packet, the flag-based Eifel algorithm has been successfully 194 negotiated, if it has received a SYN packet with both the Xmit and 195 Xmit-Echo flags set, while it has set the Xmit-Echo flag but unset 196 the Xmit flag in its SYN-ACK. 198 2.1.2. The TCP Receiver 200 If the flag-based Eifel algorithm has been successfully negotiated, 201 the following rules apply. 203 The receiver SHOULD send an immediate ACK with the Xmit-Echo flag set 204 in response to an incoming data segment that has the Xmit flag set. 205 The immediate ACK is RECOMMENDED because of the range check that the 206 sender performs on incoming ACKs after a retransmit (see Section 207 2.1.2). 209 In all other cases, the receiver MUST unset (i.e., set equal to 0) 210 the Xmit-Echo flag in all ACKs it sends. 212 2.1.3. The TCP Sender 214 If the flag-based Eifel algorithm has been successfully negotiated, 215 the following rules apply. 217 The sender MUST set the Xmit flag in the TCP header of retransmits. 218 This is REQUIRED since otherwise the Eifel algorithm might get 219 (falsely) triggered in response to a genuine packet loss (see Section 220 5). Recall, that our use of the term 'retransmit' includes both fast 221 retransmits triggered by the third DUPACK and retransmits triggered 222 by a timeout. 224 The sender MUST store in "cwnd_prev" the value that the sender's cwnd 225 had before it is reduced when the retransmit occurs. Likewise, the 226 sender MUST store in "ssthresh_prev" the value that the sender's 227 ssthresh had before it is reduced when the retransmit occurs. This is 228 REQUIRED since the sender will use cwnd_prev and ssthresh_prev to 229 restore its cwnd and ssthresh after it has detected that the 230 retransmit was spurious (see Section 3). 232 When a retransmit is sent, the sender MUST store the next (previously 233 unsent) sequence number to be sent in "sent_high", i.e., sent_high is 234 the highest outstanding sequence number transmitted so far plus 1, or 235 alternatively, the ACK number of a valid ACK that would ack all 236 outstanding data. This is REQUIRED since the sender will use 237 sent_high to detect a spurious retransmit as described below. Note 238 that the definition of "send_high" (spelled with a 'd') in [RFC2582] 239 is different. 241 When the first valid ACK that is not a DUPACK arrives after a 242 retransmit was sent, the sender detects that the retransmit was 243 spurious if all of the following conditions are true: 245 - the sender was expecting an ACK for a retransmit, and 246 - the ACK number of that ACK is less than or equal to sent_high, 247 and 248 - that ACK does not have the Xmit-Echo bit set. 250 The range check implied by the second condition prevents that the 251 Eifel algorithm is triggered in situations where a series of ACKs is 252 lost and a cumulative ACK beyond sent_high acks the retransmit. 254 2.2. Detection based on Timestamps 256 The timestamp-based Eifel algorithm requires that both the sender and 257 receiver have correctly implemented the timestamp option as specified 258 in [RFC1323]. In addition, the TCP sender implementation needs to be 259 enhanced as specified in Section 2.2.2. No change to the TCP protocol 260 is required nor any change to the TCP receiver implementation. 262 2.2.1. TCP Initialization 264 The timestamp-based Eifel algorithm has been successfully negotiated, 265 if use of the timestamp option has been successfully negotiated 266 during connection setup (see [RFC1323]). 268 2.2.2. The TCP Receiver 270 No change is required beyond the implementation of the timestamp 271 option as specified in [RFC1323]. 273 2.2.3. The TCP Sender 275 If the timestamp-based Eifel algorithm has been successfully 276 negotiated, the following rules apply. 278 The sender MUST store in "ts_first_xmit" the timestamp of the first 279 retransmit for a data segment. This is REQUIRED since otherwise the 280 Eifel algorithm might get (falsely) triggered in response to a 281 genuine packet loss (see Section 5). For the same reason, any 282 subsequent retransmit for the same oldest outstanding sequence number 283 MUST NOT overwrite ts_first_xmit. Recall, that our use of the term 284 'retransmit' includes both fast retransmits triggered by the third 285 DUPACK and retransmits triggered by a timeout. 287 As with the flag-based Eifel algorithm, the sender MUST store in 288 cwnd_prev the value that the sender's cwnd had before it is reduced 289 when the retransmit occurs. Likewise, the sender MUST store in 290 ssthresh_prev the value that the sender's ssthresh had before it is 291 reduced when the retransmit occurs. 293 When the first valid ACK that is not a DUPACK arrives after a 294 retransmit was sent, the sender detects that the retransmit was 295 spurious if all of the following conditions are true: 297 - the sender was expecting an ACK for a retransmit, and 298 - the value of the Timestamp Echo Reply field in the timestamp 299 option field of that ACK is less than ts_first_xmit. 301 Using the comparison operator "less than" in the second condition is 302 conservative. In theory, when the timestamp clock is slow or the 303 network is fast, ts_first_xmit could (at most) also be equal to the 304 value of the Timestamp Echo Reply field in the timestamp option field 305 of the first or a subsequent ACK that acks the retransmit. Thus, in 306 such a case the sender assumes that the retransmit was a genuine 307 retransmit, i.e., that it was not spurious. 309 2.3. Timestamps or the Xmit-Echo Flag? 311 The advantage of using the timestamp-based over the flag-based Eifel 312 algorithm is that it does not require changes to the TCP protocol nor 313 to the TCP receiver implementation. Also, [RFC1323] is already a 314 standards track document, and the timestamp option has already been 315 widely deployed in TCP implementations [???]. 317 The disadvantage of using the timestamp-based over the flag-based 318 Eifel algorithm is that including the 12 bytes TCP timestamp option 319 field in every segment and the corresponding ACKs introduces extra 320 protocol overhead. Moreover, current TCP/IP header compression 321 schemes [RFC1144], [RFC2507] do not compress timestamp option fields. 322 For those reasons, a sender might not choose to negotiate the 323 timestamp option. Note, that timestamps are only required for the 324 PAWS mechanism (Protect Against Wrapped Sequences) [RFC1323], since 325 for the RTTM mechanism (Round Trip Time Measurement) [RFC1323] there 326 exist implementation alternatives that work without the timestamp 327 option field [LS00]. 329 The flag-based Eifel algorithm has none of the above mentioned 330 disadvantages, but instead requires changes to the TCP protocol (two 331 new flags in the Reserved field of the TCP header) and to the TCP 332 receiver implementation. Hence, we define the following precedence 333 rules that would allow for an incremental deployment of the Eifel 334 algorithm. 336 - A TCP sender that implements the timestamp option MAY also 337 implement the timestamp-based Eifel algorithm. 339 - A TCP sender MAY implement the flag-based Eifel algorithm, and if 340 it does, MAY try to negotiate its use during connection setup. 341 There are situations where it might be advisable not to operate 342 the flag-based Eifel algorithm (see Section 5). 344 - If a receiver correctly sets the Xmit-Echo flag (see Section5), 345 the operation of the Eifel algorithm should be based on the Xmit- 346 Echo flag independent of whether timestamps are also being used. 348 In case timestamps are also being used, a sender may use 349 timestamps as an additional check to verify whether a retransmit 350 was spurious. This rule implies that the negotiation to use the 351 Xmit-Echo flag has succeeded. 353 - If timestamps are being used but the Xmit-Echo flag is not being 354 used for a particular TCP connection, the Eifel algorithm MAY 355 be operated based on timestamps. This rule implies that either 356 the negotiation to use the Xmit-Echo flag has failed or it's use 357 has been turned off due to a broken receiver (see Section 5). 359 3. Responding to a Spurious Retransmit in TCP 361 If the flag- or timestamp-based Eifel algorithm has been successfully 362 negotiated, the following rules apply. 364 When a sender has detected that it has performed a spurious 365 retransmit, the sender SHOULD resume transmission with the next 366 unsent segment. In addition, it restores cwnd and ssthresh as 367 outlined below based on the definition of cwnd_prev and ssthresh_prev 368 provided in Section 2.1.2 and 2.2.2. 370 If the first (in case of multiple) spurious retransmit was triggered 371 by a spurious timeout, and 373 if only a single retransmit had been sent, the sender MAY 374 restore cwnd with cwnd_prev and ssthresh with ssthresh_prev. 376 Else if the first (in case of multiple) spurious retransmit was not 377 triggered by a spurious timeout, and 379 if only a single retransmit had been sent, the sender 380 first sets cwnd to ssthresh, as it would do anyway [RFC2581], and 381 then MAY restore ssthresh with cwnd_prev. 383 If two retransmits of the same oldest outstanding sequence number had 384 been sent, the sender MAY restore both cwnd and ssthresh with one 385 half the value of cwnd_prev. 387 If more than two retransmits of the same oldest outstanding sequence 388 number had been sent, the Eifel algorithm MUST NOT have any effect on 389 cwnd and ssthresh. 391 Note, that the response in case the first spurious retransmit was not 392 triggered by a spurious timeout as described above, is different from 393 the original proposal described in [LK00]. However, this new response 394 avoids the packet burst that the response described in [LK00] would 395 typically cause. 397 4. Avoiding Competition between Timeout- and DUPACK-based Error Recovery 399 In the spirit of the Eifel algorithm, although unrelated to spurious 400 retransmits, we propose the following rule, based on the definition 401 of cwnd_prev and ssthresh_prev provided in Section 2.1.2 and 2.2.2. 403 The rule applies to the case when the third DUPACK arrives after the 404 first timeout for the same oldest outstanding sequence number has 405 already occurred. In that case, the sender SHOULD suppress the fast 406 retransmit and MAY restore both cwnd and ssthresh with one half the 407 value of cwnd_prev. I.e., the sender restores cwnd and ssthresh as if 408 the timeout had not occurred, but instead goes into congestion 409 avoidance [RFC2581]. 411 5. Security Considerations 413 There is a considerable risk when implementing the Eifel algorithm in 414 a naive fashion. This is since a misbehaving receiver can severely 415 upset congestion control at the sender. The risk is that the Eifel 416 algorithm is (falsely) triggered in response to a genuine packet 417 loss. In that case the Eifel algorithm would mistake a genuine 418 retransmit as a spurious retransmit. As a consequence the sender 419 would effectively not reduce its congestion window in response to the 420 lost packet. However, there are reliable sender-side mechanisms to 421 protect against this case as outlined below. 423 One needs to distinguish between broken receivers that misbehave due 424 to an implementation mistake versus malicious receivers that 425 deliberately misbehave. We first describe two mechanism to protect 426 against broken receivers, followed by a different mechanism to 427 protect against malicious receivers. We only recommend that 428 protection against broken receivers SHOULD be implemented together 429 with the Eifel algorithm at the sender. This is motivated by the fact 430 that the current TCP implicitly assumes a trust relationship between 431 sender and receiver, i.e., it can be assumed that receivers are not 432 malicious. Note that even without the Eifel algorithm, there are ways 433 a misbehaving receiver can upset congestion control at the sender 434 [SCWA99]. 436 We do not discuss problems with respect to misbehaving senders 437 assuming that the implementation of the sender-side Eifel algorithm 438 complies with the specifications in this text. 440 Protection Against Broken Receivers 442 There is no risk to falsely trigger the timestamp-based Eifel 443 algorithm as long as the receiver correctly implements the timestamp 444 option [RFC1323]. However, the flag-based Eifel algorithm can be 445 falsely triggered when the receiver has agreed to set the Xmit-Echo 446 flag in ACKs for retransmits but then "forgets" to do so. The ACK for 447 a genuine retransmit would then falsely trigger the Eifel algorithm 448 once it arrives at the sender. To protect against this case the 449 sender SHOULD implement the following mechanism if it uses the flag- 450 based Eifel algorithm. 452 After the sender has detected a spurious retransmit and in response 453 restores cwnd and ssthresh with cwnd_prev and ssthresh_prev, 454 respectively (see Section 3), it also saves the former values of cwnd 455 and ssthresh in cwnd_prev and ssthresh_prev, respectively (e.g., help 456 = cwnd; cwnd = cwnd_prev; cwnd_prev = help;). Until an ACK arrives at 457 the sender that acks beyond sent_high, the sender checks for a valid 458 ACK that arrives with the Xmit-Echo flag set. If such an ACK arrives, 459 the sender assumes that the Eifel algorithm was rightful triggered 460 and does nothing further. Otherwise, when an ACK arrives at the 461 sender that acks beyond sent_high, the sender assumes that the Eifel 462 algorithm was falsely triggered and reverses the effects of the Eifel 463 algorithm. I.e. cwnd is (re-)restored to cwnd_prev and ssthresh is 464 (re-)restored to ssthresh_prev. Since, also the ACKs with the Xmit- 465 Echo flag set can get lost, it would be too conservative to 466 completely disable the Eifel algorithm for the rest of the connection 467 in this situation. 469 Another risk stems from receivers that set the Xmit-Echo flag for 470 segments that have not been retransmitted. A TCP sender SHOULD 471 implement an appropriate detection mechanism, since in this case, it 472 cannot reliably use the flag-based Eifel algorithm. If a sender 473 detects such misbehavior, it SHOULD disable the flag-based Eifel 474 algorithm for the rest of the connection. 476 Protection Against Malicious Receivers 478 There are number of ways in which a malicious receiver could falsely 479 trigger the Eifel algorithm at the sender. A sender is particularly 480 vulnerable to this if it operates the flag-based Eifel algorithm. 481 Hence, the sender MAY choose to only use the timestamp-based Eifel 482 algorithm, and in addition implement the following mechanism. The 483 mechanism combines the idea of a "singular nonce" proposed in 484 [SCWA99] with the timestamp option specified in [RFC1323]. 486 The mechanism is based on the observation that after a spurious 487 retransmit the sender will at some point receive the ACK that was 488 triggered by the corresponding original transmit assuming that that 489 ACK was not lost. That ACK can be expected to echo the timestamp of 490 the original transmit even if the receiver implements the delayed ACK 491 algorithm. In case of packet re-ordering, this is implied by the 492 rules for generating ACKs for data segments that fill in all or part 493 of a gap in the sequence space (see section 4.2 of [RFC2581]) and by 494 the rules for echoing timestamps in that case (see rule (C) in 495 section 3.4 of [RFC1323]). In case of a spurious timeout, it is quite 496 likely that the delay that has caused the spurious timeout has also 497 caused the receiver's delayed ACK timer [RFC1122] to expire. Hence, 498 to protect against a malicious receiver, the sender should only 499 trigger the Eifel algorithm in response to the ACK for the original 500 transmit and only after it has authenticated that ACK as described 501 below. 503 To protect against malicious receivers that spoof ACKs, the sender 504 MAY implement the following modification to the timestamp option 505 specified in [RFC1323]. The sender adds a separate random number to 506 each timestamp to be included in an outgoing segment, and writes the 507 result into the Timestamp Value field in the timestamp option field. 508 A new random number is generated per RTT to avoid that the receiver 509 "learns" the random number. That random number should be carefully 510 chosen to avoid bad interactions with the PAWS mechanism specified in 511 [RFC1323]. Clearly, the random number(s) need to be accounted for in 512 the RTTM mechanism specified in [RFC1323], i.e., it needs to be 513 subtracted from the echoed timestamp before the RTT can be 514 calculated. For this to work, the sender needs to store all 515 "outstanding timestamps" and the corresponding random number. With 516 this modification, the sender only identifies a retransmit as 517 spurious if the ACK for the original transmit echoes the "random 518 timestamp" that was sent. Thus, assuming a receiver has no easy 519 access to the mentioned random numbers, this should provide for a 520 fairly secure protection against malicious receivers that spoof the 521 "right" ACK that would trigger the Eifel algorithm. 523 Acknowledgments 525 Many thanks to Keith Sklower for helping to develop the tools that 526 allowed the study of spurious timeouts and packet re-orderings. Many 527 thanks to Randy Katz, Michael Meyer, Stephan Baucke, Sally Floyd, 528 Vern Paxson, and Mark Allman for discussions around the Eifel 529 algorithm. 531 References 533 [RFC2581] M. Allman, V. Paxson, W. Stevens, TCP Congestion Control, 534 RFC 2581, April 1999. 536 [BPS99] J.C.R. Bennett, C. Partridge, N. Shectman, Packet 537 Reordering is Not Pathological Network Behavior, IEEE/ACM 538 Transactions on Networking, December `99. 540 [RFC1122] R. Braden, Requirements for Internet Hosts - Communication 541 Layers, RFC 1122, October 1989. 543 [RFC2119] S. Bradner, Key words for use in RFCs to Indicate 544 Requirement Levels, RFC 2119, March 1997. 546 [RFC2507] M. Degermark, B. Nordgren, S. Pink, IP Header Compression, 547 RFC 2507, February 1999. 549 [RFC2883] S. Floyd, J. Mahdavi, M. Mathis, M. Podolsky, A. Romanow, 550 An Extension to the Selective Acknowledgement (SACK) Option 551 for TCP, RFC 2883, July 2000. 553 [ISO8073] ISO/IEC, Information processing systems - Open Systems 554 Interconnection - Connection oriented transport protocol 555 specification, International Standard ISO/IEC 8073, 556 December 1988. 558 [RFC1323] V. Jacobson, R. Braden, D. Borman, TCP Extensions for High 559 Performance, RFC 1323, May 1992. 561 [RFC1144] V. Jacobson, Compressing TCP/IP Headers for Low-Speed 562 Serial Links, RFC 1144, February 1990. 564 [KP87] P. Karn, C. Partridge, Improving Round-Trip Time Estimates 565 in Reliable Transport Protocols, In Proceedings of ACM 566 SIGCOMM 87. 568 [LK00] R. Ludwig, R. H. Katz, The Eifel Algorithm: Making TCP 569 Robust Against Spurious Retransmissions, ACM Computer 570 Communication Review, Vol. 30, No. 1, January 2000, 571 available at http://www.acm.org/sigcomm/ccr/archive/2000/ 572 jan00/ccr-200001-ludwig.html (easier studied when 573 viewed/printed in color). 575 [LS00] R. Ludwig, K. Sklower, The Eifel Retransmissions Timer, ACM 576 Computer Communication Review, Vol. 30, No. 3, July 2000. 578 [Pax97] V. Paxson, End-to-End Routing Behavior in the Internet, 579 IEEE/ACM Transactions on Networking, Vol.5, No.5, October 580 1997. 582 [RFC791] J. Postel, Internet Protocol, RFC 791, September 1981. 584 [RFC793] J. Postel, Transmission Control Protocol, RFC793, September 585 1981. 587 [SCWA99] S. Savage, N. Cardwell, D. Wetherall, T. Anderson, TCP 588 Congestion Control with a Misbehaving Receiver, ACM 589 Computer Communication Review, Vol. 29, No. 5, October 590 1999. 592 Author's Address 594 Reiner Ludwig 595 Ericsson Research (EED) 596 Ericsson Allee 1 597 52134 Herzogenrath, Germany 598 Phone: +49 2407 575 719 599 Fax: +49 2407 575 400 600 Reiner.Ludwig@Ericsson.com 602 This Internet-Draft expires in August 2001.