idnits 2.17.1 draft-ludwig-tsvwg-tcp-eifel-alg-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? ** The document seems to lack a 1id_guidelines paragraph about the list of current Internet-Drafts -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([RFC1323]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (November 17, 2000) is 8560 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2481' is mentioned on line 187, but not defined ** Obsolete undefined reference: RFC 2481 (Obsoleted by RFC 3168) == Missing Reference: 'RFC2582' is mentioned on line 238, but not defined ** Obsolete undefined reference: RFC 2582 (Obsoleted by RFC 3782) ** Obsolete normative reference: RFC 2581 (Obsoleted by RFC 5681) -- Possible downref: Non-RFC (?) normative reference: ref. 'BPS99' -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO8073' ** Obsolete normative reference: RFC 1323 (Obsoleted by RFC 7323) -- Possible downref: Non-RFC (?) normative reference: ref. 'KP87' -- Possible downref: Non-RFC (?) normative reference: ref. 'LK00' -- Possible downref: Non-RFC (?) normative reference: ref. 'LS00' -- Possible downref: Non-RFC (?) normative reference: ref. 'Pax97' ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) -- Possible downref: Non-RFC (?) normative reference: ref. 'SCWA99' Summary: 11 errors (**), 0 flaws (~~), 4 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group Reiner Ludwig 3 INTERNET-DRAFT Ericsson Research 4 Expires: May 2001 November 17, 2000 6 The Eifel Algorithm for TCP 7 9 Status of this memo 11 This document is an Internet-Draft and is in full conformance with 12 all provisions of Section 10 of RFC2026. 14 Internet-Drafts are working documents of the Internet Engineering 15 Task Force (IETF), its areas, and its working groups. Note that other 16 groups may also distribute working documents as Internet-Drafts. 18 Internet-Drafts are draft documents valid for a maximum of six months 19 and may be updated, replaced, or obsoleted by other documents at any 20 time. It is inappropriate to use Internet-Drafts as reference 21 material or cite them other than as "work in progress". 23 The list of current Internet-Drafts can be accessed at 24 http://www.ietf.org/ietf/lid-abstracts.txt 26 The list of Internet-Draft Shadow Directories can be accessed at 27 http://www.ietf.org/shadow.html 29 Abstract 31 TCP's intertwined error and congestion control is not robust against 32 spurious timeouts nor is it robust against packet re-orderings. A 33 packet that is delayed in the network beyond the expiration of TCP's 34 retransmission timer, is mistaken for a packet loss by a TCP sender. 35 Also, a packet that is re-ordered in the network beyond TCP's 36 duplicate acknowledgment threshold, is eventually mistaken for a 37 packet loss by a TCP sender. Both situations lead to a spurious 38 retransmit of the oldest outstanding segment, and an unnecessary 39 reduction of the congestion window at the sender. Moreover, a 40 spurious timeout forces the sender into a go-back-N retransmission 41 mode leading to spurious retransmits of all outstanding segments. 43 We propose the "Eifel algorithm" as a way to make TCP robust against 44 spurious timeouts and packet re-orderings. The Eifel algorithm uses 45 extra information in the ACKs to reliably detect (a posteriori) a 46 spurious retransmit of the oldest outstanding segment at the TCP 47 sender. In response to such a detection, the Eifel algorithm restores 48 the congestion window, and prevents the spurious go-back-N 49 retransmits following a spurious timeout. As extra information in the 50 ACKs, the Eifel algorithm allows for two alternatives: the timestamp 51 option [RFC1323] and/or two new flags in the Reserved field of the 52 TCP header. 54 1. Introduction 56 In this document, we use the terms 'valid ACK' as defined in 57 [RFC793], and the terms 'duplicate ACK' (DUPACK), 'Congestion Window' 58 (cwnd), and 'Slow Start Threshold' (ssthresh) as defined in 59 [RFC2581]. Further, our use of the term 'retransmit' includes both 60 fast retransmits triggered by the third DUPACK and retransmits 61 triggered by a timeout. 63 The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, 64 SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this 65 document, are to be interpreted as described in [RFC2119]. 67 TCP's intertwined error and congestion control is not robust against 68 spurious timeouts nor is it robust against packet re-orderings. A 69 packet that is delayed in the network beyond the expiration of TCP's 70 retransmission timer, is mistaken for a packet loss by a TCP sender. 71 This results in a so-called spurious timeout, i.e., a timeout that 72 would not have occurred had the sender "waited longer". Also, a 73 packet that is re-ordered in the network beyond TCP's DUPACK 74 threshold of 3, is eventually mistaken for a packet loss by a TCP 75 sender. This is because the fast retransmit algorithm uses the 76 arrival of 3 DUPACKs as an indication that a segment has been lost 77 [RFC2581]. Both situations lead to a spurious retransmit of the 78 oldest outstanding segment, and an unnecessary reduction of the 79 congestion window at the sender. Moreover, a spurious timeout forces 80 the sender into a go-back-N retransmission mode leading to spurious 81 retransmits of all outstanding segments. A detailed explanation of 82 these effects using trace plots is found in [LK00]. 84 We propose the "Eifel algorithm" as a way to make TCP robust against 85 spurious timeouts and packet re-orderings. The Eifel algorithm is 86 based on the observation that the spurious go-back-N retransmits 87 following a spurious timeout and the unnecessary reduction of the 88 congestion window caused by packet re-ordering have the same root: 89 the retransmission ambiguity. The retransmission ambiguity problem 90 [KP87] is the TCP sender's inability to distinguish an ACK for the 91 original transmit of a segment from the ACK for its retransmit. The 92 Eifel algorithm uses extra information in the ACKs to reliably detect 93 (a posteriori) a spurious retransmit of the oldest outstanding 94 segment at the TCP sender. In response to such a detection, the Eifel 95 algorithm restores the congestion window, and prevents the spurious 96 go-back-N retransmits following a spurious timeout. 98 Spurious timeouts have not generally been a concern in the past since 99 they are rare [Pax97]. This can be credited to the conservativeness 100 of TCP's retransmission timer [LS00]. Yet, there is benefit in 101 avoiding the detrimental effects that spurious timeouts have on TCP 102 performance. This is since those effects create a strong incentive 103 for keeping a conservative - potentially too conservative - 104 retransmission timer. However, a retransmission timer that is too 105 conservative may cause long idle times before a lost packet is 106 retransmitted. This can degrade performance. This is obvious for 107 interactive request/response-style connections. But it also affects 108 bulk data transfers whenever the sender has exhausted its send window 109 before the retransmission timer has expired. The Eifel algorithm 110 opens the door to the development of a more optimistic retransmission 111 timer as it ensures that the penalty for underestimating the round- 112 trip time is minimal. In the common case, the only penalty is a 113 single spurious retransmit. 115 Packet re-orderings can occur due to the connection-less nature of IP 116 [RFC791] which does not guarantee an in-order delivery of packets. 117 However, it is difficult to evaluate how serious this is in the 118 Internet today. Early studies [Pax97] conclude that this occurs 119 rarely, while recent studies [BPS99] find this problem to be more 120 serious. Clearly, this depends on the paths underlying such studies, 121 e.g., whenever routers are inter-connected via multiple links/paths 122 (for fault tolerance) and load balancing is performed across those 123 links/paths on the aggregate traffic, packet re-orderings will occur 124 more frequently. 126 2. Detecting a Spurious Retransmit in TCP 128 Detecting the retransmission ambiguity requires extra information in 129 the ACKs that the sender can use to unambiguously distinguish an ACK 130 for the original transmit of a segment from that of a retransmit. 131 This in turn requires that every segment and the corresponding ACK 132 carry the extra information to allow the sender to avoid the spurious 133 go-back-N retransmits described in Section 1. Waiting for the 134 receiver to signal in DUPACKs that is has correctly received 135 duplicate segments, as proposed in [RFC2883], would be too late, and 136 is thus not an alternative. 138 As extra information in the ACKs, the Eifel algorithm allows for two 139 alternatives: the timestamp option [RFC1323] and/or two new flags in 140 the Reserved field of the TCP header. Both alternatives are specified 141 in the following two subsections. In Section 2.3, we specify 142 precedence rules among those two alternatives. We speak of the 143 timestamp-based Eifel algorithm to emphasize that timestamps are 144 being used to detect spurious retransmits. Likewise, we speak of the 145 flag-based Eifel algorithm. 147 2.1. Detection based on the Xmit-Echo Flag 149 We define Bit 6 and Bit 7 in the Reserved field of the TCP header as 150 the "Xmit flag" and "Xmit-Echo flag", respectively. The sender uses 151 the Xmit flag to mark retransmits while the receiver sets the Xmit- 152 Echo flag in the ACKs it sends in response to a segment with the Xmit 153 flag set. Since, TCP can be a sender and receiver at the same time, 154 two separate bits need to be used for those flags. It is worth noting 155 that the two flags are similar to the sub-sequence field proposed in 156 [ISO8073]. The location of the 6-bit Reserved field in the TCP header 157 is shown in Figure 3 of [RFC793]. Bit 8 and 9 of the Reserved field 158 have been assigned to the Explicit Congestion Notification (ECN) 159 [RFC2481]. 161 2.1.1. TCP Initialization 163 The text in this subsection has been derived from Section 6.1.1 of 164 [RFC2481]. This is because the same initialization semantics also 165 apply to the flag-based Eifel algorithm. Thus, TCP's that support 166 ECN, and wish to support the flag-based Eifel algorithm, should be 167 able to re-use most of the initialization code implemented for ECN. 169 When a TCP sends a SYN packet, it MAY set (i.e., equal to 1) the Xmit 170 and Xmit-Echo flag. For a SYN packet, the setting of both flags is 171 defined as an indication that the sending TCP whishes to use the 172 flag-based Eifel algorithm, rather than as an indication that the SYN 173 packet is a retransmit. More precisely, such a SYN packet indicates 174 that the TCP transmitting the SYN packet will participate in the 175 flag-based Eifel algorithm as both a sender and receiver. 177 Only if a TCP receives a SYN packet with both the Xmit and Xmit-Echo 178 flags set, MAY it respond with a SYN-ACK packet in which it sets the 179 Xmit-Echo flag, but unsets (i.e., sets equal to 0) the Xmit flag. For 180 a SYN-ACK packet, the pattern of the Xmit-Echo flag set and the Xmit 181 flag unset is defined as an indication that the TCP transmitting the 182 SYN-ACK packet agrees to participate in the flag-based Eifel 183 algorithm as both a sender and receiver. 185 This asymmetry is necessary for the robust negotiation of the use of 186 the flag-based Eifel algorithm with deployed TCP implementations (see 187 section 6.1.1 of [RFC2481] for details). 189 For the TCP transmitting the SYN packet with both the Xmit and Xmit- 190 Echo flags set, the flag-based Eifel algorithm has been successfully 191 negotiated, if it receives a SYN-ACK packet in which the Xmit-Echo 192 flag is set, but the Xmit flag is unset. For the TCP transmitting the 193 SYN-ACK packet, the flag-based Eifel algorithm has been successfully 194 negotiated, if it has received a SYN packet with both the Xmit and 195 Xmit-Echo flags set, while it has set the Xmit-Echo flag but unset 196 the Xmit flag in its SYN-ACK. 198 2.1.2. The TCP Receiver 200 If the flag-based Eifel algorithm has been successfully negotiated, 201 the following rules apply. 203 The receiver SHOULD send an immediate ACK with the Xmit-Echo flag set 204 in response to an incoming data segment that has the Xmit flag set. 205 The immediate ACK is RECOMMENDED because of the range check that the 206 sender performs on incoming ACKs after a retransmit (see Section 207 2.1.2). 209 In all other cases, the receiver SHOULD unset (i.e., set equal to 0) 210 the Xmit-Echo flag in all ACKs it sends. 212 2.1.3. The TCP Sender 214 If the flag-based Eifel algorithm has been successfully negotiated, 215 the following rules apply. 217 The sender MUST set the Xmit flag in the TCP header of retransmits. 218 This is REQUIRED since otherwise the Eifel algorithm might get 219 (falsely) triggered in response to a genuine packet loss (see Section 220 5). Recall, that our use of the term 'retransmit' includes both fast 221 retransmits triggered by the third DUPACK and retransmits triggered 222 by a timeout. 224 The sender MUST store in "cwnd_prev" the value that the sender's cwnd 225 had before it is reduced when the retransmit occurs. Likewise, the 226 sender MUST store in "ssthresh_prev" the value that the sender's 227 ssthresh had before it is reduced when the retransmit occurs. This is 228 REQUIRED since the sender will use cwnd_prev and ssthresh_prev to 229 restore its cwnd and ssthresh after it has detected that the 230 retransmit was spurious (see Section 3). 232 When a retransmit is sent, the sender MUST store the next (previously 233 unsent) sequence number to be sent in "sent_high", i.e., sent_high is 234 the highest outstanding sequence number transmitted so far plus 1, or 235 alternatively, the ACK number of a valid ACK that would ack all 236 outstanding data. This is REQUIRED since the sender will use 237 sent_high to detect a spurious retransmit as described below. Note 238 that the definition of "send_high" (spelled with a 'd') in [RFC2582] 239 is different. 241 When the first valid ACK that is not a DUPACK arrives after a 242 retransmit was sent, the sender detects that the retransmit was 243 spurious if all of the following conditions are true: 245 - the sender was expecting an ACK for a retransmit, and 246 - the ACK number of that ACK is less than or equal to sent_high, 247 and 248 - that ACK does not have the Xmit-Echo bit set. 250 The range check implied by the second condition prevents that the 251 Eifel algorithm is triggered in situations where a series of ACKs is 252 lost and a cumulative ACK beyond sent_high acks the retransmit. 254 2.2. Detection based on Timestamps 256 The timestamp-based Eifel algorithm requires that both the sender and 257 receiver have correctly implemented the timestamp option as specified 258 in [RFC1323]. In addition, the TCP sender implementation needs to be 259 enhanced as specified in Section 2.2.2. No change to the TCP protocol 260 is required nor any change to the TCP receiver implementation. 262 2.1.1. TCP Initialization 264 The timestamp-based Eifel algorithm has been successfully negotiated, 265 if use of the timestamp option has been successfully negotiated 266 during connection setup (see [RFC1323]). 268 2.1.2. The TCP Receiver 270 No change is required beyond the implementation of the timestamp 271 option as specified in [RFC1323]. 273 2.1.3. The TCP Sender 275 If the timestamp-based Eifel algorithm has been successfully 276 negotiated, the following rules apply. 278 The sender MUST store in "ts_first_xmit" the timestamp of the first 279 retransmit for a data segment. This is REQUIRED since otherwise the 280 Eifel algorithm might get (falsely) triggered in response to a 281 genuine packet loss (see Section 5). For the same reason, any 282 subsequent retransmit for the same oldest outstanding sequence number 283 MUST NOT overwrite ts_first_xmit. Recall, that our use of the term 284 'retransmit' includes both fast retransmits triggered by the third 285 DUPACK and retransmits triggered by a timeout. 287 As with the flag-based Eifel algorithm, the sender MUST store in 288 cwnd_prev the value that the sender's cwnd had before it is reduced 289 when the retransmit occurs. Likewise, the sender MUST store in 290 ssthresh_prev the value that the sender's ssthresh had before it is 291 reduced when the retransmit occurs. 293 When the first valid ACK that is not a DUPACK arrives after a 294 retransmit was sent, the sender detects that the retransmit was 295 spurious if all of the following conditions are true: 297 - the sender was expecting an ACK for a retransmit, and 298 - the value of the Timestamp Echo Reply field in the timestamp 299 option field of that ACK is less than ts_first_xmit. 301 Using the comparison operator "less than" in the second condition is 302 conservative. In theory, when the timestamp clock is slow or the 303 network is fast, ts_first_xmit could (at most) also be equal to the 304 value of the Timestamp Echo Reply field in the timestamp option field 305 of the first or a subsequent ACK that acks the retransmit. Thus, in 306 such a case the sender assumes that the retransmit was a genuine 307 retransmit, i.e., that it was not spurious. 309 2.3. Timestamps or the Xmit-Echo Flag? 311 The advantage of using the timestamp-based over the flag-based Eifel 312 algorithm is that it does not require changes to the TCP protocol nor 313 to the TCP receiver implementation. Also, [RFC1323] is already a 314 standards track document, and the timestamp option has already been 315 widely deployed in TCP implementations [???]. 317 The disadvantage of using the timestamp-based over the flag-based 318 Eifel algorithm is that including the 12 bytes TCP timestamp option 319 field in every segment and the corresponding ACKs introduces extra 320 protocol overhead. Moreover, current TCP/IP header compression 321 schemes [RFC1144], [RFC2507] do not compress timestamp option fields. 322 For those reason, a sender might not choose to negotiate the 323 timestamp option. Note, that timestamps are only required for the 324 PAWS mechanism (Protect Against Wrapped Sequences) [RFC1323], since 325 for the RTTM mechanism (Round Trip Time Measurement) [RFC1323] there 326 exist implementation alternatives that work without the timestamp 327 option field [LS00]. 329 The flag-based Eifel algorithm has none of the above mentioned 330 disadvantages, but instead requires changes to the TCP protocol (two 331 new flags in the Reserved field of the TCP header) and to the TCP 332 receiver implementation. Hence, we define the following precedence 333 rules that would allow for an incremental deployment of the Eifel 334 algorithm. 336 - A TCP sender that implements the timestamp option SHOULD also 337 implement the timestamp-based Eifel algorithm. 339 - A TCP sender SHOULD implement the flag-based Eifel algorithm and 340 SHOULD try to negotiate its use during connection setup. There 341 are situations where it might be advisable to deviate from this 342 rule (see Section 5). 344 - If a receiver correctly sets the Xmit-Echo flag (see Section5), 345 the operation of the Eifel algorithm SHOULD be based on the Xmit- 346 Echo flag independent of whether timestamps are also being used. 348 In case timestamps are also being used, a sender MAY use 349 timestamps as an additional check to verify whether a retransmit 350 was spurious. This rule implies that the negotiation to use the 351 Xmit-Echo flag has succeeded. 353 - If timestamps are being used but the Xmit-Echo flag is not being 354 used for a particular TCP connection, the Eifel algorithm SHOULD 355 be operated based on timestamps. This rule implies that either 356 the negotiation to use the Xmit-Echo flag has failed or it's use 357 has been turned off due to a broken receiver (see Section 5). 359 3. Responding to a Spurious Retransmit in TCP 361 If the flag- or timestamp-based Eifel algorithm has been successfully 362 negotiated, the following rules apply. 364 When a sender has detected that it has performed a spurious 365 retransmit, the sender resumes transmission with the next unsent 366 segment. In addition, it performs one of the following two actions 367 based on the definition of cwnd_prev and ssthresh_prev provided in 368 Section 2.1.2 and 2.2.2. 370 - If only a single retransmit had been sent, the sender SHOULD 371 restore cwnd with cwnd_prev and ssthresh with ssthresh_prev. 373 - If two retransmits of the same oldest outstanding sequence number 374 had been sent, the sender SHOULD restore both cwnd and ssthresh 375 with one half the value of cwnd_prev. 377 If more than two retransmits of the same oldest outstanding sequence 378 number had been sent, the Eifel algorithm has no effect on cwnd and 379 ssthresh. 381 Note that in the case of packet re-ordering, the ACK that was used to 382 detect that the retransmit was spurious (see Section 2.1.2 and 383 3.2.2), will usually clock out a burst of segments. The size of that 384 burst is equal to the number of DUPACKs that did not clock out a new 385 segment during the first phase of the fast recovery phase when cwnd 386 is inflated [RFC2581]. 388 Responding to a spurious retransmit as specified above is visualized 389 in [LK00] using trace plots. This might aid a better understanding. 391 4. Avoiding Competition between Timeout- and DUPACK-based Error Recovery 393 In the spirit of the Eifel algorithm, although unrelated to spurious 394 retransmits, we propose the following rule, based on the definition 395 of cwnd_prev and ssthresh_prev provided in Section 2.1.2 and 2.2.2. 397 The rule applies to the case when the third DUPACK arrives after the 398 first timeout for the same oldest outstanding sequence number has 399 already occurred. In that case, the sender SHOULD suppress the fast 400 retransmit and SHOULD restore both cwnd and ssthresh with one half 401 the value of cwnd_prev. I.e., the sender restores cwnd and ssthresh 402 as if the timeout had not occurred, but instead goes into congestion 403 avoidance [RFC2581]. 405 5. Security Considerations 407 There is a considerable risk when implementing the Eifel algorithm in 408 a naive fashion. This is since a misbehaving receiver can severely 409 upset congestion control at the sender. The risk is that the Eifel 410 algorithm is (falsely) triggered in response to a genuine packet 411 loss. In that case the Eifel algorithm would mistake a genuine 412 retransmit as a spurious retransmit. As a consequence the sender 413 would effectively not reduce its congestion window in response to the 414 lost packet. However, there are reliable sender-side mechanisms to 415 protect against this case as outlined below. 417 One needs to distinguish between broken receivers that misbehave due 418 to an implementation mistake versus malicious receivers that 419 deliberately misbehave. We first describe two mechanism to protect 420 against broken receivers, followed by a different mechanism to 421 protect against malicious receivers. We only recommend that 422 protection against broken receivers SHOULD be implemented at the 423 sender. This is motivated by the fact that the current TCP implicitly 424 assumes a trust relationship between sender and receiver, i.e., it 425 can be assumed that receivers are not malicious. Note that even 426 without the Eifel algorithm, there are ways a misbehaving receiver 427 can upset congestion control at the sender [SCWA99]. 429 We do not discuss problems with respect to misbehaving senders 430 assuming that the implementation of the sender-side Eifel algorithm 431 complies with the specifications in this text. 433 Protection Against Broken Receivers 435 There is no risk to falsely trigger the timestamp-based Eifel 436 algorithm as long as the receiver correctly implements the timestamp 437 option [RFC1323]. However, the flag-based Eifel algorithm can be 438 falsely triggered when the receiver has agreed to set the Xmit-Echo 439 flag in ACKs for retransmits but then "forgets" to do so. The ACK for 440 a genuine retransmit would then falsely trigger the Eifel algorithm 441 once it arrives at the sender. To protect against this case the 442 sender SHOULD implement the following mechanism if it uses the flag- 443 based Eifel algorithm. 445 After the sender has detected a spurious retransmit and in response 446 restores cwnd and ssthresh with cwnd_prev and ssthresh_prev, 447 respectively (see Section 3), it also saves the former values of cwnd 448 and ssthresh in cwnd_prev and ssthresh_prev, respectively (e.g., help 449 = cwnd; cwnd = cwnd_prev; cwnd_prev = help;). Until an ACK arrives at 450 the sender that acks beyond sent_high, the sender checks for a valid 451 ACK that arrives with the Xmit-Echo flag set. If such an ACK arrives, 452 the sender assumes that the Eifel algorithm was rightful triggered 453 and does nothing further. Otherwise, when an ACK arrives at the 454 sender that acks beyond sent_high, the sender assumes that the Eifel 455 algorithm was falsely triggered and reverses the effects of the Eifel 456 algorithm. I.e. cwnd is (re-)restored to cwnd_prev and ssthresh is 457 (re-)restored to ssthresh_prev. Since, also the ACKs with the Xmit- 458 Echo flag set can get lost, it would be too conservative to 459 completely disable the Eifel algorithm for the rest of the connection 460 in this situation. 462 Another risk stems from receivers that set the Xmit-Echo flag for 463 segments that have not been retransmitted. A TCP sender SHOULD 464 implement an appropriate detection mechanism, since in this case, it 465 cannot reliably use the flag-based Eifel algorithm. If a sender 466 detects such misbehavior, it SHOULD disable the flag-based Eifel 467 algorithm for the rest of the connection. 469 Protection Against Malicious Receivers 471 There are number of ways in which a malicious receiver could falsely 472 trigger the Eifel algorithm at the sender. A sender is particularly 473 vulnerable to this if it operates the flag-based Eifel algorithm. 474 Hence, the sender MAY choose to only use the timestamp-based Eifel 475 algorithm, and in addition implement the following mechanism. The 476 mechanism combines the idea of a "singular nonce" proposed in 477 [SCWA99] with the timestamp option specified in [RFC1323]. 479 The mechanism is based on the observation that after a spurious 480 retransmit the sender will at some point receive the ACK that was 481 triggered by the corresponding original transmit assuming that that 482 ACK was not lost. That ACK can be expected to echo the timestamp of 483 the original transmit even if the receiver implements the delayed ACK 484 algorithm. In case of packet re-ordering, this is implied by the 485 rules for generating ACKs for data segments that fill in all or part 486 of a gap in the sequence space (see section 4.2 of [RFC2581]) and by 487 the rules for echoing timestamps in that case (see rule (C) in 488 section 3.4 of [RFC1323]). In case of a spurious timeout, it is quite 489 likely that the delay that has caused the spurious timeout has also 490 caused the receiver's delayed ACK timer [RFC1122] to expire. Hence, 491 to protect against a malicious receiver, the sender should only 492 trigger the Eifel algorithm in response to the ACK for the original 493 transmit and only after it has authenticated that ACK as described 494 below. 496 To protect against malicious receivers that spoof ACKs, the sender 497 MAY implement the following modification to the timestamp option 498 specified in [RFC1323]. The sender adds a separate random number to 499 each timestamp to be included in an outgoing segment, and writes the 500 result into the Timestamp Value field in the timestamp option field. 501 A new random number is generated per RTT to avoid that the receiver 502 "learns" the random number. That random number should be carefully 503 chosen to avoid bad interactions with the PAWS mechanism specified in 504 [RFC1323]. Clearly, the random number(s) need to be accounted for in 505 the RTTM mechanism specified in [RFC1323], i.e., it needs to be 506 subtracted from the echoed timestamp before the RTT can be 507 calculated. For this to work, the sender needs to store all 508 "outstanding timestamps" and the corresponding random number. With 509 this modification, the sender only identifies a retransmit as 510 spurious if the ACK for the original transmit echoes the "random 511 timestamp" that was sent. Thus, assuming a receiver has no easy 512 access to the mentioned random numbers, this should provide for a 513 fairly secure protection against malicious receivers that spoof the 514 "right" ACK that would trigger the Eifel algorithm. 516 Acknowledgments 518 Many thanks to Keith Sklower for helping to develop the tools that 519 allowed the study of spurious timeouts and packet re-orderings. Many 520 thanks to Randy Katz, Michael Meyer, Stephan Baucke, Sally Floyd, and 521 Vern Paxson for discussions around the Eifel algorithm. 523 References 525 [RFC2581] M. Allman, V. Paxson, W. Stevens, TCP Congestion Control, 526 RFC 2581, April 1999. 528 [BPS99] J.C.R. Bennett, C. Partridge, N. Shectman, Packet 529 Reordering is Not Pathological Network Behavior, IEEE/ACM 530 Transactions on Networking, December `99. 532 [RFC1122] R. Braden, Requirements for Internet Hosts - Communication 533 Layers, RFC 1122, October 1989. 535 [RFC2119] S. Bradner, Key words for use in RFCs to Indicate 536 Requirement Levels, RFC 2119, March 1997. 538 [RFC2507] M. Degermark, B. Nordgren, S. Pink, IP Header Compression, 539 RFC 2507, February 1999. 541 [RFC2883] S. Floyd, J. Mahdavi, M. Mathis, M. Podolsky, A. Romanow, 542 An Extension to the Selective Acknowledgement (SACK) Option 543 for TCP, RFC 2883, July 2000. 545 [ISO8073] ISO/IEC, Information processing systems - Open Systems 546 Interconnection - Connection oriented transport protocol 547 specification, International Standard ISO/IEC 8073, 548 December 1988. 550 [RFC1323] V. Jacobson, R. Braden, D. Borman, TCP Extensions for High 551 Performance, RFC 1323, May 1992. 553 [RFC1144] V. Jacobson, Compressing TCP/IP Headers for Low-Speed 554 Serial Links, RFC 1144, February 1990. 556 [KP87] P. Karn, C. Partridge, Improving Round-Trip Time Estimates 557 in Reliable Transport Protocols, In Proceedings of ACM 558 SIGCOMM 87. 560 [LK00] R. Ludwig, R. H. Katz, The Eifel Algorithm: Making TCP 561 Robust Against Spurious Retransmissions, ACM Computer 562 Communication Review, Vol. 30, No. 1, January 2000, 563 available at http://www.acm.org/sigcomm/ccr/archive/2000/ 564 jan00/ccr-200001-ludwig.html (easier studied when 565 viewed/printed in color). 567 [LS00] R. Ludwig, K. Sklower, The Eifel Retransmissions Timer, ACM 568 Computer Communication Review, Vol. 30, No. 3, July 2000. 570 [Pax97] V. Paxson, End-to-End Routing Behavior in the Internet, 571 IEEE/ACM Transactions on Networking, Vol.5, No.5, October 572 1997. 574 [RFC791] J. Postel, Internet Protocol, RFC 791, September 1981. 576 [RFC793] J. Postel, Transmission Control Protocol, RFC793, September 577 1981. 579 [SCWA99] S. Savage, N. Cardwell, D. Wetherall, T. Anderson, TCP 580 Congestion Control with a Misbehaving Receiver, ACM 581 Computer Communication Review, Vol. 29, No. 5, October 582 1999. 584 Author's Address 586 Reiner Ludwig 587 Ericsson Research (EED) 588 Ericsson Allee 1 589 52134 Herzogenrath, Germany 590 Phone: +49 2407 575 719 591 Fax: +49 2407 575 400 592 Reiner.Ludwig@Ericsson.com 594 This Internet-Draft expires in May 2001.