idnits 2.17.1 draft-kojo-tcpm-frto-eval-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 18. -- Found old boilerplate from RFC 3978, Section 5.5, updated by RFC 4748 on line 723. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 734. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 741. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 747. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack a Security Considerations section. ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust Copyright Line does not match the current year == The document doesn't use any RFC 2119 keywords, yet seems to have RFC 2119 boilerplate text. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (5 June 2007) is 6170 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC793' is defined on line 645, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'BBA06' -- Possible downref: Non-RFC (?) normative reference: ref. 'DK06' -- Possible downref: Non-RFC (?) normative reference: ref. 'Jac88' -- Possible downref: Non-RFC (?) normative reference: ref. 'HC05' -- Possible downref: Non-RFC (?) normative reference: ref. 'Hok05' -- Possible downref: Non-RFC (?) normative reference: ref. 'KP87' ** Downref: Normative reference to an Informational RFC: RFC 3753 (ref. 'MK04') ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Downref: Normative reference to an Experimental RFC: RFC 3522 ** Downref: Normative reference to an Experimental RFC: RFC 3708 ** Downref: Normative reference to an Experimental RFC: RFC 4138 -- Possible downref: Non-RFC (?) normative reference: ref. 'SKR03' -- Possible downref: Non-RFC (?) normative reference: ref. 'Sar03' -- Possible downref: Non-RFC (?) normative reference: ref. 'Yam05' -- Possible downref: Non-RFC (?) normative reference: ref. 'Zh86' Summary: 9 errors (**), 0 flaws (~~), 4 warnings (==), 17 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force M. Kojo 2 INTERNET-DRAFT University of Helsinki 3 draft-kojo-tcpm-frto-eval-00.txt K. Yamamoto 4 Expires: December 2007 M. Hata 5 NTT Docomo 6 P. Sarolahti 7 Nokia Research Center 9 5 June 2007 11 Evaluation of RFC 4138 13 Status of this Memo 15 By submitting this Internet-Draft, each author represents that any 16 applicable patent or other IPR claims of which he or she is aware 17 have been or will be disclosed, and any of which he or she becomes 18 aware will be disclosed, in accordance with Section 6 of BCP 79. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six 26 months and may be updated, replaced, or obsoleted by other documents 27 at any time. It is inappropriate to use Internet-Drafts as 28 reference material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt. 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 This Internet-Draft will expire on December 2007. 38 Abstract 40 Forward-RTO recovery (F-RTO) specified in RFC 4138 is an algorithm 41 for detecting a spurious retransmission timeout with TCP and SCTP. 42 This document describes the advantages of F-RTO and summarizes the 43 experience in its implementations and the experiments conducted with 44 it. By analyzing the implications of the spurious retransmission 45 timeouts on the regular RTO recovery and Forward-RTO recovery 46 algorithm, including a detailed corner case analysis, it shows that 47 F-RTO does not have negative impact on the network when used with an 48 appropriate response algorithm even in the rare cases where F-RTO 49 falsely declares a retransmission timeout spurious. It concludes 50 with a recommendation that F-RTO is to be advanced to the standards 51 track. 53 Table of Contents 55 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 3 56 1.1. Conventions and Terminology. . . . . . . . . . . . . . . 4 57 2. Problems with the Regular RTO Recovery. . . . . . . . . . . . 4 58 2.1. Unnecessary Retransmissions. . . . . . . . . . . . . . . 4 59 2.2. Dishonoring the Packet Conservation Princi- 60 ple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 61 2.3. Unnecessary Reduction of the Congestion Win- 62 dow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 63 2.4. Other Problems . . . . . . . . . . . . . . . . . . . . . 6 64 3. Advantages and Motivation . . . . . . . . . . . . . . . . . . 6 65 3.1. Avoiding Unnecessary Retransmissions . . . . . . . . . . 6 66 3.2. Adhering to the Packet Conservation Princi- 67 ple . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 68 3.3. Selecting an Appropriate Congestion Control 69 Response. . . . . . . . . . . . . . . . . . . . . . . . . . . 7 70 3.4. Other Advantages . . . . . . . . . . . . . . . . . . . . 8 71 3.5. Non-spurious RTOs and Undetected Spurious 72 RTOs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 73 4. Experimental Results. . . . . . . . . . . . . . . . . . . . . 8 74 4.1. Initial trials in an emulated network. . . . . . . . . . 9 75 4.2. F-RTO Performance over Commercial W-CDMA 76 Networks. . . . . . . . . . . . . . . . . . . . . . . . . . . 9 77 5. Hidden Packet Losses. . . . . . . . . . . . . . . . . . . . . 11 78 5.1. Loss of Retransmitted Segments . . . . . . . . . . . . . 11 79 5.2. Reordering . . . . . . . . . . . . . . . . . . . . . . . 11 80 5.3. Malicious Receiver . . . . . . . . . . . . . . . . . . . 12 81 6. Conclusions and Recommendations . . . . . . . . . . . . . . . 13 82 References . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 83 AUTHORS' ADDRESSES . . . . . . . . . . . . . . . . . . . . . . . 16 84 Full Copyright Statement . . . . . . . . . . . . . . . . . . . . 18 85 Intellectual Property. . . . . . . . . . . . . . . . . . . . . . 18 87 1. Introduction 89 A temporary delay spike or a more permanent but sudden delay 90 increase in the TCP data or ACK path may result in a spurious 91 retransmission timeout (RTO) that triggers a premature 92 retransmission of the first unacknowledged data segment followed by 93 an unnecessary loss recovery in slow-start. This creates severe 94 problems with the regular RTO recovery algorithm as the late 95 acknowledgments of the original segments trigger unnecessary 96 retransmissions at a high rate. This introduces useless load into 97 the network in the form of a (large) packet burst. In addition, the 98 TCP sender will reduce its transmission rate quite unnecessarily 99 because the congestion control algorithms are falsely triggered, 100 resulting in decreased TCP performance. 102 When a spurious RTO occurs, a TCP sender employing the Forward RTO- 103 Recovery (F-RTO) algorithm [RFC4138] is able to avoid the problems 104 encountered with the regular RTO recovery by detecting that the TCP 105 retransmission timer expired spuriously and by avoiding additional 106 unnecessary retransmissions. In addition, the F-RTO sender may elude 107 the unnecessary performance degradation by restoring the congestion 108 control state and/or reduce the risk of falsely triggering TCP's 109 loss recovery and congestion control again in the later phases of 110 the connection by adapting the RTT estimators. 112 This document discusses the problems with the regular TCP RTO 113 recovery when spurious RTOs are encountered and evaluates the F-RTO 114 algorithm as a standards track alternative for the regular RTO 115 recovery. 117 1.1. Conventions and Terminology 119 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 120 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 121 document are to be interpreted as described in [RFC2119]. 123 2. Problems with the Regular RTO Recovery 125 When a spurious retransmission timeout occurs, the regular RTO 126 recovery is incapable of avoiding unnecessary retransmissions and 127 also fails in adhering to the packet conservation principle [Jac88] 128 by injecting the unnecessary retransmissions into the network at a 129 rate that is higher than the rate at which packets are leaving the 130 network. In addition, after bursting the unnecessary retransmissions 131 into the network the transmission is continued at an unnecessarily 132 low rate. 134 2.1. Unnecessary Retransmissions 136 After the first unacknowledged segment triggered by a spurious 137 retransmission timeout has been transmitted, the late 138 acknowledgments of the original segments arrive at the sender and 139 trigger further unnecessary retransmissions in slow start. The first 140 late ACK triggers unnecessary retransmission of the next segment(s) 141 which, in turn, will get acknowledged by the next late ACK quite 142 immediately, injecting the next segments from the retransmission 143 queue into the network. Assuming that none of the original segments 144 and none of the corresponding late ACKs were lost, this chain 145 reaction results in unnecessary retransmission of all outstanding 146 segments. 148 Depending on the flight size at the time when the spurious RTO 149 occurred, a large number of unnecessary retransmissions may get 150 injected into the network as useless load. This often creates a 151 significant problem in network environments where sudden delay 152 spikes tend to appear because such networks often offer a low or 153 moderate transmission capacity and any excess load significantly 154 reduces the available network capacity for delivering useful 155 traffic. In addition, unnecessary transmissions waste the battery 156 power of wireless devices and introduce extra costs when the access 157 link usage is billed per transmission volume. 159 2.2. Dishonoring the Packet Conservation Principle 161 The purpose of the TCP slow-start algorithm is to (re)start the ACK 162 clock and probe the network for available capacity. However, when 163 the retransmission timeout expires spuriously the TCP sender fails 164 to restart the ACK clock at a correct rate and is not able to probe 165 the available network capacity correctly. This is because the 166 acknowledgements acknowledging the retransmissions do not arrive one 167 round-trip time (RTT) after the retransmission as supposed. Instead, 168 the retransmissions are acknowledged by the late acknowledgements 169 that arrive at the line rate of the bottleneck link on the end-to- 170 end path or, in some cases, at much higher rate. Therefore, the late 171 acknowledgements clock out the unnecessary retransmissions within 172 one RTT using slow start and potentially 50 percent more data 173 segments are transmitted to the network in one RTT than what the TCP 174 sender in steady state would have transmitted if the spurious RTO 175 had not occurred. This violates the packet conservation principle 176 [Jac88]. 178 Assuming no packets are lost and the delayed ACKs are in use, each 179 late acknowledgement (except the first one) arriving after a 180 spurious RTO triggers three unnecessary retransmissions into the 181 network until all segments in the retransmission queue have been 182 retransmitted. After this, the TCP sender continues by transmitting 183 three new segments on each late acknowledgement. The number of the 184 new segments triggered by the late acknowledgements equals to the 185 half of the flight size at the time when spurious RTO occurred. This 186 injects 50 percent more segments into the network within one RTT 187 compared to the number of segments injected in one RTT by a TCP 188 sender in steady state, i.e., three segments per ACK instead of two 189 segments per ACK. 191 Assuming no packets are lost and the delayed ACKs are not in use, 192 each late acknowledgement corresponding to the first half of the 193 original flight triggers two unnecessary retransmissions into the 194 network. The late acknowledgements belonging the second half of the 195 original flight trigger a transmission of one new segment each. This 196 means that during the first half of the RTT, the sending rate is 197 doubled compared to the rate at which a TCP sender in steady state 198 would have transmitted if the spurious RTO had not occurred. 200 The new data segments transmitted after the unnecessary 201 retransmissions during the same RTT are likely to experience 202 congestion as the preceding unnecessary retransmission of the whole 203 window of segments is likely to occupy the bottleneck link queue. 204 This may result in serious a performance penalty as the TCP sender 205 is often forced to wait for a backed-off retransmission timer to 206 expire in order to recover the lost segments. A figure depicting an 207 example of such a behavior is available at 208 [http://www.iki.fi/pasi.sarolahti/frto/]. 210 2.3. Unnecessary Reduction of the Congestion Window 212 When a spurious RTO occurs, the TCP sender enters loss recovery and 213 reduces the congestion window and slow-start threshold. If the RTO 214 was spurious, the reduction is likely to be unnecessary and results 215 in sacrificed TCP performance. The impact of this unnecessary 216 congestion control action is particularly notable in high latency 217 environments where restoring the previous congestion window takes a 218 long time. 220 2.4. Other Problems 222 Updating the RTO estimate on retransmitted segments is not possible 223 due to the retransmission ambiguity problem [Zh86, KP87]. Therefore, 224 the RTO estimate is not updated for segments that experience the 225 unusually long delay and cause the spurious RTOs. This means that 226 the delayed segments are ignored in updating the RTO estimate and, 227 in the worst case, the temporary delay spikes are never reflected to 228 the RTO estimate, allowing a later delay spike to trigger a new 229 spurious RTO as easily as the previous spurious RTO was triggered. 231 3. Advantages and Motivation 233 3.1. Avoiding Unnecessary Retransmissions 235 If the TCP sender employs F-RTO, it is able to detect spurious RTOs. 236 When F-RTO detects a spurious RTO, it retransmits only one segment 237 unnecessarily (the first unacknowledged segment) and continues by 238 transmitting new segments. 240 3.2. Adhering to the Packet Conservation Principle 242 If the TCP sender employs F-RTO, it is able to detect spurious RTOs 243 and avoid the unnecessary retransmission of the whole window of 244 data. The amount of data that the TCP sender employing F-RTO 245 transmits during the next RTT after detecting the spurious RTO 246 depends on the congestion control response that the TCP sender 247 follows. Whichever response algorithm is selected, the segments 248 clocked out by the late acknowledgements must not be transmitted in 249 slow start, unless the TCP sender was in slow start right before the 250 spurious RTO occurred and the RTO recovery was entered. Otherwise, 251 the late acknowledgements would clock out the segments at higher 252 than accepted rate as discussed in Section 2.2 and the TCP sender 253 would not adhere to the packet conservation principle. 255 3.3. Selecting an Appropriate Congestion Control Response 257 If the F-RTO algorithm detects that the RTO was spurious, the TCP 258 sender may revert the congestion control state back to the same 259 state as it was right before the RTO occurred. One possible option 260 is to restore the congestion window and slow-start threshold [LG04]. 261 This would result in transmitting at the same rate as before the 262 RTO, avoiding the performance penalty of unnecessarily reducing the 263 congestion window and slow-start threshold. However, reverting the 264 congestion control parameters might not be a safe response in all 265 occasions. For example, a spurious RTO may occur due to a make- 266 before-break vertical handover [MK04] from a low latency path to a 267 high latency path [HC05, DK06]. If the handover results in a 268 spurious RTO and the bottleneck link bandwidth-delay product with 269 the new path after the handover is smaller than with the old path, 270 restoring the congestion window is likely to result in congestion on 271 new the bottleneck link. 273 The TCP sender may select to take a conservative congestion control 274 response after detecting a spurious RTO. The original F-RTO 275 algorithm employed a conservative response algorithm after a 276 spurious RTO was detected [SKR03]. That is, the TCP sender sets the 277 congestion window and slow-start threshold to a value that equals to 278 the half of the flight size right before the spurious RTO occurred 279 and continues transmitting new data in congestion avoidance. This 280 approach is always a safe response as the TCP sender halves its 281 transmission rate, thereby taking the spurious RTO as a congestion 282 signal. 284 3.4. Other Advantages 286 If the F-RTO algorithm declares that an RTO was spuriously 287 triggered, it may take the RTT for the delayed segments into account 288 when calculating the RTO estimate, except for the segment that was 289 retransmitted upon the retransmission timer expiration. This alone 290 may help in avoiding further spurious RTOs. However, with the 291 capability of detecting spurious RTOs the TCP sender may adjust the 292 RTO estimate explicitly in order avoid entering loss recovery 293 unnecessarily in the later phases of the connection [BBA06]. 295 3.5. Non-spurious RTOs and Undetected Spurious RTOs 297 If the retransmission timeout is not spurious or the F-RTO algorithm 298 is not able to detect the spurious timeout, it reverts back to the 299 conventional RTO recovery and continues retransmitting segments in 300 slow-start. Two different cases with slightly different behavior can 301 be observed: (i) if the first ACK arriving after the retransmission 302 timer expired is a duplicate acknowledgement, the F-RTO sender 303 declares the RTO genuine and reverts back to the conventional RTO 304 recovery. (ii) if the first ACK arriving after the retransmission 305 timer expired acknowledges new data, the F-RTO sender sends two 306 previously unsent segments. Now, if the next ACK is a duplicate ACK, 307 the F-RTO sender declares the RTO genuine and reverts to the 308 conventional RTO recovery. 310 In the first case, the behavior is identical to the behavior of the 311 conventional RTO recovery. In the second case, the behavior is 312 similar to the conventional RTO recovery with the only difference 313 that the 2nd and 3rd segment sent after the RTO are new segments. 314 When compared to a regular TCP implementation, the use of the F-RTO 315 algorithm does not change the transmission rate of segments in the 316 cases where the RTO is not declared spurious. Therefore, from the 317 congestion control point of view the F-RTO algorithm can be seen to 318 be safe also in these cases. 320 4. Experimental Results 322 Additional material, such as the cited papers with the F-RTO 323 experimentation results, can be found at 324 [http://www.iki.fi/pasi.sarolahti/frto/]. 326 4.1. Initial trials in an emulated network 328 The basic F-RTO algorithm was first introduced in [SKR03]. F-RTO 329 performance was experimented in a simple emulated network 330 environment with slow bottleneck link that was typical to the 331 wireless environments at the time of conducting the analysis. The 332 original conservative F-RTO congestion control response (see Section 333 3.3) was used. The paper analyzed different scenarios that trigger a 334 retransmission timeout either spuriously or genuinely. The following 335 scenarios were investigated: i) delay spike that triggers a spurious 336 timeout, ii) lost retransmission, iii) loss burst of entire window 337 of data. The paper also discussed packet reordering scenario, 338 although experiments with packet reordering were not conducted. In 339 the delay spike scenario F-RTO significantly reduced the number of 340 unnecessary retransmissions and also improved the data throughput. 341 In the other cases use of F-RTO did not affect TCP performance 342 negatively, and did not cause any additional traffic to be sent into 343 the network. 345 SACK-enhanced F-RTO with different congestion control response 346 algorithms was evaluated in [Sar03]. Because the cases where 347 duplicate ACKs interact with spurious timeout detection were 348 relatively rare in practice, in many cases the basic F-RTO performed 349 equally well with the SACK-enhanced F-RTO. 351 4.2. F-RTO Performance over Commercial W-CDMA Networks 353 Experimentation on F-RTO performance over commercial W-CDMA networks 354 and in a test environment which emulates HSDPA (High Speed Downlink 355 Packet Access) networks has been reported in [Yam05, Hok05]. 357 In the experimentation over the commercial networks, we downloaded 358 data objects from a test server in the Internet through the W-CDMA 359 mobile communications networks. The F-RTO detection algorithm with 360 the Eifel response algorithm was implemented on the test server, HP- 361 UX 11i prototype. The commercial W-CDMA networks provide a maximum 362 bearer rate of 384 kbps in a downlink and 64 kbps in an uplink. 364 A mobile client downloaded data objects of varying size from the 365 server in five different situations; fixed point (good and bad 366 wireless conditions), low speed (pedestrian), medium speed (driving 367 by car in an urban area), and high speed (a bullet train). The 368 object sizes were set to 6 Kbytes, 18 Kbytes, 300 Kbytes, 2 Mbytes, 369 and 518 Mbytes (the object size of 518 Mbytes was used only for a 370 bullet train). The experimentation took two weeks collecting 371 performance data for 643 connections with F-RTO and 991 connections 372 without F-RTO. 374 In this experimentation, F-RTO reduced the amount of unnecessarily 375 retransmitted data by 82 percent compared to the connections without 376 F-RTO. Because the spurious RTOs did not occur very often, a 377 relatively large amount of data was sent in total compared to the 378 amount of unnecessarily retransmitted data. Therefore, just avoiding 379 the unnecessary retransmissions did not improve TCP performance 380 significantly. Throughput was improved primarily because F-RTO was 381 used with the Eifel response algorithm that fully reverts the 382 congestion control state to the state valid prior to the spurious 383 retransmission timeout. F-RTO with the Eifel response improved 384 throughput by 6 percent for connections that transferred at least 2 385 Mbytes and experienced spurious timeout. The network used for the 386 experimentation has a small bandwidth-delay product around ten 387 segments. Larger improvement in throughput is expected in networks 388 with high bandwidth-delay product such as HSDPA networks. 390 There are a few situations in which F-RTO cannot detect a spurious 391 timeout such as severe reordering or duplication occurring on the 392 segment that triggered the spurious timeout, if the sender has no 393 new data to send, or the advertised window does not allow to send 394 new data that is needed by F-RTO to detect the spurious timeout. In 395 the experimentation, F-RTO was able detect 71 percent of spurious 396 timeouts successfully. 28 percent of the spurious timeouts could not 397 be detected by F-RTO because the sender had already sent the FIN 398 segment and had no new data to send when the spurious timeout 399 occurred. 0.7 percent of the spurious timeouts could not be detected 400 because the advertised window prohibited transmitting new data and 401 0.3 percent because the sender received duplicate acknowledgements 402 after the spurious timeout. 404 Throughput of F-RTO with Eifel response was also evaluated in the 405 test environment that emulated HSDPA networks. The test network has 406 14 Mbps bearer rate and 300 ms round-trip time, which yields the 407 bandwidth-delay product of about 350 segments with the segment size 408 of 1460 bytes. To trigger a spurious timeout, acknowledgements from 409 the server to the client were delayed intentionally in the early and 410 middle phase of the initial slow start and after the congestion 411 window reached the maximum window size (i.e., in the steady state). 412 F-RTO with Eifel response improved the throughput by 262 percent, 92 413 percent, and 37 percent when a spurious timeout occurred in the 414 early slow-start phase, in the middle of the slow start phase, and 415 after the steady state, respectively. The results show that F-RTO 416 with Eifel response improves the throughput significantly in the 417 networks with large bandwidth-delay product. 419 5. Hidden Packet Losses 421 There are a few known scenarios where a packet loss could escape F- 422 RTO's notice and cause a false positive detection. These scenarios 423 could be split into two cases: scenarios with a legitime receiver 424 where TCP communication is unaffected, and scenarios with 425 misbehaving receiver. In the first case the hidden packet loss is 426 harmless, if the congestion control response to spurious timeout is 427 conservative enough. The second case requires receiver misbehavior 428 by acknowledging segments that have not been received or by delaying 429 the acknowledgements, and it is not beneficial to the receiver 430 because, as a result, the TCP connection may become unreliable or 431 useless, or the malicious receiver may compromise the performance of 432 the TCP loss recovery in order to mislead the F-RTO sender. We also 433 note that optimistically acknowledging segments that have not yet 434 been received is possible with any regular TCP implementation, and 435 if the receiver's motivation was to damage the TCP connection (for 436 example, as a part of some kind of denial-of-service effort), the 437 standard TCP offers easier ways of doing that. Next we will discuss 438 these cases in more detail. 440 5.1. Loss of Retransmitted Segments 442 RFC 4138 notes that when the timeout is declared spurious, the TCP 443 sender cannot detect whether the unnecessary RTO retransmission was 444 lost. In principle, the loss of the RTO retransmission should be 445 taken as a congestion signal. Thus, there is a small possibility 446 that the F-RTO sender will violate the congestion control rules, if 447 it chooses to fully revert congestion control parameters after 448 detecting a spurious timeout. The Eifel detection algorithm 449 [RFC3522] has a similar property, while the DSACK option can be used 450 to detect whether the retransmitted segment was successfully 451 delivered to the receiver [RFC3708]. 453 This behavior belongs to the first class of the above-mentioned 454 cases; the loss of the RTO retransmission does not harm the TCP 455 connection in any way, because the original segment has reached the 456 receiver. With a conservative enough congestion control response 457 this behavior is not harmful to the network, either. 459 5.2. Reordering 461 F-RTO can declare a timeout spurious unintentionally when there is 462 reordering between the retransmitted segment and the original 463 segments transmitted before the timeout so that the RTO 464 retransmission is acknowledged before the full window of original 465 transmissions. This could happen, for example, in a case when an 466 original segment is lost on a high-latency connection path, and the 467 RTO retransmission of that segment traverses through a different 468 path that has substantially lower round-trip delay. This might sound 469 a pathological scenario, but could occur on a multi-radio device 470 that is performing a vertical handover between a high-latency WWAN 471 link and a low-latency WLAN link. 473 Also this scenario belongs to the first class of the above-mentioned 474 cases as it is not harmful to the network, if the congestion control 475 response to spurious retransmission timeout is conservative enough. 476 We believe that in practice this kind of combination of loss, delay 477 and reordering is very rare. In addition, this kind of a reordering 478 is less likely to occur with large TCP windows with which the effect 479 of a non-conservative response could be detectable. 481 5.3. Malicious Receiver 483 RFC 4138 notes in its security considerations that with F-RTO the 484 receiver could mislead the sender into falsely declaring the RTO 485 spurious. There are two possible ways a malicious receiver could 486 trigger a wrong output from the F-RTO algorithm. First, the receiver 487 can acknowledge data that it has not received so that the 488 acknowledgement arrives at the sender after the retransmission 489 timeout. Second, it can delay acknowledgments for segments it has 490 received earlier and acknowledge the outstanding segments after the 491 retransmission timer has expired and the retransmitted segment has 492 arrived, deluding the sender to declare the timeout spurious. 494 If the TCP receiver acknowledges a segment it has not really 495 received, the sender can be led to declare the timeout spurious in 496 the F-RTO algorithm, step 3. However, by doing so the receiver risks 497 the correct behavior of the connection. If both the original 498 transmission and the retransmission of the segment are dropped, the 499 sender incorrectly thinks that the lost segment has been delivered 500 to the receiver being not able to retransmit the segment again. As a 501 result, the TCP connection is unable to proceed unless the receiver 502 delivers the data out-of-order to the application, making the data 503 delivery of the connection unreliable. In addition, this requires 504 that the receiver transmits the false ACK timely such that the ACK 505 does not arrive at the sender until the retransmission timer has 506 expired and that the receiver suppresses any duplicate ACKs in order 507 to prevent the sender from entering the fast retransmit and fast 508 recovery. Therefore, we believe that this kind of attack is very 509 hard to implement succesfully and a malicious receiver is unlikely 510 to get any benefit from this attack, and with an appropriate 511 response this attack is not harmful to the network, either. 513 If the TCP receiver delays the acknowledgements of the out-of-order 514 segments after detecting a hole in the sequence space and waits for 515 the retransmission timer to expire and the retransmitted segment to 516 arrive before it acknowledges the segments with cumulative 517 acknowledgements, it may make the F-RTO sender to walk through the 518 algorithm steps so that the timeout seems spurious when it should 519 have been genuine. We believe this kind of attack is difficult to 520 implement in practice, and it is likely to be of no benefit to the 521 receiver as it needs to force the sender to wait for an RTO to 522 recover each of the lost segments while loss recovery with fast 523 retransmit and fast recovery is likely to be much more efficient. In 524 addition, this approach does not work if consecutive segments are 525 lost unless the receiver acknowledges data that it has not received. 526 Also, with a conservative response to the spurious timeout, this 527 attack is of no benefit to the receiver and it is not harmful to the 528 network, either. 530 6. Conclusions and Recommendations 532 This document analyzed the possible benefits and disadvantages of 533 using F-RTO enhanced TCP, if it was deployed globally. When a 534 spurious retransmission timeout occurs, the regular RTO recovery 535 wastes the network resources by retransmitting the whole window of 536 data unnecessarily. By doing it, a regular TCP also violates the 537 packet conservation principle and is thus harmful for congestion 538 control, despite following the letter of the specifications. As a 539 result of spurious timeout, the regular RTO recovery transmits 1.5 540 times more data than what is allowed by the congestion window at the 541 time the spurious timeout occurred and imposes excessive load on the 542 network. 544 F-RTO is able to avoid these unnecessary retransmissions by 545 detecting a spurious timeout and not retransmitting segments 546 unnecessarily. When the spurious retransmission timeout has been 547 detected by F-RTO, the F-RTO sender with an appropriate response 548 algorithm adheres to the packet conservation principle, because it 549 does not transmit more segments than what have left the network. 550 Therefore, a successful detection of a spurious retransmission 551 timeout with F-RTO can result both in reduced load on the network 552 and improved TCP throughput. These factors are especially important 553 in wireless communication. 555 There is one well-known scenario where a spurious timeout hides a 556 packet loss with F-RTO: if the RTO retransmission is lost, a F-RTO 557 sender cannot detect the segment loss. This is common to all 558 currently known mechanisms for detecting spurious retransmission 559 timeout immediately after it has occurred. Missing one packet loss 560 event is not a problem, if the response algorithm is conservative 561 enough. DSACK, that is able to detect spurious timeout after a full 562 window of data has been unnecessarily retransmitted, does not have 563 this problem, but on the other hand, DSACK is not able to avoid the 564 unnecessary retransmissions and the consequent violation of the 565 packet conservation principle. 567 There are two known ways a misbehaving TCP receiver could cheat the 568 F-RTO algorithm: (i) after detecting a packet loss, the receiver 569 could delay acknowledging the following segments until a 570 retransmission timer expires and the retransmitted segment arrives 571 and then acknowledge the outstanding data, making the timeout seem 572 spurious to F-RTO. (ii) after F-RTO algorithm has been triggered, 573 the receiver could optimistically acknowledge segments that have 574 been lost and make the RTO seem spurious to F-RTO. In the latter 575 case the penalty to the connection is significant, because the 576 control data at TCP sender may go into an invalid state, causing the 577 TCP connection to be unusable. Furthermore, also with the regular 578 TCP algorithms the receiver can acknowledge unreceived segments 579 before they arrive in hope of gaining more performance, with the 580 risk of invalidating the TCP state at the sender and making the 581 connection unusable. If the response to spurious retransmission 582 timeout is conservative enough, a misbehaving receiver cannot cause 583 extensive congestion to the network in either of the cases. 585 Given the above presented benefits and disadvantages, we believe F- 586 RTO [RFC4138] is safe algorithm to be moved on to Proposed Standard, 587 and to be deployed globally in the Internet with the following 588 notes: 590 * Because F-RTO performance with SCTP has not been studied to a 591 significant extent, we propose that the revised version of RFC 592 4138 would not include discussion on SCTP. However, the authors 593 would like to encourage future experimentations with F-RTO and 594 SCTP, applying RFC 4138. 596 * The research so far indicates that SACK-enhanced F-RTO provides 597 only a limited benefit over the basic F-RTO in a small subset of 598 spurious timeouts. Also many of the deployed implementations of 599 the F-RTO algorithm implement only the basic F-RTO. Therefore, we 600 propose that the revision of RFC 4138 would only contain the basic 601 F-RTO algorithm. 603 * While it is useful to keep the spurious timeout detection and 604 response specifications separate, the authors would like to enable 605 an usage of the F-RTO algorithm that allows detecting a spurious 606 timeout without applying any specific response algorithm, i.e., 607 allowing the TCP sender to continue transmitting new data with a 608 conservative congestion control response. In other words, after 609 detecting a spurious retransmission timeout, the TCP sender would 610 take the spurious timeout as a congestion signal and reduce the 611 congestion window and slow-start threshold. 613 References 615 [BBA06] J. Blanton, E. Blanton, and M. Allman. Using Spurious 616 Retransmissions to Adapt the Retransmission Timeout. Internet-Draft 617 "draft-allman-rto-backoff-04.txt", December 2006. Work in progress. 619 [DK06] L. Daniel and M. Kojo. "Adapting TCP for Vertical Handoffs in 620 Wireless Networks". In Proc. 31st IEEE Conference on Local Computer 621 Networks (LCN), Tampa, FL, USA, November 15-16, 2006. 623 [Jac88] Jacobson, V., "Congestion Avoidance and Control", In 624 Proceedings of ACM SIGCOMM 88. 626 [HC05] H. Huang and J. Cai. Improving TCP Performance during Soft 627 Vertical Handoff. In Proc. 19th International Conference on Advanced 628 Information Networking and Applications (AINA'05), volume 2, pages 629 329-332, Mar. 2005. 631 [Hok05] A. Hokamura, et al. "Performance Evaluation of F-RTO and 632 Eifel Response Algorithms over W-CDMA packet network". Wireless 633 Personal Multimedia Communications (WPMC'05), Sept. 2005. 635 [KP87] Karn, P. and C. Partridge, "Improving Round-Trip Time 636 Estimates in Reliable Transport Protocols", In Proceedings of ACM 637 SIGCOMM 87. 639 [LG04] Ludwig, R. and A. Gurtov, "The Eifel Response Algorithm for 640 TCP", RFC 4015, February 2005. 642 [MK04] J. Manner and M Kojo, Mobility Related Terminology. RFC 3753, 643 June 2004. 645 [RFC793] J. Postel. Transmission Control Protocol. RFC 793, 646 September 1981. 648 [RFC2119] S. Bradner. Key words for use in RFCs to Indicate 649 Requirement Levels. BCP 14, RFC 2119, March 1997. 651 [RFC3522] R. Ludwig and M. Meyer, The Eifel Detection Algorithm for 652 TCP. RFC 3522, April 2003. 654 [RFC3708] E. Blanton and M. Allman, Using TCP Duplicate Selective 655 Acknowledgement (DSACKs) and Stream Control Transmission Protocol 656 (SCTP) Duplicate Transmission Sequence Numbers (TSNs) to Detect 657 Spurious Retransmissions, RFC 3708, February 2004. 659 [RFC4138] P. Sarolahti and M. Kojo. Forward RTO-Recovery (F-RTO): An 660 Algorithm for Detecting Spurious Retransmission Timeouts with TCP 661 and the Stream Control Transmission Protocol (SCTP), RFC 4138, 662 August 2005. 664 [SKR03] P. Sarolahti, M. Kojo, and K. Raatikainen. F-RTO: An 665 Enhanced Recovery Algorithm for TCP Retransmission Timeouts. In ACM 666 SIGCOMM Computer Communication Review, 33(2), April 2003 668 [Sar03] P. Sarolahti. Congestion Control on Spurious TCP 669 Retransmission Timeouts. In Proceedings of IEEE Globecom 2003, San 670 Francisco, CA, USA, December 2003. 672 [Yam05] K. Yamamoto, et al. "Effects of F-RTO and Eifel Response 673 Algorithms for W-CDMA and HSDPA networks". Wireless Personal 674 Multimedia Communications (WPMC'05), Sept. 2005. 676 [Zh86] Zhang, L., "Why TCP Timers Don't Work Well", In Proceedings 677 of ACM SIGCOMM 86. 679 AUTHORS' ADDRESSES 681 Markku Kojo 682 University of Helsinki 683 P.O. Box 68 684 FI-00014 UNIVERSITY OF HELSINKI 685 Finland 686 Email: kojo@cs.helsinki.fi 688 Kazunori Yamamoto 689 NTT Docomo, Inc. 690 3-5 Hikarinooka, Yokosuka, Kanagawa, 239-8536, Japan 691 Phone: +81-46-840-3812 692 Email: yamamotokaz@nttdocomo.co.jp 694 Max Hata 695 NTT Docomo, Inc. 696 3-5 Hikarinooka, Yokosuka, Kanagawa, 239-8536, Japan 697 Phone: +81-46-840-3812 698 Email: hatama@s1.nttdocomo.co.jp 700 Pasi Sarolahti 701 Nokia Research Center 702 P.O. Box 407 703 FI-00045 NOKIA GROUP 704 Finland 705 Phone: +358 50 4876607 706 Email: pasi.sarolahti@nokia.com 708 Full Copyright Statement 710 Copyright (C) The IETF Trust (2007). 712 This document is subject to the rights, licenses and restrictions 713 contained in BCP 78, and except as set forth therein, the authors 714 retain all their rights. 716 This document and the information contained herein are provided on 717 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 718 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY, THE 719 IETF TRUST AND THE INTERNET ENGINEERING TASK FORCE DISCLAIM ALL 720 WARRANTIES, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO ANY 721 WARRANTY THAT THE USE OF THE INFORMATION HEREIN WILL NOT INFRINGE 722 ANY RIGHTS OR ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS 723 FOR A PARTICULAR PURPOSE. 725 Intellectual Property 727 The IETF takes no position regarding the validity or scope of any 728 Intellectual Property Rights or other rights that might be claimed 729 to pertain to the implementation or use of the technology described 730 in this document or the extent to which any license under such 731 rights might or might not be available; nor does it represent that 732 it has made any independent effort to identify any such rights. 733 Information on the procedures with respect to rights in RFC 734 documents can be found in BCP 78 and BCP 79. 736 Copies of IPR disclosures made to the IETF Secretariat and any 737 assurances of licenses to be made available, or the result of an 738 attempt made to obtain a general license or permission for the use 739 of such proprietary rights by implementers or users of this 740 specification can be obtained from the IETF on-line IPR repository 741 at http://www.ietf.org/ipr. 743 The IETF invites any interested party to bring to its attention any 744 copyrights, patents or patent applications, or other proprietary 745 rights that may cover technology that may be required to implement 746 this standard. Please address the information to the IETF at ietf- 747 ipr@ietf.org.