idnits 2.17.1 draft-ietf-tsvwg-dsack-use-02.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 60 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** There are 4 instances of lines with control characters in the document. ** The abstract seems to contain references ([RFC2119]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC2026' is mentioned on line 16, but not defined == Missing Reference: 'RFC2119' is mentioned on line 45, but not defined == Missing Reference: 'RFC2581' is mentioned on line 87, but not defined ** Obsolete undefined reference: RFC 2581 (Obsoleted by RFC 5681) ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2960 (Obsoleted by RFC 4960) == Outdated reference: A later version (-07) exists of draft-allman-tcp-early-rexmt-01 -- Obsolete informational reference (is this intentional?): RFC 1323 (Obsoleted by RFC 7323) -- Obsolete informational reference (is this intentional?): RFC 3517 (Obsoleted by RFC 6675) Summary: 7 errors (**), 0 flaws (~~), 6 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Internet Engineering Task Force Ethan Blanton 3 INTERNET DRAFT Purdue University 4 File: draft-ietf-tsvwg-dsack-use-02.txt Mark Allman 5 ICIR 6 October, 2003 7 Expires: April, 2004 9 Using TCP DSACKs and SCTP Duplicate TSNs 10 to Detect Spurious Retransmissions 12 Status of this Memo 14 This document is an Internet-Draft and is in full conformance with 15 all provisions of Section 10 of [RFC2026]. 17 Internet-Drafts are working documents of the Internet Engineering 18 Task Force (IETF), its areas, and its working groups. Note that 19 other groups may also distribute working documents as 20 Internet-Drafts. 22 Internet-Drafts are draft documents valid for a maximum of six 23 months and may be updated, replaced, or obsoleted by other documents 24 at any time. It is inappropriate to use Internet-Drafts as 25 reference material or to cite them other than as "work in progress." 27 The list of current Internet-Drafts can be accessed at 28 http://www.ietf.org/ietf/1id-abstracts.txt 30 The list of Internet-Draft Shadow Directories can be accessed at 31 http://www.ietf.org/shadow.html. 33 Abstract 35 TCP and SCTP provide notification of duplicate segment receipt 36 through DSACK and Duplicate TSN notification, respectively. This 37 document presents conservative methods of using this information to 38 identify unnecessary retransmissions for various applications. 40 Terminology 42 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 43 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 44 document are to be interpreted as described in RFC 2119 [RFC2119]. 46 1 Introduction 48 TCP [RFC793] and SCTP [RFC2960] provide notification of duplicate 49 segment receipt through duplicate selective acknowledgment (DSACK) 50 [RFC2883] and Duplicate TSN notifications, respectively. Using this 51 information, a TCP or SCTP sender can generally determine when a 52 retransmission was sent in error. This document presents two 53 methods for using duplicate notifications. The first method is 54 simple and can be used for accounting applications. The second 55 method is a conservative algorithm to disambiguate unnecessary 56 retransmissions from loss events for the purpose of undoing 57 unnecessary congestion control changes. 59 This document is intended to outline reasonable and safe algorithms 60 for detecting spurious retransmissions and discuss some of the 61 considerations involved. It is not intended to describe the only 62 possible method for achieving the goal, although the guidelines in 63 this document should be taken into consideration when designing 64 alternate algorithms. Additionally, this document does not outline 65 what a TCP or SCTP sender may do after a spurious retransmission is 66 detected. A number of proposals have been developed (e.g., 67 [RFC3522], [SK03], [BDA03]), but it is not yet clear which of these 68 proposals are appropriate. In addition, they all rely on detecting 69 spurious retransmits and so can share the algorithm specified in 70 this document. 72 Finally, we note that to simplify the text much of the following 73 discussion is in terms of TCP DSACKs, while applying to both TCP and 74 SCTP. 76 2 Counting Duplicate Notifications 78 For certain applications a straight count of duplicate notifications 79 will suffice. For instance, if a stack simply wants to know (for 80 some reason) the number of spuriously retransmitted segments, 81 counting all duplicate notifications for retransmitted segments 82 should work well. Another application of this strategy is to 83 monitor and adapt transport algorithms so that the transport is not 84 sending large amounts of spurious data into the network. For 85 instance, monitoring duplicate notifications could be used by the 86 Early Retransmit [AAAB03] algorithm to determine whether fast 87 retransmitting [RFC2581] segments with a lower than normal duplicate 88 ACK threshold is working, or if segment reordering is causing 89 spurious retransmits. 91 More speculatively, duplicate notification has been proposed as an 92 integral part of estimating TCP's total loss rate [AEO03] for the 93 purposes of mitigating the impact of corruption-based losses on 94 transport protocol performance. [EOA03] proposes altering the 95 transport's congestion response to the fraction of losses that are 96 actually due to congestion by requiring the network to provide the 97 corruption-based loss rate and making the transport sender estimate 98 the total loss rate. Duplicate notifications are a key part of 99 estimating the total loss rate accurately [AEO03]. 101 3 Congestion/Duplicate Disambiguation Algorithm 103 When the purpose of detecting spurious retransmissions is to 104 ``undo'' unnecessary changes made to the congestion control state, 105 as suggested in [RFC2883], the data sender ideally needs to 106 determine: 108 (a) That spurious retransmissions in a particular window of data do 109 not mask real segment loss (congestion). 111 For example, assume segments N and N+1 are retransmitted even 112 though only segment N was dropped by the network (thus, segment 113 N+1 was needlessly retransmitted). When the sender receives the 114 notification that segment N+1 arrived more than once it can 115 conclude that segment N+1 was needlessly resent. However, it 116 cannot conclude that it is appropriate to revert the congestion 117 control state because the window of data contained at least one 118 valid congestion indication (i.e., segment N was lost). 120 (b) That network duplication is not the cause of the duplicate 121 notification. 123 Determining whether a duplicate notification is caused by 124 network duplication of a packet or a spurious retransmit is a 125 nearly impossible task in theory. Since [Pax97] shows that 126 packet duplication by the network is rare, the algorithm in this 127 section simply ceases to function when network duplication is 128 detected (by receiving a duplication notification for a segment 129 that was not retransmitted by the sender). 131 The algorithm specified below gives reasonable, but not complete, 132 protection against both of these cases. 134 We assume the TCP sender has a data structure to hold selective 135 acknowledgment information (e.g., as outlined in [RFC3517]). The 136 following steps require an extension of such a 'scoreboard' to 137 incorporate a slightly longer history of retransmissions than called 138 for in [RFC3517]. The following steps MUST be taken upon the 139 receipt of each DSACK or duplicate TSN notification: 141 (A) Check the corresponding sequence range or TSN to determine 142 whether the segment has been retransmitted. 144 (A.1) If the SACK scoreboard is empty (i.e., the TCP sender has 145 received no SACK information from the receiver) processing 146 of this DSACK MUST be terminated and the congestion control 147 state MUST NOT be reverted during the current window of 148 data. This clause intends to cover the case when an entire 149 window of acknowledgments have been dropped by the network. 150 In such a case, the reverse path seems to be in a congested 151 state and so reducing TCP's sending rate is the conservative 152 approach. 154 (A.2) If the segment was retransmitted exactly one time, mark it 155 as a duplicate. 157 (A.3) If the segment was retransmitted more than once processing 158 of this DSACK MUST be terminated and the congestion control 159 state MUST NOT be reverted to its previous state during the 160 current window of data. 162 (A.4) If the segment was not retransmitted the incoming DSACK 163 indicates that the network duplicated the segment in 164 question. Processing of this DSACK MUST be terminated. In 165 addition, the algorithm specified in this document MUST NOT 166 be used for the remainder of the connection, as future DSACK 167 reports may be indicating network duplication rather than 168 unnecessary retransmission. Note that some techniques to 169 further disambiguate network duplication from unnecessary 170 retransmission (e.g., the TCP timestamp option [RFC1323]) 171 may be used to refine the algorithm in this document 172 further. Using such a technique in conjunction with an 173 algorithm similar to the one presented herein may allow for 174 the continued use of the algorithm in the face of duplicated 175 segments. We do not delve into such an algorithm in this 176 document due the current rarity of network duplication. 177 However, future work should include tackling this problem. 179 (B) Assuming processing is allowed to continue (per the (A) rules), 180 check all retransmitted segments in the previous window of data. 182 (B.1) If all segments or chunks marked as retransmitted have 183 also been marked as acknowledged and duplicated, we conclude 184 that all retransmissions in the previous window of data were 185 spurious and no loss occurred. 187 (B.2) If any segment or chunk is still marked as retransmitted 188 but not marked as duplicate, there are outstanding 189 retransmissions that could indicate loss within this window 190 of data. We can make no conclusions based on this 191 particular DSACK/duplicate TSN notification. 193 In addition to keeping the state mentioned in [RFC3517] (for TCP) 194 and [RFC2960] (for SCTP), an implementation of this algorithm must 195 track all sequence numbers or TSNs that have been acknowledged as 196 duplicates. 198 4 Related Work 200 In addition to the mechanism for detecting spurious retransmits 201 outlined in this document, several other proposals for finding 202 needless retransmits have been developed. 204 [BA02] uses the algorithm outlined in this document as the basis for 205 investigating several methods to make TCP more robust to reordered 206 packets. 208 The Eifel detection algorithm [RFC3522] uses the TCP timestamp 209 option [RFC1323] to determine whether the ACK for a given retransmit 210 is for the original transmission or a retransmission. More 211 generally, [LK00] outlines the benefits of detecting spurious 212 retransmits and reverting from needless congestion control changes 213 using the timestamp-based scheme or a mechanism that uses a 214 "retransmit bit" to flag retransmits (and ACKs of retransmits). The 215 Eifel detection algorithm can detect spurious retransmits more 216 rapidly than a DSACK-based scheme. However, the tradeoff is that 217 the overhead of the 12-byte timestamp option must be incurred in 218 every packet transmitted for Eifel to function. 220 The F-RTO scheme [SK03] slightly alters TCP's sending pattern 221 immediately following a retransmission timeout and then observes the 222 pattern of the returning ACKs. This pattern can indicate whether 223 the retransmitted segment was needed. The advantage of F-RTO is 224 that the algorithm only needs to be implemented on the sender side 225 of the TCP connection and that nothing extra needs to cross the 226 network (e.g., DSACKs, timestamps, special flags, etc.). The 227 downside is that the algorithm is a heuristic that can be confused 228 by network pathologies (e.g., duplication or reordering of key 229 packets). Finally, note that F-RTO only works for spurious 230 retransmits triggered by the transport's retransmission timer. 232 Finally, [AP99] briefly investigates using the time between 233 retransmitting a segment via the retransmission timeout and the 234 arrival of the next ACK as an indicator of whether the retransmit 235 was needed. The scheme compares this time delta with a fraction (f) 236 of the minimum RTT observed thus far on the connection. If the time 237 delta is less than f*minRTT then the retransmit is labeled 238 spurious. When f=1/2 the algorithm identifies roughly 59% of the 239 needless retransmission timeouts and identifies needed retransmits 240 only 2.5% of the time. As with F-RTO, this scheme only detects 241 spurious retransmits sent by the transport's retransmission timer. 243 5 Security Considerations 245 It is possible for the receiver to falsely indicate spurious 246 retransmissions in the case of actual loss, potentially causing a 247 TCP or SCTP sender to inaccurately conclude that no loss took place 248 (and possibly cause inappropriate changes to the senders congestion 249 control state). 251 Consider the following scenario: A receiver watches every segment or 252 chunk that arrives and acknowledges any segment that arrives out of 253 order by more than some threshold amount as a duplicate, assuming 254 that it is a retransmission. A sender using the above algorithm 255 will assume that the retransmission was spurious. 257 The ECN nonce sum proposal [RFC3540] could possibly help mitigate 258 the ability of the receiver to hide real losses from the sender with 259 modest extension. In the common case of receiving an original 260 transmission and a spurious retransmit a receiver will have received 261 the nonce from the original transmission and therefore can "prove" 262 to the sender that the duplication notification is valid. In the 263 case when the receiver did not receive the original and is trying to 264 improperly induce the sender into transmitting at an inappropriately 265 high rate, the receiver will not know the ECN nonce from the 266 original segment and therefore will probabilistically not be able to 267 fool the sender for long. [RFC3540] calls for disabling nonce sums 268 on duplicate ACKs, which means that the nonce sum is not directly 269 suitable for use as a mitigation to the problem of receivers lying 270 about DSACK information. However, future efforts may be able to use 271 [RFC3540] as a starting point for building protection should it be 272 needed. 274 Acknowledgments 276 Sourabh Ladha and Reiner Ludwig made several useful comments on an 277 earlier version of this document. The second author thanks BBN 278 Technologies and NASA's Glenn Research Center for supporting this 279 work. 281 Normative References 283 [RFC793] Jon Postel. Transmission Control Protocol. Std 7, RFC 284 793. September 1981. 286 [RFC2960] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. 287 Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, V. 288 Paxson. Stream Control Transmission Protocol. October 2000. 290 [RFC2883] S. Floyd, J. Mahdavi, M. Mathis, M. Podolsky. An 291 Extension to the Selective Acknowledgement (SACK) Option for 292 TCP. RFC 2883, July 2000. 294 Non-Normative References 296 [AAAB03] M. Allman, K. Avrachenkov, U. Ayesta, J. Blanton. Early 297 Retransmit for TCP. Internet-Draft 298 draft-allman-tcp-early-rexmt-01.txt, June 2003. Work in 299 progress. 301 [AEO03] Mark Allman, Wesley Eddy, Shawn Ostermann. Estimating Loss 302 Rates With TCP. August 2003. Under submission. 304 [AP99] Allman, M. and V. Paxson, "On Estimating End-to-End Network 305 Path Properties", SIGCOMM 99. 307 [BA02] E. Blanton, M. Allman. On Making TCP More Robust to Packet 308 Reordering. ACM Computer Communication Review, 32(1), January 309 2002. 311 [BDA03] Ethan Blanton, Robert Dimond, Mark Allman. Practices for TCP 312 Senders in the Face of Segment Reordering, February 313 2003. Internet-Draft draft-blanton-tcp-reordering-00.txt (work 314 in progress). 316 [EOA03] Wesley Eddy, Shawn Ostermann, Mark Allman. New Techniques 317 for Making Transport Protocols Robust to Corruption-Based 318 Loss. July 2003. Under submission. 320 [LK00] R. Ludwig, R. H. Katz. The Eifel Algorithm: Making TCP 321 Robust Against Spurious Retransmissions. ACM Computer 322 Communication Review, 30(1), January 2000. 324 [Pax97] V. Paxson. End-to-End Internet Packet Dynamics. In ACM 325 SIGCOMM, September 1997. 327 [RFC1323] Van Jacobson, Robert Braden, David Borman. TCP Extensions 328 for High Performance. RFC 1323. May 1992. 330 [RFC3517] Ethan Blanton, Mark Allman, Kevin Fall, Lili Wang. A 331 Conservative Selective Acknowledgment (SACK)-based Loss Recovery 332 Algorithm for TCP, April 2003. RFC 3517. 334 [RFC3522] R. Ludwig, M. Meyer. The Eifel Detection Algorithm for 335 TCP, April 2003. RFC 3522. 337 [RFC3540] N. Spring, D. Wetherall, D. Ely. Robust Explicit 338 Congestion Notification (ECN) Signaling with Nonces, June 2003. 339 RFC 3540. 341 [SK03] P. Sarolahti, M. Kojo. F-RTO: An Algorithm for Detecting 342 Spurious Retransmission Timeouts with TCP and SCTP. 343 Internet-Draft draft-sarolahti-tsvwg-tcp-frto-04.txt, June 2003. 344 Work in progress. 346 Authors' Addresses: 348 Ethan Blanton 349 Purdue University Computer Sciences 350 1398 Computer Science Building 351 West Lafayette, IN 47907 352 eblanton@cs.purdue.edu 354 Mark Allman 355 ICSI Center for Internet Research 356 1947 Center Street, Suite 600 357 Berkeley, CA 94704-1198 358 Phone: 216-243-7361 359 mallman@icir.org 360 http://www.icir.org/mallman/