idnits 2.17.1 draft-floyd-sack-00.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity -- however, there's a paragraph with a matching beginning. Boilerplate error? == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 5 longer pages, the longest (page 5) being 59 lines == It seems as if not all pages are separated by form feeds - found 0 form feeds but 14 pages Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack an IANA Considerations section. (See Section 2.2 of https://www.ietf.org/id-info/checklist for how to handle the case when there are no actions for IANA.) ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** The abstract seems to contain references ([RFC2018]), which it shouldn't. Please replace those with straight textual mentions of the documents in question. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 51: '... The keywords MUST, MUST NOT, REQUIR...' RFC 2119 keyword, line 52: '... SHOULD NOT, RECOMMENDED, MAY, and O...' RFC 2119 keyword, line 117: '...ption, ``the first SACK block ... MUST...' RFC 2119 keyword, line 126: '...ll, SACK options SHOULD be included in...' RFC 2119 keyword, line 367: '...An implementation MUST NOT compare the...' Miscellaneous warnings: ---------------------------------------------------------------------------- -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (August 1999) is 9015 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'B97' is mentioned on line 54, but not defined -- Possible downref: Non-RFC (?) normative reference: ref. 'AP99' -- Possible downref: Non-RFC (?) normative reference: ref. 'BPK97' -- Possible downref: Non-RFC (?) normative reference: ref. 'F99' -- Possible downref: Non-RFC (?) normative reference: ref. 'ISO8073' -- Possible downref: Non-RFC (?) normative reference: ref. 'L99' ** Obsolete normative reference: RFC 1323 (Obsoleted by RFC 7323) ** Obsolete normative reference: RFC 2581 (Obsoleted by RFC 5681) -- Possible downref: Non-RFC (?) normative reference: ref. 'SCWA99' Summary: 8 errors (**), 0 flaws (~~), 4 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Sally Floyd 2 INTERNET DRAFT Jamshid Mahdavi 3 draft-floyd-sack-00.txt Matt Mathis 4 Matthew Podolsky 5 Allyn Romanow 6 August 1999 7 Expires: February 2000 9 An Extension to the Selective Acknowledgement (SACK) Option for TCP 11 Status of this Memo 13 This document is an Internet-Draft and is in full conformance with 14 all provisions of Section 10 of RFC2026. 16 Internet-Drafts are working documents of the Internet Engineering 17 Task Force (IETF), its areas, and its working groups. Note that 18 other groups may also distribute working documents as Internet- 19 Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six months 22 and may be updated, replaced, or obsoleted by other documents at any 23 time. It is inappropriate to use Internet- Drafts as reference 24 material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 Abstract 34 This note defines an extension of the Selective Acknowledgement 35 (SACK) Option [RFC2018] for TCP. RFC 2018 specified the use of the 36 SACK option for acknowledging out-of-sequence data not covered by 37 TCP's cumulative acknowledgement field. This note extends RFC 2018 38 by specifying the use of the SACK option for acknowledging duplicate 39 packets. This note suggests that when duplicate packets are 40 received, the first block of the SACK option field can be used to 41 report the sequence numbers of the packet that triggered the 42 acknowledgement. This extension to the SACK option allows the TCP 43 sender to infer the order of packets received at the receiver, 44 allowing the sender to infer when it has unnecessarily retransmitted 45 a packet. A TCP sender could then use this information for more 46 robust operation in an environment of reordered packets, ACK loss, 47 packet replication, and/or early retransmit timeouts. 49 1. Conventions and Acronyms 51 The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, 52 SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this 53 document, are to be interpreted as described in [B97]. 55 2. Introduction 57 The Selective Acknowledgement (SACK) option defined in RFC 2018 is 58 used by the TCP data receiver to acknowledge non-contiguous blocks of 59 data not covered by the Cumulative Acknowledgement field. However, 60 RFC 2018 does not specify the use of the SACK option when duplicate 61 segments are received. This note specifies the use of the SACK 62 option when acknowledging the receipt of a duplicate packet [F99]. 63 We use the term D-SACK (for duplicate-SACK) to refer to a SACK block 64 that reports a duplicate segment. 66 This document does not make any changes to TCP's use of the 67 cumulative acknowledgement field, or to the TCP receiver's decision 68 of *when* to send an acknowledgement packet. This document only 69 concerns the contents of the SACK option when an acknowledgement is 70 sent. 72 This extension is compatible with current implementations of the SACK 73 option in TCP. That is, if one of the TCP end-nodes does not 74 implement this D-SACK extension and the other TCP end-node does, we 75 believe that this use of the D-SACK extension by one of the end nodes 76 will not introduce problems. 78 The use of D-SACK does not require separate negotiation between a TCP 79 sender and receiver that have already negotiated SACK capability. 80 The absence of separate negotiation for D-SACK means that the TCP 81 receiver could send D-SACK blocks when the TCP sender does not 82 understand this extension to SACK. In this case, the TCP sender will 83 simply discard any D-SACK blocks, and process the other SACK blocks 84 in the SACK option field as it normally would. 86 3. The Sack Option Format as defined in RFC 2018 88 The SACK option as defined in RFC 2018 is as follows: 90 +--------+--------+ 91 | Kind=5 | Length | 92 +--------+--------+--------+--------+ 93 | Left Edge of 1st Block | 94 +--------+--------+--------+--------+ 95 | Right Edge of 1st Block | 96 +--------+--------+--------+--------+ 97 | | 98 / . . . / 99 | | 100 +--------+--------+--------+--------+ 101 | Left Edge of nth Block | 102 +--------+--------+--------+--------+ 103 | Right Edge of nth Block | 104 +--------+--------+--------+--------+ 106 The Selective Acknowledgement (SACK) option in the TCP header 107 contains a number of SACK blocks, where each block specifies the left 108 and right edge of a block of data received at the TCP receiver. In 109 particular, a block represents a contiguous sequence space of data 110 received and queued at the receiver, where the ``left edge'' of the 111 block is the first sequence number of the block, and the ``right 112 edge'' is the sequence number immediately following the last sequence 113 number of the block. 115 RFC 2018 implies that the first SACK block specify the segment that 116 triggered the acknowledgement. From RFC 2018, when the data receiver 117 chooses to send a SACK option, ``the first SACK block ... MUST 118 specify the contiguous block of data containing the segment which 119 triggered this ACK, unless that segment advanced the Acknowledgment 120 Number field in the header.'' 122 However, RFC 2018 does not address the use of the SACK option when 123 acknowledging a duplicate segment. For example, RFC 2018 specifies 124 that ``each block represents received bytes of data that are 125 contiguous and isolated''. RFC 2018 further specifies that ``if sent 126 at all, SACK options SHOULD be included in all ACKs which do not ACK 127 the highest sequence number in the data receiver's queue.'' RFC 2018 128 does not specify the use of the SACK option when a duplicate segment 129 is received, and the cumulative acknowledgement field in the ACK 130 acknowledges all of the data in the data receiver's queue. 132 4. Use of the SACK option for reporting a duplicate segment 134 This section specifies the use of SACK blocks when the SACK option is 135 used in reporting a duplicate segment. When D-SACK is used, the 136 first block of the SACK option should be a D-SACK block specifying 137 the sequence numbers for the duplicate segment that triggers the 138 acknowledgement. If the duplicate segment is part of a larger block 139 of non-contiguous data in the receiver's data queue, then the 140 following SACK block should be used to specify this larger block. 141 Additional SACK blocks can be used to specify additional non- 142 contiguous blocks of data, as specified in RFC 2018. 144 The guidelines for reporting duplicate segments are summarized below: 146 (1) A D-SACK block is only used to report a duplicate contiguous 147 sequence of data received by the receiver in the most recent packet. 149 (2) Each duplicate contiguous sequence of data received is reported 150 in at most one D-SACK block. (I.e., the receiver sends two identical 151 D-SACK blocks in subsequent packets only if the receiver receives two 152 duplicate segments.) 154 (3) The left edge of the D-SACK block specifies the first sequence 155 number of the duplicate contiguous sequence, and the right edge of 156 the D-SACK block specifies the sequence number immediately following 157 the last sequence in the duplicate contiguous sequence. 159 (4) If the D-SACK block reports a duplicate contiguous sequence from 160 a (possibly larger) block of data in the receiver's data queue above 161 the cumulative acknowledgement, then the second SACK block in that 162 SACK option should specify that (possibly larger) block of data. 164 (5) Following the SACK blocks described above for reporting duplicate 165 segments, additional SACK blocks can be used for reporting additional 166 blocks of data, as specified in RFC 2018. 168 Note that because each duplicate segment is reported in only one ACK 169 packet, information about that duplicate segment will be lost if that 170 ACK packet is dropped in the network. 172 4.1 Reporting Full Duplicate Segments 174 We illustrate these guidelines with three examples. In each example, 175 we assume that the data receiver has first received eight segments of 176 500 bytes each, and has sent an acknowledgement with the cumulative 177 acknowledgement field set to 4000 (assuming the first sequence number 178 is zero). The D-SACK block is underlined in each example. 180 4.1.1: Example 1: Reporting a duplicate segment. 182 Because several ACK packets are lost, the data sender retransmits 183 packet 3000-3499, and the data receiver subsequently receives a 184 duplicate segment with sequence numbers 3000-3499. The receiver 185 sends an acknowledgement with the cumulative acknowledgement field 186 set to 4000, and the first, D-SACK block specifying sequence numbers 187 3000-3500. 189 Transmitted Received ACK Sent 190 Segment Segment (Including SACK Blocks) 192 3000-3499 3000-3499 3500 (ACK dropped) 193 3500-3999 3500-3999 4000 (ACK dropped) 194 3000-3499 3000-3499 4000, SACK=3000-3500 195 --------- 197 4.1.2. Example 2: Reporting an out-of-order segment and a duplicate 198 segment. 200 Following a lost data packet, the receiver receives an out-of-order 201 data segment, which triggers the SACK option as specified in RFC 202 2018. Because of several lost ACK packets, the sender then 203 retransmits a data packet. The receiver receives the duplicate 204 packet, and reports it in the first, D-SACK block: 206 Transmitted Received ACK Sent 207 Segment Segment (Including SACK Blocks) 209 3000-3499 3000-3499 3500 (ACK dropped) 210 3500-3999 3500-3999 4000 (ACK dropped) 211 4000-4499 (data packet dropped) 212 4500-4999 4500-4999 4000, SACK=4500-5000 (ACK dropped) 213 3000-3499 3000-3499 4000, SACK=3000-3500, 4500-5000 214 --------- 216 4.1.3. Example 3: Reporting a duplicate of an out-of-order segment. 218 Because of a lost data packet, the receiver receives two out-of-order 219 segments. The receiver next receives a duplicate segment for one of 220 these out-of-order segments: 222 Transmitted Received ACK Sent 223 Segment Segment (Including SACK Blocks) 225 3500-3999 3500-3999 4000 226 4000-4499 (data packet dropped) 227 4500-4999 4500-4999 4000, SACK=4500-5000 228 5000-5499 5000-5499 4000, SACK=4500-5500 229 (duplicated packet) 230 5000-5499 4000, SACK=5000-5500, 4500-5500 231 --------- 233 4.2. Reporting Partial Duplicate Segments 235 It may be possible that a sender transmits a packet that includes one 236 or more duplicate sub-segments--that is, only part but not all of the 237 transmitted packet has already arrived at the receiver. This can 238 occur when the size of the sender's transmitted segments increases, 239 which can occur when the PMTU increases in the middle of a TCP 240 session, for example. The guidelines in Section 4 above apply to 241 reporting partial as well as full duplicate segments. This section 242 gives examples of these guidelines when reporting partial duplicate 243 segments. 245 When the SACK option is used for reporting partial duplicate 246 segments, the first D-SACK block reports the first duplicate sub- 247 segment. If the data packet being acknowledged contains multiple 248 partial duplicate sub-segments, then only the first such duplicate 249 sub-segment is reported in the SACK option. We illustrate this with 250 the examples below. 252 4.2.1. Example 4: Reporting a single duplicate subsegment. 254 The sender increases the packet size from 500 bytes to 1000 bytes. 255 The receiver subsequently receives a 1000-byte packet containing one 256 500-byte subsegment that has already been received and one which has 257 not. The receiver reports only the already received subsegment using 258 a single D-SACK block. 260 Transmitted Received ACK Sent 261 Segment Segment (Including SACK Blocks) 263 500-999 500-999 1000 264 1000-1499 (delayed) 265 1500-1999 (data packet dropped) 266 2000-2499 2000-2499 1000, SACK=2000-2500 267 1000-2000 1000-1499 1500, SACK=2000-2500 268 1000-2000 2500, SACK=1000-1500 269 --------- 271 4.2.2. Example 5: Two non-contiguous duplicate subsegments covered by 272 the cumulative acknowledgement. 274 After the sender increases its packet size from 500 bytes to 1500 275 bytes, the receiver receives a packet containing two non-contiguous 276 duplicate 500-byte subsegments which are less than the cumulative 277 acknowledgement field. The receiver reports the first such duplicate 278 segment in a single D-SACK block. 280 Transmitted Received ACK Sent 281 Segment Segment (Including SACK Blocks) 283 500-999 500-999 1000 284 1000-1499 (delayed) 285 1500-1999 (data packet dropped) 286 2000-2499 (delayed) 287 2500-2999 (data packet dropped) 288 3000-3499 3000-3499 1000, SACK=3000-3500 289 1000-2499 1000-1499 1500, SACK=3000-3500 290 2000-2499 1500, SACK=2000-2500, 3000-3500 291 1000-2499 2500, SACK=1000-1500, 3000-3500 292 --------- 294 4.2.3. Example 6: Two non-contiguous duplicate subsegments not covered 295 by the cumulative acknowledgement. 297 This example is similar to Example 5, except that after the sender 298 increases the packet size, the receiver receives a packet containing 299 two non-contiguous duplicate subsegments which are above the 300 cumulative acknowledgement field, rather than below. The first, D- 301 SACK block reports the first duplicate subsegment, and the second, 302 SACK block reports the larger block of non-contiguous data that it 303 belongs to. 305 Transmitted Received ACK Sent 306 Segment Segment (Including SACK Blocks) 308 500-999 500-999 1000 309 1000-1499 (data packet dropped) 310 1500-1999 (delayed) 311 2000-2499 (data packet dropped) 312 2500-2999 (delayed) 313 3000-3499 (data packet dropped) 314 3500-3999 3500-3999 1000, SACK=3500-4000 315 1000-1499 (data packet dropped) 316 1500-2999 1500-1999 1000, SACK=1500-2000, 3500-4000 317 2000-2499 1000, SACK=2000-2500, 1500-2000, 318 3500-4000 319 1500-2999 1000, SACK=1500-2000, 1500-3000, 320 --------- 321 3500-4000 323 4.3. Interaction Between D-SACK and PAWS 325 RFC 1323 [RFC1323] specifies an algorithm for Protection Against 326 Wrapped Sequence Numbers (PAWS). PAWS gives a method for 327 distinguishing between sequence numbers for new data, and sequence 328 numbers from a previous cycle through the sequence number space. 329 Duplicate segments might be detected by PAWS as belonging to a 330 previous cycle through the sequence number space. 332 RFC 1323 specifies that for such packets, the receiver should do the 333 following: 335 Send an acknowledgement in reply as specified in RFC 793 page 69, 336 and drop the segment. 338 Since PAWS still requires sending an ACK, there is no harmful 339 interaction between PAWS and the use of D-SACK. The D-SACK block can 340 be included in the SACK option of the ACK, as outlined in Section 4, 341 independently of the use of PAWS by the TCP receiver, and 342 independently of the determination by PAWS of the validity or 343 invalidity of the data segment. 345 TCP senders receiving D-SACK blocks should be aware that a segment 346 reported as a duplicate segment could possibly have been from a prior 347 cycle through the sequence number space. This is independent of the 348 use of PAWS by the TCP data receiver. We do not anticipate that this 349 will present significant problems for senders using D-SACK 350 information. 352 5. Detection of Duplicate Packets 354 This extension to the SACK option enables the receiver to accurately 355 report the reception of duplicate data. Because each receipt of a 356 duplicate packet is reported in only one ACK packet, the loss of a 357 single ACK can prevent this information from reaching the sender. In 358 addition, we note that the sender can not necessarily trust the 359 receiver to send it accurate information [SCWA99]. 361 In order for the sender to check that the first (D)SACK block of an 362 acknowledgement in fact acknowledges duplicate data, the sender 363 should compare the sequence space in the first SACK block to the 364 cumulative ACK which is carried IN THE SAME PACKET. If the SACK 365 sequence space is less than this cumulative ACK, it is an indication 366 that the segment identified by the SACK block has been received more 367 than once by the receiver. An implementation MUST NOT compare the 368 sequence space in the SACK block to the TCP state variable snd.una 369 (which carries the total cumulative ACK), as this may result in the 370 wrong conclusion if ACK packets are reordered. 372 If the sequence space in the first SACK block is greater than the 373 cumulative ACK, then the sender next compares the sequence space in 374 the first SACK block with the sequence space in the second SACK 375 block, if there is one. This comparison can determine if the first 376 SACK block is reporting duplicate data that lies above the cumulative 377 ACK. 379 TCP implementations which follow RFC 2581 [RFC2581] could see 380 duplicate packets in each of the following four situations. This 381 document does not specify what action a TCP implementation should 382 take in these cases. The extension to the SACK option simply enables 383 the sender to detect each of these cases. Note that these four 384 conditions are not an exhaustive list of possible cases for duplicate 385 packets, but are representative of the most common/likely cases. 386 Subsequent documents will describe experimental proposals for sender 387 responses to the detection of unnecessary retransmits due to 388 reordering, lost ACKS, or early retransmit timeouts. 390 5.1. Replication by the network 392 If a packet is replicated in the network, this extension to the SACK 393 option can identify this. For example: 395 Transmitted Received ACK Sent 396 Segment Segment (Including SACK Blocks) 398 500-999 500-999 1000 399 1000-1499 1000-1499 1500 400 (replicated) 401 1000-1499 1500, SACK=1000-1500 402 --------- 404 In this case, the second packet was replicated in the network. An 405 ACK containing a D-SACK block which is lower than its ACK field and 406 is not identical to a previously retransmitted segment is indicative 407 of a replication by the network. 409 WITHOUT D-SACK: 411 If D-SACK was not used and the last ACK was piggybacked on a data 412 packet, the sender would not know that a packet had been replicated 413 in the network. If D-SACK was not used and neither of the last two 414 ACKs was piggybacked on a data packet, then the sender could 415 reasonably infer that either some data packet *or* the final ACK 416 packet had been replicated in the network. The receipt of the D-SACK 417 packet gives the sender positive knowledge that this data packet was 418 replicated in the network (assuming that the receiver is not lying). 420 RESEARCH ISSUES: 422 The current SACK option already allows the sender to identify 423 duplicate ACKs that do not acknowledge new data, but the D-SACK 424 option gives the sender a stronger basis for inferring that a 425 duplicate ACK does not acknowledge new data. The knowledge that a 426 duplicate ACK does not acknowledge new data allows the sender to 427 refrain from using that duplicate ACKs to infer packet loss (e.g., 428 Fast Retransmit) or to send more data (e.g., Fast Recovery). 430 5.2. False retransmit due to reordering 432 If packets are reordered in the network such that a segment arrives 433 more than 3 packets out of order, TCP's Fast Retransmit algorithm 434 will retransmit the out-of-order packet. An example of this is shown 435 below: 437 Transmitted Received ACK Sent 438 Segment Segment (Including SACK Blocks) 440 500-999 500-999 1000 441 1000-1499 (delayed) 442 1500-1999 1500-1999 1000, SACK=1500-2000 443 2000-2499 2000-2499 1000, SACK=1500-2500 444 2500-2999 2500-2999 1000, SACK=1500-3000 445 1000-1499 1000-1499 3000 446 1000-1499 3000, SACK=1000-1500 447 --------- 449 In this case, an ACK containing a SACK block which is lower than its 450 ACK field and identical to a previously retransmitted segment is 451 indicative of a significant reordering followed by a false 452 (unnecessary) retransmission. 454 WITHOUT D-SACK: 456 With the use of D-SACK illustrated above, the sender knows that 457 either the first transmission of segment 1000-1499 was delayed in the 458 network, or the first transmission of segment 1000-1499 was dropped 459 and the second transmission of segment 1000-1499 was duplicated. 460 Given that no other segments have been duplicated in the network, 461 this second option can be considered unlikely. 463 Without the use of D-SACK, the sender would only know that either the 464 first transmission of segment 1000-1499 was delayed in the network, 465 or that either one of the data segments or the final ACK was 466 duplicated in the network. Thus, the use of D-SACK allows the sender 467 to more reliably infer that the first transmission of segment 468 1000-1499 was not dropped. 470 [AP99] and [L99] both note that the sender could unambiguously detect 471 an unnecessary retransmit with the use of the timestamp option. In 472 addition, [AP99] proposes a heuristic for detecting an unnecessary 473 retransmit in an environment with neither timestamps nor SACK. [L99] 474 also proposes a two-bit field as an alternate to the timestamp option 475 for unambiguously marking the first three retransmissions of a 476 packet. A similar idea was proposed in [ISO8073]. 478 RESEARCH ISSUES: 480 The use of D-SACK allows the sender to detect some cases (e.g., when 481 no ACK packets have been lost) when a a Fast Retransmit was due to 482 packet reordering instead of packet loss. This allows the TCP sender 483 to adjust the duplicate acknowledment threshold, to prevent such 484 unnecessary Fast Retransmits in the future. Coupled with this, when 485 the sender determines, after the fact, that it has made an 486 unnecessary window reduction, the sender has the option of 487 ``undoing'' that reduction in the congestion window by resetting 488 ssthresh to the value of the old congestion window, and slow-starting 489 until the congestion window has reached that point. 491 Any proposal for ``undoing'' a reduction in the congestion window 492 would have to address the possibility that the TCP receiver could be 493 lying in its reports of received packets [SCWA99]. 495 5.3. Retransmit Timeout Due to ACK Loss 497 If an entire window of ACKs is lost, a timeout will result. An 498 example of this is given below: 500 Transmitted Received ACK Sent 501 Segment Segment (Including SACK Blocks) 503 500-999 500-999 1000 (ACK dropped) 504 1000-1499 1000-1499 1500 (ACK dropped) 505 1500-1999 1500-1999 2000 (ACK dropped) 506 2000-2499 2000-2499 2500 (ACK dropped) 507 (timeout) 508 500-999 500-999 2500, SACK=500-1000 509 -------- 511 In this case, all of the ACKs are dropped, resulting in a timeout. 512 This condition can be identified because the first ACK received 513 following the timeout carries a D-SACK block indicating duplicate 514 data was received. 516 WITHOUT D-SACK: 518 Without the use of D-SACK, the sender in this case would be unable to 519 decide that no data packets has been dropped. 521 RESEARCH ISSUES: 523 For a TCP that implements some form of ACK congestion control 524 [BPK97], this ability to distinguish between dropped data packets and 525 dropped ACK packets would be particularly useful. In this case, the 526 connection could implement congestion control for the return (ACK) 527 path independently from the congestion control on the forward (data) 528 path. 530 5.4. Early Retransmit Timeout 532 If the sender's RTO is too short, an early retransmission timeout can 533 occur when no packets have in fact been dropped in the network. An 534 example of this is given below: 536 Transmitted Received ACK Sent 537 Segment Segment (Including SACK Blocks) 539 500-999 (delayed) 540 1000-1499 (delayed) 541 1500-1999 (delayed) 542 2000-2499 (delayed) 543 (timeout) 544 500-999 (delayed) 545 500-999 1000 546 1000-1499 (delayed) 547 1000-1499 1500 548 ... 549 1500-1999 2000 550 2000-2499 2500 551 500-999 2500, SACK=500-1000 552 -------- 553 1000-1499 2500, SACK=1000-1500 554 --------- 555 ... 557 In this case, the first packet is retransmitted following the 558 timeout. Subsequently, the original window of packets arrives at the 559 receiver, resulting in ACKs for these segments. Following this, the 560 retransmissions of these segments arrive, resulting in ACKs carrying 561 SACK blocks which identify the duplicate segments. 563 This can be identified as an early retransmission timeout because the 564 ACK for byte 1000 is received after the timeout with no SACK 565 information, followed by an ACK which carries SACK information 566 (500-999) indicating that the retransmitted segment had already been 567 received. 569 WITHOUT D-SACK: 571 If D-SACK was not used and one of the duplicate ACKs was piggybacked 572 on a data packet, the sender would not know how many duplicate 573 packets had been received. If D-SACK was not used and none of the 574 duplicate ACKs were piggybacked on a data packet, then the sender 575 would have sent N duplicate packets, for some N, and received N 576 duplicate ACKs. In this case, the sender could reasonably infer that 577 some data or ACK packet had been replicated in the network, or that 578 an early retransmission timeout had occurred (or that the receiver is 579 lying). 581 RESEARCH ISSUES: 583 After the sender determines that an unnecessary (i.e., early) 584 retransmit timeout has occurred, the sender could adjust parameters 585 for setting the RTO, to prevent more unnecessary retransmit timeouts. 586 Coupled with this, when the sender determines, after the fact, that 587 it has made an unnecessary window reduction, the sender has the 588 option of ``undoing'' that reduction in the congestion window. 590 6. Security Considerations 592 This document neither strengthens nor weakens TCP's current security 593 properties. 595 7. Acknowledgements 597 We would like to thank Mark Handley, Reiner Ludwig, and Venkat 598 Padmanabhan for conversations on these issues, and to thank Mark 599 Allman for helpful feedback on this document. 601 8. References 603 [AP99] Mark Allman and Vern Paxson, On Estimating End-to-End Network 604 Path Properties, SIGCOMM 99, August 1999. URL 605 ``http://www.acm.org/sigcomm/sigcomm99/papers/session7-3.html''. 607 [BPK97] Hari Balakrishnan, Venkata Padmanabhan, and Randy H. Katz, 608 The Effects of Asymmetry on TCP Performance, Third ACM/IEEE Mobicom 609 Conference, Budapest, Hungary, Sep 1997. URL 610 ``http://www.cs.berkeley.edu/~padmanab/index.html#Publications''. 612 [F99] Floyd, S., Re: TCP and out-of-order delivery, Message ID 613 <199902030027.QAA06775@owl.ee.lbl.gov> to the end-to-end-interest 614 mailing list, February 1999. URL 615 ``http://www.aciri.org/floyd/notes/TCP_Feb99.email''. 617 [ISO8073] ISO/IEC, Information-processing systems - Open Systems 618 Interconnection - Connection Oriented Transport Protocol 619 Specification, Internation Standard ISO/IEC 8073, December 1988. 621 [L99] Reiner Ludwig, A Case for Flow Adaptive Wireless links, 622 Technical Report UCB//CSD-99-1053, May 1999. URL 623 ``http://iceberg.cs.berkeley.edu/papers/Ludwig-FlowAdaptive/''. 625 [RFC1323] V. Jacobson, R. Braden, and D. Borman, TCP Extensions for 626 High Performance, RFC 1323, May 1992. URL 627 ``http://www.ietf.cnri.reston.va.us/rfc/rfc1323.txt''. 629 [RFC2018] Mathis, M., Mahdavi, J., Floyd, S., and Romanow, A., TCP 630 Selective Acknowledgement Options. RFC 2018, Proposed Standard, April 631 1996. URL ``ftp://ftp.isi.edu/in-notes/rfc2018.txt''. 633 [RFC2581] M. Allman, V. Paxson, W. Stevens, TCP Congestion Control, 634 RFC 2581, Proposed Standard, April 1999. URL ``ftp://ftp.isi.edu/in- 635 notes/rfc2581.txt''. 637 [SCWA99] Stefan Savage, Neal Cardwell, David Wetherall, Tom Anderson, 638 TCP Congestion Control with a Misbehaving Receiver, draft paper, 639 April 1999. 641 AUTHORS' ADDRESSES 643 Sally Floyd 644 AT&T Center for Internet Research at ICSI (ACIRI) 645 Phone: +1 510-642-4274 x189 646 EMail: floyd@aciri.org 647 URL: http://www.aciri.org/floyd/ 649 Jamshid Mahdavi 650 Novell 651 Phone: 1-408-967-3806 652 Email: mahdavi@novell.com 653 Matt Mathis 654 Pittsburgh Supercomputing Center 655 Phone: 412 268-3319 656 Email: mathis@psc.edu 657 URL: http://www.psc.edu/~mathis/ 659 Matthew Podolsky 660 UC Berkeley Computer Science Dept. 661 Phone: 510-649-8914 662 Email: podolsky@eecs.berkeley.edu 663 URL: http://www.eecs.berkeley.edu/~podolsky 665 Allyn Romanow 666 Cisco Systems 667 Phone: 408-525-8836 668 Email: allyn@cisco.com 670 This draft was created in August 1999. 671 It expires February 2000.