idnits 2.17.1 draft-ietf-dccp-ccid2-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1.a on line 17. -- Found old boilerplate from RFC 3978, Section 5.5 on line 948. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 959. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 966. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 972. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 940), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 39. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: This document is an Internet-Draft and is subject to all provisions of Section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: * Added that "The sender SHOULD not attempt Ack Ratio renegotiations more than once per round-trip time." -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (10 March 2005) is 6980 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: 'RFC 793' is defined on line 873, but no explicit reference was found in the text == Unused Reference: 'RFC 2119' is defined on line 879, but no explicit reference was found in the text == Unused Reference: 'RFC 2434' is defined on line 882, but no explicit reference was found in the text == Unused Reference: 'RFC 2581' is defined on line 885, but no explicit reference was found in the text == Unused Reference: 'RFC 3390' is defined on line 894, but no explicit reference was found in the text == Unused Reference: 'RFC 2861' is defined on line 910, but no explicit reference was found in the text == Unused Reference: 'RFC 3540' is defined on line 916, but no explicit reference was found in the text == Outdated reference: A later version (-13) exists of draft-ietf-dccp-spec-11 ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 2581 (Obsoleted by RFC 5681) ** Obsolete normative reference: RFC 2988 (Obsoleted by RFC 6298) ** Obsolete normative reference: RFC 3517 (Obsoleted by RFC 6675) -- Obsolete informational reference (is this intentional?): RFC 2861 (Obsoleted by RFC 7661) Summary: 11 errors (**), 0 flaws (~~), 11 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Sally Floyd 2 INTERNET-DRAFT ICIR 3 draft-ietf-dccp-ccid2-10.txt Eddie Kohler 4 Expires: 10 September 2005 UCLA 5 10 March 2005 7 Profile for DCCP Congestion Control ID 2: 8 TCP-like Congestion Control 10 Status of this Memo 12 This document is an Internet-Draft and is subject to all provisions 13 of section 3 of RFC 3667. By submitting this Internet-Draft, each 14 author represents that any applicable patent or other IPR claims of 15 which he or she is aware have been or will be disclosed, and any of 16 which he or she become aware will be disclosed, in accordance with 17 RFC 3668. 19 Internet-Drafts are working documents of the Internet Engineering 20 Task Force (IETF), its areas, and its working groups. Note that 21 other groups may also distribute working documents as Internet- 22 Drafts. 24 Internet-Drafts are draft documents valid for a maximum of six 25 months and may be updated, replaced, or obsoleted by other documents 26 at any time. It is inappropriate to use Internet-Drafts as 27 reference material or to cite them other than as "work in progress." 29 The list of current Internet-Drafts can be accessed at 30 http://www.ietf.org/ietf/1id-abstracts.txt. 32 The list of Internet-Draft Shadow Directories can be accessed at 33 http://www.ietf.org/shadow.html. 35 This Internet-Draft will expire on 10 September 2005. 37 Copyright Notice 39 Copyright (C) The Internet Society (2005). All Rights Reserved. 41 Abstract 43 This document contains the profile for Congestion Control Identifier 44 2, TCP-like Congestion Control, in the Datagram Congestion Control 45 Protocol (DCCP). CCID 2 should be used by senders who would like to 46 take advantage of the available bandwidth in an environment with 47 rapidly changing conditions, and who are able to adapt to the abrupt 48 changes in the congestion window typical of TCP's Additive Increase 49 Multiplicative Decrease (AIMD) congestion control. 51 TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION: 53 Changes from draft-ietf-dccp-ccid2-07.txt: 55 * Restrict the use of byte-counting to be at most as aggressive 56 as the current TCP (without byte-counting). 58 Changes from draft-ietf-dccp-ccid2-06.txt: 60 * Moved three citations to Informational. 62 * Added that "The sender SHOULD not attempt Ack Ratio 63 renegotiations more than once per round-trip time." 65 * Specified that ssthresh is never less than two, instead of one. 67 * Added references to RFC 2988 and RFC 2018. 69 * Specify that the congestion window is only increased for packets 70 that aren't ECN-marked. 72 Changes from draft-ietf-dccp-ccid2-05.txt: 74 * Changes to the discussion about how the sender infers that DCCP- 75 Ack packets are lost. The sender does not know for sure whether a 76 missing sequence number is for a dropped ACK packet or a dropped 77 data packet. Our changes include a new appendix on "The Costs of 78 Inferring Lost Ack Packets". 80 * Minor editing for clarity, including some reordering of sections. 82 * Added a section on response to idle and application-limited 83 periods. 85 * Clarifications on changing the Ack Ratio, based on feedback from 86 Nils-Erik Mattsson. 88 Changes from draft-ietf-dccp-ccid2-04.txt: 90 * Minor editing, as follows: 91 - Added a note that CCID2 implementations MAY check for apps that 92 are 93 gaming with regard to the packet size. 94 - Deleted a statement that the maximum packet size is 1500 bytes. 95 - Added that the receiver MAY know the round-trip time from its 96 role as 97 - Added a note that the initial cwnd is up to four packets. 99 * Added Intellectual Property Notice. 101 Changes from draft-ietf-dccp-ccid2-03.txt: 103 * Disallow direct tracking of TCP standards. 105 Changes from draft-ietf-dccp-ccid2-02.txt: 107 * Added to the section on application requirements. 109 * Changed the default Ack Ratio to be two, as recommended for TCP. 111 * Added a paragraph about packet sizes. 113 Changes from draft-ietf-dccp-ccid2-01.txt: 115 * Added "Security Considerations" and "IANA Considerations" 116 sections. 118 * Refer explicitly to SACK-based TCP, and flesh out Section 3 119 ("Congestion Control on Data Packets"). 121 * When cwnd < ssthresh, increase cwnd by one per newly acknowledged 122 packet up to some limit, in line with TCP Appropriate Byte Counting. 124 * Refined definition of quiescence. 126 Changes from draft-ietf-dccp-ccid2-00.txt: 128 * Said that the Acknowledgement Number reports the largest sequence 129 number, not the most recent packet, for consistency with draft-ietf- 130 dccp-spec. 132 * Added notes about ECN nonces for acknowledgements, and about 133 dealing with piggybacked acknowledgements. 135 Table of Contents 137 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 6 138 2. Conventions and Notation. . . . . . . . . . . . . . . . . . . 6 139 3. Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 140 3.1. Relationship with TCP. . . . . . . . . . . . . . . . . . 7 141 3.2. Example Half-Connection. . . . . . . . . . . . . . . . . 8 142 4. Connection Establishment. . . . . . . . . . . . . . . . . . . 9 143 5. Congestion Control on Data Packets. . . . . . . . . . . . . . 9 144 5.1. Response to Idle and Application-limited 145 Periods . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 146 5.2. Response to Data Dropped and Slow Receiver . . . . . . . 12 147 5.3. Packet Size. . . . . . . . . . . . . . . . . . . . . . . 12 148 6. Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . 13 149 6.1. Congestion Control on Acknowledgements . . . . . . . . . 13 150 6.1.1. Detecting Lost and Marked 151 Acknowledgements . . . . . . . . . . . . . . . . . . . . . 13 152 6.1.2. Changing Ack Ratio. . . . . . . . . . . . . . . . . 14 153 6.2. Acknowledgements of Acknowledgements . . . . . . . . . . 15 154 6.2.1. Determining Quiescence. . . . . . . . . . . . . . . 15 155 7. Explicit Congestion Notification. . . . . . . . . . . . . . . 16 156 8. Options and Features. . . . . . . . . . . . . . . . . . . . . 16 157 9. Security Considerations . . . . . . . . . . . . . . . . . . . 16 158 10. IANA Considerations. . . . . . . . . . . . . . . . . . . . . 16 159 10.1. Reset Codes . . . . . . . . . . . . . . . . . . . . . . 17 160 10.2. Option Types. . . . . . . . . . . . . . . . . . . . . . 17 161 10.3. Feature Numbers . . . . . . . . . . . . . . . . . . . . 17 162 11. Thanks . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 163 A. Appendix: Derivation of Ack Ratio Decrease. . . . . . . . . . 18 164 B. Appendix: Cost of Loss Inference Mistakes to Ack 165 Ratio. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 166 Normative References . . . . . . . . . . . . . . . . . . . . . . 20 167 Informative References . . . . . . . . . . . . . . . . . . . . . 21 168 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 21 169 Full Copyright Statement . . . . . . . . . . . . . . . . . . . . 22 170 Intellectual Property. . . . . . . . . . . . . . . . . . . . . . 22 172 1. Introduction 174 This document contains the profile for Congestion Control Identifier 175 2, TCP-like Congestion Control, in the Datagram Congestion Control 176 Protocol (DCCP) [DCCP]. DCCP uses Congestion Control Identifiers, 177 or CCIDs, to specify the congestion control mechanism in use on a 178 half-connection. 180 The TCP-like Congestion Control CCID sends data using a close 181 variant of TCP's congestion control mechanisms, incorporating 182 selective acknowledgements (SACK) [RFC 2018, RFC 3517]. CCID 2 is 183 suitable for senders who can adapt to the abrupt changes in 184 congestion window typical of TCP's Additive Increase Multiplicative 185 Decrease (AIMD) congestion control, and particularly useful for 186 senders who would like to take advantage of the available bandwidth 187 in an environment with rapidly changing conditions. See Section 3 188 for more on application requirements. 190 2. Conventions and Notation 192 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 193 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 194 document are to be interpreted as described in RFC 2119. 196 A DCCP half-connection consists of the application data sent by one 197 endpoint and the corresponding acknowledgements sent by the other 198 endpoint. The terms "HC-Sender" and "HC-Receiver" denote the 199 endpoints sending application data and acknowledgements, 200 respectively. Since CCIDs apply at the level of half-connections, 201 we abbreviate HC-Sender to "sender" and HC-Receiver to "receiver" in 202 this document. See [DCCP] for more discussion. 204 For simplicity, we say that senders send DCCP-Data packets and 205 receivers send DCCP-Ack packets. Both of these categories are meant 206 to include DCCP-DataAck packets. 208 The phrases "ECN-marked" and "marked" refer to packets marked ECN 209 Congestion Experienced unless otherwise noted. 211 3. Usage 213 CCID 2, TCP-like Congestion Control, is appropriate for DCCP flows 214 that would like to receive as much bandwidth as possible over the 215 long term, consistent with the use of end-to-end congestion control, 216 and that can tolerate the large sending rate variations 217 characteristic of AIMD congestion control, including halving of the 218 congestion window in response to a congestion event. 220 Applications that simply need to transfer as much data as possible 221 in as short a time as possible should use CCID 2. This contrasts 222 with CCID 3, TCP-Friendly Rate Control (TFRC) Congestion Control 223 [CCID 3 PROFILE], which is appropriate for flows that would prefer 224 to minimize abrupt changes in the sending rate. For example, CCID 2 225 is recommended over CCID 3 for streaming media applications that 226 buffer a considerable amount of data at the application receiver 227 before playback time, insulating the application somewhat from 228 abrupt changes in the sending rate. Such applications could easily 229 choose DCCP's CCID 2 over TCP itself, possibly adding some form of 230 selective reliability at the application layer. CCID 2 is also 231 recommended over CCID 3 for applications where halving the sending 232 rate in response to congestion is not likely to interfere with 233 application-level performance. 235 An additional advantage of CCID 2 is that its TCP-like congestion 236 control mechanisms are reasonably well-understood, with traffic 237 dynamics quite similar to those of TCP. While the network research 238 community is still learning about the dynamics of TCP after 15 years 239 of its being the dominant transport protocol in the Internet, some 240 applications might prefer the more well-known dynamics of TCP-like 241 congestion control over that of newer congestion control mechanisms, 242 which haven't yet met the test of widespread Internet deployment. 244 3.1. Relationship with TCP 246 The congestion control mechanisms described here closely follow 247 mechanisms standardized by the IETF for use in SACK-based TCP, and 248 we rely partially on existing TCP documentation, such as RFC 793, 249 RFC 2581, RFC 3465, and RFC 3517. TCP congestion control continues 250 to evolve, but CCID 2 implementations SHOULD wait for explicit 251 updates to CCID 2 rather than track TCP's evolution directly. 252 Differences between CCID 2 and straight TCP congestion control 253 include the following: 255 o CCID 2 applies congestion control to acknowledgements, a 256 mechanism not currently standardized for use in TCP. 258 o DCCP is a datagram protocol, so several parameters whose units 259 are specified in bytes in TCP, such as the congestion window 260 cwnd, have units of packets in DCCP. 262 o As an unreliable protocol, DCCP never retransmits a packet, so 263 congestion control mechanisms that distinguish retransmissions 264 from new packets have been redesigned for the DCCP context. 266 3.2. Example Half-Connection 268 This example shows the typical progress of a half-connection using 269 CCID 2's TCP-like Congestion Control, not including connection 270 initiation and termination. The example is informative, not 271 normative. 273 1. The sender sends DCCP-Data packets, where the number of packets 274 sent is governed by a congestion window, cwnd, as in TCP. Each 275 DCCP-Data packet uses a sequence number. The sender also sends 276 an Ack Ratio feature option specifying the number of data 277 packets to be covered by an Ack packet from the receiver; Ack 278 Ratio defaults to two. The DCCP header's CCVal field is set to 279 zero. 281 Assuming that the half-connection is Explicit Congestion 282 Notification (ECN) capable (the ECN Incapable feature is zero -- 283 the default), each DCCP-Data packet is sent as ECN-Capable with 284 either the ECT(0) or the ECT(1) codepoint set, as described in 285 RFC 3540. 287 2. The receiver sends a DCCP-Ack packet acknowledging the data 288 packets for every Ack Ratio data packets transmitted by the 289 sender. Each DCCP-Ack packet uses a sequence number and 290 contains an Ack Vector. The sequence number acknowledged in a 291 DCCP-Ack packet is that of the received packet with the highest 292 sequence number, rather than a TCP-like cumulative 293 acknowledgement. 295 The receiver returns the sum of received ECN Nonces via Ack 296 Vector options, allowing the sender to probabilistically verify 297 that the receiver is not misbehaving. DCCP-Ack packets from the 298 receiver are also sent as ECN-Capable, since the sender will 299 control the acknowledgement rate in a roughly TCP-friendly way 300 using the Ack Ratio feature. There is little need for the 301 receiver to verify the nonces of its DCCP-Ack packets, since the 302 sender cannot get significant benefit from misreporting the ack 303 mark rate. 305 3. The sender continues sending DCCP-Data packets as controlled by 306 the congestion window. Upon receiving DCCP-Ack packets, the 307 sender examines their Ack Vectors to learn about marked or 308 dropped data packets, and adjusts its congestion window 309 accordingly. Because this is unreliable transfer, the sender 310 does not retransmit dropped packets. 312 4. Because DCCP-Ack packets use sequence numbers, the sender has 313 some information about lost or marked DCCP-Ack packets. The 314 sender responds to lost or marked DCCP-Ack packets by modifying 315 the Ack Ratio sent to the receiver. 317 5. The sender acknowledges the receiver's acknowledgements at least 318 once per congestion window. If both half-connections are 319 active, the sender's acknowledgement of the receiver's 320 acknowledgements is included in the sender's acknowledgement of 321 the receiver's data packets. If the reverse-path half- 322 connection is quiescent, the sender sends a DCCP-DataAck packet 323 that includes an Acknowledgement Number in the header. 325 6. The sender estimates round-trip times, either through keeping 326 track of acknowledgement round-trip times as TCP does or through 327 explicit Timestamp options, and calculates a TimeOut (TO) value 328 much as the RTO (Retransmit Timeout) is calculated in TCP. The 329 TO is used to determine when a new DCCP-Data packet can be 330 transmitted when the sender has been limited by the congestion 331 window and no feedback has been received from the receiver. 333 4. Connection Establishment 335 Use of the Ack Vector is MANDATORY on CCID 2 half-connections, so 336 the sender MUST send a "Change R(Send Ack Vector, 1)" option to the 337 receiver as part of connection establishment. The sender SHOULD NOT 338 send data until it has received the corresponding "Confirm L(Send 339 Ack Vector, 1)" from the receiver, except possibly for data included 340 on the initial DCCP-Request packet. 342 5. Congestion Control on Data Packets 344 CCID 2's congestion control mechanisms are based on those for SACK- 345 based TCP [RFC 3517], since the Ack Vector provides all the 346 information that might be transmitted in SACK options. 348 A CCID 2 data sender maintains three integer parameters measured in 349 packets. 351 1. The congestion window "cwnd", which equals the maximum number of 352 data packets allowed in the network at any time. ("Data packet" 353 means any DCCP packet that contains user data: DCCP-Data, DCCP- 354 DataAck, and occasionally DCCP-Request and DCCP-Response.) 356 2. The slow-start threshold "ssthresh", which controls adjustments 357 to cwnd. 359 3. The pipe value "pipe", which is the sender's estimate of the 360 number of data packets outstanding in the network. 362 These parameters are manipulated, and their initial values 363 determined, according to SACK-based TCP's behavior, except that they 364 are measured in packets, not bytes. The rest of this section 365 provides more specific guidance. 367 The sender MAY send a data packet when pipe < cwnd, but MUST NOT 368 send a data packet when pipe >= cwnd. Every data packet sent 369 increases pipe by 1. 371 The sender reduces pipe as it infers that data packets have left the 372 network, either by being received or by being dropped. In 373 particular: 375 1. Acked data packets. The sender reduces pipe by 1 for each data 376 packet newly-acknowledged as received (Ack Vector State 0 or 377 State 1) by some DCCP-Ack. 379 2. Dropped data packets. The sender reduces pipe by 1 for each 380 data packet it can infer as lost due to the DCCP equivalent of 381 TCP's "duplicate acknowledgements". This depends on the 382 NUMDUPACK parameter, the number of duplicate acknowledgements 383 needed to infer a loss. The NUMDUPACK parameter is set to 384 three, as is currently the case in TCP. A packet P is inferred 385 to be lost, rather than delayed, when at least NUMDUPACK packets 386 transmitted after P have been acknowledged as received (Ack 387 Vector State 0 or 1) by the receiver. Note that the 388 acknowledged packets following the hole may be DCCP-Acks or 389 other non-data packets. 391 3. Transmit timeouts. Finally, the sender needs transmit timeouts, 392 handled like TCP's retransmission timeouts, in case an entire 393 window of packets is lost. The sender estimates the round-trip 394 time at most once per window of data, and uses the TCP 395 algorithms for maintaining the average round-trip time, mean 396 deviation, and timeout value [RFC 2988]. (If more than one 397 measurement per round-trip time was used for these calculations, 398 then the weights of the averagers would have to be adjusted, so 399 that the average round-trip time is effectively derived from 400 measurements over multiple round-trip times.) Because DCCP does 401 not retransmit data, DCCP does not require TCP's recommended 402 minimum timeout of one second. The exponential backoff of the 403 timer is exactly as in TCP. When a transmit timeout occurs, the 404 sender sets pipe to zero. The adjustments to cwnd and ssthresh 405 are described below. 407 The sender MUST NOT decrement pipe more than once per data packet. 408 True duplicate acknowledgements, for example, MUST NOT affect pipe. 409 Furthermore, the sender MUST NOT decrement pipe for non-data 410 packets, such as DCCP-Acks, even though the Ack Vector will contain 411 information about them. 413 Congestion events cause CCID 2 to reduce its congestion window. A 414 congestion event contains at least one lost or marked packet. As in 415 TCP, two losses or marks are considered to be part of a single 416 congestion event when the second packet was sent before the loss or 417 mark of the first packet was detected. As an approximation, a 418 sender can consider two losses or marks to be part of a single 419 congestion event when the packets were sent within one RTT estimate 420 of one another, using an RTT estimate current at the time the 421 packets were sent. For each congestion event, either indicated 422 explicitly as an Ack Vector State 1 (ECN-marked) acknowledgement or 423 inferred via "duplicate acknowledgements", cwnd is halved, then 424 ssthresh is set to the new cwnd. Cwnd is never reduced below one 425 packet. After a timeout, the slow-start threshold is set to cwnd/2, 426 then cwnd is set to one packet. When halved, cwnd and ssthresh have 427 their values rounded down, except that cwnd is never less than one 428 and ssthresh is never less than two. 430 When cwnd < ssthresh, meaning that the sender is in slow-start, the 431 congestion window is increased by one packet for every two newly 432 acknowledged data packets with Ack Vector State 0 (not ECN-marked), 433 up to a maximum of Ack Ratio/2 packets per acknowledgement. This is 434 a modified form of Appropriate Byte Counting [RFC 3465] that is 435 consistent with TCP's current standard (which does not include byte- 436 counting), but allows CCID 2 to increase as aggressively as TCP when 437 CCID-2's Ack Ratio is greater than the default value of two. When 438 cwnd >= ssthresh, the congestion window is increased by one packet 439 for every window of data acknowledged without lost or marked 440 packets. The cwnd parameter is initialized to at most four packets 441 for new connections, following the rules from RFC 3390; the ssthresh 442 parameter is initialized to an arbitrarily high value. 444 Senders MAY use a form of rate-based pacing when sending multiple 445 data packets liberated by a single ack packet, rather than sending 446 all liberated data packets in a single burst. 448 5.1. Response to Idle and Application-limited Periods 450 CCID 2 is designed to follow TCP's congestion control mechanisms to 451 the extent possible, but TCP does not have complete standardization 452 for its congestion control response to idle periods (when no data 453 packets are sent) or to application-limited periods (when the 454 sending rate is less than that allowed by cwnd). This section is a 455 brief guide to the standards for TCP in this area. 457 For idle periods, RFC 2581 recommends that the TCP sender SHOULD 458 slow-start after an idle period, where an idle period is defined as 459 a period exceeding the timeout interval. RFC 2861, currently 460 Experimental, suggests a slightly more moderate mechanism where the 461 congestion window is halved for every round-trip time that the 462 sender has remained idle. 464 There are currently no standards governing TCP's use of the 465 congestion window during an application-limited period. In 466 particular, it is possible for TCP's congestion window to grow quite 467 large during a long uncongested period when the sender is 468 application-limited, sending at a low rate. RFC 2861 essentially 469 suggests that TCP's congestion window not be increased during 470 application-limited periods, when the congestion window is not being 471 fully utilized. 473 5.2. Response to Data Dropped and Slow Receiver 475 As described in [DCCP], the Data Dropped option lets an endpoint 476 declare that a packet was dropped at the end host before delivery to 477 the application -- for instance, because of corruption or receive 478 buffer overflow. CCID 2 senders respond to these options as 479 described in [DCCP], with the following further clarifications. 481 o Drop Code 2 ("receive buffer drop"). The congestion window 482 "cwnd" is reduced by one for each packet newly acknowledged as 483 Drop Code 2, except that it is never reduced below one. 485 o Exiting slow-start. The sender MUST exit slow start whenever it 486 receives a relevant Data Dropped or Slow Receiver option. 488 5.3. Packet Size 490 CCID 2 is optimized for applications that generally use a fixed 491 packet size, and that vary their sending rate in packets per second 492 in response to congestion. CCID 2 is not appropriate for 493 applications that require a fixed interval of time between packets, 494 and vary their packet size instead of their packet rate in response 495 to congestion. CCID 2 maintains a congestion window in packets, and 496 does not increase the congestion window in response to a decrease in 497 the packet size. However, some attention might be required for 498 applications using CCID 2 that vary their packet size not in 499 response to congestion, but in response to other application-level 500 requirements. 502 CCID 2 implementations MAY check for applications that appear to be 503 manipulating the packet size inappropriately. For example, an 504 application might send small packets for a while, building up a fast 505 rate, then switch to large packets to take advantage of the fast 506 rate. (Preliminary simulations indicate that applications may not 507 be able to increase their overall transfer rates this way, so it is 508 not clear this manipulation will occur in practice [V03].) 510 6. Acknowledgements 512 CCID 2 acknowledgements are generally paced by the sender's data 513 packets. Each required acknowledgement MUST contain Ack Vector 514 options that declare exactly which packets arrived, and whether 515 those packets were ECN-marked. Acknowledgement data in the Ack 516 Vector options SHOULD generally cover the receiver's entire 517 Acknowledgement Window; see [DCCP] (Section 11.4.2). 519 CCID 2 senders use DCCP's Ack Ratio feature to influence the rate at 520 which DCCP-Ack packets are generated, thus controlling reverse-path 521 congestion. This differs from TCP, which presently has no 522 congestion control for pure acknowledgement traffic. CCID 2's 523 reverse-path congestion control does not try to be TCP-friendly; it 524 just tries to avoid congestion collapse, and to be somewhat better 525 than TCP in the presence of a high packet loss or mark rate on the 526 reverse path. The default Ack Ratio is two, and CCID 2 with this 527 Ack Ratio behaves like TCP with delayed acks. [DCCP] (Section 11.3) 528 describes the Ack Ratio in more detail, including its relationship 529 to acknowledgement pacing and DCCP-DataAck packets. Section 6.1.1 530 below describes the sender's detection of lost or marked 531 acknowledgements, and Section 6.1.2 gives the sender's rules for 532 changing the Ack Ratio. 534 6.1. Congestion Control on Acknowledgements 536 When Ack Ratio is R, the receiver sends one DCCP-Ack packet per R 537 data packets, more or less. Since the sender sends cwnd data 538 packets per round-trip time, the acknowledgement rate equals cwnd/R 539 DCCP-Acks per round-trip time. The sender keeps the acknowledgement 540 rate roughly TCP-friendly by monitoring the acknowledgement stream 541 for lost and marked DCCP-Ack packets, and modifying R accordingly. 542 For every RTT containing a DCCP-Ack congestion event (that is, a 543 lost or marked DCCP-Ack), the sender halves the acknowledgement rate 544 by doubling Ack Ratio; for every RTT containing no DCCP-Ack 545 congestion event, it additively increases the acknowledgement rate 546 through gradual decreases in Ack Ratio. 548 6.1.1. Detecting Lost and Marked Acknowledgements 550 All packets from the receiver contain sequence numbers, so the 551 sender can detect both losses and marks on the receiver's packets. 552 The sender infers receiver packet loss in the same way as it infers 553 losses of its data packets: a packet from the receiver is considered 554 lost after at least NUMDUPACK packets with greater sequence numbers 555 have been received. 557 DCCP-Ack packets are generally small, so they might impose less load 558 on congested network links than DCCP-Data and DCCP-DataAck packets. 559 For this reason, Ack Ratio depends on losses and marks on the 560 receiver's non-data packets, not on aggregate losses and marks on 561 all of the receiver's packets. The non-data packet category 562 consists of those packet types that cannot carry application data: 563 DCCP-Ack, DCCP-Close, DCCP-CloseReq, DCCP-Reset, DCCP-Sync, and 564 DCCP-SyncAck. The sender can easily distinguish non-data marks from 565 other marks. This is harder for losses, though, since the sender 566 can't always know whether a lost packet carried data. Unless it has 567 better information, the sender SHOULD assume, for the purpose of Ack 568 Ratio calculation, that every lost packet was a non-data packet. 569 Better information is available via DCCP's NDP Count option, if 570 necessary. (Appendix B discusses the costs of mistaking data packet 571 loss for non-data packet loss.) 573 A receiver that implements its own acknowledgement congestion 574 control SHOULD NOT reduce its DCCP-Ack acknowledgement rate due to 575 losses or marks on its data packets. 577 6.1.2. Changing Ack Ratio 579 Ack Ratio always meets three constraints: (1) Ack Ratio is an 580 integer. (2) Ack Ratio does not exceed cwnd/2, rounded up, except 581 that Ack Ratio 2 is always acceptable. (3) Ack Ratio is two or more 582 for a congestion window of four or more packets. 584 The sender changes Ack Ratio within those constraints as follows. 585 For each congestion window of data with lost or marked DCCP-Ack 586 packets, Ack Ratio is doubled; and for each cwnd/(R^2 - R) 587 consecutive congestion windows of data with no lost or marked DCCP- 588 Ack packets, Ack Ratio is decreased by 1. (See Appendix A for the 589 derivation.) Changes in Ack Ratio are signalled through feature 590 negotiation; see [DCCP] (Section 11.3). 592 For a constant congestion window, this gives an Ack sending rate 593 that is roughly TCP-friendly. Of course, cwnd usually varies over 594 time; the dynamics will be rather complex, but roughly TCP-friendly. 595 We recommend that the sender use the most recent value of cwnd when 596 determining whether to decrease Ack Ratio by 1. 598 The sender need not keep Ack Ratio completely up to date. For 599 instance, it MAY rate-limit Ack Ratio renegotiations to once every 600 four or five round-trip times, or to once every second or two. The 601 sender SHOULD NOT attempt to renegotiate the Ack Ratio more than 602 once per round-trip time. Additionally, it MAY enforce a minimum 603 Ack Ratio of two, or it MAY set Ack Ratio to one for half- 604 connections with persistent congestion windows of 1 or 2 packets. 606 Putting it all together, the receiver always sends at least one 607 acknowledgement per window of data when cwnd = 1, and at least two 608 acknowledgements per window of data otherwise. Thus, the receiver 609 could be sending two ack packets per window of data even in the face 610 of very heavy congestion on the reverse path. We would note, 611 however, that if congestion is sufficiently heavy that all of the 612 ack packets are dropped, then the sender falls back on an 613 exponentially-backed-off timeout, as in TCP. Thus, if congestion is 614 sufficiently heavy on the reverse path, then the sender reduces its 615 sending rate on the forward path, which reduces the rate on the 616 reverse path as well. 618 6.2. Acknowledgements of Acknowledgements 620 An active sender DCCP A MUST occasionally acknowledge its peer DCCP 621 B's acknowledgements, so that DCCP B can free up Ack Vector state. 622 When both half-connections are active, A's acknowledgements of B's 623 acknowledgements are automatically contained in A's acknowledgements 624 of B's data. If the B-to-A half-connection is quiescent, however, 625 DCCP A must occasionally send acknowledgements proactively, such as 626 by sending a DCCP-DataAck packet that includes an Acknowledgement 627 Number in the header. 629 An active sender SHOULD acknowledge the receiver's acknowledgements 630 at least once per congestion window. Of course, the sender's 631 application might fall silent. This is no problem; when neither 632 side is sending data, a sender can wait arbitrarily long before 633 sending an ack. 635 6.2.1. Determining Quiescence 637 This section describes how a CCID 2 receiver determines that the 638 corresponding sender is not sending any data, and therefore has gone 639 quiescent. See [DCCP] (Section 11.1) for general information on 640 quiescence. 642 Let T equal the greater of 0.2 seconds and two round-trip times. 643 (The receiver may know the round-trip time in its role as the sender 644 for the other half-connection. If it does not, it should use a 645 default RTT of 0.2 seconds, as described in [DCCP] (Section 3.4).) 646 Once the sender acknowledges the receiver's Ack Vectors, and the 647 sender has not sent additional data for at least T seconds, the 648 receiver can infer that the sender is quiescent. More precisely, 649 the receiver infers that the sender has gone quiescent when at least 650 T seconds have passed without receiving any data from the sender, 651 and the sender has acknowledged receiver Ack Vectors covering all 652 data packets received at the receiver. 654 7. Explicit Congestion Notification 656 CCID 2 supports Explicit Congestion Notification (ECN) [RFC 3168]. 657 The sender will use the ECN Nonce for data packets, and the receiver 658 will echo those nonces in its Ack Vectors, as specified in [DCCP] 659 (Section 12.2). Information about marked packets is also returned 660 in the Ack Vector. Because the information in the Ack Vector is 661 reliably transferred, DCCP does not need the TCP flags of ECN-Echo 662 and Congestion Window Reduced. 664 For unmarked data packets, the receiver computes the ECN Nonce Echo 665 as in RFC 3540, and returns it as part of its Ack Vector options. 666 The sender SHOULD check these ECN Nonce Echoes against the expected 667 values, thus protecting against the accidental or malicious 668 concealment of marked packets. 670 Because CCID 2 acknowledgements are congestion-controlled, ECN may 671 also be used for its acknowledgements. In this case we do not make 672 use of the ECN Nonce, because it would not be easy to provide 673 protection against the concealment of marked ack packets by the 674 sender, and because the sender does not have much motivation for 675 lying about the mark rate on acknowledgements. 677 8. Options and Features 679 DCCP's Ack Vector option, and its ECN Capable, Ack Ratio, and Send 680 Ack Vector features, are relevant for CCID 2. 682 9. Security Considerations 684 Security considerations for DCCP have been discussed in [DCCP], and 685 security considerations for TCP have been discussed in RFC 2581. 687 RFC 2581 discusses ways that an attacker could impair the 688 performance of a TCP connection by dropping packets, or by forging 689 extra duplicate acknowledgements or acknowledgements for new data. 690 We are not aware of any new security considerations created by this 691 document in its use of TCP-like congestion control. 693 10. IANA Considerations 695 This specification defines the value 2 in the DCCP CCID namespace 696 managed by IANA. This assignment is also mentioned in [DCCP]. 698 CCID 2 also introduces three sets of numbers whose values should be 699 allocated by IANA, namely CCID 2-specific Reset Codes, option types, 700 and feature numbers. These ranges will prevent any future 701 CCID 2-specific allocations from polluting DCCP's corresponding 702 global namespaces; see [DCCP] (Section 10.3). However, this 703 document makes no particular allocations from any range, except for 704 experimental and testing use [RFC 3692]. We refer to the Standards 705 Action policy outlined in RFC 2434. 707 10.1. Reset Codes 709 Each entry in the DCCP CCID 2 Reset Code registry contains a 710 CCID 2-specific Reset Code, which is a number in the range 128-255; 711 a short description of the Reset Code; and a reference to the RFC 712 defining the Reset Code. Reset Codes 184-190 and 248-254 are 713 permanently reserved for experimental and testing use. The 714 remaining Reset Codes -- 128-183, 191-247, and 255 -- are currently 715 reserved, and should be allocated with the Standards Action policy, 716 which requires IESG review and approval and standards-track IETF RFC 717 publication. 719 10.2. Option Types 721 Each entry in the DCCP CCID 2 option type registry contains a 722 CCID 2-specific option type, which is a number in the range 128-255; 723 the name of the option; and a reference to the RFC defining the 724 option type. Option types 184-190 and 248-254 are permanently 725 reserved for experimental and testing use. The remaining option 726 types -- 128-183, 191-247, and 255 -- are currently reserved, and 727 should be allocated with the Standards Action policy, which requires 728 IESG review and approval and standards-track IETF RFC publication. 730 10.3. Feature Numbers 732 Each entry in the DCCP CCID 2 feature number registry contains a 733 CCID 2-specific feature number, which is a number in the range 734 128-255; the name of the feature; and a reference to the RFC 735 defining the feature number. Feature numbers 184-190 and 248-254 736 are permanently reserved for experimental and testing use. The 737 remaining feature numbers -- 128-183, 191-247, and 255 -- are 738 currently reserved, and should be allocated with the Standards 739 Action policy, which requires IESG review and approval and 740 standards-track IETF RFC publication. 742 11. Thanks 744 We thank Mark Handley and Jitendra Padhye for their help in defining 745 CCID 2. We also thank Mark Allman, Aaron Falk, Nils-Erik Mattsson, 746 Greg Minshall, Arun Venkataramani, Magnus Westerlund, and members of 747 the DCCP Working Group for feedback on this document. 749 A. Appendix: Derivation of Ack Ratio Decrease 751 This section justifies the algorithm for increasing and decreasing 752 the Ack Ratio given in Section 6.1.2. 754 The congestion avoidance phase of TCP halves the cwnd for every 755 window with congestion. Similarly, CCID 2 doubles Ack Ratio for 756 every window with congestion on the return path, roughly halving the 757 DCCP-Ack sending rate. 759 The congestion avoidance phase of TCP increases cwnd by one MSS for 760 every congestion-free window. Applying this congestion avoidance 761 behavior to acknowledgement traffic, this would correspond to 762 increasing the number of DCCP-Ack packets per window by one after 763 every congestion-free window of DCCP-Ack packets. We cannot achieve 764 this exactly using Ack Ratio, since it is an integer. Instead, we 765 must decrease Ack Ratio by one after K windows have been sent 766 without a congestion event on the reverse path, where K is chosen so 767 that the long-term number of DCCP-Ack packets per congestion window 768 is roughly TCP-friendly, following AIMD congestion control. 770 In CCID 2, rough TCP-friendliness for the ack traffic can be 771 accomplished by setting K to cwnd/(R^2 - R), where R is the current 772 Ack Ratio. 774 This result was calculated as follows: 776 R = Ack Ratio = # data packets / ack packets, and 777 W = Congestion Window = # data packets / window, so 778 W/R = # ack packets / window. 780 Requirement: Increase W/R by 1 per congestion-free window. 781 Since we can only reduce R by increments of one, we find K 782 so that, after K congestion-free windows, 783 W/R + K would equal W/(R-1). 785 (W/R) + K = W/(R-1), so 786 K = W/(R-1) - W/R = W/(R^2 - R). 788 B. Appendix: Cost of Loss Inference Mistakes to Ack Ratio 790 As discussed in Section 6.1.1, the sender often cannot determine 791 whether lost packets carried data. This hinders its ability to 792 separate non-data loss events from other loss events. In the 793 absence of better information, the sender assumes, for the purpose 794 of Ack Ratio calculation, that all lost packets were non-data 795 packets. This may overestimate the non-data loss event rate, which 796 can lead to a too-high Ack Ratio, and thus a too-slow 797 acknowledgement rate. All acknowledgement information will still 798 get through -- DCCP acknowledgements are reliable -- but 799 acknowledgement information will arrive in a burstier fashion. 800 Absent some form of rate-based pacing, this could lead to increased 801 burstiness for the sender's data traffic. 803 There are several cases when the problem of an overly-high Ack 804 Ratio, and the resulting increased burstiness of the data traffic, 805 will not arise. In particular, call the receiver DCCP B and the 806 sender DCCP A. Then: 808 o The problem won't arise unless DCCP B is sending a significant 809 amount of data itself. When the B-to-A half-connection is 810 quiescent or low-rate, most packets sent by DCCP B will, in fact, 811 be pure acknowledgements, and DCCP A's estimate of the DCCP-Ack 812 loss rate will be reasonably accurate. 814 o The problem won't arise if DCCP B habitually piggybacks 815 acknowledgement information on its data packets. The piggybacked 816 acknowledgements are not limited by Ack Ratio, so they can arrive 817 frequently enough to prevent burstiness. 819 o The problem won't arise if DCCP A's sending rate is low, since 820 burstiness isn't a problem at low rates. 822 o The problem won't arise if DCCP B's sending rate is high relative 823 to DCCP A's sending rate, since the B-to-A loss rate must be low 824 to support DCCP B's sending rate. This bounds the Ack Ratio to 825 reasonable values even when DCCP A labels every loss as a DCCP- 826 Ack loss. 828 o The problem won't arise if DCCP B sends NDP Count options when 829 appropriate (the Send NDP Count/B feature is true). Then the 830 sender can use the receiver's NDP Count options to detect, in 831 most cases, whether lost packets were data packets or DCCP-Acks. 833 o Finally, the problem won't arise if DCCP A rate-paces its data 834 packets. 836 This leaves the case when DCCP B is sending roughly the same amount 837 of data packets and non-data packets, without NDP Count options, and 838 with all acknowledgement information in DCCP-Ack packets. We now 839 quantify the potential cost, in terms of a too-large Ack Ratio, due 840 to the sender's misclassifying data packet losses as DCCP-Ack 841 losses. For simplicity, we assume an environment of large-scale 842 statistical multiplexing, where the packet drop rate is independent 843 of the sending rate of any individual connection. 845 Assume that when DCCP A correctly counts non-data losses, Ack Ratio 846 is set so that B-to-A data and acknowledgement traffic both have a 847 sending rate of D packets per second. Then when DCCP A incorrectly 848 counts data losses as non-data losses, the sending rate for the B- 849 to-A data traffic is still D pps, but the reduced sending rate for 850 the B-to-A acknowledgement traffic is f*D pps, with f < 1. Let the 851 packet loss rate be p. The sender incorrectly estimates the non- 852 data loss rate as (pD+pfD)/fD, or, equivalently, as p(1 + 1/f). 853 Because the congestion control mechanism for acknowledgement traffic 854 is roughly TCP-friendly, and therefore the non-data sending rate and 855 the data sending rate both grow as 1/sqrt(x) for x the packet drop 856 rate, we have 857 fD/D = sqrt(p)/sqrt(p(1 + 1/f)), 858 so 859 f^2 = 1/(1 + 1/f). 860 Solving, we get f = 0.62. If the sender incorrectly counts lost 861 data packets as non-data in this scenario, the acknowledgement rate 862 is decreased by a factor of 0.62. This would result in a moderate 863 increase in burstiness for the A-to-B data traffic, which could be 864 mitigated by sending NDP Count options or piggybacked 865 acknowledgements, or by rate-pacing out the data. 867 Normative References 869 [DCCP] E. Kohler, M. Handley, and S. Floyd. Datagram Congestion 870 Control Protocol, draft-ietf-dccp-spec-11.txt, work in progress, 871 March 2005. 873 [RFC 793] J. Postel, editor. Transmission Control Protocol. 874 RFC 793. 876 [RFC 2018] M. Mathis, J. Mahdavi, A. Floyd, and A. Romanow. TCP 877 Selective Acknowledgement Options, RFC 2018, October 1996. 879 [RFC 2119] S. Bradner. Key Words For Use in RFCs to Indicate 880 Requirement Levels. RFC 2119. 882 [RFC 2434] T. Narten and H. Alvestrand. Guidelines for Writing an 883 IANA Considerations Section in RFCs. RFC 2434. 885 [RFC 2581] M. Allman, V. Paxson, and W. Stevens. TCP Congestion 886 Control. RFC 2581. 888 [RFC 2988] V. Paxson and M. Allman, Computing TCP's Retransmission 889 Timer, RFC 2988, November 2000. 891 [RFC 3168] K.K. Ramakrishnan, S. Floyd, and D. Black. The Addition 892 of Explicit Congestion Notification (ECN) to IP. RFC 3168. 894 [RFC 3390] M. Allman, S. Floyd, and C. Partridge. Increasing TCP's 895 Initial Window. RFC 3390. 897 [RFC 3517] E. Blanton, M. Allman, K. Fall, and L. Wang. A 898 Conservative Selective Acknowledgment (SACK)-based Loss Recovery 899 Algorithm for TCP. RFC 3517. 901 [RFC 3692] T. Narten. Assigning Experimental and Testing Numbers 902 Considered Useful. RFC 3692. 904 Informative References 906 [CCID 3 PROFILE] S. Floyd, E. Kohler, and J. Padhye. Profile for 907 DCCP Congestion Control ID 3: TFRC Congestion Control. draft- 908 ietf-dccp-ccid3-11.txt, work in progress, March 2005. 910 [RFC 2861] M. Handley, J. Padhye, and S. Floyd. TCP Congestion 911 Window Validation. RFC 2861. 913 [RFC 3465] M. Allman. TCP Congestion Control with Appropriate Byte 914 Counting (ABC). RFC 3465. 916 [RFC 3540] N. Spring, D. Wetherall, and D. Ely. Robust Explicit 917 Congestion Notification (ECN) Signaling with Nonces. RFC 3540. 919 [V03] Arun Venkataramani, August 2003. Citation for acknowledgement 920 purposes only. 922 Authors' Addresses 924 Sally Floyd 925 ICSI Center for Internet Research 926 1947 Center Street, Suite 600 927 Berkeley, CA 94704 928 USA 930 Eddie Kohler 931 4531C Boelter Hall 932 UCLA Computer Science Department 933 Los Angeles, CA 90095 934 USA 936 Full Copyright Statement 938 Copyright (C) The Internet Society 2005. This document is subject 939 to the rights, licenses and restrictions contained in BCP 78, and 940 except as set forth therein, the authors retain all their rights. 942 This document and the information contained herein are provided on 943 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 944 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE 945 INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR 946 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 947 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 948 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 950 Intellectual Property 952 The IETF takes no position regarding the validity or scope of any 953 Intellectual Property Rights or other rights that might be claimed 954 to pertain to the implementation or use of the technology described 955 in this document or the extent to which any license under such 956 rights might or might not be available; nor does it represent that 957 it has made any independent effort to identify any such rights. 958 Information on the procedures with respect to rights in RFC 959 documents can be found in BCP 78 and BCP 79. 961 Copies of IPR disclosures made to the IETF Secretariat and any 962 assurances of licenses to be made available, or the result of an 963 attempt made to obtain a general license or permission for the use 964 of such proprietary rights by implementers or users of this 965 specification can be obtained from the IETF on-line IPR repository 966 at http://www.ietf.org/ipr. 968 The IETF invites any interested party to bring to its attention any 969 copyrights, patents or patent applications, or other proprietary 970 rights that may cover technology that may be required to implement 971 this standard. Please address the information to the IETF at ietf- 972 ipr@ietf.org.