idnits 2.17.1 draft-ietf-dccp-spec-03.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-24) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 18 instances of too long lines in the document, the longest one being 6 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 1169 has weird spacing: '... option optio...' == Line 1171 has weird spacing: '...feature featu...' == Line 1178 has weird spacing: '...feature featu...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (19 May 2003) is 7646 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC3124' is mentioned on line 291, but not defined == Missing Reference: 'Nonce 0' is mentioned on line 2965, but not defined == Missing Reference: 'Nonce 1' is mentioned on line 2936, but not defined == Missing Reference: 'E' is mentioned on line 2649, but not defined -- Looks like a reference, but probably isn't: '1' on line 2808 -- Looks like a reference, but probably isn't: '0' on line 2793 == Missing Reference: 'TFRC' is mentioned on line 3292, but not defined == Unused Reference: 'RFC 1948' is defined on line 3488, but no explicit reference was found in the text ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) -- Obsolete informational reference (is this intentional?): RFC 1889 (Obsoleted by RFC 3550) -- Obsolete informational reference (is this intentional?): RFC 1948 (Obsoleted by RFC 6528) -- Obsolete informational reference (is this intentional?): RFC 2960 (Obsoleted by RFC 4960) == Outdated reference: A later version (-02) exists of draft-ietf-tsvwg-udp-lite-01 Summary: 5 errors (**), 0 flaws (~~), 11 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force 2 INTERNET-DRAFT Eddie Kohler 3 draft-ietf-dccp-spec-03.txt Mark Handley 4 Sally Floyd 5 ICIR 6 Jitendra Padhye 7 Microsoft Research 8 19 May 2003 9 Expires: November 2003 11 Datagram Congestion Control Protocol (DCCP) 13 Status of this Document 15 This document is an Internet-Draft and is in full conformance with 16 all provisions of Section 10 of [RFC 2026]. Internet-Drafts are 17 working documents of the Internet Engineering Task Force (IETF), its 18 areas, and its working groups. Note that other groups may also 19 distribute working documents as Internet-Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six 22 months and may be updated, replaced, or obsoleted by other documents 23 at any time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html 32 Abstract 34 This document specifies the Datagram Congestion Control 35 Protocol (DCCP), which implements a congestion-controlled, 36 unreliable flow of datagrams suitable for use by applications 37 such as streaming media. 39 TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION: 41 Changes since draft-ietf-dccp-spec-02.txt: 43 * Identification options include the Acknowledgement Number in 44 their hash. 46 * Added an additional condition to accepting a packet with an 47 invalid Sequence Number: the Acknowledgement Number must be 48 valid, as well as the Identification options. 50 * Explicitly allow Connection Nonces to be negotiated in other 51 ways than the Connection Nonce feature. 53 * Bad Moves are ignored, not reset, to avoid leaking 54 information to attackers. 56 Changes since draft-ietf-dccp-spec-01.txt: 58 * Revise definition of when packets are reported as received, 59 due to ECN Nonce verification problems with the previous 60 definition and options. 62 * Replace Receive Buffer Drops with Data Dropped. 64 * Remove Data Discarded in favor of Data Dropped with Drop 65 State 0. 67 * Remove Buffer Closed in favor of Data Dropped with Drop 68 State 4. 70 * Add Initial Sequence Number setting guidelines. 72 * Add sections on retransmission of Requests, and a table to 73 the state diagram. 75 * Made the 4-bit Reserved field in the DCCP generic header 76 available for use by CCIDs. 78 * Refine description of CCID 1. 80 * Add Middlebox Considerations. 82 * Change Identification option to allow middleboxes to change 83 port numbers, DCCP options, and/or packet data without 84 disrupting the connection. 86 * Specify that Ignored should be sent only on packets with 87 Acknowledgement Numbers. 89 * Add Aggression Penalty Reset Reason. 91 * Add Payload Checksum option. 93 * Add Elapsed Time option (formerly specific to CCID 3). 95 * Timestamp Echo option can omit Elapsed Time, or provide a 96 two-byte Elapsed Time value. Elapsed Time is measured in 97 tenths of milliseconds, not microseconds. 99 * Clean up DCCP-Move and feature-negotiation options 100 discussions. 102 * Confirm(Connection Nonce) sends no data. 104 * Ack Vector implementation supports ECN Nonce Echo. 106 * Add CSlen and Partial Checksumming Design Motivation. 108 * Clarify that Ack Vectors may be sent even if Use Ack Vector 109 is false. 111 Table of Contents 113 1. Introduction. . . . . . . . . . . . . . . . . . . . . . 6 114 2. Design Rationale. . . . . . . . . . . . . . . . . . . . 7 115 3. Concepts and Terminology. . . . . . . . . . . . . . . . 8 116 3.1. Anatomy of a DCCP Connection . . . . . . . . . . . . 8 117 3.2. Congestion Control . . . . . . . . . . . . . . . . . 9 118 3.3. Connection Initiation and Termination. . . . . . . . 9 119 3.4. Features . . . . . . . . . . . . . . . . . . . . . . 10 120 4. DCCP Packets. . . . . . . . . . . . . . . . . . . . . . 10 121 4.1. Examples of DCCP Congestion Control. . . . . . . . . 12 122 4.1.1. DCCP with TCP-like Congestion Control . . . . . . 12 123 4.1.2. DCCP with TFRC Congestion Control . . . . . . . . 14 124 5. Packet Formats. . . . . . . . . . . . . . . . . . . . . 15 125 5.1. Generic Packet Header. . . . . . . . . . . . . . . . 15 126 5.2. Sequence Number Validity . . . . . . . . . . . . . . 18 127 5.3. DCCP State Diagram . . . . . . . . . . . . . . . . . 19 128 5.4. DCCP-Request Packet Format . . . . . . . . . . . . . 20 129 5.5. DCCP-Response Packet Format. . . . . . . . . . . . . 22 130 5.6. DCCP-Data, DCCP-Ack, and DCCP-DataAck Packet 131 Formats . . . . . . . . . . . . . . . . . . . . . . . . . 23 132 5.7. DCCP-CloseReq and DCCP-Close Packet Format . . . . . 25 133 5.8. DCCP-Reset Packet Format . . . . . . . . . . . . . . 26 134 5.9. DCCP-Move Packet Format. . . . . . . . . . . . . . . 27 135 6. Options and Features. . . . . . . . . . . . . . . . . . 29 136 6.1. Padding Option . . . . . . . . . . . . . . . . . . . 30 137 6.2. Ignored Option . . . . . . . . . . . . . . . . . . . 30 138 6.3. Feature Negotiation. . . . . . . . . . . . . . . . . 31 139 6.3.1. Feature Numbers . . . . . . . . . . . . . . . . . 32 140 6.3.2. Change Option . . . . . . . . . . . . . . . . . . 32 141 6.3.3. Prefer Option . . . . . . . . . . . . . . . . . . 33 142 6.3.4. Confirm Option. . . . . . . . . . . . . . . . . . 33 143 6.3.5. Example Negotiations. . . . . . . . . . . . . . . 33 144 6.3.6. Unknown Features. . . . . . . . . . . . . . . . . 34 145 6.3.7. State Diagram . . . . . . . . . . . . . . . . . . 34 146 6.4. Identification Options . . . . . . . . . . . . . . . 38 147 6.4.1. Identification Regime Feature . . . . . . . . . . 38 148 6.4.2. Connection Nonce Feature. . . . . . . . . . . . . 39 149 6.4.3. Identification Option . . . . . . . . . . . . . . 39 150 6.4.4. Challenge Option. . . . . . . . . . . . . . . . . 41 151 6.5. Init Cookie Option . . . . . . . . . . . . . . . . . 42 152 6.6. Timestamp Option . . . . . . . . . . . . . . . . . . 42 153 6.7. Elapsed Time Option. . . . . . . . . . . . . . . . . 42 154 6.8. Timestamp Echo Option. . . . . . . . . . . . . . . . 43 155 6.9. Loss Window Feature. . . . . . . . . . . . . . . . . 44 156 7. Congestion Control IDs. . . . . . . . . . . . . . . . . 45 157 7.1. Unspecified Sender-Based Congestion Control. . . . . 46 158 7.2. TCP-like Congestion Control. . . . . . . . . . . . . 47 159 7.3. TFRC Congestion Control. . . . . . . . . . . . . . . 47 160 7.4. CCID-Specific Options and Features . . . . . . . . . 47 161 8. Acknowledgements. . . . . . . . . . . . . . . . . . . . 48 162 8.1. Acks of Acks and Unidirectional Connections. . . . . 48 163 8.2. Ack Piggybacking . . . . . . . . . . . . . . . . . . 50 164 8.3. Ack Ratio Feature. . . . . . . . . . . . . . . . . . 50 165 8.4. Use Ack Vector Feature . . . . . . . . . . . . . . . 51 166 8.5. Ack Vector Options . . . . . . . . . . . . . . . . . 51 167 8.5.1. Ack Vector Consistency. . . . . . . . . . . . . . 53 168 8.5.2. Ack Vector Coverage . . . . . . . . . . . . . . . 55 169 8.6. Slow Receiver Option . . . . . . . . . . . . . . . . 55 170 8.7. Data Dropped Option. . . . . . . . . . . . . . . . . 56 171 8.8. Payload Checksum Option. . . . . . . . . . . . . . . 58 172 8.9. Ack Vector Implementation Notes. . . . . . . . . . . 59 173 8.9.1. New Packets . . . . . . . . . . . . . . . . . . . 61 174 8.9.2. Sending Acknowledgements. . . . . . . . . . . . . 62 175 8.9.3. Clearing State. . . . . . . . . . . . . . . . . . 63 176 8.9.4. Processing Acknowledgements . . . . . . . . . . . 64 177 9. Explicit Congestion Notification. . . . . . . . . . . . 65 178 9.1. ECN Capable Feature. . . . . . . . . . . . . . . . . 65 179 9.2. ECN Nonces . . . . . . . . . . . . . . . . . . . . . 66 180 9.3. Other Aggression Penalties . . . . . . . . . . . . . 67 181 10. Multihoming and Mobility . . . . . . . . . . . . . . . 67 182 10.1. Mobility Capable Feature. . . . . . . . . . . . . . 67 183 10.2. Security. . . . . . . . . . . . . . . . . . . . . . 68 184 10.3. Congestion Control State. . . . . . . . . . . . . . 68 185 10.4. Loss During Transition. . . . . . . . . . . . . . . 68 186 11. Path MTU Discovery . . . . . . . . . . . . . . . . . . 69 187 12. Middlebox Considerations . . . . . . . . . . . . . . . 71 188 13. Abstract API . . . . . . . . . . . . . . . . . . . . . 72 189 14. Multiplexing Issues. . . . . . . . . . . . . . . . . . 72 190 15. DCCP and RTP . . . . . . . . . . . . . . . . . . . . . 73 191 16. Security Considerations. . . . . . . . . . . . . . . . 74 192 17. IANA Considerations. . . . . . . . . . . . . . . . . . 74 193 18. Design Motivation. . . . . . . . . . . . . . . . . . . 75 194 18.1. CSlen and Partial Checksumming. . . . . . . . . . . 75 195 19. Thanks . . . . . . . . . . . . . . . . . . . . . . . . 77 196 20. Normative References . . . . . . . . . . . . . . . . . 77 197 21. Informative References . . . . . . . . . . . . . . . . 77 198 22. Authors' Addresses . . . . . . . . . . . . . . . . . . 78 200 1. Introduction 202 This document specifies the Datagram Congestion Control Protocol 203 (DCCP). DCCP provides the following features: 205 o An unreliable flow of datagrams, with acknowledgements. 207 o A reliable handshake for connection setup and teardown. 209 o Reliable negotiation of options, including negotiation of a 210 suitable congestion control mechanism. 212 o Mechanisms allowing a server to avoid holding any state for 213 unacknowledged connection attempts or already-finished 214 connections. 216 o Optional mechanisms that tell the sender, with high reliability, 217 which packets reached the receiver, and whether those packets were 218 ECN marked, corrupted, or dropped in the receive buffer. 220 o Congestion control incorporating Explicit Congestion Notification 221 (ECN) and the ECN Nonce, as per [RFC 3168] and [ECN NONCE]. 223 o Path MTU discovery, as per [RFC 1191]. 225 DCCP is intended for applications that require the flow-based 226 semantics of TCP, but which do not want TCP's in-order delivery and 227 reliability semantics, or which would like different congestion 228 control dynamics than TCP. Similarly, DCCP is intended for 229 applications that do not require features of SCTP [RFC 2960] such as 230 sequenced delivery within multiple streams. 232 Applications that could make use of DCCP include those with timing 233 constraints on the delivery of data such that reliable in-order 234 delivery, when combined with congestion control, is likely to result 235 in some information arriving at the receiver after it is no longer 236 of use. Such applications might include streaming media and 237 Internet telephony. 239 To date most such applications have used either TCP, with the 240 problems described above, or used UDP and implemented their own 241 congestion control mechanisms (or no congestion control at all). The 242 purpose of DCCP is to provide a standard way to implement congestion 243 control and congestion control negotiation for such applications. 244 One of the motivations for DCCP is to enable the use of ECN, along 245 with conformant end-to-end congestion control, for applications that 246 would otherwise be using UDP. In addition, DCCP implements reliable 247 connection setup, teardown, and feature negotiation. 249 A DCCP connection contains acknowledgement traffic as well as data 250 traffic. Acknowledgements inform a sender whether its packets 251 arrived, and whether they were ECN marked. Acks are transmitted as 252 reliably as the congestion control mechanism in use requires, 253 possibly completely reliably. 255 Previous drafts of this specification called the protocol DCP, or 256 Datagram Control Protocol. The name was changed to make the acronym 257 sound less like "TCP". 259 2. Design Rationale 261 DCCP is intended to be used by applications that currently use UDP 262 without end-to-end congestion control. The desire is for many 263 applications to have little reason not to use DCCP instead of UDP, 264 once DCCP is deployed. Thus, DCCP was designed to have as little 265 overhead as possible, in terms both of the size of the packet header 266 and in terms of the state and CPU overhead required at the end 267 hosts. 269 This desire for minimal overhead results in the design decision to 270 include only the minimal necessary functionality in DCCP, leaving 271 other functionality, such as FEC or semi-reliability, to be layered 272 on top of DCCP as desired. The desire for minimal overhead is also 273 one of the reasons to propose DCCP instead of just proposing an 274 unreliable version of SCTP for applications currently using UDP. 276 A second motivation behind the design of DCCP is to allow 277 applications to choose an alternative to the current TCP-style 278 congestion control that halves the congestion window in response to 279 a congestion indication. DCCP lets applications choose between 280 several forms of congestion control. One choice, TCP-like 281 congestion control, halves the congestion window in response to a 282 packet drop or mark, as in TCP. A second alternative, TFRC (TCP- 283 Friendly Rate Control, a form of equation-based congestion control), 284 minimizes abrupt changes in the sending rate while maintaining 285 longer-term fairness with TCP. 287 In proposing a new transport protocol, it is necessary to justify 288 the design decision not to require the use of the Congestion 289 Manager, as well as the design decision to add a new transport 290 protocol to the current family of UDP, TCP, and SCTP. The 291 Congestion Manager [RFC3124] allows multiple concurrent streams 292 between the same sender and receiver to share congestion control. 293 However, the current Congestion Manager can only be used by 294 applications that have their own end-to-end feedback about packet 295 losses, and this is not the case for many of the applications 296 currently using UDP. In addition, the current Congestion Manager 297 does not lend itself to the use of forms of TFRC where the state 298 about past packet drops or marks is maintained at the receiver 299 rather than at the sender. While DCCP should be able to make use of 300 CM where desired by the application, we do not see any benefit in 301 making the deployment of DCCP contingent on the deployment of CM 302 itself. 304 3. Concepts and Terminology 306 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 307 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 308 this document are to be interpreted as described in [RFC 2119]. 310 3.1. Anatomy of a DCCP Connection 312 Each DCCP connection runs between two endpoints, which we often name 313 DCCP A and DCCP B. Data may pass over the connection in either or 314 both directions. The DCCP connection between DCCP A and DCCP B 315 consists of four sets of packets, as follows: 317 (1) Data packets from DCCP A to DCCP B. 319 (2) Acknowledgements from DCCP B to DCCP A. 321 (3) Data packets from DCCP B to DCCP A. 323 (4) Acknowledgements from DCCP A to DCCP B. 325 We use the following terms to refer to subsets and endpoints of a 326 DCCP connection. 328 Subflows 329 A subflow consists of either data or acknowledgement packets, 330 sent in one direction. Each of the four sets of packets above is 331 a subflow. (Subflows may overlap to some extent, since 332 acknowledgements may be piggybacked on data packets.) 334 Sequences 335 A sequence consists of all packets sent in one direction, 336 regardless of whether they are data or acknowledgements. The 337 sets 1+4 and 2+3, above, are sequences. Each packet on a 338 sequence has a different sequence number. 340 Half-connections 341 A half-connection consists of the data packets sent in one 342 direction, plus the corresponding acknowledgements. The sets 1+2 343 and 3+4, above, are half-connections. Half-connections are named 344 after the direction of data flow, so the A-to-B half-connection 345 contains the data packets from A to B and the acknowledgements 346 from B to A. 348 HC-Sender and HC-Receiver 349 In the context of a single half-connection, the HC-Sender is the 350 endpoint sending data, while the HC-Receiver is the endpoint 351 sending acknowledgements. For example, in the A-to-B half- 352 connection, DCCP A is the HC-Sender and DCCP B is the HC- 353 Receiver. 355 3.2. Congestion Control 357 Each half-connection is managed by a congestion control mechanism. 358 The endpoints negotiate these mechanisms at connection setup; the 359 mechanisms for the two half-connections need not be the same. 361 Conformant congestion control mechanisms correspond to single-byte 362 congestion control identifiers, or CCIDs. The CCID for a half- 363 connection describes how the HC-Sender limits data packet rates; how 364 it maintains necessary parameters, such as congestion windows; how 365 the HC-Receiver sends congestion feedback via acknowledgements; and 366 how it manages the acknowledgement rate. Section 7 introduces the 367 currently allocated CCIDs, which are defined in separate profile 368 documents. 370 3.3. Connection Initiation and Termination 372 Every DCCP connection is actively initiated by one DCCP, which 373 connects to a DCCP socket in the passive listening state. We refer 374 to the active endpoint as "the client" and the passive endpoint as 375 "the server". Most of the DCCP specification is indifferent to 376 whether a DCCP is client or server. However, only the server may 377 generate a DCCP-CloseReq packet. (A DCCP-CloseReq packet forces the 378 receiving DCCP to close the connection and maintain connection state 379 for a reasonable time, allowing old packets to clear the network.) 380 This means that the client cannot force the server to maintain 381 connection state after the connection is closed. 383 DCCP does not support TCP-style simultaneous open. In particular, a 384 host MUST NOT respond to a DCCP-Request packet with a DCCP-Response 385 packet unless the destination port specified in the DCCP-Request 386 corresponds to a local socket opened for listening. 388 DCCP does not support half-open connections either. That is, DCCP 389 shuts down both half-connections as a unit. However, DCCP SHOULD 390 allow applications to declare that they are no longer interested in 391 receiving data. This would allow DCCP implementations to streamline 392 state for certain half-connections. See Section 8.7, on the Data 393 Dropped option---and particularly its Drop State 4---for more 394 information. 396 3.4. Features 398 DCCP uses a generic mechanism to negotiate connection properties, 399 such as the CCIDs active on the two half-connections. These 400 properties are called features. (We reserve the term "option" for a 401 collection of bytes in some DCCP header.) A feature name, such as 402 "CCID", generally corresponds to two features, one per half- 403 connection. For instance, there are two CCIDs per connection. The 404 endpoint in charge of a particular feature is called its feature 405 location. 407 The Change, Prefer, and Confirm options negotiate feature values. 408 Change is sent to a feature location, asking it to change its value 409 for the feature. The feature location may respond with Prefer, which 410 asks the other endpoint to Change again with different values, or it 411 may change the feature value and acknowledge the request with 412 Confirm. Retransmissions make feature negotiation reliable. Section 413 6.3 describes these options further. 415 4. DCCP Packets 417 DCCP has nine different packet types: 419 o DCCP-Request 421 o DCCP-Response 423 o DCCP-Data 425 o DCCP-Ack 427 o DCCP-DataAck 429 o DCCP-CloseReq 431 o DCCP-Close 433 o DCCP-Reset 435 o DCCP-Move 437 Only the first eight types commonly occur. The DCCP-Move packet is 438 used to support multihoming and mobility. 440 The progress of a typical DCCP connection is as follows. (This 441 description is informative, not normative.) 443 (1) The client sends the server a DCCP-Request packet specifying the 444 client and server ports, the service being requested, and any 445 features being negotiated, including the CCID that the client 446 would like the server to use. The client may optionally 447 piggyback some data on the DCCP-Request packet---an application- 448 level request, say---which the server may ignore. 450 (2) The server sends the client a DCCP-Response packet indicating 451 that it is willing to communicate with the client. The response 452 indicates any features and options that the server agrees to, 453 begins or continues other feature negotiations if desired, and 454 optionally includes an Init Cookie that wraps up all this 455 information and which must be returned by the client for the 456 connection to complete. 458 (3) The client sends the server a DCCP-Ack packet that acknowledges 459 the DCCP-Response packet. This acknowledges the server's initial 460 sequence number and returns the Init Cookie if there was one in 461 the DCCP-Response. It may also continue feature negotiation. 463 (4) Next comes zero or more DCCP-Ack exchanges as required to 464 finalize feature negotiation. The client may piggyback an 465 application-level request on its final ack, producing a DCCP- 466 DataAck packet. 468 (5) The server and client then exchange DCCP-Data packets, DCCP-Ack 469 packets acknowledging that data, and, optionally, DCCP-DataAck 470 packets containing piggybacked data and acknowledgements. If the 471 client has no data to send, then the server will send DCCP-Data 472 and DCCP-DataAck packets, while the client will send DCCP-Acks 473 exclusively. 475 (6) The server sends a DCCP-CloseReq packet requesting a close. 477 (7) The client sends a DCCP-Close packet acknowledging the close. 479 (8) The server sends a DCCP-Reset packet whose Reason field is set 480 to "Closed", and clears its connection state. 482 (9) The client receives the DCCP-Reset packet and holds state for a 483 reasonable interval of time to allow any remaining packets to 484 clear the network. 486 An alternative connection closedown sequence is initiated by the 487 client: 489 (6) The client sends a DCCP-Close packet closing the connection. 491 (7) The server sends a DCCP-Reset packet with Reason field set to 492 "Closed" and clears its connection state. 494 (8) The client receives the DCCP-Reset packet and holds state for a 495 reasonable interval of time to allow any remaining packets to 496 clear the network. 498 This arrangement of setup and teardown handshakes permits the server 499 to decline to hold any state until the handshake with the client has 500 completed, and ensures that the client must hold the TimeWait state 501 at connection closedown. 503 4.1. Examples of DCCP Congestion Control 505 Before giving the detailed specifications of DCCP, we present two 506 more detailed examples showing DCCP congestion control in operation. 507 Again, these examples are informative, not normative. 509 4.1.1. DCCP with TCP-like Congestion Control 511 The first example is of a connection where both half-connections use 512 TCP-like Congestion Control, specified by CCID 2 [CCID 2 PROFILE]. 513 In this example, the client sends an application-level request to 514 the server, and the server responds with a stream of data packets. 515 This example is of a connection using ECN. 517 (1) The client sends the DCCP-Request, which includes a Change 518 option asking the server to use CCID 2 for the server's data 519 packets, and a Prefer option informing the server that the 520 client would like to use CCID 2 for the its data packets. 522 (2) The server sends a DCCP-Response, including a Confirm option 523 indicating that the server agrees to use CCID 2 for its data 524 packets, and a Change option indicating that the server agrees 525 to the client's suggestion of CCID 2 for the client's data 526 packets. 528 (3) The client responds with a DCCP-DataAck acknowledging the 529 server's initial sequence number, and including a Confirm option 530 finalizing the negotiation of the client-to-server CCID, and an 531 application-level request for data. We will not discuss the 532 client-to-server half-connection further in this example. 534 (4) The server sends DCCP-Data packets, where the number of packets 535 sent is governed by a congestion window, as in TCP. The details 536 of the congestion window are defined in the profile for CCID 2, 537 which is a separate document [CCID 2 PROFILE]. The server also 538 sends Ack Ratio feature options specifying the number of server 539 data packets to be covered by an Ack packet from the client. 540 Some of these data packets are DCCP-DataAcks acknowledging 541 packets from the client. 543 Each DCCP-Data and DCCP-DataAck packet is sent as ECN-Capable, 544 with either the ECT(0) or the ECT(1) codepoint set, as described 545 in [ECN NONCE]. 547 (5) The client sends a DCCP-Ack packet acknowledging the data 548 packets for every Ack Ratio data packets transmitted by the 549 server. Each DCCP-Ack packet uses a sequence number and 550 contains an Ack Vector, as defined in Section 8 on 551 Acknowledgements. These packets also include Confirm options 552 answering any Ack Ratio requests from the server. 554 The client's DCCP-Acks are also sent as ECN-Capable, with either 555 ECT(0) or ECT(1). The client's Ack Vector echoes the accumulated 556 ECN Nonce for the server's packets. 558 (6) The server continues sending DCCP-Data packets as controlled by 559 the congestion window. Upon receiving DCCP-Ack packets, the 560 server examines the Ack Vector to learn about marked or dropped 561 data packets, and adjusts its congestion window accordingly, as 562 described in [CCID 2 PROFILE]. Because this is unreliable 563 transfer, the server does not retransmit dropped packets. 565 (7) Because DCCP-Ack packets use sequence numbers, the server has 566 direct information about the fraction of loss or marked DCCP-Ack 567 packets. The server responds to lost or marked DCCP-Ack packets 568 by modifying the Ack Ratio sent to the client, as described in 569 [CCID 2 PROFILE]. Under certain conditions, the server must 570 acknowledge some of the client's acknowledgements; see Section 571 8.1 for more information. 573 (8) The server estimates round-trip times and calculates a TimeOut 574 (TO) value much as the RTO (Retransmit Timeout) is calculated in 575 TCP. Again, the specification for this is in [CCID 2 PROFILE]. 576 The TO is used to determine when a new DCCP-Data packet can be 577 transmitted when the server has been limited by the congestion 578 window and no feedback has been received from the client. 580 (9) The DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets to close 581 the connection are as in the example above. 583 4.1.2. DCCP with TFRC Congestion Control 585 This example is of a connection where both half-connections use TFRC 586 Congestion Control, specified by CCID 3 [CCID 3 PROFILE]. 588 (1) The DCCP-Request and DCCP-Response packets specifying the use of 589 CCID 3 and the initial DCCP-DataAck packet are similar to those 590 in the CCID 2 example above. 592 (2) The server sends DCCP-Data packets, where the number of packets 593 sent is governed by an allowed transmit rate, as in TFRC. The 594 details of the allowed transmit rate are defined in the profile 595 for CCID 3, which is a separate document [CCID 3 PROFILE]. Each 596 DCCP-Data packet has a sequence number and a window counter 597 value. 599 Some of these data packets are DCCP-DataAck packets 600 acknowledging packets from the client, but for simplicity we 601 will not discuss the half-connection of data from the client to 602 the server in this example. 604 The use of ECN follows TCP-like Congestion Control, above, and 605 is described further in [CCID 3 PROFILE]. 607 (3) The receiver sends DCCP-Ack packets at least once per round-trip 608 time acknowledging the data packets, unless the server is 609 sending at a rate of less than one packet per RTT, as specified 610 by [CCID 3 PROFILE]. These acknowledgements may be piggybacked 611 on data packets, producing DCCP-DataAck packets. Each DCCP-Ack 612 packet uses a sequence number and identifies the most recent 613 packet received from the server. Each DCCP-Ack packet includes 614 feedback about the loss event rate calculated by the client, as 615 specified by [CCID 3 PROFILE]. 617 (4) The server continues sending DCCP-Data packets as controlled by 618 the allowed transmit rate. Upon receiving DCCP-Ack packets, the 619 server updates its allowed transmit rate as specified by [CCID 3 620 PROFILE]. 622 (5) The server estimates round-trip times and calculates a TimeOut 623 (TO) value much as the RTO (Retransmit Timeout) is calculated in 624 TCP. Again, the specification for this is in [CCID 3 PROFILE]. 626 (6) The DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets to close 627 the connection are as in the examples above. 629 5. Packet Formats 631 5.1. Generic Packet Header 633 All DCCP packets begin with a generic DCCP packet header: 635 0 1 2 3 636 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 637 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 638 | Source Port | Dest Port | 639 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 640 | Type | CCval | Sequence Number | 641 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 642 | Data Offset | # NDP | Cslen | Checksum | 643 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 645 Source and Destination Ports: 16 bits each 646 These fields identify the connection, similar to the 647 corresponding fields in TCP and UDP. The Source Port represents 648 the relevant port on the endpoint that sent this packet, the 649 Destination Port the relevant port on the other endpoint. 651 Type: 4 bits 652 The type field specifies the type of the DCCP message. The 653 following values are defined: 655 0 DCCP-Request packet. 657 1 DCCP-Response packet. 659 2 DCCP-Data packet. 661 3 DCCP-Ack packet. 663 4 DCCP-DataAck packet. 665 5 DCCP-CloseReq packet. 667 6 DCCP-Close packet. 669 7 DCCP-Reset packet. 671 8 DCCP-Move packet. 673 CCval: 4 bits 674 This field is reserved for use by the sending CCID. In 675 particular, the A-to-B CCID's sender, which is active at DCCP A, 676 MAY send information to the receiver at DCCP B by encoding that 677 information in CCval. DCCP proper MUST ignore the field. If the 678 relevant CCID does not specify its value, it SHOULD be set to 679 zero. 681 Sequence Number: 24 bits 682 The sequence number field is initialized by a DCCP-Request or 683 DCCP-Response packet, and increases by one (modulo 16777216) 684 with every packet sent. The receiver uses this information to 685 determine whether packet losses have occurred. Even packets 686 containing no data update the sequence number. Sequence numbers 687 also provide some protection against old and malicious packets; 688 see Section 5.2 on sequence number validity. 690 Very-high-rate DCCPs may need protection against wrapped 691 sequence numbers. For example, a 10 Gb/s flow of 1500-byte DCCP 692 packets will send 2^24 packets in about 20 seconds. This is a 693 long time, in terms of likely round-trip times that could 694 possibly achieve such a sustained rate, but it is not without 695 risk. Despite this, we leave the design of mechanisms to protect 696 against wrapped sequence numbers for future work. In particular, 697 if it is decided that very large packet sizes are better than 698 very large congestion windows for very-high-bandwidth flows, 699 then 24 bits may be enough. 701 The two subflows' initial sequence numbers are set by the first 702 DCCP-Request and DCCP-Response packets sent, and SHOULD be 703 chosen as for TCP. In particular, initial sequence number choice 704 MUST include a random or pseudorandom component to make it 705 harder for attackers to complete sequence number attacks [RFC 706 1948]. The initial sequence number chosen for a given connection 707 identifier (source address and port plus destination address and 708 port) SHOULD increase over time, as TCP suggests [RFC 793], to 709 prevent inappropriate delivery of old packets. 711 Data Offset: 8 bits 712 The offset from the start of the DCCP header to the beginning of 713 the packet's payload, measured in 32-bit words. 715 Number of Non-Data Packets (# NDP): 4 bits 716 DCCP sets this field to the number of non-data packets it has 717 sent so far on its sequence, modulo 16. A non-data packet is 718 simply any packet not containing user data; DCCP-Ack, DCCP- 719 Close, DCCP-CloseReq, and DCCP-Reset are always non-data 720 packets, while DCCP-Request, DCCP-Response, and DCCP-Move might 721 or might not be. When sending a non-data packet, DCCP increments 722 the # NDP counter before storing its value in the packet header. 724 This field can help the receiving DCCP decide whether a lost 725 packet contained any user data. (An application may want to know 726 when it has lost data. DCCP could report every packet loss as a 727 potential data loss, but that would cause false loss reports 728 when non-data packets were lost.) For example, say that packet 729 10 had # NDP set to 5; packet 11 was lost; and packet 12 had # 730 NDP set to 5. Then the receiving DCCP could deduce that packet 731 11 contained data, since # NDP did not change. Likewise, if # 732 NDP had gone up to 6 (and packet 12 contained user data), then 733 packet 11 must not have contained any data. 735 Checksum Length (Cslen): 4 bits 736 The checksum length field specifies what parts of the packet are 737 covered by the checksum field. The checksum always covers at 738 least the DCCP header, DCCP options, and a pseudoheader taken 739 from the network-layer header (described under Checksum below). 740 If the checksum length field is zero, that is all the checksum 741 covers. If the field is 15, the checksum covers the packet's 742 payload as well, possibly with 8 bits of zero padding on the 743 right to pad the payload to an even number of bytes. Values 744 between 1 and 14, inclusive, indicate that the checksum 745 additionally covers that number of initial 32-bit words of the 746 packet's payload, padded on the right with zeros as necessary. 748 Values other than 15 specify that corruption is acceptable in 749 some or all of the DCCP packet's payload. In fact, DCCP cannot 750 even detect corruption there, unless the Payload Checksum option 751 is used (Section 8.8). The meaning of values other than 0 and 15 752 should be considered experimental. 754 Section 18.1 further discusses the motivation of, and issues 755 related to, partial checksums. The checksum length field was 756 inspired by UDP-Lite [UDP-LITE]. 758 Checksum: 16 bits 759 DCCP uses the TCP/IP checksum algorithm. The checksum field 760 equals the 16 bit one's complement of the one's complement sum 761 of all 16 bit words in the DCCP header, DCCP options, a 762 pseudoheader taken from the network-layer header, and, depending 763 on the value of the checksum length field, some or all of the 764 payload. When calculating the checksum, the checksum field 765 itself is treated as 0. If a packet contains an odd number of 766 header and text bytes to be checksummed, 8 zero bits are added 767 on the right to form a 16 bit word for checksum purposes. The 768 pad byte is not transmitted as part of the packet. 770 The pseudoheader is calculated as for TCP. For IPv4, it is 96 771 bits long, and consists of the IPv4 source and destination 772 addresses, the IP protocol number for DCCP (padded on the left 773 with 8 zero bits), and the DCCP length as a 16-bit quantity (the 774 length of the DCCP header with options, plus the length of any 775 data); see Section 3.1 of [RFC 793]. For IPv6, it is 320 bits 776 long, and consists of the IPv6 source and destination addresses, 777 the DCCP length as a 32-bit quantity, and the IP protocol number 778 for DCCP (padded on the left with 24 zero bits); see Section 8.1 779 of [RFC 2460]. 781 Packets with invalid checksums MUST be ignored. In particular, 782 their options MUST NOT be processed. 784 5.2. Sequence Number Validity 786 DCCP endpoints SHOULD ignore packets with invalid sequence numbers, 787 which may arise if the network delivers a very old packet or an 788 attacker attempts to hijack a connection. TCP solves this problem 789 with its window. In DCCP, however, sequence numbers change with each 790 packet sent, even pure acknowledgements. Thus, a loss event that 791 dropped many consecutive packets could cause two DCCPs to get out of 792 sync relative to any window. 794 DCCP uses Loss Window and Identification mechanisms to determine 795 whether a given packet's sequence number is valid. Each HC-Sender 796 gives the corresponding HC-Receiver a loss window width W; see 797 Section 6.9. This reflects how many packets the sender expects to be 798 in flight. Only the sender can anticipate this number. One good 799 guideline is to set it to about 3 or 4 times the maximum number of 800 packets the sender expects to send in any round-trip time. Too-small 801 values increase the risk of the endpoints getting out sync after 802 bursts of loss; too-large values increase the risk of connection 803 hijacking. W defaults to 1000. The Identification mechanism is used 804 to get back into sync when more than W consecutive packets are lost. 806 The HC-Receiver sets up a loss window of W consecutive sequence 807 numbers containing GSN, the Greatest Sequence Number it has received 808 on any valid packet from the sender. ("Consecutive" and "greatest" 809 are measured in circular sequence space. The receiver may center the 810 loss window on GSN, or arrange it asymmetrically.) Sequence numbers 811 outside this loss window are invalid. Packets with invalid sequence 812 numbers are themselves invalid, unless both of the following 813 conditions are true: 815 (1) No valid packet has been received recently (for instance, within 816 at least one round-trip time), AND 818 (2) The packet includes a correct Identification or Challenge option 819 (see Section 6.4.3), and a valid Acknowledgement Number (meaning 820 the Acknowledgement Number is within the corresponding Loss 821 Window). 823 The receiving DCCP SHOULD ignore invalid packets. In particular, it 824 SHOULD NOT pass any enclosed data to the application, update its 825 congestion control or feature state, or close the connection. 826 However, the receiving DCCP MAY send a DCCP-Ack packet to the 827 sender, as allowed by the congestion control mechanism in use. This 828 packet SHOULD acknowledge the last received valid sequence number 829 and contain a Challenge option (Section 6.4.4). The other DCCP will 830 send an Identification option to resync. 832 A DCCP endpoint MAY implement rate limits to reduce the likelihood 833 of denial-of-service attack. In particular, it MAY ignore all 834 packets with bad sequence numbers---even those containing 835 Identification or Challenge options---for some amount of time, on 836 the order of one round-trip time, after receiving a packet with an 837 invalid Identification or Challenge option; and it MAY rate-limit 838 the Challenge options it sends. 840 5.3. DCCP State Diagram 842 In this section we present a DCCP state diagram showing how a DCCP 843 connection should progress, and the proper responses for packets or 844 timeout events in various connection states. The state diagram is 845 illustrative; the text should be considered definitive. 847 +----------------------------------+ 848 | Figure omitted from text version | 849 +----------------------------------+ 851 All receive events on the diagram represent receipt of valid 852 packets. For example, receiving a Reset with a bad Acknowledgement 853 Number SHOULD NOT cause DCCP to transition to the Time-Wait state. 854 DCCP implementations MAY send Acks as described above, or "Invalid 855 Packet" Resets, in response to invalid packets; any such responses 856 SHOULD be rate-limited. 858 Otherwise-valid packets without explicit transitions in the state 859 diagram SHOULD be treated according to the table below. Particular 860 actions are "OK", meaning the packet MUST be processed according to 861 this document; "Rst", meaning the receiver SHOULD either ignore the 862 packet or respond with a (rate-limited) Reset; and "-", meaning the 863 packet SHOULD be ignored. Entries may take the form "Old/New", 864 where "Old" applies to old packets and "New" to new packets (whose 865 sequence numbers are greater than the largest sequence number seen 866 so far). The table respecifies some transitions listed in the state 867 diagram---for instance, those for receiving packets in the TIME-WAIT 868 state. In these cases, prefer the action listed in the diagram. For 869 example, in the TIME-WAIT case, prefer sending rate-limited Resets 870 when valid packets are received; the table would allow ignoring 871 them. However, either action would be acceptable. 873 Data/Ack/ 874 DataAck/ 875 State Request Response Move CloseReq Close Reset 876 ------------- -------- -------- -------- -------- -------- -------- 877 CLOSED Rst Rst Rst Rst Rst OK 878 LISTEN OK Rst Rst(1) Rst Rst OK 879 REQUEST Rst OK Rst Rst Rst OK 880 RESPOND -/OK Rst Rst/OK Rst OK OK 881 OPEN (server) -/Rst Rst OK Rst OK OK 882 OPEN (client) Rst -/Rst OK OK OK OK 883 SERVER-CLOSE -/Rst Rst OK Rst OK OK 884 CLIENT-CLOSE Rst -/Rst OK OK OK OK 885 TIME-WAIT Rst Rst Rst Rst Rst OK 887 Notes: (1) Data/Ack/DataAck with valid Init Cookie OK. 889 The Open state does not signify that a DCCP connection is ready for 890 data transfer. In particular, incomplete feature negotiations might 891 prevent data transfer. Feature negotiation takes place in parallel 892 with the state transitions on this diagram. 894 Only the server may take the transition from the OPEN state to the 895 SERVER-CLOSE state. (The server is the DCCP endpoint that began in 896 the LISTEN state.) Similarly, only the client must transition to 897 CLIENT-CLOSE after receiving a CloseReq packet. 899 5.4. DCCP-Request Packet Format 901 A DCCP connection is initiated by sending a DCCP-Request packet. The 902 format of a DCCP request packet is: 904 0 1 2 3 905 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 906 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 907 / Generic DCCP Header (12 bytes) / 908 / with Type=0 (DCCP-Request) / 909 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 910 | Service Name | 911 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 912 | Options / [padding] | 913 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 914 | data | 915 | ... | 916 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 918 Service Name: 32 bits 919 The Service Name field describes the service to which the sender 920 is trying to connect. Service Names are 32-bit numbers allocated 921 by IANA; they are meant to correspond to application services 922 and protocols, such as FTP and HTTP, and are not intended to be 923 DCCP-specific. With Service Names, stateful middleboxes, such as 924 firewalls, can identify the application running on a nonstandard 925 port (assuming the DCCP header has not been encrypted). A 926 Service Name of zero is a wildcard, matching any service. The 927 host operating system MAY force every DCCP socket, both actively 928 and passively opened, to specify a nonzero Service Name. 929 Connection requests MUST fail if the Destination Port on the 930 receiver has a different Service Name from that given in the 931 packet, and both Service Names are nonzero. In this case, the 932 receiver will respond with a DCCP-Reset packet (with Reason set 933 to "Bad Service Name"). A server or stateful middlebox MAY also 934 send a "Bad Service Name" DCCP-Reset in response to packets with 935 Service Name value 0. 937 Options 938 DCCP-Request packets will usually include a "Change(Connection 939 Nonce)" option, to inform the server of the client's connection 940 nonce; see Section 6.4. 942 The client MAY send new DCCP-Request packets if no response is 943 received after some timeout. Each retransmission MUST increment the 944 Sequence Number, and possibly # NDP, by one. The retransmission 945 strategy SHOULD be similar to that for retransmitting TCP SYNs. 947 A client MAY decide to give up after some number of DCCP-Requests. 948 If so, it MAY send a DCCP-Reset packet to the server, to clean up 949 state in case one or more of the Requests actually arrived. The 950 DCCP-Reset SHOULD have Reason set to "Closed". 952 5.5. DCCP-Response Packet Format 954 In the second phase of the three-way handshake, the server sends a 955 DCCP-Response message to the client. In this phase, a server will 956 often specify the options it would like to use, either from among 957 those the client requested, or in addition to those. Among these 958 options is the congestion control mechanism the server expects to 959 use. 961 0 1 2 3 962 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 963 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 964 / Generic DCCP Header (12 bytes) / 965 / with Type=1 (DCCP-Response) / 966 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 967 | Reserved | Acknowledgement Number | 968 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 969 | Options / [padding] | 970 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 971 | data | 972 | ... | 973 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 975 Acknowledgement Number: 24 bits 976 The Acknowledgement Number field, which appears in several 977 packet types, acknowledges the greatest valid sequence number 978 received so far on this connection. ("Greatest" is, of course, 979 measured in circular sequence space.) In the case of a DCCP- 980 Response packet, the acknowledgement number field will equal the 981 sequence number from the DCCP-Request. Acknowledgement numbers 982 make no attempt to provide precise information about which 983 packets have arrived; options such as the Ack Vector do this. 985 Some care is required in defining when a packet is "received" 986 for purposes of acknowledgement. All valid packets received by a 987 DCCP stack MUST be acknowledged as "received", even if their 988 payloads were dropped (due to receive buffer overflow or payload 989 corruption, for example). The receiving DCCP MUST have processed 990 the options on every packet it reports as "received". The Data 991 Dropped option (Section 8.7) helps the sending application 992 determine when packet payloads were dropped by the receiving 993 DCCP. This issue is discussed in somewhat more detail in 994 Section 8.5. 996 Reserved: 8 bits 997 The version of DCCP specified here SHOULD set this field to all 998 zeroes on generated packets, and ignore its value on received 999 packets. 1001 Options 1002 The Data Dropped and Init Cookie options are particularly useful 1003 for DCCP-Response packets (Sections 8.7 and 6.5). In addition, 1004 DCCP-Response, or early DCCP-Data or DCCP-Ack packets, may 1005 include "Confirm(Connection Nonce)" and "Change(Connection 1006 Nonce)" options, to negotiate connection nonces (Section 6.4), 1007 as well as options to negotiate CCIDs and other relevant 1008 features. 1010 The receiver MAY respond to a DCCP-Request packet with a DCCP-Reset 1011 packet to refuse the connection. Relevant Reset Reasons for refusing 1012 a connection include "Connection Refused", when the DCCP-Request's 1013 Destination Port did not correspond to a DCCP port open for 1014 listening; "Bad Service Name", when the DCCP-Request's Service Name 1015 did not correspond to the service name registered with the 1016 Destination Port; and "Too Busy", when the server is currently too 1017 busy to respond to requests. The server SHOULD limit the rate at 1018 which it generates these resets. 1020 The receiver SHOULD NOT retransmit DCCP-Response packets; the sender 1021 will retransmit the DCCP-Request if necessary. The responder will 1022 detect that the retransmitted DCCP-Request applies to an existing 1023 connection because of its Source and Destination Ports. Every valid 1024 DCCP-Request received MUST elicit a new DCCP-Response, unless the 1025 responder can guarantee that the requestor has received at least one 1026 Response already. (For instance, if the responder has received a 1027 valid DCCP-Data or DCCP-Ack packet from the requestor, then it knows 1028 the newly received Request is old, and SHOULD be ignored.) Each new 1029 DCCP-Response MUST increment the Sequence Number, and possibly # 1030 NDP, by one. 1032 5.6. DCCP-Data, DCCP-Ack, and DCCP-DataAck Packet Formats 1034 The payload of a DCCP connection is sent in DCCP-Data and DCCP- 1035 DataAck packets, while DCCP-Ack packets are used for 1036 acknowledgements when there is no payload to be sent. DCCP-Data 1037 packets look like this: 1039 0 1 2 3 1040 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1041 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1042 / Generic DCCP Header (12 bytes) / 1043 / with Type=2 (DCCP-Data) / 1044 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1045 | Options / [padding] | 1046 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1047 | data | 1048 | ... | 1049 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1051 DCCP-Ack packets dispense with the data, but contain an 1052 acknowledgement number: 1054 0 1 2 3 1055 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1056 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1057 / Generic DCCP Header (12 bytes) / 1058 / with Type=3 (DCCP-Ack) / 1059 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1060 | Reserved | Acknowledgement Number | 1061 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1062 | Options / [padding] | 1063 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1065 DCCP-DataAck packets contain both data and an acknowledgement 1066 number: acknowledgement information is piggybacked on a data packet. 1068 0 1 2 3 1069 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1070 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1071 / Generic DCCP Header (12 bytes) / 1072 / with Type=4 (DCCP-DataAck) / 1073 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1074 | Reserved | Acknowledgement Number | 1075 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1076 | Options / [padding] | 1077 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1078 | data | 1079 | ... | 1080 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1082 DCCP-Ack and DCCP-DataAck packets often include additional 1083 acknowledgement options, such as Ack Vector, as required by the 1084 congestion control mechanism in use. 1086 DCCP A sends DCCP-Data and DCCP-DataAck packets to DCCP B due to 1087 application events on host A. These packets are congestion- 1088 controlled by the CCID for the A-to-B half-connection. In contrast, 1089 DCCP-Ack packets sent by DCCP A are controlled by the CCID for the 1090 B-to-A half-connection. Generally, DCCP A will piggyback 1091 acknowledgement information on data packets when acceptable, 1092 creating DCCP-DataAck packets. DCCP-Ack packets are used when there 1093 is no data to send from DCCP A to DCCP B, or when the link from A to 1094 B is so congested that sending data would be inappropriate. 1096 Section 8, below, describes acknowledgements in DCCP. 1098 A DCCP-Data or DCCP-DataAck packet may contain no data bytes if the 1099 application sends a zero-length datagram. Such zero-length datagrams 1100 MUST be reported to the receiving application. 1102 5.7. DCCP-CloseReq and DCCP-Close Packet Format 1104 The DCCP-CloseReq and DCCP-Close packets have the same format. 1105 However, only the server can send a DCCP-CloseReq packet. Either 1106 client or server may send a DCCP-Close packet. The receiver of a 1107 valid DCCP-Close packet SHOULD respond with a DCCP-Reset packet, 1108 with Reason set to "Closed"; the endpoint that originally sent the 1109 DCCP-Close will hold TimeWait state. The receiver of a valid DCCP- 1110 CloseReq packet SHOULD respond with a DCCP-Close packet; that 1111 receiving endpoint will expect to hold TimeWait state after later 1112 receiving a DCCP-Reset. See the state diagram in 5.3 for more 1113 information. 1115 0 1 2 3 1116 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1117 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1118 / Generic DCCP Header (12 bytes) / 1119 / with Type=5 or 6 (DCCP-Close or CloseReq) / 1120 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1121 | Reserved | Acknowledgement Number | 1122 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1123 | Options / [padding] | 1124 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1126 5.8. DCCP-Reset Packet Format 1128 DCCP-Reset packets unconditionally shut down a connection. Every 1129 connection shutdown sequence ends with a DCCP-Reset, but resets may 1130 be sent for other reasons, including bad port numbers, bad option 1131 behavior, incorrect ECN Nonce Echoes, and so forth. The reason for a 1132 reset is represented by an eight-bit number, the Reason field, and 1133 24 bits of additional data. 1135 0 1 2 3 1136 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1137 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1138 / Generic DCCP Header (12 bytes) / 1139 / with Type=7 (DCCP-Reset) / 1140 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1141 | Reserved | Acknowledgement Number | 1142 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1143 | Reason | Data 1 | Data 2 | Data 3 | 1144 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1145 | Options / [padding] | 1146 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1148 Reason: 8 bits 1149 The Reason field represents the reason that the sender reset the 1150 DCCP connection. 1152 Data 1, Data 2, and Data 3: 8 bits each 1153 The Data fields provide additional information about why the 1154 sender reset the DCCP connection. The meanings of these fields 1155 depend on the value of Reason. 1157 The following Reasons are currently defined. The "Data" columns 1158 describe what the Data fields should contain for a given Reason. In 1159 those columns, N/A means the Data field SHOULD be set to 0 by the 1160 sender of the DCCP-Reset, and ignored by its receiver. 1162 Section 1163 Reason Name Data 1 Data 2 Data 3 Reference 1164 ------ ---- ------ ------ ------ --------- 1165 0 Unspecified N/A N/A N/A 1166 1 Closed N/A N/A N/A 4 1167 2 Invalid Packet packet N/A N/A 5.3 1168 type 1169 3 Option Error option option data 1170 number (if any) 1171 4 Feature Error feature feature data 1172 number (if any) 1173 5 Connection Refused N/A N/A N/A 5.5 1174 6 Bad Service Name N/A N/A N/A 5.4 1175 7 Too Busy N/A N/A N/A 5.5 1176 8 Bad Init Cookie N/A N/A N/A 6.5 1177 10 Unanswered Challenge N/A N/A N/A 6.4.4 1178 11 Fruitless Negotiation feature feature data 6.3.7 1179 number (optional) 1180 12 Aggression Penalty N/A N/A N/A 9.2 1182 5.9. DCCP-Move Packet Format 1184 The DCCP-Move packet type is part of DCCP's support for multihoming 1185 and mobility, which is described further in Section 10. DCCP A sends 1186 a DCCP-Move packet to DCCP B after changing its address and/or port 1187 number. The DCCP-Move packet requests that DCCP B start sending 1188 packets to the new address and port number. The old address and port 1189 are stored explicitly in the DCCP-Move header; the new address and 1190 port come from the packet's network header and generic DCCP header. 1191 The old address's type is indicated explicitly by an Old Address 1192 Family field. The Sequence Number and Acknowledgement Number fields 1193 and a mandatory Identification option provide some protection 1194 against hijacked connections. See Section 10 for more on security 1195 and DCCP's mobility support. 1197 0 1 2 3 1198 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1199 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1200 / Generic DCCP Header (12 bytes) / 1201 / with Type=8 (DCCP-Move) / 1202 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1203 | Reserved | Acknowledgement Number | 1204 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1205 | Old Address Family | Old Port | 1206 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1207 / Old Address / 1208 / / [padding] / 1209 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1210 | Options, including Identification / [padding] | 1211 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1212 | data | 1213 | ... | 1214 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1216 Old Address Family: 16 bits 1217 The Old Address Family field indicates the address family 1218 formerly used for this connection, and takes values from the 1219 Address Family Numbers registry administered by IANA. Particular 1220 values include 1 for IPv4 and 2 for IPv6. An endpoint MUST 1221 discard DCCP-Move packets with unrecognized Old Address Family 1222 values. 1224 Old Port: 16 bits 1225 The former port number used by DCCP A's endpoint. 1227 Old Address: at least 32 bits 1228 The former address used by DCCP A's endpoint, padded on the 1229 right to a multiple of 32 bits. The form and size of the address 1230 are determined by the Old Address Family field. For instance, if 1231 Old Address Family is 1, then Old Address contains an IPv4 1232 address and takes 32 bits; if it is 2, then Old Address contains 1233 an IPv6 address and takes 128 bits. 1235 Options 1236 Every DCCP-Move packet MUST include a valid Identification 1237 option (see Section 6.4). 1239 DCCP B SHOULD ignore the DCCP-Move if any of the following 1240 conditions holds: 1242 (1) Neither the Old Address/Old Port combination nor the network 1243 address/Source Port combination refers to a currently active 1244 DCCP connection. 1246 (2) The Identification option is not present or invalid. 1248 (3) DCCP B does not support mobility, or its Mobility Capable 1249 feature is off. 1251 DCCP B SHOULD NOT respond to such invalid Moves with DCCP-Reset 1252 packets, since any such resets would leak information about the 1253 connection, such as the current sequence number, to a possibly 1254 malicious host. After receiving such an invalid DCCP-Move, DCCP B 1255 MAY ignore subsequent DCCP-Move packets, valid or not, for a short 1256 period of time, such as one round-trip time. This protects DCCP B 1257 against denial-of-service attacks from floods of invalid DCCP-Moves. 1259 DCCP B SHOULD respond to a valid DCCP-Move packet with a DCCP-Ack or 1260 DCCP-DataAck packet acknowledging the move. If DCCP B accepts the 1261 move, it MUST send this acknowledgement to the network 1262 address/Source Port combination; if it rejects the move, which it 1263 MAY do for any reason, it MUST send the acknowledgement to the Old 1264 Address/Old Port combination. 1266 If the acknowledgement is lost, DCCP A might resend the DCCP-Move 1267 packet (using a new sequence number). DCCP B will detect this case 1268 because the network address/Source Port combination corresponds to a 1269 valid connection, for which the Sequence Number and Acknowledgement 1270 Number fields are valid; the Identification option is valid for that 1271 connection; and the Old Address/Old Port combination no longer 1272 refers to a valid DCCP connection. It SHOULD respond by sending 1273 another acknowledgement, as allowed by the congestion control 1274 mechanism in use. 1276 We note that DCCP mobility, as provided by DCCP-Move, may not be 1277 useful in the context of IPv6, with its mandatory support for Mobile 1278 IP. 1280 6. Options and Features 1282 All DCCP packets may contain options, which occupy space at the end 1283 of the DCCP header and are a multiple of 8 bits in length. All 1284 options are always included in the checksum. An option may begin on 1285 any byte boundary. 1287 The first byte of an option is the option type. Options with types 0 1288 through 31 are single-byte options. Other options are followed by a 1289 byte indicating the option's length. This length value includes the 1290 two bytes of option-type and option-length as well as any option- 1291 data bytes, and MUST therefore be greater than or equal to two. 1293 The following options are currently defined: 1295 Option Section 1296 Type Length Meaning Reference 1297 ---- ------ ------- --------- 1298 0 1 Padding 6.1 1299 2 1 Slow Receiver 8.6 1300 32 3-4 Ignored 6.2 1301 33 variable Change 6.3 1302 34 variable Prefer 6.3 1303 35 variable Confirm 6.3 1304 36 variable Init Cookie 6.5 1305 37 variable Ack Vector [Nonce 0] 8.5 1306 38 variable Ack Vector [Nonce 1] 8.5 1307 39 variable Data Dropped 8.7 1308 40 6 Timestamp 6.6 1309 41 6-10 Timestamp Echo 6.8 1310 42 variable Identification 6.4.3 1311 44 variable Challenge 6.4.4 1312 45 4 Payload Checksum 8.8 1313 46 4-6 Elapsed Time 6.7 1314 128-255 variable CCID-specific options 7.4 1316 6.1. Padding Option 1318 The padding option, with type 0, is a single byte option used to pad 1319 between or after options. It either ensures the payload begins on a 1320 32-bit boundary (as required), or ensures alignment of following 1321 options (not mandatory). 1323 +--------+ 1324 |00000000| 1325 +--------+ 1326 Type=0 1328 6.2. Ignored Option 1330 The Ignored option, with type 32, signals that a DCCP did not 1331 understand some option. This can happen, for example, when a 1332 conventional DCCP converses with an extended DCCP. Each Ignored 1333 option has one or two bytes of data. The first byte contains the 1334 offending option type; the second, if present, contains the first 1335 byte of the offending option's data. If the offending option had no 1336 data, the Ignored option MAY still supply two bytes of data, with 1337 the second byte set to 0. 1339 Ignored options SHOULD be sent only on packets that contain 1340 Acknowledgement Numbers (that is, DCCP-Reponse, DCCP-Ack, DCCP- 1341 DataAck, DCCP-Close, DCCP-CloseReq, DCCP-Reset, and DCCP-Move), and 1342 SHOULD concern options sent on the packet acknowledged by the 1343 Acknowledgement Number. 1345 +--------+--------+--------+ 1346 |00100000|00000011|Opt Type| 1347 +--------+--------+--------+ 1348 Type=32 Length=3 1350 +--------+--------+--------+--------+ 1351 |00100000|00000100|Opt Type|Opt Data| 1352 +--------+--------+--------+--------+ 1353 Type=32 Length=4 1355 6.3. Feature Negotiation 1357 DCCP contains a mechanism for reliably negotiating features, notably 1358 the congestion control mechanism in use on each half-connection. The 1359 motivation is to implement reliable feature negotiation once, so 1360 that different options need not reinvent that wheel. 1362 Three options, Change, Prefer, and Confirm, implement feature 1363 negotiation. Change is sent to a feature's location, asking it to 1364 change the feature's value. The feature location may respond with 1365 Prefer, which asks the other endpoint to Change again with different 1366 values, or it may change the feature value and acknowledge the 1367 request with Confirm. 1369 Feature values MUST NOT change apart from feature negotiation, and 1370 enforced retransmissions make feature negotiation reliable. This 1371 ensures that both endpoints eventually agree on every feature's 1372 value. 1374 Some features are non-negotiable, meaning that the feature location 1375 MUST set its value to whatever the other endpoint requests. For non- 1376 negotiable features, the feature location MUST respond to Change 1377 options with Confirm; Prefer is not useful. These features use the 1378 feature framework simply to achieve reliability. 1380 Negotiations for multiple features may take place simultaneously. 1381 For instance, a packet may contain multiple Change options that 1382 refer to different features. 1384 Feature negotiation generally takes place using packet types that 1385 carry no user data, such as DCCP-Ack, particularly when the relevant 1386 feature may affect how data will be treated. 1388 6.3.1. Feature Numbers 1390 The first data byte of every Change, Prefer, or Confirm option is a 1391 feature number, defining the type of feature being negotiated. The 1392 remainder of the data gives one or more values for the feature, and 1393 is interpreted according to the feature. The current set of feature 1394 numbers is as follows: 1396 Section 1397 Number Meaning Neg.? Reference 1398 ------ ------- ----- --------- 1399 1 Congestion Control (CC) Y 7 1400 2 ECN Capable Y 9.1 1401 3 Ack Ratio N 8.3 1402 4 Use Ack Vector Y 8.4 1403 5 Mobility Capable Y 10.1 1404 6 Loss Window N 6.9 1405 7 Connection Nonce N 6.4.2 1406 8 Identification Regime Y 6.4.1 1407 128-255 CCID-Specific Features ? 7.4 1409 The "Neg[otiable]?" column is "Y" for normal features and "N" for 1410 non-negotiable features. 1412 6.3.2. Change Option 1414 DCCP A sends a Change option to DCCP B to ask it to change the value 1415 of some feature located at DCCP B. DCCP B SHOULD respond to a Change 1416 option for a known feature with either Prefer or Confirm. In 1417 special circumstances, such as a Change option whose value is 1418 inappropriate for the listed feature number or a negotiation that 1419 seems to be going on forever, DCCP B MAY respond instead by ignoring 1420 the Change (with or without sending an Ignored option), or by 1421 resetting the connection with Reason set to "Fruitless Negotiation" 1422 or "Feature Error". DCCP A SHOULD retransmit the Change option 1423 until it receives some relevant response. DCCP A will always 1424 generate a Change option in response to a Prefer option; it may also 1425 generate a Change option due to some application event. 1427 +--------+--------+--------+--------+--------+-------- 1428 |00100001| Length |Feature#| Value or Values ... 1429 +--------+--------+--------+--------+--------+-------- 1430 Type=33 1432 6.3.3. Prefer Option 1434 DCCP A sends a Prefer option to DCCP B to ask it to choose another 1435 value for some feature located at DCCP B. DCCP B SHOULD respond to a 1436 valid Prefer option with a Change; other possible responses include 1437 ignoring the option, sending an Ignored option, or resetting the 1438 connection, as described above. DCCP A SHOULD retransmit the Prefer 1439 option until it receives some relevant response. DCCP A may generate 1440 a Prefer option in response to some Change option, or in response to 1441 some application event. Prefer options are not useful for non- 1442 negotiable features. 1444 +--------+--------+--------+--------+--------+-------- 1445 |00100010| Length |Feature#| Value or Values ... 1446 +--------+--------+--------+--------+--------+-------- 1447 Type=34 1449 6.3.4. Confirm Option 1451 DCCP A sends a Confirm option to DCCP B to inform it that a Change 1452 option for some feature located at DCCP A has been accepted. 1453 Generally the Confirm option will include the feature's accepted 1454 value. For some special features, such as Connection Nonce, a 1455 Confirm option contains no data; these features are identified 1456 explicitly. DCCP A MUST generate Confirm options only in response 1457 to valid Change options. DCCP A SHOULD NOT retransmit Confirm 1458 options: DCCP B will retransmit the relevant Changes as necessary. 1459 The receipt of a valid Confirm option ends the negotiation over a 1460 feature's value. 1462 +--------+--------+--------+--------+--------+-------- 1463 |00100011| Length |Feature#| Value ... 1464 +--------+--------+--------+--------+--------+-------- 1465 Type=35 1467 6.3.5. Example Negotiations 1469 This section demonstrates several negotiations of the congestion 1470 control feature for the A-to-B half-connection. (This feature is 1471 located at DCCP A.) In this sequence of packets, DCCP A is happy 1472 with DCCP B's suggestion of CC mechanism 2: 1474 B > A Change(CC, 2) 1475 A > B Confirm(CC, 2) 1477 Here, A and B jointly settle on CC mechanism 5: 1479 B > A Change(CC, 3, 4) 1480 A > B Prefer(CC, 1, 2, 5) 1481 B > A Change(CC, 5) 1482 A > B Confirm(CC, 5) 1484 In this sequence, A refuses to use CC mechanism 5. If this sequence 1485 continued, one or the other endpoint would eventually abort the 1486 connection via a DCCP-Reset packet with Reason set to "Fruitless 1487 Negotiation": 1489 B > A Change(CC, 3, 4, 5) 1490 A > B Prefer(CC, 1, 2) 1491 B > A Change(CC, 5) 1492 A > B Prefer(CC, 1, 2) 1494 Here, A elicits agreement from B that it is satisfied with 1495 congestion control mechanism 2: 1497 A > B Prefer(CC, 1, 2) 1498 B > A Change(CC, 2) 1499 A > B Confirm(CC, 2) 1501 6.3.6. Unknown Features 1503 If a DCCP receives a Change or Prefer option referring to a feature 1504 number it does not understand, it MUST respond with an Ignored 1505 option. This informs the remote DCCP that the local DCCP does not 1506 implement the feature. No other action need be taken. (Ignored may 1507 also indicate that the DCCP endpoint could not respond to a CCID- 1508 specific feature request because the CCID was in flux; see Section 1509 7.4.) 1511 6.3.7. State Diagram 1513 These state diagrams present the legal transitions in a DCCP feature 1514 negotiation. They define DCCP's states and transitions with respect 1515 to the negotiation of a single feature it understands. There are two 1516 diagrams, corresponding to the two endpoints: the feature location 1517 DCCP A, and what we call the "feature requester", DCCP B. 1519 Transitions between states are triggered by receiving a packet 1520 ("RECV") or by an application event ("APP"). Received packets are 1521 further distinguished by any options relevant to the feature being 1522 negotiated. "RECV -" means the packet contained no relevant option. 1523 "RECV Chg" denotes a Change option, "RECV Pr" a Prefer option, and 1524 "RECV Cfm" a Confirm option. The data contained in an option is 1525 given in parentheses when necessary. The "SEND" action indicates 1526 which option the DCCP will send next. Finally, the "SET-VALUE" 1527 action causes the DCCP to change its value for the relevant feature. 1529 "SEND" does not force DCCP to immediately generate a packet; rather, 1530 it says which feature option must be sent on the next packet 1531 generated. A DCCP MAY choose to generate a packet in response to 1532 some "SEND" action. However, it MUST NOT generate a packet if doing 1533 so would violate the congestion control mechanism in use. 1535 The requester, DCCP B, has four states: Known, Unknown, Failed, and 1536 Changing. Similarly, the feature location, DCCP A, has four states: 1537 Known, Unknown, Failed, and Confirming. In both cases, Known denotes 1538 a state where the DCCP knows the feature's current value, and 1539 believes that the other DCCP agrees. Changing and Confirming denote 1540 states where the DCCPs are in the process of negotiating a new value 1541 for the feature. The Unknown state can occur only at connection 1542 setup time. It denotes a state where the DCCP does not know any 1543 value for the feature, and has not yet entered a negotiation to 1544 determine its value. Finally, the Failed state represents a state 1545 where the other DCCP does not implement the feature under 1546 negotiation. 1548 A DCCP may start in either the Unknown or Known state, depending on 1549 the feature in question. In particular, some features have a well- 1550 known value for new connections, in which case the DCCPs begin the 1551 connection in the Known states. 1553 REQUESTER STATE DIAGRAM (DCCP B) 1555 +-----------+ 1556 | Unknown | 1557 +-----------+ 1558 +----------+ | +-----------+ 1559 | |RECV - |RECV -/Pr | APP | |RECV Pr/Cfm 1560 V |SEND - |SEND Chg V |SEND Chg 1561 +-----------+ | | +------------+ | 1562 | |----+ +------------>| |-----+ 1563 | Known |------------------------------>| Changing | 1564 | | RECV Pr | APP | |-----+ 1565 +-----------+ SEND Chg +------------+ |RECV - 1566 ^ | | ^ |SEND -/Chg 1567 | | | | | 1568 +------------------------------------------+ | +---------+ 1569 RECV Cfm(O) | +----------+ 1570 SEND - +--------->| Failed | 1571 SET-VALUE O RECV Ign +----------+ 1572 SEND - 1574 FEATURE LOCATION STATE DIAGRAM (DCCP A) 1575 (O represents any feature value acceptable to DCCP A; X is not acceptable.) 1577 RECV Chg(O) 1578 SEND Cfm(O) RECV - | APP 1579 SET-VALUE O +-----------+ SEND Pr(O) 1580 +--------------------| Unknown |------------+ 1581 | +-----------+ | 1582 | +-------+ | | +-----------+ 1583 | | |RECV - |RECV Chg(X) | | |RECV Chg(X) 1584 V V |SEND - |SEND Pr(O) V V |SEND Pr(O) 1585 +-----------+ | | +------------+ | (need not be 1586 | |----+ +------------>| |-----+ the same O) 1587 | Known |------------------------------>| Confirming | 1588 | |----+ RECV Chg | APP | |-----+ 1589 +-----------+ | SEND Pr(O) +------------+ |RECV - 1590 ^ ^ | | | ^ |SEND -/Pr(O) 1591 | | |RECV Chg(O) | | | | 1592 | | |SEND Cfm(O) | | +---------+ 1593 | | |SET-VALUE O | | 1594 | +-------+ | | +----------+ 1595 +---------------------------------------------+ +-------->| Failed | 1596 RECV Chg(O) RECV Ign +----------+ 1597 SEND Cfm(O) SEND - 1598 SET-VALUE O 1600 This specification allows several choices of action in certain 1601 states. The implementation will generally use feature-specific 1602 information to decide how to respond. For example, DCCP A in the 1603 Known state may respond to a Change option with either Confirm or 1604 Prefer. If DCCP A is willing to set the feature to the value 1605 specified by Change, it will generally send Confirm; but if it would 1606 like to negotiate further, it will send Prefer. 1608 DCCP B retransmits Change options, and DCCP A retransmits Prefer 1609 options, until receiving a relevant response. However, they need not 1610 retransmit the option on every packet, as shown by the "RECV - / 1611 SEND -" transitions in the Changing and Confirming states. 1613 These state diagrams guarantee safety, but not liveness. Namely, no 1614 unexpected or erroneous options will be sent, but option negotiation 1615 might not terminate. For example, the following infinite negotiation 1616 is legal according to this specification. 1618 A > B Prefer(1) 1619 B > A Change(2) 1620 A > B Prefer(1) 1621 B > A Change(2)... 1623 Implementations MAY choose to enforce a maximum length on any 1624 negotiation---for example, by resetting the connection when any 1625 negotiation lasts more than some maximum time. The DCCP-Reset Reason 1626 "Fruitless Negotiation" SHOULD be used to signal that a connection 1627 was aborted because of a negotiation that took too long. 1629 In the Changing and Confirming states, the value of the 1630 corresponding feature is in flux. DCCP MAY change its behavior in 1631 these states---for example, by refusing to send data until 1632 reentering a Known state. 1634 6.4. Identification Options 1636 The Identification options provide a way for DCCP endpoints to 1637 confirm each others' identities, even after changes of address 1638 (Section 10) or long bursts of loss that get the endpoints out of 1639 sync (Section 5.2). Again, DCCP as specified here does not provide 1640 cryptographic security guarantees, and attackers that can see every 1641 packet are still capable of manipulating DCCP connections 1642 inappropriately, but the Identification options make it more 1643 difficult for some kinds of attacks to succeed. 1645 The Identification option is used to prove an endpoint's identity, 1646 while a Challenge option elicits an Identification from the other 1647 endpoint. An Identification Regime determines how the 1648 Identifications are calculated. In the default MD5 Regime, the 1649 calculation involves an MD5 hash over packet data and two Connection 1650 Nonces, either exchanged at the beginning of the connection or 1651 implicitly agreed upon. 1653 6.4.1. Identification Regime Feature 1655 Identification Regime has feature number 8. The ID Regime feature 1656 located at DCCP B specifies the algorithm that DCCP A will use for 1657 its Identification options. Each endpoint must keep track of both 1658 its ID regime and, via the ID Regime feature, the regime used by the 1659 other endpoint. 1661 The value of ID Regime is a two-byte number, so a valid Confirm(ID 1662 Regime) option takes exactly four bytes. Change or Prefer options 1663 MAY list multiple ID Regimes in descending order of preference. ID 1664 Regime defaults to 0, the MD5 Regime. Applications preferring 1665 different security guarantees, particularly around mobility issues, 1666 may prefer to implement another identification algorithm and assign 1667 it a different ID Regime value. 1669 The ID Regime feature is negotiable, so an endpoint can request that 1670 the other endpoint use a particular ID Regime, or one of a set of 1671 Regimes, by sending a Prefer option. If the endpoints cannot agree 1672 on mutually acceptable ID Regimes, the connection SHOULD be reset 1673 due to "Fruitless Negotiation". 1675 6.4.2. Connection Nonce Feature 1677 Connection Nonce has feature number 7. The Connection Nonce feature 1678 located at DCCP B is the value of DCCP A's connection nonce. Each 1679 endpoint SHOULD keep track of both its nonce and the other 1680 endpoint's nonce. Connection Nonces are used by Identification 1681 Regime 0. 1683 The Connection Nonce feature takes arbitrary values of at least 4 1684 bytes long. A Change(Connection Nonce) option therefore takes at 1685 least 6 bytes. Confirm(Connection Nonce) options MUST NOT contain 1686 the relevant value, so a Confirm(Connection Nonce) option takes 1687 exactly 2 bytes. 1689 Connection Nonce defaults to a random 8-byte string. To prevent 1690 spoofing, this string MUST NOT have any trivially predictable value. 1691 For example, it MUST NOT be set deterministically to zero, and it 1692 SHOULD change on every connection. DCCP endpoints MAY, however, 1693 exchange Connection Nonces via some mechanism other than the 1694 plaintext, snoopable Connection Nonce option. 1696 This feature is non-negotiable. 1698 6.4.3. Identification Option 1700 The Identification option serves as confirmation that a packet was 1701 sent by an endpoint involved in the initiation of the DCCP 1702 connection. It is permitted in any DCCP packet, but it might not be 1703 useful until the endpoints have exchanged security information such 1704 as connection nonces. The option takes the following form: 1706 +--------+--------+--------+--------+--------+-------- 1707 |00101010| Length | Identification Data ... 1708 +--------+--------+--------+--------+--------+-------- 1709 Type=42 1711 The particular data included in an Identification option sent by 1712 DCCP A depends on the ID Regime in force for the A-to-B sequence, 1713 which is the value of the ID Regime feature located at DCCP B. The 1714 remainder of this section describes ID Regime 0, the default MD5 1715 Regime. 1717 The Identification data provided for the MD5 Regime consists of a 1718 16-byte MD5 digest of: the second and fourth 32-bit words in the 1719 generic DCCP header, including the Sequence and Acknowledgement 1720 Numbers; the value of the sender's Connection Nonce; and the value 1721 of the other endpoint's Connection Nonce, in that order. The total 1722 length of the option is therefore 18 bytes, and the option may only 1723 be provided on packets that contain Acknowledgement Numbers, such as 1724 DCCP-Ack. Inclusion of the two Connection Nonces ensures that 1725 attackers cannot fake an Identification Option, unless they snooped 1726 on the beginning of the connection when nonces are exchanged. (No 1727 mechanism protects against snoopers who know Connection Nonces, 1728 since DCCP as specified here does not provide strong cryptographic 1729 security guarantees; see Section 16.) Inclusion of the Sequence and 1730 Acknowledgement Numbers protects against replay attacks within the 1731 connection. 1733 To check an Identification option's value, the receiver simply 1734 calculates the MD5 digest itself and compares that against the 1735 option data. The MD5 calculation can be expensive, so an attacker 1736 could conceivably disable a DCCP endpoint by sending it a flood of 1737 invalid packets with bad Identification options. Rate limits 1738 described in Sections 5.2 and 10 mitigate this issue. The receiver 1739 MAY ignore an Identification option if it occurs on a packet that 1740 would otherwise be considered valid. 1742 Example C code for constructing the option's value follows: 1744 unsigned char *packet_data; 1745 int packet_length; 1746 int id_option_offset; /* offset of option in packet_data */ 1748 const unsigned char *my_nonce, *other_nonce; 1749 int my_nonce_length, other_nonce_length; 1751 MD5_CTX md5_context; 1753 MD5_Init(&md5_context); 1754 MD5_Update(&md5_context, packet_data + 4, 4); 1755 MD5_Update(&md5_context, packet_data + 12, 4); 1756 MD5_Update(&md5_context, my_nonce, my_nonce_length); 1757 MD5_Update(&md5_context, other_nonce, other_nonce_length); 1758 packet_data[id_option_offset] = 42; /* option value */ 1759 packet_data[id_option_offset+1] = 18; /* option length */ 1760 MD5_Final(packet_data + id_option_offset + 2, &md5_context); 1762 6.4.4. Challenge Option 1764 This option informs the receiving DCCP that one of its packets was 1765 ignored, and that succeeding packets will be ignored until the 1766 endpoint sends a correct Identification option. The receiving DCCP 1767 SHOULD include an Identification option on the next packet it sends. 1768 The option takes the following form: 1770 +--------+--------+--------+--------+--------+-------- 1771 |00101100| Length | Identification Data ... 1772 +--------+--------+--------+--------+--------+-------- 1773 Type=44 1775 The Identification Data on a packet sent by DCCP B is the same as 1776 that for an Identification option sent by DCCP B. The receiver 1777 SHOULD ignore a Challenge option, and the packet the Challenge 1778 option contains, if the Identification Data is incorrect. The 1779 purpose of this mechanism is to prevent denial-of-service attacks 1780 where an attacker could cause the receiver to send many packets with 1781 expensive-to-compute Identification options, since the receiver MAY 1782 ignore Challenge options for some time after receiving an invalid 1783 Challenge. 1785 If, after several Challenge options, a DCCP is unable to elicit a 1786 valid Identification from its partner, it MAY reset the connection 1787 with Reason "Unanswered Challenge". 1789 6.5. Init Cookie Option 1791 This option is permitted in DCCP-Response, DCCP-Data, and DCCP- 1792 DataAck messages. The option MAY be returned by the server in a 1793 DCCP-Response. If so, then the client MUST echo the same Init 1794 Cookie option in its ensuing DCCP-Data or DCCP-DataAck message. The 1795 server SHOULD respond to an invalid Init Cookie option by resetting 1796 the connection with Reason set to "Bad Init Cookie". 1798 The purpose of this option is to allow a DCCP server to avoid having 1799 to hold any state until the three-way connection setup handshake has 1800 completed. The server wraps up the service name, server port, and 1801 any options it cares about from both the DCCP-Request and DCCP- 1802 Response in an opaque cookie. Typically the cookie will be 1803 encrypted using a secret known only to the server and include a 1804 cryptographic checksum or magic value so that correct decryption can 1805 be verified. When the server receives the cookie back in the 1806 response, it can decrypt the cookie and instantiate all the state it 1807 avoided keeping. 1809 The precise implementation of the Init Cookie does not need to be 1810 specified here; since Init Cookies are opaque to the client, there 1811 are no interoperability concerns. 1813 +--------+--------+--------+--------+--------+-------- 1814 |00100100| Length | Init Cookie Value ... 1815 +--------+--------+--------+--------+--------+-------- 1816 Type=36 1818 6.6. Timestamp Option 1820 This option is permitted in any DCCP packet. The length of the 1821 option is 6 bytes. 1823 +--------+--------+--------+--------+--------+--------+ 1824 |00101000|00000110| Timestamp Value | 1825 +--------+--------+--------+--------+--------+--------+ 1826 Type=40 Length=6 1828 The four bytes of option data carry the timestamp of this packet in 1829 some undetermined form. A DCCP receiving a Timestamp option SHOULD 1830 respond with a Timestamp Echo option on the next packet it sends. 1832 6.7. Elapsed Time Option 1834 This option is permitted in any DCCP packet that contains an 1835 Acknowledgement Number. It indicates how much time, in milliseconds, 1836 has elapsed since the packet being acknowledged---the packet with 1837 the given Acknowledgement Number---was received. The option may take 1838 up 4 or 6 bytes, depending on how large Elapsed Time is. 1840 +--------+--------+--------+--------+ 1841 |00101110|00000100| Elapsed Time | 1842 +--------+--------+--------+--------+ 1843 Type=46 Len=4 1845 +--------+--------+--------+--------+--------+--------+ 1846 |00101110|00000110| Elapsed Time | 1847 +--------+--------+--------+--------+--------+--------+ 1848 Type=46 Len=6 1850 The option data, Elapsed Time, represents the amount of time, in 1851 tenths of milliseconds, elapsed since the packet being acknowledged 1852 was received. If Elapsed Time is less than a minute, the first, more 1853 parsimonious form of the option SHOULD be used. Elapsed Times of 1854 more than 6.5535 seconds MUST be sent using the second form of the 1855 option. 1857 Elapsed Time is measured in tenths of milliseconds as a compromise 1858 between two conflicting goals: first, to provide enough granularity 1859 to reduce aliasing noise when measuring elapsed time over fast LANs; 1860 and second, to allow most reasonable elapsed times to fit into two 1861 bytes of data. 1863 6.8. Timestamp Echo Option 1865 This option is permitted in any DCCP packet, as long as at least one 1866 packet carrying the Timestamp option has been received. The length 1867 of the option is between 6 and 10 bytes, depending on whether 1868 Elapsed Time is included and how large it is. 1870 +--------+--------+--------+--------+--------+--------+ 1871 |00101001|00000110| Timestamp Echo | 1872 +--------+--------+--------+--------+--------+--------+ 1873 Type=41 Len=6 1875 +--------+--------+------- ... -------+--------+--------+ 1876 |00101001|00001000| Timestamp Echo | Elapsed Time | 1877 +--------+--------+------- ... -------+--------+--------+ 1878 Type=41 Len=8 (4 bytes) 1880 +--------+--------+------- ... -------+------- ... -------+ 1881 |00101001|00001010| Timestamp Echo | Elapsed Time | 1882 +--------+--------+------- ... -------+------- ... -------+ 1883 Type=41 Len=10 (4 bytes) (4 bytes) 1885 The first four bytes of option data, Timestamp Echo, carry a 1886 Timestamp Value taken from a preceding received Timestamp option. 1887 Usually, this will be the last packet that was received---the packet 1888 indicated by the Acknowledgement Number, if any---but it might be a 1889 preceding packet. 1891 The Elapsed Time field is similar to the value stored in the Elapsed 1892 Time option. If present, it indicates the amount of time elapsed 1893 since receiving the packet whose timestamp is being echoed. This 1894 time MUST be in tenths of milliseconds. Elapsed Time is meant to 1895 help the Timestamp sender separate the network round-trip time from 1896 the Timestamp receiver's processing time. This may be particularly 1897 important for CCIDs where acknowledgements are sent infrequently, so 1898 that there might be considerable delay between receiving a Timestamp 1899 option and sending the corresponding Timestamp Echo. A missing 1900 Elapsed Time field is equivalent to an Elapsed Time of zero. The 1901 smallest version of the option SHOULD be used that can hold the 1902 relevant Elapsed Time value. 1904 6.9. Loss Window Feature 1906 Loss Window has feature number 6. The Loss Window feature located at 1907 DCCP B is the width of the window DCCP B uses to determine whether 1908 packets from DCCP A are valid. Packets outside this window will be 1909 dropped by DCCP B as old duplicates or spoofing attempts; see 1910 Section 5.2 for more information. DCCP A sends a "Change(Loss 1911 Window, W)" option to DCCP B to set DCCP B's Loss Window to W. 1913 The Loss Window feature takes 3-byte integer values, like DCCP 1914 sequence numbers. Change and Confirm options for Loss Window are 1915 therefore 6 bytes long. 1917 Loss Window defaults to 1000 for new connections. The Loss Window 1918 value is the total width of the loss window. The receiver may 1919 position the loss window asymmetrically around the greatest sequence 1920 number seen---for example, by allocating 1/4 of the loss window 1921 width for older sequence numbers and 3/4 of it for newer sequence 1922 numbers. 1924 This feature is non-negotiable. 1926 7. Congestion Control IDs 1928 Each congestion control mechanism supported by DCCP is assigned a 1929 congestion control identifier, or CCID: a number from 0 to 255. 1930 During connection setup, and optionally thereafter, the endpoints 1931 negotiate their congestion control mechanisms by negotiating the 1932 values for their Congestion Control features. Congestion Control has 1933 feature number 1. The feature located at DCCP A is the CCID in use 1934 for the A-to-B half-connection. DCCP B sends an "Change(CC, K)" 1935 option to DCCP A to ask A to use CCID K for its data packets. 1937 The data byte of Congestion Control feature negotiation options form 1938 a list of acceptable CCIDs, sorted in descending order of priority. 1939 For example, the option "Change(CC 1, 2, 3)" asks the sender to use 1940 CCID 1, although CCIDs 2 and 3 are also acceptable. (This 1941 corresponds to the bytes "33, 6, 1, 1, 2, 3": Change option (33), 1942 option length (6), feature ID (1), CCIDs (1, 2, 3).) Similarly, 1943 "Confirm(CC 1, 2, 3)" tells the receiver that the sender is using 1944 CCID 1, but that CCIDs 2 or 3 might also be acceptable. 1946 The CCIDs defined by this document are: 1948 CCID Meaning 1949 ---- ------- 1950 0 Reserved 1951 1 Unspecified Sender-Based Congestion Control 1952 2 TCP-like Congestion Control 1953 3 TFRC Congestion Control 1955 A new connection starts with CCID 2 for both DCCPs. If this is 1956 unacceptable for either DCCP, that DCCP will start in the Unknown 1957 state. A DCCP SHOULD NOT send data when its Congestion Control 1958 feature is in the Unknown state. 1960 All CCIDs standardized for use with DCCP will correspond to 1961 congestion control mechanisms previously standardized by the IETF. 1962 We expect that for quite some time, all such mechanisms will be TCP- 1963 friendly, but TCP-friendliness is not an explicit DCCP requirement. 1965 A DCCP implementation intended for general use---in a general- 1966 purpose operating system kernel, for example---SHOULD implement at 1967 least CCIDs 1 and 2. The intent is to make these CCIDs broadly 1968 available for interoperability, although any given application might 1969 disallow their use via the feature negotiation process. 1971 7.1. Unspecified Sender-Based Congestion Control 1973 CCID 1 denotes an unspecified sender-based congestion control 1974 mechanism. Separate features negotiate the corresponding congestion 1975 acknowledgement options---for example, Ack Vector. This provides a 1976 limited, controlled form of interoperability for new IETF-approved 1977 CCIDs. 1979 Implementors MUST NOT use CCID 1 in production environments as a 1980 proxy for congestion control mechanisms that have not entered the 1981 IETF standards process. We intend that any production use of CCID 1 1982 would have to be explicitly approved first by the IETF. Middleboxes 1983 MAY choose to treat the use of CCID 1 as experimental or 1984 unacceptable. 1986 For example, say that CCID 98, a new sender-based congestion control 1987 mechanism using Ack Vector for acknowledgements, has entered the 1988 IETF standards process, and the IETF has approved the use of CCID 1 1989 as a backup for CCID 98. Now, DCCP A, which understands and would 1990 like to use CCID 98, is trying to communicate with DCCP B, which 1991 doesn't yet know about CCID 98. DCCP A can simply negotiate use of 1992 CCID 1 and, separately, negotiate Use Ack Vector. DCCP B will 1993 provide the feedback DCCP A requires for CCID 98, namely Ack Vector, 1994 without needing to understand the congestion control mechanism in 1995 use. 1997 CCID 1 has no sender implementation; it is exclusively meaningful at 1998 the receiver to support forward compatibility. The sender always 1999 uses a specific congestion control mechanism whose CCID is not 1. 2000 However, the code implementing a CCID that requires only generic 2001 feedback, such as Ack Vector, MAY add CCID 1 to the list of 2002 acceptable CCIDs sent to the receiver (following the actual CCID), 2003 facilitating communication with receivers that do not understand the 2004 actual CCID. 2006 Any CCID feature negotiation in which the sender proposes the use of 2007 CCID 1 without any other CCID is considered erroneous, and SHOULD 2008 result in connection reset, with Reason set to "Fruitless 2009 Negotiation". 2011 DCCP implementations MAY provide APIs that allow applications to 2012 suggest preferred CCIDs for sending and receiving data. Any such API 2013 MUST NOT allow sending applications to suggest CCID 1; again, CCID 1 2014 will be suggested when appropriate by the code implementing the 2015 preferred CCID. In contrast, APIs SHOULD let applications allow or 2016 prevent the use of CCID 1 for receiving. 2018 7.2. TCP-like Congestion Control 2020 CCID 2 denotes Additive Increase, Multiplicative Decrease (AIMD) 2021 congestion control with behavior modelled directly on TCP, including 2022 congestion window, slow start, timeouts, and so forth. CCID 2 is 2023 further described in [CCID 2 PROFILE]. 2025 7.3. TFRC Congestion Control 2027 CCID 3 denotes TCP-Friendly Rate Control, an equation-based rate- 2028 controlled congestion control mechanism. CCID 3 is further described 2029 in [CCID 3 PROFILE]. 2031 7.4. CCID-Specific Options and Features 2033 Option and feature numbers 128 through 255 are available for CCID- 2034 specific use. CCIDs may often need new option types---for 2035 communicating acknowledgement or rate information, for example. 2036 CCID-specific option types let them create options at will without 2037 polluting the global option space. Option 128 might have different 2038 meanings on a half-connection using CCID 4 and a half-connection 2039 using CCID 8. CCID-specific options and features will never conflict 2040 with global options introduced by later versions of this 2041 specification. 2043 Any packet may contain information meant for either half-connection, 2044 so CCID-specific option and feature numbers explicitly signal the 2045 half-connection to which they apply. Option numbers 128 through 191 2046 are for options sent from the HC-Sender to the HC-Receiver; option 2047 numbers 192 through 255 are for options sent from the HC-Receiver to 2048 the HC-Sender. Similarly, feature numbers 128 through 191 are for 2049 features located at the HC-Sender; feature numbers 192 through 255 2050 are for features located at the HC-Receiver. (Change options for a 2051 feature are sent to the feature location; Prefer and Confirm options 2052 are sent from the feature location. Thus, Change(128) options are 2053 sent by the HC-Receiver by definition, while Change(192) options are 2054 sent by the HC-Sender.) 2056 For example, consider a DCCP connection where the A-to-B half- 2057 connection uses CCID 4 and the B-to-A half-connection uses CCID 5. 2058 Here is how a sampling of CCID-specific options and features are 2059 assigned to half-connections: 2061 Relevant Relevant 2062 Packet Option Half-conn. CCID 2063 ------ ------ ---------- ---- 2064 A > B 128 A-to-B 4 2065 A > B 192 B-to-A 5 2066 A > B Change(128, ...) B-to-A 5 2067 A > B Prefer(128, ...) A-to-B 4 2068 A > B Confirm(128, ...) A-to-B 4 2069 A > B Change(192, ...) A-to-B 4 2070 A > B Prefer(192, ...) B-to-A 5 2071 A > B Confirm(192, ...) B-to-A 5 2073 CCID-specific options and features have no clear meaning when the 2074 relevant CCID is in flux. A DCCP SHOULD respond to CCID-specific 2075 options and features with Ignored options during those times. 2077 8. Acknowledgements 2079 Congestion control requires receivers to transmit information about 2080 packet losses and ECN marks to senders. DCCP receivers MUST report 2081 all congestion they see, as defined by the relevant CCID profile. 2082 Each CCID says when acknowledgements should be sent, what options 2083 they must use, how they should be congestion controlled, and so on. 2085 Most acknowledgements use DCCP options. For example, on a half- 2086 connection with CCID 2 (TCP-like), the receiver reports 2087 acknowledgement information using the Ack Vector option. This 2088 section describes common acknowledgement options and shows how acks 2089 using those options will commonly work. Full descriptions of the 2090 acknowledgement mechanisms used for each CCID are laid out in the 2091 CCID profile specifications. 2093 Acknowledgement options, such as Ack Vector, generally depend on the 2094 DCCP Acknowledgement Number, and are thus only allowed on packet 2095 types that carry that number (all packets except DCCP-Request and 2096 DCCP-Data). However, detailed acknowledgement options are not 2097 generally necessary on DCCP-Resets. 2099 8.1. Acks of Acks and Unidirectional Connections 2101 DCCP was designed to work well for both bidirectional and 2102 unidirectional flows of data, and for connections that transition 2103 between these states. However, acknowledgements required for a 2104 unidirectional connection are very different from those required for 2105 a bidirectional connection. In particular, unidirectional 2106 connections need to worry about acks of acks. 2108 The ack-of-acks problem arises because some acknowledgement 2109 mechanisms are reliable. For example, an HC-Receiver using CCID 2, 2110 TCP-like Congestion Control, sends Ack Vectors containing completely 2111 reliable acknowledgement information. The HC-Sender should 2112 occasionally inform the HC-Receiver that it has received an ack. If 2113 it did not, the HC-Receiver might resend complete Ack Vector 2114 information, going back to the start of the connection, with every 2115 DCCP-Ack packet! However, note that acks-of-acks need not be 2116 reliable themselves: when an ack-of-acks is lost, the HC-Receiver 2117 will simply maintain old acknowledgement-related state for a little 2118 longer. Therefore, there is no need for acks-of-acks-of-acks. 2120 When communication is bidirectional, any required acks-of-acks are 2121 automatically contained in normal acknowledgements for data packets. 2122 On a unidirectional connection, however, the receiver DCCP sends no 2123 data, so the sender would not normally send acknowledgements. 2124 Therefore, the CCID in force on that half-connection must explicitly 2125 say whether, when, and how the HC-Sender should generate acks-of- 2126 acks. 2128 For example, consider a bidirectional connection where both half- 2129 connections use the same CCID (either 2 or 3), and where DCCP B goes 2130 "quiescent". This means that the connection becomes unidirectional: 2131 DCCP B stops sending data, and sends only sends DCCP-Ack packets to 2132 DCCP A. For CCID 2, TCP-like Congestion Control, DCCP B uses Ack 2133 Vector to reliably communicate which packets it has received. As 2134 described above, DCCP A must occasionally acknowledge a pure 2135 acknowledgement from DCCP B, so that DCCP B can free old Ack Vector 2136 state. For instance, DCCP A might send a DCCP-DataAck packet every 2137 now and then, instead of DCCP-Data. In contrast, for CCID 3, TFRC 2138 Congestion Control, DCCP B's acknowledgements generally need not be 2139 reliable, since they contain cumulative loss rates; TFRC works even 2140 if every DCCP-Ack is lost. Therefore, DCCP A need never acknowledge 2141 an acknowledgement. 2143 When communication is unidirectional, a single CCID---in the 2144 example, the A-to-B CCID---controls both DCCPs' acknowledgements, in 2145 terms of their content, their frequency, and so forth. For 2146 bidirectional connections, the A-to-B CCID governs DCCP B's 2147 acknowledgements (including its acks of DCCP A's acks), while the B- 2148 to-A CCID governs DCCP A's acknowledgements. 2150 DCCP A switches its ack pattern from bidirectional to unidirectional 2151 when it notices that DCCP B has gone quiescent. It switches from 2152 unidirectional to bidirectional when it must acknowledge even a 2153 single DCCP-Data or DCCP-DataAck packet from DCCP B. (This includes 2154 the case where a single DCCP-Data or DCCP-DataAck packet was lost in 2155 transit, which is detectable using the # NDP field in the DCCP 2156 packet header.) 2158 Each CCID defines how to detect quiescence on that CCID, and how 2159 that CCID handles acks-of-acks on unidirectional connections. The B- 2160 to-A CCID defines when DCCP B has gone quiescent. Usually, this 2161 happens when a period has passed without B sending any data packets. 2162 For CCID 2, this period is roughly two round-trip times. The A-to-B 2163 CCID defines how DCCP A handles acks-of-acks once DCCP B has gone 2164 quiescent. 2166 8.2. Ack Piggybacking 2168 Acknowledgements of A-to-B data MAY be piggybacked on data sent by 2169 DCCP B, as long as that does not delay the acknowledgement longer 2170 than the A-to-B CCID would find acceptable. However, data 2171 acknowledgements often require more than 4 bytes to express. A large 2172 set of acknowledgements prepended to a large data packet might 2173 exceed the path's MTU. In this case, DCCP B SHOULD send separate 2174 DCCP-Data and DCCP-Ack packets, or wait, but not too long, for a 2175 smaller datagram. 2177 Piggybacking is particularly common at DCCP A when the B-to-A half- 2178 connection is quiescent---that is, when DCCP A is just acknowledging 2179 DCCP B's acknowledgements, as described above. There are three 2180 reasons to acknowledge DCCP B's acknowledgements: to allow DCCP B to 2181 free up information about previously acknowledged data packets from 2182 A; to shrink the size of future acknowledgements; and to manipulate 2183 the rate future acknowledgements are sent. Since these are secondary 2184 concerns, DCCP A can generally afford to wait indefinitely for a 2185 data packet to piggyback its acknowledgement onto. 2187 Any restrictions on ack piggybacking are described in the relevant 2188 CCID's profile. 2190 8.3. Ack Ratio Feature 2192 Ack Ratio provides a common mechanism by which CCIDs that clock 2193 acknowledgements off of data packets can perform rudimentary 2194 congestion control on the acknowledgement stream. CCID 2, TCP-like 2195 Congestion Control, uses Ack Ratio to limit the rate of its 2196 acknowledgement stream, for example. Some CCIDs ignore Ack Ratio, 2197 performing congestion control on acknowledgements in some other way. 2199 Ack Ratio has feature number 3. The Ack Ratio feature located at 2200 DCCP B equals the ratio of data packets sent by DCCP A to 2201 acknowledgement packets sent back by DCCP B. For example, if it is 2202 set to four, then DCCP B will send at least one acknowledgement 2203 packet for every four data packets DCCP A sends. DCCP A sends a 2204 "Change(Ack Ratio)" option to DCCP B to change DCCP B's ack ratio. 2206 An Ack Ratio option contains two bytes of data: a sixteen-bit 2207 integer representing the ratio. A new connection starts with Ack 2208 Ratio 2 for both DCCPs. 2210 This feature is non-negotiable. 2212 8.4. Use Ack Vector Feature 2214 The Use Ack Vector feature lets DCCPs negotiate whether they should 2215 use Ack Vector options to report congestion. Ack Vector provides 2216 detailed loss information, and lets senders report back to their 2217 applications whether particular packets were dropped. Use Ack Vector 2218 is mandatory for some CCIDs, and optional for others. 2220 Use Ack Vector has feature number 4. The Use Ack Vector feature 2221 located at DCCP B specifies whether DCCP B MUST use Ack Vector 2222 options on its acknowledgements to DCCP A, although DCCP B MAY send 2223 Ack Vector options even when Use Ack Vector is false. DCCP A sends a 2224 "Change(Use Ack Vector, 1)" option to DCCP B to ask B to send Ack 2225 Vector options as part of its acknowledgement traffic. 2227 Use Ack Vector feature values are a single byte long. The receiver 2228 MUST send Ack Vector options if this byte is nonzero. A new 2229 connection starts with Use Ack Vector 0 for both DCCPs. 2231 8.5. Ack Vector Options 2233 The Ack Vector gives a run-length encoded history of data packets 2234 received at the client. Each byte of the vector gives the state of 2235 that data packet in the loss history, and the number of preceding 2236 packets with the same state. The option's data looks like this: 2238 +--------+--------+--------+--------+--------+-------- 2239 |001001??| Length |SSLLLLLL|SSLLLLLL|SSLLLLLL| ... 2240 +--------+--------+--------+--------+--------+-------- 2241 Type=37/38 \___________ Vector ___________... 2243 The two Ack Vector options (option types 37 and 38) differ only in 2244 the values they imply for ECN Nonce Echo. Section 9.2 describes this 2245 further. 2247 The vector itself consists of a series of bytes, each of whose 2248 encoding is: 2250 0 1 2 3 4 5 6 7 2251 +-+-+-+-+-+-+-+-+ 2252 |St | Run Length| 2253 +-+-+-+-+-+-+-+-+ 2255 St[ate]: 2 bits 2257 Run Length: 6 bits 2259 State occupies the most significant two bits of each byte, and can 2260 have one of four values: 2262 0 Packet received (and not ECN marked). 2264 1 Packet received ECN marked. 2266 2 Reserved. 2268 3 Packet not yet received. 2270 The first byte in the first Ack Vector option refers to the packet 2271 indicated in the Acknowledgement Number; subsequent bytes refer to 2272 older packets. (Ack Vector MUST NOT be sent on DCCP-Data and DCCP- 2273 Request packets, which lack an Acknowledgement Number.) If an Ack 2274 Vector contains the decimal values 0,192,3,64,5 and the 2275 Acknowledgement Number is decimal 100, then: 2277 Packet 100 was received (Acknowledgement Number 100, State 0, 2278 Run Length 0). 2280 Packet 99 was lost (State 3, Run Length 0). 2282 Packets 98, 97, 96 and 95 were received (State 0, Run Length 3). 2284 Packet 94 was ECN marked (State 1, Run Length 0). 2286 Packets 93, 92, 91, 90, 89, and 88 were received (State 0, Run 2287 Length 5). 2289 Run lengths of more than 64 must be encoded in multiple bytes. A 2290 single Ack Vector option can acknowledge up to 16192 data packets. 2291 Should more packets need to be acknowledged than can fit in 253 2292 bytes of Ack Vector, then multiple Ack Vector options can be sent. 2293 The second Ack Vector option will begin where the first Ack Vector 2294 option left off, and so forth. 2296 Ack Vector states are subject to two general constraints. (These 2297 principles SHOULD also be followed for other acknowledgement 2298 mechanisms; referring to Ack Vector states simplifies their 2299 explanation.) 2301 (1) Packets reported as State 0 or State 1 MUST have been processed 2302 by the receiving DCCP stack. In particular, their options must 2303 have been processed. Any data on the packet need not have been 2304 delivered to the receiving application; in fact, the data may 2305 have been dropped. 2307 (2) Packets reported as State 3 MUST NOT have been received by DCCP. 2308 Feature negotiations and options on such packets MUST NOT have 2309 been processed, and the Acknowledgement Number MUST NOT 2310 correspond to such a packet. 2312 Packets dropped in the application's receive buffer SHOULD be 2313 reported as Received or Received ECN Marked (States 0 and 1), 2314 depending on their ECN state; such packets' ECN Nonces MUST be 2315 included in the Nonce Echo. The Data Dropped option informs the 2316 sender that some packets reported as received actually had their 2317 payloads dropped. 2319 One or more Ack Vector options that, together, report the status of 2320 more packets than have actually been sent SHOULD be considered 2321 invalid. The receiving DCCP SHOULD either ignore the options or 2322 reset the connection with Reason set to "Option Error". Packets 2323 whose status has not reported by any Ack Vector option SHOULD be 2324 treated as "not yet received" (State 3) by the sender. 2326 8.5.1. Ack Vector Consistency 2328 A DCCP sender will commonly receive multiple acknowledgements for 2329 some of its data packets. For instance, an HC-Sender might receive 2330 two DCCP-Acks with Ack Vectors, both of which contained information 2331 about sequence number 24. (Because of cumulative acking, 2332 information about a sequence number is repeated in every ack until 2333 the HC-Sender acknowledges an ack. Perhaps the HC-Receiver is 2334 sending acks faster than the HC-Sender is acknowledging them.) In a 2335 perfect world, the two Ack Vectors would always be consistent. 2336 However, there are many reasons why they might not be: 2338 o The HC-Receiver received packet 24 between sending its acks, so 2339 the first ack said 24 was not received (State 3) and the second 2340 said it was received or ECN marked (State 0 or 1). 2342 o The HC-Receiver received packet 24 between sending its acks, and 2343 the network reordered the acks. In this case, the packet will 2344 appear to transition from State 0 or 1 to State 3. 2346 o The network duplicated packet 24, and one of the duplicates was 2347 ECN marked. This might show up as a transition between States 0 2348 and 1. 2350 To cope with these situations, HC-Sender DCCP implementations SHOULD 2351 combine multiple received Ack Vector states according to this table: 2353 Received State 2354 0 1 3 2355 +---+---+---+ 2356 0 | 0 | 1 | 0 | 2357 Old +---+---+---+ 2358 1 | 1 | 1 | 1 | 2359 State +---+---+---+ 2360 3 | 0 | 1 | 3 | 2361 +---+---+---+ 2363 To read the table, choose the row corresponding to the packet's old 2364 state and the column corresponding to the packet's state in the 2365 newly received Ack Vector, then read the packet's new state off the 2366 table. The table is symmetric about the main diagonal, so it is 2367 indifferent to ack reordering. 2369 This table defines how the HC-Sender should react to received Ack 2370 Vector states. This is equivalent to how the HC-Receiver should 2371 collect information about received packets, with two symmetric 2372 exceptions: when one State is 0 (received non-marked) and the other 2373 is 1 (received ECN marked). According to the table, the HC-Sender 2374 should react to this combination of Ack Vector information as if 2375 only State 1 had been reported. But what state should the HC- 2376 Receiver report in Ack Vector if two duplicates are received for a 2377 packet, and only one is ECN marked? We explicitly allow the HC- 2378 Receiver to report the combination as State 0 (received non-marked) 2379 or State 1. After all, one duplicate was non-marked, and depending 2380 on how much state the HC-Receiver keeps about packets it receives, 2381 it might be impossible to change a packet from State 0 to State 1 2382 and preserve correct ECN Nonce Echo information. 2384 A HC-Sender MAY choose to throw away old information gleaned from 2385 the HC-Receiver's Ack Vectors, in which case it MUST ignore newly 2386 received acknowledgements from the HC-Receiver for those old 2387 packets. It is often kinder to save recent Ack Vector information 2388 for a while, so that the HC-Sender can undo its reaction to presumed 2389 congestion when a "lost" packet unexpectedly shows up (the 2390 transition from State 3 to State 0). 2392 8.5.2. Ack Vector Coverage 2394 We can divide the packets that have been sent from an HC-Sender to 2395 an HC-Receiver into four roughly contiguous groups. From oldest to 2396 youngest, these are: 2398 (1) Packets already acknowledged by the HC-Receiver, where the HC- 2399 Receiver knows that the HC-Sender has definitely received the 2400 acknowledgements. 2402 (2) Packets already acknowledged by the HC-Receiver, where the HC- 2403 Receiver cannot be sure that the HC-Sender has received the 2404 acknowledgements. 2406 (3) Packets not yet acknowledged by the HC-Receiver. 2408 (4) Packets not yet received by the HC-Receiver. 2410 The union of groups 2 and 3 is called the Unacknowledged Window. 2411 Generally, every Ack Vector generated by the HC-Receiver will cover 2412 the whole Unacknowledged Window: Ack Vector acknowledgements are 2413 cumulative. (This simplifies Ack Vector maintenance at the HC- 2414 Receiver; see Section 8.9, below.) As packets are received, this 2415 window both grows on the right and shrinks on the left. It grows 2416 because there are more packets, and shrinks because the data 2417 packets' Acknowledgement Numbers will acknowledge previous 2418 acknowledgements, moving packets from group 2 into group 1. 2420 8.6. Slow Receiver Option 2422 An HC-Receiver sends the Slow Receiver option to its sender to 2423 indicate that it is having trouble keeping up with the sender's 2424 data. The HC-Sender SHOULD NOT increase its sending rate for 2425 approximately one round-trip time after seeing a packet with a Slow 2426 Receiver option. However, the Slow Receiver option does not indicate 2427 congestion, and the HC-Sender need not reduce its sending rate. (If 2428 necessary, the receiver can force the sender to slow down by 2429 dropping packets or reporting false ECN marks.) APIs SHOULD let 2430 receiver applications set Slow Receiver, and sending applications 2431 determine whether or not their receivers are Slow. 2433 The Slow Receiver option takes just one byte: 2435 +--------+ 2436 |00000010| 2437 +--------+ 2438 Type=2 2440 Slow Receiver does not specify why the receiver is having trouble 2441 keeping up with the sender. Possible reasons include lack of buffer 2442 space, CPU overload, and application quotas. A sending application 2443 might react to Slow Receiver by reducing its sending rate or by 2444 switching to a lossier compression algorithm. However, a smart 2445 sender might actually *increase* its sending rate in response to 2446 Slow Receiver, by switching to a less-compressed sending format. (A 2447 highly-compressed data format might overwhelm a slow CPU more 2448 seriously than the higher memory requirements of a less-compressed 2449 data format.) This tension between transfer size (less compression 2450 means more congestion) and processing speed (less compression means 2451 less processing) cannot be resolved in general. 2453 Slow Receiver implements a portion of TCP's receive window 2454 functionality. We believe receiver operating systems and 2455 applications will find it much easier to send Slow Receiver when 2456 appropriate than they currently find it to correctly set a TCP 2457 receive window. 2459 8.7. Data Dropped Option 2461 The Data Dropped option indicates that some packets reported as 2462 received actually had their data dropped before it reached the 2463 application. The sender's congestion control mechanism MAY react to 2464 data-dropped packets; such responses MAY be less severe than 2465 responses triggered by a lost or marked packet. (For instance, a 2466 windowed mechanism might subtract a constant value from its 2467 congestion window, rather than cut it in half.) When ECN-marked 2468 packets are included in Data Dropped, the sender's congestion 2469 control mechanism MUST react to the ECN marks as usual. 2471 The option's data looks like this: 2473 +--------+--------+--------+--------+--------+-------- 2474 |00100111| Length | Block | Block | Block | ... 2475 +--------+--------+--------+--------+--------+-------- 2476 Type=39 \___________ Vector ___________ ... 2478 The vector itself consists of a series of bytes, called Blocks, 2479 each of whose encoding corresponds to one of these choices: 2481 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 2482 +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 2483 |0| Run Length | or |1|Dr St|Run Len| 2484 +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 2485 Normal Block Drop Block 2487 The first byte in the first Data Dropped option refers to the packet 2488 indicated in the Acknowledgement Number; subsequent bytes refer to 2489 older packets. (Data Dropped MUST NOT be sent on DCCP-Data or DCCP- 2490 Request packets, which lack an Acknowledgement Number.) Normal 2491 Blocks, which have high bit 0, indicate that any received packets in 2492 the Run Length had their data delivered to the application. Drop 2493 Blocks, which have high bit 1, indicate that received packets in the 2494 Run Len[gth] were not delivered as usual. The 3-bit Dr[op] St[ate] 2495 field says what happened; generally, no data from that packet 2496 reached the application. Packets reported as "not yet received" MUST 2497 be included in Normal Blocks; packets not covered by any Data 2498 Dropped option are treated as if they were in a Normal Block. 2499 Defined Drop States for Drop Blocks are: 2501 0 Packet data dropped due to protocol constraints. For 2502 example, the data was included on a DCCP-Request packet, and 2503 the receiving application does not allow that piggybacking; 2504 or the data was sent during an important feature 2505 negotiation. 2507 1 Packet data dropped in the receive buffer. 2509 2 Packet data dropped due to corruption. 2511 3 Packet data corrupted, but delivered to the application 2512 anyway. 2514 4 Packet data dropped because the application is no longer 2515 listening. 2517 5-7 Reserved. 2519 For example, if a Data Dropped option contains the decimal values 2520 0,144,3,146, the Acknowledgement Number is 100, and an Ack Vector 2521 reported all packets as received, then: 2523 Packet 100 was received (Acknowledgement Number 100, Normal 2524 Block, Run Length 0). 2526 Packet 99 was dropped in the receive buffer (Drop Block, Drop 2527 State 1, Run Length 0). 2529 Packets 98, 97, 96, and 95 were received (Normal Block, Run 2530 Length 3). 2532 Packets 95, 94, and 93 were dropped in the receive buffer (Drop 2533 Block, Drop State 1, Run Length 2). 2535 Run lengths of more than 128 (for Normal Blocks) or 16 (for Drop 2536 Blocks) must be encoded in multiple Blocks. A single Data Dropped 2537 option can acknowledge up to 32384 Normal Block data packets, 2538 although the receiver SHOULD NOT send a Data Dropped option when all 2539 relevant packets fit into Normal Blocks. Should more packets need to 2540 be acknowledged than can fit in 253 bytes of Data Dropped, then 2541 multiple Data Dropped options can be sent. The second option will 2542 begin where the first option left off, and so forth. 2544 One or more Data Dropped options that, together, report the status 2545 of more packets than have been sent, or that change the status of a 2546 packet, or that disagree with Ack Vector or equivalent options (by 2547 reporting a "not yet received" packet as "dropped in the receive 2548 buffer", for example), SHOULD be considered invalid. The receiving 2549 DCCP SHOULD respond to invalid Data Dropped options by ignoring them 2550 or by resetting the connection with Reason set to "Option Error". 2552 Drop State 4 ("application no longer listening") means the 2553 application running at the endpoint that sent the option is no 2554 longer listening for data. For example, a server might close its 2555 receiving half-connection to new data after receiving a complete 2556 request from the client. This would limit the amount of state the 2557 server would expend on incoming data, and thus reduce the potential 2558 damage from certain denial-of-service attacks. A Data Dropped option 2559 containing Drop State 4 SHOULD be sent whenever received data is 2560 ignored due to a non-listening application. Once a DCCP reports Drop 2561 State 4 for a packet, it SHOULD report Drop State 4 for every 2562 succeeding data packet on that half-connection; once a DCCP receives 2563 a Drop State 4 report, it SHOULD expect that no more data will ever 2564 be delivered to the other endpoint's application. A DCCP receiving 2565 Drop State 4 MAY report this event to the application. (Previous 2566 versions of this specification used a "Buffer Closed" option instead 2567 of Drop State 4.) 2569 8.8. Payload Checksum Option 2571 The Payload Checksum option holds the 16 bit one's complement of the 2572 one's complement sum of all 16 bit words in the DCCP payload (the 2573 data contained in a DCCP-Request, DCCP-Response, DCCP-Data, DCCP- 2574 DataAck, or DCCP-Move packet). When combined with a Checksum Length 2575 of less than 15, this lets DCCP distinguish between corruption in a 2576 packet's payload and corruption in its header. Corrupted-header 2577 packets MUST be treated as dropped by the network, while corrupted- 2578 payload packets MAY be treated differently; for example, the 2579 sender's response to corruption might be less stringent than its 2580 response to congestion. A low Checksum Length lets DCCP process 2581 packets with valid headers, even if the payload is corrupt, avoiding 2582 the congestion response to corruption. The Payload Checksum option 2583 then lets DCCP detect payload corruption, and therefore avoid 2584 delivering bad data to the application. 2586 The option's data looks like this: 2588 +--------+--------+--------+--------+ 2589 |00101101|00000100| Checksum | 2590 +--------+--------+--------+--------+ 2591 Type=45 Length=4 2593 The receiving DCCP MUST check the Payload Checksum's value against 2594 the actual payload checksum. If the values differ, the packet's data 2595 SHOULD be dropped, and reported as dropped due to corruption (Drop 2596 State 2) using a Data Dropped option (Section 8.7). Optionally, DCCP 2597 MAY provide an API through which the receiving application could 2598 request delivery of known-corrupt data. When that API is active, the 2599 packet's data SHOULD be delivered, but reported as delivered corrupt 2600 (Drop State 3) using a Data Dropped option. In either case, the 2601 packet will be reported as Received or Received ECN Marked by Ack 2602 Vector or equivalent options. 2604 See Section 18.1 for a discussion of the issues related to the use 2605 of this option. 2607 8.9. Ack Vector Implementation Notes 2609 This section discusses particulars of DCCP acknowledgement handling, 2610 in the context of an abstract implementation for Ack Vector. It is 2611 informative rather than normative. 2613 The first part of our implementation runs at the HC-Receiver, and 2614 therefore acknowledges data packets. It generates Ack Vector 2615 options. The implementation has the following characteristics: 2617 o At most one byte of state per acknowledged packet. 2619 o O(1) time to update that state when a new packet arrives (normal 2620 case). 2622 o Cumulative acknowledgements. 2624 o Quick removal of old state. 2626 The basic data structure is a circular buffer containing information 2627 about acknowledged packets. Each byte in this buffer contains a 2628 state and run length; the state can be 0 (packet received), 1 2629 (packet ECN marked), or 3 (packet not yet received). The live 2630 portion of the buffer is marked off by head and tail pointers, each 2631 marked with the HC-Sender sequence number to which it corresponds. 2632 The buffer also stores a single-bit ECN Nonce Echo, which equals the 2633 one-bit sum of the ECN Nonces received on state-0 packets. The 2634 buffer grows from right to left. For example: 2636 +-------------------------------------------------------------------+ 2637 |S,L|S,L|S,L|S,L| | | | | |S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L| 2638 +-------------------------------------------------------------------+ 2639 ^ ^ 2640 Tail, seqno = T Head, seqno = H ECN Nonce Echo = E 2642 <=== Head and Tail move this way <=== 2644 Each `S,L' represents a State/Run length byte. We will draw these 2645 buffers showing only their live portion; for example, here is 2646 another representation for the buffer above: 2648 +-----------------------------------------------+ 2649 H |S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L| T ENE[E] 2650 +-----------------------------------------------+ 2652 This smaller Example Buffer contains actual data. 2654 +---------------------------+ 2655 10 |0,0|3,0|3,0|3,0|0,4|1,0|0,0| 0 ENE[1] [Example Buffer] 2656 +---------------------------+ 2658 In concrete terms, its meaning is as follows: 2660 Packet 10 was received. (The head of the buffer has sequence 2661 number 10, state 0, and run length 0.) 2663 Packets 9, 8, and 7 have not yet been received. (The three bytes 2664 preceding the head each have state 3 and run length 0.) 2666 Packets 6, 5, 4, 3, and 2 were received. 2668 Packet 1 was ECN marked. 2670 Packet 0 was received. 2672 The one-bit sum of the ECN Nonces on packets 10, 6, 5, 4, 3, 2, 2673 and 0 equals 1. 2675 8.9.1. New Packets 2677 When a packet arrives whose sequence number is larger than any in 2678 the buffer, the HC-Receiver simply moves the Head pointer to the 2679 left, increases the head sequence number, and stores a byte 2680 representing the packet into the buffer. For example, if HC-Sender 2681 packet 11 arrived ECN marked, the Example Buffer above would enter 2682 this new state (the change is marked with stars): 2684 +***----------------------------+ 2685 11 |1,0|0,0|3,0|3,0|3,0|0,4|1,0|0,0| 0 ENE[1] 2686 +***----------------------------+ 2688 If the packet's state equals the state at the head of the buffer, 2689 the HC-Receiver may choose to increment its run length (up to the 2690 maximum). For example, if HC-Sender packet 11 arrived without ECN 2691 marking and with ECN Nonce 0, the Example Buffer might enter this 2692 state instead: 2694 +--*------------------------+ 2695 11 |0,1|3,0|3,0|3,0|0,4|1,0|0,0| 0 ENE[1] 2696 +--*------------------------+ 2698 Of course, the new packet's sequence number might not equal the 2699 expected sequence number. In this case, the HC-Receiver should enter 2700 the intervening packets as State 3. If several packets are missing, 2701 the HC-Receiver may prefer to enter multiple bytes with run length 2702 0, rather than a single byte with a larger run length; this 2703 simplifies table updates when one of the missing packets arrives. 2704 For example, if HC-Sender packet 12 arrived with ECN Nonce 1, the 2705 Example Buffer would enter this state: 2707 +*******----------------------------+ * 2708 12 |0,0|3,0|0,1|3,0|3,0|3,0|0,4|1,0|0,0| 0 ENE[0] 2709 +*******----------------------------+ * 2711 When a new packet's sequence number is less than the head sequence 2712 number, the HC-Receiver should scan the table for the byte 2713 corresponding to that sequence number. (Slightly more complex 2714 indexing structures could reduce the complexity of this scan.) 2715 Assume that the sequence number was previously lost (State 3), and 2716 that it was stored in a byte with run length 0. Then the HC-Receiver 2717 can simply change the byte's state. For example, if HC-Sender packet 2718 8 was received with ECN Nonce 0, the Example Buffer would enter this 2719 state: 2721 +--------*------------------+ 2722 10 |0,0|3,0|0,0|3,0|0,4|1,0|0,0| 0 ENE[1] 2723 +--------*------------------+ 2725 If the packet is not marked as lost, or if its sequence number is 2726 not contained in the table, the packet is probably a duplicate, and 2727 should be ignored. (The new packet's ECN marking state might differ 2728 from the state in the buffer; Section 8.5.1 describes what to do 2729 then.) If the packet's corresponding buffer byte has a non-zero run 2730 length, then the buffer might need be reshuffled to make space for 2731 one or two new bytes. 2733 Of course, the circular buffer may overflow, either when the HC- 2734 Sender is sending data at a very high rate, when the HC-Receiver's 2735 acknowledgements are not reaching the HC-Sender, or when the HC- 2736 Sender is forgetting to acknowledge those acks (so the HC-Receiver 2737 is unable to clean up old state). In this case, the HC-Receiver 2738 should either compress the buffer, transfer its state to a larger 2739 buffer, or drop all received packets, without processing them 2740 whatsoever, until its buffer shrinks again. 2742 8.9.2. Sending Acknowledgements 2744 Whenever the HC-Receiver needs to generate an acknowledgement, the 2745 buffer's contents can simply be copied into one or more Ack Vector 2746 options. Copied Ack Vectors might not be maximally compressed; for 2747 example, the Example Buffer above contains three adjacent 3,0 bytes 2748 that could be combined into a single 3,2 byte. The HC-Receiver 2749 might, therefore, choose to compress the buffer in place before 2750 sending the option, or to compress the buffer while copying it; 2751 either operation is simple. 2753 Every acknowledgement sent by the HC-Receiver SHOULD include the 2754 entire state of the buffer. That is, acknowledgements are 2755 cumulative. 2757 The HC-Receiver should store information about each acknowledgement 2758 it sends in another buffer. Specifically, for every acknowledgement 2759 it sends, the HC-Receiver should store: 2761 o The HC-Receiver sequence number it used for the ack packet. 2763 o The HC-Sender sequence number it acknowledged (that is, the 2764 packet's Acknowledgement Number). Since acknowledgements are 2765 cumulative, this single number completely specifies the set of HC- 2766 Sender packets acknowledged by this ack packet. 2768 8.9.3. Clearing State 2770 Some of the HC-Sender's packets will include acknowledgement 2771 numbers, which ack the HC-Receiver's acknowledgements. When such an 2772 ack is received, the HC-Receiver simply finds the HC-Sender sequence 2773 number corresponding to that acked HC-Receiver packet, and moves the 2774 buffer's Tail pointer up to that sequence number. (It may choose to 2775 keep some older information, in case a lost packet shows up late.) 2776 For example, say that the HC-Receiver storing the Example Buffer had 2777 sent two acknowledgements already: 2779 (1) HC-Receiver Ack 59 acknowledged HC-Sender Seq 3 with ECN Nonce 2780 Echo 1. 2782 (2) HC-Receiver Ack 60 acknowledged HC-Sender Seq 10 with ECN Nonce 2783 Echo 0. 2785 Say the HC-Receiver then received a DCCP-DataAck packet from the HC- 2786 Sender with Acknowledgement Number 59. This informs the HC-Receiver 2787 that the HC-Sender received, and processed, all the information in 2788 HC-Receiver packet 59. This packet acknowledged HC-Sender packet 3, 2789 so the HC-Sender has now received HC-Receiver's acknowledgements for 2790 packets 0, 1, 2, and 3. The Example Buffer should enter this state: 2792 +------------------*+ * * 2793 10 |0,0|3,0|3,0|3,0|0,2| 4 ENE[0] 2794 +------------------*+ * * 2796 The tail byte's run length was adjusted, since packet 3 was in the 2797 middle of that byte. The new ECN Nonce Echo field equals the 2798 exclusive-or of the old field, and the ECN Nonce Echo reported with 2799 the relevant acknowledgement. The HC-Receiver can also throw away 2800 stored information about HC-Receiver Ack 59. 2802 A careful implementation might try to ensure reasonable robustness 2803 to reordering. Suppose that the Example Buffer is as before, but 2804 that packet 9 now arrives, out of sequence. The buffer would enter 2805 this state: 2807 +----*----------------------+ 2808 10 |0,0|0,0|3,0|3,0|0,4|1,0|0,0| 0 ENE[1] 2809 +----*----------------------+ 2811 The danger is that the HC-Sender might acknowledge the P2's previous 2812 acknowledgement (with sequence number 60), which says that Packet 9 2813 was not received, before the HC-Receiver has a chance to send a new 2814 acknowledgement saying that Packet 9 actually was received. 2816 Therefore, when packet 9 arrived, the HC-Receiver might modify its 2817 acknowledgement record to: 2819 (1) HC-Receiver Ack 59 acknowledged HC-Sender Seq 3 with ECN Nonce 2820 Echo 1. 2822 (2) HC-Receiver Ack 60 also acknowledged HC-Sender Seq 3 with ECN 2823 Nonce Echo 1. 2825 That is, Ack 60 is now treated like a duplicate of Ack 59. This 2826 would prevent the Tail pointer from moving past packet 9 until the 2827 HC-Receiver knows that the HC-Sender has seen an Ack Vector 2828 indicating that packet's arrival. 2830 8.9.4. Processing Acknowledgements 2832 When the HC-Sender receives an acknowledgement, it generally cares 2833 about the number of packets that were dropped and/or ECN marked. It 2834 simply reads this off the Ack Vector. Additionally, it may check the 2835 ECN Nonce for correctness. (As described in Section 8.5.1, it may 2836 want to keep more detailed information about acknowledged packets in 2837 case packets change states between acknowledgements, or in case the 2838 application queries whether a packet arrived.) 2840 The HC-Sender must also acknowledge the HC-Receiver's 2841 acknowledgements so that the HC-Receiver can free old Ack Vector 2842 state. (Since Ack Vector acknowledgements are reliable, the HC- 2843 Receiver must maintain and resend Ack Vector information until it is 2844 sure that the HC-Sender has received that information.) A simple 2845 algorithm suffices: since Ack Vector acknowledgements are 2846 cumulative, a single acknowledgement number tells HC-Receiver how 2847 much ack information has arrived. Assuming that the HC-Receiver 2848 sends no data, the HC-Sender can simply ensure that at least once a 2849 round-trip time, it sends a DCCP-DataAck packet acknowledging the 2850 latest DCCP-Ack packet it has received. Of course, the HC-Sender 2851 only needs to acknowledge the HC-Receiver's acknowledgements if the 2852 HC-Sender is also sending data. If the HC-Sender is not sending 2853 data, then the HC-Receiver's Ack Vector state is stable, and there 2854 is no need to shrink it. The HC-Sender must watch for drops and ECN 2855 marks on received DCCP-Ack packets so that it can adjust the HC- 2856 Receiver's ack-sending rate---for example, with Ack Ratio---in 2857 response to congestion. 2859 If the other half-connection is not quiescent---that is, the HC- 2860 Receiver is sending data to the HC-Sender, possibly using another 2861 CCID---then the acknowledgements on that half-connection are 2862 sufficient for the HC-Receiver to free its state. 2864 9. Explicit Congestion Notification 2866 The DCCP protocol is fully ECN-aware. Each CCID specifies how its 2867 endpoints respond to ECN marks. Furthermore, DCCP, unlike TCP, 2868 allows senders to control the rate at which acknowledgements are 2869 generated (with options like Ack Ratio); this means that 2870 acknowledgements are generally congestion-controlled, and may have 2871 ECN-Capable Transport set. 2873 A CCID profile describes how that CCID interacts with ECN, both for 2874 data traffic and pure-acknowledgement traffic. A sender SHOULD set 2875 ECN-Capable Transport on its packets whenever the receiver has its 2876 ECN Capable feature turned on and the relevant CCID allows it, 2877 unless the sending application indicates that ECN should not be 2878 used. 2880 The rest of this section describes the ECN Capable feature and the 2881 interaction of the ECN Nonce with acknowledgement options such as 2882 Ack Vector. 2884 9.1. ECN Capable Feature 2886 The ECN Capable feature lets a DCCP inform its partner that it 2887 cannot read ECN bits from received IP headers, so the partner must 2888 not set ECN-Capable Transport on its packets. 2890 ECN Capable has feature number 2. The ECN Capable feature located at 2891 DCCP A indicates whether or not A can successfully read ECN bits 2892 from received frames' IP headers. (This is independent of whether it 2893 can set ECN bits on sent frames.) DCCP A sends a "Prefer(ECN 2894 Capable, 0)" option to DCCP B to inform B that A cannot read ECN 2895 bits. 2897 An ECN Capable feature contains a single byte of data. ECN 2898 capability is on if and only if this byte is nonzero. 2900 A new connection starts with ECN Capable 1 (that is, ECN capable) 2901 for both DCCPs. If a DCCP is not ECN capable, it MUST send 2902 "Prefer(ECN Capable, 0)" options to the other endpoint until 2903 acknowledged (by "Change(ECN Capable, 0)") or the connection closes. 2904 Furthermore, it MUST NOT accept any data until the other endpoint 2905 sends "Change(ECN Capable, 0)". It SHOULD send Data Dropped options 2906 on its acknowledgements, with Drop State 0 ("protocol constraints"), 2907 if the other endpoint does send data inappropriately. 2909 9.2. ECN Nonces 2911 Congestion avoidance will not occur, and the receiver will sometimes 2912 get its data faster, when the sender is not told about any 2913 congestion events. Thus, the receiver has some incentive to falsify 2914 acknowledgement information, reporting that marked or dropped 2915 packets were actually received unmarked. This problem is more 2916 serious with DCCP than with TCP, since TCP provides reliable 2917 transport: it is more difficult with TCP to lie about lost packets 2918 without breaking the application. 2920 ECN Nonces are a general mechanism to prevent ECN cheating (or loss 2921 cheating). Two values for the two-bit ECN header field indicate ECN- 2922 Capable Transport, 01 and 10. The second code point, 10, is the ECN 2923 Nonce. In general, a protocol sender chooses between these code 2924 points randomly on its output packets, remembering the sequence it 2925 chose. The protocol receiver reports, on every acknowledgement, the 2926 number of ECN Nonces it has received thus far. This is called the 2927 ECN Nonce Echo. Since ECN marking and packet dropping both destroy 2928 the ECN Nonce, a receiver that lies about an ECN mark or packet drop 2929 has a 50% chance of guessing right and avoiding discipline. The 2930 sender may react punitively to an ECN Nonce mismatch, possibly up to 2931 dropping the connection. The ECN Nonce Echo field need not be an 2932 integer; one bit is enough to catch 50% of infractions. 2934 In DCCP, the ECN Nonce Echo field is encoded in acknowledgement 2935 options. For example, the Ack Vector option comes in two forms, Ack 2936 Vector [Nonce 0] (option 37) and Ack Vector [Nonce 1] (option 38), 2937 corresponding to the two values for a one-bit ECN Nonce Echo. The 2938 Nonce Echo for a given Ack Vector equals the one-bit sum (exclusive- 2939 or, or parity) of ECN nonces for packets reported by that Ack Vector 2940 as received and not ECN marked. Thus, only packets marked as State 2941 0 matter for this calculation (that is, valid received packets that 2942 were not ECN marked). Every Ack Vector option is detailed enough 2943 for the sender to determine what the Nonce Echo should have been. It 2944 can check this calculation against the actual Nonce Echo, and 2945 complain if there is a mismatch. 2947 (The Ack Vector could conceivably report every packet's ECN Nonce 2948 state, but this would severely limit Ack Vector's compressibility 2949 without providing much extra protection.) 2951 Consider a half-connection from DCCP A to DCCP B. DCCP A SHOULD set 2952 ECN Nonces on its packets, and remember which packets had nonces, 2953 whenever DCCP B reports that it is ECN Capable. An ECN-capable 2954 endpoint MUST calculate and use the correct value for ECN Nonce Echo 2955 when sending acknowledgement options. An ECN-incapable endpoint, 2956 however, SHOULD treat the ECN Nonce Echo as always zero. When a 2957 sender detects an ECN Nonce Echo mismatch, it SHOULD behave as if 2958 the receiver had reported one or more packets as ECN-marked (instead 2959 of unmarked). It MAY take more punitive action, such as resetting 2960 the connection. The Reason for such DCCP-Reset packets SHOULD be set 2961 to "Aggression Penalty". 2963 An ECN-incapable DCCP SHOULD ignore received ECN nonces and generate 2964 ECN nonces of zero. For instance, out of the two Ack Vector options, 2965 an ECN-incapable DCCP SHOULD generate Ack Vector [Nonce 0] (option 2966 37) exclusively. (Again, the ECN Capable feature MUST be set to zero 2967 in this case.) 2969 9.3. Other Aggression Penalties 2971 The ECN Nonce provides one way for a DCCP sender to discover that a 2972 receiver is misbehaving. There may be other mechanisms, and a 2973 receiver or middlebox may also discover that a sender is 2974 misbehaving---sending more data than it should. In any of these 2975 cases, the entity that discovers the misbehavior MAY react by 2976 resetting the connection, with Reason set to "Aggression Penalty". A 2977 receiver that detects marginal (meaning possibly spurious) sender 2978 misbehavior MAY instead react with a Slow Receiver option, or by 2979 reporting some packets as ECN marked that were not, in fact, marked. 2981 10. Multihoming and Mobility 2983 DCCP provides primitive support for multihoming and mobility via a 2984 mechanism for transferring a connection endpoint from one address to 2985 another. The moving endpoint must negotiate mobility support 2986 beforehand, and both endpoints must share their Connection Nonces. 2987 When the moving endpoint gets a new address, it sends a DCCP-Move 2988 packet from that address to the stationary endpoint. The stationary 2989 endpoint then changes its connection state to use the new address. 2991 DCCP's support for mobility is intended to solve only the simplest 2992 multihoming and mobility problems. For instance, DCCP has no support 2993 for simultaneous moves. Applications requiring more complex mobility 2994 semantics, or more stringent security guarantees, should use an 2995 existing solution like Mobile IP or [SB00]. 2997 10.1. Mobility Capable Feature 2999 A DCCP uses the Mobility Capable feature to inform its partner that 3000 it would like to be able to change its address and/or port during 3001 the course of the connection. 3003 Mobility Capable has feature number 5. The Mobility Capable feature 3004 located at DCCP A indicates whether or not A will accept a DCCP-Move 3005 packet sent by B. DCCP B sends a "Change(Mobility Capable, 1)" 3006 option to DCCP A to inform it that B might like to move later. 3008 A Mobility Capable feature contains a single byte of data. Mobility 3009 is allowed if and only if this byte is nonzero. A DCCP MUST reject a 3010 DCCP-Move packet referring to a connection when Mobility Capable is 3011 0; however, it MAY reject a valid DCCP-Move packet even when 3012 Mobility Capable is 1. 3014 A new connection starts with Mobility Capable 0 (that is, mobility 3015 is not allowed) for both DCCPs. 3017 10.2. Security 3019 The DCCP mobility mechanism, like DCCP in general, does not provide 3020 cryptographic security guarantees. Nevertheless, mobile hosts must 3021 use valid sequence numbers and include valid Identifications in 3022 their DCCP-Move packets, providing protection against some classes 3023 of attackers. Specifically, an attacker cannot move a DCCP 3024 connection to a new address unless they know valid sequence numbers 3025 and how to generate valid Identifications. Even with the default MD5 3026 Identification Regime, this means that an attacker must have snooped 3027 on every packet in the connection to get a reasonable probability of 3028 success, assuming that initial sequence numbers and Connection 3029 Nonces are chosen well (that is, randomly). Section 16 further 3030 describes DCCP security considerations. 3032 10.3. Congestion Control State 3034 Once an endpoint has transitioned to a new address, the connection 3035 is effectively a new connection in terms of its congestion control 3036 state: the accumulated information about congestion between the old 3037 endpoints no longer applies. Both DCCPs MUST initialize their 3038 congestion control state (windows, rates, and so forth) to that of a 3039 new connection---that is, they must "slow start"---unless they have 3040 high-quality information about actual network conditions between the 3041 two new endpoints. Normally, the only way to get this information 3042 would be by instrumenting a DCCP connection between the new 3043 addresses. 3045 Similarly, the endpoints' configured MTUs (see 11) SHOULD be 3046 reinitialized, and PMTU discovery performed again, following an 3047 address change. 3049 10.4. Loss During Transition 3051 Several loss and delay events may affect the transition of a DCCP 3052 connection from one address to another. The DCCP-Move packet itself 3053 might be lost; the acknowledgement to that packet might be lost, 3054 leaving the mobile endpoint unsure of whether the transition has 3055 completed; and data from the old endpoint might continue to arrive 3056 at the receiver even after the transition. 3058 To protect against lost DCCP-Move packets, the mobile host SHOULD 3059 retransmit a DCCP-Move packet if it does not receive an 3060 acknowledgement within a reasonable time period. Section 5.9 3061 describes the mechanism used to protect against duplicate DCCP-Move 3062 packets. 3064 A receiver MAY drop all data received from the old address/port pair 3065 once a DCCP-Move has successfully completed. Alternately, it MAY 3066 accept one Loss Window's worth of this data. Congestion and loss 3067 events on this data SHOULD NOT affect the new connection's 3068 congestion control state. The receiver MUST NOT accept data with the 3069 old address/port pair past one Loss Window, and SHOULD send DCCP- 3070 Resets in response to those packets. 3072 During some transition period, acknowledgements from the receiver to 3073 the mobile host will contain information about packets sent both 3074 from the old address/port pair, and from the new address/port pair. 3075 The mobile DCCP MUST NOT let loss events on packets from the old 3076 address/port pair affect the new congestion control state. 3078 11. Path MTU Discovery 3080 A DCCP implementation SHOULD be capable of performing Path MTU 3081 (PMTU) discovery, as described in [RFC 1191]. The API to DCCP SHOULD 3082 allow this mechanism to be disabled in cases where IP fragmentation 3083 is preferred. The rest of this section assumes PMTU discovery has 3084 not been disabled. 3086 A DCCP implementation MUST maintain its idea of the current PMTU for 3087 each active DCCP session. The PMTU SHOULD be initialized from the 3088 interface MTU that will be used to send packets. 3090 To perform PMTU discovery, the DCCP sender sets the IP Don't 3091 Fragment (DF) bit. However, it is undersirable for MTU discovery to 3092 occur on the initial connection setup handshake, as the connection 3093 setup process may not be representative of packet sizes used during 3094 the connection, and performing MTU discovery on the initial 3095 handshake might unnecessarily delay connection establishment. Thus, 3096 DF SHOULD NOT be set on DCCP-Request and DCCP-Response packets. In 3097 addition DF SHOULD NOT be set on DCCP-Reset packets, although 3098 typically these would be small enough to not be a problem. On all 3099 other DCCP packets, DF SHOULD be set. 3101 Any API to DCCP MUST allow the application to discover DCCP's 3102 current PMTU. DCCP applications SHOULD use the API to discover the 3103 PMTU, and SHOULD NOT send datagrams that are greater than the PMTU; 3104 the only exception to this is if the application disables PMTU 3105 discovery. If the application tries to send a packet bigger than the 3106 PMTU, the DCCP implementation MUST drop the packet and return an 3107 appropriate error. 3109 As specified in [RFC 1191], when a router receives a packet with DF 3110 set that is larger than the PMTU, it sends an ICMP Destination 3111 Unreachable message to the source of the datagram with the Code 3112 indicating "fragmentation needed and DF set" (also known as a 3113 "Datagram Too Big" message). When a DCCP implementation receives a 3114 Datagram Too Big message, it decreases its PMTU to the Next-Hop MTU 3115 value given in the ICMP message. If the MTU given in the message is 3116 zero, the sender chooses a value for PMTU using the algorithm 3117 described in Section 7 of [RFC 1191]. If the MTU given in the 3118 message is greater than the current PMTU, the Datagram Too Big 3119 message is ignored, as described in [RFC 1191]. (We are aware that 3120 this may cause problems for DCCP endpoints behind certain 3121 firewalls.) 3123 If the DCCP implementation has decreased the PMTU, and the sending 3124 application attempts to send a packet larger than the new MTU, the 3125 API MUST cause the send to fail returning an appropriate error to 3126 the application, and the application SHOULD then use the API to 3127 query the new value of the PMTU. When this occurs, it is possible 3128 that the kernel has some packets buffered for transmission that are 3129 smaller than the old PMTU, but larger than the new PMTU. The kernel 3130 MAY send these packets with the DF bit cleared, or it MAY discard 3131 these packets; it MUST NOT transmit these datagrams with the DF bit 3132 set. 3134 DCCP currently provides no way to increase the PMTU once it has 3135 decreased. 3137 A DCCP sender MAY optionally treat the reception of an ICMP Datagram 3138 Too Big message as an indication that the packet being reported was 3139 not lost due congestion, and so for the purposes of congestion 3140 control it MAY ignore the DCCP receiver's indication that this 3141 packet did not arrive. However, if this is done, then the DCCP 3142 sender MUST check the ECN bits of the IP header echoed in the ICMP 3143 message, and only perform this optimization if these ECN bits 3144 indicate that the packet did not experience congestion prior to 3145 reaching the router whose MTU it exceeded. 3147 12. Middlebox Considerations 3149 This section describes properties of DCCP that firewalls, network 3150 address translators, and other middleboxes must consider, including 3151 parts of the packet that middleboxes must not change. 3153 The Service Name field in DCCP-Request packets provide information 3154 that may be useful for stateful middleboxes. With Service Name, a 3155 middlebox can tell what protocol a connection will use, without 3156 relying on port numbers. Middleboxes MAY disallow attempted 3157 connections with zero Service Names by sending a DCCP-Reset. 3158 Middleboxes SHOULD NOT modify the Service Name. 3160 The Source and Destination Port fields are in the same packet 3161 locations as the corresponding fields in TCP and UDP, which may 3162 simplify some middlebox implementations. 3164 Middleboxes MUST NOT modify DCCP packets' Sequence Number, 3165 Acknowledgement Number, and # NDP fields in order to add or remove 3166 packets from a packet stream. Any such modification would affect the 3167 endpoints' accounting of which packets have been lost, destroy the 3168 Identification mechanism, and confuse the congestion control 3169 mechanisms in use. Note that there is less need to modify DCCP's 3170 per-packet sequence numbers than TCP's per-byte sequence numbers; 3171 for example, a middlebox can change the contents of a packet without 3172 changing its sequence number. (In TCP, sequence number modification 3173 is required to support legacy protocols like FTP that carry 3174 variable-length addresses in the data stream. If such an application 3175 were deployed over DCCP, middleboxes would simply grow or shrink the 3176 relevant packets as necessary, without changing their sequence 3177 numbers.) 3179 The exception to this rule is that middleboxes MAY reset connections 3180 in progress. Clearly this requires inserting a packet into one or 3181 both packet streams, as well as dropping all later packets on the 3182 connection. 3184 This does not explicitly prevent one sequence number modification 3185 occasionally seen with TCP, namely proxies with "connection 3186 splicing" [SHHP00]. Such proxies intercept TCP connection attempts 3187 from a client, but may later "splice" data from an external server 3188 connection into that client connection via sequence number 3189 manipulations. Packets are not added to or removed from the spliced- 3190 in stream, reducing the sequence number issues somewhat. 3191 Nevertheless, DCCP, with its extensive end-to-end feature 3192 negotiation, is inherently unfriendly to the idea of connection 3193 splicing: the proxy would have to ensure that the server chose the 3194 same feature values that the proxy had previously negotiated with 3195 the client. Furthermore, Identification options would require 3196 special handling; and there may be other issues. We suggest that 3197 DCCP splicing, if implemented, should take place at the application 3198 level. 3200 A middlebox that wants to trivially support the MD5 Identification 3201 Regime (Section 6.4.3) MUST NOT alter packets' Sequence Number, 3202 Type, CCval, Acknowledgement Number, and Reserved fields, or the 3203 Connection Nonce feature values, which are included in the MD5 hash 3204 sent with Identification and Challenge options. 3206 The contents of this section SHOULD NOT be interpreted as a 3207 wholesale endorsement of stateful middleboxes. 3209 13. Abstract API 3211 API issues for DCCP are discussed in another Internet-Draft, in 3212 progress. 3214 14. Multiplexing Issues 3216 In contrast to TCP, DCCP does not offer reliable ordered delivery. 3217 As a consequence, with DCCP there are no inherent performance 3218 penalties in layering functionality above DCCP to multiplex several 3219 sub-flows into a single DCCP connection. 3221 However, this approach of multiplexing sub-flows above DCCP will not 3222 work in circumstances such as RTP where the RTP subflows require 3223 separate port numbers. In this case, if it is desired to share 3224 congestion control state among multiple DCCP flows that share the 3225 same source and destination addresses, the possibilities are to add 3226 DCCP-specific mechanisms to enable this, or to use a generic 3227 multiplexing facility like the Congestion Manager [RFC 3124] 3228 residing below the transport layer. For some DCCP flows, the 3229 ability to specify the congestion control mechanism might be 3230 critical, and for these flows the Congestion Manager will only be a 3231 viable tool if it allows DCCP to specify the congestion control 3232 mechanism used by the Congestion Manager for that flow. Thus, to 3233 allow the sharing of congestion control state among multiple DCCP 3234 flows, the alternatives seem to be to add DCCP-specific 3235 functionality to the Congestion Manager, or to add a similar layer 3236 below DCCP that is specific to DCCP. We defer issues of DCCP 3237 operating over a revised version of the Congestion Manager, or over 3238 a DCCP-specific module for the sharing of congestion control state, 3239 to later work. 3241 15. DCCP and RTP 3243 The real-time transport protocol, RTP [RFC 1889], is currently used 3244 (over UDP) by many of DCCP's target applications (for instance, 3245 streaming media). This section therefore discusses the relationship 3246 between DCCP and RTP, and in particular, the question of whether any 3247 changes in RTP are necessary or desirable when it is layered over 3248 DCCP instead of UDP. The main issue here is header size: a DCCP 3249 header is at least 4 bytes larger than a UDP header. 3251 There are two potential sources of overhead in the RTP-over-DCCP 3252 combination, duplicated acknowledgement information and duplicated 3253 sequence numbers. We argue that together, these sources of overhead 3254 add just 4 bytes per packet relative to RTP-over-UDP, and that 3255 eliminating the redundancy would not reduce the overhead. However, 3256 particular CCIDs might make productive use of the space occupied by 3257 RTP's sequence number. 3259 First, consider acknowledgements. The information on packet loss 3260 that RTP communicates via RTCP SR/RR packets is communicated by DCCP 3261 via acknowledgement options. Much of the information in an RTCP 3262 receiver report could be divined from DCCP acknowledgements, 3263 depending on the CCID in use. Acknowledgement options, such as Ack 3264 Vector, can be frequent and verbose, whereas RTCP reports are sent 3265 only rarely, with a minimum interval of 5 seconds between reports 3266 [RFC 1889]. 3268 However, not all CCIDs require such verbose acknowledgements. CCID 3 3269 (TFRC) reports acknowledgements at a low rate---between 16 and 32 3270 bytes of options (depending on ECN usage), sent once per round trip 3271 time. This is not an undue burden. Furthermore, the options are 3272 necessary to implement responsive congestion control, and we cannot 3273 report less frequently, although we might design alternative 3274 acknowledgement options that take fewer bytes. DCCP gives the 3275 application the trade off between small packet overhead and the 3276 precise feedback provided by Ack Vector. 3278 While RTP receiver reports might be considered "redundant" in the 3279 presence of DCCP's more precise acknowledgements, they are sent so 3280 infrequently that it is not worth optimizing them away. Also, note 3281 that in the common case of a one-way data stream, acknowledgement 3282 packets contain no data, so acknowledgement header size (as distinct 3283 from congestion on the acknowledgement path) is not an issue. 3285 We now consider sequence number redundancy on data packets. The 3286 embedded RTP header contains a 16-bit RTP sequence number. Most data 3287 packets will use the DCCP-Data type; DCCP-DataAck and DCCP-Ack 3288 packets need not usually be sent. The DCCP-Data header is 12 bytes 3289 long without options, including a 24-bit sequence number. This is 4 3290 bytes more than a UDP header. Any options required on data packets 3291 would add further overhead, although many CCIDs (for instance, CCID 3292 3 [TFRC]) don't require options on most data packets. 3294 The DCCP sequence number cannot be inferred from the RTP sequence 3295 number since it increments on non-data packets as well as data 3296 packets. The RTP sequence number could be inferred from the DCCP 3297 sequence number, though; it might equal the DCCP sequence number 3298 minus the total number of non-data packets seen so far in the 3299 connection (as tracked by DCCP's # NDP header field). 3301 Removing RTP's sequence number would not save any header space 3302 because of alignment issues. However, particular DCCP CCIDs might 3303 make use of the 16 bits occupied by the RTP sequence number. 3304 Therefore, particular DCCP CCIDs MAY provide optional CCID-specific 3305 features that store DCCP quantities in place of the embedded RTP 3306 sequence number. A conforming DCCP would write in the calculated RTP 3307 sequence number before passing the packet to RTP. (The DCCP checksum 3308 would use the DCCP quantity, not the RTP sequence number.) 3310 Given RTP-over-DCCP's small overhead, however, implementors 3311 demanding tiny headers will probably prefer more comprehensive 3312 header compression to this ad-hoc compression technique. 3314 16. Security Considerations 3316 DCCP does not provide cryptographic security guarantees. 3317 Applications desiring hard security should use IPsec or end-to-end 3318 security of some kind. 3320 Nevertheless, DCCP is intended to protect against some classes of 3321 attackers. Attackers cannot hijack a DCCP connection (close the 3322 connection unexpectedly, or cause attacker data to be accepted by an 3323 endpoint as if it came from the sender) unless they can guess valid 3324 sequence numbers. Thus, as long as endpoints choose initial sequence 3325 numbers well, a DCCP attacker must snoop on data packets to get any 3326 reasonable probability of success. The sequence number validity 3327 (Section 5.2), Identification (Section 6.4.3), and mobility (Section 3328 10) mechanisms provide this guarantee. We also avoid leaking 3329 sequence numbers to possibly malicious endpoints. For instance, this 3330 is why invalid DCCP-Moves are ignored, rather than reset. 3332 17. IANA Considerations 3334 DCCP introduces six sets of numbers whose values should be allocated 3335 by IANA. 3337 o 32-bit Service Names (Section 5.4; not exclusive to DCCP). 3339 o 8-bit DCCP-Reset Reasons (Section 5.8). 3341 o 8-bit DCCP Option Types (Section 6). The CCID-specific options 128 3342 through 255 need not be allocated by IANA. 3344 o 8-bit DCCP Feature Numbers (Section 6.3). The CCID-specific 3345 features 128 through 255 need not be allocated by IANA. 3347 o 8-bit DCCP Congestion Control Identifiers (CCIDs) (Section 7). 3349 o 16-bit Identification Regimes, for use with DCCP Identification 3350 and Challenge options (Section 6.4). 3352 In addition, DCCP requires a Protocol Number to be added to the 3353 registry of Assigned Internet Protocol Numbers. Experimental 3354 implementors should use Protocol Number 33 for DCCP, but this number 3355 may change in future. 3357 18. Design Motivation 3359 In the section we attempt to capture some of the rationale behind 3360 specific details of DCCP design. 3362 18.1. CSlen and Partial Checksumming 3364 A great deal of discussion has taken place regarding the utility of 3365 allowing a DCCP sender to restrict the checksum so that it does not 3366 cover the complete packet. 3368 Many of the applications that we envisage using DCCP are resilient 3369 to some degree of data loss, or they would typically have chosen a 3370 reliable transport. Some of these applications may also be 3371 resilient to data corruption---some audio payloads, for example. 3372 These resilient applications might prefer to receive corrupted data 3373 than to have DCCP drop a corrupted packet. This is particularly 3374 because of congestion control: DCCP cannot tell the difference 3375 between packets dropped due to corruption and packets dropped due to 3376 congestion, and so it must reduce the transmission rate accordingly. 3377 This response may cause the connection to receive less bandwidth 3378 than it is due; corruption in some networking technologies is 3379 independent of, or at least not always correlated to, congestion. 3380 Therefore, corrupted packets do not need to cause as strong a 3381 reduction in transmission rate as the congestion response would 3382 dictate (so long as the DCCP header and options are not corrupt). 3384 Thus DCCP allows the checksum to cover all of the packet, just the 3385 DCCP header, or both the DCCP header and some number of bytes from 3386 the payload. If the application cannot tolerate any payload 3387 corruption, then the checksum SHOULD cover the whole packet. If the 3388 application would prefer to tolerate some corruption rather than 3389 have the packet dropped, then it can set the checksum to cover only 3390 part of the packet (but always the DCCP header). In addition, if 3391 the application wishes to decouple checksumming of the DCCP header 3392 from checksumming of the payload, it may do so by including the 3393 Payload Checksum option. This would allow payload corruption to 3394 cause DCCP to discard a corrupted payload, but still not mistake the 3395 corruption for network congestion. 3397 Thus, from the application point of view, partial checksums seem to 3398 be a desirable feature. However, the usefulness of partial 3399 checksums depends on partially corrupted packets being delivered to 3400 the receiver. If the link-layer CRC always discards corrupted 3401 packets, then this will not happen, and so the usefulness of partial 3402 checksums would be restricted to corruption that occurred in routers 3403 and other places not covered by link CRCs. There does not appear to 3404 be consensus on how likely it is that future network links that 3405 suffer significant corruption will not cover the entire packet with 3406 a single strong CRC. DCCP makes it possible to tailor such links to 3407 the application, but it is difficult to predict if this will be 3408 compelling for future link technologies. 3410 In addition, partial checksums do not co-exist well with IP-level 3411 authentication mechanisms such as IPsec AH, which cover the entire 3412 packet with a cryptographic hash. Thus, if cryptographic 3413 authentication mechanisms are required to co-exist with partial 3414 checksums, the authentication must be carried in the DCCP payload. 3415 A possible mode of usage would appear to be similar to that of 3416 Secure RTP. However, such "application-level" authentication does 3417 not protect the DCCP option negotiation and state machine from 3418 forged packets. An alternative would be to use IPsec ESP, and use 3419 encryption to protect the DCCP headers against attack, while using 3420 the DCCP header validity checks to authenticate that the header is 3421 from someone who possessed the correct key. However, while this is 3422 resistant to replay (due to the DCCP sequence number), it is not by 3423 itself resistant to some forms of man-in-the-middle attacks because 3424 the payload is not tightly coupled to the packet header. Thus an 3425 application-level authentication probably needs to be coupled with 3426 IPsec ESP or a similar mechanism to provide a reasonably complete 3427 security solution. The overhead of such a solution might be 3428 unacceptable for some applications that would otherwise wish to use 3429 partial checksums. 3431 On balance, the authors believe that DCCP partial checksums have the 3432 potential to enable some future uses that would otherwise be 3433 difficult. As the cost and complexity of supporting them is small, 3434 it seems worth including them at this time. It remains to be seen 3435 whether they are useful in practice. 3437 19. Thanks 3439 There is a wealth of work in this area, including the Congestion 3440 Manager. We thank the staff and interns of ICIR and, formerly, 3441 ACIRI, the members of the End-to-End Research Group, and the members 3442 of the Transport Area Working Group for their feedback on DCCP. We 3443 also thank those who provided comments and suggestions via the DCCP 3444 BOF, Working Group, and mailing lists, including Damon Lanphear, 3445 Patrick McManus, Sara Karlberg, Kevin Lai, Youngsoo Choi, Dan 3446 Duchamp, Derek Fawcus, David Timothy Fleeman, John Loughney, 3447 Ghyslain Pelletier, Tom Phelan, Stanislav Shalunov, Yufei Wang, and 3448 Michael Welzl. 3450 20. Normative References 3452 [RFC 793] J. Postel, editor. Transmission Control Protocol. RFC 793. 3454 [RFC 1191] J. C. Mogul and S. E. Deering. Path MTU Discovery. RFC 3455 1191. 3457 [RFC 2026] S. Bradner. The Internet Standards Process---Revision 3. 3458 RFC 2026. 3460 [RFC 2119] S. Bradner. Key Words For Use in RFCs to Indicate 3461 Requirement Levels. RFC 2119. 3463 [RFC 2460] S. Deering and R. Hinden. Internet Protocol, Version 6 3464 (IPv6) Specification. RFC 2460. 3466 [RFC 3168] K.K. Ramakrishnan, S. Floyd, and D. Black. The Addition 3467 of Explicit Congestion Notification (ECN) to IP. RFC 3168. 3468 September 2001. 3470 21. Informative References 3472 [CCID 2 PROFILE] S. Floyd and E. Kohler. Profile for DCCP Congestion 3473 Control ID 2: TCP-like Congestion Control. draft-ietf-dccp- 3474 ccid2-01.txt, work in progress, March 2003. 3476 [CCID 3 PROFILE] S. Floyd, E. Kohler, and J. Padhye. Profile for 3477 DCCP Congestion Control ID 3: TFRC Congestion Control. draft- 3478 ietf-dccp-ccid3-01.txt, work in progress, March 2003. 3480 [ECN NONCE] David Wetherall, David Ely, and Neil Spring. Robust ECN 3481 Signaling with Nonces. draft-ietf-tsvwg-tcp-nonce-04.txt, work 3482 in progress, October 2002. 3484 [RFC 1889] Audio-Video Transport Working Group, H. Schulzrinne, S. 3485 Casner, R. Frederick, and V. Jacobson. RTP: A Transport 3486 Protocol for Real-Time Applications. RFC 1889. 3488 [RFC 1948] S. Bellovin. Defending Against Sequence Number Attacks. 3489 RFC 1948. 3491 [RFC 2960] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. 3492 Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, and V. 3493 Paxson. Stream Control Transmission Protocol. RFC 2960. 3495 [RFC 3124] H. Balakrishnan and S. Seshan. The Congestion Manager. 3496 RFC 3124. 3498 [SB00] Alex C. Snoeren and Hari Balakrishnan. An End-to-End Approach 3499 to Host Mobility. Proc. 6th Annual ACM/IEEE International 3500 Conference on Mobile Computing and Networking (MOBICOM '00), 3501 August 2000. 3503 [SHHP00] Oliver Spatscheck, Jorgen S. Hansen, John H. Hartman, and 3504 Larry L. Peterson. Optimizing TCP Forwarder Performance. 3505 IEEE/ACM Transactions on Networking 8(2):146-157, April 2000. 3507 [UDP-LITE] L-A. Larzon, M. Degermark, S. Pink, L-E. Jonsson 3508 (editor), and G. Fairhurst (editor). The UDP-Lite Protocol. 3509 draft-ietf-tsvwg-udp-lite-01.txt, work in progress, December 3510 2002. 3512 22. Authors' Addresses 3513 Eddie Kohler 3514 Mark Handley 3515 Sally Floyd 3517 ICSI Center for Internet Research 3518 1947 Center Street, Suite 600 3519 Berkeley, CA 94704 USA 3521 Jitendra Padhye 3523 Microsoft Research 3524 One Microsoft Way 3525 Redmond, WA 98052 USA