idnits 2.17.1 draft-ietf-dccp-spec-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Cannot find the required boilerplate sections (Copyright, IPR, etc.) in this document. Expected boilerplate is as follows today (2024-04-26) according to https://trustee.ietf.org/license-info : IETF Trust Legal Provisions of 28-dec-2009, Section 6.a: This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 2: Copyright (c) 2024 IETF Trust and the persons identified as the document authors. All rights reserved. IETF Trust Legal Provisions of 28-dec-2009, Section 6.b(i), paragraph 3: This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (https://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 18 instances of too long lines in the document, the longest one being 6 characters in excess of 72. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 328: '... host MUST NOT respond to a DCCP-Re...' RFC 2119 keyword, line 333: '...ctions as a unit. However, DCCP SHOULD...' RFC 2119 keyword, line 389: '...like the server to use. The client MAY...' RFC 2119 keyword, line 391: '...st, say---which the server MAY ignore....' RFC 2119 keyword, line 398: '...mation and which MUST be returned by t...' (117 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 1040 has weird spacing: '... option optio...' == Line 1042 has weird spacing: '...feature featu...' == Line 1631 has weird spacing: '...scarded is no...' -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (2 March 2003) is 7726 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'RFC3124' is mentioned on line 238, but not defined == Missing Reference: 'Nonce 0' is mentioned on line 2601, but not defined == Missing Reference: 'Nonce 1' is mentioned on line 2572, but not defined ** Downref: Normative reference to an Historic draft: draft-ietf-tsvwg-tcp-nonce (ref. 'ECN NONCE') ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 1889 (Obsoleted by RFC 3550) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) ** Obsolete normative reference: RFC 2960 (Obsoleted by RFC 4960) -- Possible downref: Non-RFC (?) normative reference: ref. 'SB00' == Outdated reference: A later version (-02) exists of draft-ietf-tsvwg-udp-lite-01 Summary: 10 errors (**), 0 flaws (~~), 8 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force 2 INTERNET-DRAFT Eddie Kohler 3 draft-ietf-dccp-spec-01.txt Mark Handley 4 Sally Floyd 5 ICIR 6 Jitendra Padhye 7 Microsoft Research 8 2 March 2003 9 Expires: September 2003 11 Datagram Congestion Control Protocol (DCCP) 13 Status of this Document 15 This document is an Internet-Draft and is in full conformance with 16 all provisions of Section 10 of [RFC 2026]. Internet-Drafts are 17 working documents of the Internet Engineering Task Force (IETF), its 18 areas, and its working groups. Note that other groups may also 19 distribute working documents as Internet-Drafts. 21 Internet-Drafts are draft documents valid for a maximum of six 22 months and may be updated, replaced, or obsoleted by other documents 23 at any time. It is inappropriate to use Internet-Drafts as reference 24 material or to cite them other than as "work in progress." 26 The list of current Internet-Drafts can be accessed at 27 http://www.ietf.org/ietf/1id-abstracts.txt 29 The list of Internet-Draft Shadow Directories can be accessed at 30 http://www.ietf.org/shadow.html. 32 Abstract 34 This document specifies the Datagram Congestion Control 35 Protocol (DCCP), which implements a congestion-controlled, 36 unreliable flow of datagrams suitable for use by applications 37 such as streaming media. 39 TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION: 41 Changes from draft-ietf-dccp-spec-00.txt: 43 * Add Identification mechanism, replacing old Challenge 44 mechanism. Identification is much less vulnerable to attack 45 than Challenges were. 47 * Add discussion of DCCP and RTP. 49 * Add clarification of when packets are "received" for 50 purposes of acknowledgement (Section 5.5). 52 * Clarify discussion of partial checksums. 54 * Mention the problem of wrapped sequence numbers for future 55 work. 57 * Allow Data Discarded options on packets other than DCCP- 58 Response. 60 * Clarify what ECN-incapable receivers should do with ECN 61 nonces. 63 Table of Contents 65 1. Introduction. . . . . . . . . . . . . . . . . . . . . . 5 66 2. Design Rationale. . . . . . . . . . . . . . . . . . . . 6 67 3. Concepts and Terminology. . . . . . . . . . . . . . . . 7 68 3.1. Anatomy of a DCCP Connection . . . . . . . . . . . . 7 69 3.2. Congestion Control . . . . . . . . . . . . . . . . . 8 70 3.3. Connection Initiation and Termination. . . . . . . . 8 71 3.4. Features . . . . . . . . . . . . . . . . . . . . . . 9 72 4. DCCP Packets. . . . . . . . . . . . . . . . . . . . . . 9 73 4.1. Examples of DCCP Congestion Control. . . . . . . . . 11 74 4.1.1. DCCP with TCP-like Congestion Control . . . . . . 11 75 4.1.2. DCCP with TFRC Congestion Control . . . . . . . . 12 76 5. Packet Formats. . . . . . . . . . . . . . . . . . . . . 13 77 5.1. Generic Packet Header. . . . . . . . . . . . . . . . 13 78 5.2. Sequence Number Validity . . . . . . . . . . . . . . 16 79 5.3. DCCP State Diagram . . . . . . . . . . . . . . . . . 18 80 5.4. DCCP-Request Packet Format . . . . . . . . . . . . . 18 81 5.5. DCCP-Response Packet Format. . . . . . . . . . . . . 19 82 5.6. DCCP-Data, DCCP-Ack, and DCCP-DataAck Packet 83 Formats . . . . . . . . . . . . . . . . . . . . . . . . . 21 84 5.7. DCCP-CloseReq and DCCP-Close Packet Format . . . . . 23 85 5.8. DCCP-Reset Packet Format . . . . . . . . . . . . . . 23 86 5.9. DCCP-Move Packet Format. . . . . . . . . . . . . . . 24 87 6. Options and Features. . . . . . . . . . . . . . . . . . 26 88 6.1. Padding Option . . . . . . . . . . . . . . . . . . . 27 89 6.2. Ignored Option . . . . . . . . . . . . . . . . . . . 27 90 6.3. Feature Negotiation. . . . . . . . . . . . . . . . . 28 91 6.3.1. Feature Numbers . . . . . . . . . . . . . . . . . 28 92 6.3.2. Change Option . . . . . . . . . . . . . . . . . . 29 93 6.3.3. Prefer Option . . . . . . . . . . . . . . . . . . 29 94 6.3.4. Confirm Option. . . . . . . . . . . . . . . . . . 30 95 6.3.5. Example Negotiations. . . . . . . . . . . . . . . 30 96 6.3.6. Unknown Features. . . . . . . . . . . . . . . . . 31 97 6.3.7. State Diagram . . . . . . . . . . . . . . . . . . 31 98 6.4. Identification Options . . . . . . . . . . . . . . . 34 99 6.4.1. Identification Regime Feature . . . . . . . . . . 34 100 6.4.2. Connection Nonce Feature. . . . . . . . . . . . . 35 101 6.4.3. Identification Option . . . . . . . . . . . . . . 35 102 6.4.4. Challenge Option. . . . . . . . . . . . . . . . . 36 103 6.5. Data Discarded Option. . . . . . . . . . . . . . . . 37 104 6.6. Init Cookie Option . . . . . . . . . . . . . . . . . 38 105 6.7. Timestamp Option . . . . . . . . . . . . . . . . . . 38 106 6.8. Timestamp Echo Option. . . . . . . . . . . . . . . . 39 107 6.9. Loss Window Feature. . . . . . . . . . . . . . . . . 39 108 7. Congestion Control IDs. . . . . . . . . . . . . . . . . 39 109 7.1. Unspecified Sender-Based Congestion Control. . . . . 40 110 7.2. TCP-like Congestion Control. . . . . . . . . . . . . 41 111 7.3. TFRC Congestion Control. . . . . . . . . . . . . . . 41 112 7.4. CCID-Specific Options and Features . . . . . . . . . 41 113 8. Acknowledgements. . . . . . . . . . . . . . . . . . . . 42 114 8.1. Acks of Acks and Unidirectional Connections. . . . . 42 115 8.2. Ack Piggybacking . . . . . . . . . . . . . . . . . . 44 116 8.3. Ack Ratio Feature. . . . . . . . . . . . . . . . . . 44 117 8.4. Use Ack Vector Feature . . . . . . . . . . . . . . . 45 118 8.5. Ack Vector Options . . . . . . . . . . . . . . . . . 45 119 8.5.1. Ack Vector Consistency. . . . . . . . . . . . . . 47 120 8.5.2. Ack Vector Coverage . . . . . . . . . . . . . . . 48 121 8.6. Slow Receiver Option . . . . . . . . . . . . . . . . 49 122 8.7. Receive Buffer Drops Option. . . . . . . . . . . . . 50 123 8.8. Buffer Closed Option . . . . . . . . . . . . . . . . 50 124 8.9. Ack Vector Implementation Notes. . . . . . . . . . . 51 125 8.9.1. New Packets . . . . . . . . . . . . . . . . . . . 52 126 8.9.2. Sending Acknowledgements. . . . . . . . . . . . . 54 127 8.9.3. Clearing State. . . . . . . . . . . . . . . . . . 54 128 8.9.4. Processing Acknowledgements . . . . . . . . . . . 55 129 9. Explicit Congestion Notification. . . . . . . . . . . . 56 130 9.1. ECN Capable Feature. . . . . . . . . . . . . . . . . 56 131 9.2. ECN Nonces . . . . . . . . . . . . . . . . . . . . . 57 132 10. Multihoming and Mobility . . . . . . . . . . . . . . . 58 133 10.1. Mobility Capable Feature. . . . . . . . . . . . . . 59 134 10.2. Security. . . . . . . . . . . . . . . . . . . . . . 59 135 10.3. Congestion Control State. . . . . . . . . . . . . . 59 136 10.4. Loss During Transition. . . . . . . . . . . . . . . 60 137 11. Path MTU Discovery . . . . . . . . . . . . . . . . . . 60 138 12. Abstract API . . . . . . . . . . . . . . . . . . . . . 62 139 13. Multiplexing Issues. . . . . . . . . . . . . . . . . . 62 140 14. DCCP and RTP . . . . . . . . . . . . . . . . . . . . . 62 141 15. Security Considerations. . . . . . . . . . . . . . . . 64 142 16. IANA Considerations. . . . . . . . . . . . . . . . . . 64 143 17. Thanks . . . . . . . . . . . . . . . . . . . . . . . . 65 144 18. References . . . . . . . . . . . . . . . . . . . . . . 65 145 19. Authors' Addresses . . . . . . . . . . . . . . . . . . 66 147 1. Introduction 149 This document specifies the Datagram Congestion Control Protocol 150 (DCCP). DCCP provides the following features: 152 o An unreliable flow of datagrams, with acknowledgements. 154 o A reliable handshake for connection setup and teardown. 156 o Reliable negotiation of options, including negotiation of a 157 suitable congestion control mechanism. 159 o Mechanisms allowing a server to avoid holding any state for 160 unacknowledged connection attempts or already-finished 161 connections. 163 o An optional mechanism that allows the sender to know, with high 164 reliability, which packets reached the receiver. 166 o Congestion control incorporating Explicit Congestion Notification 167 (ECN) and the ECN Nonce, as per [RFC 3168] and [ECN NONCE]. 169 o Path MTU discovery, as per [RFC 1191]. 171 DCCP is intended for applications that require the flow-based 172 semantics of TCP, but which do not want TCP's in-order delivery and 173 reliability semantics, or which would like different congestion 174 control dynamics than TCP. Similarly, DCCP is intended for 175 applications that do not require the features of SCTP [RFC 2960] 176 such as sequenced delivery within multiple streams. 178 Applications that could make use of DCCP include those with timing 179 constraints on the delivery of data such that reliable in-order 180 delivery, when combined with congestion control, is likely to result 181 in some information arriving at the receiver after it is no longer 182 of use. Such applications might include streaming media and 183 Internet telephony. 185 To date most such applications have used either TCP, with the 186 problems described above, or used UDP and implemented their own 187 congestion control mechanisms (or no congestion control at all). The 188 purpose of DCCP is to provide a standard way to implement congestion 189 control and congestion control negotiation for such applications. 190 One of the motivations for DCCP is to enable the use of ECN, along 191 with conformant end-to-end congestion control, for applications that 192 would otherwise be using UDP. In addition, DCCP implements reliable 193 connection setup, teardown, and feature negotiation. 195 A DCCP connection contains acknowledgement traffic as well as data 196 traffic. Acknowledgements inform a sender whether its packets 197 arrived, and whether they were ECN marked. Acks are transmitted as 198 reliably as the congestion control mechanism in use requires, 199 possibly up to completely reliably. 201 Previous drafts of this specification called the protocol DCP, or 202 Datagram Control Protocol. The name was changed to make the acronym 203 sound less like "TCP". 205 2. Design Rationale 207 One of the motivations behind the design of DCCP is to make DCCP as 208 low-overhead as possible, in terms both of the size of the packet 209 header and in terms of the state and CPU overhead required at the 210 end hosts. In particular, DCCP is designed to minimize the state 211 maintained by the data sender. DCCP is intended to be used by 212 applications that currently use UDP without end-to-end congestion 213 control. The desire is for many applications to have little reason 214 not to use DCCP instead of UDP, once DCCP is deployed. 216 This desire for minimal overhead results in the design decision to 217 include only the minimal necessary functionality in DCCP, leaving 218 other functionality, such as FEC or semi-reliability, to be layered 219 on top of DCCP as desired. The desire for minimal overhead is also 220 one of the reasons to propose DCCP instead of just proposing an 221 unreliable version of SCTP for applications currently using UDP. 223 A second motivation behind the design of DCCP is to allow 224 applications to choose an alternative to the current TCP-style 225 congestion control that halves the congestion window in response to 226 a congestion indication. DCCP lets applications choose between 227 several forms of congestion control. One choice, TCP-like 228 congestion control, halves the congestion window in response to a 229 packet drop or mark, as in TCP. A second alternative, TFRC (TCP- 230 Friendly Rate Control, a form of equation-based congestion control), 231 minimizes abrupt changes in the sending rate while maintaining 232 longer-term fairness with TCP. 234 In proposing a new transport protocol, it is necessary to justify 235 the design decision not to require the use of the Congestion 236 Manager, as well as the design decision to add a new transport 237 protocol to the current family of UDP, TCP, and SCTP. The 238 Congestion Manager [RFC3124] allows multiple concurrent streams 239 between the same sender and receiver to share congestion control. 240 However, the current Congestion Manager can only be used by 241 applications that have their own end-to-end feedback about packet 242 losses, and this is not the case for many of the applications 243 currently using UDP. In addition, the current Congestion Manager 244 does not lend itself to the use of forms of TFRC where the state 245 about past packet drops or marks is maintained at the receiver 246 rather than at the sender. While DCCP should be able to make use of 247 CM where desired by the application, we do not see any benefit in 248 making the deployment of DCCP contingent on the deployment of CM 249 itself. 251 3. Concepts and Terminology 253 3.1. Anatomy of a DCCP Connection 255 Each DCCP connection runs between two endpoints, which we often name 256 DCCP A and DCCP B. Data may pass over the connection in either or 257 both directions. The DCCP connection between DCCP A and DCCP B 258 consists of four sets of packets, as follows: 260 (1) Data packets from DCCP A to DCCP B. 262 (2) Acknowledgements from DCCP B to DCCP A. 264 (3) Data packets from DCCP B to DCCP A. 266 (4) Acknowledgements from DCCP A to DCCP B. 268 We use the following terms to refer to subsets and endpoints of a 269 DCCP connection. 271 Subflows 272 A subflow consists of either data or acknowledgement packets, 273 sent in one direction (from DCCP A to DCCP B, say). Each of the 274 four sets of packets above is a subflow. (Subflows may overlap 275 to some extent, since acknowledgements may be piggybacked on 276 data packets.) 278 Sequences 279 A sequence consists of all packets sent in one direction, 280 regardless of whether they are data or acknowledgements. The 281 sets 1+4 and 2+3, from above, are each sequences. Each packet on 282 a sequence has a different sequence number. 284 Half-connections 285 A half-connection consists of the data packets sent in one 286 direction, plus the corresponding acknowledgements. The sets 1+2 287 and 3+4, from above, are each half-connections. Half-connections 288 are named after the direction of data flow, so the A-to-B half- 289 connection contains the data packets from A to B and the 290 acknowledgements from B to A. 292 HC-Sender and HC-Receiver 293 In the context of a single half-connection, the HC-Sender is the 294 endpoint sending data, while the HC-Receiver is the endpoint 295 sending acknowledgements. For example, in the A-to-B half- 296 connection, DCCP A is the HC-Sender and DCCP B is the HC- 297 Receiver. 299 3.2. Congestion Control 301 Each half-connection is managed by a congestion control mechanism. 302 The endpoints negotiate these mechanisms at connection setup; the 303 mechanisms for the two half-connections need not be the same. 305 Conformant congestion control mechanisms correspond to single-byte 306 congestion control identifiers, or CCIDs. The CCID for a half- 307 connection describes how the HC-Sender limits data packet rates; how 308 it maintains necessary parameters, such as congestion windows; how 309 the HC-Receiver sends congestion feedback via acknowledgements; and 310 how it manages the acknowledgement rate. Section 7 introduces the 311 currently allocated CCIDs, which are defined in separate profile 312 documents. 314 3.3. Connection Initiation and Termination 316 Every DCCP connection is actively initiated by one DCCP, which 317 connects to a DCCP socket in the passive listening state. We refer 318 to the active endpoint as "the client" and the passive endpoint as 319 "the server". Most of the DCCP specification is indifferent to 320 whether a DCCP is client or server. However, only the server may 321 generate a DCCP-CloseReq packet. (A DCCP-CloseReq packet forces the 322 receiving DCCP to close the connection and maintain connection state 323 for a reasonable time, allowing old packets to clear the network.) 324 This means that the client cannot force the server to maintain 325 connection state after the connection is closed. 327 DCCP does not support TCP-style simultaneous open. In particular, a 328 host MUST NOT respond to a DCCP-Request packet with a DCCP-Response 329 packet unless the destination port specified in the DCCP-Request 330 corresponds to a local socket opened for listening. 332 DCCP does not support half-open connections either. That is, DCCP 333 shuts down both half-connections as a unit. However, DCCP SHOULD 334 allow applications to declare that they are no longer interested in 335 receiving data. This would allow DCCP implementations to streamline 336 state for certain half-connections. See Section 8.8, on the Buffer 337 Closed option, for more information. 339 3.4. Features 341 DCCP uses a generic mechanism to negotiate connection properties, 342 such as the CCIDs active on the two half-connections. These 343 properties are called features. (We reserve the term "option" for a 344 collection of bytes in some DCCP header.) A feature name, such as 345 "CCID", generally corresponds to two features, one per half- 346 connection. For instance, there are two CCIDs per connection. The 347 endpoint in charge of a particular feature is called its feature 348 location. 350 The Change, Prefer, and Confirm options negotiate feature values. 351 (These options were formerly called Ask, Choose, and Answer, 352 respectively.) Change is sent to a feature location, asking it to 353 change its value for the feature. The feature location may respond 354 with Prefer, which asks the other endpoint to Change again with 355 different values, or it may change the feature value and acknowledge 356 the request with Confirm. Retransmissions make feature negotiation 357 reliable. Section 6.3 describes these options further. 359 4. DCCP Packets 361 DCCP has nine different packet types: 363 o DCCP-Request 365 o DCCP-Response 367 o DCCP-Data 369 o DCCP-Ack 371 o DCCP-DataAck 373 o DCCP-CloseReq 375 o DCCP-Close 377 o DCCP-Reset 379 o DCCP-Move 381 Only the first eight types commonly occur. The DCCP-Move packet is 382 used to support multihoming and mobility. 384 The progress of a typical DCCP connection is as follows. 386 (1) The client sends the server a DCCP-Request packet specifying the 387 client and server ports, the service that is being requested, 388 and any features that are being negotiated, including the CCID 389 that the client would like the server to use. The client MAY 390 optionally piggyback some data on the DCCP-Request packet---an 391 application-level request, say---which the server MAY ignore. 393 (2) The server sends the client a DCCP-Response packet indicating 394 that it is willing to communicate with the client. The response 395 indicates any features and options that the server agrees to, 396 whether an application request in the DCCP-Request was actually 397 passed to the application, and optionally an Init Cookie that 398 wraps up all this information and which MUST be returned by the 399 client for the connection to complete. 401 (3) The client sends the server a DCCP-Ack packet that acknowledges 402 the DCCP-Response packet. This acknowledges the server's initial 403 sequence number and returns the Init Cookie if there was one in 404 the DCCP-Response. It may also continue feature negotiation. 406 (4) Next comes zero or more DCCP-Ack exchanges as required to 407 finalize feature negotiation. The client may piggyback an 408 application-level request on its final ack, producing a DCCP- 409 DataAck packet. 411 (5) The server and client then exchange DCCP-Data packets, DCCP-Ack 412 packets acknowledging that data, and, optionally, DCCP-DataAck 413 packets containing piggybacked data and acknowledgements. If the 414 client has no data to send, then the server will send DCCP-Data 415 and DCCP-DataAck packets, while the client will send DCCP-Acks 416 exclusively. 418 (6) The server sends a DCCP-CloseReq packet requesting a close. 420 (7) The client sends a DCCP-Close packet acknowledging the close. 422 (8) The server sends a DCCP-Reset packet with Reason field set to 423 "Closed" and clears its connection state. 425 (9) The client receives the DCCP-Reset packet and holds state for a 426 reasonable interval of time to allow any remaining packets to 427 clear the network. 429 An alternative connection closedown sequence is initiated by the 430 client: 432 (6) The client sends a DCCP-Close packet closing the connection. 434 (7) The server sends a DCCP-Reset packet with Reason field set to 435 "Closed" and clears its connection state. 437 (8) The client receives the DCCP-Reset packet and holds state for a 438 reasonable interval of time to allow any remaining packets to 439 clear the network. 441 This arrangement of setup and teardown handshakes permits the server 442 to decline to hold any state until the handshake with the client has 443 completed, and ensures that the client must hold the TimeWait state 444 at connection closedown. 446 4.1. Examples of DCCP Congestion Control 448 Before giving the detailed specifications of DCCP, we present two 449 more detailed examples showing DCCP congestion control in operation. 451 4.1.1. DCCP with TCP-like Congestion Control 453 The first example is of a connection where both half-connections use 454 TCP-like Congestion Control, specified by CCID 2 [CCID 2 PROFILE]. 455 In this example, the client sends an application-level request to 456 the server, and the server responds with a stream of data packets. 457 This example is of a connection using ECN. 459 (1) The client sends the DCCP-Request, which includes a Change 460 option asking the server to use CCID 2 for the server's data 461 packets, and a Prefer option informing the server that the 462 client would like to use CCID 2 for the its data packets. 464 (2) The server sends a DCCP-Response, including a Confirm option 465 indicating that the server agrees to use CCID 2 for its data 466 packets, and a Change option indicating that the server agrees 467 to the client's suggestion of CCID 2 for the client's data 468 packets. 470 (3) The client responds with a DCCP-DataAck acknowledging the 471 server's initial sequence number, and including a Confirm option 472 finalizing the negotiation of the client-to-server CCID, and an 473 application-level request for data. We will not discuss the 474 client-to-server half-connection further in this example. 476 (4) The server sends DCCP-Data packets, where the number of packets 477 sent is governed by a congestion window cwnd, as in TCP. The 478 details of the congestion window are defined in the profile for 479 CCID 2, which is a separate document [CCID 2 PROFILE]. The 480 server also sends Ack Ratio feature options specifying the 481 number of server data packets to be covered by an Ack packet 482 from the client. 484 Some of these data packets are DCCP-DataAcks acknowledging 485 packets from the client. 487 (5) The client sends a DCCP-Ack packet acknowledging the data 488 packets for every Ack Ratio data packets transmitted by the 489 server. Each DCCP-Ack packet uses a sequence number and 490 contains an Ack Vector, as defined in Section 8 on 491 Acknowledgements. These packets also include Confirm options 492 answering any Ack Ratio requests from the server. 494 (6) The server continues sending DCCP-Data packets as controlled by 495 the congestion window. Upon receiving DCCP-Ack packets, the 496 server examines the Ack Vector to learn about marked or dropped 497 data packets, and adjusts its congestion window accordingly, as 498 described in [CCID 2 PROFILE]. Because this is unreliable 499 transfer, the server does not retransmit dropped packets. 501 (7) Because DCCP-Ack packets use sequence numbers, the server has 502 direct information about the fraction of loss or marked DCCP-Ack 503 packets. The server responds to lost or marked DCCP-Ack packets 504 by modifying the Ack Ratio sent to the client, as described in 505 [CCID 2 PROFILE]. Under certain conditions, the server must 506 acknowledge some of the client's acknowledgements; see Section 507 8.1 for more information. 509 (8) The server estimates round-trip times and calculates a TimeOut 510 (TO) value much as the RTO (Retransmit Timeout) is calculated in 511 TCP. Again, the specification for this is in [CCID 2 PROFILE]. 512 The TO is used to determine when a new DCCP-Data packet can be 513 transmitted when the server has been limited by the congestion 514 window and no feedback has been received from the client. 516 (9) Each DCCP-Data, DCCP-DataAck, and DCCP-Ack packet is sent as 517 ECN-Capable, with either the ECT(0) or the ECT(1) codepoint set, 518 as described in [ECN NONCE]. The client echoes the accumulated 519 ECN Nonce for the server's packets along with its Ack Vector 520 options. 522 (10) 523 The DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets to close 524 the connection are as in the example above. 526 4.1.2. DCCP with TFRC Congestion Control 528 This example is of a connection where both half-connections use TFRC 529 Congestion Control, specified by CCID 3 [CCID 3 PROFILE]. 531 (1) The DCCP-Request and DCCP-Response packets specifying the use of 532 CCID 3 and the initial DCCP-DataAck packet are similar to those 533 in the CCID 2 example above. 535 (2) The server sends DCCP-Data packets, where the number of packets 536 sent is governed by an allowed transmit rate, as in TFRC. The 537 details of the allowed transmit rate are defined in the profile 538 for CCID 3, which is a separate document [CCID 3 PROFILE]. Each 539 DCCP-Data packet has a sequence number and a window counter 540 option. 542 Some of these data packets are DCCP-DataAck packets 543 acknowledging packets from the client, but for simplicity we 544 will not discuss the half-connection of data from the client to 545 the server in this example. 547 (3) The receiver sends DCCP-Ack packets at least once per round-trip 548 time acknowledging the data packets, unless the server is 549 sending at a rate of less than one packet per RTT, as specified 550 by [CCID 3 PROFILE]. These acknowledgements may be piggybacked 551 on data packets, producing DCCP-DataAck packets. Each DCCP-Ack 552 packet uses a sequence number and identifies the most recent 553 packet received from the server. Each DCCP-Ack packet includes 554 feedback about the loss event rate calculated by the client, as 555 specified by [CCID 3 PROFILE]. 557 (4) The server continues sending DCCP-Data packets as controlled by 558 the allowed transmit rate. Upon receiving DCCP-Ack packets, the 559 server updates its allowed transmit rate as specified by [CCID 3 560 PROFILE]. 562 (5) The server estimates round-trip times and calculates a TimeOut 563 (TO) value much as the RTO (Retransmit Timeout) is calculated in 564 TCP. Again, the specification for this is in [CCID 3 PROFILE]. 566 (6) The use of ECN follows TCP-like Congestion Control, above, and 567 is described further in [CCID 3 PROFILE]. 569 (7) The DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets to close 570 the connection are as in the examples above. 572 5. Packet Formats 574 5.1. Generic Packet Header 576 All DCCP packets begin with a generic DCCP packet header: 578 0 1 2 3 579 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 580 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 581 | Source Port | Dest Port | 582 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 583 | Type | Res | Sequence Number | 584 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 585 | Data Offset | # NDP | Cslen | Checksum | 586 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 588 Source and Destination Ports: 16 bits each 589 These fields identify the connection. Packets sent on the other 590 sequence switch the source and destination port values. 592 Type: 4 bits 593 The type field specifies the type of the DCCP message. The 594 following values are defined: 596 0 DCCP-Request packet. 598 1 DCCP-Response packet. 600 2 DCCP-Data packet. 602 3 DCCP-Ack packet. 604 4 DCCP-DataAck packet. 606 5 DCCP-CloseReq packet. 608 6 DCCP-Close packet. 610 7 DCCP-Reset packet. 612 8 DCCP-Move packet. 614 Reserved (Res): 4 bits 615 This field is reserved for future expansion. The version of DCCP 616 specified here MUST set the field to all zeroes on generated 617 packets, and ignore its value on received packets. 619 Sequence Number: 24 bits 620 The sequence number field is initialized by a DCCP-Request or 621 DCCP-Response packet, and increases by one (modulo 16777216) 622 with every packet sent. The receiver uses this information to 623 determine whether packet losses have occurred. Even packets 624 containing no data update the sequence number. Sequence numbers 625 also provide some protection against old and malicious packets; 626 see Section 5.2 on sequence number validity. 628 Very-high-rate DCCPs may need protection against wrapped 629 sequence numbers. For example, a 10 Gb/s flow of 1500-byte DCCP 630 packets will send 2^24 packets in about 20 seconds. This is a 631 long time, in terms of likely round-trip times that could 632 possibly achieve such a sustained rate, but it is not without 633 risk. However, we leave the design of mechanisms to protect 634 against wrapped sequence numbers for future work. In particular, 635 if it is decided that very large packet sizes are better than 636 very large congestion windows for very-high-bandwidth flows, 637 then 24 bits may be enough. 639 Data Offset: 8 bits 640 The offset from the start of the DCCP header to the beginning of 641 the packet's payload, measured in 32-bit words. 643 Number of Non-Data Packets (# NDP): 4 bits 644 DCCP sets this field to the number of non-data packets it has 645 sent so far on its sequence, modulo 16. A non-data packet is 646 simply any packet not containing user data; DCCP-Ack packets are 647 the canonical example. When sending a non-data packet, DCCP 648 increments the # NDP counter before storing its value in the 649 packet header. 651 This field can help the receiving DCCP decide whether a lost 652 packet contained any user data. (An application may want to know 653 when it has lost data. DCCP could report every packet loss as a 654 potential data loss, but that would cause false loss reports 655 when non-data packets were lost.) For example, say that packet 656 10 had # NDP set to 5; packet 11 was lost; and packet 12 had # 657 NDP set to 5. Then the receiving DCCP could deduce that packet 658 11 contained data, since # NDP did not change. Likewise, if # 659 NDP had gone up to 6 (and packet 12 contained user data), then 660 packet 11 must not have contained any data. 662 Checksum Length (Cslen): 4 bits 663 The checksum length field specifies what parts of the packet are 664 covered by the checksum field. The checksum always covers at 665 least the DCCP header, DCCP options, and a pseudoheader taken 666 from the network-layer header (see below). If the checksum 667 length field is zero, that is all the checksum covers. If the 668 field is 15, the checksum covers the packet's payload as well, 669 possibly with 8 bits of zero padding on the right to pad the 670 payload to an even number of bytes. Values between 1 and 14, 671 inclusive, indicate that the checksum additionally covers the 672 indicated number of initial 32-bit words of the packet's 673 payload, padded on the right with zeros as necessary. Values 674 other than 15 specify that corruption is acceptable in some or 675 all of the DCCP packet's payload, since DCCP will not even 676 detect any corruption there. The meaning of values other than 0 677 and 15 should be considered experimental. (The checksum length 678 field was inspired by UDP-Lite [UDP-LITE].) 680 Checksum: 16 bits 681 DCCP uses the TCP/IP checksum algorithm. The checksum field 682 equals the 16 bit one's complement of the one's complement sum 683 of all 16 bit words in the DCCP header, DCCP options, a 684 pseudoheader taken from the network-layer header, and, depending 685 on the value of the checksum length field, some or all of the 686 payload. When calculating the checksum, the checksum field 687 itself is treated as 0. If a packet contains an odd number of 688 header and text octets to be checksummed, 8 zero bits are added 689 on the right to form a 16 bit word for checksum purposes. The 690 pad octet is not transmitted as part of the packet. 692 The pseudoheader is calculated as for TCP. For IPv4, it is 96 693 bits long, and consists of the IPv4 source and destination 694 addresses, the IP protocol number for DCCP (padded on the left 695 with 8 zero bits), and the DCCP length (the length of the DCCP 696 header with options, plus the length of any data); see Section 697 3.1 of [RFC 793]. For IPv6, it is 320 bits long, and consists of 698 the IPv6 source and destination addresses, the DCCP length as a 699 32-bit quantity, and the IP protocol number for DCCP (padded on 700 the left with 24 zero bits); see Section 8.1 of [RFC 2460]. 702 Packets with invalid checksums MUST be dropped. In particular, 703 their options MUST NOT be processed. 705 5.2. Sequence Number Validity 707 DCCP SHOULD ignore packets with invalid sequence numbers, which may 708 arise if the network delivers a very old packet or an attacker 709 attempts to hijack a connection. TCP solves this problem with its 710 window. In DCCP, however, the definition of "invalid sequence 711 number" is complicated because sequence numbers change with each 712 packet sent, even pure acknowledgements. Thus, a loss event that 713 dropped many consecutive packets could cause two DCCPs to get out of 714 sync relative to any window. 716 DCCP uses Loss Window and Identification mechanisms to determine 717 whether a given packet's sequence number is valid. Each HC-Sender 718 gives the corresponding HC-Receiver a loss window width W; see 719 Section 6.9. This reflects how many packets the sender expects to be 720 in flight. Only the sender can anticipate this number. One good 721 guideline is to set it to about 3 or 4 times the maximum number of 722 packets the sender expects to send in any round-trip time. Too-small 723 values increase the risk of the endpoints getting out sync after 724 bursts of loss; too-large values increase the risk of connection 725 hijacking. W defaults to 1000. The Identification mechanism is used 726 to get back into sync when more than W consecutive packets are lost. 728 The HC-Receiver sets up a loss window of W consecutive sequence 729 numbers containing GSN, the Greatest Sequence Number it has received 730 on any valid packet from the sender. ("Consecutive" and "greatest" 731 are measured in circular sequence space. The receiver may center the 732 loss window on GSN, or arrange it asymmetrically.) Sequence numbers 733 outside this loss window are invalid. Packets with invalid sequence 734 numbers are themselves invalid, *unless* each of the following 735 conditions is true: 737 (1) No valid packet has been received recently (for instance, within 738 at least one round-trip time). 740 (2) The packet includes a correct Identification or Challenge option 741 (see Section 6.4.3). 743 The receiving DCCP SHOULD ignore invalid packets---that is, it 744 should not pass any enclosed data to the application, update its 745 congestion control state, or close the connection. However, the 746 receiving DCCP MAY send a DCCP-Ack packet to the sender, as allowed 747 by the congestion control mechanism in use. This packet should 748 contain the last received valid sequence number and a Challenge 749 option (Section 6.4.4). The other DCCP will send an Identification 750 option to resync. 752 A DCCP endpoint MAY implement rate limits to reduce the likelihood 753 of denial-of-service attack. In particular, it MAY ignore all 754 packets with bad sequence numbers---even those containing 755 Identification or Challenge options---for some amount of time, on 756 the order of one round-trip time, after receiving a packet with an 757 invalid Identification or Challenge option; and it MAY rate-limit 758 the Challenge options it sends. 760 5.3. DCCP State Diagram 762 In this section we present a DCCP state diagram showing how a DCCP 763 connection should progress, and the proper responses for packets or 764 timeout events in various connection states. The state diagram is 765 illustrative; the text should be considered definitive. 767 +-----------------------------------+ 768 | Figures omitted from text version | 769 +-----------------------------------+ 771 All receive events on the diagram represent receipt of valid 772 packets. For example, receiving a Reset with a bad Acknowledgement 773 Number should not cause DCCP to transition to the Time-Wait state. 774 Furthermore, packets without explicit transitions in the state 775 diagram should be treated as invalid. DCCP implementations MAY send 776 "Invalid Packet" Resets, or Acks, as described above, in response to 777 invalid packets. Any such responses MUST be rate-limited. 779 The Open state does not signify that a DCCP connection is ready for 780 data transfer. In particular, incomplete feature negotiations might 781 prevent data transfer. Feature negotiation takes place in parallel 782 with the state transitions on this diagram. 784 Only the server may take the transition from the OPEN state to the 785 SERVER-CLOSE state. (The server is the DCCP endpoint that began in 786 the LISTEN state.) Similarly, only the client must transition to 787 CLIENT-CLOSE after receiving a CloseReq packet. 789 5.4. DCCP-Request Packet Format 791 A DCCP connection is initiated by sending a DCCP-Request packet. The 792 format of a DCCP request packet is: 794 0 1 2 3 795 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 796 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 797 / Generic DCCP Header (12 octets) / 798 / with Type=0 (DCCP-Request) / 799 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 800 | Service Name | 801 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 802 | Options / [padding] | 803 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 804 | data | 805 | ... | 806 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 808 The Service Name field, in combination with the Destination Port, 809 identifies the service to which the sender is trying to connect. 810 Service Names are 32-bit numbers allocated by the IETF; they are 811 meant to correspond to application services and protocols. The host 812 operating system MAY force every DCCP socket, both actively and 813 passively opened, to specify a Service Name. The connection will 814 succeed only if the Destination Port on the receiver has the same 815 Service Name as that given in the packet. If they differ, the 816 receiver will respond with a DCCP-Reset packet (with Reason set to 817 "Bad Service Name"). 819 The DCCP-Request packet initializes the client-to-server sequence 820 number. As in TCP, this sequence number should be chosen randomly 821 to help prevent connection hijacking. 823 Options 824 DCCP-Request packets will usually include a "Change(Connection 825 Nonce)" option, to inform the server of the client's connection 826 nonce; see Section 6.4. 828 5.5. DCCP-Response Packet Format 830 In the second phase of the three-way handshake, the server sends a 831 DCCP-Response message to the client. The response initializes the 832 server-to-client sequence number. As in TCP, this sequence number 833 should be chosen randomly to help prevent connection hijacking. 835 In this phase, a server will often specify the options it would like 836 to use, either from among those the client requested, or in addition 837 to those. Among these options is the congestion control mechanism 838 the server expects to use. 840 0 1 2 3 841 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 842 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 843 / Generic DCCP Header (12 octets) / 844 / with Type=1 (DCCP-Response) / 845 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 846 | Reserved | Acknowledgement Number | 847 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 848 | Options / [padding] | 849 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 850 | data | 851 | ... | 852 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 854 Acknowledgement Number: 24 bits 855 The acknowledgement number field acknowledges the largest valid 856 sequence number received so far on this connection. (The usual 857 care must be taken in case of wrapped sequence numbers.) In the 858 case of a DCCP-Response packet, the acknowledgement number field 859 will equal the sequence number from the DCCP-Request. 860 Acknowledgement numbers make no attempt to provide precise 861 information about which packets have arrived; options such as 862 the Ack Vector do this. 864 Some care is required in defining when a packet is "received" 865 for purposes of acknowledgement. A packet has not been received 866 until the receiving DCCP can guarantee that the packet's 867 contents, if any, are under the application's control. For 868 packets with data, this means that the data on an acknowledged 869 packet MUST NOT be dropped from the receive buffer without 870 explicit application intervention. If the receiving DCCP cannot 871 guarantee this property on a packet---perhaps because of 872 particulars of its receive buffer---then it MUST NOT acknowledge 873 that packet as received, or process that packet's options. 875 Any packet without data is "received" as soon as the receiving 876 DCCP determines its validity. This implies that feature 877 negotiations should probably use dedicated DCCP-Ack packets, 878 rather than be piggybacked on DCCP-Data and DCCP-DataAck 879 packets, since data packets might have their option processing 880 arbitrarily delayed by receive buffer issues. 882 DCCP-Response packets represent an exception to this rule. A 883 DCCP that drops the data on a DCCP-Request SHOULD acknowledge 884 that request and process its options, but include a Data 885 Discarded option on its response (Section 6.5). 887 This issue is discussed in more detail in Section 8.5. 889 Reserved: 8 bits 890 The version of DCCP specified here MUST set this field to all 891 zeroes on generated packets, and ignore its value on received 892 packets. 894 Options 895 The Data Discarded and Init Cookie options are particularly 896 designed for DCCP-Response packets (Sections 6.5 and 6.6). In 897 addition, DCCP-Response, or early DCCP-Data or DCCP-Ack packets, 898 will often include "Confirm(Connection Nonce)" and 899 "Change(Connection Nonce)" packets, to further negotiate 900 connection nonces (Section 6.4). 902 The receiver MAY respond to a DCCP-Request packet with a DCCP-Reset 903 packet to refuse the connection. Valid Reasons for refusing a 904 connection include "Connection Refused", for when the DCCP-Request's 905 Destination Port did not correspond to a DCCP port open for 906 listening; "Bad Service Name", when the DCCP-Request's Service Name 907 did not correspond to the service name registered with the 908 Destination Port; and "Too Busy", when the server is currently too 909 busy to respond to requests. 911 5.6. DCCP-Data, DCCP-Ack, and DCCP-DataAck Packet Formats 913 The payload of a DCCP connection is sent in DCCP-Data and DCCP- 914 DataAck packets, while DCCP-Ack packets are used for 915 acknowledgements when there is no payload to be sent. DCCP-Data 916 packets look like this: 918 0 1 2 3 919 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 920 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 921 / Generic DCCP Header (12 octets) / 922 / with Type=2 (DCCP-Data) / 923 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 924 | Options / [padding] | 925 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 926 | data | 927 | ... | 928 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 930 DCCP-Ack packets dispense with the data, but contain an 931 acknowledgement number: 933 0 1 2 3 934 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 935 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 936 / Generic DCCP Header (12 octets) / 937 / with Type=3 (DCCP-Ack) / 938 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 939 | Reserved | Acknowledgement Number | 940 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 941 | Options / [padding] | 942 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 944 DCCP-DataAck packets contain both data and an acknowledgement 945 number: acknowledgement information is piggybacked on a data packet. 947 0 1 2 3 948 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 949 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 950 / Generic DCCP Header (12 octets) / 951 / with Type=4 (DCCP-DataAck) / 952 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 953 | Reserved | Acknowledgement Number | 954 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 955 | Options / [padding] | 956 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 957 | data | 958 | ... | 959 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 961 DCCP-Ack and DCCP-DataAck packets may include additional 962 acknowledgement options, such as Ack Vector, as required by the 963 congestion control mechanism in use. 965 DCCP A sends DCCP-Data and DCCP-DataAck packets to DCCP B due to 966 application events on host A. These packets are congestion- 967 controlled by the CCID for the A-to-B half-connection. In contrast, 968 DCCP-Ack packets sent by DCCP A are controlled by the CCID for the 969 B-to-A half-connection. Generally, DCCP A will piggyback 970 acknowledgement information on data packets when acceptable, 971 creating DCCP-DataAck packets. DCCP-Ack packets are used when there 972 is no data to send from DCCP A to DCCP B, or when the link from A to 973 B is completely congested (so sending data would be inappropriate). 975 Section 8, below, describes acknowledgements in DCCP. 977 A DCCP-Data or DCCP-DataAck packet may contain no data if the 978 application sends a zero-length datagram. Such zero-length datagrams 979 MUST be reported to the receiving application. 981 5.7. DCCP-CloseReq and DCCP-Close Packet Format 983 The DCCP-CloseReq and DCCP-Close packets have the same format. 984 However, only the server can send a DCCP-CloseReq packet. Either 985 client or server may send a DCCP-Close packet. 987 0 1 2 3 988 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 989 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 990 / Generic DCCP Header (12 octets) / 991 / with Type=5 or 6 (DCCP-Close or CloseReq) / 992 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 993 | Reserved | Acknowledgement Number | 994 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 995 | Options / [padding] | 996 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 998 5.8. DCCP-Reset Packet Format 1000 DCCP-Reset packets unconditionally shut down a connection. Every 1001 connection shutdown sequence ends with a DCCP-Reset, but resets may 1002 be sent for other reasons, including bad port numbers, bad option 1003 behavior, incorrect ECN Nonce Echoes, and so forth. The reason for a 1004 reset is represented in the reset itself by a four-byte number, the 1005 Reason field. 1007 0 1 2 3 1008 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1009 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1010 / Generic DCCP Header (12 octets) / 1011 / with Type=7 (DCCP-Reset) / 1012 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1013 | Reserved | Acknowledgement Number | 1014 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1015 | Reason | Data 1 | Data 2 | Data 3 | 1016 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1017 | Options / [padding] | 1018 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1019 Reason: 8 bits 1020 The Reason field represents the reason that the sender reset the 1021 DCCP connection. 1023 Data 1, Data 2, and Data 3: 8 bits each 1024 The Data fields provide additional information about why the 1025 sender reset the DCCP connection. The meanings of these fields 1026 depend on the value of Reason. 1028 The following Reasons are currently defined. The "Data" columns 1029 describe what the Data fields should contain for a given Reason. In 1030 those columns, N/A means the Data field SHOULD be set to 0 by the 1031 sender of the DCCP-Reset, and ignored by its receiver. 1033 Section 1034 Reason Name Data 1 Data 2 Data 3 Reference 1035 ------ ---- ------ ------ ------ --------- 1036 0 Unspecified N/A N/A N/A 1037 1 Closed N/A N/A N/A 4 1038 2 Invalid Packet packet N/A N/A 5.3 1039 type 1040 3 Option Error option option data 1041 number 1042 4 Feature Error feature feature data 1043 number 1044 5 Connection Refused N/A N/A N/A 5.5 1045 6 Bad Service Name N/A N/A N/A 5.4 1046 7 Too Busy N/A N/A N/A 5.5 1047 8 Bad Init Cookie N/A N/A N/A 6.6 1048 9 Invalid Move N/A N/A N/A 5.9 1049 10 Unanswered Challenge N/A N/A N/A 6.4.4 1050 11 Fruitless Negotiation feature N/A N/A 6.3.7 1051 number 1053 5.9. DCCP-Move Packet Format 1055 The DCCP-Move packet type is part of DCCP's support for multihoming 1056 and mobility, which is described further in Section 10. DCCP A sends 1057 a DCCP-Move packet to DCCP B after changing its address and/or port 1058 number. The DCCP-Move packet requests that DCCP B start sending its 1059 data to the new address and port number. The old address and port 1060 are stored explicitly in the DCCP-Move packet header; the new 1061 address and port come from the network header and generic DCCP 1062 header. The type of address contained in the packet is indicated 1063 explicitly by an Old Address Family field. The Sequence Number and 1064 Acknowledgement Number fields and a mandatory Identification option 1065 provide some protection against hijacked connections. See Section 10 1066 for more on security and DCCP's mobility support. 1068 0 1 2 3 1069 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1070 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1071 / Generic DCCP Header (12 octets) / 1072 / with Type=8 (DCCP-Move) / 1073 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1074 | Reserved | Acknowledgement Number | 1075 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1076 | Old Address Family | Old Port | 1077 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1078 / Old Address / 1079 / / [padding] / 1080 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1081 | Options, including Identification / [padding] | 1082 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1083 | data | 1084 | ... | 1085 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1087 Old Address Family: 16 bits 1088 The Old Address Family field indicates the address family 1089 formerly used for this connection, and takes values from the 1090 Address Family Numbers registry administered by IANA. Particular 1091 values include 1 for IPv4 and 2 for IPv6. The endpoint MUST 1092 discard DCCP-Move packets with unrecognized Old Address Family 1093 values. 1095 Old Port: 16 bits 1096 The former port number used by DCCP A's endpoint. 1098 Old Address: at least 32 bits 1099 The former address used by DCCP A's endpoint, padded on the 1100 right to a multiple of 32 bits. The form and size of the address 1101 are determined by the Old Address Family field. For instance, if 1102 Old Address Family is 1, then Old Address contains an IPv4 1103 address and takes 32 bits; if it is 2, then Old Address contains 1104 an IPv6 address and takes 128 bits. 1106 Options 1107 Every DCCP-Move packet MUST include a valid Identification 1108 option (see Section 6.4). 1110 DCCP B SHOULD respond to the DCCP-Move with a DCCP-Reset (with 1111 Reason set to "Invalid Move") if neither the Old Address/Old Port 1112 combination nor the network address/Source Port combination refers 1113 to a currently active DCCP connection, or if the Identification 1114 option is not present or invalid. In either case, DCCP B MAY ignore 1115 subsequent DCCP-Move packets for a short period of time, such as one 1116 round-trip time. This protects DCCP B against denial-of-service 1117 attacks from floods of invalid DCCP-Moves. 1119 DCCP B MUST respond to the DCCP-Move packet with a DCCP-Ack or DCCP- 1120 DataAck packet acknowledging the move. If this acknowledgement is 1121 lost, DCCP A might resend the DCCP-Move packet (using a new sequence 1122 number). DCCP B MUST NOT send a DCCP-Reset in response to such 1123 packets, even though the Old Address/Old Port combination no longer 1124 refers to a valid DCCP connection. It SHOULD instead send another 1125 acknowledgement, as allowed by the congestion control mechanism in 1126 use. 1128 We note that DCCP mobility, as provided by DCCP-Move, may not be 1129 useful in the context of IPv6, with its mandatory support for Mobile 1130 IP. 1132 6. Options and Features 1134 All DCCP packets may contain options which can be used to extend 1135 DCCP's functionality. Options occupy space at the end of the DCCP 1136 header and are a multiple of 8 bits in length. All options are 1137 included in the checksum. An option may begin on any byte boundary. 1139 The first octet of an option is the option type. Options with types 1140 0 through 31 are single-byte options. Other options are followed by 1141 an octet indicating the option's length. This length includes the 1142 two octets of option-type and option-length as well as the option- 1143 data octets. 1145 The following options are currently defined: 1147 Option Section 1148 Type Length Meaning Reference 1149 ---- ------ ------- --------- 1150 0 1 Padding 6.1 1151 1 1 Data Discarded 6.5 1152 2 1 Slow Receiver 8.6 1153 3 1 Buffer Closed 8.8 1154 32 4 Ignored 6.2 1155 33 variable Change 6.3 1156 34 variable Prefer 6.3 1157 35 variable Confirm 6.3 1158 36 variable Init Cookie 6.6 1159 37 variable Ack Vector [Nonce 0] 8.5 1160 38 variable Ack Vector [Nonce 1] 8.5 1161 39 3 Receive Buffer Drops 8.7 1162 40 6 Timestamp 6.7 1163 41 10 Timestamp Echo 6.8 1164 42 variable Identification 6.4.3 1165 44 variable Challenge 6.4.4 1166 128-255 variable CCID-Specific Options 7.4 1168 6.1. Padding Option 1170 The padding option, with type 0, is a single byte option used to pad 1171 between or after options. It either ensures the payload begins on a 1172 32-bit boundary (as required), or ensures alignment of following 1173 options (not mandatory). 1175 +--------+ 1176 |00000000| 1177 +--------+ 1178 Type=0 1180 6.2. Ignored Option 1182 The Ignored option, with type 32, signals that a DCCP did not 1183 understand some option. This can happen, for example, when a 1184 conventional DCCP converses with an extended DCCP. Each Ignored 1185 option has two octets of payload, the first containing the offending 1186 option type and the second containing the first octet of the 1187 offending option's payload. (If the offending option had no payload, 1188 this octet is 0.) 1189 +--------+--------+--------+--------+ 1190 |00100000|00000100|Opt Type|Opt Data| 1191 +--------+--------+--------+--------+ 1192 Type=32 Length=4 1194 6.3. Feature Negotiation 1196 DCCP contains a mechanism for reliably negotiating features, notably 1197 the congestion control mechanism in use on each half-connection. The 1198 motivation is to implement reliable feature negotiation once, so 1199 that different options need not reinvent that particular wheel. 1201 Three options, Change, Prefer, and Confirm, implement feature 1202 negotiation. Change is sent to a feature's location, asking it to 1203 change the feature's value. The feature location may respond with 1204 Prefer, which asks the other endpoint to Change again with different 1205 values, or it may change the feature value and acknowledge the 1206 request with Confirm. (The options were formerly called Ask, Choose, 1207 and Answer.) 1209 Features MUST NOT change values apart from feature negotiation, and 1210 enforced retransmissions make feature negotiation reliable. This 1211 ensures that both endpoints eventually agree on every feature's 1212 value. 1214 Some features are non-negotiable, meaning that the feature location 1215 MUST set its value to whatever the other endpoint requests. For non- 1216 negotiable features, the feature location MUST respond to Change 1217 options with Confirm; Prefer is not useful. These features use the 1218 feature framework simply to achieve reliability. 1220 Negotiations for multiple features may take place simultaneously. 1221 For instance, a packet may contain multiple Change options that 1222 refer to different features. 1224 Feature negotiation should generally take place using packet types 1225 such as DCCP-Ack that carry no user data, since features may affect 1226 how data will be treated. 1228 6.3.1. Feature Numbers 1230 The first data octet of every Change, Prefer, or Confirm option is a 1231 feature number, defining the type of feature being negotiated. The 1232 remainder of the data gives one or more values for the feature, and 1233 is interpreted according to the feature. The current set of feature 1234 numbers is as follows: 1236 Section 1237 Number Meaning Neg.? Reference 1238 ------ ------- ----- --------- 1239 1 Congestion Control (CC) Y 7 1240 2 ECN Capable Y 9.1 1241 3 Ack Ratio N 8.3 1242 4 Use Ack Vector Y 8.4 1243 5 Mobility Capable Y 10.1 1244 6 Loss Window N 6.9 1245 7 Connection Nonce N 6.4.2 1246 8 Identification Regime Y 6.4.1 1247 128-255 CCID-Specific Features ? 7.4 1249 The "Neg.?" column is "Y" for normal features and "N" for non- 1250 negotiable features. 1252 6.3.2. Change Option 1254 DCCP B sends a Change option to DCCP A to ask it to change the value 1255 of some feature. (DCCP A is the feature location.) DCCP A MUST 1256 respond to the Change option with either Prefer or Confirm. DCCP B 1257 MUST retransmit the Change option until it receives some relevant 1258 response. DCCP B will always generate a Change option in response to 1259 a Prefer option; it may also generate a Change option due to some 1260 application event. 1262 +--------+--------+--------+--------+--------+-------- 1263 |00100001| Length |Feature#| Value or Values ... 1264 +--------+--------+--------+--------+--------+-------- 1265 Type=33 1267 6.3.3. Prefer Option 1269 DCCP A sends a Prefer option to DCCP B to ask it to choose another 1270 value for some feature. (Again, DCCP A is the feature location.) 1271 DCCP B MUST respond to the Prefer option with a Change. DCCP A MUST 1272 retransmit the Prefer option until it receives a relevant Change 1273 response. DCCP A may generate a Prefer option in response to some 1274 Change option, or in response to some application event. Prefer 1275 options are not useful for non-negotiable features. 1277 +--------+--------+--------+--------+--------+-------- 1278 |00100010| Length |Feature#| Value or Values ... 1279 +--------+--------+--------+--------+--------+-------- 1280 Type=34 1282 6.3.4. Confirm Option 1284 DCCP A sends a Confirm option to DCCP B to inform it of the current 1285 value of some feature. (Again, DCCP A is the feature location.) DCCP 1286 A MUST generate Confirm options only in response to Change options. 1287 DCCP A need not ever retransmit a Confirm option: DCCP B will 1288 retransmit the relevant Change as necessary. 1290 +--------+--------+--------+--------+--------+-------- 1291 |00100011| Length |Feature#| Value ... 1292 +--------+--------+--------+--------+--------+-------- 1293 Type=35 1295 6.3.5. Example Negotiations 1297 This section demonstrates several negotiations of the congestion 1298 control feature for the A-to-B half-connection. (This feature is 1299 located at DCCP A.) In this sequence of packets, DCCP A is happy 1300 with DCCP B's suggestion of CC mechanism 2: 1302 B > A Change(CC, 2) 1303 A > B Confirm(CC, 2) 1305 Here, A and B jointly settle on CC mechanism 5: 1307 B > A Change(CC, 3, 4) 1308 A > B Prefer(CC, 1, 2, 5) 1309 B > A Change(CC, 5) 1310 A > B Confirm(CC, 5) 1312 In this sequence, A refuses to use CC mechanism 5. If B requires CC 1313 mechanism 5, its only recourse is to abort the connection, via a 1314 DCCP-Reset packet with Reason set to "Fruitless Negotiation": 1316 B > A Change(CC, 3, 4, 5) 1317 A > B Prefer(CC, 1, 2) 1318 B > A Change(CC, 5) 1319 A > B Prefer(CC, 1, 2) 1321 Here, A elicits agreement from B that it is satisfied with 1322 congestion control mechanism 2: 1324 A > B Prefer(CC, 1, 2) 1325 B > A Change(CC, 2) 1326 A > B Confirm(CC, 2) 1328 6.3.6. Unknown Features 1330 If a DCCP receives a Change or Prefer option referring to a feature 1331 number it does not understand, it MUST respond with an Ignored 1332 option. This informs the remote DCCP that the local DCCP does not 1333 implement the feature. No other action need be taken. (Ignored may 1334 also indicate that the DCCP endpoint could not respond to a CCID- 1335 specific feature request because the CCID was in flux; see Section 1336 7.4.) 1338 6.3.7. State Diagram 1340 These state diagrams present the legal transitions in a DCCP feature 1341 negotiation. They define DCCP's states and transitions with respect 1342 to the negotiation of a single feature it understands. There are two 1343 diagrams, corresponding to the two endpoints: the feature location 1344 DCCP A, and what we call the "feature requester", DCCP B. 1346 Transitions between states are triggered by receiving a packet 1347 ("RECV") or by an application event ("APP"). Received packets are 1348 further distinguished by any options relevant to the feature being 1349 negotiated. "RECV -" means the packet contained no relevant option. 1350 "RECV Chg" denotes a Change option, "RECV Pr" a Prefer option, and 1351 "RECV Con" a Confirm option. The data contained in an option is 1352 given in parentheses when necessary. The "SEND" action indicates 1353 which option the DCCP will send next. Finally, the "SET-VALUE" 1354 action causes the DCCP to change its value for the relevant feature. 1356 "SEND" does not force DCCP to immediately generate a packet; rather, 1357 it says which feature option must be sent on the next packet 1358 generated. A DCCP MAY choose to generate a packet in response to 1359 some "SEND" action. However, it MUST NOT generate a packet if doing 1360 so would violate the congestion control mechanism in use. 1362 The requester, DCCP B, has four states: Known, Unknown, Failed, and 1363 Changing. Similarly, the feature location, DCCP A, has four states: 1364 Known, Unknown, Failed, and Confirming. In both cases, Known denotes 1365 a state where the DCCP knows the feature's current value, and 1366 believes that the other DCCP agrees. Changing and Confirming denote 1367 states where the DCCPs are in the process of negotiating a new value 1368 for the feature. The Unknown state can occur only at connection 1369 setup time. It denotes a state where the DCCP does not know any 1370 value for the feature, and has not yet entered a negotiation to 1371 determine its value. Finally, the Failed state represents a state 1372 where the other DCCP does not implement the feature under 1373 negotiation. 1375 A DCCP may start in either the Unknown or Known state, depending on 1376 the feature in question. In particular, some features have a well- 1377 known value for new connections, in which case the DCCPs begin the 1378 connection in the Known states. 1380 REQUESTER STATE DIAGRAM (DCCP B) 1382 +-----------+ 1383 | Unknown | 1384 +-----------+ 1385 +----------+ | +-----------+ 1386 | |RECV - |RECV -/Pr | APP | |RECV Pr/Con 1387 V |SEND - |SEND Chg V |SEND Chg 1388 +-----------+ | | +------------+ | 1389 | |----+ +------------>| |-----+ 1390 | Known |------------------------------>| Changing | 1391 | | RECV Pr | APP | |-----+ 1392 +-----------+ SEND Chg +------------+ |RECV - 1393 ^ | | ^ |SEND -/Chg 1394 | | | | | 1395 +------------------------------------------+ | +---------+ 1396 RECV Con(O) | +----------+ 1397 SEND - +--------->| Failed | 1398 SET-VALUE O RECV Ign +----------+ 1399 SEND - 1401 FEATURE LOCATION STATE DIAGRAM (DCCP A) 1402 (O represents any feature value acceptable to DCCP A; X is not acceptable.) 1404 RECV Chg(O) 1405 SEND Con(O) RECV - | APP 1406 SET-VALUE O +-----------+ SEND Pr(O) 1407 +--------------------| Unknown |------------+ 1408 | +-----------+ | 1409 | +-------+ | | +-----------+ 1410 | | |RECV - |RECV Chg(X) | | |RECV Chg(X) 1411 V V |SEND - |SEND Pr(O) V V |SEND Pr(O) 1412 +-----------+ | | +------------+ | (need not be 1413 | |----+ +------------>| |-----+ the same O) 1414 | Known |------------------------------>| Confirming | 1415 | |----+ RECV Chg | APP | |-----+ 1416 +-----------+ | SEND Pr(O) +------------+ |RECV - 1417 ^ ^ | | | ^ |SEND -/Pr(O) 1418 | | |RECV Chg(O) | | | | 1419 | | |SEND Con(O) | | +---------+ 1420 | | |SET-VALUE O | | 1421 | +-------+ | | +----------+ 1422 +---------------------------------------------+ +-------->| Failed | 1423 RECV Chg(O) RECV Ign +----------+ 1424 SEND Con(O) SEND - 1425 SET-VALUE O 1427 This specification allows several choices of action in certain 1428 states. The implementation will generally use feature-specific 1429 information to decide how to respond. For example, DCCP A in the 1430 Known state may respond to a Change option with either Confirm or 1431 Prefer. If DCCP A is willing to set the feature to the value 1432 specified by Change, it will generally send Confirm; but if it would 1433 like to negotiate further, it will send Prefer. 1435 DCCP B must retransmit Change options, and DCCP A must retransmit 1436 Prefer options, until receiving a relevant response. However, they 1437 need not retransmit the option on every packet, as shown by the 1438 "RECV - / SEND -" transitions in the Changing and Confirming states. 1440 These state diagrams guarantee safety, but not liveness. Namely, no 1441 unexpected or erroneous options will be sent, but option negotiation 1442 might not terminate. For example, the following infinite negotiation 1443 is legal according to this specification. 1445 A > B Prefer(1) 1446 B > A Change(2) 1447 A > B Prefer(1) 1448 B > A Change(2)... 1450 Implementations may choose to enforce a maximum length on any 1451 negotiation -- for example, by resetting the connection when any 1452 negotiation lasts more than some maximum time. The DCCP-Reset Reason 1453 "Fruitless Negotiation" should be used to signal that a connection 1454 was aborted because of a negotiation that took too long. 1456 In the Changing and Confirming states, the value of the 1457 corresponding feature is in flux. DCCP MAY change its behavior in 1458 these states---for example, by refusing to send data until 1459 reentering a Known state. 1461 6.4. Identification Options 1463 The Identification options provide a way for DCCP endpoints to 1464 confirm each others' identities, even after changes of address 1465 (Section 10) or long bursts of loss that get the endpoints out of 1466 sync (Section 5.2). Again, DCCP as specified here does not provide 1467 cryptographic security guarantees, and attackers that can see every 1468 packet are still capable of manipulating DCCP connections 1469 inappropriately, but the Identification options make it more 1470 difficult for some kinds of attacks to succeed. 1472 The Identification option is used to prove an endpoint's identity. 1473 An Identification Regime determines how the Identifications are 1474 calculated. In the default MD5 Regime, the calculation involves an 1475 MD5 hash over packet data and two Connection Nonces exchanged at the 1476 beginning of the connection. Finally, a Challenge option is used to 1477 elicit an Identification from the other endpoint. 1479 6.4.1. Identification Regime Feature 1481 Identification Regime has feature number 8. The ID Regime feature 1482 located at DCCP B specifies the algorithm that DCCP A will use for 1483 its Identification options. Each endpoint must keep track of both 1484 its ID regime and, via the ID Regime feature, the regime used by the 1485 other endpoint. 1487 The value of ID Regime is a two-byte number, so a Change or Confirm 1488 ID Regime option takes exactly four bytes. ID Regime defaults to 0, 1489 the MD5 Regime. Applications preferring different security 1490 guarantees, particularly around mobility issues, may prefer to 1491 implement another identification algorithm and assign it a different 1492 ID Regime value. 1494 The ID Regime feature is negotiable, so an endpoint can request that 1495 the other endpoint use a particular ID Regime by sending a Prefer 1496 option. If the endpoints cannot agree on mutually acceptable ID 1497 Regimes, the connection should be reset due to Fruitless 1498 Negotiation. 1500 6.4.2. Connection Nonce Feature 1502 Connection Nonce has feature number 7. The Connection Nonce feature 1503 located at DCCP B is the value of DCCP A's connection nonce. Each 1504 endpoint SHOULD keep track of both its nonce and, via the Connection 1505 Nonce feature, the other endpoint's nonce. Connection Nonces are 1506 used by Identification Regime 0. 1508 The Connection Nonce feature takes arbitrary values of at least 4 1509 bytes long. A Change or Confirm Connection Nonce option therefore 1510 takes at least 6 bytes. 1512 Connection Nonce defaults to a random 8-byte string. To prevent 1513 spoofing, this string MUST NOT have any predictable value. For 1514 example, it MUST NOT be set deterministically to zero, and it MUST 1515 change on every connection. 1517 This feature is non-negotiable. 1519 6.4.3. Identification Option 1521 The Identification option serves as confirmation that a packet was 1522 sent by the same endpoint that initiated the DCCP connection. It is 1523 permitted in any DCCP packet, although might be useful only after 1524 the endpoints have exchanged security information such as connection 1525 nonces. The option takes the following form: 1527 +--------+--------+--------+--------+--------+-------- 1528 |00101010| Length | Identification Data ... 1529 +--------+--------+--------+--------+--------+-------- 1530 Type=42 1532 The particular data included in an Identification option depends on 1533 the ID Regime in force. The remainder of this section describes ID 1534 Regime 0, the default MD5 Regime. 1536 The Identification data provided for the MD5 Regime consists of a 1537 16-byte MD5 digest of the DCCP packet containing the option (except 1538 for the Identification Option itself), the value of the sender's 1539 Connection Nonce, and the value of the other endpoint's Connection 1540 Nonce, in that order. The total length of the option is therefore 18 1541 bytes. Inclusion of the two Connection Nonces ensures that 1542 attackers cannot fake an Identification Option, unless they snooped 1543 on the beginning of the connection when nonces are exchanged. (No 1544 mechanism in DCCP protects against snoopers who know Connection 1545 Nonces, since DCCP does not provide strong cryptographic security 1546 guarantees; see Section 15.) Inclusion of the packet data protects 1547 against replay attacks. 1549 To check an Identification option's value, the receiver simply 1550 calculates the MD5 digest itself and compares that against the 1551 option data. The MD5 calculation can be expensive, so an attacker 1552 could conceivably disable a DCCP endpoint by sending it a flood of 1553 invalid packets with bad Identification options. Rate limits 1554 described in Sections 5.2 and 10 mitigate this issue. The receiver 1555 MAY ignore an Identification option if it occurs on a packet that 1556 would otherwise be considered valid. 1558 Example C code for constructing the option's value follows: 1560 unsigned char *packet_data; 1561 int packet_length; 1562 int id_option_offset; /* offset of option in packet_data */ 1564 const unsigned char *my_nonce, *other_nonce; 1565 int my_nonce_length, other_nonce_length; 1567 MD5_CTX md5_context; 1569 MD5_Init(&md5_context); 1570 MD5_Update(&md5_context, packet_data, id_option_offset); 1571 MD5_Update(&md5_context, packet_data + id_option_offset + 18, 1572 packet_length - id_option_offset - 18); 1573 MD5_Update(&md5_context, my_nonce, my_nonce_length); 1574 MD5_Update(&md5_context, other_nonce, other_nonce_length); 1575 packet_data[id_option_offset] = 42; /* option value */ 1576 packet_data[id_option_offset+1] = 18; /* option length */ 1577 MD5_Final(packet_data + id_option_offset + 2, &md5_context); 1579 6.4.4. Challenge Option 1581 This option informs the receiving DCCP that one of its packets was 1582 ignored, and that succeeding packets will be ignored until the 1583 endpoint sends a correct Identification option. The receiving DCCP 1584 SHOULD include an Identification option on the next packet it sends. 1585 The option takes the following form: 1587 +--------+--------+--------+--------+--------+-------- 1588 |00101100| Length | Identification Data ... 1589 +--------+--------+--------+--------+--------+-------- 1590 Type=44 1592 The Identification Data is the same as for an Identification option, 1593 above. The receiver SHOULD ignore a Challenge option, and the 1594 packet the Challenge option contains, if the Identification Data is 1595 incorrect. The purpose of this mechanism is to prevent denial-of- 1596 service attacks where an attacker could cause the receiver to send 1597 many packets with expensive-to-compute Identification options, since 1598 the receiver MAY ignore Challenge options for some time after 1599 receiving an invalid Challenge. 1601 If, after several Challenge options, a DCCP is unable to elicit a 1602 valid Identification from its partner, it MAY reset the connection 1603 with Reason "Unanswered Challenge". 1605 6.5. Data Discarded Option 1607 On a DCCP-Response packet, the Data Discarded option indicates that 1608 the payload of the DCCP-Request packet was discarded by the server, 1609 and therefore should be resent in a following DCCP-Data or DCCP- 1610 DataAck packet. This option can be set by the server to avoid 1611 having to keep state for the connection until the handshake is 1612 complete. Doing so causes an additional round-trip time before the 1613 server can begin servicing the request. The tradeoff is under the 1614 control of local policy at the server. 1616 The Data Discarded option is allowed on other packets as well, where 1617 it indicates that packets were ignored. For example, a DCCP MAY 1618 choose to discard all received data while an important feature 1619 negotiation is in progress, such as the negotiation to determine the 1620 connection's CCID. As Sections 5.5 and 8.5 describe, the DCCP MUST 1621 report such discarded packets as "not yet received" or equivalent; 1622 the Acknowledgement Number MUST NOT correspond to such a packet, and 1623 their options MUST NOT be processed. However, the DCCP MAY include 1624 a Data Discarded option on its next packet or packets, indicating 1625 that some packets marked as "not yet received" were actually 1626 received and dropped because the DCCP did not accept their data. A 1627 DCCP receiving a Data Discarded option SHOULD resend any options on 1628 packets that might have been discarded, but using DCCP-Ack packets, 1629 which (when valid) will never be discarded by the receiver. 1631 We note that Data Discarded is not a replacement for the Receive 1632 Buffer Drops option (Section 8.7). Data Discarded is intended for 1633 cases when packets are dropped for protocol-specific reasons, such 1634 as unstable negotiation state; Receive Buffer Drops indicates a 1635 problem with the OS's receive buffer. 1637 +--------+ 1638 |00000010| 1639 +--------+ 1640 Type=2 1642 6.6. Init Cookie Option 1644 This option is permitted in DCCP-Response, DCCP-Data, and DCCP- 1645 DataAck messages. The option MAY be returned by the server in a 1646 DCCP-Response mechanism. If so, then the client MUST echo the same 1647 Init Cookie option in its ensuing DCCP-Data or DCCP-DataAck message. 1648 The server SHOULD respond to an invalid Init Cookie option by 1649 resetting the connection with Reason set to "Bad Init Cookie". 1651 The purpose of this option is to allow a DCCP server to avoid having 1652 to hold any state until the three-way connection setup handshake has 1653 completed. The server wraps up the service name, server port, and 1654 any options it cares about from both the DCCP-Request and DCCP- 1655 Response in a opaque cookie. Typically the cookie will be encrypted 1656 using a secret known only to the server and include a cryptographic 1657 checksum or magic value so that correct decryption can be verified. 1658 When the server receives the cookie back in the response, it can 1659 decrypt the cookie and instantiate all the state it avoided keeping. 1661 The precise implementation of the Init Cookie does not need to be 1662 specified here as it is only relayed by the client, and does not 1663 need to be understood by the client. 1665 +--------+--------+--------+--------+--------+-------- 1666 |00100100| Length | Init Cookie Value ... 1667 +--------+--------+--------+--------+--------+-------- 1668 Type=36 1670 6.7. Timestamp Option 1672 This option is permitted in any DCCP packet. The length of the 1673 option is 6 bytes. 1675 +--------+--------+--------+--------+--------+--------+ 1676 |00101000|00000110| Timestamp Value | 1677 +--------+--------+--------+--------+--------+--------+ 1678 Type=40 Length=6 1680 The four bytes of option data carry the timestamp of this packet, in 1681 some undetermined form. A DCCP receiving a Timestamp option SHOULD 1682 respond with a Timestamp Echo option on the next packet it sends. 1684 6.8. Timestamp Echo Option 1686 This option is permitted in any DCCP packet, as long as at least one 1687 packet carrying the Timestamp option has been received. The length 1688 of the option is 10 bytes. 1690 +--------+--------+------- ... -------+------- ... -------+ 1691 |00101001|00001010| TS Echo | Elapsed | 1692 +--------+--------+------- ... -------+------- ... -------+ 1693 Type=41 Len=10 (4 bytes) (4 bytes) 1695 The first four bytes of option data, TS Echo, carry a Timestamp 1696 Value taken from a preceding received Timestamp option. Usually, 1697 this will be the last packet that was received. The final four bytes 1698 indicate the amount of time elapsed since receiving the packet whose 1699 timestamp is being echoed. This time MUST be in microseconds. 1701 6.9. Loss Window Feature 1703 Loss Window has feature number 6. The Loss Window feature located at 1704 DCCP B is the width of the window DCCP B uses to determine whether 1705 packets from DCCP A are valid. Packets outside this window will be 1706 dropped by DCCP B as old duplicates or spoofing attempts; see 1707 Section 5.2 for more information. DCCP A sends a "Change(Loss 1708 Window, W)" option to DCCP B to set DCCP B's Loss Window to W. 1710 The Loss Window feature takes 3-byte integer values, like DCCP 1711 sequence numbers. Change and Confirm options for Loss Window are 1712 therefore 6 bytes long. 1714 Loss Window defaults to 1000 for new connections. The Loss Window 1715 value is the total width of the loss window. The receiver may 1716 position the loss window asymmetrically around the last sequence 1717 number seen -- for example, by allocating 1/4 of the loss window 1718 width for older sequence numbers and 3/4 of it for newer sequence 1719 numbers. 1721 This feature is non-negotiable. 1723 7. Congestion Control IDs 1725 Each congestion control mechanism supported by DCCP is assigned a 1726 congestion control identifier, or CCID: a number from 0 to 255. 1727 During connection setup, and optionally thereafter, the endpoints 1728 negotiate their congestion control mechanisms by negotiating the 1729 values for their Congestion Control features. Congestion Control has 1730 feature number 1. The feature located at DCCP A is the CCID in use 1731 for the A-to-B half-connection. DCCP B sends an "Change(CC, K)" 1732 option to DCCP A to ask A to use CCID K for its data packets. 1734 The data octets of Congestion Control feature negotiation options 1735 form a list of acceptable CCIDs, sorted in descending order of 1736 priority. For example, the option "Change(CC 1, 2, 3)" asks the 1737 sender to use CCID 1, although CCIDs 2 and 3 are also acceptable. 1738 (This corresponds to the octets "33, 6, 1, 1, 2, 3": Change option 1739 (33), option length (6), feature ID (1), CCIDs (1, 2, 3).) 1740 Similarly, "Confirm(CC 1, 2, 3)" tells the receiver that the sender 1741 is using CCID 1, but that CCIDs 2 or 3 might also be acceptable. 1743 The CCIDs defined by this document are: 1745 CCID Meaning 1746 ---- ------- 1747 0 Reserved 1748 1 Unspecified Sender-Based Congestion Control 1749 2 TCP-like Congestion Control 1750 3 TFRC Congestion Control 1752 A new connection starts with CCID 2 for both DCCPs. If this is 1753 unacceptable for either DCCP, that DCCP will start in the Unknown 1754 state. A DCCP SHOULD NOT send data when its Congestion Control 1755 feature is in the Unknown state. 1757 All CCIDs standardized for use with DCCP will correspond to 1758 congestion control mechanisms previously standardized by the IETF. 1759 We expect that for quite some time, all such mechanisms will be TCP- 1760 friendly, but TCP-friendliness is not an explicit DCCP requirement. 1762 7.1. Unspecified Sender-Based Congestion Control 1764 CCID 1 denotes an unspecified sender-based congestion control 1765 mechanism. Separate features negotiate the corresponding congestion 1766 acknowledgement options---for example, Ack Vector. This provides a 1767 limited, controlled form of interoperability for new IETF-approved 1768 CCIDs. 1770 Implementors MUST NOT use CCID 1 in production environments as a 1771 proxy for congestion control mechanisms that have not entered the 1772 IETF standards process. We intend for the IETF to approve all 1773 production uses of CCID 1. Nevertheless, middle boxes MAY choose to 1774 treat the use of CCID 1 as experimental or unacceptable. 1776 For example, say that CCID 98, a new sender-based congestion control 1777 mechanism using Ack Vector for acknowledgements, has entered the 1778 IETF standards process. Now, DCCP A, which understands and would 1779 like to use CCID 98, is trying to communicate with DCCP B, which 1780 doesn't yet know about CCID 98. DCCP A can simply negotiate use of 1781 CCID 1 and, separately, negotiate Use Ack Vector. DCCP B will 1782 provide the feedback DCCP A requires for CCID 98, namely Ack Vector, 1783 without needing to understand the congestion control mechanism in 1784 use. 1786 7.2. TCP-like Congestion Control 1788 CCID 2 denotes Additive Increase, Multiplicative Decrease (AIMD) 1789 congestion control with behavior modelled directly on TCP, including 1790 congestion window, slow start, timeouts, and so forth. CCID 2 is 1791 further described in [CCID 2 PROFILE]. 1793 7.3. TFRC Congestion Control 1795 CCID 3 denotes TCP-Friendly Rate Control, an equation-based rate- 1796 controlled congestion control mechanism. CCID 3 is further described 1797 in [CCID 3 PROFILE]. 1799 7.4. CCID-Specific Options and Features 1801 Option and feature numbers 128 through 255 are available for CCID- 1802 specific use. CCIDs may often need new option types---for 1803 communicating acknowledgement or rate information, for example. 1804 CCID-specific option types let them create options at will without 1805 polluting the global options space. Option 128 might have different 1806 meanings on a half-connection using CCID 4 and a half-connection 1807 using CCID 8. CCID-specific options and features will never conflict 1808 with global options introduced by later versions of this 1809 specification. 1811 Any packet may contain information meant for either half-connection, 1812 so CCID-specific option and feature numbers explicitly signal the 1813 half-connection to which they apply. Option numbers 128 through 191 1814 are for options sent from the HC-Sender to the HC-Receiver; option 1815 numbers 192 through 255 are for options sent from the HC-Receiver to 1816 the HC-Sender. Similarly, feature numbers 128 through 191 are for 1817 features located at the HC-Sender; feature numbers 192 through 255 1818 are for features located at the HC-Receiver. (Change options for a 1819 feature are sent *to* the feature location; Prefer and Confirm 1820 options are sent *from* the feature location. Thus, Change(128) 1821 options are sent by the HC-Receiver by definition, while Change(192) 1822 options are sent by the HC-Sender.) 1823 For example, consider a DCCP connection where the A-to-B half- 1824 connection uses CCID 4 and the B-to-A half-connection uses CCID 5. 1825 Here is how a sampling of CCID-specific options and features are 1826 assigned to half-connections: 1828 Relevant Relevant 1829 Packet Option Half-conn. CCID 1830 ------ ------ ---------- ---- 1831 A > B 128 A-to-B 4 1832 A > B 192 B-to-A 5 1833 A > B Change(128, ...) B-to-A 5 1834 A > B Prefer(128, ...) A-to-B 4 1835 A > B Confirm(128, ...) A-to-B 4 1836 A > B Change(192, ...) A-to-B 4 1837 A > B Prefer(192, ...) B-to-A 5 1838 A > B Confirm(192, ...) B-to-A 5 1840 CCID-specific options and features have no clear meaning when the 1841 relevant CCID is in flux. A DCCP SHOULD respond to CCID-specific 1842 options and features with Ignored options during those times. 1844 8. Acknowledgements 1846 Congestion control requires receivers to transmit information about 1847 packet losses and ECN marks to senders. DCCP receivers MUST report 1848 all congestion they see, as defined by the relevant CCID profile. 1849 Each CCID says when acknowledgements should be sent, what options 1850 they must use, how they should be congestion controlled, and so on. 1852 Most acknowledgements use DCCP options. For example, on a half- 1853 connection with CCID 2 (TCP-like), the receiver reports 1854 acknowledgement information using the Ack Vector option. This 1855 section describes common acknowledgement options and shows how acks 1856 using those options will commonly work. Full descriptions of the 1857 acknowledgement mechanisms used for each CCID are laid out in the 1858 CCID profile specifications. 1860 Acknowledgement options, such as Ack Vector, are only allowed on 1861 DCCP-Ack, DCCP-DataAck, DCCP-Close, and DCCP-CloseReq packets. 1863 8.1. Acks of Acks and Unidirectional Connections 1865 DCCP was designed to work well for both bidirectional and 1866 unidirectional flows of data, and for connections that transition 1867 between these states. However, acknowledgements required for a 1868 unidirectional connection are very different from those required for 1869 a bidirectional connection. In particular, unidirectional 1870 connections need to worry about acks of acks. 1872 The ack-of-acks problem arises because some acknowledgement 1873 mechanisms are reliable. For example, an HC-Receiver using CCID 2, 1874 TCP-like Congestion Control, sends Ack Vectors containing completely 1875 reliable acknowledgement information. The HC-Sender should 1876 occasionally inform the HC-Receiver that it has received an ack. If 1877 it did not, the HC-Receiver might resend complete Ack Vector 1878 information, going back to the start of the connection, with every 1879 DCCP-Ack packet! However, note that acks-of-acks need not be 1880 reliable themselves: when an ack-of-acks is lost, the HC-Receiver 1881 will simply maintain old acknowledgement-related state for a little 1882 longer. Therefore, there is no need for acks of acks of acks. 1884 When communication is bidirectional, any required acks of acks are 1885 automatically contained in normal acknowledgements for data packets. 1886 On a unidirectional connection, however, the receiver DCCP sends no 1887 data, so the sender would not normally send acknowledgements. 1888 Therefore, the CCID in force on that half-connection must explicitly 1889 say whether, when, and how the HC-Sender should generate acks of 1890 acks. 1892 For example, consider a bidirectional connection where both half- 1893 connections use the same CCID (either 2 or 3), and where DCCP B goes 1894 *quiescent*. This means that the connection becomes unidirectional: 1895 DCCP B stops sending data, and sends only sends DCCP-Ack packets to 1896 DCCP A. For CCID 2, TCP-like Congestion Control, DCCP B uses Ack 1897 Vector to reliably communicate which packets it has received. As 1898 described above, DCCP A must occasionally acknowledge a pure 1899 acknowledgement from DCCP B, so that DCCP B can free old Ack Vector 1900 state. For instance, DCCP A might send a DCCP-DataAck packet every 1901 now and then, instead of DCCP-Data. In contrast, for CCID 3, TFRC 1902 Congestion Control, DCCP B's acknowledgements need not be reliable, 1903 since they contain cumulative loss rates; TFRC works even if every 1904 DCCP-Ack is lost. Therefore, DCCP A need never acknowledge an 1905 acknowledgement. 1907 When communication is unidirectional, a single CCID---in the 1908 example, the A-to-B CCID---controls both DCCPs' acknowledgements, in 1909 terms of their content, their frequency, and so forth. For 1910 bidirectional connections, the A-to-B CCID governs DCCP B's 1911 acknowledgements (including its acks of DCCP A's acks), while the B- 1912 to-A CCID governs DCCP A's acknowledgements. 1914 DCCP A switches its ack pattern from bidirectional to unidirectional 1915 when it notices that DCCP B has gone quiescent. It switches from 1916 unidirectional to bidirectional when it must acknowledge even a 1917 single DCCP-Data or DCCP-DataAck packet from DCCP B. (This includes 1918 the case where a single DCCP-Data or DCCP-DataAck packet was lost in 1919 transit, which is detectable using the # NDP field in the DCCP 1920 packet header.) 1922 Each CCID defines how to detect quiescence on that CCID, and how 1923 that CCID handles acks-of-acks on unidirectional connections. The B- 1924 to-A CCID defines when DCCP B has gone quiescent. Usually, this 1925 happens when a period has passed without B sending any data packets. 1926 For CCID 2, this period is roughly two round-trip times. The A-to-B 1927 CCID defines how DCCP A handles acks-of-acks once DCCP B has gone 1928 quiescent. 1930 8.2. Ack Piggybacking 1932 Acknowledgements of A-to-B data MAY be piggybacked on data sent by 1933 DCCP B, as long as that does not delay the acknowledgement longer 1934 than the A-to-B CCID would find acceptable. However, data 1935 acknowledgements often require more than 4 bytes to express. A large 1936 set of acknowledgements prepended to a large data packet might 1937 exceed the path's MTU. In this case, DCCP B SHOULD send separate 1938 DCCP-Data and DCCP-Ack packets, or wait, but not too long, for a 1939 smaller datagram. 1941 Piggybacking is particularly common at DCCP A when the B-to-A half- 1942 connection is quiescent---that is, when DCCP A is just acknowledging 1943 DCCP B's acknowledgements, as described above. There are three 1944 reasons to acknowledge DCCP B's acknowledgements: to allow DCCP B to 1945 free up information about previously acknowledged data packets from 1946 A; to shrink the size of future acknowledgements; and to manipulate 1947 the rate future acknowledgements are sent. Since these are secondary 1948 concerns, DCCP A can generally afford to wait indefinitely for a 1949 data packet to piggyback its acknowledgement onto. 1951 Any restrictions on ack piggybacking are described in the relevant 1952 CCID's profile. 1954 8.3. Ack Ratio Feature 1956 With Ack Ratio, DCCP A can perform rudimentary congestion control on 1957 DCCP B's acknowledgement stream by telling DCCP B how to clock its 1958 acks. 1960 Ack Ratio has feature number 3. The Ack Ratio feature located at 1961 DCCP B equals the ratio of data packets sent by DCCP A to 1962 acknowledgement packets sent back by DCCP B. For example, if it is 1963 set to four, then DCCP B will send at least one acknowledgement 1964 packet for every four data packets DCCP A sends. DCCP A sends a 1965 "Change(Ack Ratio)" option to DCCP B to change DCCP B's ack ratio. 1967 An Ack Ratio option contains two bytes of data: a sixteen-bit 1968 integer representing the ratio. A new connection starts with Ack 1969 Ratio 2 for both DCCPs. 1971 This feature is non-negotiable. 1973 8.4. Use Ack Vector Feature 1975 The Use Ack Vector feature lets DCCPs negotiate whether they should 1976 use Ack Vector options to report congestion. Ack Vector provides 1977 detailed loss information, and lets senders report back to their 1978 applications whether particular packets were dropped. Use Ack Vector 1979 is mandatory for some CCIDs, and optional for others. 1981 Use Ack Vector has feature number 4. The Use Ack Vector feature 1982 located at DCCP B specifies whether DCCP B should use the Ack Vector 1983 option to report congestion back to DCCP A. DCCP A sends a 1984 "Change(Use Ack Vector, 1)" option to DCCP B to ask B to send Ack 1985 Vector options as part of its acknowledgement traffic. 1987 A Use Ack Vector option contains a single octet of data. The 1988 receiver should send Ack Vector options if and only if this octet is 1989 nonzero. A new connection starts with Use Ack Vector 0 for both 1990 DCCPs. 1992 8.5. Ack Vector Options 1994 The Ack Vector gives a run-length encoded history of data packets 1995 received at the client. Each octet of the vector gives the state of 1996 that data packet in the loss history, and the number of preceding 1997 packets with the same state. The option's data looks like this: 1999 +--------+--------+--------+--------+--------+-------- 2000 |001001??| Length |SSLLLLLL|SSLLLLLL|SSLLLLLL| ... 2001 +--------+--------+--------+--------+--------+-------- 2002 Type=37/38 \___________ Vector ___________... 2004 The two Ack Vector options (option types 37 and 38) differ only in 2005 the values they imply for ECN Nonce Echo. Section 9.2 describes this 2006 further. 2008 The vector itself consists of a series of octets, each of whose 2009 encoding is: 2011 0 1 2 3 4 5 6 7 2012 +-+-+-+-+-+-+-+-+ 2013 |St | Run Length| 2014 +-+-+-+-+-+-+-+-+ 2016 St[ate]: 2 bits 2018 Run Length: 6 bits 2020 State occupies the most significant two bits of each byte, and can 2021 have one of four values: 2023 0 Packet received (and not ECN marked). 2025 1 Packet received ECN marked. 2027 2 Reserved. 2029 3 Packet not yet received. 2031 The first byte in the first Ack Vector option refers to the packet 2032 indicated in the Acknowledgement Number; subsequent bytes refer to 2033 older packets. (Ack Vector may not be sent on DCCP-Data packets, 2034 which lack an Acknowledgement Number.) If an Ack Vector contains the 2035 decimal values 0,192,3,64,5 and the Acknowledgement Number is 2036 decimal 100, then: 2038 Packet 100 was received (Acknowledgement Number 100, State 0, 2039 Run Length 0). 2041 Packet 99 was lost (State 3, Run Length 0). 2043 Packets 98, 97, 96 and 95 were received (State 0, Run Length 3). 2045 Packet 94 was ECN marked (State 1, Run Length 0). 2047 Packets 93, 92, 91, 90, 89, and 88 were received (State 0, Run 2048 Length 5). 2050 Run lengths of more than 64 must be encoded in multiple bytes. A 2051 single Ack Vector option can acknowledge up to 16192 data packets. 2052 Should more packets need to be acknowledged than can fit in 253 2053 bytes of Ack Vector, then multiple Ack Vector options can be sent. 2054 The second Ack Vector option will begin where the first Ack Vector 2055 option left off, and so forth. 2057 Ack Vector states are subject to two general constraints. (These 2058 principles should also be followed for other acknowledgement 2059 mechanisms; referring to Ack Vector states simplifies their 2060 explanation.) 2062 (1) Packets reported as State 0 or State 1 MUST be delivered to the 2063 application or explicitly dropped via application intervention. 2064 In particular, it is not appropriate to drop a packet reported 2065 using State 0 or State 1 in the receive buffer, unless the 2066 application explicitly requests such a drop. This has 2067 implications for receive buffer design: a DCCP MUST NOT 2068 acknowledge a packet using State 0 or State 1 until it can 2069 guarantee that packet will not be dropped from the receive 2070 buffer (assuming the application does not intervene). 2072 (2) Packets reported as State 3 MAY have been received by DCCP, but 2073 DCCP MUST NOT have processed them in any way, except possibly to 2074 check their sequence numbers for validity. In particular, 2075 feature negotiations and options on such packets MUST NOT have 2076 been processed, and the Acknowledgement Number MUST NOT 2077 correspond to such a packet. 2079 Packets dropped in the receive buffer should be reported as not 2080 received (State 3). The Receive Buffer Drops and Buffer Closed 2081 options distinguish between congestion losses, losses due to receive 2082 buffer overflow, and losses due to receive buffer closure. As 2083 described above, DCCP MUST NOT have previously acknowledged such 2084 packets using State 0 or 1, or processed such packets' options. For 2085 example, it is not acceptable to process a packet's options, enqueue 2086 the packet on a receive buffer, and later drop it without explicit 2087 application intervention. 2089 8.5.1. Ack Vector Consistency 2091 A DCCP sender will commonly receive multiple acknowledgements for 2092 some of its data packets. For instance, an HC-Sender might receive 2093 two DCCP-Acks with Ack Vectors, both of which contained information 2094 about sequence number 24. (Because of cumulative acking, 2095 information about a sequence number is repeated in every ack until 2096 the HC-Sender acknowledges an ack. Perhaps the HC-Receiver is 2097 sending acks faster than the HC-Sender is acknowledging them.) In a 2098 perfect world, the two Ack Vectors would always be consistent. 2099 However, there are many reasons why they might not be: 2101 o The HC-Receiver received packet 24 between sending its acks, so 2102 the first ack said 24 was not received (State 3) and the second 2103 said it was received or ECN marked (State 0 or 1). 2105 o The HC-Receiver received packet 24 between sending its acks, and 2106 the network reordered the acks. In this case, the packet will 2107 appear to transition from State 0 or 1 to State 3. 2109 o The network duplicated packet 24, but the second duplicate was ECN 2110 marked. This will show up as a transition between States 0 and 1. 2112 To cope with these situations, HC-Sender DCCP implementations SHOULD 2113 combine multiple received Ack Vector states according to this table: 2115 Received State 2116 0 1 3 2117 +---+---+---+ 2118 0 | 0 | 1 | 0 | 2119 Old +---+---+---+ 2120 1 | 1 | 1 | 1 | 2121 State +---+---+---+ 2122 3 | 0 | 1 | 3 | 2123 +---+---+---+ 2125 To read the table, choose the row corresponding to the packet's old 2126 state and the column corresponding to the packet's state in the 2127 newly received Ack Vector, then read the packet's new state off the 2128 table. The table is symmetric about the main diagonal, so it is 2129 indifferent to ack reordering. 2131 A HC-Sender MAY choose to throw away old information gleaned from 2132 the HC-Receiver's Ack Vectors, in which case it MUST ignore newly 2133 received acknowledgements from the HC-Receiver for those old 2134 packets. It is often kinder to save recent Ack Vector information 2135 for a while, so that the HC-Sender can undo its reaction to presumed 2136 congestion when a "lost" packet unexpectedly shows up (the 2137 transition from State 3 to State 0). 2139 8.5.2. Ack Vector Coverage 2141 We can divide the packets that have been sent from an HC-Sender to 2142 an HC-Receiver into four roughly contiguous groups. From oldest to 2143 youngest, these are: 2145 (1) Packets already acknowledged by the HC-Receiver, where the HC- 2146 Receiver knows that the HC-Sender has definitely received the 2147 acknowledgements. 2149 (2) Packets already acknowledged by the HC-Receiver, where the HC- 2150 Receiver cannot be sure that the HC-Sender has received the 2151 acknowledgements. 2153 (3) Packets not yet acknowledged by the HC-Receiver. 2155 (4) Packets not yet received by the HC-Receiver. 2157 The union of groups 2 and 3 is called the Unacknowledged Window. 2158 Generally, every Ack Vector generated by the HC-Receiver will cover 2159 the whole Unacknowledged Window: Ack Vector acknowledgements are 2160 cumulative. (This simplifies Ack Vector maintenance at the HC- 2161 Receiver; see Section 8.9, below.) As packets are received, this 2162 window both grows on the right and shrinks on the left. It grows 2163 because there are more packets, and shrinks because the data 2164 packets' Acknowledgement Numbers will acknowledge previous 2165 acknowledgements, moving packets from group 2 into group 1. 2167 8.6. Slow Receiver Option 2169 An HC-Receiver sends the Slow Receiver option to its sender to 2170 indicate that it is having trouble keeping up with the sender's 2171 data. The HC-Sender SHOULD NOT increase its sending rate for 2172 approximately one round-trip time after seeing a packet with a Slow 2173 Receiver option. However, the Slow Receiver option does not indicate 2174 congestion, and the HC-Sender need not reduce its sending rate. (If 2175 necessary, the receiver can force the sender to slow down by 2176 dropping packets at its receive buffer or reporting false ECN 2177 marks.) APIs SHOULD let receiver applications set Slow Receiver, and 2178 sending applications determine whether or not their receivers are 2179 Slow. 2181 The Slow Receiver option takes just one byte: 2183 +--------+ 2184 |00000010| 2185 +--------+ 2186 Type=2 2188 Slow Receiver does not specify why the receiver is having trouble 2189 keeping up with the sender. Possible reasons include lack of buffer 2190 space, CPU overload, and application quotas. A sending application 2191 might react to Slow Receiver by reducing its sending rate or by 2192 switching to a lossier compression algorithm. However, a smart 2193 sender might actually *increase* its sending rate in response to 2194 Slow Receiver, by switching to a less-compressed sending format. (A 2195 highly-compressed data format might overwhelm a slow CPU more 2196 seriously than the higher memory requirements of a less-compressed 2197 data format.) This tension between transfer size (less compression 2198 means more congestion) and processing speed (more compression means 2199 more processing) cannot be resolved in general. 2201 Slow Receiver implements a portion of TCP's receive window 2202 functionality. We believe receiver operating systems and 2203 applications will find it much easier to send Slow Receiver when 2204 appropriate than they currently find it to correctly set a TCP 2205 receive window. 2207 8.7. Receive Buffer Drops Option 2209 The Receive Buffer Drops option indicates that some packets reported 2210 as not received were actually dropped at the endpoint, due to 2211 insufficient kernel space. The sender will probably react 2212 differently to receive buffer drops than congestion losses; for 2213 instance, it might or might not reduce its congestion window. The 2214 option's data looks like this: 2216 +--------+--------+--------+ 2217 |00100111|00000011| Count | 2218 +--------+--------+--------+ 2219 Type=39 Length=3 2221 Count: 8 bits 2222 The Count field says how many acknowledged packets were dropped 2223 at the receive buffer, limited to packets acknowledged by the 2224 packet containing the option. Count is simply a number between 0 2225 and 255. 2227 Multiple Receive Buffer Drops options are added together, so a 2228 single option with Count 2 is equivalent to two options, each with 2229 Count 1. A packet's total Receive Buffer Drops count MUST be less 2230 than or equal to the number of packets acknowledged by it as "not 2231 yet received". For example, assuming Ack Vector, the Receive Buffer 2232 Drops count must be less than or equal to the total number of 2233 State-3 packets in the Ack Vectors. 2235 If an ECN-marked packet is dropped at the receive buffer, it MUST 2236 NOT be included in the Receive Buffer Drops count. Such packets MUST 2237 be reported as the equivalent of "dropped by the network". (For Ack 2238 Vector, this is "not yet received".) 2240 8.8. Buffer Closed Option 2242 The Buffer Closed option indicates that the sending application is 2243 no longer listening for data. For example, a server might close its 2244 receiving half-connection to new data after receiving a complete 2245 request from the client. This would limit the amount of state the 2246 server would expend on incoming data, and thus reduce the potential 2247 damage from certain denial-of-service attacks. A DCCP receiving a 2248 Buffer Closed option MAY report this event to the application. 2250 +--------+ 2251 |00000011| 2252 +--------+ 2253 Type=3 2255 A Buffer Closed option SHOULD be sent whenever received data packets 2256 are dropped due to a non-listening application. After receiving a 2257 Buffer Closed option, a DCCP sender should expect that no more data 2258 will ever be delivered to the receiving application. 2260 8.9. Ack Vector Implementation Notes 2262 This section discusses particulars of DCCP acknowledgement handling, 2263 in the context of an abstract implementation for Ack Vector. It is 2264 informative rather than normative. 2266 The first part of our implementation runs at the HC-Receiver, and 2267 therefore acknowledges data packets. It generates Ack Vector 2268 options. The implementation has the following characteristics: 2270 o At most one byte of state per acknowledged packet. 2272 o O(1) time to update that state when a new packet arrives (normal 2273 case). 2275 o Cumulative acknowledgements. 2277 o Quick removal of old state. 2279 The basic data structure is a circular buffer containing information 2280 about acknowledged packets. Each byte in this buffer contains a 2281 state and run length; the state can be 0 (packet received), 1 2282 (packet ECN marked), or 3 (packet not yet received). The live 2283 portion of the buffer is marked off by head and tail pointers; each 2284 pointer is marked with the HC-Sender sequence number to which it 2285 corresponds. The buffer grows from right to left. For example: 2287 +-------------------------------------------------------------------+ 2288 |S,L|S,L|S,L|S,L|S,L| | | | |S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L| 2289 +-------------------------------------------------------------------+ 2290 ^ ^ 2291 Tail, seqno = T Head, seqno = H 2293 <=== Head and Tail move this way <=== 2295 Each `S,L' represents a State/Run length byte. We will draw these 2296 buffers showing only their live portion; for example, here is 2297 another representation for the buffer above: 2299 +---------------------------------------------------+ 2300 (Head) H |S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L| T (Tail) 2301 +---------------------------------------------------+ 2303 This smaller Example Buffer contains actual data. 2305 +---------------------------+ 2306 10 |0,0|3,0|3,0|3,0|0,4|1,0|0,0| 0 [Example Buffer] 2307 +---------------------------+ 2309 In concrete terms, its meaning is as follows: 2311 Packet 10 was received. (The head of the buffer has sequence 2312 number 10, state 0, and run length 0.) 2314 Packets 9, 8, and 7 have not yet been received. (The three bytes 2315 preceding the head each have state 3 and run length 0.) 2317 Packets 6, 5, 4, 3, and 2 were received. 2319 Packet 1 was ECN marked. 2321 Packet 0 was received. 2323 8.9.1. New Packets 2325 When a packet arrives whose sequence number is larger than any in 2326 the buffer, the HC-Receiver simply moves the Head pointer to the 2327 left, increases the head sequence number, and stores a byte 2328 representing the packet into the buffer. For example, if HC-Sender 2329 packet 11 arrived ECN marked, the Example Buffer above would enter 2330 this new state (the change is marked with stars): 2332 +***----------------------------+ 2333 11 |1,0|0,0|3,0|3,0|3,0|0,4|1,0|0,0| 0 2334 +***----------------------------+ 2336 If the packet's state equals the state at the head of the buffer, 2337 the HC-Receiver may choose to increment its run length (up to the 2338 maximum). For example, if HC-Sender packet 11 arrived without ECN 2339 marking, the Example Buffer might enter this state instead: 2341 +--*------------------------+ 2342 11 |0,1|3,0|3,0|3,0|0,4|1,0|0,0| 0 2343 +--*------------------------+ 2345 Of course, the new packet's sequence number might not equal the 2346 expected sequence number. In this case, the HC-Receiver should enter 2347 the intervening packets as State 3. If several packets are missing, 2348 the HC-Receiver may prefer to enter multiple bytes with run length 2349 0, rather than a single byte with a larger run length; this 2350 simplifies table updates when one of the missing packets arrives. 2351 For example, if HC-Sender packet 12 arrived, the Example Buffer 2352 would enter this state: 2354 +*******----------------------------+ 2355 12 |0,0|3,0|0,1|3,0|3,0|3,0|0,4|1,0|0,0| 0 2356 +*******----------------------------+ 2358 When a new packet's sequence number is less than the head sequence 2359 number, the HC-Receiver should scan the table for the byte 2360 corresponding to that sequence number. (Slightly more complex 2361 indexing structures could reduce the complexity of this scan.) 2362 Assume that the sequence number was previously lost (State 3), and 2363 that it was stored in a byte with run length 0. Then the HC-Receiver 2364 can simply change the byte's state. For example, if HC-Sender packet 2365 8 was received, the Example Buffer would enter this state: 2367 +--------*------------------+ 2368 10 |0,0|3,0|0,0|3,0|0,4|1,0|0,0| 0 2369 +--------*------------------+ 2371 If the packet is not marked as lost, or if its sequence number is 2372 not contained in the table, the packet is probably a duplicate, and 2373 should be ignored. (The new packet's ECN marking state might differ 2374 from the state in the buffer; Section 8.5.1 describes what to do 2375 then.) If the packet's corresponding buffer byte has a non-zero run 2376 length, then the buffer might need be reshuffled to make space for 2377 one or two new bytes. 2379 Of course, the circular buffer may overflow, either when the HC- 2380 Sender is sending data at a very high rate, when the HC-Receiver's 2381 acknowledgements are not reaching the HC-Sender, or when the HC- 2382 Sender is forgetting to acknowledge those acks (so the HC-Receiver 2383 is unable to clean up old state). In this case, the HC-Receiver 2384 should either compress the buffer, transfer its state to a larger 2385 buffer, or drop all received packets until its buffer shrinks again. 2387 8.9.2. Sending Acknowledgements 2389 Whenever the HC-Receiver needs to generate an acknowledgement, the 2390 buffer's contents can simply be copied into one or more Ack Vector 2391 options. Copied Ack Vectors might not be maximally compressed; for 2392 example, the Example Buffer above contains three adjacent 3,0 bytes 2393 that could be combined into a single 3,2 byte. The HC-Receiver 2394 might, therefore, choose to compress the buffer in place before 2395 sending the option, or to compress the buffer while copying it; 2396 either operation is simple. 2398 Every acknowledgement sent by the HC-Receiver should include the 2399 entire state of the buffer. That is, acknowledgements are 2400 cumulative. 2402 The HC-Receiver should store information about each acknowledgement 2403 it sends in another buffer. Specifically, for every acknowledgement 2404 it sends, the HC-Receiver should store: 2406 o The HC-Receiver sequence number it used for the ack packet. 2408 o The HC-Sender sequence number it acknowledged (that is, the 2409 packet's Acknowledgement Number). Since acknowledgements are 2410 cumulative, this single number completely specifies the set of HC- 2411 Sender packets acknowledged by this ack packet. 2413 8.9.3. Clearing State 2415 Some of the HC-Sender's packets will include acknowledgement 2416 numbers, which ack the HC-Receiver's acknowledgements. When such an 2417 ack is received, the HC-Receiver simply finds the HC-Sender sequence 2418 number corresponding to that acked HC-Receiver packet, and moves the 2419 buffer's Tail pointer up to that sequence number. (It may choose to 2420 keep some older information, in case a lost packet shows up late.) 2421 For example, say that the HC-Receiver storing the Example Buffer had 2422 sent two acknowledgements already: 2424 HC-Receiver Ack 59 acknowledged HC-Sender Seq 3, and 2425 HC-Receiver Ack 60 acknowledged HC-Sender Seq 10. 2427 Say the HC-Receiver then received a DCCP-DataAck packet from the HC- 2428 Sender with Acknowledgement Number 59. This informs the HC-Receiver 2429 that the HC-Sender received, and processed, all the information in 2430 HC-Receiver packet 59. This packet acknowledged HC-Sender packet 3, 2431 so the HC-Sender has now received HC-Receiver's acknowledgements for 2432 packets 0, 1, 2, and 3. The Example Buffer should enter this state: 2434 +------------------*+ * 2435 10 |0,0|3,0|3,0|3,0|0,2| 4 2436 +------------------*+ * 2438 Note that the tail byte's run length was adjusted, since packet 3 2439 was in the middle of that byte. The HC-Receiver can also throw away 2440 the information about HC-Receiver Ack 59. 2442 A careful implementation might also modify its own acknowledgement 2443 record to ensure that it is reasonably robust to reordering. 2444 Suppose that the Example Buffer is as before, but that packet 9 now 2445 arrives, out of sequence. The buffer would enter this state: 2447 +----*----------------------+ 2448 10 |0,0|0,0|3,0|3,0|0,4|1,0|0,0| 0 2449 +----*----------------------+ 2451 The danger is that the HC-Sender might acknowledge the P2's previous 2452 acknowledgement (with sequence number 60), which says that Packet 9 2453 was not received, before the HC-Receiver has a chance to send a new 2454 acknowledgement saying that Packet 9 actually was received. 2455 Therefore, when packet 9 arrived, the HC-Receiver might modify its 2456 acknowledgement record to: 2458 HC-Receiver Ack 59 acknowledged HC-Sender Seq 3, and 2459 HC-Receiver Ack 60 acknowledged HC-Sender Seq *8*. 2461 That is, any HC-Sender sequence number in the acknowledgement record 2462 is reduced to at most 8. This would prevent the Tail pointer from 2463 moving past packet 9 until the HC-Receiver knows that the HC-Sender 2464 has seen an Ack Vector indicating that packet's arrival. 2466 8.9.4. Processing Acknowledgements 2468 When the HC-Sender receives an acknowledgement, it generally cares 2469 about the number of packets that were dropped and/or ECN marked. It 2470 simply reads this off the Ack Vector. Additionally, it may check the 2471 ECN Nonce for correctness. (As described in Section 8.5.1, it may 2472 want to keep more detailed information about acknowledged packets in 2473 case packets change states between acknowledgements, or in case the 2474 application queries whether a packet arrived.) 2476 The HC-Sender must also acknowledge the HC-Receiver's 2477 acknowledgements so that the HC-Receiver can free old Ack Vector 2478 state. (Since Ack Vector acknowledgements are reliable, the HC- 2479 Receiver must maintain and resend Ack Vector information until it is 2480 sure that the HC-Sender has received that information.) A simple 2481 algorithm suffices: since Ack Vector acknowledgements are 2482 cumulative, a single acknowledgement number tells HC-Receiver how 2483 much ack information has arrived. Assuming that the HC-Receiver 2484 sends no data, the HC-Sender can simply ensure that at least once a 2485 round-trip time, it sends a DCCP-DataAck packet acknowledging the 2486 latest DCCP-Ack packet it has received. Of course, the HC-Sender 2487 only needs to acknowledge the HC-Receiver's acknowledgements if the 2488 HC-Sender is also sending data. If the HC-Sender is not sending 2489 data, then the HC-Receiver's Ack Vector state is stable, and there 2490 is no need to shrink it. The HC-Sender must watch for drops and ECN 2491 marks on received DCCP-Ack packets so that it can adjust the HC- 2492 Receiver's ack-sending rate with Ack Ratio in response to 2493 congestion. 2495 If the other half-connection is not quiescent---that is, the HC- 2496 Receiver is sending data to the HC-Sender, possibly using another 2497 CCID---then the acknowledgements on that half-connection are 2498 sufficient for the HC-Receiver to free its state. 2500 9. Explicit Congestion Notification 2502 The DCCP protocol is fully ECN-aware. Each CCID specifies how its 2503 endpoints respond to ECN marks. Furthermore, DCCP, unlike TCP, 2504 allows senders to control the rate at which acknowledgements are 2505 generated (with options like Ack Ratio); this means that 2506 acknowledgements are generally congestion-controlled, and may have 2507 ECN-Capable Transport set. 2509 A CCID profile describes how that CCID interacts with ECN, both for 2510 data traffic and pure-acknowledgement traffic. A sender SHOULD set 2511 ECN-Capable Transport on its packets whenever the receiver has its 2512 ECN Capable feature turned on, and the relevant CCID allows it, 2513 unless the sending application indicates that ECN should not be 2514 used. 2516 The rest of this section describes the ECN Capable feature and the 2517 interaction of the ECN Nonce with acknowledgement options such as 2518 Ack Vector. 2520 9.1. ECN Capable Feature 2522 The ECN Capable feature lets a DCCP inform its partner that it 2523 cannot read ECN bits from received IP headers, so the partner must 2524 not set ECN-Capable Transport on its packets. 2526 ECN Capable has feature number 2. The ECN Capable feature located at 2527 DCCP A indicates whether or not A can successfully read ECN bits 2528 from received frames' IP headers. (This is independent of whether it 2529 can set ECN bits on sent frames.) DCCP A sends a "Prefer(ECN 2530 Capable, 0)" option to DCCP B to inform B that A cannot read ECN 2531 bits. 2533 An ECN Capable feature contains a single octet of data. ECN 2534 capability is on if and only if this octet is nonzero. 2536 A new connection starts with ECN Capable 1 (that is, ECN capable) 2537 for both DCCPs. If a DCCP is not ECN capable, it MUST send 2538 "Prefer(ECN Capable, 0)" options to the other endpoint until 2539 acknowledged (by "Change(ECN Capable, 0)") or the connection closes. 2540 Furthermore, it MUST NOT accept any data until the other endpoint 2541 sends "Change(ECN Capable, 0)". It SHOULD send Data Discarded 2542 options on its acknowledgements if the other endpoint does send data 2543 inappropriately. 2545 9.2. ECN Nonces 2547 Congestion avoidance will not occur, and the receiver will sometimes 2548 get its data faster, when the sender is not told about any 2549 congestion events. Thus, the receiver has some incentive to falsify 2550 acknowledgement information, reporting that marked or dropped 2551 packets were actually received unmarked. This problem is more 2552 serious with DCCP than with TCP, since TCP provides reliable 2553 transport: it is more difficult with TCP to lie about lost packets 2554 without breaking the application. 2556 ECN Nonces are a general mechanism to prevent ECN cheating (or loss 2557 cheating). Two values for the two-bit ECN header field indicate ECN- 2558 Capable Transport, 01 and 10. The second code point, 10, is the ECN 2559 Nonce. In general, a protocol sender chooses between these code 2560 points randomly on its output packets, remembering the sequence it 2561 chose. The protocol receiver reports, on every acknowledgement, the 2562 number of ECN Nonces it has received thus far. This is called the 2563 ECN Nonce Echo. Since ECN marking and packet dropping both destroy 2564 the ECN Nonce, a receiver that lies about an ECN mark or packet drop 2565 has a 50% chance of guessing right and avoiding discipline. The 2566 sender may react punitively to an ECN Nonce mismatch, possibly up to 2567 dropping the connection. The ECN Nonce Echo field need not be an 2568 integer; one bit is enough to catch 50% of infractions. 2570 In DCCP, the ECN Nonce Echo field is encoded in acknowledgement 2571 options. For example, the Ack Vector option comes in two forms, Ack 2572 Vector [Nonce 0] (option 37) and Ack Vector [Nonce 1] (option 38), 2573 corresponding to the two values for a one-bit ECN Nonce Echo. The 2574 Nonce Echo for a given Ack Vector equals the one-bit sum (exclusive- 2575 or, or parity) of ECN nonces for packets reported by that Ack Vector 2576 as received and not ECN marked. Thus, only packets marked as State 2577 0 matter for this calculation (that is, received packets that were 2578 not ECN marked or dropped in the receive buffer). Every Ack Vector 2579 option is detailed enough for the sender to determine what the Nonce 2580 Echo should have been. It can check this calculation against the 2581 actual Nonce Echo, and complain if there is a mismatch. 2583 (The Ack Vector could conceivably report every ECN Nonce packet, 2584 using a separate code point for received ECN Nonces. However, this 2585 would limit Ack Vector's compressibility without providing much 2586 extra protection.) 2588 Consider a half-connection from DCCP A to DCCP B. DCCP A SHOULD set 2589 ECN Nonces on its packets, and remember which packets had nonces, 2590 whenever DCCP B reports that it is ECN Capable. An ECN-capable 2591 endpoint MUST calculate and use the correct value for ECN Nonce Echo 2592 when sending acknowledgement options. An ECN-incapable endpoint, 2593 however, SHOULD treat the ECN Nonce Echo as always zero. When a 2594 sender detects an ECN Nonce Echo mismatch, it SHOULD behave as if 2595 the receiver had reported one or more packets as ECN-marked (instead 2596 of unmarked). It MAY take more punitive action, such as resetting 2597 the connection. 2599 An ECN-incapable DCCP SHOULD ignore received ECN nonces and generate 2600 ECN nonces of zero. For instance, out of the two Ack Vector options, 2601 an ECN-incapable DCCP SHOULD generate Ack Vector [Nonce 0] (option 2602 37) exclusively. (Again, the ECN Capable feature must be set to zero 2603 in this case.) 2605 10. Multihoming and Mobility 2607 DCCP provides primitive support for multihoming and mobility, via a 2608 mechanism for transferring a connection endpoint from one address to 2609 another. The moving endpoint must negotiate mobility support 2610 beforehand, and both endpoints must share their Connection Nonces. 2611 When the moving endpoint gets a new address, it sends a DCCP-Move 2612 packet from that address to the stationary endpoint. The stationary 2613 address responds with a challenge. After the moving endpoint 2614 responds correctly to the challenge, the stationary endpoint changes 2615 its connection state to use the new address. 2617 DCCP's support for mobility is intended to solve only the simplest 2618 multihoming and mobility problems. For instance, DCCP has no support 2619 for simultaneous moves. Applications requiring more complex mobility 2620 semantics, or more stringent security guarantees, should use an 2621 existing solution like Mobile IP or Snoeren and Balakrishnan's work 2622 [SB00]. 2624 10.1. Mobility Capable Feature 2626 A DCCP uses the Mobility Capable feature to inform its partner that 2627 it would like to be able to change its address and/or port during 2628 the course of the connection. 2630 Mobility Capable has feature number 5. The Mobility Capable feature 2631 located at DCCP A indicates whether or not A will accept a DCCP-Move 2632 packet sent by B. DCCP B sends a "Change(Mobility Capable, 1)" 2633 option to DCCP A to inform it that B might like to move later. 2635 A Mobility Capable feature contains a single octet of data. Mobility 2636 is allowed if and only if this octet is nonzero. A DCCP MUST reject 2637 a DCCP-Move packet referring to a connection when Mobility Capable 2638 is 0; however, it MAY reject a valid DCCP-Move packet even when 2639 Mobility Capable is 1. 2641 A new connection starts with Mobility Capable 0 (that is, mobility 2642 is not allowed) for both DCCPs. 2644 10.2. Security 2646 The DCCP mobility mechanism, like DCCP in general, does not provide 2647 cryptographic security guarantees. Nevertheless, mobile hosts must 2648 use valid sequence numbers and include valid Identifications in 2649 their DCCP-Move packets, providing protection against some classes 2650 of attackers. Specifically, an attacker cannot move a DCCP 2651 connection to a new address unless they know valid sequence numbers 2652 and how to generate valid Identifications. Even with the default MD5 2653 Identification Regime, this means that an attacker must have snooped 2654 on every packet in the connection to get a reasonable probability of 2655 success, assuming that initial sequence numbers and Connection 2656 Nonces are chosen well (that is, randomly). Section 15 further 2657 describes DCCP security considerations. 2659 10.3. Congestion Control State 2661 Once an endpoint has transitioned to a new address, the connection 2662 is effectively a new connection in terms of its congestion control 2663 state: the accumulated information about congestion between the old 2664 endpoints no longer applies. Both DCCPs MUST initialize their 2665 congestion control state (windows, rates, and so forth) to that of a 2666 new connection---that is, they must "slow start"---unless they have 2667 high-quality information about actual network conditions between the 2668 two new endpoints. Normally, the only way to get this information 2669 would be by instrumenting a DCCP connection between the new 2670 addresses. 2672 Similarly, the endpoints' configured MTUs (see 11) should be 2673 reinitialized, and PMTU discovery performed again, following an 2674 address change. 2676 10.4. Loss During Transition 2678 (This section is preliminary.) Several loss and delay events may 2679 affect the transition of a DCCP connection from one address to 2680 another. The DCCP-Move packet itself might be lost; the 2681 acknowledgement to that packet might be lost, leaving the mobile 2682 endpoint unsure of whether the transition has completed; and data 2683 from the old endpoint might continue to arrive at the receiver even 2684 after the transition. 2686 To protect against lost DCCP-Move packets, the mobile host SHOULD 2687 retransmit a DCCP-Move packet if it does not receive an 2688 acknowledgement within a reasonable time period. Section 5.9 2689 describes the mechanism used to protect against duplicate DCCP-Move 2690 packets. 2692 A receiver MAY drop all data received from the old address/port 2693 pair, once a DCCP-Move has successfully completed. Alternately, it 2694 MAY accept one loss window's worth of this data. Congestion and loss 2695 events on this data SHOULD NOT affect the new connection's 2696 congestion control state. The receiver MUST NOT accept data with the 2697 old address/port pair past one loss window, and SHOULD send DCCP- 2698 Resets in response to those packets. 2700 During some transition period, acknowledgements from the receiver to 2701 the mobile host will contain information about packets sent both 2702 from the old address/port pair, and from the new address/port pair. 2703 The mobile DCCP MUST NOT let loss events on packets from the old 2704 address/port pair affect the new congestion control state. 2706 11. Path MTU Discovery 2708 A DCCP implementation should be capable of performing Path MTU 2709 (PMTU) discovery, as described in [RFC 1191]. The API to DCCP SHOULD 2710 allow this mechanism to be disabled in cases where IP fragmentation 2711 is preferred. The rest of this section assumes PMTU discovery has 2712 not been disabled. 2714 A DCCP implementation MUST maintain its idea of the current PMTU for 2715 each active DCCP session. The PMTU should be initialized from the 2716 interface MTU that will be used to send packets. 2718 To perform PMTU discovery, the DCCP sender sets the IP Don't 2719 Fragment (DF) bit. However, it is undersirable for MTU discovery to 2720 occur on the initial connection setup handshake, as the connection 2721 setup process may not be representative of packet sizes used during 2722 the connection, and performing MTU discovery on the initial 2723 handshake might unnecessarily delay connection establishment. Thus, 2724 DF SHOULD NOT be set on DCCP-Request and DCCP-Response packets. In 2725 addition DF SHOULD NOT be set on DCCP-Reset packets, although 2726 typically these would be small enough to not be a problem. On all 2727 other DCCP packets, DF SHOULD be set. 2729 Any API to DCCP MUST allow the application to discover DCCP's 2730 current PMTU. DCCP applications SHOULD use the API to discover the 2731 PMTU, and SHOULD NOT send datagrams that are greater than the PMTU; 2732 the only exception to this is if the application disables PMTU 2733 discovery. If the application tries to send a packet bigger than the 2734 PMTU, the DCCP implementation MUST drop the packet and return an 2735 appropriate error. 2737 As specified in [RFC 1191], when a router receives a packet with DF 2738 set that is larger than the PMTU, it sends an ICMP Destination 2739 Unreachable message to the source of the datagram with the Code 2740 indicating "fragmentation needed and DF set" (also known as a 2741 "Datagram Too Big" message). When a DCCP implementation receives a 2742 Datagram Too Big message, it decreases its PMTU to the Next-Hop MTU 2743 value given in the ICMP message. If the MTU given in the message is 2744 zero, the sender chooses a value for PMTU using the algorithm 2745 described in Section 7 of [RFC 1191]. If the MTU given in the 2746 message is greater than the current PMTU, the Datagram Too Big 2747 message is ignored, as described in [RFC 1191]. (We are aware that 2748 this may cause problems for DCCP endpoints behind certain 2749 firewalls.) 2751 If the DCCP implementation has decreased the PMTU, and the sending 2752 application attempts to send a packet larger than the new MTU, the 2753 API MUST cause the send to fail returning an appropriate error to 2754 the application, and the application SHOULD then use the API to 2755 query the new value of the PMTU. When this occurs, it is possible 2756 that the kernel has some packets buffered for transmission that are 2757 smaller than the old PMTU, but larger than the new PMTU. The kernel 2758 MAY send these packets with the DF bit cleared, or it MAY discard 2759 these packets; it MUST NOT transmit these datagrams with the DF bit 2760 set. 2762 DCCP currently provides no way to increase the PMTU once it has 2763 decreased. 2765 A DCCP sender MAY optionally treat the reception of an ICMP Datagram 2766 Too Big message as an indication that the packet being reported was 2767 not lost due congestion, and so for the purposes of congestion 2768 control it MAY ignore the DCCP receiver's indication that this 2769 packet did not arrive. However, if this is done, then the DCCP 2770 sender MUST check the ECN bits of the IP header echoed in the ICMP 2771 message, and only perform this optimization if these ECN bits 2772 indicate that the packet did not experience congestion prior to 2773 reaching the router whose MTU it exceeded. 2775 12. Abstract API 2777 API issues for DCCP are discussed in another Internet-Draft, in 2778 progress. 2780 13. Multiplexing Issues 2782 In contrast to TCP, DCCP does not offer reliable ordered delivery. 2783 As a consequence, with DCCP there are no inherent performance 2784 penalties in layering functionality above DCCP to multiplex several 2785 sub-flows into a single DCCP connection. 2787 However, this approach of multiplexing sub-flows above DCCP will not 2788 work in circumstances such as RTP where the RTP subflows require 2789 separate port numbers. In this case, if it is desired to share 2790 congestion control state among multiple DCCP flows that share the 2791 same source and destination addresses, the possibilities are to add 2792 DCCP-specific mechanisms to enable this, or to use a generic 2793 multiplexing facility like the Congestion Manager [RFC 3124] 2794 residing below the transport layer. For some DCCP flows, the 2795 ability to specify the congestion control mechanism might be 2796 critical, and for these flows the Congestion Manager will only be a 2797 viable tool if it allows DCCP to specify the congestion control 2798 mechanism used by the Congestion Manager for that flow. Thus, to 2799 allow the sharing of congestion control state among multiple DCCP 2800 flows, the alternatives seem to be to add DCCP-specific 2801 functionality to the Congestion Manager, or to add a similar layer 2802 below DCCP that is specific to DCCP. We defer issues of DCCP 2803 operating over a revised version of the Congestion Manager, or over 2804 a DCCP-specific module for the sharing of congestion control state, 2805 to later work. 2807 14. DCCP and RTP 2809 The real-time transport protocol, RTP [RFC 1889], is currently used 2810 (over UDP) by many of DCCP's target applications (for instance, 2811 streaming media). This section therefore discusses the relationship 2812 between DCCP and RTP, and in particular, the question of whether any 2813 changes in RTP are necessary or desirable when it is layered over 2814 DCCP instead of UDP. The main issue here is header size: a DCCP 2815 header is at least 4 bytes larger than a UDP header. 2817 There are two potential sources of overhead in the RTP-over-DCCP 2818 combination: duplicated acknowledgement information, and duplicated 2819 sequence numbers. We argue that together, these sources of overhead 2820 add just 4 bytes per packet relative to RTP-over-UDP, and that 2821 eliminating the redundancy would not reduce the overhead. However, 2822 particular CCIDs might make productive use of the space occupied by 2823 RTP's sequence number. 2825 First, consider acknowledgements. The information on packet loss 2826 that RTP communicates via RTCP SR/RR packets is communicated by DCCP 2827 via acknowledgement options. Much of the information in an RTCP 2828 receiver report could be divined from DCCP acknowledgements, 2829 depending on the CCID in use. Acknowledgement options, such as Ack 2830 Vector, can be frequent and verbose, whereas RTCP reports are sent 2831 only rarely, with a minimum interval of 5 seconds between reports 2832 [RFC 1889]. 2834 However, not all CCIDs require such verbose acknowledgements. CCID 3 2835 (TFRC) reports acknowledgements at a low rate---between 16 and 32 2836 bytes of options (depending on ECN usage), sent once per round trip 2837 time. This is not an undue burden. Furthermore, the options are 2838 necessary to implement responsive congestion control, and we cannot 2839 report less frequently, although we might design alternative 2840 acknowledgement options that take fewer bytes. DCCP gives the 2841 application the trade off between small packet overhead and the 2842 precise feedback provided by Ack Vector. 2844 While RTP receiver reports might be considered "redundant" in the 2845 presence of DCCP's more precise acknowledgements, they are sent so 2846 infrequently that it is not worth optimizing them away. Also, note 2847 that in the common case of a one-way data stream, acknowledgement 2848 packets contain no data, so acknowledgement header size (as distinct 2849 from congestion on the acknowledgement path) is not an issue. 2851 We now consider sequence number redundancy on data packets. The 2852 embedded RTP header contains a 16-bit RTP sequence number. Most data 2853 packets will use the DCCP-Data type; DCCP-DataAck and DCCP-Ack 2854 packets need not usually be sent. The DCCP-Data header is 12 bytes 2855 long without options, including a 24-bit sequence number. This is 4 2856 bytes more than a UDP header; any options required on data packets 2857 would add further overhead. 2859 The DCCP sequence number cannot be inferred from the RTP sequence 2860 number since it increments on non-data packets as well as data 2861 packets. The RTP sequence number could be inferred from the DCCP 2862 sequence number, though; it might equal the DCCP sequence number 2863 minus the total number of non-data packets seen so far in the 2864 connection (as tracked by DCCP's # NDP header field). 2866 Removing RTP's sequence number would not save any header space 2867 because of alignment issues. However, particular DCCP CCIDs might 2868 make use of the 16 bits occupied by the RTP sequence number. For 2869 example, in CCID 3 (TFRC), every data packet must contain a Window 2870 Counter option. In straight DCCP, this option would take up another 2871 4 bytes, but one could store it in place of the RTP sequence number. 2872 This would keep the overhead relative to RTP-over-UDP at 4 bytes 2873 (rather than 8 bytes). 2875 Therefore, particular DCCP CCIDs MAY provide optional CCID-specific 2876 features that store DCCP quantities, such as TFRC's Window Counter, 2877 in place of the embedded RTP sequence number. A conforming DCCP 2878 would write in the calculated RTP sequence number before passing the 2879 packet to RTP. (The DCCP checksum would use the DCCP quantity, not 2880 the RTP sequence number.) 2882 Given RTP-over-DCCP's small overhead, however, implementors 2883 demanding tiny headers will probably prefer more comprehensive 2884 header compression to this ad-hoc saving of 4 bytes. 2886 15. Security Considerations 2888 DCCP does not provide cryptographic security guarantees. 2889 Applications desiring hard security should use IPsec or end-to-end 2890 security of some kind. 2892 Nevertheless, DCCP is intended to protect against some classes of 2893 attackers. Attackers cannot hijack a DCCP connection (close the 2894 connection unexpectedly, or cause attacker data to be accepted by an 2895 endpoint as if it came from the sender) unless they can guess valid 2896 sequence numbers. Thus, as long as endpoints choose initial sequence 2897 numbers well, a DCCP attacker must snoop on data packets to get any 2898 reasonable probability of success. The sequence number validity 2899 (Section 5.2) and mobility (Section 10) mechanisms provide this 2900 guarantee. 2902 This section is not in its final state. Further research is needed 2903 to ensure that we have met our stated security requirement. 2905 16. IANA Considerations 2907 DCCP introduces six sets of numbers whose values should be allocated 2908 by IANA. 2910 o 32-bit Service Names (Section 5.4). 2912 o 8-bit DCCP-Reset Reasons (Section 5.8). 2914 o 8-bit DCCP Option Types (Section 6). The CCID-specific options 128 2915 through 255 need not be allocated by IANA. 2917 o 8-bit DCCP Feature Numbers (Section 6.3). The CCID-specific 2918 features 128 through 255 need not be allocated by IANA. 2920 o 8-bit DCCP Congestion Control Identifiers (CCIDs) (Section 7). 2922 o 16-bit Identification Regimes, for use with DCCP Identification 2923 and Challenge options (Section 6.4). 2925 In addition, DCCP requires a Protocol Number to be added to the 2926 registry of Assigned Internet Protocol Numbers. Experimental 2927 implementors should use Protocol Number 33 for DCCP, but this number 2928 may change in future. 2930 17. Thanks 2932 There is a wealth of work in this area, including the Congestion 2933 Manager. We thank the staff and interns of ICIR and, formerly, 2934 ACIRI, the members of the End-to-End Research Group, and the members 2935 of the Transport Area Working Group for their feedback on DCCP. We 2936 also thank those who provided comments and suggestions via the DCCP 2937 BOF, Working Group, and mailing lists, including Damon Lanphear, 2938 Patrick McManus, Sara Karlberg, Kevin Lai, Youngsoo Choi, Dan 2939 Duchamp, Derek Fawcus, David Timothy Fleeman, John Loughney, 2940 Ghyslain Pelletier, Stanislav Shalunov, Yufei Wang, and Michael 2941 Welzl. 2943 18. References 2945 [CCID 2 PROFILE] S. Floyd and E. Kohler. Profile for DCCP Congestion 2946 Control ID 2: TCP-like Congestion Control. draft-ietf-dccp- 2947 ccid2-01.txt, work in progress, March 2003. 2949 [CCID 3 PROFILE] S. Floyd, E. Kohler, and J. Padhye. Profile for 2950 DCCP Congestion Control ID 3: TFRC Congestion Control. draft- 2951 ietf-dccp-ccid3-01.txt, work in progress, March 2003. 2953 [ECN NONCE] David Wetherall, David Ely, Neil Spring. Robust ECN 2954 Signaling with Nonces. draft-ietf-tsvwg-tcp-nonce-04.txt, work 2955 in progress, October 2002. 2957 [RFC 793] J. Postol, editor. Transmission Control Protocol. RFC 793. 2959 [RFC 1191] J. C. Mogul and S. E. Deering. Path MTU discovery. RFC 2960 1191. 2962 [RFC 1889] Audio-Video Transport Working Group, H. Schulzrinne, S. 2963 Casner, R. Frederick, and V. Jacobson. RTP: A Transport 2964 Protocol for Real-Time Applications. RFC 1889. 2966 [RFC 2026] S. Bradner. The Internet Standards Process---Revision 3. 2967 RFC 2026. 2969 [RFC 2460] S. Deering and R. Hinden. Internet Protocol, Version 6 2970 (IPv6) Specification. RFC 2460. 2972 [RFC 2960] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. 2973 Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, and V. 2974 Paxson. Stream Control Transmission Protocol. RFC 2960. 2976 [RFC 3124] H. Balakrishnan and S. Seshan. The Congestion Manager. 2977 RFC 3124. 2979 [RFC 3168] K.K. Ramakrishnan, S. Floyd, and D. Black. The Addition 2980 of Explicit Congestion Notification (ECN) to IP. RFC 3168. 2981 September 2001. 2983 [SB00] Alex C. Snoeren and Hari Balakrishnan. An End-to-End Approach 2984 to Host Mobility. Proc. 6th Annual ACM/IEEE International 2985 Conference on Mobile Computing and Networking (MOBICOM '00), 2986 August 2000. 2988 [UDP-LITE] L-A. Larzon, M. Degermark, S. Pink, L-E. Jonsson 2989 (editor), and G. Fairhurst (editor). The UDP-Lite Protocol. 2990 draft-ietf-tsvwg-udp-lite-01.txt, work in progress, December 2991 2002. 2993 19. Authors' Addresses 2995 Eddie Kohler 2996 Mark Handley 2997 Sally Floyd 2999 ICSI Center for Internet Research 3000 1947 Center Street, Suite 600 3001 Berkeley, CA 94704 USA 3003 Jitendra Padhye 3005 Microsoft Research 3006 One Microsoft Way 3007 Redmond, WA 98052 USA