idnits 2.17.1 draft-ietf-dccp-spec-06.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 28 instances of too long lines in the document, the longest one being 9 characters in excess of 72. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 1166 has weird spacing: '... option optio...' == Line 1168 has weird spacing: '... option optio...' == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (16 February 2004) is 7373 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'CLOSED' is mentioned on line 747, but not defined == Missing Reference: 'LISTEN' is mentioned on line 747, but not defined == Missing Reference: 'TIMEWAIT' is mentioned on line 757, but not defined == Missing Reference: 'Nonce 0' is mentioned on line 4139, but not defined == Missing Reference: 'Nonce 1' is mentioned on line 4112, but not defined == Missing Reference: 'SWL' is mentioned on line 2050, but not defined == Missing Reference: 'SWH' is mentioned on line 2050, but not defined == Missing Reference: 'AWL' is mentioned on line 2159, but not defined == Missing Reference: 'AWH' is mentioned on line 2159, but not defined == Missing Reference: 'DrpCd' is mentioned on line 3920, but not defined == Missing Reference: 'E' is mentioned on line 4953, but not defined -- Looks like a reference, but probably isn't: '1' on line 5163 -- Looks like a reference, but probably isn't: '0' on line 5147 ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 1750 (Obsoleted by RFC 4086) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) ** Obsolete normative reference: RFC 3309 (Obsoleted by RFC 4960) == Outdated reference: A later version (-15) exists of draft-ietf-pilc-link-design-13 == Outdated reference: A later version (-11) exists of draft-ietf-pmtud-method-00 -- Obsolete informational reference (is this intentional?): RFC 1948 (Obsoleted by RFC 6528) -- Obsolete informational reference (is this intentional?): RFC 2960 (Obsoleted by RFC 4960) -- Obsolete informational reference (is this intentional?): RFC 3448 (Obsoleted by RFC 5348) -- Obsolete informational reference (is this intentional?): RFC 3517 (Obsoleted by RFC 6675) Summary: 7 errors (**), 0 flaws (~~), 18 warnings (==), 9 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Eddie Kohler 2 INTERNET-DRAFT UCLA 3 draft-ietf-dccp-spec-06.txt Mark Handley 4 Expires: August 2004 UCL 5 Sally Floyd 6 ICIR 7 16 February 2004 9 Datagram Congestion Control Protocol (DCCP) 11 Status of this Memo 13 This document is an Internet-Draft and is in full conformance with 14 all provisions of Section 10 of [RFC 2026]. Internet-Drafts are 15 working documents of the Internet Engineering Task Force (IETF), its 16 areas, and its working groups. Note that other groups may also 17 distribute working documents as Internet-Drafts. 19 Internet-Drafts are draft documents valid for a maximum of six 20 months and may be updated, replaced, or obsoleted by other documents 21 at any time. It is inappropriate to use Internet-Drafts as reference 22 material or to cite them other than as "work in progress." 24 The list of current Internet-Drafts can be accessed at 25 http://www.ietf.org/ietf/1id-abstracts.txt 27 The list of Internet-Draft Shadow Directories can be accessed at 28 http://www.ietf.org/shadow.html 30 Copyright Notice 32 Copyright (C) The Internet Society (2004). All Rights Reserved. 34 Abstract 36 This document specifies the Datagram Congestion Control Protocol 37 (DCCP), which implements a congestion-controlled, unreliable flow of 38 unicast datagrams suitable for use by applications such as streaming 39 media, Internet telephony, and on-line games. 41 TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION: 43 Changes since draft-ietf-dccp-spec-05.txt: 45 * Organization overhaul. 47 * Add pseudocode for event processing. 49 * Remove # NDP; replace with Ack Count. 51 * Remove Identification, Challenge, ID Regime, and Connection Nonce. 53 * Data Checksum (formerly Payload Checksum) uses a 32-bit CRC. 55 * Switch location of non-negotiable features to clarify 56 presentation; now the feature location controls its value. 58 * Rename "value type" to "reconciliation rule". 60 * Rename "Reset Reason" to "Reset Code". 62 * Mobility ID becomes 128 bits long. 64 * Add probabilities to Mobility ID discussion. 66 * Add SyncAck. 68 Table of Contents 70 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 7 71 2. Design Rationale. . . . . . . . . . . . . . . . . . . . . . . 8 72 3. Conventions and Terminology . . . . . . . . . . . . . . . . . 9 73 3.1. Numbers and Fields . . . . . . . . . . . . . . . . . . . 9 74 3.2. Parts of a Connection. . . . . . . . . . . . . . . . . . 9 75 3.3. Features . . . . . . . . . . . . . . . . . . . . . . . . 10 76 3.4. Round-Trip Times . . . . . . . . . . . . . . . . . . . . 10 77 3.5. Robustness Principle . . . . . . . . . . . . . . . . . . 10 78 4. Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . 11 79 4.1. Packet Types . . . . . . . . . . . . . . . . . . . . . . 11 80 4.2. Sequence Numbers . . . . . . . . . . . . . . . . . . . . 12 81 4.3. States . . . . . . . . . . . . . . . . . . . . . . . . . 13 82 4.4. Congestion Control . . . . . . . . . . . . . . . . . . . 15 83 4.5. Features . . . . . . . . . . . . . . . . . . . . . . . . 16 84 4.6. Other Differences from TCP . . . . . . . . . . . . . . . 17 85 4.7. Example Connection . . . . . . . . . . . . . . . . . . . 18 86 5. Header Formats. . . . . . . . . . . . . . . . . . . . . . . . 19 87 5.1. Generic Header . . . . . . . . . . . . . . . . . . . . . 20 88 5.2. DCCP-Request Header. . . . . . . . . . . . . . . . . . . 23 89 5.3. DCCP-Response Header . . . . . . . . . . . . . . . . . . 23 90 5.4. DCCP-Data, DCCP-Ack, and DCCP-DataAck Head- 91 ers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 92 5.5. DCCP-CloseReq and DCCP-Close Headers . . . . . . . . . . 25 93 5.6. DCCP-Reset Header. . . . . . . . . . . . . . . . . . . . 26 94 5.7. DCCP-Move Header . . . . . . . . . . . . . . . . . . . . 27 95 5.8. DCCP-Sync and DCCP-SyncAck Headers . . . . . . . . . . . 28 96 5.9. Options. . . . . . . . . . . . . . . . . . . . . . . . . 29 97 5.9.1. Padding Option. . . . . . . . . . . . . . . . . . . 30 98 5.9.2. Mandatory Option. . . . . . . . . . . . . . . . . . 30 99 6. Feature Negotiation . . . . . . . . . . . . . . . . . . . . . 31 100 6.1. Change Options . . . . . . . . . . . . . . . . . . . . . 31 101 6.2. Confirm Options. . . . . . . . . . . . . . . . . . . . . 32 102 6.3. Reconciliation Rules . . . . . . . . . . . . . . . . . . 32 103 6.3.1. Server-Priority . . . . . . . . . . . . . . . . . . 33 104 6.3.2. Non-Negotiable. . . . . . . . . . . . . . . . . . . 33 105 6.4. Feature Numbers. . . . . . . . . . . . . . . . . . . . . 33 106 6.5. Examples . . . . . . . . . . . . . . . . . . . . . . . . 34 107 6.6. Option Exchange. . . . . . . . . . . . . . . . . . . . . 36 108 6.6.1. Normal Exchange . . . . . . . . . . . . . . . . . . 36 109 6.6.2. Loss and Retransmission . . . . . . . . . . . . . . 37 110 6.6.3. Reordering. . . . . . . . . . . . . . . . . . . . . 38 111 6.6.4. Preference Changes. . . . . . . . . . . . . . . . . 39 112 6.6.5. Simultaneous Negotiation. . . . . . . . . . . . . . 39 113 6.6.6. Unknown Features. . . . . . . . . . . . . . . . . . 39 114 6.6.7. Invalid Options . . . . . . . . . . . . . . . . . . 40 115 6.6.8. Mandatory Feature Negotiation . . . . . . . . . . . 40 116 6.6.9. Out-of-Band Agreement . . . . . . . . . . . . . . . 41 117 6.6.10. State Diagram. . . . . . . . . . . . . . . . . . . 41 118 7. Sequence Numbers. . . . . . . . . . . . . . . . . . . . . . . 42 119 7.1. Variables. . . . . . . . . . . . . . . . . . . . . . . . 42 120 7.2. Initial Sequence Numbers . . . . . . . . . . . . . . . . 43 121 7.3. Quiet Time . . . . . . . . . . . . . . . . . . . . . . . 44 122 7.4. Acknowledgement Numbers. . . . . . . . . . . . . . . . . 44 123 7.5. Validity and Synchronization . . . . . . . . . . . . . . 45 124 7.5.1. Sequence-Validity Rules . . . . . . . . . . . . . . 45 125 7.5.2. Handling Sequence-Invalid Packets . . . . . . . . . 47 126 7.5.3. Sequence and Acknowledgement Number 127 Windows. . . . . . . . . . . . . . . . . . . . . . . . . . 48 128 7.5.4. Sequence Window Feature . . . . . . . . . . . . . . 49 129 7.5.5. Sequence Number Attacks . . . . . . . . . . . . . . 49 130 7.5.6. Examples. . . . . . . . . . . . . . . . . . . . . . 50 131 7.6. Extended Sequence Numbers. . . . . . . . . . . . . . . . 51 132 7.6.1. When to Use Extended Sequence Numbers . . . . . . . 51 133 7.6.2. Header Processing . . . . . . . . . . . . . . . . . 52 134 7.6.3. Transitioning to Extended Sequence Num- 135 bers . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 136 7.6.4. Sequence Transition Capable Feature . . . . . . . . 54 137 7.7. NDP Count and Detecting Application Loss . . . . . . . . 55 138 7.7.1. Usage Notes . . . . . . . . . . . . . . . . . . . . 56 139 7.7.2. Send NDP Count Feature. . . . . . . . . . . . . . . 56 140 8. Event Processing. . . . . . . . . . . . . . . . . . . . . . . 56 141 8.1. Connection Establishment . . . . . . . . . . . . . . . . 56 142 8.1.1. Client Request. . . . . . . . . . . . . . . . . . . 57 143 8.1.2. Service Codes . . . . . . . . . . . . . . . . . . . 57 144 8.1.3. Server Response . . . . . . . . . . . . . . . . . . 59 145 8.1.4. Init Cookie Option. . . . . . . . . . . . . . . . . 60 146 8.1.5. Handshake Completion. . . . . . . . . . . . . . . . 60 147 8.2. Data Transfer. . . . . . . . . . . . . . . . . . . . . . 61 148 8.3. Termination. . . . . . . . . . . . . . . . . . . . . . . 62 149 8.3.1. Abnormal Termination. . . . . . . . . . . . . . . . 63 150 8.4. DCCP State Diagram . . . . . . . . . . . . . . . . . . . 63 151 8.5. Pseudocode . . . . . . . . . . . . . . . . . . . . . . . 64 152 9. Checksums . . . . . . . . . . . . . . . . . . . . . . . . . . 68 153 9.1. Header Checksum Field. . . . . . . . . . . . . . . . . . 68 154 9.2. Header Checksum Coverage Field . . . . . . . . . . . . . 69 155 9.3. Data Checksum Option . . . . . . . . . . . . . . . . . . 70 156 9.3.1. Check Data Checksum Feature . . . . . . . . . . . . 71 157 9.3.2. Usage Notes . . . . . . . . . . . . . . . . . . . . 71 158 10. Congestion Control IDs . . . . . . . . . . . . . . . . . . . 71 159 10.1. Unspecified Sender-Based Congestion 160 Control . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 161 10.2. TCP-like Congestion Control . . . . . . . . . . . . . . 74 162 10.3. TFRC Congestion Control . . . . . . . . . . . . . . . . 74 163 10.4. CCID-Specific Options, Features, and Reset 164 Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74 165 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 76 166 11.1. Acks of Acks and Unidirectional 167 Connections . . . . . . . . . . . . . . . . . . . . . . . . . 77 168 11.2. Ack Piggybacking. . . . . . . . . . . . . . . . . . . . 78 169 11.3. Ack Ratio Feature . . . . . . . . . . . . . . . . . . . 79 170 11.4. Ack Vector Options. . . . . . . . . . . . . . . . . . . 79 171 11.4.1. Ack Vector Consistency . . . . . . . . . . . . . . 81 172 11.4.2. Ack Vector Coverage. . . . . . . . . . . . . . . . 83 173 11.5. Send Ack Vector Feature . . . . . . . . . . . . . . . . 83 174 11.6. Slow Receiver Option. . . . . . . . . . . . . . . . . . 84 175 11.7. Data Dropped Option . . . . . . . . . . . . . . . . . . 84 176 11.7.1. Data Dropped and Normal Congestion 177 Response . . . . . . . . . . . . . . . . . . . . . . . . . 87 178 11.7.2. Particular Drop Codes. . . . . . . . . . . . . . . 87 179 12. Explicit Congestion Notification . . . . . . . . . . . . . . 88 180 12.1. ECN Capable Feature . . . . . . . . . . . . . . . . . . 88 181 12.2. ECN Nonces. . . . . . . . . . . . . . . . . . . . . . . 89 182 12.3. Other Aggression Penalties. . . . . . . . . . . . . . . 90 183 13. Timing Options . . . . . . . . . . . . . . . . . . . . . . . 90 184 13.1. Timestamp Option. . . . . . . . . . . . . . . . . . . . 90 185 13.2. Elapsed Time Option . . . . . . . . . . . . . . . . . . 91 186 13.3. Timestamp Echo Option . . . . . . . . . . . . . . . . . 92 187 14. Multihoming and Mobility . . . . . . . . . . . . . . . . . . 92 188 14.1. Mobility Capable Feature. . . . . . . . . . . . . . . . 93 189 14.2. Mobility ID Feature . . . . . . . . . . . . . . . . . . 93 190 14.3. Mobile Host Processing. . . . . . . . . . . . . . . . . 94 191 14.4. Stationary Host Processing. . . . . . . . . . . . . . . 95 192 14.5. Congestion Control State. . . . . . . . . . . . . . . . 96 193 14.6. Security. . . . . . . . . . . . . . . . . . . . . . . . 96 194 15. Maximum Packet Size. . . . . . . . . . . . . . . . . . . . . 97 195 16. Forward Compatibility. . . . . . . . . . . . . . . . . . . . 99 196 17. Middlebox Considerations . . . . . . . . . . . . . . . . . . 100 197 18. Relations to Other Specifications. . . . . . . . . . . . . . 101 198 18.1. DCCP and RTP. . . . . . . . . . . . . . . . . . . . . . 101 199 18.2. Multiplexing Issues . . . . . . . . . . . . . . . . . . 102 200 19. Security Considerations. . . . . . . . . . . . . . . . . . . 103 201 19.1. Security Considerations for Mobility. . . . . . . . . . 103 202 19.2. Security Considerations for Partial Check- 203 sums. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 204 20. IANA Considerations. . . . . . . . . . . . . . . . . . . . . 105 205 21. Thanks . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 206 A. Appendix: Ack Vector Implementation Notes . . . . . . . . . . 106 207 A.1. Packet Arrival . . . . . . . . . . . . . . . . . . . . . 108 208 A.1.1. New Packets . . . . . . . . . . . . . . . . . . . . 108 209 A.1.2. Old Packets . . . . . . . . . . . . . . . . . . . . 109 210 A.2. Sending Acknowledgements . . . . . . . . . . . . . . . . 110 211 A.3. Clearing State . . . . . . . . . . . . . . . . . . . . . 110 212 A.4. Processing Acknowledgements. . . . . . . . . . . . . . . 112 213 B. Appendix: Design Motivation . . . . . . . . . . . . . . . . . 113 214 B.1. CsCov and Partial Checksumming . . . . . . . . . . . . . 113 215 Normative References . . . . . . . . . . . . . . . . . . . . . . 114 216 Informative References . . . . . . . . . . . . . . . . . . . . . 115 217 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 116 218 Intellectual Property Notice . . . . . . . . . . . . . . . . . . 117 220 1. Introduction 222 This document describes the Datagram Congestion Control Protocol 223 (DCCP), a transport protocol that implements a congestion- 224 controlled, bidirectional stream of unreliable datagrams. 225 Specifically, DCCP provides: 227 o An unreliable flow of datagrams, with acknowledgements. 229 o Reliable handshakes for connection setup and teardown. 231 o Reliable negotiation of options, including negotiation of a 232 suitable congestion control mechanism. 234 o Mechanisms allowing a server to avoid holding any state for 235 unacknowledged connection attempts or already-finished 236 connections. 238 o Congestion control incorporating Explicit Congestion Notification 239 (ECN) and the ECN Nonce, as per [RFC 3168] and [RFC 3540]. 241 o Acknowledgement mechanisms communicating packet loss and ECN mark 242 information. Acks are transmitted as reliably as the relevant 243 congestion control mechanism requires, possibly completely 244 reliably. 246 o Optional mechanisms that tell the sending application, with high 247 reliability, which data packets reached the receiver, and whether 248 those packets were ECN marked, corrupted, or dropped in the 249 receive buffer. 251 o Path Maximum Transfer Unit (PMTU) discovery, as per [RFC 1191]. 253 DCCP is intended for applications, such as streaming media and 254 Internet telephony, where reliable in-order delivery, combined with 255 congestion control, can result in some information arriving at the 256 receiver after it is no longer of use. So far, most such 257 applications have either used TCP, with the attendant quality 258 problems caused by late data delivery, or used UDP and implemented 259 their own congestion control (or no congestion control at all). 260 DCCP provides standard congestion control mechanisms for such 261 applications. It enables the use of ECN, along with conformant end- 262 to-end congestion control, for applications that would otherwise be 263 using UDP. In addition, DCCP implements reliable connection setup, 264 teardown, and feature negotiation. 266 DCCP's target applications require the flow-based semantics of TCP, 267 but do not want TCP's in-order delivery and reliability, or would 268 like different congestion control dynamics than TCP. 270 2. Design Rationale 272 DCCP was intended to be used by applications that currently use UDP 273 without end-to-end congestion control. Most streaming UDP 274 applications should have little reason not to switch to DCCP, once 275 it is deployed. Thus, DCCP was designed to have as little overhead 276 as possible, both in terms of the packet header size and in terms of 277 the state and CPU overhead required at end hosts. Only the minimal 278 necessary functionality was included in DCCP, leaving other 279 functionality, such as forward error correction (FEC), semi- 280 reliability, and multiple streams, to be layered on top of DCCP as 281 desired. This desire for minimal overhead is also one of the 282 reasons to avoid proposing an unreliable variant of the Stream 283 Control Transmission Protocol (SCTP, [RFC 2960]). 285 Different forms of conformant congestion control are appropriate for 286 different applications. For example, applications such as on-line 287 games might want to make quick use of any available bandwidth. 288 Other applications, such as streaming media, might trade off this 289 responsiveness for a steadier, less bursty rate, since sudden rate 290 changes cause unacceptable UI glitches (such as audible pauses or 291 clicks in the playout stream). Thus, DCCP allows applications to 292 choose between several forms of congestion control. One choice, 293 TCP-like Congestion Control, halves the congestion window in 294 response to a packet drop or mark, as in TCP. Applications using 295 this congestion control mechanism will respond quickly to changes in 296 available bandwidth, but must be able to tolerate the abrupt changes 297 in congestion window typical of TCP. A second alternative, TCP- 298 Friendly Rate Control (TFRC, [RFC 3448]), a form of equation-based 299 congestion control, minimizes abrupt changes in the sending rate 300 while maintaining longer-term fairness with TCP. 302 DCCP also lets unreliable traffic safely use ECN. A UDP kernel API 303 might not allow applications to set UDP packets as ECN-capable, 304 since the API could not guarantee the application would properly 305 detect or respond to congestion. DCCP kernel APIs will have no such 306 issues, since DCCP itself implements congestion control. 308 We chose not to require the use of the Congestion Manager [RFC 309 3124], which allows multiple concurrent streams between the same 310 sender and receiver to share congestion control. The current 311 Congestion Manager can only be used by applications that have their 312 own end-to-end feedback about packet losses, but this is not the 313 case for many of the applications currently using UDP. In addition, 314 the current Congestion Manager does not easily support multiple 315 congestion control mechanisms, or lend itself to the use of forms of 316 TFRC where the state about past packet drops or marks is maintained 317 at the receiver rather than at the sender. DCCP should be able to 318 make use of CM where desired by the application, but we do not see 319 any benefit in making the deployment of DCCP contingent on the 320 deployment of CM itself. 322 3. Conventions and Terminology 324 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 325 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 326 this document are to be interpreted as described in [RFC 2119]. 328 3.1. Numbers and Fields 330 All multi-byte numerical quantities in DCCP, such as port numbers, 331 Sequence Numbers, and arguments to options, are transmitted in 332 network byte order (most significant byte first). 334 We occasionally refer to the "left" and "right" sides of a bit 335 field. "Left" means towards the most significant bit, and "right" 336 means towards the least significant bit. 338 Reserved bitfields in DCCP packet headers MUST be ignored by 339 receivers, and MUST be set to zero by senders, unless otherwise 340 specified. 342 Random numbers in DCCP are used for their security properties, and 343 MUST be chosen according to the guidelines in [RFC 1750]. 345 3.2. Parts of a Connection 347 Each DCCP connection runs between two endpoints, which we often name 348 DCCP A and DCCP B. 350 DCCP connections are actively initiated by one endpoint. The active 351 endpoint is called the client, and the passive endpoint is called 352 the server. 354 DCCP connections are bidirectional; data may pass from either 355 endpoint to the other. This means that data and acknowledgements 356 may be flowing in both directions simultaneously. Logically, 357 however, a DCCP connection consists of two separate unidirectional 358 connections, called half-connections. Each half-connection consists 359 of the data packets sent by one endpoint and the corresponding 360 acknowledgements sent by the other endpoint. We can illustrate this 361 as follows: 363 +--------+ A-to-B half-connection: +--------+ 364 | | --> data packets --> | | 365 | | <-- acknowledgements <-- | | 366 | DCCP A | | DCCP B | 367 | | B-to-A half-connection: | | 368 | | <-- data packets <-- | | 369 +--------+ --> acknowledgements --> +--------+ 371 Although they are logically distinct, in practice the half- 372 connections overlap; a DCCP-DataAck packet, for example, contains 373 application data relevant to one half-connection and acknowledgement 374 information relevant to the other. 376 In the context of a single half-connection, the HC-Sender is the 377 endpoint sending data, while the HC-Receiver is the endpoint sending 378 acknowledgements. For example, in the A-to-B half-connection, 379 DCCP A is the HC-Sender and DCCP B is the HC-Receiver. 381 3.3. Features 383 A feature is a DCCP connection attribute, identified by a feature 384 number and an endpoint, on whose value the two endpoints agree. 385 Many properties of a DCCP connection are controlled by features, 386 including the congestion control mechanisms in use on the two half- 387 connections, whether mobility is allowed, and whether ECN is 388 supported. The endpoints can achieve agreement by out-of-band 389 communication, or through the exchange of feature negotiation 390 options in DCCP headers. 392 The notation F/A represents the feature with feature number F 393 located at DCCP endpoint A; the feature F/B has the same feature 394 number, but is located at the other endpoint. Both DCCP A and 395 DCCP B know, and agree on, the values of both F/A and F/B, but F/A 396 and F/B may have different values. 398 DCCP A is called the feature location for all features F/A, and the 399 feature remote for all features F/B. 401 3.4. Round-Trip Times 403 We sometimes refer to a round-trip time for setting timers, for 404 example. If no useful round-trip time estimate is available, a DCCP 405 implementation SHOULD use 0.2 seconds instead. 407 3.5. Robustness Principle 409 DCCP implementations should follow TCP's "general principle of 410 robustness": be conservative in what you do, be liberal in what you 411 accept from others. 413 4. Overview 415 DCCP's high-level connection dynamics should seem familiar to anyone 416 who knows TCP. DCCP connections, like TCP connections, progress 417 through three phases: initiation (including a three-way handshake), 418 data transfer, and termination. Data can flow both ways over the 419 connection. An acknowledgement framework lets senders discover how 420 much data has been lost; congestion control uses this information to 421 avoid unfairly congesting the network. Of course, DCCP provides 422 unreliable datagram semantics, not TCP's reliable bytestream 423 semantics. The application must package its data into explicit 424 frames, and must retransmit its own data as necessary. It may be 425 useful to think of DCCP either as TCP minus bytestream semantics and 426 reliability, or as UDP plus congestion control, handshakes, and 427 acknowledgements. 429 4.1. Packet Types 431 DCCP uses eleven packet types to implement various protocol 432 functions. For example, every new connection attempt begins with a 433 DCCP-Request packet sent by the client. A DCCP-Request packet thus 434 resembles a TCP SYN; but DCCP-Request is a packet type, not a flag, 435 so there's no way to send an unexpected combination such as TCP's 436 SYN+FIN+ACK+RST. 438 Eight packet types occur during the progress of a typical 439 connection---two only during the initiation phase, three during the 440 data transfer phase, and three only during the termination phase: 442 Client Server 443 ------ ------ 444 (1) Initiation 445 DCCP-Request --> 446 <-- DCCP-Response 447 DCCP-Ack --> 448 (2) Data transfer 449 DCCP-Data, DCCP-Ack, DCCP-DataAck --> 450 <-- DCCP-Data, DCCP-Ack, DCCP-DataAck 451 (3) Termination 452 <-- DCCP-CloseReq 453 DCCP-Close --> 454 <-- DCCP-Reset 456 Note the three-way handshakes during initiation and termination. 457 The three remaining packet types are used for special purposes: when 458 an endpoint moves, or to resynchronize after bursts of loss. 460 Every DCCP packet starts with a common, 12-byte generic header, but 461 different packet types may include different amounts of additional 462 data. For example, the DCCP-Ack packet type includes an 463 Acknowledgement Number. Every packet type may also contain options, 464 up to around 1000 bytes' worth. 466 All of the packet types are described below. 468 DCCP-Request 469 Sent by the client to initiate a connection (the first part of 470 the three-way handshake). 472 DCCP-Response 473 Sent by the server in response to a DCCP-Request (the second 474 part of the three-way handshake). 476 DCCP-Data 477 Used to transmit data. 479 DCCP-Ack 480 Used for pure acknowledgements. 482 DCCP-DataAck 483 Used for piggybacked data-plus-acknowledgements. 485 DCCP-CloseReq 486 Sent by the server to request that the client close the 487 connection. 489 DCCP-Close 490 Used to close the connection; elicits a DCCP-Reset in response. 492 DCCP-Reset 493 Used to terminate the connection, either normally or abnormally. 495 DCCP-Move 496 Supports multihoming and mobility. 498 DCCP-Sync, DCCP-SyncAck 499 Used to resynchronize sequence numbers after large bursts of 500 loss. 502 4.2. Sequence Numbers 504 Each DCCP packet carries a sequence number, so that losses can be 505 detected and reported. But unlike TCP's byte-based sequence 506 numbers, DCCP sequence numbers are attached to packets. Each packet 507 sent increments the sequence number by one. For example: 509 DCCP A DCCP B 510 ------ ------ 511 DCCP-Data(seqno 1) --> 512 DCCP-Data(seqno 2) --> 513 <-- DCCP-Ack(seqno 10, ackno 2) 514 DCCP-DataAck(seqno 3, ackno 10) --> 515 <-- DCCP-Data(seqno 11) 517 Note that even DCCP-Ack pure acknowledgements increment the sequence 518 number; after the DCCP-Ack with sequence number 10, the following 519 DCCP-Data packet uses the next sequence number, 11. This lets the 520 endpoints tell when acknowledgements are lost in the network. It 521 also means that endpoints can get out of sync after a long burst of 522 loss. The DCCP-Sync and DCCP-SyncAck packet types let DCCP recover 523 from large loss bursts; see Section 7.5. 525 Also note that, since DCCP is an unreliable protocol, there are no 526 retransmissions, and it doesn't make sense to have a cumulative 527 acknowledgement field. Acknowledgement Number (ackno) fields equal 528 the largest sequence number received, rather than the TCP-style 529 smallest sequence number not received. Separate options indicate 530 any intermediate sequence numbers that weren't received. 532 4.3. States 534 DCCP endpoints progress through different states during the course 535 of a connection, corresponding roughly to the three phases of 536 initiation, data transfer, and termination. The figure below shows 537 the typical progress through these states for a client and server. 539 Client Server 540 ------ ------ 541 (0) No connection 542 CLOSED LISTEN 544 (1) Initiation 545 REQUEST DCCP-Request --> 546 <-- DCCP-Response RESPOND 547 PARTOPEN DCCP-Ack or DCCP-DataAck --> 549 (2) Data transfer 550 OPEN <-- DCCP-Data, Ack, DataAck --> OPEN 552 (3) Termination 553 <-- DCCP-CloseReq CLOSEREQ 554 CLOSING DCCP-Close --> 555 <-- DCCP-Reset CLOSED 556 TIMEWAIT 557 CLOSED 558 The client and server's typical progress through states. 560 The states are as follows; Section 8 describes them in more detail. 562 CLOSED 563 Represents a nonexistent connection. 565 LISTEN 566 Represents a server socket in the passive listening state. 567 LISTEN and CLOSED are not associated with any particular DCCP 568 connection. 570 REQUEST 571 The client socket enters this state, from CLOSED, after sending 572 a DCCP-Request packet to try to initiate a connection. 574 RESPOND 575 A server socket enters this state, from LISTEN, after receiving 576 a DCCP-Request from a client. 578 PARTOPEN 579 The client socket enters this state, from REQUEST, after 580 receiving a DCCP-Response from the server. This state 581 represents the third phase of the three-way handshake. The 582 client may send data in this state, but it MUST include an 583 Acknowledgement Number on all of its packets. 585 OPEN 586 The central, data transfer portion of a DCCP connection. Client 587 and server enter into this state from PARTOPEN and RESPOND, 588 respectively. Sometimes we speak of SERVER-OPEN and CLIENT-OPEN 589 states, corresponding to the server's OPEN state and the 590 client's OPEN state. 592 CLOSEREQ 593 A server socket enters this state, from SERVER-OPEN, to signal 594 that the connection is over, but the client must hold TIMEWAIT 595 state. 597 CLOSING 598 Either server or client can enter this state to close the 599 connection. 601 TIMEWAIT 602 A socket remains in this state for 2MSL after the connection has 603 been torn down, to prevent mistakes due to the delivery of old 604 packets. One MSL, or Maximum Segment Lifetime, is the maximum 605 length of time a packet could survive in the network. 607 4.4. Congestion Control 609 DCCP connections are congestion controlled. Unlike TCP, however, 610 DCCP supports multiple congestion control mechanisms for 611 applications to choose from. In fact, the two half-connections can 612 be governed by different mechanisms. Each mechanism corresponds to 613 a one-byte congestion control identifier, or CCID. A CCID describes 614 how the HC-Sender limits data packet rates; how it maintains 615 necessary parameters, such as congestion windows; how the HC- 616 Receiver sends congestion feedback via acknowledgements; and how it 617 manages the acknowledgement rate. 619 The endpoints negotiate their CCIDs during connection initiation. 620 So far, CCIDs 2 and 3 have been defined for use with DCCP; CCID 0 is 621 reserved, and CCID 1 is used for special purposes (see Section 622 10.1). 624 CCID 2 corresponds to TCP-like Congestion Control, which is similar 625 to that of TCP. The sender maintains a congestion window and sends 626 packets until that window is full. Packets are acknowledged by the 627 receiver. Dropped packets and ECN [RFC 3168] are indicate 628 congestion; the response to congestion is to halve the congestion 629 window. Acknowledgements in CCID 2 contain the sequence numbers of 630 all received packets within some window, similar to a super 631 selective-acknowledgement (SACK, [RFC 3517]). 633 CCID 3 provides TFRC Congestion Control, an equation-based form of 634 congestion control which is intended to provide a smoother response 635 to congestion than CCID 2. The sender maintains a "transmit rate". 636 The receiver sends acknowledgement packets containing information 637 about the receiver's estimate of packet loss. The sender uses this 638 information to update its transmit rate. Although CCID 3 behaves 639 somewhat differently from TCP in its short term congestion response, 640 it is designed to operate fairly with TCP over the long term. 642 The behaviors of CCIDs 2 and 3 are fully defined in separate profile 643 documents [CCID 2 PROFILE] [CCID 3 PROFILE]. 645 4.5. Features 647 Agreement on DCCP feature values is achieved by explicit 648 negotiation, using options in DCCP packet headers. This generally 649 happens at connection startup, but negotiation can begin at any 650 time. The relevant options are Change L, Confirm L, Change R, and 651 Confirm R, with the "L" options sent by the feature location and the 652 "R" options sent by the feature remote. 654 A Change R message says to the peer, "change this feature value on 655 your side". The peer responds with a Confirm L, meaning "I've 656 changed it". The suggested option setting in Change R can sometimes 657 contain multiple values, which are sorted in preference order. For 658 example: 660 Client Server 661 ------ ------ 662 Change R(CCID, 2) --> 663 <-- Confirm L(CCID, 2) 664 * agreement that CCID/Server = 2 * 666 Change R(CCID, 3 4) --> 667 <-- Confirm L(CCID, 4, 4 2) 668 * agreement that CCID/Server = 4 * 670 In the second exchange, the client requests that the server use 671 either CCID 3 or CCID 4, with 3 preferred. The server chooses 4, 672 giving its preference list of "4 2". 674 A party that wants to change a feature located at itself issues a 675 "Change L" option, which elicits a "Confirm R" in reply. 677 Client Server 678 ------ ------ 679 <-- Change L(CCID, 3 2) 680 Confirm R(CCID, 3, 3 2) --> 681 * agreement that CCID/Server = 3 * 683 In this example, the server requests CCID value 3 or 2 for the 684 server's CCID, with 3 preferred, and the client agrees. 686 Retransmissions make feature negotiation reliable. Section 6 687 describes these options further. 689 4.6. Other Differences from TCP 691 Interesting differences between DCCP and TCP, apart from those 692 discussed so far, include: 694 o Copious space for options (up to 1020 bytes). 696 o Different acknowledgement formats. The CCID for a connection 697 determines how much ack information needs to be transmitted. In 698 CCID 2 (TCP-like), this is about one ack per 2 packets, and each 699 ack must declare exactly which packets were received; in CCID 3 700 (TFRC), it's about one ack per RTT, and acks must declare at 701 minimum just the lengths of recent loss intervals. 703 o Denial-of-service (DoS) protection. Several DCCP mechanisms 704 attempt to let servers limit the amount of state possibly- 705 misbehaving clients can force them to maintain. An Init Cookie 706 option, analogous to TCP's SYN Cookies [SYNCOOKIES], avoids SYN- 707 flood-like attacks. Only one connection endpoint need hold 708 TIMEWAIT state; the DCCP-CloseReq packet, which may only be sent 709 by the server, passes that state to the client. Various rate 710 limits let servers avoid attacks that might force extensive 711 computation or packet generation. 713 o Distinguishing different kinds of loss. A Data Dropped option 714 (Section 11.7) lets an endpoint declare that a packet was dropped 715 because of corruption, because of receive buffer overflow, and so 716 on. This facilitates research into more appropriate rate-control 717 responses for these non-network-congestion losses (although 718 currently all losses will cause a congestion response). 720 o Acknowledgement readiness. In TCP, a packet is acknowledged only 721 when the data is queued for delivery to the application. This 722 does not make sense in DCCP, where an application might request a 723 drop-from-front receive buffer, for example. We acknowledge a 724 packet when its options have been processed. The Data Dropped 725 option may later say that the packet's payload was discarded. 727 o Integrated support for mobility and multihoming via the DCCP-Move 728 packet type. 730 o No receive window. DCCP is a congestion control protocol, not a 731 flow control protocol. 733 o No simultaneous open. Every connection has one client and one 734 server. 736 o No half-closed states. DCCP has no states corresponding to TCP's 737 FINWAIT and CLOSEWAIT, where one half-connection is explicitly 738 closed while the other is still active. 740 4.7. Example Connection 742 The progress of a typical DCCP connection is as follows. (This 743 description is informative, not normative.) 745 Client Server 746 ------ ------ 747 0. [CLOSED] [LISTEN] 748 1. DCCP-Request --> 749 2. <-- DCCP-Response 750 3. DCCP-Ack --> 751 <-- DCCP-Ack 752 4. DCCP-Data, DCCP-Ack, DCCP-DataAck --> 753 <-- DCCP-Data, DCCP-Ack, DCCP-DataAck 754 5. <-- DCCP-CloseReq 755 6. DCCP-Close --> 756 7. <-- DCCP-Reset 757 8. [TIMEWAIT] 759 1. The client sends the server a DCCP-Request packet specifying the 760 client and server ports, the service being requested, and any 761 features being negotiated, including the CCID that the client 762 would like the server to use. The client may optionally 763 piggyback some data on the DCCP-Request packet---an application- 764 level request, say---which the server may ignore. 766 2. The server sends the client a DCCP-Response packet indicating 767 that it is willing to communicate with the client. The response 768 indicates any features and options that the server agrees to, 769 begins or continues other feature negotiations if desired, and 770 optionally includes an Init Cookie that wraps up all this 771 information and which must be returned by the client for the 772 connection to complete. 774 3. The client sends the server a DCCP-Ack packet that acknowledges 775 the DCCP-Response packet. This acknowledges the server's 776 initial sequence number and returns the Init Cookie if there was 777 one in the DCCP-Response. It may also continue feature 778 negotiation. There might follow zero or more DCCP-Ack exchanges 779 as required to finalize feature negotiation. The client may 780 piggyback an application-level request on its final ack, 781 producing a DCCP-DataAck packet. 783 4. The server and client then exchange DCCP-Data packets, DCCP-Ack 784 packets acknowledging that data, and, optionally, DCCP-DataAck 785 packets containing piggybacked data and acknowledgements. If 786 the client has no data to send, then the server will send DCCP- 787 Data and DCCP-DataAck packets, while the client will send DCCP- 788 Acks exclusively. 790 5. The server sends a DCCP-CloseReq packet requesting a close. 792 6. The client sends a DCCP-Close packet acknowledging the close. 794 7. The server sends a DCCP-Reset packet with Reset Code 1, 795 "Closed", and clears its connection state. In DCCP, unlike TCP, 796 Resets are part of normal connection termination; see Section 797 5.6. 799 8. The client receives the DCCP-Reset packet and holds state for a 800 reasonable interval of time to allow any remaining packets to 801 clear the network. 803 An alternative connection closedown sequence is initiated by the 804 client: 806 5b. The client sends a DCCP-Close packet closing the connection. 808 6b. The server sends a DCCP-Reset packet with Reset Code 1, 809 "Closed", and clears its connection state. 811 7b. The client receives the DCCP-Reset packet and holds state for a 812 reasonable interval of time to allow any remaining packets to 813 clear the network. 815 5. Header Formats 817 The variable-length DCCP header appears first in every DCCP packet. 818 A header can be from 12 to 1020 bytes long. The initial 12 bytes of 819 the header are the same regardless of packet type. Following this 820 comes optional additional fixed-length fields, depending on the 821 packet type, and then a variable-length list of options. Finally, 822 some packet types include application data. 824 +---------------------------------------+ -. 825 | Generic Header | | 826 +---------------------------------------+ | 827 | Additional Fields (depending on type) | +- DCCP Header 828 +---------------------------------------+ | 829 | Options (optional) | | 830 +=======================================+ -' 831 | Application Data (optional) | 832 +=======================================+ 834 5.1. Generic Header 836 The DCCP generic header generally takes 12 bytes. 838 0 1 2 3 839 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 840 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 841 | Source Port | Dest Port | 842 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 843 | Data Offset | CCVal | CsCov | Checksum | 844 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 845 | Type |X| Res | Sequence Number | 846 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 848 Actually, there are two types of generic header, depending on the 849 value of X, the Extended Sequence Numbers bit. If X is zero, the 850 Sequence Number field takes 24 bits, as above. If X is one, the 851 Sequence Number field extends for an additional 24 bits, for a total 852 of 48: 854 0 1 2 3 855 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 856 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 857 | Source Port | Dest Port | 858 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 859 | Data Offset | CCVal | CsCov | Checksum | 860 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 861 | Type |1| Res | Sequence Number (high bits) . 862 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 863 . Sequence Number (low bits) | Reserved |T| 864 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 866 Source and Destination Ports: 16 bits each 867 These fields identify the connection, similar to the 868 corresponding fields in TCP and UDP. The Source Port represents 869 the relevant port on the endpoint that sent this packet, the 870 Destination Port the relevant port on the other endpoint. 871 Source Ports SHOULD be chosen randomly, to reduce the likelihood 872 of attack. 874 Data Offset: 8 bits 875 The offset from the start of the DCCP header to the beginning of 876 the packet's application data, in 32-bit words. 878 CCVal: 4 bits 879 Used by the HC-Sender CCID. For example, the A-to-B CCID's 880 sender, which is active at DCCP A, MAY send 4 bits of 881 information per packet to its receiver by encoding that 882 information in CCVal. CCVal MUST be set to zero unless the HC- 883 Sender CCID specifies a different value. 885 Checksum Coverage (CsCov): 4 bits 886 Checksum Coverage specifies what parts of the packet are covered 887 by the Checksum field. This always includes the DCCP header and 888 options, but if applications request it, some or all of the 889 application data may be excluded. This can improve performance 890 on noisy links, assuming the application can tolerate 891 corruption. See Section 9. 893 Checksum: 16 bits 894 The Internet checksum of the packet's DCCP header (including 895 options), a network-layer pseudoheader, and, depending on 896 Checksum Coverage, some or all of the application data. See 897 Section 9. 899 Type: 4 bits 900 The Type field specifies the type of the packet. The following 901 values are defined: 903 Type Meaning 904 ---- ------- 905 0 DCCP-Request 906 1 DCCP-Response 907 2 DCCP-Data 908 3 DCCP-Ack 909 4 DCCP-DataAck 910 5 DCCP-CloseReq 911 6 DCCP-Close 912 7 DCCP-Reset 913 8 DCCP-Move 914 9 DCCP-Sync 915 10 DCCP-SyncAck 916 11-15 Reserved 918 Extended Sequence Numbers (X): 1 bit 919 This bit is set to one to indicate the use of an extended 920 generic header with 48-bit Sequence and Acknowledgement Numbers. 921 Very-high-rate connections SHOULD set X to one, and use 48-bit 922 sequence numbers, to gain increased protection against wrapped 923 sequence numbers and attacks. See Section 7.6. 925 Reserved (Res): 3 bits 926 The version of DCCP specified here MUST ignore this field on 927 received packets, and MUST set it to all zeroes on generated 928 packets. 930 Sequence Number: 24 or 48 bits 931 Identifies the packet uniquely in the sequence of all packets 932 the source sent on this connection. Sequence Number increases 933 by one with every packet sent, including packets such as DCCP- 934 Ack that carry no application data. See Section 7. 936 Sequence Number Transition (T): 1 bit [X=1 only] 937 Set to one to indicate an ongoing transition from 24-bit to 938 48-bit sequence numbers. See Section 7.6. 940 Many packet types also carry an Acknowledgement Number in the four 941 or eight bytes immediately following the generic header. When X=0, 942 its format is: 944 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 945 | Reserved | Acknowledgement Number | 946 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 948 And when X=1: 950 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 951 | Reserved | Acknowledgement Number (high bits) . 952 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 953 . Acknowledgement Number (low bits) | Reserved | 954 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 956 Acknowledgement Number: 24 or 48 bits 957 The Acknowledgement Number field generally acknowledges the 958 greatest valid sequence number received so far on this 959 connection. ("Greatest" is, of course, measured in circular 960 sequence space.) Acknowledgement numbers make no attempt to 961 provide precise information about which packets have arrived; 962 options such as the Ack Vector do this. 964 Reserved: 8 bits 965 The version of DCCP specified here MUST ignore these fields on 966 received packets, and MUST set them to all zeroes on generated 967 packets. 969 5.2. DCCP-Request Header 971 A client initiates a DCCP connection by sending a DCCP-Request 972 packet. 974 0 1 2 3 975 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 976 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 977 / Generic DCCP Header (12 or 16 bytes) / 978 / with Type=0 (DCCP-Request) / 979 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 980 | Service Code | 981 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 982 | Options / Padding | 983 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 984 | Application Data | 985 | ... | 987 Service Code: 32 bits 988 Describes the service to which the client application wants to 989 connect. Examples might include RTSP and DOOM. Service Codes 990 are intended to make application protocols independent of well- 991 known ports, and help middleboxes identify the protocol used on 992 a given connection. See Section 8.1.2. 994 5.3. DCCP-Response Header 996 The server responds to valid DCCP-Request packets with DCCP-Response 997 packets. This is the second phase of the three-way handshake. 999 0 1 2 3 1000 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1001 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1002 / Generic DCCP Header (12 or 16 bytes) / 1003 / with Type=1 (DCCP-Response) / 1004 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1005 | Reserved | Acknowledgement Number | 1006 (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)when 1007 (. Acknowledgement Number (low bits) | Reserved |)X=1 1008 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1009 | Service Code | 1010 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1011 | Options / Padding | 1012 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1013 | Application Data | 1014 | ... | 1016 Acknowledgement Number: 24 or 48 bits 1017 The Acknowledgement Number field will generally equal the 1018 Sequence Number from the DCCP-Request. 1020 Service Code: 32 bits 1021 Echoes the Service Code on the DCCP-Request. 1023 5.4. DCCP-Data, DCCP-Ack, and DCCP-DataAck Headers 1025 The central data transfer portion of every DCCP connection uses 1026 DCCP-Data, DCCP-Ack, and DCCP-DataAck packets. DCCP-Data packets 1027 carry application data. 1029 0 1 2 3 1030 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1031 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1032 / Generic DCCP Header (12 or 16 bytes) / 1033 / with Type=2 (DCCP-Data) / 1034 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1035 | Options / Padding | 1036 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1037 | Application Data | 1038 | ... | 1040 DCCP-Ack packets dispense with the data, but contain an 1041 Acknowledgement Number. They are used for pure acknowledgements. 1043 0 1 2 3 1044 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1045 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1046 / Generic DCCP Header (12 or 16 bytes) / 1047 / with Type=3 (DCCP-Ack) / 1048 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1049 | Reserved | Acknowledgement Number | 1050 (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)when 1051 (. Acknowledgement Number (low bits) | Reserved |)X=1 1052 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1053 | Options / Padding | 1054 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1056 DCCP-DataAck packets carry both application data and an 1057 Acknowledgement Number: acknowledgement information is piggybacked 1058 on a data packet. 1060 0 1 2 3 1061 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1062 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1063 / Generic DCCP Header (12 or 16 bytes) / 1064 / with Type=4 (DCCP-DataAck) / 1065 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1066 | Reserved | Acknowledgement Number | 1067 (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)when 1068 (. Acknowledgement Number (low bits) | Reserved |)X=1 1069 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1070 | Options / Padding | 1071 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1072 | Application Data | 1073 | ... | 1075 DCCP-Data and DCCP-DataAck packets may contain zero application data 1076 bytes if the application sends a zero-length datagram. Also, a 1077 DCCP-Ack packet need not have a zero-length application data area. 1078 The receiver MUST ignore any "application data" in a DCCP-Ack 1079 packet. The sender will not generally send such data, but it may 1080 occasionally do so---to perform PMTU discovery without risking loss 1081 of user data, for example. 1083 DCCP-Ack and DCCP-DataAck packets often include additional 1084 acknowledgement options, such as Ack Vector, as required by the 1085 congestion control mechanism in use. 1087 5.5. DCCP-CloseReq and DCCP-Close Headers 1089 DCCP-CloseReq and DCCP-Close packets begin the handshake that 1090 normally terminates a connection. Either client or server may send 1091 a DCCP-Close packet, which will elicit a DCCP-Reset packet (see the 1092 next section). Only the server can send a DCCP-CloseReq packet, 1093 which indicates that the server wants to close the connection, but 1094 does not want to hold its TIMEWAIT state. 1096 0 1 2 3 1097 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1098 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1099 / Generic DCCP Header (12 or 16 bytes) / 1100 / with Type=5 (DCCP-CloseReq) or 6 (DCCP-Close) / 1101 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1102 | Reserved | Acknowledgement Number | 1103 (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)when 1104 (. Acknowledgement Number (low bits) | Reserved |)X=1 1105 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1106 | Options / Padding | 1107 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1109 The receiver MUST ignore any "application data" in a DCCP-CloseReq 1110 or DCCP-Close packet. 1112 5.6. DCCP-Reset Header 1114 DCCP-Reset packets unconditionally shut down a connection. 1115 Connections normally terminate with a DCCP-Reset, but resets may be 1116 sent for other reasons, including bad port numbers, bad option 1117 behavior, incorrect ECN Nonce Echoes, and so forth. 1119 0 1 2 3 1120 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1121 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1122 / Generic DCCP Header (12 or 16 bytes) / 1123 / with Type=7 (DCCP-Reset) / 1124 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1125 | Reserved | Acknowledgement Number | 1126 (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)when 1127 (. Acknowledgement Number (low bits) | Reserved |)X=1 1128 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1129 | Reset Code | Data 1 | Data 2 | Data 3 | 1130 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1131 | Options / Padding | 1132 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1133 | Error Text | 1134 | ... | 1136 Reset Code: 8 bits 1137 Represents the reason that the sender reset the DCCP connection. 1139 Data 1, Data 2, and Data 3: 8 bits each 1140 The Data fields provide additional information about why the 1141 sender reset the DCCP connection. The meanings of these fields 1142 depend on the value of Reason. 1144 Error Text (application data area) 1145 If present, Error Text is a human-readable text string, 1146 preferably in English and encoded in Unicode UTF-8, that 1147 describes the error in more detail. For example, a DCCP-Reset 1148 with Reset Code 12, "Aggression Penalty", might contain Error 1149 Text such as "Aggression Penalty: Received 3 bad ECN Nonce 1150 Echoes, assuming misbehavior". 1152 The following Reset Codes are currently defined. The "Data" columns 1153 describe what the Data fields contain for a given Code. N/A means 1154 the Data field MUST be set to 0 by the sender of the DCCP-Reset and 1155 ignored by its receiver. 1157 Reset Section 1158 Code Name Data 1 Data 2 Data 3 Reference 1159 ----- ---- ------ ------ ------ --------- 1160 0 Unspecified N/A N/A N/A 1161 1 Closed N/A N/A N/A 8.3 1162 2 Aborted N/A N/A N/A 8.1.1 1163 3 No Connection N/A N/A N/A 8.3.1 1164 4 Packet Error packet N/A N/A 8.3.1 1165 type 1166 5 Option Error option option data 1167 number (if any) 1168 6 Mandatory Error option option data 5.9.2 1169 number (if any) 1170 7 Extended Seqnos N/A N/A N/A 7.6 1171 8 Connection Refused N/A N/A N/A 8.1.3 1172 9 Bad Service Code N/A N/A N/A 8.1.3 1173 10 Too Busy N/A N/A N/A 8.1.3 1174 11 Bad Init Cookie N/A N/A N/A 8.1.4 1175 12 Aggression Penalty N/A N/A N/A 12.2 1176 13 Move Refused N/A N/A N/A 14.4 1177 13-127 Reserved 1178 128-255 CCID-specific codes ... variable ... 10.4 1180 5.7. DCCP-Move Header 1182 The DCCP-Move packet type is part of DCCP's support for multihoming 1183 and mobility, which is described further in Section 14. DCCP A sends 1184 a DCCP-Move packet to DCCP B after changing its address and/or port 1185 number. The DCCP-Move packet requests that DCCP B start sending 1186 packets to a new address and port number, which are read off the 1187 packet's network header and generic DCCP header. The old address 1188 and port are defined through a Mobility ID, which provides some 1189 protection against hijacked connections. 1191 0 1 2 3 1192 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1193 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1194 / Generic DCCP Header (12 or 16 bytes) / 1195 / with Type=8 (DCCP-Move) / 1196 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1197 | Reserved | Acknowledgement Number | 1198 (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)when 1199 (. Acknowledgement Number (low bits) | Reserved |)X=1 1200 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1201 | Mobility ID (high bits) . 1202 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1203 . Mobility ID (bits 64-95) . 1204 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1205 . Mobility ID (bits 32-63) . 1206 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1207 . Mobility ID (low bits) | 1208 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1209 | Options / Padding | 1210 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1212 Mobility ID: 128 bits 1213 The value of the receiver's Mobility ID feature. This value 1214 uniquely identifies the current connection among the set of 1215 connections terminating at the receiver (meaning, the stationary 1216 endpoint); it MUST have been set in an earlier exchange. See 1217 Section 14.2. 1219 The receiver MUST ignore any "application data" in a DCCP-Move 1220 packet. 1222 5.8. DCCP-Sync and DCCP-SyncAck Headers 1224 DCCP-Sync packets help DCCP endpoints recover synchronization after 1225 bursts of loss, or recover from half-open connections. Each valid 1226 DCCP-Sync received immediately elicits a DCCP-SyncAck. 1228 0 1 2 3 1229 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1230 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1231 / Generic DCCP Header (12 or 16 bytes) / 1232 / with Type=9 (DCCP-Sync) or 10 (DCCP-SyncAck) / 1233 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1234 | Reserved | Acknowledgement Number | 1235 (+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+)when 1236 (. Acknowledgement Number (low bits) | Reserved |)X=1 1237 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1238 | Options / Padding | 1239 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1241 The Acknowledgement Number on DCCP-Sync and DCCP-SyncAck packets 1242 need not equal the generating endpoint's greatest valid sequence 1243 number received (GSR). This differs from Acknowledgement Numbers on 1244 all other packet types. If a DCCP-Sync was generated in response to 1245 a packet with invalid sequence numbers, then the DCCP-Sync's 1246 Acknowledgement Number will equal the invalid packet's sequence 1247 number. The Acknowledgement Number on any DCCP-SyncAck packet MUST 1248 correspond to a received, valid DCCP-Sync's Sequence Number; in the 1249 presence of reordering, this might not equal GSR. 1251 The receiver MUST ignore any "application data" in a DCCP-Sync or 1252 DCCP-SyncAck packet. 1254 5.9. Options 1256 All DCCP packets may contain options, which occupy space at the end 1257 of the DCCP header. Each option is a multiple of 8 bits in length. 1258 The combination of all options MUST add up to a multiple of 32 bits. 1259 Individual options are not padded to multiples of 32 bits, however; 1260 any option may begin on any byte boundary. All options are always 1261 included in the checksum. 1263 The first byte of an option is the option type. Options with types 1264 0 through 31 are single-byte options. Other options are followed by 1265 a byte indicating the option's length. This length value includes 1266 the two bytes of option-type and option-length as well as any 1267 option-data bytes, and must therefore be greater than or equal to 1268 two. 1270 Options are processed sequentially, starting at the first option in 1271 the packet header. 1273 The following options are currently defined: 1275 Option Section 1276 Type Length Meaning Reference 1277 ---- ------ ------- --------- 1278 0 1 Padding 5.9.1 1279 1 1 Mandatory 5.9.2 1280 2 1 Slow Receiver 11.6 1281 3-31 1 Reserved 1282 32 variable Change L 6.1 1283 33 variable Confirm L 6.2 1284 34 variable Change R 6.1 1285 35 variable Confirm R 6.2 1286 36 variable Init Cookie 8.1.4 1287 37 4-5 NDP Count 7.7 1288 38 variable Ack Vector [Nonce 0] 11.4 1289 39 variable Ack Vector [Nonce 1] 11.4 1290 40 variable Data Dropped 11.7 1291 41 6 Timestamp 13.1 1292 42 6-10 Timestamp Echo 13.3 1293 43 4-6 Elapsed Time 13.2 1294 44 4 Data Checksum 9.3 1295 45-127 variable Reserved 1296 128-255 variable CCID-specific options 10.4 1298 This section describes two generic options, Padding and Mandatory. 1299 Other options are described later. 1301 5.9.1. Padding Option 1303 The Padding option, with type 0, is a single byte option used to pad 1304 between or after options. It either ensures the application data 1305 begins on a 32-bit boundary (as required), or ensures alignment of 1306 following options (not mandatory). 1308 +--------+ 1309 |00000000| 1310 +--------+ 1311 Type=0 1313 5.9.2. Mandatory Option 1315 The Mandatory option, with type 1, is a single byte option that 1316 indicates that the immediately following option is mandatory. If 1317 the receiving DCCP does not understand that following option, it 1318 MUST reset the connection, generally using Reset Code 6, "Mandatory 1319 Failure". For instance, say DCCP A receives a packet with two 1320 options: a Mandatory option, and immediately following, another 1321 option O. Then DCCP A would reset the connection if it did not 1322 understand O's type; if it understood O's type, but not O's data; if 1323 O's data was invalid for O's type; if O was a feature negotiation 1324 option, and DCCP A did not understand the enclosed feature number; 1325 if DCCP A understood O, but chose not to perform the action O 1326 implies; and so forth. Section 6.6.8 describes the behavior of 1327 Mandatory feature negotiation options in more detail. 1329 +--------+ 1330 |00000001| 1331 +--------+ 1332 Type=1 1334 6. Feature Negotiation 1336 Four DCCP options, Change L, Confirm L, Change R, and Confirm R, 1337 implement in-band feature negotiation. Change options initiate a 1338 negotiation; Confirm options complete that negotiation. The "L" 1339 options are sent by the feature location, and the "R" options are 1340 sent by the feature remote. Change options are retransmitted to 1341 ensure reliability. 1343 All these options have the same format. The first byte of option 1344 data is the feature number, and the second and subsequent data bytes 1345 hold one or more feature values. The feature values are generally 1346 arranged in a linear preference list, where the first value is most 1347 preferred. 1349 +--------+--------+--------+--------+-------- 1350 | Type | Length |Feature#| Value(s) ... 1351 +--------+--------+--------+--------+-------- 1353 Together, the feature number and the option type ("L" or "R") 1354 uniquely identify the feature to which an option applies. The exact 1355 format of the Value(s) area depends on the feature number. 1357 6.1. Change Options 1359 Change L and Change R options initiate feature negotiation. Either 1360 endpoint can start a negotiation for any feature; if DCCP A wants to 1361 start a negotiation for feature F/A, it will send a Change L option, 1362 while to start a negotiation for F/B, it will send a Change R 1363 option. Change options are retransmitted until some response is 1364 received. Normal Change options contain at least one Value, and 1365 thus have length at least 4. 1367 +--------+--------+--------+--------+-------- 1368 Change L: |00100000| Length |Feature#| Value(s) ... 1369 +--------+--------+--------+--------+-------- 1370 Type=32 1372 +--------+--------+--------+--------+-------- 1373 Change R: |00100010| Length |Feature#| Value(s) ... 1374 +--------+--------+--------+--------+-------- 1375 Type=34 1377 The endpoint may check a feature's current value without attempting 1378 to change it by sending an empty Change option, containing just the 1379 feature number. Such options have length 3. The endpoints must 1380 agree on feature values anyway, so these options are useful in 1381 practice only in special situations, such as when a middlebox 1382 introduced in the middle of a connection wants to check a feature 1383 value. 1385 6.2. Confirm Options 1387 Confirm L and Confirm R options complete feature negotiation, and 1388 are sent in response to Change R and Change L options, respectively. 1389 Confirm options MUST NOT be generated except in response to Change 1390 options. Confirm options need not be retransmitted, since Change 1391 options are retransmitted as necessary. Normal Confirm options 1392 contain the selected Value, possibly followed by the sender's 1393 preference list. 1395 +--------+--------+--------+--------+-------- 1396 Confirm L: |00100001| Length |Feature#| Value(s) ... 1397 +--------+--------+--------+--------+-------- 1398 Type=33 1400 +--------+--------+--------+--------+-------- 1401 Confirm R: |00100011| Length |Feature#| Value(s) ... 1402 +--------+--------+--------+--------+-------- 1403 Type=35 1405 If an endpoint receives an invalid Change option -- with an unknown 1406 feature number, or an invalid value -- it will respond with an empty 1407 Confirm option containing no value. Such options have length 3. 1409 6.3. Reconciliation Rules 1411 Reconciliation rules determine how the two sets of preferences for a 1412 given feature are resolved into a unique result. The reconciliation 1413 rule depends only on the feature number. Each reconciliation rule 1414 must have the property that the result is uniquely determined given 1415 the contents of Change options sent by the two endpoints. 1417 All current DCCP features use one of two reconciliation rules, 1418 server-priority ("SP") and non-negotiable ("NN"). 1420 6.3.1. Server-Priority 1422 The feature value is a fixed-length byte string (length determined 1423 by the feature number). Each Change option contains a preference 1424 list of values, with the most preferred value coming first. Each 1425 Confirm option contains the confirmed value, followed by the 1426 confirmer's preference list. Thus, the feature's current value will 1427 generally appear twice in Confirm options' data, once as the current 1428 value and once in the confirmer's preference list. Even responses 1429 to empty Change options contain the whole preference list. 1431 To reconcile the preference lists, select the first entry in the 1432 server's list that also occurs in the client's list. If there is no 1433 shared entry, the feature's value MUST NOT change, and the Confirm 1434 option will confirm the feature's previous value (unless the Change 1435 option was Mandatory; see Section 6.6.8). 1437 DCCP endpoints need not calculate their value preference lists 1438 before feature negotiation begins. Thus, a server might adjust its 1439 preference list based on the client's preference list, assuming the 1440 client opened the negotiation. Once a negotiation for a feature has 1441 begun, however, the preference lists MUST remain stable until the 1442 negotiation has closed. 1444 6.3.2. Non-Negotiable 1446 The feature value is a byte string. Each option contains exactly 1447 one feature value. The feature location signals a value change by 1448 sending Change L options. The feature remote MUST accept any valid 1449 value, responding with a Confirm R option containing the new value, 1450 and it MUST send empty Confirm R options in response to invalid 1451 values. Non-negotiable features aren't really negotiated; they use 1452 feature negotiation as a mechanism for achieving reliability. 1453 Change R and Confirm L options MUST NOT be sent for non-negotiable 1454 features. 1456 6.4. Feature Numbers 1458 This document defines the following feature numbers. 1460 Rec'n Initial Section 1461 Number Meaning Rule Value Req'd Reference 1462 ------ ------- ----- ----- ----- --------- 1463 0 Reserved 1464 1 Congestion Control ID (CCID) SP 2 Y 10 1465 2 ECN Capable SP 1 Y 12.1 1466 3 Sequence Window NN 100 Y 7.5.4 1467 4 Sequence Transition Capable SP 0 N 7.6.4 1468 5 Mobility Capable SP 0 N 14.1 1469 6 Mobility ID NN 0 N 14.2 1470 7 Ack Ratio NN 2 N 11.3 1471 8 Send Ack Vector SP 0 N 11.5 1472 9 Send NDP Count SP 0 N 7.7.2 1473 10 Check Data Checksum SP 0 N 9.3.1 1474 11-127 Reserved 1475 128-255 CCID-specific features ? ? ? 10.4 1477 Rec'n Rule The reconciliation rule used for the feature. SP is 1478 server-priority and NN is non-negotiable. 1480 Initial Value The initial value for the feature. Every feature has 1481 a known initial value. 1483 Req'd This column is "Y" iff every DCCP implementation MUST 1484 understand the feature. If it is "N", then the 1485 feature behaves like an extension (see Section 16), 1486 and it is safe to respond to Change options for the 1487 feature with empty Confirm options. Of course, a 1488 CCID might require the feature; a DCCP that 1489 implements CCID 2 MUST support Ack Ratio and Send Ack 1490 Vector, for example. 1492 6.5. Examples 1493 Here are three example feature negotiations for features located at 1494 the server, the first two for the Congestion Control ID feature, the 1495 last for the Ack Ratio: 1497 Client Server 1498 1. Change R(CCID, 2 3 1) --> 1499 ("2 3 1" is client's value preference list) 1500 2. <-- Confirm L(CCID, 3, 3 2 1) 1501 (3 is the negotiated value; 1502 "3 2 1" is server's pref list) 1503 * agreement that CCID/Server = 3 * 1505 1. XXX <-- Change L(CCID, 3 2 1) 1506 2. Retransmission: 1507 <-- Change L(CCID, 3 2 1) 1508 3. Confirm R(CCID, 3, 2 3 1) --> 1509 * agreement that CCID/Server = 3 * 1511 1. <-- Change L(Ack Ratio, 3) 1512 2. Confirm R(Ack Ratio, 3) --> 1513 * agreement that Ack Ratio/Server = 3 * 1515 This example shows a simultaneous negotiation. 1517 Client Server 1518 1a. Change R(CCID, 2 3 1) --> 1519 b. <-- Change L(CCID, 3 2 1) 1520 (both endpoints in CHANGING) 1521 2a. <-- Confirm L(CCID, 3, 3 2 1) 1522 b. Confirm R(CCID, 3, 2 3 1) --> 1523 (both endpoints in STABLE) 1524 * agreement that CCID/Server = 3 * 1526 Example Change and Confirm options follow, with their byte 1527 encodings. Each option is sent by DCCP A. 1529 Change L(CCID, 2 3) = 32,5,1,2,3 1530 I want to change CCID/A's value (feature number 1, a server- 1531 priority feature); my preferred values are 2 and 3, in that 1532 preference order. 1534 Change L(Sequence Window, 1024) = 32,6,3,0,4,0 1535 Change Sequence Window/A's value (feature number 3, a non- 1536 negotiable feature) to the 3-byte string 0,4,0 (the value 1024). 1538 Empty Change L(CCID) = 32,3,1 1539 Tell me CCID/A's value using a Confirm R option. 1541 Confirm L(CCID, 2, 2 3) = 33,6,1,2,2,3 1542 I've changed CCID/A's value to 2; my preferred values are 2 and 1543 3, in that preference order. 1545 Empty Confirm L(126) = 33,3,126 1546 I don't implement feature number 126, or your proposed value for 1547 feature 126/A was invalid. 1549 Change R(CCID, 3 2) = 34,5,1,3,2 1550 Please change CCID/B's value; my preferred values are 3 and 2, 1551 in that preference order. 1553 Empty Change R(CCID) = 34,3,1 1554 Tell me CCID/B's value using a Confirm L option. 1556 Confirm R(CCID, 2, 3 2) = 35,6,1,2,3,2 1557 I've changed CCID/B's value to 2; my preferred values were 3 and 1558 2, in that preference order. 1560 Confirm R(Sequence Window, 1024) = 35,6,3,0,4,0 1561 I've changed Sequence Window/B's value to the 3-byte string 1562 0,4,0 (the value 1024). 1564 Empty Confirm R(126) = 35,3,126 1565 I don't implement feature number 126, or your proposed value for 1566 feature 126/B was invalid. 1568 6.6. Option Exchange 1570 A few basic rules govern feature negotiation option exchange. 1572 1. Every non-reordered Change option gets a Confirm option in 1573 response. 1575 2. Change options are retransmitted until some response is 1576 received. 1578 3. Preference lists don't change during a negotiation. 1580 4. Feature negotiation options are processed in strictly increasing 1581 order by Sequence Number. 1583 The rest of this section describes the consequences of these rules 1584 in more detail. 1586 6.6.1. Normal Exchange 1588 Change options are generated when a DCCP endpoint wants to change 1589 the value of some feature. Generally, this will happen at the 1590 beginning of a connection, although it may happen at any time. We 1591 say the endpoint "generates" or "sends" a Change L or Change R 1592 option; but, of course, the option must be attached to a packet. 1593 The endpoint may attach the option to a packet it would have 1594 generated anyway (such as a DCCP-Request), or it may create a new 1595 packet just to carry the options (often a DCCP-Sync). If it does 1596 create a new packet, it MUST NOT create more than one such packet 1597 per round-trip time (or 0.2 seconds, if no RTT is available). 1599 On receiving a Change L or Change R option, a DCCP endpoint examines 1600 the included preference list, reconciles that with its own 1601 preference list, calculates the new value, and sends back a 1602 Confirm R or Confirm L option, respectively, informing its partner 1603 of the new value. The rule for reconciling the two preference lists 1604 is feature-specific; see Section 6.3. Every non-reordered Change 1605 option MUST result in a corresponding Confirm option. Any packet 1606 including a Confirm option MUST carry an Acknowledgement Number; 1607 thus, Confirm options are not allowed on DCCP-Request and DCCP-Data 1608 packets. Again, generated Confirm options may be attached to 1609 packets that would have been sent anyway (such as DCCP-Response or 1610 DCCP-SyncAck), or to new packets (usually DCCP-Ack). 1612 The Change-sending endpoint MUST wait to receive a corresponding 1613 Confirm option before changing its stored feature value. The 1614 Confirm-sending endpoint changes its stored feature value as soon as 1615 it sends the Confirm. 1617 DCCP endpoints effectively exist in one of two states, STABLE and 1618 CHANGING, relative to each feature. STABLE is the normal state, 1619 where the endpoint knows the feature's value and thinks the other 1620 endpoint agrees. An endpoint enters the CHANGING state when it 1621 first sends a Change for the feature, and returns to STABLE once it 1622 receives a corresponding Confirm. 1624 6.6.2. Loss and Retransmission 1626 Packets containing Change and Confirm options might be lost or 1627 delayed by the network. Therefore, Change options are retransmitted 1628 to achieve reliability. 1630 A CHANGING endpoint retransmits a Change option once it realizes 1631 that it has not heard back from the other endpoint. Each 1632 retransmitted Change option MUST contain exactly the same payload as 1633 the original. The endpoint may piggyback its Change options on 1634 packets it would have sent anyway. If it generates new packets for 1635 feature negotiation, it MUST use an exponential-backoff timer. The 1636 timer's initial value is set to approximately one or two round-trip 1637 times (or 0.2-0.4 seconds, if no RTT is available), and it is pinned 1638 at roughly 32 RTTs. 1640 A CHANGING endpoint MUST continue retransmitting Change options 1641 until it gets some response. Its only recourse is to reset the 1642 connection, which it SHOULD NOT do until at least 12 transmissions 1643 have failed. 1645 Change options SHOULD NOT be transmitted more frequently than once 1646 per RTT, or the reordering protection below would prevent any 1647 Confirm option from being accepted (since no Confirm would 1648 acknowledge the most recently transmitted Change). 1650 Confirm options are never retransmitted, but the Confirm-sending 1651 endpoint MUST generate a new Confirm option for every non-reordered 1652 Change it receives. 1654 6.6.3. Reordering 1656 Reordering might cause packets containing Change and Confirm options 1657 to arrive in an unexpected order. Endpoints MUST be robust to 1658 reordering, by ignoring feature negotiation options that do not 1659 arrive in strictly-increasing order by Sequence Number. 1661 The most straightforward way to implement this requirement is for an 1662 endpoint to associate two sequence number variables with every 1663 feature F/X, as follows. 1665 F/X.GSR The Greatest Sequence Number Received from the other 1666 endpoint on a packet containing a Change or Confirm option 1667 for feature F/X. 1669 F/X.GSS The Greatest Sequence Number Sent by this endpoint on a 1670 packet containing a Change option for feature F/X. 1672 Then DCCP A will check options relating to feature F/A as follows: 1674 1. Ignore any received Change R(F) option whose packet's Sequence 1675 Number is not greater than F/A.GSR. 1677 2. Ignore any received Confirm R(F) option whose packet's Sequence 1678 Number is not greater than F/A.GSR, or whose packet could not 1679 have acknowledged F/A.GSS. Specifically, if the Acknowledgement 1680 Number is less than F/A.GSS, the endpoint MUST ignore the 1681 Confirm; and if the packet has an Ack Vector indicating that 1682 F/A.GSS was not received, the endpoint MAY ignore the Confirm. 1684 A similar procedure applies options relating to feature F/B, namely 1685 Change L(F) and Confirm L(F), except that F/B.GSR and F/B.GSS are 1686 checked. 1688 A less state-intensive way to implement this requirement would be to 1689 share the F.GSR and F.GSS variables among all features, rather than 1690 keeping one pair per feature. Then the feature negotiation options 1691 on any received packet would be treated as a unit (either all 1692 accepted or all rejected). 1694 Checking Confirm options is easier if the endpoint only sends Change 1695 options on packet types that will be acknowledged immediately, 1696 namely DCCP-Request, DCCP-Response, and DCCP-Sync. Then there is 1697 never any need to check Ack Vectors, although checking Ack Vectors 1698 is NOT MANDATORY anyway. 1700 6.6.4. Preference Changes 1702 Endpoints MUST NOT change their preference lists in the middle of a 1703 negotiation. This is because, if a preference list changed in the 1704 middle of a negotiation and the right packets were lost, the 1705 negotiation could terminate with the endpoints thinking the feature 1706 had different values. In particular, an endpoint MUST NOT change 1707 its preference list while in the CHANGING state; this ensures that 1708 every Change option sent during that negotiation will contain the 1709 same data. 1711 6.6.5. Simultaneous Negotiation 1713 The two endpoints might simultaneously open negotiation for the same 1714 feature, after which an endpoint in the CHANGING state will receive 1715 a Change option for the same feature. Such received Change options 1716 can act as responses to the original Change options. The CHANGING 1717 endpoint MUST examine the received Change's preference list, 1718 reconcile that with its own preference list (as expressed in its 1719 generated Change options), and generate the corresponding Confirm 1720 option. It can then transition to the STABLE state. 1722 6.6.6. Unknown Features 1724 An endpoint may receive a Change option referring to some feature 1725 number it does not understand. This is particularly likely to 1726 happen when an extended DCCP converses with a non-extended DCCP. 1727 The receiving endpoint MUST respond to such Change options with 1728 corresponding empty Confirm options (that is, Confirm options 1729 containing no data), which inform the CHANGING endpoint that the 1730 feature was not understood. However, if the Change option was 1731 preceded by a Mandatory option, the connection MUST be reset; see 1732 Section 6.6.8. 1734 On receiving an empty Confirm option for some feature, the CHANGING 1735 endpoint MUST transition back to the STABLE state, leaving the 1736 feature's value unchanged. Section 16 suggests that the default 1737 value for any extension feature should correspond to "extension not 1738 available". 1740 An endpoint will also send an empty Confirm option when it 1741 understood the Change's feature number, but considered the Change's 1742 value invalid or inappropriate for the feature. The next section 1743 describes this further. 1745 Some features are required to be understood by all DCCPs (see 1746 Section 6.4); the CHANGING endpoint SHOULD reset the connection 1747 (with Reset Code 5, "Option Error") if it receives an empty Confirm 1748 option for such a feature. 1750 Since Confirm options are generated only in response to Change 1751 options, an endpoint should never receive a Confirm option referring 1752 to a feature number it does not understand. Endpoints MUST either 1753 reset the connection on receiving such options, or just ignore the 1754 options. 1756 6.6.7. Invalid Options 1758 A DCCP endpoint might receive a Change or Confirm option that lists 1759 one or more values that it does not understand. Some, but not all, 1760 such options are invalid, depending on the relevant reconciliation 1761 rule (Section 6.3). For instance: 1763 o All features have length limitiations, and options with invalid 1764 lengths are invalid. For example, the Mobility ID feature takes 1765 128-bit values, so valid "Confirm R(Mobility ID)" options have 1766 option length 19. 1768 o Some non-negotiable features have value limitations. The Ack 1769 Ratio feature takes two-byte, non-zero integer values, so a 1770 "Change L(Ack Ratio, 0)" option is never valid. Note that server- 1771 priority features do not have value limitations, since unknown 1772 values are handled as a matter of course. 1774 o Any Confirm option that selects the wrong value, based on the two 1775 preference lists and the relevant reconciliation rule, is invalid. 1777 An endpoint receiving an invalid Change option MUST respond with the 1778 corresponding empty Confirm option. An endpoint receiving an 1779 invalid Confirm option MUST reset the connection, with Reset Code 5, 1780 "Option Error". 1782 6.6.8. Mandatory Feature Negotiation 1784 Change options may be preceded by Mandatory options (Section 5.9.2). 1785 Mandatory Change options are processed like normal Change options, 1786 except that various failure cases will cause the receiver to reset 1787 the connection with Reset Code 6, "Mandatory Failure", rather than 1788 send a Confirm option. Specifically, the connection MUST be reset 1789 if: 1791 o The Change option's feature number was not understood; 1792 o The Change option's value was invalid, and the receiver would 1793 normally have sent an empty Confirm option in response; or 1795 o For server-priority features, there was no shared entry in the two 1796 endpoints' preference lists. 1798 There's no reason to mark Confirm options as Mandatory in this 1799 version of DCCP, since Confirm options are sent only in response to 1800 Change options and therefore can't mention potentially-invalid 1801 values or unexpected feature numbers. 1803 6.6.9. Out-of-Band Agreement 1805 An endpoint MUST NOT unilaterally change the value of any DCCP 1806 feature. However, endpoints MAY cooperatively change DCCP feature 1807 values without using in-band feature negotiation options---by using 1808 a separate signalling channel, for example. 1810 6.6.10. State Diagram 1812 This diagram illustrates feature-related state transitions, ignoring 1813 sequence number and option validity issues, for the endpoint that is 1814 the feature location. For a feature remote state transition 1815 diagram, switch the "L"s and "R"s. 1817 rcv Confirm R app/protocol evt : snd Change L 1818 : ignore +--------------------------------------------+ 1819 +----+ | | 1820 | v | rcv Change R v 1821 +------------+ rcv Confirm R : calc new value, +------------+ 1822 | | : accept value snd Confirm L | | 1823 | STABLE |<------------------------------------| CHANGING | 1824 | | rcv empty Confirm R | | 1825 +------------+ : revert to old value +------------+ 1826 | ^ | ^ 1827 +----+ +----+ 1828 rcv Change R timeout/rcv non-ack 1829 : calc new value, snd Confirm L : snd Change L 1831 This state diagram corresponds to the following procedure for 1832 reacting to received packets with feature negotiation options. The 1833 procedure refers to "P.seqno", "P.ackno", "P.optiontype", and 1834 "P.optionlen", which are properties of the packet; "F.GSR" and 1835 "F.GSS", which are the variables mentioned in Section 6.6.3; 1836 "F.state", which is the feature's state (STABLE or CHANGING); and 1837 "F.value", which is the feature's value. 1839 If F.state == STABLE: 1840 If P.optiontype == Change R && P.seqno > F.GSR: 1841 Calculate new value 1842 Send Confirm L on next packet 1843 F.GSR := P.seqno 1844 Otherwise: 1845 Ignore option 1847 If F.state == CHANGING: 1848 If P.optiontype == Confirm R && P.ackno >= F.GSS 1849 && P potentially acknowledges F.GSS: 1850 If P.optionlen == 3: 1851 /* empty Confirm R option */ 1852 Retain old value 1853 Otherwise: 1854 Check new value 1855 F.value := new value 1856 F.state := STABLE 1857 Otherwise, if P.optiontype == Change R && P.seqno > F.GSR: 1858 Calculate new value 1859 Send Confirm L on next packet 1860 F.GSR := P.seqno 1861 Otherwise: 1862 Ignore option 1864 7. Sequence Numbers 1866 DCCP uses 24- or 48-bit sequence numbers to arrange packets into 1867 sequence, detect losses and network duplicates, and protect against 1868 attackers, half-open connections, and the delivery of very old 1869 packets. Every packet carries a Sequence Number; most packet types 1870 carry an Acknowledgement Number as well. 1872 DCCP sequence numbers are per-packet. Thus, each endpoint 1873 increments the DCCP Sequence Number field by one (modulo 2^24 or 1874 2^48) with every packet sent. Even DCCP-Ack and DCCP-Sync packets, 1875 and other packets that don't carry user data, increment the Sequence 1876 Number. Since DCCP is an unreliable protocol, there are no true 1877 retransmissions; but effective retransmissions, such as 1878 retransmissions of DCCP-Request packets, also increment the Sequence 1879 Number. This lets DCCP implementations detect network duplication, 1880 retransmissions, and acknowledgement loss, and is a significant 1881 departure from TCP practice. 1883 7.1. Variables 1885 DCCP endpoints maintain a set of sequence number variables for each 1886 connection. 1888 ISS The Initial Sequence Number Sent by this endpoint. This 1889 equals the Sequence Number of the first DCCP-Request or 1890 DCCP-Response sent. 1892 ISR The Initial Sequence Number Received from the other 1893 endpoint. This equals the Sequence Number of the first 1894 DCCP-Request or DCCP-Response received. 1896 GSS The Greatest Sequence Number Sent by this endpoint. 1897 ("Greatest" is of course measured in circular sequence 1898 space.) 1900 GSR The Greatest Sequence Number Received from the other 1901 endpoint on an acknowledgeable packet. (Section 7.4 defines 1902 "acknowledgeable" packets.) 1904 GAR The Greatest Acknowledgement Number Received from the other 1905 endpoint on an acknowledgeable packet. 1907 Some other variables are derived from these primitives. 1909 SWL and SWH 1910 (Sequence Number Window Low and High) The extremes of the 1911 validity window for received packets' Sequence Numbers. 1913 AWL and AWH 1914 (Acknowledgement Number Window Low and High) The extremes 1915 of the validity window for received packets' Acknowledgement 1916 Numbers. 1918 7.2. Initial Sequence Numbers 1920 The endpoints' initial sequence numbers are set by the first DCCP- 1921 Request and DCCP-Response packets sent. Initial sequence numbers 1922 MUST be chosen to avoid two problems: 1924 o Delivery of old packets, where packets lingering in the network 1925 from an old connection are delivered to a new connection with the 1926 same addresses and port numbers. 1928 o Sequence number attacks, where an attacker can guess the sequence 1929 numbers that a future connection would use [M85]. 1931 DCCP implementations may use TCP's strategies for avoiding these 1932 problems [RFC 793] [RFC 1948]. 1934 To address the first problem, an implementation MUST ensure that the 1935 initial sequence number for a given 4-tuple doesn't overlap with 1937 recent sequence numbers on connections with the same 4-tuple 1938 ("recent" meaning sent within 2 maximum segment lifetimes). If the 1939 implementation has state for a recent connection with the same 1940 4-tuple, it can simply pick a good initial sequence number; 1941 otherwise, it could tie initial sequence number selection to some 1942 clock, such as the 4-microsecond clock used by TCP [RFC 793]. 1944 To address the second problem, an implementation MUST provide each 1945 4-tuple with an independent initial sequence number space; then an 1946 attacker can't learn anything about anyone else's initial sequence 1947 numbers. RFC 1948 achieves this by adding a cryptographic hash, of 1948 the 4-tuple and a secret, to any initial sequence number. For the 1949 secret, RFC 1948 recommends a combination of some truly-random data 1950 [RFC 1750], an administratively-installed passphrase, the endpoint's 1951 IP address, and the endpoint's boot time, but truly-random data is 1952 sufficient. Care should be taken when changing the secret; such a 1953 change alters all initial sequence number spaces, which might make 1954 an initial sequence number for some 4-tuple equal a recently sent 1955 sequence number for the same 4-tuple. To avoid this problem around 1956 such a change, the endpoint might remember dead connection state for 1957 each 4-tuple or stay quiet for 2 maximum segment lifetimes. 1959 7.3. Quiet Time 1961 DCCP endpoints, like TCP endpoints, must take care before initiating 1962 connections when they boot. In particular, they MUST NOT send 1963 packets whose sequence numbers are close to the sequence numbers of 1964 packets lingering in the network from before the boot. The simplest 1965 way to enforce this rule is for DCCP endpoints to avoid sending any 1966 packets until one maximum segment lifetime (2 minutes) after boot. 1967 Other enforcement mechanisms include remembering recent sequence 1968 numbers across boots, or reserving the upper 8 or so bits of initial 1969 sequence numbers for a persistent boot counter that decrements by 1970 two each boot (this would require the use of extended sequence 1971 numbers). 1973 7.4. Acknowledgement Numbers 1975 DCCP has no cumulative acknowledgement field; cumulative 1976 acknowledgements would be meaningless in an unreliable protocol. 1977 Therefore, the Acknowledgement Number field has a different meaning 1978 in DCCP than in TCP. 1980 A packet is classified as "acknowledgeable" if and only if its 1981 options were processed by the receiving DCCP. This means, for 1982 example, that all acknowledgeable packets have valid header 1983 checksums and sequence numbers. The Acknowledgement Number for most 1984 packet types MUST equal GSR, the Greatest Sequence Number Received 1985 on an acknowledgeable packet. 1987 Note that "acknowledgeable" refers to option processing, not data 1988 processing. Even acknowledgeable packets may have their application 1989 data dropped, due to receive buffer overflow or corruption, for 1990 instance. Data Dropped options report these data losses when 1991 necessary, letting congestion control mechanisms distinguish between 1992 network losses and endpoint losses. This issue is discussed further 1993 in Sections 11.4 and 11.7. 1995 DCCP-Sync and DCCP-SyncAck packets are a special case to this rule. 1996 The Acknowledgement Number on a DCCP-Sync packet corresponds to a 1997 received packet, but not necessarily an acknowledgeable packet; in 1998 particular, it might correspond to an out-of-sync packet whose 1999 options were not processed. The Acknowledgement Number on a DCCP- 2000 SyncAck packet always corresponds to an acknowledgeable DCCP-Sync 2001 packet; if there was reordering, that Acknowledgement Number might 2002 be less than GSR. 2004 7.5. Validity and Synchronization 2006 Any DCCP endpoint might receive packets that are not actually part 2007 of the current connection. For instance, the network might deliver 2008 an old packet, an attacker might attempt to hijack a connection, or 2009 the other endpoint might crash, causing a half-open connection. 2011 DCCP, like TCP, uses sequence number checks to detect these cases 2012 Packets whose Sequence and/or Acknowledgement Numbers are out of 2013 range are called sequence-invalid, and are not processed normally. 2015 Unlike TCP, DCCP requires a synchronization mechanism to recover 2016 from large bursts of loss. One endpoint might send so many packets 2017 during a burst of loss that when one of its packets finally got 2018 through, the other endpoint would label its Sequence Number as 2019 invalid. A handshake involving DCCP-Sync and DCCP-SyncAck packets 2020 recovers from this case. 2022 7.5.1. Sequence-Validity Rules 2024 Sequence-validity depends on the received packet's type. This table 2025 shows the sequence and acknowledgement number checks applied to each 2026 packet; a packet is sequence-valid if it passes both tests, and 2027 sequence-invalid if it does not. Many of the checks refer to the 2028 sequence and acknowledgement number windows, [SWL, SWH] and [AWL, 2029 AWH], defined below in Section 7.5.3. 2031 Acknowledgement Number 2032 Packet Type Sequence Number Check Check 2033 ----------- --------------------- ---------------------- 2034 DCCP-Request SWL <= seqno <= SWH (*) N/A 2035 DCCP-Response SWL <= seqno <= SWH (*) AWL <= ackno <= AWH 2036 DCCP-Data SWL <= seqno <= SWH N/A 2037 DCCP-Ack SWL <= seqno <= SWH AWL <= ackno <= AWH 2038 DCCP-DataAck SWL <= seqno <= SWH AWL <= ackno <= AWH 2039 DCCP-CloseReq SWL <= seqno <= SWH AWL <= ackno <= AWH 2040 DCCP-Close SWL <= seqno <= SWH AWL <= ackno <= AWH 2041 DCCP-Reset seqno == 0 or seqno > GSR GAR <= ackno <= AWH 2042 DCCP-Move seqno >= SWL ISS <= ackno <= AWH 2043 DCCP-Sync seqno >= SWL AWL <= ackno <= AWH 2044 DCCP-SyncAck seqno >= SWL AWL <= ackno <= AWH 2046 (*) Check not applied if connection is in LISTEN or REQUEST state. 2048 In general, packets are sequence-valid if their Sequence and 2049 Acknowledgement Numbers lie within the corresponding valid windows, 2050 [SWL, SWH] and [AWL, AWH]. The exceptions to this rule are as 2051 follows: 2053 o DCCP-Reset Sequence Numbers may be zero. This is because during 2054 the cleanup of a half-open connection, an endpoint might generate 2055 a DCCP-Reset in response to a DCCP-Request or DCCP-Data packet 2056 with no Acknowledgement Number; the resetting endpoint would then 2057 use zero for the Reset's Sequence Number, since it has no valid 2058 Sequence Number available. 2060 DCCP-Reset Acknowledgement Numbers, and non-zero Sequence Numbers, 2061 are checked more stringently than those on other packet types, 2062 however. This is because DCCP-Reset always ends a connection: no 2063 endpoint will send a non-Reset packet on a connection after it has 2064 sent a Reset. Thus, a Reset packet whose Sequence Number is less 2065 than GSR, or whose Acknowledgement Number is less than GAR, must 2066 be sequence-invalid. 2068 o DCCP-Move Sequence and Acknowledgement Numbers are not strongly 2069 checked because moves might likely happen after long loss periods, 2070 and the mandatory Mobility ID provides good protection against 2071 unexpected packets. 2073 o DCCP-Sync and DCCP-SyncAck Sequence Numbers are not strongly 2074 checked. These packet types exist specifically to get the 2075 endpoints back into sync after bursts of loss; checking their 2076 Sequence Numbers would eliminate their usefulness. 2078 These lenient checks all allow continued operation after unusual 2079 events, such as endpoint crashes and large bursts of loss. There's 2080 no need for leniency when the endpoints are actively sending packets 2081 to one another. Therefore, a DCCP endpoint SHOULD implement the 2082 following, tighter constraints for active connections. An endpoint 2083 considers a connection active if it has received valid packets from 2084 the other endpoint within the last several round-trip times, or 2085 1 second, if the RTT is not known. 2087 Acknowledgement Number 2088 Packet Type Sequence Number Check Check 2089 ----------- --------------------- ---------------------- 2090 DCCP-Reset GSR < seqno <= SWH GAR <= ackno <= AWH 2091 DCCP-Move SWL <= seqno <= SWH AWL <= ackno <= AWH 2092 DCCP-Sync SWL <= seqno <= SWH AWL <= ackno <= AWH 2093 DCCP-SyncAck SWL <= seqno <= SWH AWL <= ackno <= AWH 2095 Note that sequence-validity is only one of the validity checks 2096 applied to received packets. 2098 7.5.2. Handling Sequence-Invalid Packets 2100 Sequence-invalid DCCP-Move, DCCP-Reset, DCCP-Sync, and DCCP-SyncAck 2101 packets MUST be ignored. 2103 When DCCP A receives any other sequence-invalid packet, it MUST 2104 reply with a DCCP-Sync packet. This packet MUST acknowledge the 2105 packet's Sequence Number (not GSR!). The DCCP-Sync MUST use a new 2106 Sequence Number, and thus will increase GSS; GSR will not change, 2107 however, since the received packet was sequence-invalid. DCCP A 2108 MUST NOT otherwise process sequence-invalid packets. For instance, 2109 it MUST NOT process their options. 2111 When the DCCP B endpoint receives the (sequence-valid) DCCP-Sync, it 2112 MUST update its GSR variable and reply with a DCCP-SyncAck packet 2113 acknowledging the DCCP-Sync (not necessarily GSR!). Upon receiving 2114 this DCCP-SyncAck, which will be sequence-valid since it 2115 acknowledges the DCCP-Sync, DCCP A will update its GSR variable, and 2116 the endpoints will be back in sync. Alternatively, if the 2117 connection was half-open (DCCP B is in CLOSED or REQUEST state), 2118 DCCP B will send a Reset. 2120 A DCCP endpoint MAY temporarily preserve sequence-invalid packets in 2121 case they become valid later. This can reduce the impact of bursts 2122 of loss by delivering more packets to the application. In 2123 particular, an endpoint MAY preserve a sequence-invalid packet for 2124 up to 2 round-trip times (or 1 second, if the RTT is unknown); if, 2125 within that time, the relevant sequence windows change so that the 2126 packet becomes sequence-valid, the endpoint MAY process the packet 2127 again. 2129 To protect itself against denial-of-service attacks (where an 2130 attacker sends many sequence-invalid packets, trying to force the 2131 receiver to send many DCCP-Syncs), a DCCP implementation MAY rate- 2132 limit the DCCP-Syncs sent in response to sequence-invalid packets. 2134 7.5.3. Sequence and Acknowledgement Number Windows 2136 Each DCCP endpoint defines sequence validity windows that are 2137 subsets of the Sequence and Acknowledgement Number spaces. These 2138 windows correspond to packets the endpoint expects to receive in the 2139 next few round-trip times. The Sequence and Acknowledgement Number 2140 windows always contain GSR and GSS, respectively; the window widths 2141 are controlled by Sequence Window features. 2143 The Sequence Number validity window for packets from DCCP B is [SWL, 2144 SWH]. This window always contains GSR, the Greatest Sequence Number 2145 Received on a sequence-valid packet from DCCP B. It is W packets 2146 wide, where W is the value of the Sequence Window/B feature. One- 2147 fourth of the sequence window, rounded down, is placed at and before 2148 GSR, with three-fourths after GSR. (This asymmetric placement 2149 assumes that bursts of loss are more common in the network than 2150 significant reordering.) 2152 invalid | valid Sequence Numbers | invalid 2153 <---------*|*===========*=======================*|*---------> 2154 GSR -|GSR + 1 - GSR GSR +|GSR + 1 + 2155 floor(W/4)|floor(W/4) ceil(3W/4)|ceil(3W/4) 2156 = SWL = SWH 2158 The Acknowledgement Number validity window for packets from DCCP B 2159 is [AWL, AWH]. The high end of the window, AWH, always equals GSS, 2160 the Greatest Sequence Number Sent by DCCP A; the window is W' 2161 packets wide, where W' is the value of the Sequence Window/A 2162 feature. 2164 invalid | valid Acknowledgement Numbers | invalid 2165 <---------*|*===================================*|*---------> 2166 GSS - W'|GSS + 1 - W' GSS|GSS + 1 2167 = AWL = AWH 2169 SWL and AWL are initially adjusted so that they don't go below the 2170 initial Sequence Numbers received and sent, respectively: 2171 SWL := max(GSR + 1 - floor(W/4), ISR), 2172 AWL := max(GSS - W' + 1, ISS). 2173 Of course, these adjustments MUST NOT be applied after the relevant 2174 sequence numbers wrap. 2176 7.5.4. Sequence Window Feature 2178 The Sequence Window/A feature determines the width of the Sequence 2179 Number validity window used by DCCP B, and the width of the 2180 Acknowledgement Number validity window used by DCCP A. DCCP A sends 2181 a "Change L(Sequence Window, W)" option to notify DCCP B that the 2182 Sequence Window/A value is W. 2184 Sequence Window has feature number 3, and is non-negotiable. It 2185 takes 3- or 6-byte integer values, like DCCP sequence numbers. 2186 Change and Confirm options for Sequence Window are therefore either 2187 6 or 9 bytes long. New connections start with Sequence Window 100 2188 for both endpoints. 2190 A proper Sequence Window/A value should reflect how many packets 2191 DCCP A expects to be in flight. Only DCCP A can anticipate this 2192 number. Too-small values increase the risk of the endpoints getting 2193 out sync after bursts of loss; too-large values increase the risk of 2194 connection hijacking. (The next section quantifies this risk.) One 2195 good guideline is for each endpoint to set Sequence Window to a 2196 small multiple of the maximum number of packets it expects to send 2197 in a round-trip time. This value may not be available at connection 2198 initiation, when the round-trip time is unknown, but the endpoint 2199 can always send updates as the connection progresses. 2201 7.5.5. Sequence Number Attacks 2203 Sequence and Acknowledgement Numbers form DCCP's main line of 2204 defense against attackers. An attacker that cannot guess sequence 2205 numbers cannot easily manipulate or hijack a DCCP connection, and 2206 requirements like careful initial sequence number choice eliminate 2207 the most serious attacks. 2209 An attacker might still send many packets with randomly chosen 2210 Sequence and Acknowledgement Numbers, however. If one of those 2211 probes ends up sequence-valid, it may shut down the connection or 2212 otherwise cause problems. The easiest such attacks to execute are: 2214 o Send DCCP-Sync packets with random Sequence and Acknowledgement 2215 Numbers. If one of these packets hits the valid acknowledgement 2216 number window, the receiver will shift its sequence number window 2217 accordingly, getting out of sync with the correct 2218 endpoint---perhaps permanently. 2220 o Send DCCP-Reset packets with Sequence Number zero and random 2221 Acknowledgement Numbers. If one of these packets hits the valid 2222 acknowledgement number window, the connection will be shut down. 2224 o Send DCCP-Data packets with random Sequence Numbers. If one of 2225 these packets hits the valid sequence number window, the attack 2226 packet's application data may be inserted into the data stream. 2228 The attacker has to guess both Source and Destination Ports for any 2229 of these attacks to succeed. Additionally, the connection would 2230 have to be inactive for the DCCP-Sync and DCCP-Reset packets to 2231 succeed, assuming the victim implemented the more stringent checks 2232 for active connections recommended in Section 7.5.1. 2234 To quantify the probability of success, let N be the number of 2235 attack packets the attacker is willing to send, W be the relevant 2236 sequence window width, and L be the length of sequence numbers (24 2237 or 48). The attacker's best strategy is to space the attack packets 2238 evenly over sequence space. Then one of these attacks will succeed 2239 with probability P = WN/2^L. For N = 1000, W = 100, and L = 24, 2240 this probability is about 0.006. (For reference, the easiest TCP 2241 attack---sending a SYN with a random sequence number, which will 2242 cause a connection reset if it falls within the window---will 2243 succeed with probability 0.002 for N = 1000, W = 8760 [a common 2244 default], and L = 32.) Connections with sequence windows much 2245 larger than 100 SHOULD use extended sequence numbers to reduce the 2246 probability of attack success. 2248 7.5.6. Examples 2250 In the following example, DCCP A and DCCP B recover from a large 2251 burst of loss that runs DCCP A's sequence numbers out of DCCP B's 2252 appropriate sequence number window. 2254 Recovery from Burst of Loss 2255 DCCP A DCCP B 2256 (GSS=1,GSR=10) (GSS=10,GSR=1) 2257 --> DCCP-Data(seq 2) XXX 2258 ... 2259 --> DCCP-Data(seq 100) XXX 2260 --> DCCP-Data(seq 101) --> ??? 2261 seqno out of range; 2262 send Sync 2263 OK <-- DCCP-Sync(seq 11, ack 101) <-- 2264 (GSS=11,GSR=1) 2265 --> DCCP-SyncAck(seq 102, ack 11) --> OK 2266 (GSS=102,GSR=11) (GSS=11,GSR=102) 2268 In the next example, a DCCP connection recovers from a simple 2269 attack. The attacker cannot guess sequence numbers. (DCCP is not 2270 robust to attackers who can guess sequence numbers.) 2272 Recovery from Attack 2273 DCCP A DCCP B 2274 (GSS=1,GSR=10) (GSS=10,GSR=1) 2275 *ATTACKER* --> DCCP-Data(seq 10^6) --> ??? 2276 seqno out of range; 2277 send Sync 2278 ??? <-- DCCP-Sync(seq 11, ack 10^6) <-- 2279 ackno out of range; ignore 2280 (GSS=1,GSR=10) (GSS=11,GSR=1) 2282 The final example demonstrates recovery from a half-open connection. 2284 Recovery from a Half-Open Connection 2285 DCCP A DCCP B 2286 (GSS=1,GSR=10) (GSS=10,GSR=1) 2287 (Crash) 2288 CLOSED OPEN 2289 REQUEST --> DCCP-Request(seq 400) --> ??? 2290 !! <-- DCCP-Sync(seq 11, ack 400) <-- OPEN 2291 REQUEST --> DCCP-Reset(seq 401, ack 11) --> (Abort) 2292 REQUEST CLOSED 2293 REQUEST --> DCCP-Request(seq 402) --> ... 2295 7.6. Extended Sequence Numbers 2297 Extended 48-bit sequence numbers increase the rate DCCP connections 2298 can achieve without wrapping sequence numbers, and provide 2299 additional protection against the sequence number attacks described 2300 above. Very-high-rate DCCP connections, and connections with large 2301 sequence windows, SHOULD therefore use extended sequence numbers 2302 rather than the default 24-bit sequence numbers. 2304 7.6.1. When to Use Extended Sequence Numbers 2306 The sequence-validity mechanism protects against the network 2307 delivering old data, but it assumes that the network does not 2308 deliver extremely old data. In particular, it assumes that the 2309 network must have dropped any packet by the time the connection 2310 wraps around and uses its sequence number again. We can easily 2311 calculate the maximum connection rate that can be safely achieved 2312 given this constraint. Let MSL equal the maximum segment lifetime, 2313 P equal the average DCCP packet size in bits, and L equal the length 2314 of sequence numbers (24 or 48 bits). Then the maximum safe rate, in 2315 bits per second, is R = P*(2^L)/2MSL. 2317 For the default MSL of 2 minutes, 1500-byte DCCP packets, and 24-bit 2318 sequence numbers, the safe rate is therefore approximately 800 Mb/s. 2319 Of course, 2 minutes is a very large MSL for any networks that could 2320 sustain that rate with such small packets. Nevertheless, 48-bit 2321 sequence numbers allow much higher rates, up to 14 petabits a second 2322 for 1500-byte packets and the default MSL. 2324 The probability of sequence number attack success P = WN/2^L, 2325 discussed in Section 7.5.5, may also be relevant when deciding 2326 whether to use extended sequence numbers. A fast connection will 2327 generally have a relatively high W (sequence window size), 2328 increasing the attack success probability for fixed N (number of 2329 attack packets); if the probability gets uncomfortably high with L = 2330 24, the connection should use 48-bit sequence numbers instead. 2332 7.6.2. Header Processing 2334 Extended sequence numbers are activated when the header's X bit is 2335 set to one (see Section 5.1). This extends the Sequence Number and 2336 Acknowledgement Number fields by an additional 24 bits, for a total 2337 of 48 bits. The 48-bit numbers are stored in network order, with 2338 most significant bit first. All packet types except for DCCP-Data 2339 and DCCP-Request will follow this generic header with an extended 2340 48-bit Acknowledgement Number. 2342 Once an endpoint has transitioned to 48-bit sequence numbers (X=1), 2343 it MUST send all succeeding packets with 48-bit sequence numbers. 2344 Furthermore, once an endpoint has received a sequence-valid packet 2345 with 48-bit sequence numbers, it MUST either send all succeeding 2346 packets with 48-bit sequence numbers, or reset the connection with 2347 Reset Code 7, "Extended Sequence Numbers". (But note that an 2348 endpoint may send extended DCCP-Sync packets before transitioning to 2349 extended sequence numbers.) 2351 Clients SHOULD decide whether to use extended sequence numbers 2352 before sending their DCCP-Requests. However, the Transition bit (T) 2353 and Sequence Transition Capable feature support transitioning to 2354 extended sequence numbers during an active connection, in case this 2355 proves necessary; see below. A client that sends an extended DCCP- 2356 Request might receive a DCCP-Reset in response with Reset Code 7, 2357 "Extended Sequence Numbers"; the client SHOULD respond by sending 2358 another Request using 24-bit sequence numbers. 2360 Extended sequence numbers are treated simply as longer sequence 2361 numbers. For instance, the sequence-validity mechanisms work the 2362 same way whether or not sequence numbers are extended. Care is 2363 required when comparing a 24-bit sequence number with an 48-bit 2364 sequence number, however; see the next section. 2366 7.6.3. Transitioning to Extended Sequence Numbers 2368 The Transition bit (T) following the extended Sequence Number field 2369 makes it possible to transition to 48-bit sequence numbers in the 2370 middle of a connection. T is set to one only during such a 2371 transition. When DCCP A switches to 48-bit sequence numbers, it 2372 MUST set the T bit to one on all of its packets for some period. 2373 This period SHOULD last on the order of a few round trip times, or 2374 until DCCP A receives an acknowledgement from DCCP B proving that 2375 one of its 48-bit-sequence-number packets has been received, 2376 whichever comes later. 2378 Each DCCP MUST choose its first 48-bit sequence number to have its 2379 lower 24 bits equal the 24-bit sequence number it expected to send 2380 (GSS+1). The upper 24 bits may be chosen arbitrarily. This applies 2381 to Acknowledgement Numbers as well as Sequence Numbers; if DCCP A 2382 sends an extended packet containing an Acknowledgement Number before 2383 DCCP B sends it a 48-bit Sequence Number, DCCP A can choose any 2384 value for the upper 24 bits of the Acknowledgement Number, but the 2385 lower 24 bits MUST equal the expected 24-bit Acknowledgement Number 2386 (GSR). Furthermore, DCCP A MUST leave GSR as a 24-bit number until 2387 receiving an extended packet from DCCP B. 2389 Switching to 48-bit sequence numbers in the middle of a connection 2390 complicates sequence number comparison. Endpoints must compare 2391 48-bit sequence numbers with 24-bit sequence numbers, and compare 2392 48-bit sequence numbers that might have different, arbitrary values 2393 in the upper 24 bits, while remaining robust to reordering and to 2394 old or malicious packets. The following procedure describes how 2395 sequence numbers should be compared during and immediately after a 2396 transition. 2398 Let P be the packet sequence number received from DCCP B, and E be 2399 the sequence number DCCP A expects. During sequence-validity 2400 computations, for example, P might be the packet's Acknowledgement 2401 Number and E might be AWL, the left edge of the appropriate 2402 acknowledgement number window. Then DCCP A should perform the 2403 comparison as follows. 2405 o If P and E are both 24 bits, compare them modulo 2^24. 2407 o If P and E are both 48 bits, you generally compare them modulo 2408 2^48, except that during a transition, the two values might have 2409 arbitrary values in the upper 24 bits. 2411 - If the packet's Transition bit is set, and the last packet sent 2412 by DCCP A had its Transition bit set, then compare P and E 2413 modulo 2^24. 2415 - Otherwise, compare them modulo 2^48. 2417 o If P is 48 bits but E is 24, the remote DCCP may want to 2418 transition to extended sequence numbers. 2420 - If the packet's Transition bit is set, compare P with E modulo 2421 2^24. If the packet proves sequence-valid, then it is OK; 2422 transition to extended sequence numbers, and set E according to 2423 the full 48 bits of P. 2425 - Otherwise, the packet is sequence-invalid. 2427 Either way, if the packet proves to be sequence-invalid, send an 2428 extended DCCP-Sync if required (with T set to one), but do not yet 2429 transition to extended sequence numbers. 2431 o If P is 24 bits but E is 48, there may have been benign packet 2432 reordering. The correct action depends on whether the last 2433 sequence-valid packet received from DCCP B had the Transition bit 2434 set. 2436 - If Transition was set, extend P to a 48-bit value P'. First, 2437 let EH equal the upper 24 bits of E, and EL equal the lower 24 2438 bits of E. Then: 2440 If EL > P, set P' = (EH << 24) | P. 2441 Otherwise, set P' = (((EH - 1) mod 2^24) << 24) | P. 2443 The "EL > P" test uses arithmetic comparison, NOT circular 2444 comparison. Compare P' with E modulo 2^48. 2446 - Otherwise, the packet is sequence-invalid. 2448 Either way, if the packet proves to be sequence-invalid, send an 2449 extended DCCP-Sync if required, with T set to one. 2451 DCCP implementations can, of course, avoid most of this complexity 2452 by disallowing transitions to extended sequence numbers (and by 2453 resetting the connection when the other endpoint attempts such a 2454 transition). Connections that use 48-bit sequence numbers 2455 throughout, starting with the DCCP-Request, MUST have T set to zero 2456 on all their packets. 2458 7.6.4. Sequence Transition Capable Feature 2460 The Sequence Transition Capable feature expresses whether DCCP 2461 endpoints are capable of transitioning to extended sequence numbers 2462 in the course of an active connection. DCCP A sends a 2463 "Change R(Sequence Transition Capable, 1)" option to DCCP B to 2464 discover whether B can transition to extended sequence numbers. 2466 Sequence Transition Capable has feature number 4, and is server- 2467 priority. It takes one-byte Boolean values. DCCP B MUST allow 2468 transitions to extended sequence numbers when Sequence Transition 2469 Capable/B is one. It MUST NOT reset the connection with Reset Code 2470 7, "Extended Sequence Numbers", under those circumstances. However, 2471 DCCP B MAY allow such transitions even when Sequence Transition 2472 Capable/B is zero. Values of two or more are reserved. New 2473 connections start with Sequence Transition Capable 0 (that is, not 2474 capable) for both endpoints. 2476 7.7. NDP Count and Detecting Application Loss 2478 DCCP's sequence numbers increment by one on every packet, including 2479 non-data packets (packets that don't carry application data). This 2480 makes DCCP sequence numbers suitable for detecting any network loss, 2481 but not for detecting the loss of application data. The NDP Count 2482 option reports the length of each burst of non-data packets. This 2483 lets the receiving DCCP determine, for every burst of loss, whether 2484 or not application data was lost. 2486 +--------+--------+-------- ... --------+ 2487 |00100101| Length | NDP Count | 2488 +--------+--------+-------- ... --------+ 2489 Type=37 Len=3-5 2491 If a DCCP endpoint's Send NDP Count feature is one (see below), then 2492 that endpoint MUST send an NDP Count option on every packet whose 2493 immediate predecessor was a non-data packet. Non-data packets 2494 consist of DCCP packet types DCCP-Ack, DCCP-Close, DCCP-CloseReq, 2495 DCCP-Reset, DCCP-Move, DCCP-Sync, and DCCP-SyncAck. All other 2496 packet types are considered data packets, although not all DCCP- 2497 Request and DCCP-Response packets will actually carry application 2498 data. 2500 The value stored in NDP Count equals the number of consecutive non- 2501 data packets in the run immediately previous to the current packet. 2502 Packets with no NDP Count option are considered to have NDP Count 2503 zero. 2505 The NDP Count option can carry one to three bytes of data. The 2506 smallest option format that can hold the NDP Count SHOULD be used. 2508 7.7.1. Usage Notes 2510 Say that K consecutive sequence numbers are missing in some burst of 2511 loss, and the Send NDP Count feature is on. Then some application 2512 data was lost within those sequence numbers unless the packet 2513 following the hole contains an NDP Count option whose value is 2514 greater than or equal to K. 2516 For example, say that the following sequence of non-data packets 2517 (Nx) and data packets (Dx) were sent. 2519 N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 2521 Those packets would have NDP Counts as follows. 2523 N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 2524 - 1 2 - 1 - - 1 - - - - 1 2 2526 NDP Count is not useful for applications that include their own 2527 sequence numbers with their packet headers. 2529 7.7.2. Send NDP Count Feature 2531 The Send NDP Count feature lets DCCPs negotiate whether they should 2532 send NDP Count options on their packets. DCCP A sends a 2533 "Change R(Send NDP Count, 1)" option to ask DCCP B to send NDP Count 2534 options. 2536 Send NDP Count has feature number 9, and is server-priority. It 2537 takes one-byte Boolean values. DCCP B MUST send NDP Count options 2538 on its non-data packets (and some of its data packets) when Send NDP 2539 Count/B is one, although it MAY send NDP Count options even when 2540 Send NDP Count/B is zero. Values of two or more are reserved. New 2541 connections start with Send NDP Count 0 for both endpoints. 2543 8. Event Processing 2545 This section describes how DCCP connections move between states, and 2546 which packets are sent when. Note that feature negotiation takes 2547 place in parallel with the connection-wide state transitions 2548 described here. 2550 8.1. Connection Establishment 2552 DCCP connections' initiation phase consists of a three-way 2553 handshake: an initial DCCP-Request packet sent by the client, a 2554 DCCP-Response sent by the server in reply, and finally an 2555 acknowledgement from the client, usually via a DCCP-Ack or DCCP- 2556 DataAck packet. The client moves from the REQUEST state to 2557 PARTOPEN, and finally to OPEN; the server moves from LISTEN to 2558 RESPOND, and finally to OPEN. 2560 Client State Server State 2561 CLOSED LISTEN 2562 1. REQUEST --> Request --> 2563 2. <-- Response <-- RESPOND 2564 3. PARTOPEN --> Ack, DataAck --> 2565 4. <-- Data, Ack, DataAck <-- OPEN 2566 5. OPEN <-> Data, Ack, DataAck <-> OPEN 2568 8.1.1. Client Request 2570 When a client decides to initiate a connection, it enters the 2571 REQUEST state, chooses an initial sequence number (Section 7.2), and 2572 sends a DCCP-Request packet using that sequence number to the 2573 intended server. 2575 DCCP-Request packets will commonly carry feature negotiation options 2576 that open negotiations for various connection parameters, such as 2577 preferred congestion control IDs for each half-connection. They may 2578 also carry application data, but the client should be aware that the 2579 server may not accept such data. 2581 A client in the REQUEST state SHOULD send new DCCP-Request packets 2582 after some timeout if no response is received. The retransmission 2583 strategy SHOULD be similar to that for retransmitting TCP SYNs; for 2584 instance, a first timeout on the order of a second, with an 2585 exponential backoff timer. Each new DCCP-Request MUST increment the 2586 Sequence Number by one, and MUST contain the same Service Code and 2587 application data as the original DCCP-Request. 2589 A client MAY give up after some number of DCCP-Requests. If so, it 2590 SHOULD send a DCCP-Reset packet to the server with Reset Code 2, 2591 "Aborted", to clean up state in case one or more of the Requests 2592 actually arrived. 2594 The client leaves the REQUEST state for PARTOPEN when it receives a 2595 DCCP-Response from the server. 2597 8.1.2. Service Codes 2599 Each DCCP-Request contains a 32-bit Service Code, which identifies 2600 the service to which the client application is trying to connect. 2601 Service Codes should correspond to application services and 2602 protocols. For example, there might be a Service Code for HTTP 2603 connections, one for FTP control connections, and one for FTP data 2604 connections. Middleboxes, such as firewalls, can use the Service 2605 Code to identify the application running on a nonstandard port 2606 (assuming the DCCP header has not been encrypted). 2608 Endpoints MUST associate a Service Code with every DCCP socket, both 2609 actively and passively opened. The application will generally 2610 supply this Service Code. Each active socket MUST have exactly one 2611 Service Code, while passive sockets MAY have more than one; this 2612 might let multiple applications listen on the same port, 2613 differentiated by Service Code. If the DCCP-Request's Service Code 2614 doesn't match any of the server's Service Codes for the given port, 2615 the server MUST reject the request by sending a DCCP-Reset packet 2616 with Reset Code 9, "Bad Service Code". A middlebox MAY also send 2617 such a DCCP-Reset in response to packets whose Service Code is 2618 considered unsuitable. 2620 Service Codes should be allocated by IANA. We intend for Service 2621 Code allocation to be allocated to anyone who asks, first-come 2622 first-serve, subject to the following guidelines. 2624 o Service Codes should be allocated one at a time, or in small 2625 blocks. A short English description of the intended service is 2626 required to obtain a Service Code assignment, but no 2627 specification, standards-track or otherwise, is necessary. IANA 2628 should maintain an association of Service Codes to the 2629 corresponding phrases. 2631 o Users may request specific Service Code values, which should be 2632 assigned first-come first-serve. We suggest that users request 2633 Service Codes that can be interpreted as meaningful four-byte 2634 ASCII strings. Thus, the "Frobodyne Plotz Protocol" might 2635 correspond to "fdpz", or the number 1717858426. The canonical 2636 interpretation of a Service Code field is numeric. 2638 o Service Codes whose bytes each have values in the set {32, 45-57, 2639 65-90} should be reserved for international standard or standards- 2640 track specifications, IETF or otherwise. (This set consists of 2641 the ASCII digits, uppercase letters, and characters space, '-', 2642 '.', and '/'.) 2644 o Service Codes whose high-order byte equals 63 (ASCII '?') should 2645 never be allocated. These Service Codes are reserved for private 2646 use. 2648 o Service Code 0 should never be allocated. It represents the 2649 absence of a meaningful Service Code. 2651 This design for Service Code allocation is based on the allocation 2652 of 4-byte identifiers for Macintosh resources, PNG chunks, and 2653 TrueType and OpenType tables. 2655 8.1.3. Server Response 2657 In the second phase of the three-way handshake, the server moves 2658 from the LISTEN state to RESPOND, and sends a DCCP-Response message 2659 to the client. In this phase, a server will often specify the 2660 features it would like to use, either from among those the client 2661 requested, or in addition to those. Among these options is the 2662 congestion control mechanism the server expects to use. 2664 The receiver MAY respond to a DCCP-Request packet with a DCCP-Reset 2665 packet to refuse the connection. Relevant Reset Codes for refusing 2666 a connection include 8, "Connection Refused", when the DCCP- 2667 Request's Destination Port did not correspond to a DCCP port open 2668 for listening; 9, "Bad Service Code", when the DCCP-Request's 2669 Service Code did not correspond to the service code registered with 2670 the Destination Port; and 10, "Too Busy", when the server is 2671 currently too busy to respond to requests. The server SHOULD limit 2672 the rate at which it generates these resets. 2674 The receiver SHOULD NOT retransmit DCCP-Response packets; the sender 2675 will retransmit the DCCP-Request if necessary. (Note that the 2676 "retransmitted" DCCP-Request will have, at least, a different 2677 sequence number from the "original" DCCP-Request; the receiver can 2678 thus distinguish true retransmissions from network duplicates.) The 2679 responder will detect that the retransmitted DCCP-Request applies to 2680 an existing connection because of its Source and Destination Ports. 2681 Every valid DCCP-Request received while the server is in the RESPOND 2682 state MUST elicit a new DCCP-Response. Each new DCCP-Response MUST 2683 increment the responder's Sequence Number by one, and MUST include 2684 the same application data, if any, as the original DCCP-Response. 2686 The responder MUST accept at most one piece of DCCP-Request data per 2687 connection. In particular, the DCCP-Response sent in reply to a 2688 retransmitted DCCP-Request with data SHOULD contain a Data Dropped 2689 option, in which the retransmitted DCCP-Request is reported as "data 2690 dropped due to protocol constraints" (Drop Code 0). The original 2691 DCCP-Request SHOULD also be reported in the Data Dropped option, 2692 either in a Normal Block (if the responder accepted the data, or 2693 there was no data), or in a Drop Code 0 Drop Block (if the responder 2694 refused the data the first time as well). 2696 The Data Dropped and Init Cookie options are particularly useful for 2697 DCCP-Response packets (Sections 11.7 and 8.1.4). 2699 The server leaves the RESPOND state for OPEN when it receives a 2700 valid DCCP-Ack from the client, completing the three-way handshake. 2702 8.1.4. Init Cookie Option 2704 +--------+--------+--------+--------+--------+-------- 2705 |00100100| Length | Init Cookie Value ... 2706 +--------+--------+--------+--------+--------+-------- 2707 Type=36 2709 The Init Cookie option lets a DCCP server avoid having to hold any 2710 state until the three-way connection setup handshake has completed. 2711 The server wraps up the service code, server port, and any options 2712 it cares about from both the DCCP-Request and DCCP-Response in an 2713 opaque cookie. Typically the cookie will be encrypted using a 2714 secret known only to the server and include a cryptographic checksum 2715 or magic value so that correct decryption can be verified. When the 2716 server receives the cookie back in the response, it can decrypt the 2717 cookie and instantiate all the state it avoided keeping. In the 2718 meantime, it need not move from the LISTEN state. 2720 This option is permitted in DCCP-Response, DCCP-Data, DCCP-Ack, 2721 DCCP-DataAck, DCCP-Sync, and DCCP-SyncAck packets. The server MAY 2722 include an Init Cookie option in its DCCP-Response. If so, then the 2723 client MUST echo the same Init Cookie option in each succeeding DCCP 2724 packet until one of those packets is acknowledged, meaning the 2725 three-way handshake has completed, or the connection is reset. The 2726 server SHOULD design its Init Cookie format so that Init Cookies can 2727 be checked for tampering; it SHOULD respond to a tampered Init 2728 Cookie option by resetting the connection with Reset Code 11, "Bad 2729 Init Cookie". 2731 The precise implementation of the Init Cookie does not need to be 2732 specified here; since Init Cookies are opaque to the client, there 2733 are no interoperability concerns. 2735 Init Cookies are limited to at most 253 bytes in length. 2737 8.1.5. Handshake Completion 2739 When the client receives a DCCP-Response from the server, it moves 2740 from the REQUEST state to PARTOPEN, and completes three-way 2741 handshake by sending a DCCP-Ack packet to the server. The PARTOPEN 2742 state represents that the client isn't sure whether the server has 2743 received any of its DCCP-Acks. The client MUST NOT send DCCP-Data 2744 packets while it remains in PARTOPEN. This is because DCCP-Data 2745 packets lack Acknowledgement Numbers, so the server can't tell from 2746 a DCCP-Data packet whether the client saw its DCCP-Response. 2747 Furthermore, if the DCCP-Response included an Init Cookie, that Init 2748 Cookie MUST be included on every packet sent in PARTOPEN. 2750 The single DCCP-Ack sent when entering the PARTOPEN state might, of 2751 course, be dropped by the network. The client SHOULD ensure that 2752 some packet gets through eventually. The preferred mechanism would 2753 be a delayed-ack-like 200-millisecond timer, set every time a packet 2754 is transmitted in PARTOPEN. If this timer goes off and the client 2755 is still in PARTOPEN, the client generates another DCCP-Ack and 2756 backs off the timer. If the client remains in PARTOPEN for more 2757 than 4MSL, it SHOULD reset the connection with Reset Code 2, 2758 "Aborted". 2760 The client leaves the PARTOPEN state for OPEN when it receives a 2761 packet other than DCCP-Response or DCCP-Reset from the server. 2763 8.2. Data Transfer 2765 In the central, data transfer phase of the connection, both server 2766 and client are in the OPEN state. 2768 DCCP A sends DCCP-Data and DCCP-DataAck packets to DCCP B due to 2769 application events on host A. These packets are congestion- 2770 controlled by the CCID for the A-to-B half-connection. In contrast, 2771 DCCP-Ack packets sent by DCCP A are controlled by the CCID for the 2772 B-to-A half-connection. Generally, DCCP A will piggyback 2773 acknowledgement information on DCCP-Data packets when acceptable, 2774 creating DCCP-DataAck packets. DCCP-Ack packets are used when there 2775 is no data to send from DCCP A to DCCP B, or when the congestion 2776 state of the A-to-B CCID will not allow data to be sent. 2778 The DCCP-Move, DCCP-Sync, and DCCP-SyncAck packets will also occur 2779 in the data transfer phase. DCCP-Move handling is discussed in 2780 Section 14, and some cases causing DCCP-Sync generation are 2781 discussed in Section 7.5. One important distinction between DCCP- 2782 Sync packets and other packet types is that DCCP-Sync elicits an 2783 immediate acknowledgement. On receiving a valid DCCP-Sync packet, a 2784 DCCP endpoint MUST immediately generate and send a DCCP-SyncAck in 2785 response; and the Acknowledgement Number on that DCCP-SyncAck MUST 2786 equal the Sequence Number of the DCCP-Sync. 2788 A particular DCCP implementation might decide to initiate feature 2789 negotiation only once the OPEN state was reached, in which case it 2790 might not allow data transfer until some time later. Data received 2791 during that time SHOULD be rejected and reported using a Data 2792 Dropped Drop Block with Drop Code 0. 2794 8.3. Termination 2796 DCCP connection termination uses a handshake consisting of an 2797 optional DCCP-CloseReq packet, a DCCP-Close packet, and a DCCP-Reset 2798 packet. The server moves from the OPEN state, possibly through the 2799 CLOSEREQ state, to CLOSED; the client moves from OPEN through 2800 CLOSING to TIMEWAIT, and after 2MSL wait time, to CLOSED. 2802 The sequence DCCP-CloseReq, DCCP-Close, DCCP-Reset is used when the 2803 server decides to close the connection, but doesn't want to hold 2804 TIMEWAIT state: 2806 Client State Server State 2807 OPEN OPEN 2808 1. <-- CloseReq <-- CLOSEREQ 2809 2. CLOSING --> Close --> 2810 3. <-- Reset <-- CLOSED 2811 4. TIMEWAIT 2812 5. CLOSED 2814 A shorter sequence occurs when the client decides to close the 2815 connection. 2817 Client State Server State 2818 OPEN OPEN 2819 1. CLOSING --> Close --> 2820 2. <-- Reset <-- CLOSED 2821 3. TIMEWAIT 2822 4. CLOSED 2824 Finally, the server can decide to hold TIMEWAIT state: 2826 Client State Server State 2827 OPEN OPEN 2828 1. <-- Close <-- CLOSING 2829 2. CLOSED --> Reset --> 2830 3. TIMEWAIT 2831 4. CLOSED 2833 In all cases, the receiver of the DCCP-Reset packet holds TIMEWAIT 2834 state for the connection. As in TCP, TIMEWAIT state, where an 2835 endpoint quietly preserves a socket for 2MSL (4 minutes) after its 2836 connection has closed, ensures that no connection duplicating the 2837 current connection's source and destination addresses and ports can 2838 start up while old packets might remain in the network. 2840 The termination handshake proceeds as follows. The receiver of a 2841 valid DCCP-CloseReq packet MUST respond with a DCCP-Close packet; 2842 that receiving endpoint will expect to hold TIMEWAIT state after 2843 later receiving a DCCP-Reset. The receiver of a valid DCCP-Close 2844 packet MUST respond with a DCCP-Reset packet, with Reset Code 1, 2845 "Closed"; the endpoint that originally sent the DCCP-Close will hold 2846 TIMEWAIT state. The endpoint that receives a valid DCCP-Reset 2847 packet will hold TIMEWAIT state for the connection. 2849 A DCCP-Reset packet completes every DCCP connection, whether the 2850 termination is clean (due to application close; Reset Code 1, 2851 "Closed") or unclean. Unlike TCP, which has two distinct 2852 termination mechanisms (FIN and RST), DCCP ends all connections in a 2853 uniform manner. This is justified because some responses to 2854 connection termination close are the same no matter whether 2855 termination was clean. For instance, the endpoint that receives a 2856 valid DCCP-Reset should hold TIMEWAIT state for the connection. 2857 Processors that must distinguish between clean and unclean 2858 termination can examine the Reset Code. DCCP-Reset packets MUST NOT 2859 be generated in response to received DCCP-Reset packets. DCCP 2860 implementations generally transition to the CLOSED state after 2861 sending a DCCP-Reset packet. 2863 Endpoints in the CLOSEREQ and CLOSING states MUST retransmit DCCP- 2864 CloseReq and DCCP-Close packets, respectively, until leaving those 2865 states. The retransmission timer should initially be set to go off 2866 in two RTTs, or 0.4 seconds if the RTT is not known, and should back 2867 off to not less than once every 64 RTTs if no relevant response is 2868 received. 2870 Only the server can send a DCCP-CloseReq packet or enter the 2871 CLOSEREQ state. 2873 8.3.1. Abnormal Termination 2875 DCCP endpoints generate DCCP-Reset packets to terminate connections 2876 abnormally; a DCCP-Reset packet may be generated from any state. 2877 However, a DCCP endpoint in the CLOSED or LISTEN state may not have 2878 a proper sequence number available to send a Reset. In these cases, 2879 it MUST set the Reset's Sequence Number to zero. Resets sent in the 2880 CLOSED, LISTEN, and TIMEWAIT states often use Reset Code 3, "No 2881 Connection". Resets sent in the REQUEST or RESPOND states often use 2882 Reset Code 4, "Packet Error". 2884 8.4. DCCP State Diagram 2886 The most common state transitions discussed above can be summarized 2887 in the following state diagram. The diagram is illustrative; the 2888 text in Section 8.5 and elsewhere should be considered definitive. 2889 For example, there are arcs (not shown) from every state except 2890 CLOSED to TIMEWAIT, contingent on the receipt of a valid DCCP-Reset. 2892 +---------------------------+ +---------------------------+ 2893 | v v | 2894 | +----------+ | 2895 | +-------------+ CLOSED +------------+ | 2896 | | +----------+ active | | 2897 | | passive open | | 2898 | | open snd Request | | 2899 | v v | 2900 | +----------+ +----------+ | 2901 | | LISTEN | | REQUEST | | 2902 | +----+-----+ +----+-----+ | 2903 | | rcv Request rcv Response | | 2904 | | snd Response snd Ack | | 2905 | v v | 2906 | +----------+ +----------+ | 2907 | | RESPOND | | PARTOPEN | | 2908 | +----+-----+ +----+-----+ | 2909 | | rcv Ack/DataAck rcv packet | | 2910 | | | | 2911 | | +----------+ | | 2912 | +------------>| OPEN |<-----------+ | 2913 | +--+-+--+--+ | 2914 | server active close | | | active close | 2915 | snd CloseReq | | | or rcv CloseReq | 2916 | | | | snd Close | 2917 | | | | | 2918 | +----------+ | | | +----------+ | 2919 | | CLOSEREQ |<---------+ | +--------->| CLOSING | | 2920 | +----+-----+ | +----+-----+ | 2921 | | rcv Close | | | 2922 | | snd Reset | rcv Reset | | 2923 |<---------+ | v | 2924 | rcv Close | +----+-----+ | 2925 | snd Reset | | TIMEWAIT | | 2926 | | +----+-----+ | 2927 +-----------------------------+ | | 2928 +-----------+ 2929 2MSL timer expires 2931 8.5. Pseudocode 2933 This section presents an algorithm describing the processing steps a 2934 DCCP endpoint must go through when it receives a packet. A DCCP 2935 implementation need not implement the algorithm as it is described 2936 here, but any implementation MUST generate observable effects 2937 (meaning packets) exactly as indicated by this pseudocode, except 2938 where allowed otherwise by another part of this document. 2940 The received packet is written as P, the socket as S. Socket variables: 2941 S.SWL - sequence number window low 2942 S.SWH - sequence number window high 2943 S.AWL - acknowledgement number window low 2944 S.AWH - acknowledgement number window high 2945 S.ISS - initial sequence number sent 2946 S.ISR - initial sequence number received 2947 S.OSR - first OPEN sequence number received 2948 S.GSS - greatest sequence number sent 2949 S.GSR - greatest valid sequence number received 2950 S.GAR - greatest acknowledgement number received; initialized to S.ISS 2951 "Send packet" actions always use, and increment, S.GSS. 2953 First, check the header basics; 2954 If the header checksum is incorrect, drop packet and return. 2955 If the packet type is not understood, drop packet and return. 2956 If Data Offset is too small for packet type, or too large for packet, 2957 drop packet and return. 2959 Second, process DCCP-Move; 2960 If P.type == Move, 2961 Look up the Mobility ID in table; get socket. 2962 If socket exists && P.seqno >= S.SWL && P.ackno <= S.AWH 2963 && P.ackno >= S.ISS && S.state >= PARTOPEN && S.state < TIMEWAIT, 2964 Process options 2965 Set socket to point at new address/ports 2966 Add reference to new address/ports 2967 Set timer to remove old address/ports after 2MSL 2968 Choose new Mobility ID, add to table 2969 Send DCCP-Sync[Change L[Mobility ID, new ID]] 2970 Update S.GSR, S.SWL, S.SWH 2971 Drop packet and return 2972 Otherwise, 2973 Drop packet and return 2975 Third, check ports and process TIMEWAIT state; 2976 Look up flow ID; get socket. 2977 If no socket, or S.state == TIMEWAIT, 2978 Generate Reset(No Connection) unless P.type == Reset 2979 Drop packet and return 2981 Fourth, process LISTEN state; 2982 If S.state == LISTEN, 2983 If P.type == Request, 2984 /* Init Cookie processing would go here */ 2985 Set S := new socket for this port pair 2986 S.state = RESPOND 2987 Choose S.ISS (initial seqno) 2988 Set S.ISR, S.GSR, S.SWL, S.SWH from packet 2989 Continue (with S.state == RESPOND) 2990 Otherwise, 2991 Generate Reset(No Connection) unless P.type == Reset 2992 Drop packet and return 2994 Fifth, process Reset; 2995 If P.type == Reset, 2996 If S.GAR <= P.ackno <= S.AWH 2997 && (P.seqno == 0 || P.seqno > S.GSR || S.state == REQUEST), 2998 Tear down connection 2999 S.state := TIMEWAIT 3000 Set TIMEWAIT timer 3001 Drop packet and return 3002 Otherwise (sequence numbers out of whack), 3003 Drop packet and return 3005 Sixth, process REQUEST state; 3006 If S.state == REQUEST, 3007 If P.type == Response && S.AWL <= P.ackno <= S.AWH, 3008 Set S.GSR, S.ISR, S.SWL, S.SWH 3009 Otherwise, 3010 Generate Reset(Packet Error) 3011 Drop packet and return 3013 Seventh, process Sync sequence numbers; 3014 If P.type == Sync || P.type == SyncAck, 3015 If S.AWL <= P.ackno <= S.AWH and P.seqno >= S.SWL, 3016 Update S.GSR, S.SWL, S.SWH 3017 Otherwise, 3018 Drop packet and return 3020 Eighth, check sequence numbers; 3021 If S.SWL <= P.seqno <= S.SWH 3022 && (P.ackno does not exist || S.AWL <= P.ackno <= S.AWH), 3023 Update S.GSR, S.GAR, S.SWL, S.SWH 3024 Otherwise, 3025 Send Sync packet acknowledging P.seqno 3026 Drop packet and return 3028 Ninth, check packet type; 3029 If (S.is_server && P.type == CloseReq) 3030 || (S.is_server && P.type == Response) 3031 || (S.is_client && P.type == Request) 3032 || (S.state >= OPEN && P.type == Request && P.seqno >= S.OSR) 3033 || (S.state >= OPEN && P.type == Response && P.seqno >= S.OSR) 3034 || (S.state == RESPOND && P.type == Data), 3035 Send Sync packet acknowledging P.seqno 3036 Drop packet and return 3038 Tenth, process options; 3039 /* may involve resetting connection, etc. */ 3040 Mark packet as "received" for acknowledgement purposes 3041 On processing Confirm R(Mobility ID), 3042 Check that the confirmed Mobility ID is correct 3043 If a DCCP-Move was recently processed, 3044 Remove any old Mobility ID from table 3046 Eleventh, process RESPOND state; 3047 If S.state == RESPOND, 3048 If P.type == Request, 3049 Send Response 3050 Otherwise, 3051 S.OSR := P.seqno 3052 S.state := OPEN 3054 Twelfth, process REQUEST state; 3055 If S.state == REQUEST, 3056 S.state := PARTOPEN 3057 /* Do not send Data packets in PARTOPEN; furthermore, include Init 3058 Cookie on every packet */ 3059 Set PARTOPEN timer 3061 Thirteenth, process PARTOPEN state; 3062 If S.state == PARTOPEN, 3063 If P.type == Response, 3064 Send Ack 3065 Otherwise, 3066 S.OSR := P.seqno 3067 S.state := OPEN 3069 Fourteenth, process CloseReq; 3070 If P.type == CloseReq && S.state < CLOSEREQ, 3071 Generate Close 3072 S.state := CLOSING 3073 Set CLOSING timer 3075 Fifteenth, process Close; 3076 If P.type == Close, 3077 Generate Reset(Closed) 3078 Tear down connection 3079 Drop packet and return 3081 Sixteenth, process Sync; 3082 If P.type == Sync, 3083 Generate SyncAck 3085 Seventeenth, process data. 3086 Do not deliver data from more than one Request or Response 3088 9. Checksums 3090 DCCP uses a header checksum to protect its header against 3091 corruption. Generally, this checksum covers any application data as 3092 well. However, DCCP applications can request that the header 3093 checksum cover only part of the application data, or perhaps no 3094 application data at all. Link layers may then reduce their 3095 protection on unprotected parts of DCCP packets. For some noisy 3096 links, and applications that can tolerate corruption, this can 3097 greatly improve delivery rates and perceived performance. 3099 If checksum coverage is complete, packets with corrupt application 3100 data must be treated as network losses, thus incurring a loss 3101 response from the sender's congestion control mechanism. Such a 3102 heavy-duty response may unfairly penalize connections on links with 3103 high background corruption. It is to the application's benefit to 3104 report corruption losses differently from network losses. 3105 Therefore, even applications that demand correct data can make use 3106 of reduced checksum coverage, by including a Data Checksum option. 3107 Data Checksum holds a strong checksum of the application data. The 3108 combination of reduced checksum coverage and Data Checksum can 3109 detect application data corruption, but report it as corruption, not 3110 congestion, via Data Dropped options (see Section 11.7). 3112 Reduced checksum coverage introduces some security considerations; 3113 see Section 19.2. See Appendix B.1 for further motivation and 3114 discussion. DCCP's implementation of reduced checksum coverage was 3115 inspired by UDP-Lite [UDP-LITE]. 3117 9.1. Header Checksum Field 3119 DCCP uses the TCP/IP checksum algorithm. The Checksum field in the 3120 DCCP generic header (see Section 5.1) equals the 16 bit one's 3121 complement of the one's complement sum of all 16 bit words in the 3122 DCCP header, DCCP options, a pseudoheader taken from the network- 3123 layer header, and, depending on the value of the Checksum Coverage 3124 field, some or all of the application data. When calculating the 3125 checksum, the Checksum field itself is treated as 0. If a packet 3126 contains an odd number of header and text bytes to be checksummed, 8 3127 zero bits are added on the right to form a 16 bit word for checksum 3128 purposes. The pad byte is not transmitted as part of the packet. 3130 The pseudoheader is calculated as for TCP. For IPv4, it is 96 bits 3131 long, and consists of the IPv4 source and destination addresses, the 3132 IP protocol number for DCCP (padded on the left with 8 zero bits), 3133 and the DCCP length as a 16-bit quantity (the length of the DCCP 3134 header with options, plus the length of any data); see Section 3.1 3135 of [RFC 793]. For IPv6, it is 320 bits long, and consists of the 3136 IPv6 source and destination addresses, the DCCP length as a 32-bit 3137 quantity, and the IP protocol number for DCCP (padded on the left 3138 with 24 zero bits); see Section 8.1 of [RFC 2460]. 3140 Packets with invalid header checksums MUST be ignored. In 3141 particular, their options MUST NOT be processed. 3143 9.2. Header Checksum Coverage Field 3145 The Checksum Coverage field in the DCCP generic header (see Section 3146 5.1) specifies what parts of the packet are covered by the Checksum 3147 field, as follows: 3149 CsCov = 0 The Checksum field covers the DCCP header, DCCP 3150 options, network-layer pseudoheader, and all 3151 application data in the packet, possibly padded on 3152 the right with zeros to an even number of bytes. 3154 CsCov = 1-15 The Checksum field covers the DCCP header, DCCP 3155 options, network-layer pseudoheader, and the initial 3156 (CsCov-1)*4 bytes of the packet's application data. 3158 Thus, if CsCov is 1, none of the application data is protected by 3159 the header checksum. The value (CsCov-1)*4 MUST be less than or 3160 equal to the length of the application data. Packets with invalid 3161 CsCov values MUST be ignored; in particular, their options MUST NOT 3162 be processed. The meanings of values other than 0 and 1 should be 3163 considered experimental. 3165 Values other than 0 specify that corruption is acceptable in some or 3166 all of the DCCP packet's application data. In fact, DCCP cannot 3167 even detect corruption in areas not covered by the header checksum, 3168 unless the Data Checksum option is used. Applications should not 3169 make any assumptions about the correctness of received data not 3170 covered by the checksum, and should if necessary introduce their own 3171 validity checks. 3173 A DCCP application interface should let sending applications suggest 3174 a value for CsCov for sent packets, defaulting to 0 (full coverage). 3176 It should also let receiving applications refuse delivery of packets 3177 with checksum coverage less than a value provided by the 3178 application; by default, only packets with fully-covered application 3179 data should be accepted. (Note that, for short packets, application 3180 data might be fully covered by a nonzero Checksum Coverage value.) 3181 Lower layers that support partial error detection MAY use the 3182 Checksum Coverage field as a hint of where errors do not need to be 3183 detected. Lower layers MUST use a strong error detection mechanism 3184 to detect at least errors that occur in the sensitive part of the 3185 packet, and discard damaged packets. The sensitive part consists of 3186 the bytes between the first byte of the IP header and the last byte 3187 identified by Checksum Coverage. 3189 For more details on application and lower-layer interface issues 3190 relating to partial checksumming, see [UDP-LITE]. 3192 9.3. Data Checksum Option 3194 The Data Checksum option holds a 32-bit CRC-32c cyclic redundancy- 3195 check code of a DCCP packet's application data. 3197 +--------+--------+--------+--------+--------+--------+ 3198 |00101100|00000110| CRC-32c | 3199 +--------+--------+--------+--------+--------+--------+ 3200 Type=44 Length=6 3202 Data Checksum is intended for packets containing application data, 3203 such as DCCP-Request, DCCP-Response, DCCP-Data, and DCCP-DataAck, 3204 but it may be included on any packet. The sending DCCP computes the 3205 CRC of the bytes comprising the application data and stores it in 3206 the option data. The CRC-32c algorithm used for Data Checksum is 3207 the same as that used for SCTP [RFC 3309]; note that the CRC-32c of 3208 zero bytes of data equals zero. The DCCP header checksum will cover 3209 the Data Checksum option, so the data checksum must be computed 3210 before the header checksum. 3212 The receiving DCCP SHOULD compute the received application data's 3213 CRC-32c using the same algorithm as the sender, and compare the 3214 result and the Data Checksum value. If the values differ, the 3215 packet's application data MUST be dropped, and reported using a Data 3216 Dropped option as dropped due to corruption (Drop Code 3). However, 3217 DCCP MAY provide an API through which the receiving application 3218 could request delivery of known-corrupt data. When that API is 3219 active, the packet's data SHOULD be delivered, but reported as 3220 delivered corrupt (Drop Code 7) using a Data Dropped option. In 3221 either case, the packet will be reported as Received or Received ECN 3222 Marked by Ack Vector or similar options. 3224 9.3.1. Check Data Checksum Feature 3226 The Check Data Checksum feature lets a sending DCCP determine 3227 whether or not its partner can check Data Checksum options. DCCP A 3228 sends a Mandatory "Change R(Check Data Checksum, 1)" option to 3229 DCCP B to require B to check Data Checksum options (the connection 3230 will be reset if DCCP B cannot). 3232 Check Data Checksum has feature number 10, and is server-priority. 3233 It takes one-byte Boolean values. DCCP B MUST check any received 3234 Data Checksum options when Check Data Checksum/B is one, although it 3235 MAY check them even when Check Data Checksum/B is zero. Values of 3236 two or more are reserved. New connections start with Check Data 3237 Checksum 0 for both endpoints. 3239 9.3.2. Usage Notes 3241 Internet links must normally apply strong integrity checks to the 3242 packets they transmit [UDP-LITE] [LINK BCP]. Data Checksum is 3243 redundant for DCCP packets whose integrity is checked by every link 3244 they traverse. This is the default case when the DCCP header's 3245 Checksum Coverage value equals zero (full coverage). However, the 3246 DCCP Checksum Coverage value might not be zero. By setting partial 3247 Checksum Coverage, the application indicates that it can tolerate 3248 corruption in the unprotected part of the application data. 3249 Recognizing this, link layers may reduce the strength of their error 3250 detection and/or correction when transmitting this unprotected part, 3251 which can significantly increase the probability of the endpoint 3252 receiving corrupt data. Data Checksum lets the receiver detect any 3253 ensuing corruption. 3255 10. Congestion Control IDs 3257 Each congestion control mechanism supported by DCCP is assigned a 3258 congestion control identifier, or CCID: a number from 0 to 255. 3259 During connection setup, and optionally thereafter, the endpoints 3260 negotiate their congestion control mechanisms by negotiating the 3261 values for their Congestion Control ID features. Congestion Control 3262 ID has feature number 1. The CCID/A value equals the CCID in use 3263 for the A-to-B half-connection. DCCP B sends a "Change R(CCID, K)" 3264 option to ask DCCP A to use CCID K for its data packets. 3266 CCID is a server-priority feature, so CCID negotiation options can 3267 list multiple acceptable CCIDs, sorted in descending order of 3268 priority. For example, the option "Change R(CCID, 1 2 3)" asks the 3269 receiver to use CCID 1 for its packets, although CCIDs 2 and 3 are 3270 also acceptable. (This corresponds to the bytes "35, 6, 1, 1, 2, 3271 3": Change R option (35), option length (6), feature ID (1), CCIDs 3272 (1, 2, 3).) Similarly, "Confirm L(CCID, 1, 1 2 3)" tells the 3273 receiver that the sender is using CCID 1 for its packets, but that 3274 CCIDs 2 or 3 might also be acceptable. 3276 The CCIDs defined by this document are: 3278 CCID Meaning 3279 ---- ------- 3280 0 Reserved 3281 1 Unspecified Sender-Based Congestion Control 3282 2 TCP-like Congestion Control 3283 3 TFRC Congestion Control 3285 New connections start with CCID 2 for both endpoints. If this is 3286 unacceptable for a DCCP endpoint, that endpoint MUST send Mandatory 3287 Change(CCID) options on its first packets. 3289 All CCIDs standardized for use with DCCP will correspond to 3290 congestion control mechanisms previously standardized by the IETF. 3291 We expect that for quite some time, all such mechanisms will be TCP- 3292 friendly, but TCP-friendliness is not an explicit DCCP requirement. 3294 A DCCP implementation intended for general use, such as an 3295 implementation in a general-purpose operating system kernel, SHOULD 3296 implement at least CCIDs 1 and 2. The intent is to make these CCIDs 3297 broadly available for interoperability, although particular 3298 applications might disallow their use. 3300 10.1. Unspecified Sender-Based Congestion Control 3302 CCID 1 denotes an unspecified sender-based congestion control 3303 mechanism. This provides a limited, controlled form of 3304 interoperability for new IETF-approved CCIDs: with CCID 1, an HC- 3305 Sender can use a new sender-based congestion control mechanism whose 3306 details the HC-Receiver does not understand. 3308 Some congestion control mechanisms require only generic behavior 3309 from the receiver. For example, CCID 2, TCP-like Congestion 3310 Control, requires that the receiver (1) send Ack Vectors and (2) 3311 respond to Ack Ratio. Both of these requirements use generic 3312 mechanisms described in this document. Thus, a CCID 2 HC-Receiver 3313 doesn't really need to understand the details of CCID 2. 3315 CCID 1 uses this insight to support forward compatibility for 3316 sender-based congestion control mechanisms. An HC-Sender proposes 3317 CCID 1 as a proxy for a sender-based mechanism whose details the HC- 3318 Receiver doesn't need to understand. The HC-Receiver can then agree 3319 to CCID 1, and provide generic acknowledgement feedback as requested 3320 by other features (such as Send Ack Vector). Individual CCID 3321 profile documents say whether or not they can masquerade as CCID 1. 3323 For example, say that CCID 98, a new sender-based congestion control 3324 mechanism using Ack Vector for acknowledgements, has entered the 3325 IETF standards process, and the IETF has approved the use of CCID 1 3326 as a proxy for CCID 98. Now, say DCCP A would like to use CCID 98 3327 for its data packets. It should therefore send a "Change L(CCID, 98 3328 1)" option to open a CCID negotiation. 98 comes first, since that 3329 is the preferred CCID; 1 comes next, as a potential proxy for 98. 3330 If DCCP B understands CCID 98, it will respond with "Confirm R(CCID, 3331 98, ...)" and all is well. But if it does not understand CCID 98, 3332 it may respond with "Confirm R(CCID, 1, ...)", still allowing DCCP A 3333 to use CCID 98. DCCP A will separately negotiate Send Ack Vector, 3334 and thus DCCP B will provide the feedback DCCP A requires, namely 3335 Ack Vector, without needing to understand the operation of CCID 98. 3337 Implementors MUST NOT use CCID 1 in production environments as a 3338 proxy for congestion control mechanisms that have not entered the 3339 IETF standards process. We intend that any production use of CCID 1 3340 would have to be explicitly approved first by the IETF. Middleboxes 3341 MAY choose to treat the use of CCID 1 as experimental or 3342 unacceptable. 3344 Since CCID 1 should be used only as a proxy for other, defined 3345 CCIDs, an HC-Sender MUST NOT report a preference list consisting 3346 only of CCID 1, and the option "Change L(CCID, 1)" is illegal. 3347 Receiving such an option SHOULD result in connection reset with 3348 Reset Code 5, "Option Error". An HC-Receiver MAY suggest CCID 1 3349 exclusively: the option "Change R(CCID, 1)" is not illegal. 3351 If CCID 1 is the result of a CCID feature negotiation, the HC-Sender 3352 determines which CCID to actually use by picking the earliest CCID 3353 in its preference list that can masquerade as CCID 1. The HC-Sender 3354 MUST pick a CCID that appeared explicitly in its preference list. 3356 Many DCCP APIs will allow applications to suggest preferred CCIDs 3357 for sending and receiving data. Such APIs might let applications 3358 allow or prevent the use of CCID 1 for receiving, but they should 3359 not let applications suggest the use of CCID 1 for sending. The 3360 code implementing a particular CCID should add CCID 1 to the HC- 3361 Sender's CCID preference list when appropriate, unless the 3362 application disagrees. The default for both sender and receiver 3363 should be to allow CCID 1 when possible. 3365 CCID 1 places no restrictions on how often the HC-Receiver may send 3366 DCCP-Ack packets. A careful implementation SHOULD implement a 3367 liberal rate limit on DCCP-Acks to prevent ack storms. 3369 10.2. TCP-like Congestion Control 3371 CCID 2, TCP-like Congestion Control, denotes Additive Increase, 3372 Multiplicative Decrease (AIMD) congestion control with behavior 3373 modelled directly on TCP, including congestion window, slow start, 3374 timeouts, and so forth. CCID 2 achieves maximum bandwidth over the 3375 long term, consistent with the use of end-to-end congestion control, 3376 but halves its congestion window in response to each congestion 3377 event. This leads to the abrupt rate changes typical of TCP. 3378 Applications should use CCID 2 if they prefer maximum bandwidth 3379 utilization to steadiness of rate. This is often the case for 3380 applications that are not playing their data directly to the user. 3381 For example, a hypothetical application that transferred files over 3382 DCCP, using application-level retransmissions for lost packets, 3383 would prefer CCID 2 to CCID 3. On-line games may also prefer CCID 3384 2. 3386 CCID 2 is further described in [CCID 2 PROFILE]. 3388 10.3. TFRC Congestion Control 3390 CCID 3 denotes TCP-Friendly Rate Control (TFRC), an equation-based 3391 rate-controlled congestion control mechanism. TFRC is designed to 3392 be reasonably fair when competing for bandwidth with TCP-like flows, 3393 where a flow is "reasonably fair" if its sending rate is generally 3394 within a factor of two of the sending rate of a TCP flow under the 3395 same conditions. However, TFRC has a much lower variation of 3396 throughput over time compared with TCP, which makes CCID 3 more 3397 suitable than CCID 2 for applications such as telephony or streaming 3398 media where a relatively smooth sending rate is of importance. 3400 CCID 3 is further described in [CCID 3 PROFILE]. The TFRC congestion 3401 control algorithms were initially described in [RFC 3448]. 3403 10.4. CCID-Specific Options, Features, and Reset Codes 3405 Half of the option types, feature numbers, and Reset Codes are 3406 reserved for CCID-specific use. CCIDs may often need new options, 3407 for communicating acknowledgement or rate information, for example; 3408 reserved option spaces let CCIDs create options at will without 3409 polluting the global option space. Option 128 might have different 3410 meanings on a half-connection using CCID 4 and a half-connection 3411 using CCID 8. CCID-specific options and features will never 3412 conflict with global options and features introduced by later 3413 versions of this specification. 3415 Any packet may contain information meant for either half-connection, 3416 so CCID-specific option types, feature numbers, and Reset Codes 3417 explicitly signal the half-connection to which they apply. 3419 o Option numbers 128 through 191 are for options sent from the HC- 3420 Sender to the HC-Receiver; option numbers 192 through 255 are for 3421 options sent from the HC-Receiver to the HC-Sender. 3423 o Reset Codes 128 through 191 indicate that the HC-Sender reset the 3424 connection (most likely because of some problem with 3425 acknowledgements sent by the HC-Receiver); Reset Codes 192 through 3426 255 indicate that the HC-Receiver reset the connection (most 3427 likely because of some problem with data packets sent by the HC- 3428 Sender). 3430 o Finally, feature numbers 128 through 191 are used for features 3431 located at the HC-Sender; feature numbers 192 through 255 are for 3432 features located at the HC-Receiver. Since Change L and Confirm L 3433 options for a feature are sent by the feature location, we know 3434 that any Change L(128) option was sent by the HC-Sender, while any 3435 Change L(192) option was sent by the HC-Receiver. Similarly, 3436 Change R(128) options are sent by the HC-Receiver, while 3437 Change R(192) options are sent by the HC-Sender. 3439 For example, consider a DCCP connection where the A-to-B half- 3440 connection uses CCID 4 and the B-to-A half-connection uses CCID 5. 3441 Here is how a sampling of CCID-specific options and features are 3442 assigned to half-connections: 3444 Relevant Relevant 3445 Packet Option Half-conn. CCID 3446 ------ ------ ---------- ---- 3447 A > B 128 A-to-B 4 3448 A > B 192 B-to-A 5 3449 A > B Change L(128, ...) A-to-B 4 3450 A > B Change R(192, ...) A-to-B 4 3451 A > B Confirm L(128, ...) A-to-B 4 3452 A > B Confirm R(192, ...) A-to-B 4 3453 A > B Change R(128, ...) B-to-A 5 3454 A > B Change L(192, ...) B-to-A 5 3455 A > B Confirm R(128, ...) B-to-A 5 3456 A > B Confirm L(192, ...) B-to-A 5 3458 B > A 128 B-to-A 5 3459 B > A 192 A-to-B 4 3460 B > A Change L(128, ...) B-to-A 5 3461 B > A Change R(192, ...) B-to-A 5 3462 B > A Confirm L(128, ...) B-to-A 5 3463 B > A Confirm R(192, ...) B-to-A 5 3464 B > A Change R(128, ...) A-to-B 4 3465 B > A Change L(192, ...) A-to-B 4 3466 B > A Confirm R(128, ...) A-to-B 4 3467 B > A Confirm L(192, ...) A-to-B 4 3469 CCID-specific options and features have no clear meaning when a 3470 nontrivial negotiation for the relevant CCID is in progress. This 3471 can happen when a CCID-specific option follows a Change(CCID) 3472 option. Say the Change option lists CCID X first. Then the 3473 negotiation is nontrivial if and only if its result is not X. CCID- 3474 specific options and features MUST be ignored during a nontrivial 3475 CCID negotiation, except that Mandatory CCID-specific options and 3476 features MUST induce a DCCP-Reset with Reset Code 6, "Mandatory 3477 Error". 3479 11. Acknowledgements 3481 Congestion control requires receivers to transmit information about 3482 packet losses and ECN marks to senders. DCCP receivers MUST report 3483 all congestion they see, as defined by the relevant CCID profile. 3484 Each CCID says when acknowledgements should be sent, what options 3485 they must use, how they should be congestion controlled, and so on. 3487 Most acknowledgements use DCCP options. For example, on a half- 3488 connection with CCID 2 (TCP-like), the receiver reports 3489 acknowledgement information using the Ack Vector option. This 3490 section describes common acknowledgement options and shows how acks 3491 using those options will commonly work. Full descriptions of the 3492 ack mechanisms used for each CCID are laid out in the CCID profile 3493 specifications. 3495 Acknowledgement options, such as Ack Vector, generally depend on the 3496 DCCP Acknowledgement Number, and are thus only allowed on packet 3497 types that carry that number (all packets except DCCP-Request and 3498 DCCP-Data). Detailed acknowledgement options are not necessarily 3499 required on every packet that carries an Acknowledgement Number, 3500 however. 3502 11.1. Acks of Acks and Unidirectional Connections 3504 DCCP was designed to work well for both bidirectional and 3505 unidirectional flows of data, and for connections that transition 3506 between these states. However, acknowledgements required for a 3507 unidirectional connection are very different from those required for 3508 a bidirectional connection. In particular, unidirectional 3509 connections need to worry about acks of acks. 3511 The ack-of-acks problem arises because some acknowledgement 3512 mechanisms are reliable. For example, an HC-Receiver using CCID 2, 3513 TCP-like Congestion Control, sends Ack Vectors containing completely 3514 reliable acknowledgement information. The HC-Sender should 3515 occasionally inform the HC-Receiver that it has received an ack. If 3516 it did not, the HC-Receiver might resend complete Ack Vector 3517 information, going back to the start of the connection, with every 3518 DCCP-Ack packet! However, note that acks-of-acks need not be 3519 reliable themselves: when an ack-of-acks is lost, the HC-Receiver 3520 will simply maintain, and periodically retransmit, old 3521 acknowledgement-related state for a little longer. Therefore, there 3522 is no need for acks-of-acks-of-acks. 3524 When communication is bidirectional, any required acks-of-acks are 3525 automatically contained in normal acknowledgements for data packets. 3526 On a unidirectional connection, however, the receiver DCCP sends no 3527 data, so the sender would not normally send acknowledgements. 3528 Therefore, the CCID in force on that half-connection must explicitly 3529 say whether, when, and how the HC-Sender should generate acks-of- 3530 acks. 3532 For example, consider a bidirectional connection where both half- 3533 connections use the same CCID (either 2 or 3), and where DCCP B goes 3534 "quiescent". This means that the connection becomes unidirectional: 3535 DCCP B stops sending data, and sends only sends DCCP-Ack packets to 3536 DCCP A. For CCID 2, TCP-like Congestion Control, DCCP B uses Ack 3537 Vector to reliably communicate which packets it has received. As 3538 described above, DCCP A must occasionally acknowledge a pure 3539 acknowledgement from DCCP B, so that B can free old Ack Vector 3540 state. For instance, A might send a DCCP-DataAck packet every now 3541 and then, instead of DCCP-Data. In contrast, for CCID 3, TFRC 3542 Congestion Control, DCCP B's acknowledgements generally need not be 3543 reliable, since they contain cumulative loss rates; TFRC works even 3544 if every DCCP-Ack is lost. Therefore, DCCP A need never acknowledge 3545 an acknowledgement. 3547 When communication is unidirectional, a single CCID---in the 3548 example, the A-to-B CCID---controls both DCCPs' acknowledgements, in 3549 terms of their content, their frequency, and so forth. For 3550 bidirectional connections, the A-to-B CCID governs DCCP B's 3551 acknowledgements (including its acks of DCCP A's acks), while the B- 3552 to-A CCID governs DCCP A's acknowledgements. 3554 DCCP A switches its ack pattern from bidirectional to unidirectional 3555 when it notices that DCCP B has gone quiescent. It switches from 3556 unidirectional to bidirectional when it must acknowledge even a 3557 single DCCP-Data or DCCP-DataAck packet from DCCP B. 3559 Each CCID defines how to detect quiescence on that CCID, and how 3560 that CCID handles acks-of-acks on unidirectional connections. The 3561 B-to-A CCID defines when DCCP B has gone quiescent. Usually, this 3562 happens when a period has passed without B sending any data packets; 3563 for CCID 2, this period is the maximum of 0.2 seconds and two round- 3564 trip times. The A-to-B CCID defines how DCCP A handles acks-of-acks 3565 once DCCP B has gone quiescent. 3567 11.2. Ack Piggybacking 3569 Acknowledgements of A-to-B data MAY be piggybacked on data sent by 3570 DCCP B, as long as that does not delay the acknowledgement longer 3571 than the A-to-B CCID would find acceptable. However, data 3572 acknowledgements often require more than 4 bytes to express. A 3573 large set of acknowledgements prepended to a large data packet might 3574 exceed the allowed maximum packet size. In this case, DCCP B SHOULD 3575 send separate DCCP-Data and DCCP-Ack packets, or wait, but not too 3576 long, for a smaller datagram. 3578 Piggybacking is particularly common at DCCP A when the B-to-A half- 3579 connection is quiescent---that is, when DCCP A is just acknowledging 3580 DCCP B's acknowledgements, as described above. There are three 3581 reasons to acknowledge DCCP B's acknowledgements: to allow DCCP B to 3582 free up information about previously acknowledged data packets from 3583 A; to shrink the size of future acknowledgements; and to manipulate 3584 the rate at which future acknowledgements are sent. Since these are 3585 secondary concerns, DCCP A can generally afford to wait indefinitely 3586 for a data packet to piggyback its acknowledgement onto. 3588 Any restrictions on ack piggybacking are described in the relevant 3589 CCID's profile. 3591 11.3. Ack Ratio Feature 3593 Ack Ratio provides a common mechanism by which CCIDs that clock 3594 acknowledgements off data packets can perform rudimentary congestion 3595 control on the acknowledgement stream. CCID 2, TCP-like Congestion 3596 Control, uses Ack Ratio to limit the rate of its acknowledgement 3597 stream, for example. Some CCIDs ignore Ack Ratio, performing 3598 congestion control on acknowledgements in some other way. 3600 Ack Ratio has feature number 7, and is non-negotiable. It takes 3601 two-byte integer values. The Ack Ratio/A feature is the rough ratio 3602 of data packets sent by DCCP A to acknowledgement packets sent back 3603 by DCCP B. For example, if Ack Ratio/A is four, then DCCP B will 3604 send at least one acknowledgement packet for every four data packets 3605 sent by DCCP A. DCCP A sends a "Change L(Ack Ratio)" option to 3606 notify DCCP B of its ack ratio. New connections start with Ack 3607 Ratio 2 for both endpoints. 3609 Implementations should treat Ack Ratio as a loose guideline. For 3610 instance, a DCCP endpoint might implement a delayed acknowledgement 3611 timer like TCP's, whereby each packet is acknowledged within at most 3612 T seconds of its receipt. (In TCP, T is commonly set to 200 3613 milliseconds.) This is explicitly allowed even though it might lead 3614 to sending more acknowledgement packets than Ack Ratio would 3615 suggest. Particular algorithms for setting and using Ack Ratio are 3616 discussed in the relevant CCID drafts. 3618 11.4. Ack Vector Options 3620 The Ack Vector gives a run-length encoded history of data packets 3621 received at the client. Each byte of the vector gives the state of 3622 that data packet in the loss history, and the number of preceding 3623 packets with the same state. The option's data looks like this: 3625 +--------+--------+--------+--------+--------+-------- 3626 |0010011?| Length |SSLLLLLL|SSLLLLLL|SSLLLLLL| ... 3627 +--------+--------+--------+--------+--------+-------- 3628 Type=38/39 \___________ Vector ___________... 3630 The two Ack Vector options (option types 38 and 39) differ only in 3631 the values they imply for ECN Nonce Echo. Section 12.2 describes 3632 this further. 3634 The vector itself consists of a series of bytes, each of whose 3635 encoding is: 3637 0 1 2 3 4 5 6 7 3638 +-+-+-+-+-+-+-+-+ 3639 |Sta| Run Length| 3640 +-+-+-+-+-+-+-+-+ 3642 Sta[te] occupies the most significant two bits of each byte, and can 3643 have one of four values: 3645 0 Packet received (and not ECN marked). 3647 1 Packet received ECN marked. 3649 2 Reserved. 3651 3 Packet not yet received. 3653 Run Length, the least significant six bits of each byte, specifies 3654 how many consecutive packets have the given State. Run Length zero 3655 says the corresponding State applies to one packet only; Run Length 3656 63 says it applies to 64 consecutive packets. Run lengths of 65 or 3657 more must be encoded in multiple bytes. 3659 The first byte in the first Ack Vector option refers to the packet 3660 indicated in the Acknowledgement Number; subsequent bytes refer to 3661 older packets. (Ack Vector MUST NOT be sent on DCCP-Data and DCCP- 3662 Request packets, which lack an Acknowledgement Number.) If an Ack 3663 Vector contains the decimal values 0,192,3,64,5 and the 3664 Acknowledgement Number is decimal 100, then: 3666 Packet 100 was received (Acknowledgement Number 100, State 0, 3667 Run Length 0). 3669 Packet 99 was lost (State 3, Run Length 0). 3671 Packets 98, 97, 96 and 95 were received (State 0, Run Length 3). 3673 Packet 94 was ECN marked (State 1, Run Length 0). 3675 Packets 93, 92, 91, 90, 89, and 88 were received (State 0, Run 3676 Length 5). 3678 A single Ack Vector option can acknowledge up to 16192 data packets. 3679 Should more packets need to be acknowledged than can fit in 253 3680 bytes of Ack Vector, then multiple Ack Vector options can be sent; 3681 the second Ack Vector begins where the first left off, and so forth. 3683 Ack Vector states are subject to two general constraints. (These 3684 principles SHOULD also be followed for other acknowledgement 3685 mechanisms; referring to Ack Vector states simplifies their 3686 explanation.) 3688 1. Packets reported as State 0 or State 1 MUST have been processed 3689 by the receiving DCCP stack. In particular, their options must 3690 have been processed. Any data on the packet need not have been 3691 delivered to the receiving application; in fact, the data may 3692 have been dropped. 3694 2. Packets reported as State 3 MUST NOT have been received by DCCP. 3695 Feature negotiations and options on such packets MUST NOT have 3696 been processed, and the Acknowledgement Number MUST NOT 3697 correspond to such a packet. 3699 Packets dropped in the application's receive buffer SHOULD be 3700 reported as Received or Received ECN Marked (States 0 and 1), 3701 depending on their ECN state; such packets' ECN Nonces MUST be 3702 included in the Nonce Echo. The Data Dropped option informs the 3703 sender that some packets reported as received actually had their 3704 application data dropped. 3706 One or more Ack Vector options that, together, report the status of 3707 more packets than have actually been sent SHOULD be considered 3708 invalid. The receiving DCCP SHOULD either ignore the options or 3709 reset the connection with Reset Code 5, "Option Error". Packets 3710 that haven't been included in any Ack Vector option SHOULD be 3711 treated as "not yet received" (State 3) by the sender. 3713 Appendix A provides a non-normative description of the details of 3714 DCCP acknowledgement handling, in the context of an abstract Ack 3715 Vector implementation. 3717 11.4.1. Ack Vector Consistency 3719 A DCCP sender will commonly receive multiple acknowledgements for 3720 some of its data packets. For instance, an HC-Sender might receive 3721 two DCCP-Acks with Ack Vectors, both of which contained information 3722 about sequence number 24. (Information about a sequence number is 3723 generally repeated in every ack until the HC-Sender acknowledges an 3724 ack. In this case, perhaps the HC-Receiver is sending acks faster 3725 than the HC-Sender is acknowledging them.) In a perfect world, the 3726 two Ack Vectors would always be consistent. However, there are many 3727 reasons why they might not be: 3729 o The HC-Receiver received packet 24 between sending its acks, so 3730 the first ack said 24 was not received (State 3) and the second 3731 said it was received or ECN marked (State 0 or 1). 3733 o The HC-Receiver received packet 24 between sending its acks, and 3734 the network reordered the acks. In this case, the packet will 3735 appear to transition from State 0 or 1 to State 3. 3737 o The network duplicated packet 24, and one of the duplicates was 3738 ECN marked. This might show up as a transition between States 0 3739 and 1. 3741 To cope with these situations, HC-Sender DCCP implementations SHOULD 3742 combine multiple received Ack Vector states according to this table: 3744 Received State 3745 0 1 3 3746 +---+---+---+ 3747 0 | 0 |0/1| 0 | 3748 Old +---+---+---+ 3749 1 | 1 | 1 | 1 | 3750 State +---+---+---+ 3751 3 | 0 | 1 | 3 | 3752 +---+---+---+ 3754 To read the table, choose the row corresponding to the packet's old 3755 state and the column corresponding to the packet's state in the 3756 newly received Ack Vector, then read the packet's new state off the 3757 table. For an old state of 0 (received non-marked) and received 3758 state of 1 (received ECN marked), the packet's new state may be set 3759 to either 0 or 1. The HC-Sender implementation will be indifferent 3760 to ack reordering if it chooses new state 1 for that cell. 3762 The HC-Receiver should collect information about received packets, 3763 which it will eventually report to the HC-Sender on one or more 3764 acknowledgements, according to the following table: 3766 Received Packet 3767 0 1 3 3768 +---+---+---+ 3769 0 | 0 |0/1| 0 | 3770 Stored +---+---+---+ 3771 1 |0/1| 1 | 1 | 3772 State +---+---+---+ 3773 3 | 0 | 1 | 3 | 3774 +---+---+---+ 3776 This table equals the sender's table, except that when the stored 3777 state is 1 and the received state is 0, the receiver is allowed to 3778 switch its stored state to 0. 3780 A HC-Sender MAY choose to throw away old information gleaned from 3781 the HC-Receiver's Ack Vectors, in which case it MUST ignore newly 3782 received acknowledgements from the HC-Receiver for those old 3783 packets. It is often kinder to save recent Ack Vector information 3784 for a while, so that the HC-Sender can undo its reaction to presumed 3785 congestion when a "lost" packet unexpectedly shows up (the 3786 transition from State 3 to State 0). 3788 11.4.2. Ack Vector Coverage 3790 We can divide the packets that have been sent from an HC-Sender to 3791 an HC-Receiver into four roughly contiguous groups. From oldest to 3792 youngest, these are: 3794 1. Packets already acknowledged by the HC-Receiver, where the HC- 3795 Receiver knows that the HC-Sender has definitely received the 3796 acknowledgements. 3798 2. Packets already acknowledged by the HC-Receiver, where the HC- 3799 Receiver cannot be sure that the HC-Sender has received the 3800 acknowledgements. 3802 3. Packets not yet acknowledged by the HC-Receiver. 3804 4. Packets not yet received by the HC-Receiver. 3806 The union of groups 2 and 3 is called the Acknowledgement Window. 3807 Generally, every Ack Vector generated by the HC-Receiver will cover 3808 the whole Acknowledgement Window: Ack Vector acknowledgements are 3809 cumulative. (This simplifies Ack Vector maintenance at the HC- 3810 Receiver; see Section A, below.) As packets are received, this 3811 window both grows on the right and shrinks on the left. It grows 3812 because there are more packets, and shrinks because the data 3813 packets' Acknowledgement Numbers will acknowledge previous 3814 acknowledgements, moving packets from group 2 into group 1. 3816 11.5. Send Ack Vector Feature 3818 The Send Ack Vector feature lets DCCPs negotiate whether they should 3819 use Ack Vector options to report congestion. Ack Vector provides 3820 detailed loss information, and lets senders report back to their 3821 applications whether particular packets were dropped. Send Ack 3822 Vector is mandatory for some CCIDs, and optional for others. 3824 Send Ack Vector has feature number 8, and is server-priority. It 3825 takes one-byte Boolean values. DCCP A MUST send Ack Vector options 3826 on its acknowledgements when Send Ack Vector/A has value one, 3827 although it MAY send Ack Vector options even when Send Ack Vector/A 3828 is zero. Values of two or more are reserved. New connections start 3829 with Send Ack Vector 0 for both endpoints. DCCP B sends a 3830 "Change R(Send Ack Vector, 1)" option to DCCP A to ask A to send Ack 3831 Vector options as part of its acknowledgement traffic. 3833 11.6. Slow Receiver Option 3835 An HC-Receiver sends the Slow Receiver option to its sender to 3836 indicate that it is having trouble keeping up with the sender's 3837 data. The HC-Sender SHOULD NOT increase its sending rate for 3838 approximately one round-trip time after seeing a packet with a Slow 3839 Receiver option. However, the Slow Receiver option does not 3840 indicate congestion, and the HC-Sender need not reduce its sending 3841 rate. (If necessary, the receiver can force the sender to slow down 3842 by dropping packets, with or without Data Dropped, or reporting 3843 false ECN marks.) APIs should let receiver applications set Slow 3844 Receiver, and sending applications determine whether or not their 3845 receivers are Slow. 3847 The Slow Receiver option takes just one byte: 3849 +--------+ 3850 |00000010| 3851 +--------+ 3852 Type=2 3854 Slow Receiver does not specify why the receiver is having trouble 3855 keeping up with the sender. Possible reasons include lack of buffer 3856 space, CPU overload, and application quotas. A sending application 3857 might react to Slow Receiver by reducing its sending rate or by 3858 switching to a lossier compression algorithm. 3860 The sending application should not react to Slow Receiver by sending 3861 more data, however. The optimal response to a CPU-bound receiver 3862 might be to increase the sending rate, by switching to a less- 3863 compressed sending format, since a highly-compressed data format 3864 might overwhelm a slow CPU more seriously than the higher memory 3865 requirements of a less-compressed data format. The Slow Receiver 3866 option is not appropriate for this case; a CPU-bound receiver should 3867 not ask for Slow Receiver options to be sent. 3869 Slow Receiver implements a portion of TCP's receive window 3870 functionality. 3872 11.7. Data Dropped Option 3874 The Data Dropped option indicates that some packets reported as 3875 received actually had their data dropped before it reached the 3876 application. The sender's congestion control mechanism may respond 3877 to data-dropped packets less severely than to lost or marked 3878 packets. For instance, a windowed mechanism might subtract a 3879 constant value from its congestion window, rather than cut it in 3880 half. 3882 Data Dropped lets a sender differentiate between different kinds of 3883 loss (network and endpoint), but it does not allow total freedom in 3884 how to react. The congestion control response to a Data Dropped 3885 packet must be approved by the IETF. Each congestion control 3886 mechanism MUST react to a Data Dropped packet as if the packet were 3887 ECN marked, unless explicitly specified otherwise. 3889 If a received packet's application data is dropped for one of the 3890 reasons listed below, this SHOULD be reported using a Data Dropped 3891 option. Alternatively, the receiver MAY choose to report as 3892 "received" only those packets whose data were not dropped, subject 3893 to the constraint that packets not reported as received MUST NOT 3894 have had their options processed. 3896 The option's data looks like this: 3898 +--------+--------+--------+--------+--------+-------- 3899 |00101000| Length | Block | Block | Block | ... 3900 +--------+--------+--------+--------+--------+-------- 3901 Type=40 \___________ Vector ___________ ... 3903 The vector itself consists of a series of bytes, called Blocks, each 3904 of whose encoding corresponds to one of these choices: 3906 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 3907 +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 3908 |0| Run Length | or |1|DrpCd|Run Len| 3909 +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 3910 Normal Block Drop Block 3912 The first byte in the first Data Dropped option refers to the packet 3913 indicated in the Acknowledgement Number; subsequent bytes refer to 3914 older packets. (Data Dropped MUST NOT be sent on DCCP-Data or DCCP- 3915 Request packets, which lack an Acknowledgement Number.) Normal 3916 Blocks, which have high bit 0, indicate that any received packets in 3917 the Run Length had their data delivered to the application. Drop 3918 Blocks, which have high bit 1, indicate that received packets in the 3919 Run Len[gth] were not delivered as usual. The 3-bit Drop Code 3920 [DrpCd] field says what happened; generally, no data from that 3921 packet reached the application. Packets reported as "not yet 3922 received" MUST be included in Normal Blocks; packets not covered by 3923 any Data Dropped option are treated as if they were in a Normal 3924 Block. Defined Drop Codes for Drop Blocks are: 3926 0 Packet data dropped due to protocol constraints. For 3927 example, the data was included on a DCCP-Request packet, and 3928 the receiving application does not allow that piggybacking; 3929 or the data was sent during an important feature 3930 negotiation. 3932 1 Packet data dropped because the application is no longer 3933 listening. 3935 2 Packet data dropped in the receive buffer. 3937 3 Packet data dropped due to corruption. 3939 4-6 Reserved. 3941 7 Packet data corrupted, but delivered to the application 3942 anyway. 3944 For example, if a Data Dropped option contains the decimal values 3945 0,160,3,162, the Acknowledgement Number is 100, and an Ack Vector 3946 reported all packets as received, then: 3948 Packet 100 was received (Acknowledgement Number 100, Normal 3949 Block, Run Length 0). 3951 Packet 99 was dropped in the receive buffer (Drop Block, Drop 3952 Code 2, Run Length 0). 3954 Packets 98, 97, 96, and 95 were received (Normal Block, Run 3955 Length 3). 3957 Packets 95, 94, and 93 were dropped in the receive buffer (Drop 3958 Block, Drop Code 2, Run Length 2). 3960 Run lengths of more than 128 (for Normal Blocks) or 16 (for Drop 3961 Blocks) must be encoded in multiple Blocks. A single Data Dropped 3962 option can acknowledge up to 32384 Normal Block data packets, 3963 although the receiver SHOULD NOT send a Data Dropped option when all 3964 relevant packets fit into Normal Blocks. Should more packets need 3965 to be acknowledged than can fit in 253 bytes of Data Dropped, then 3966 multiple Data Dropped options can be sent. The second option will 3967 begin where the first left off, and so forth. 3969 One or more Data Dropped options that, together, report the status 3970 of more packets than have been sent, or that change the status of a 3971 packet, or that disagree with Ack Vector or equivalent options (by 3972 reporting a "not yet received" packet as "dropped in the receive 3973 buffer", for example), SHOULD be considered invalid. The receiving 3974 DCCP SHOULD respond to invalid Data Dropped options by ignoring 3975 them, or by resetting the connection with Reset Code 5, "Option 3976 Error". 3978 A DCCP application interface should let receiving applications 3979 specify the Drop Codes corresponding to received packets. For 3980 example, this would let applications calculate their own checksums, 3981 but still report "dropped due to corruption" packets via the Data 3982 Dropped option. The interface should not let applications reduce 3983 the "seriousness" of a packet's Drop Code; for example, the 3984 application should not be able to upgrade a packet from delivered 3985 corrupt (Drop Code 7) to delivered normally (no Drop Code). 3987 11.7.1. Data Dropped and Normal Congestion Response 3989 When deciding on a response to a particular acknowledgement or set 3990 of acknowledgements containing Data Dropped packets, a congestion 3991 control mechanism MUST consider dropped packets and ECN marks 3992 (including ECN-marked packets that are included in Data Dropped), as 3993 well as the Data Dropped packets. For window-based mechanisms, the 3994 valid response space is defined as follows. 3996 Assume an old window of W. Independently calculate a new window 3997 W_new1 that assumes no packets were Data Dropped (so W_new1 contains 3998 only the normal congestion response), and a new window W_new2 that 3999 assumes no packets were lost or marked (so W_new2 contains only the 4000 Data Dropped response). We are assuming that Data Dropped 4001 recommended a reduction in congestion window, so W_new2 < W. 4003 Then the actual new window W_new MUST NOT be larger than the minimum 4004 of W_new1 and W_new2; and the sender MAY combine the two responses, 4005 by setting 4006 W_new = W + min(W_new1 - W, 0) + min(W_new2 - W, 0). 4008 Non-window-based congestion control mechanisms MUST behave 4009 analogously. 4011 11.7.2. Particular Drop Codes 4013 Drop Code 0 ("protocol constraints") does not indicate any kind of 4014 congestion, so the sender's CCID SHOULD react to non-marked packets 4015 with Drop Code 0 as if they were received. However, the sending 4016 DCCP SHOULD NOT send more data until it believes the relevant 4017 protocol constraint has passed. 4019 Drop Code 1 ("application no longer listening") means the 4020 application running at the endpoint that sent the option is no 4021 longer listening for data. For example, a server might close its 4022 receiving half-connection to new data after receiving a complete 4023 request from the client. This would limit the amount of state the 4024 server would expend on incoming data, and thus reduce the potential 4025 damage from certain denial-of-service attacks. A Data Dropped 4026 option containing Drop Code 1 SHOULD be sent whenever received data 4027 is ignored due to a non-listening application. Once a DCCP reports 4028 Drop Code 1 for a packet, it SHOULD report Drop Code 1 for every 4029 succeeding data packet on that half-connection; once a DCCP receives 4030 a Drop State 1 report, it SHOULD expect that no more data will ever 4031 be delivered to the other endpoint's application, so it SHOULD NOT 4032 send more data. A DCCP receiving Drop Code 1 MAY report this event 4033 to the application. (Previous versions of this specification used a 4034 "Buffer Closed" option instead of Drop Code 1.) 4036 Drop Code 2 ("receive buffer drop") indicates congestion inside the 4037 receiving host. Every packet newly acknowledged as Drop Code 2 4038 SHOULD reduce the sender's instantaneous rate by one packet per 4039 round trip time, using whatever mechanism is appropriate for the 4040 relevant CCID. Further details may be available in CCID documents. 4042 12. Explicit Congestion Notification 4044 The DCCP protocol is fully ECN-aware [RFC 3168]. Each CCID specifies 4045 how its endpoints respond to ECN marks. Furthermore, DCCP, unlike 4046 TCP, allows senders to control the rate at which acknowledgements 4047 are generated (with options like Ack Ratio); this means that 4048 acknowledgements are generally congestion-controlled, and may have 4049 ECN-Capable Transport set. 4051 A CCID profile describes how that CCID interacts with ECN, both for 4052 data traffic and pure-acknowledgement traffic. A sender SHOULD set 4053 ECN-Capable Transport on its packets whenever the receiver has its 4054 ECN Capable feature turned on and the relevant CCID allows it, 4055 unless the sending application indicates that ECN should not be 4056 used. 4058 The rest of this section describes the ECN Capable feature and the 4059 interaction of the ECN Nonce with acknowledgement options such as 4060 Ack Vector. 4062 12.1. ECN Capable Feature 4064 The ECN Capable feature lets a DCCP inform its partner that it 4065 cannot read ECN bits from received IP headers, so the partner must 4066 not set ECN-Capable Transport on its packets. 4068 ECN Capable has feature number 2, and is server-priority. It takes 4069 one-byte Boolean values. DCCP A MUST be able to read ECN bits from 4070 received frames' IP headers when ECN Capable/A is one. (This is 4071 independent of whether it can set ECN bits on sent frames.) DCCP A 4072 thus sends a "Change L(ECN Capable, 0)" option to DCCP B to inform 4073 it that A cannot read ECN bits. New connections start with ECN 4074 Capable 1 (that is, ECN capable) for both endpoints. Values of two 4075 or more are reserved. 4077 If a DCCP is not ECN capable, it MUST send Mandatory "Change L(ECN 4078 Capable, 0)" options to the other endpoint until acknowledged (by 4079 "Confirm R(ECN Capable, 0)") or the connection closes. Furthermore, 4080 it MUST NOT accept any data until the other endpoint sends 4081 "Confirm R(ECN Capable, 0)". It SHOULD send Data Dropped options on 4082 its acknowledgements, with Drop Code 0 ("protocol constraints"), if 4083 the other endpoint does send data inappropriately. 4085 12.2. ECN Nonces 4087 Congestion avoidance will not occur, and the receiver will sometimes 4088 get its data faster, if the sender isn't told about congestion 4089 events. Thus, the receiver has some incentive to falsify 4090 acknowledgement information, reporting that marked or dropped 4091 packets were actually received unmarked. This problem is more 4092 serious with DCCP than with TCP, since TCP provides reliable 4093 transport: it is more difficult with TCP to lie about lost packets 4094 without breaking the application. 4096 ECN Nonces are a general mechanism to prevent ECN cheating (or loss 4097 cheating). Two values for the two-bit ECN header field indicate 4098 ECN-Capable Transport, 01 and 10. The second code point, 10, is the 4099 ECN Nonce. In general, a protocol sender chooses between these code 4100 points randomly on its output packets, remembering the sequence it 4101 chose. The protocol receiver reports, on every acknowledgement, the 4102 number of ECN Nonces it has received thus far. This is called the 4103 ECN Nonce Echo. Since ECN marking and packet dropping both destroy 4104 the ECN Nonce, a receiver that lies about an ECN mark or packet drop 4105 has a 50% chance of guessing right and avoiding discipline. The 4106 sender may react punitively to an ECN Nonce mismatch, possibly up to 4107 dropping the connection. The ECN Nonce Echo field need not be an 4108 integer; one bit is enough to catch 50% of infractions. 4110 In DCCP, the ECN Nonce Echo field is encoded in acknowledgement 4111 options. For example, the Ack Vector option comes in two forms, Ack 4112 Vector [Nonce 0] (option 38) and Ack Vector [Nonce 1] (option 39), 4113 corresponding to the two values for a one-bit ECN Nonce Echo. The 4114 Nonce Echo for a given Ack Vector equals the one-bit sum (exclusive- 4115 or, or parity) of ECN nonces for packets reported by that Ack Vector 4116 as received and not ECN marked. Thus, only packets marked as State 4117 0 matter for this calculation (that is, valid received packets that 4118 were not ECN marked). Every Ack Vector option is detailed enough 4119 for the sender to determine what the Nonce Echo should have been. 4120 It can check this calculation against the actual Nonce Echo, and 4121 complain if there is a mismatch. (The Ack Vector could conceivably 4122 report every packet's ECN Nonce state, but this would severely limit 4123 Ack Vector's compressibility without providing much extra 4124 protection.) 4126 Given an A-to-B half-connection, DCCP A SHOULD set ECN Nonces on its 4127 packets, and remember which packets had nonces, whenever DCCP B 4128 reports that it is ECN Capable. An ECN-capable endpoint MUST 4129 calculate and use the correct value for ECN Nonce Echo when sending 4130 acknowledgement options. An ECN-incapable endpoint, however, SHOULD 4131 treat the ECN Nonce Echo as always zero. When a sender detects an 4132 ECN Nonce Echo mismatch, it SHOULD behave as if the receiver had 4133 reported one or more packets as ECN-marked (instead of unmarked). 4134 It MAY take more punitive action, such as resetting the connection 4135 with Reset Code 12, "Aggression Penalty". 4137 An ECN-incapable DCCP SHOULD ignore received ECN nonces and generate 4138 ECN nonces of zero. For instance, out of the two Ack Vector 4139 options, an ECN-incapable DCCP SHOULD generate Ack Vector [Nonce 0] 4140 (option 38) exclusively. (Again, the ECN Capable feature MUST be 4141 set to zero in this case.) 4143 12.3. Other Aggression Penalties 4145 The ECN Nonce provides one way for a DCCP sender to discover that a 4146 receiver is misbehaving. There may be other mechanisms, and a 4147 receiver or middlebox may also discover that a sender is 4148 misbehaving---sending more data than it should. In any of these 4149 cases, the entity that discovers the misbehavior MAY react by 4150 resetting the connection with Reset Code 12, "Aggression Penalty". 4151 A receiver that detects marginal (meaning possibly spurious) sender 4152 misbehavior MAY instead react with a Slow Receiver option, or by 4153 reporting some packets as ECN marked that were not, in fact, marked. 4155 13. Timing Options 4157 The Timestamp, Timestamp Echo, and Elapsed Time options help DCCP 4158 endpoints explicitly measure round-trip times. 4160 13.1. Timestamp Option 4162 This option is permitted in any DCCP packet. The length of the 4163 option is 6 bytes. 4165 +--------+--------+--------+--------+--------+--------+ 4166 |00101001|00000110| Timestamp Value | 4167 +--------+--------+--------+--------+--------+--------+ 4168 Type=41 Length=6 4170 The four bytes of option data carry the timestamp of this packet in 4171 some undetermined form. A DCCP receiving a Timestamp option SHOULD 4172 respond with a Timestamp Echo option on the next packet it sends. 4174 13.2. Elapsed Time Option 4176 This option is permitted in any DCCP packet that contains an 4177 Acknowledgement Number. It indicates how much time, in tenths of 4178 milliseconds, has elapsed since the packet being acknowledged---the 4179 packet with the given Acknowledgement Number---was received. The 4180 option may take 4 or 6 bytes, depending on the size of the Elapsed 4181 Time value. Elapsed Time helps correct round-trip time estimates 4182 when the gap between receiving a packet and acknowledging that 4183 packet may be long---in CCID 3, for example, where acknowledgements 4184 are sent infrequently. 4186 +--------+--------+--------+--------+ 4187 |00101011|00000100| Elapsed Time | 4188 +--------+--------+--------+--------+ 4189 Type=43 Len=4 4191 +--------+--------+--------+--------+--------+--------+ 4192 |00101011|00000110| Elapsed Time | 4193 +--------+--------+--------+--------+--------+--------+ 4194 Type=43 Len=6 4196 The option data, Elapsed Time, represents an estimated upper bound 4197 on the amount of time elapsed since the packet being acknowledged 4198 was received, with units of tenths of milliseconds. If Elapsed Time 4199 is less than a second, the first, smaller form of the option SHOULD 4200 be used. Elapsed Times of more than 6.5535 seconds MUST be sent 4201 using the second form of the option. DCCP endpoints MUST NOT report 4202 Elapsed Times that are significantly larger than the true elapsed 4203 times. A connection MAY be reset with Reset Code 12, "Aggression 4204 Penalty", if one endpoint determines that the other is reporting a 4205 much-too-large Elapsed Time. 4207 Elapsed Time is measured in tenths of milliseconds as a compromise 4208 between two conflicting goals. First, it provides enough 4209 granularity to reduce rounding error when measuring elapsed time 4210 over fast LANs; second, it allows most reasonable elapsed times to 4211 fit into two bytes of data. 4213 13.3. Timestamp Echo Option 4215 This option is permitted in any DCCP packet, as long as at least one 4216 packet carrying the Timestamp option has been received. Generally, 4217 a DCCP endpoint should send one Timestamp Echo option for each 4218 Timestamp option it receives; and it should send that option as soon 4219 as is convenient. The length of the option is between 6 and 10 4220 bytes, depending on whether Elapsed Time is included and how large 4221 it is. 4223 +--------+--------+--------+--------+--------+--------+ 4224 |00101010|00000110| Timestamp Echo | 4225 +--------+--------+--------+--------+--------+--------+ 4226 Type=42 Len=6 4228 +--------+--------+------- ... -------+--------+--------+ 4229 |00101010|00001000| Timestamp Echo | Elapsed Time | 4230 +--------+--------+------- ... -------+--------+--------+ 4231 Type=42 Len=8 (4 bytes) 4233 +--------+--------+------- ... -------+------- ... -------+ 4234 |00101010|00001010| Timestamp Echo | Elapsed Time | 4235 +--------+--------+------- ... -------+------- ... -------+ 4236 Type=42 Len=10 (4 bytes) (4 bytes) 4238 The first four bytes of option data, Timestamp Echo, carry a 4239 Timestamp Value taken from a preceding received Timestamp option. 4240 Usually, this will be the last packet that was received---the packet 4241 indicated by the Acknowledgement Number, if any---but it might be a 4242 preceding packet. 4244 The Elapsed Time value, similar to that in the Elapsed Time option, 4245 indicates the amount of time elapsed since receiving the packet 4246 whose timestamp is being echoed. This time MUST be in tenths of 4247 milliseconds. Elapsed Time is meant to help the Timestamp sender 4248 separate the network round-trip time from the Timestamp receiver's 4249 processing time. This may be particularly important for CCIDs where 4250 acknowledgements are sent infrequently, so that there might be 4251 considerable delay between receiving a Timestamp option and sending 4252 the corresponding Timestamp Echo. A missing Elapsed Time field is 4253 equivalent to an Elapsed Time of zero. The smallest version of the 4254 option SHOULD be used that can hold the relevant Elapsed Time value. 4256 14. Multihoming and Mobility 4258 DCCP provides primitive support for multihoming and mobility via a 4259 mechanism for transferring a connection endpoint from one address to 4260 another. The moving endpoint must negotiate mobility support 4261 beforehand. When the moving endpoint gets a new address, it sends a 4262 DCCP-Move packet from that address to the stationary endpoint. The 4263 stationary endpoint then changes its connection state to use the new 4264 address. 4266 DCCP's support for mobility is intended to solve only the simplest 4267 multihoming and mobility problems; for instance, there's no support 4268 for simultaneous moves. Applications requiring more complex 4269 mobility semantics, or more stringent security guarantees, should 4270 use an existing solution like Mobile IP or [SB00]. DCCP mobility may 4271 not be useful in the context of IPv6, with its mandatory support for 4272 Mobile IP. 4274 14.1. Mobility Capable Feature 4276 A DCCP uses the Mobility Capable feature to inform its partner that 4277 it would like to be able to change its address and/or port during 4278 the course of the connection. DCCP B sends a "Change R(Mobility 4279 Capable, 1)" option to DCCP A to inform it that B might like to move 4280 later. 4282 Mobility Capable has feature number 5, and is server-priority. It 4283 takes one-byte Boolean values. DCCP A agrees in principle to accept 4284 DCCP-Move packets from DCCP B when Mobility Capable/A is one. 4285 DCCP A MUST reject any DCCP-Move packet for a connection whose 4286 Mobility Capable/A feature is zero, although it MAY reject a valid 4287 DCCP-Move packet even when Mobility Capable/A is one. Values of two 4288 or more are reserved. New connections start with Mobility Capable 0 4289 (that is, mobility is not allowed) for both endpoints. 4291 14.2. Mobility ID Feature 4293 A DCCP uses the Mobility ID feature to inform its partner of a 4294 128-bit number that will act as identification, should the partner 4295 change its address and/or port during the course of the connection. 4296 DCCP A sends a "Change L(Mobility ID, N)" option to notify DCCP B of 4297 the ID it has chosen for B's use. 4299 Mobility ID has feature number 6, and is non-negotiable. Its values 4300 are sixteen-byte integers. The Mobility ID/A feature equals the 4301 identifier that DCCP B should use on DCCP-Move packets sent to A. 4302 DCCP A chooses Mobility ID/A to uniquely identify the connection 4303 among all connections that terminate at A. For security, DCCP A 4304 MUST choose Mobility ID/A randomly. Furthermore, it MUST reassign 4305 Mobility ID/A after each successful move by DCCP B, and it MAY 4306 reassign Mobility ID/A more frequently. New connections start with 4307 Mobility ID 0 for both endpoints. However, Mobility IDs of zero 4308 MUST NOT be accepted on DCCP-Move packets; an endpoint cannot 4309 successfully move until the relevant Mobility ID has been set to a 4310 nonzero value. 4312 14.3. Mobile Host Processing 4314 When DCCP A changes its address and/or port, it MUST signal this by 4315 sending DCCP B a DCCP-Move packet. The Mobility ID in the DCCP-Move 4316 packet uniquely identifies the connection; DCCP B will read the new 4317 address and port off the DCCP-Move's network and DCCP headers. 4318 Eventually, DCCP A will receive a DCCP-Sync sent to its new address 4319 that negotiates a new Mobility ID/B feature. This confirms the 4320 move. DCCP A SHOULD retransmit the DCCP-Move packet until it 4321 receives a DCCP-Sync confirmation. The retransmission strategy 4322 SHOULD be similar to that for retransmitting DCCP-Requests (Section 4323 8.1.1); for instance, a first timeout on the order of a second, with 4324 an exponential backoff timer. 4326 DCCP A MUST reset its congestion control state after sending a DCCP- 4327 Move, since nothing is known about conditions on the new path. 4328 Essentially, DCCP A must "slow start" up to its new fair rate, as 4329 appropriate for its congestion control mechanism. Section 14.5 4330 discusses this further. 4332 DCCP A SHOULD NOT send non-DCCP-Move packets to DCCP B until the 4333 move is confirmed. If it did so, and the DCCP-Move packet was lost 4334 or reordered, then DCCP B would react by sending DCCP-Resets with 4335 Reset Code 3, "No Connection". DCCP A might implement special 4336 handling for such resets to avoid any post-move quiet period, but 4337 this is NOT RECOMMENDED. 4339 DCCP B MAY refuse to accept a move, perhaps because of address 4340 policy. In this case, DCCP A will receive a DCCP-Reset with Reset 4341 Code 13, "Move Refused", rather than a confirming DCCP-Sync. DCCP A 4342 MAY react by tearing down the connection, or by trying another DCCP- 4343 Move---for instance, back to the old address, if possible. 4345 DCCP endpoints SHOULD NOT use an old address-port pair after sending 4346 a DCCP-Move. If it becomes necessary to switch back to the old 4347 address-port pair, the endpoint MUST do so explicitly using another 4348 DCCP-Move. 4350 DCCP-Move packets SHOULD NOT be sent until the connection is 4351 established; it is illegal to send a DCCP-Move in REQUEST or RESPOND 4352 state. If an endpoint moves during connection establishment, it 4353 SHOULD abandon the old connection and initiate a new one. No 4354 connection exists to move until the three-way handshake has 4355 completed. 4357 14.4. Stationary Host Processing 4359 The stationary endpoint, DCCP B, uses DCCP-Move packets' destination 4360 address, destination port, and Mobility ID fields to look up the 4361 relevant connection. This differs from all other packet types, 4362 which use the source address/source port/destination 4363 address/destination port 4-tuple. 4365 DCCP B MUST ignore DCCP-Moves whose Mobility ID is zero, or whose 4366 Mobility ID does not correspond to any active connection. It also 4367 MUST ignore DCCP-Moves sent to sockets in CLOSED, LISTEN, REQUEST, 4368 RESPOND, or TIMEWAIT state, and it MUST ignore DCCP-Moves with 4369 invalid Sequence or Acknowledgement Numbers (see Section 7.5). 4370 DCCP B MUST NOT respond to invalid DCCP-Moves with DCCP-Reset or 4371 DCCP-Sync packets, since any active response would leak information 4372 about the connection to a possibly malicious host. After receiving 4373 an invalid DCCP-Move, DCCP B MAY ignore subsequent DCCP-Move 4374 packets, valid or not, for a short period of time, such as one 4375 second or one round-trip time. This protects DCCP B against denial- 4376 of-service attacks from floods of invalid DCCP-Moves. 4378 On receiving a valid DCCP-Move, DCCP B decides whether to accept or 4379 refuse the move request. To accept the request, it performs several 4380 actions: 4382 o It changes the connection to use the new address and port. 4384 o It sets a timer to remove the old address and port after 2MSL. 4385 This delay allows the receipt of any delayed packets from the old 4386 address and port, and essentially represents TIMEWAIT state for 4387 the old connection. 4389 o It chooses a new Mobility ID for the connection, which temporarily 4390 coexists with the old Mobility ID. 4392 o It generates and sends a confirmation DCCP-Sync packet, which 4393 includes a "Change L(Mobility ID)" option for the new Mobility ID. 4395 If the DCCP-Sync is lost, then DCCP A will send another DCCP-Move 4396 packet with the old Mobility ID. DCCP B MUST send another DCCP-Sync 4397 packet in this situation, but SHOULD NOT choose yet another new 4398 Mobility ID. 4400 The move's three-way handshake completes once DCCP B receives a 4401 DCCP-SyncAck from DCCP A that confirms the new Mobility ID option. 4402 At that point, DCCP B MUST remove the old Mobility ID. 4404 DCCP B MAY refuse a valid DCCP-Move request for any reason; for 4405 instance, the new address space might be considered unsuitable. To 4406 refuse a valid DCCP-Move, DCCP B sends a DCCP-Reset packet to the 4407 new address and port pair with Reset Code 13, "Move Refused". It 4408 need take no other action; for example, it MAY tear down the 4409 connection, or not. If DCCP B plans to refuse every DCCP-Move 4410 request, it MUST negotiate a zero value for the Mobility Capable/A 4411 feature. 4413 DCCP B MUST ignore any data following the header in a DCCP-Move 4414 packet. 4416 14.5. Congestion Control State 4418 Once an endpoint has transitioned to a new address, the connection 4419 is effectively a new connection in terms of its congestion control 4420 state: the accumulated information about congestion between the old 4421 endpoints no longer applies. Both DCCPs MUST initialize their 4422 congestion control state (windows, rates, and so forth) to that of a 4423 new connection. That is, they must "slow start". 4425 Similarly, the endpoints' PMTUs SHOULD be reinitialized, and PMTU 4426 discovery performed again, following an address change. See Section 4427 15. 4429 During the transition period between addresses, the endpoints might 4430 receive congestion feedback from both before the move and after the 4431 move. Congestion and loss events on packets sent before the move 4432 SHOULD NOT affect the new connection's congestion control state. 4434 14.6. Security 4436 The DCCP mobility mechanism, like DCCP in general, does not provide 4437 cryptographic security guarantees. Nevertheless, mobile hosts must 4438 use valid Mobility IDs, providing protection against some classes of 4439 attackers: An attacker cannot move a DCCP connection to a new 4440 address unless it knows a valid Mobility ID. This generally means 4441 that an attacker must have snooped on every packet in the connection 4442 to get a reasonable probability of success, assuming that the 4443 Mobility ID was chosen well (that is, randomly). 4445 An attacker could choose a server running many mobility-capable 4446 connections, and simply guess random Mobility IDs until one hit. 4447 Let N equal the number of mobility-capable connections at the 4448 server, X equal the number of attack attempts, and D equal the 4449 number of possible Mobility IDs, namely 2^128. Then the probability 4450 of at least one attack succeeding is 4451 (D - N) choose X (D-N)! (D-X)! 4452 P = 1 - ---------------- = 1 - ------------- . 4453 D choose X D! (D-N-X)! 4455 For N = 10^6 and X = 10^9, the attack success probability is less 4456 than 10^-23. 4458 Section 19 further describes DCCP security considerations. 4460 15. Maximum Packet Size 4462 A DCCP implementation MUST maintain the maximum packet size (MPS) 4463 allowed for each active DCCP session. The MPS is influenced by the 4464 maximum packet size allowed by the current congestion control 4465 mechanism (CCMPS), the maximum packet size supported by the path's 4466 links (PMTU, the Path Maximum Transfer Unit) [RFC 1191], and the 4467 lengths of the IP and DCCP headers. 4469 A DCCP application interface should let the application discover 4470 DCCP's current MPS. DCCP applications should use the API to 4471 discover the MPS. Generally, the DCCP implementation will refuse to 4472 send any packet bigger than the MPS, returning an appropriate error 4473 to the application. 4475 A DCCP interface may allow applications to request that packets 4476 larger than PMTU be fragmented on IPv4 networks. This only matters 4477 when CCMPS > PMTU; packets larger than CCMPS MUST be rejected 4478 regardless. Fragmentation should not be the default. The rest of 4479 this section assumes the application has not requested 4480 fragmentation. 4482 The MPS reported to the application SHOULD be influenced by the size 4483 expected to be required for DCCP headers and options. If the 4484 application provides data that, when combined with the options the 4485 DCCP implementation would like to include, would exceed the MPS, the 4486 implementation should either send the options on a separate packet 4487 (such as a DCCP-Ack) or lower the MPS, drop the data, and return an 4488 appropriate error to the application. 4490 The PMTU SHOULD be initialized from the interface MTU that will be 4491 used to send packets. The MPS will be initialized with the minimum 4492 of the PMTU and the CCMPS, if any. 4494 To perform classical PMTU discovery, the DCCP sender sets the IP 4495 Don't Fragment (DF) bit. However, it is undesirable for MTU 4496 discovery to occur on the initial connection setup handshake, as the 4497 connection setup process may not be representative of packet sizes 4498 used during the connection, and performing MTU discovery on the 4499 initial handshake might unnecessarily delay connection 4500 establishment. Thus, DF SHOULD NOT be set on DCCP-Request and DCCP- 4501 Response packets. In addition DF SHOULD NOT be set on DCCP-Reset 4502 packets, although typically these would be small enough to not be a 4503 problem. On all other DCCP packets, DF SHOULD be set. 4505 As specified in [RFC 1191], when a router receives a packet with DF 4506 set that is larger than the next link's MTU, it sends an ICMP 4507 Destination Unreachable message to the source of the datagram with 4508 the Code indicating "fragmentation needed and DF set" (also known as 4509 a "Datagram Too Big" message). When a DCCP implementation receives 4510 a Datagram Too Big message, it decreases its PMTU to the Next-Hop 4511 MTU value given in the ICMP message. If the MTU given in the 4512 message is zero, the sender chooses a value for PMTU using the 4513 algorithm described in Section 7 of [RFC 1191]. If the MTU given in 4514 the message is greater than the current PMTU, the Datagram Too Big 4515 message is ignored, as described in [RFC 1191]. (We are aware that 4516 this may cause problems for DCCP endpoints behind certain 4517 firewalls.) 4519 If the DCCP implementation has decreased the PMTU, and the sending 4520 application attempts to send a packet larger than the new MPS, the 4521 API must refuse to send the packet and return an appropriate error 4522 to the application. The application should then use the API to 4523 query the new value of MPS. The kernel might have some packets 4524 buffered for transmission that are smaller than the old MPS, but 4525 larger than the new MPS. It MAY send these packets with the DF bit 4526 cleared, or it MAY discard these packets; it MUST NOT transmit these 4527 datagrams with the DF bit set. 4529 A DCCP implementation may allow the application to occasionally 4530 request that PMTU discovery be performed again. This will reset the 4531 PMTU to the outgoing interface's MTU. Such requests SHOULD be rate 4532 limited, to one per two seconds, for example. A successful DCCP- 4533 Move will also reset the PMTU. 4535 A DCCP sender MAY treat the reception of an ICMP Datagram Too Big 4536 message as an indication that the packet being reported was not lost 4537 due congestion, and so for the purposes of congestion control it MAY 4538 ignore the DCCP receiver's indication that this packet did not 4539 arrive. However, if this is done, then the DCCP sender MUST check 4540 the ECN bits of the IP header echoed in the ICMP message, and only 4541 perform this optimization if these ECN bits indicate that the packet 4542 did not experience congestion prior to reaching the router whose 4543 link MTU it exceeded. 4545 A DCCP implementation SHOULD ensure, as far as possible, that ICMP 4546 Datagram Too Big messages were actually generated by routers, so 4547 that attackers cannot drive the PMTU down to a falsely small value. 4548 The simplest way to do this is to verify that the Sequence Number on 4549 the ICMP error's encapsulated header corresponds to a Sequence 4550 Number that the implementation recently sent. (Routers are not 4551 required to return more than 64 bits of the DCCP header [RFC 792], 4552 but most modern routers will return far more, including the Sequence 4553 Number.) ICMP Datagram Too Big messages with incorrect or missing 4554 Sequence Numbers may be ignored, or the DCCP implementation may 4555 lower the PMTU only temporarily in response. If more than three odd 4556 Datagram Too Big messages are received and the other DCCP endpoint 4557 reports commensurate loss, however, the DCCP implementation SHOULD 4558 assume the presence of a confused router, and either obey the ICMP 4559 messages' PMTU or (on IPv4 networks) switch to allowing 4560 fragmentation. 4562 DCCP also allows upward probing of the PMTU [PMTUD], where the DCCP 4563 endpoint begins by sending small packets with DF set, then gradually 4564 increases the packet size until a packet is lost. This mechanism 4565 does not require any ICMP error processing. DCCP-Sync packets are 4566 the best choice for upward probing, since DCCP-Sync probes do not 4567 risk application data loss. The DCCP implementation inserts 4568 arbitrary data into the DCCP-Sync application area, padding the 4569 packet to the right length; and since every valid DCCP-Sync 4570 generates an immediate DCCP-SyncAck in response, the endpoint will 4571 have a pretty good idea of when a probe is lost. 4573 16. Forward Compatibility 4575 Future versions of DCCP may add new options and features. A few 4576 simple guidelines will let extended DCCPs interoperate with normal 4577 DCCPs. 4579 o DCCP processors MUST NOT act punitively towards options and 4580 features they do not understand. For example, DCCP processors 4581 MUST NOT reset the connection if some field marked Reserved in 4582 this specification is non-zero; if some unknown option is present; 4583 or if some feature negotiation option mentions an unknown feature. 4584 Instead, DCCP processors MUST ignore these events. The Mandatory 4585 option is the single exception: if Mandatory precedes some unknown 4586 option or feature, the connection MUST be reset. 4588 o DCCP processors MUST anticipate the possibility of unknown feature 4589 values, which might occur as part of a negotiation for a known 4590 feature. For server-priority features, unknown values are handled 4591 as a matter of course: since the non-extended DCCP's priority list 4592 will not contain unknown values, the result of the negotiation 4593 cannot be an unknown value. A DCCP SHOULD reset the connection if 4594 it is assigned an unacceptable value for some non-negotiable 4595 feature. 4597 o Each DCCP extension SHOULD be controlled by some feature. The 4598 default value of this feature should correspond to "extension not 4599 available". If an extended DCCP wants to use the extension, it 4600 SHOULD attempt to change the feature's value using a Change L or 4601 Change R option. Any non-extended DCCP will ignore the option, 4602 thus leaving the feature value at its default, "extension not 4603 available". 4605 Section 20 lists DCCP assigned numbers reserved for experimental and 4606 testing purposes. 4608 17. Middlebox Considerations 4610 This section describes properties of DCCP that firewalls, network 4611 address translators, and other middleboxes should consider, 4612 including parts of the packet that middleboxes should not change. 4613 The intent is to draw attention to aspects of DCCP that may be 4614 useful, or dangerous, for middleboxes, or that differ significantly 4615 from TCP. 4617 The Service Code field in DCCP-Request packets provide information 4618 that may be useful for stateful middleboxes. With Service Code, a 4619 middlebox can tell what protocol a connection will use without 4620 relying on port numbers. Middleboxes can disallow attempted 4621 connections accessing unexpected services by sending a DCCP-Reset 4622 with Reset Code 9, "Bad Service Code". Middleboxes probably 4623 shouldn't modify the Service Code, unless they are really changing 4624 the service a connection is accessing. 4626 The Source and Destination Port fields are in the same packet 4627 locations as the corresponding fields in TCP and UDP, which may 4628 simplify some middlebox implementations. 4630 Modifying DCCP Sequence Numbers and Acknowledgement Numbers is more 4631 tedious and dangerous than modifying TCP sequence numbers. A 4632 middlebox that added packets to, or removed packets from, a DCCP 4633 connection would have to modify acknowledgement options, such as Ack 4634 Vector, and CCID-specific options, such as TFRC's Loss Intervals, at 4635 minimum. On ECN-capable connections, the middlebox would have to 4636 keep track of ECN Nonce information for packets it introduced or 4637 removed, so that the relevant acknowledgement options continued to 4638 have correct ECN Nonce Echoes, or risk the connection being reset 4639 for "Aggression Penalty". Furthermore, if a middlebox completely 4640 changed sequence numbers, the DCCP-Move mobility mechanism might 4641 stop working. We therefore recommend that middleboxes not modify 4642 packet streams by adding or removing packets. 4644 Note that there is less need to modify DCCP's per-packet sequence 4645 numbers than TCP's per-byte sequence numbers; for example, a 4646 middlebox can change the contents of a packet without changing its 4647 sequence number. (In TCP, sequence number modification is required 4648 to support protocols like FTP that carry variable-length addresses 4649 in the data stream. If such an application were deployed over DCCP, 4650 middleboxes would simply grow or shrink the relevant packets as 4651 necessary, without changing their sequence numbers. This might 4652 involve fragmenting the packet.) 4654 Middleboxes may, of course, reset connections in progress. Clearly 4655 this requires inserting a packet into one or both packet streams, 4656 but the difficult issues do not arise. 4658 DCCP is somewhat unfriendly to "connection splicing" [SHHP00], in 4659 which clients' connection attempts are intercepted, but possibly 4660 later "spliced in" to external server connections via sequence 4661 number manipulations. A connection splicer at minimum would have to 4662 ensure that the spliced connections agreed on all relevant feature 4663 values, which might take some renegotiation. 4665 The contents of this section should not be interpreted as a 4666 wholesale endorsement of stateful middleboxes. 4668 18. Relations to Other Specifications 4670 18.1. DCCP and RTP 4672 The Real-Time Transport Protocol, RTP [RFC 3550], is currently used 4673 over UDP by many of DCCP's target applications (for instance, 4674 streaming media). Therefore, it is important to examine the 4675 relationship between DCCP and RTP, and in particular, the question 4676 of whether any changes in RTP are necessary or desirable when it is 4677 layered over DCCP instead of UDP. 4679 There are two potential sources of overhead in the RTP-over-DCCP 4680 combination, duplicated acknowledgement information and duplicated 4681 sequence numbers. Together, these sources of overhead add slightly 4682 more than 4 bytes per packet relative to RTP-over-UDP, and that 4683 eliminating the redundancy would not reduce the overhead. 4685 First, consider acknowledgements. Both RTP and DCCP report feedback 4686 about loss rates to data senders, via Real-Time Control Protocol 4687 Sender and Receiver Reports (RTCP SR/RR packets) and via DCCP 4688 acknowledgement options. These feedback mechanisms are potentially 4689 redundant. However, RTCP SR/RR packets contain information not 4690 present in DCCP acknowledgements, such as "interarrival jitter", and 4691 DCCP's acknowledgements contain information not transmitted by RTCP, 4692 such as the ECN Nonce Echo. Neither feedback mechanism makes the 4693 other redundant. 4695 Sending both types of feedback isn't particularly costly either. 4696 RTCP reports are sent relatively infrequently: once every 5 seconds, 4697 for low-bandwidth flows. In DCCP, some feedback mechanisms are 4698 expensive---Ack Vector, for example, is frequent and verbose---but 4699 others are relatively cheap: CCID 3 (TFRC) acknowledgements take 4700 between 16 and 32 bytes of options sent once per round trip time. 4701 (Reporting less frequently than once per RTT would make congestion 4702 control less responsive to loss.) We therefore conclude that 4703 acknowledgement overhead in RTP-over-DCCP is not significantly 4704 higher than for RTP-over-UDP, at least for CCID 3. 4706 One clear redundancy can be addressed at the application level. The 4707 verbose packet-by-packet loss reports sent in RTCP Extended Reports 4708 (RTCP XR) Loss RLE Blocks can be derived from DCCP's Ack Vector 4709 options. (The converse is not true, since Loss RLE Blocks contain 4710 no ECN information.) Since DCCP implementations should provide an 4711 API for application access to Ack Vector information, RTP-over-DCCP 4712 applications might request either DCCP Ack Vectors or RTCP Extended 4713 Report Loss RLE Blocks, but not both. 4715 Now consider sequence number redundancy on data packets. The 4716 embedded RTP header contains a 16-bit RTP sequence number. Most 4717 data packets will use the DCCP-Data type; DCCP-DataAck and DCCP-Ack 4718 packets need not usually be sent. The DCCP-Data header is 12 bytes 4719 long without options, including a 24-bit sequence number. This is 4 4720 bytes more than a UDP header. Any options required on data packets 4721 would add further overhead, although many CCIDs (for instance, CCID 4722 3, TFRC) don't require options on most data packets. 4724 The DCCP sequence number cannot be inferred from the RTP sequence 4725 number since it increments on non-data packets as well as data 4726 packets. The RTP sequence number cannot be inferred from the DCCP 4727 sequence number either; for instance, RTP sequence numbers might be 4728 sent out of order. Furthermore, removing RTP's sequence number 4729 would not save any header space because of alignment issues. We 4730 therefore recommend that RTP transmitted over DCCP use the same 4731 headers currently defined. The 4 byte header cost is a reasonable 4732 tradeoff for DCCP's congestion control features and access to ECN. 4733 Truly bandwidth-starved endpoints should use header compression. 4735 18.2. Multiplexing Issues 4737 Since DCCP doesn't provide reliable, ordered delivery, multiple 4738 application sub-flows may be multiplexed over a single DCCP 4739 connection with no inherent performance penalty. Thus, there is no 4740 need for DCCP to provide built-in, SCTP-style support for multiple 4741 sub-flows. 4743 Some applications might want to share congestion control state among 4744 multiple DCCP flows that share the same source and destination 4745 addresses. This functionality could be provided by the Congestion 4746 Manager [RFC 3124], a generic multiplexing facility. However, the 4747 CM would not fully support DCCP without change; it does not 4748 gracefully handle multiple congestion control mechanisms, for 4749 example. 4751 19. Security Considerations 4753 DCCP does not provide cryptographic security guarantees. 4754 Applications desiring hard security should use IPsec or end-to-end 4755 security of some kind. 4757 Nevertheless, DCCP is intended to protect against some classes of 4758 attackers: Attackers cannot hijack a DCCP connection (close the 4759 connection unexpectedly, or cause attacker data to be accepted by an 4760 endpoint as if it came from the sender) unless they can guess valid 4761 sequence numbers. Thus, as long as endpoints choose initial 4762 sequence numbers well, a DCCP attacker must snoop on data packets to 4763 get any reasonable probability of success. Sequence number validity 4764 checks provide this guarantee. Section 7.5.5 describes sequence 4765 number security further. 4767 This security property only holds assuming that DCCP's random 4768 numbers are chosen according to the guidelines in [RFC 1750]. 4770 DCCP provides no protection against attackers that can snoop on data 4771 packets. 4773 19.1. Security Considerations for Mobility 4775 Mobility slightly changes DCCP's security properties by introducing 4776 a new mechanism by which an attacker can hijack a connection. This 4777 mechanism, DCCP-Move, has the unfortunate property that, given a 4778 successful attack, the victim could not realize that the connection 4779 has been stolen---its connection would simply be reset unexpectedly. 4781 Nevertheless, a DCCP attacker still must snoop on data packets to 4782 get any reasonable probability of success, since it must guess a 4783 valid Mobility ID. Section 14.6 quantifies the probability of 4784 successful attack; with DCCP's 128-bit Mobility IDs, that 4785 probability is quite low. 4787 19.2. Security Considerations for Partial Checksums 4789 The partial checksum facility has a separate security impact, 4790 particularly in its interaction with authentication and encryption 4791 mechanisms. The impact is the same in DCCP as in the UDP-Lite 4792 protocol, and what follows was adapted from the corresponding text 4793 in the UDP-Lite specification [UDP-LITE]. 4795 When a DCCP packet's Checksum Coverage field is not zero, the 4796 uncovered portion of a packet may change in transit. This is 4797 contrary to the idea behind most authentication mechanisms: 4798 authentication succeeds if the packet has not changed in transit. 4799 Unless authentication mechanisms that operate only on the sensitive 4800 part of packets are developed and used, authentication will always 4801 fail for partially-checksummed DCCP packets whose uncovered part has 4802 been damaged. 4804 The IPsec integrity check (Encapsulation Security Protocol, ESP, or 4805 Authentication Header, AH) is applied (at least) to the entire IP 4806 packet payload. Corruption of any bit within that area will then 4807 result in the IP receiver discarding a DCCP packet, even if the 4808 corruption happened in an uncovered part of the DCCP application 4809 data. 4811 When IPsec is used with ESP payload encryption, a link can not 4812 determine the specific transport protocol of a packet being 4813 forwarded by inspecting the IP packet payload. In this case, the 4814 link MUST provide a standard integrity check covering the entire IP 4815 packet and payload. DCCP partial checksums provide no benefit in 4816 this case. 4818 Encryption (e.g., at the transport or application levels) may be 4819 used. Note that omitting an integrity check can, under certain 4820 circumstances, compromise confidentiality [BEL98]. 4822 If a few bits of an encrypted packet are damaged, the decryption 4823 transform will typically spread errors so that the packet becomes 4824 too damaged to be of use. Many encryption transforms today exhibit 4825 this behavior. There exist encryption transforms, stream ciphers, 4826 which do not cause error propagation. Proper use of stream ciphers 4827 can be quite difficult, especially when authentication-checking is 4828 omitted [BB01]. In particular, an attacker can cause predictable 4829 changes to the ultimate plaintext, even without being able to 4830 decrypt the ciphertext. 4832 20. IANA Considerations 4834 DCCP introduces several sets of numbers whose values should be 4835 allocated by IANA. The following sets of numbers should require an 4836 IETF standards-track specification as a prerequisite for new 4837 registrations. 4839 o DCCP Packet Types 9 through 15 (Section 5.1). 4841 o 8-bit DCCP-Reset Codes (Section 5.6). 4843 o 8-bit DCCP Option Types (Section 5.9). The CCID-specific options 4844 128 through 255 need not be allocated by IANA, although particular 4845 CCIDs may request that IANA allocate their CCID-specific options. 4847 o 8-bit DCCP Feature Numbers (Section 6). The CCID-specific features 4848 128 through 255 need not be allocated by IANA, although particular 4849 CCIDs may request that IANA allocate their CCID-specific features. 4851 o 8-bit DCCP Congestion Control Identifiers (CCIDs) (Section 10). 4853 o Ack Vector States (Section 11.4). Only State 2 remains 4854 unallocated. 4856 o Data Dropped Drop Codes 4 through 6 (Section 11.7). 4858 IANA should also provide a registry for 32-bit Service Codes. 4859 Registering a Service Code should not require a standards-track 4860 specification. Our liberal proposed registration rules for Service 4861 Codes are presented in detail in Section 8.1.2. 4863 Finally, DCCP requires a Protocol Number to be added to the registry 4864 of Assigned Internet Protocol Numbers. Protocol Number 33 has 4865 informally been made available for experimental DCCP use, but this 4866 number may change in future. 4868 The following DCCP assigned numbers should be reserved specifically 4869 for experimental and testing use [RFC 3692]: packet type 15, option 4870 number 31, option numbers 120 through 126, feature numbers 120 4871 through 126, Reset Codes 248 through 254, and CCID 254. 4873 21. Thanks 4875 Thanks to Jitendra Padhye for his help with early versions of this 4876 specification. 4878 Thanks to Junwen Lai and Arun Venkataramani, who, as interns at 4879 ICIR, built a prototype DCCP implementation. In particular, Junwen 4880 Lai recommended that the old feature negotiation mechanism be 4881 scrapped and helped design the current mechanism, and Arun 4882 Venkataramani's feedback improved Appendix A. 4884 We thank the staff and interns of ICIR and, formerly, ACIRI, the 4885 members of the End-to-End Research Group, and the members of the 4886 Transport Area Working Group for their feedback on DCCP. We 4887 especially thank the DCCP expert reviewers: Greg Minshall, Eric 4888 Rescorla, and Magnus Westerlund for detailed written comments and 4889 problem spotting, and Rob Austein and Steve Bellovin for verbal 4890 comments and written notes. 4892 We also thank those who provided comments and suggestions via the 4893 DCCP BOF, Working Group, and mailing lists, including Damon 4894 Lanphear, Patrick McManus, Sara Karlberg, Kevin Lai, Youngsoo Choi, 4895 Dan Duchamp, Gorry Fairhurst, Derek Fawcus, David Timothy Fleeman, 4896 John Loughney, Ghyslain Pelletier, Tom Phelan, Stanislav Shalunov, 4897 Yufei Wang, and Michael Welzl. In particular, Michael Welzl 4898 suggested the Data Checksum option. 4900 A. Appendix: Ack Vector Implementation Notes 4902 This appendix discusses particulars of DCCP acknowledgement 4903 handling, in the context of an abstract implementation for Ack 4904 Vector. It is informative rather than normative. 4906 The first part of our implementation runs at the HC-Receiver, and 4907 therefore acknowledges data packets. It generates Ack Vector 4908 options. The implementation has the following characteristics: 4910 o At most one byte of state per acknowledged packet. 4912 o O(1) time to update that state when a new packet arrives (normal 4913 case). 4915 o Cumulative acknowledgements. 4917 o Quick removal of old state. 4919 The basic data structure is a circular buffer containing information 4920 about acknowledged packets. Each byte in this buffer contains a 4921 state and run length; the state can be 0 (packet received), 1 4922 (packet ECN marked), or 3 (packet not yet received). The buffer 4923 grows from right to left. The implementation maintains five 4924 variables, aside from the buffer contents: 4926 o "buf_head" and "buf_tail", which mark the live portion of the 4927 buffer. 4929 o "buf_ackno", the Acknowledgement Number of the most recent packet 4930 acknowledged in the buffer. This corresponds to the "head" 4931 pointer. 4933 o "buf_nonce", the one-bit sum (exclusive-or, or parity) of the ECN 4934 Nonces received on all packets acknowledged by the buffer with 4935 State 0. 4937 We draw acknowledgement buffers like this: 4939 +-------------------------------------------------------------------+ 4940 |S,L|S,L|S,L|S,L| | | | | |S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L| 4941 +-------------------------------------------------------------------+ 4942 ^ ^ 4943 buf_tail buf_head, buf_ackno = A buf_nonce = E 4945 <=== buf_head and buf_tail move this way <=== 4947 Each `S,L' represents a State/Run length byte. We will draw these 4948 buffers showing only their live portion, and will add an annotation 4949 showing the Acknowledgement Number for the last live byte in the 4950 buffer. For example: 4952 +-----------------------------------------------+ 4953 A |S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L| T BN[E] 4954 +-----------------------------------------------+ 4956 Here, buf_nonce equals E and buf_ackno equals A. This smaller 4957 Example Buffer contains actual data. 4959 +---------------------------+ 4960 10 |0,0|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] [Example Buffer] 4961 +---------------------------+ 4963 In concrete terms, its meaning is as follows: 4965 Packet 10 was received. (The head of the buffer has sequence 4966 number 10, state 0, and run length 0.) 4968 Packets 9, 8, and 7 have not yet been received. (The three 4969 bytes preceding the head each have state 3 and run length 0.) 4971 Packets 6, 5, 4, 3, and 2 were received. 4973 Packet 1 was ECN marked. 4975 Packet 0 was received. 4977 The one-bit sum of the ECN Nonces on packets 10, 6, 5, 4, 3, 2, 4978 and 0 equals 1. 4980 Additionally, the HC-Receiver must keep some information about the 4981 Ack Vectors it has recently sent. For each packet sent carrying an 4982 Ack Vector, it remembers four variables: 4984 o "ack_seqno", the Sequence Number used for the packet. This is an 4985 HC-Receiver sequence number. 4987 o "ack_ptr", the value of buf_head at the time of acknowledgement. 4989 o "ack_ackno", the Acknowledgement Number used for the packet. This 4990 is an HC-Sender sequence number. Since acknowledgements are 4991 cumulative, this single number completely specifies all necessary 4992 information about the packets acknowledged by this Ack Vector. 4994 o "ack_nonce", the one-bit sum of the ECN Nonces for all State 0 4995 packets in the buffer from buf_head to ack_ackno, inclusive. 4996 Initially, this equals the Nonce Echo of the acknowledgement's Ack 4997 Vector (or, if the ack packet contained more than one Ack Vector, 4998 the exclusive-or of all the acknowledgement's Ack Vectors). It 4999 changes as information about old acknowledgements is removed (so 5000 ack_ptr and buf_head diverge), and as old packets arrive (so they 5001 change from State 3 or State 1 to State 0). 5003 A.1. Packet Arrival 5005 This section describes how the HC-Receiver updates its 5006 acknowledgement buffer as packets arrive from the HC-Sender. 5008 A.1.1. New Packets 5010 When a packet with Sequence Number greater than buf_ackno arrives, 5011 the HC-Receiver updates buf_head (by moving it to the left 5012 appropriately), buf_ackno (which is set to the new packet's Sequence 5013 Number), and possibly buf_nonce (if the packet arrived unmarked with 5014 ECN Nonce 1), in addition to the buffer itself. For example, if HC- 5015 Sender packet 11 arrived ECN marked, the Example Buffer above would 5016 enter this new state (changes are marked with stars): 5018 ** +***----------------------------+ 5019 11 |1,0|0,0|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5020 ** +***----------------------------+ 5022 If the packet's state equals the state at the head of the buffer, 5023 the HC-Receiver may choose to increment its run length (up to the 5024 maximum). For example, if HC-Sender packet 11 arrived without ECN 5025 marking and with ECN Nonce 0, the Example Buffer might enter this 5026 state instead: 5028 ** +--*------------------------+ 5029 11 |0,1|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5030 ** +--*------------------------+ 5032 Of course, the new packet's sequence number might not equal the 5033 expected sequence number. In this case, the HC-Receiver will enter 5034 the intervening packets as State 3. If several packets are missing, 5035 the HC-Receiver may prefer to enter multiple bytes with run length 5036 0, rather than a single byte with a larger run length; this 5037 simplifies table updates if one of the missing packets arrives. For 5038 example, if HC-Sender packet 12 arrived with ECN Nonce 1, the 5039 Example Buffer would enter this state: 5041 ** +*******----------------------------+ * 5042 12 |0,0|3,0|0,1|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[0] 5043 ** +*******----------------------------+ * 5045 Of course, the circular buffer may overflow, either when the HC- 5046 Sender is sending data at a very high rate, when the HC-Receiver's 5047 acknowledgements are not reaching the HC-Sender, or when the HC- 5048 Sender is forgetting to acknowledge those acks (so the HC-Receiver 5049 is unable to clean up old state). In this case, the HC-Receiver 5050 should either compress the buffer (by increasing run lengths when 5051 possible), transfer its state to a larger buffer, or, as a last 5052 resort, drop all received packets, without processing them 5053 whatsoever, until its buffer shrinks again. 5055 A.1.2. Old Packets 5057 When a packet with Sequence Number S arrives, and S <= buf_ackno, 5058 the HC-Receiver will scan the table for the byte corresponding to S. 5059 (Indexing structures could reduce the complexity of this scan.) If 5060 S was previously lost (State 3), and it was stored in a byte with 5061 run length 0, the HC-Receiver can simply change the byte's state. 5062 For example, if HC-Sender packet 8 was received with ECN Nonce 0, 5063 the Example Buffer would enter this state: 5065 +--------*------------------+ 5066 10 |0,0|3,0|0,0|3,0|0,4|1,0|0,0| 0 BN[1] 5067 +--------*------------------+ 5069 If S was not marked as lost, or if it was not contained in the 5070 table, the packet is probably a duplicate, and should be ignored. 5071 (The new packet's ECN marking state might differ from the state in 5072 the buffer; Section 11.4.1 describes what is allowed then.) If S's 5073 buffer byte has a non-zero run length, then the buffer might need be 5074 reshuffled to make space for one or two new bytes. 5076 The ack_nonce fields may also need manipulation when old packets 5077 arrive. In particular, when S transitions from State 3 or State 1 5078 to State 0, and S had ECN Nonce 1, then the implementation should 5079 flip the value of ack_nonce for every acknowledgement with ack_ackno 5080 >= S. 5082 It is impossible with this data structure to shift packets from 5083 State 0 to State 1, since the buffer doesn't store individual 5084 packets' ECN Nonces. 5086 A.2. Sending Acknowledgements 5088 Whenever the HC-Receiver needs to generate an acknowledgement, the 5089 buffer's contents can simply be copied into one or more Ack Vector 5090 options. Copied Ack Vectors might not be maximally compressed; for 5091 example, the Example Buffer above contains three adjacent 3,0 bytes 5092 that could be combined into a single 3,2 byte. The HC-Receiver 5093 might, therefore, choose to compress the buffer in place before 5094 sending the option, or to compress the buffer while copying it; 5095 either operation is simple. 5097 Every acknowledgement sent by the HC-Receiver SHOULD include the 5098 entire state of the buffer. That is, acknowledgements are 5099 cumulative. 5101 If the acknowledgement fits in one Ack Vector, that Ack Vector's 5102 Nonce Echo simply equals buf_nonce. For multiple Ack Vectors, more 5103 care is required. The Ack Vectors should be split at points 5104 corresponding to previous acknowledgements, since the stored 5105 ack_nonce fields provide enough information to calculate correct 5106 Nonce Echoes. The implementation should therefore acknowledge data 5107 at least once per 253 bytes of buffer state. (Otherwise, there'd be 5108 no way to calculate a Nonce Echo.) 5110 For each acknowledgement it sends, the HC-Receiver will add an 5111 acknowledgement record. ack_seqno will equal the HC-Receiver 5112 sequence number it used for the ack packet; ack_ptr will equal 5113 buf_head; ack_ackno will equal buf_ackno; and ack_nonce will equal 5114 buf_nonce. 5116 A.3. Clearing State 5118 Some of the HC-Sender's packets will include acknowledgement 5119 numbers, which ack the HC-Receiver's acknowledgements. When such an 5120 ack is received, the HC-Receiver finds the acknowledgement record R 5121 with the appropriate ack_seqno, then: 5123 o Sets buf_tail to R.ack_ptr + 1. 5125 o If R.ack_nonce is 1, it flips buf_nonce, and the value of 5126 ack_nonce for every later ack record. 5128 o Throws away R and every preceding ack record. 5130 (The HC-Receiver may choose to keep some older information, in case 5131 a lost packet shows up late.) For example, say that the HC-Receiver 5132 storing the Example Buffer had sent two acknowledgements already: 5134 1. ack_seqno = 59, ack_ackno = 3, ack_nonce = 1. 5136 2. ack_seqno = 60, ack_ackno = 10, ack_nonce = 0. 5138 Say the HC-Receiver then received a DCCP-DataAck packet with 5139 Acknowledgement Number 59 from the HC-Sender. This informs the HC- 5140 Receiver that the HC-Sender received, and processed, all the 5141 information in HC-Receiver packet 59. This packet acknowledged HC- 5142 Sender packet 3, so the HC-Sender has now received HC-Receiver's 5143 acknowledgements for packets 0, 1, 2, and 3. The Example Buffer 5144 should enter this state: 5146 +------------------*+ * * 5147 10 |0,0|3,0|3,0|3,0|0,2| 4 BN[0] 5148 +------------------*+ * * 5150 The tail byte's run length was adjusted, since packet 3 was in the 5151 middle of that byte. Since R.ack_nonce was 1, the buf_nonce field 5152 was flipped, as were the ack_nonce fields for later acknowledgements 5153 (here, the HC-Receiver Ack 60 record, not shown, has its ack_nonce 5154 set to 1). The HC-Receiver can also throw away stored information 5155 about HC-Receiver Ack 59 and any earlier acknowledgements. 5157 A careful implementation might try to ensure reasonable robustness 5158 to reordering. Suppose that the Example Buffer is as before, but 5159 that packet 9 now arrives, out of sequence. The buffer would enter 5160 this state: 5162 +----*----------------------+ 5163 10 |0,0|0,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5164 +----*----------------------+ 5166 The danger is that the HC-Sender might acknowledge the P2's previous 5167 acknowledgement (with sequence number 60), which says that Packet 9 5168 was not received, before the HC-Receiver has a chance to send a new 5169 acknowledgement saying that Packet 9 actually was received. 5170 Therefore, when packet 9 arrived, the HC-Receiver might modify its 5171 acknowledgement record to: 5173 1. ack_seqno = 59, ack_ackno = 3, ack_nonce = 1. 5175 2. ack_seqno = 60, ack_ackno = 3, ack_nonce = 1. 5177 That is, Ack 60 is now treated like a duplicate of Ack 59. This 5178 would prevent the Tail pointer from moving past packet 9 until the 5179 HC-Receiver knows that the HC-Sender has seen an Ack Vector 5180 indicating that packet's arrival. 5182 A.4. Processing Acknowledgements 5184 When the HC-Sender receives an acknowledgement, it generally cares 5185 about the number of packets that were dropped and/or ECN marked. It 5186 simply reads this off the Ack Vector. Additionally, it should check 5187 the ECN Nonce for correctness. (As described in Section 11.4.1, it 5188 may want to keep more detailed information about acknowledged 5189 packets in case packets change states between acknowledgements, or 5190 in case the application queries whether a packet arrived.) 5192 The HC-Sender must also acknowledge the HC-Receiver's 5193 acknowledgements so that the HC-Receiver can free old Ack Vector 5194 state. (Since Ack Vector acknowledgements are reliable, the HC- 5195 Receiver must maintain and resend Ack Vector information until it is 5196 sure that the HC-Sender has received that information.) A simple 5197 algorithm suffices: since Ack Vector acknowledgements are 5198 cumulative, a single acknowledgement number tells HC-Receiver how 5199 much ack information has arrived. Assuming that the HC-Receiver 5200 sends no data, the HC-Sender can ensure that at least once a round- 5201 trip time, it sends a DCCP-DataAck packet acknowledging the latest 5202 DCCP-Ack packet it has received. Of course, the HC-Sender only 5203 needs to acknowledge the HC-Receiver's acknowledgements if the HC- 5204 Sender is also sending data. If the HC-Sender is not sending data, 5205 then the HC-Receiver's Ack Vector state is stable, and there is no 5206 need to shrink it. The HC-Sender must watch for drops and ECN marks 5207 on received DCCP-Ack packets so that it can adjust the HC-Receiver's 5208 ack-sending rate---for example, with Ack Ratio---in response to 5209 congestion. 5211 If the other half-connection is not quiescent---that is, the HC- 5212 Receiver is sending data to the HC-Sender, possibly using another 5213 CCID---then the acknowledgements on that half-connection are 5214 sufficient for the HC-Receiver to free its state. 5216 B. Appendix: Design Motivation 5218 This section attempts to capture some of the rationale behind 5219 specific details of DCCP design. 5221 B.1. CsCov and Partial Checksumming 5223 A great deal of discussion has taken place regarding the utility of 5224 allowing a DCCP sender to restrict the checksum so that it does not 5225 cover the complete packet. 5227 Many of the applications that we envisage using DCCP are resilient 5228 to some degree of data loss, or they would typically have chosen a 5229 reliable transport. Some of these applications may also be 5230 resilient to data corruption---some audio payloads, for example. 5231 These resilient applications might prefer to receive corrupted data 5232 than to have DCCP drop a corrupted packet. This is particularly 5233 because of congestion control: DCCP cannot tell the difference 5234 between packets dropped due to corruption and packets dropped due to 5235 congestion, and so it must reduce the transmission rate accordingly. 5236 This response may cause the connection to receive less bandwidth 5237 than it is due; corruption in some networking technologies is 5238 independent of, or at least not always correlated to, congestion. 5239 Therefore, corrupted packets do not need to cause as strong a 5240 reduction in transmission rate as the congestion response would 5241 dictate (so long as the DCCP header and options are not corrupt). 5243 Thus DCCP allows the checksum to cover all of the packet, just the 5244 DCCP header, or both the DCCP header and some number of bytes from 5245 the application data. If the application cannot tolerate any data 5246 corruption, then the checksum must cover the whole packet. If the 5247 application would prefer to tolerate some corruption rather than 5248 have the packet dropped, then it can set the checksum to cover only 5249 part of the packet (but always the DCCP header). In addition, if 5250 the application wishes to decouple checksumming of the DCCP header 5251 from checksumming of the application data, it may do so by including 5252 the Data Checksum option. This would allow DCCP to discard 5253 corrupted application data, but still not mistake the corruption for 5254 network congestion. 5256 Thus, from the application point of view, partial checksums seem to 5257 be a desirable feature. However, the usefulness of partial 5258 checksums depends on partially corrupted packets being delivered to 5259 the receiver. If the link-layer CRC always discards corrupted 5260 packets, then this will not happen, and so the usefulness of partial 5261 checksums would be restricted to corruption that occurred in routers 5262 and other places not covered by link CRCs. There does not appear to 5263 be consensus on how likely it is that future network links that 5264 suffer significant corruption will not cover the entire packet with 5265 a single strong CRC. DCCP makes it possible to tailor such links to 5266 the application, but it is difficult to predict if this will be 5267 compelling for future link technologies. 5269 In addition, partial checksums do not co-exist well with IP-level 5270 authentication mechanisms such as IPsec AH, which cover the entire 5271 packet with a cryptographic hash. Thus, if cryptographic 5272 authentication mechanisms are required to co-exist with partial 5273 checksums, the authentication must be carried in the application 5274 data. A possible mode of usage would appear to be similar to that 5275 of Secure RTP. However, such "application-level" authentication 5276 does not protect the DCCP option negotiation and state machine from 5277 forged packets. An alternative would be to use IPsec ESP, and use 5278 encryption to protect the DCCP headers against attack, while using 5279 the DCCP header validity checks to authenticate that the header is 5280 from someone who possessed the correct key. However, while this is 5281 resistant to replay (due to the DCCP sequence number), it is not by 5282 itself resistant to some forms of man-in-the-middle attacks because 5283 the application data is not tightly coupled to the packet header. 5284 Thus an application-level authentication probably needs to be 5285 coupled with IPsec ESP or a similar mechanism to provide a 5286 reasonably complete security solution. The overhead of such a 5287 solution might be unacceptable for some applications that would 5288 otherwise wish to use partial checksums. 5290 On balance, the authors believe that DCCP partial checksums have the 5291 potential to enable some future uses that would otherwise be 5292 difficult. As the cost and complexity of supporting them is small, 5293 it seems worth including them at this time. It remains to be seen 5294 whether they are useful in practice. 5296 Normative References 5298 [RFC 793] J. Postel, editor. Transmission Control Protocol. 5299 RFC 793. 5301 [RFC 1191] J. C. Mogul and S. E. Deering. Path MTU Discovery. 5302 RFC 1191. 5304 [RFC 1750] D. Eastlake, S. Crocker, and J. Schiller. Randomness 5305 Recommendations for Security. RFC 1750. 5307 [RFC 2026] S. Bradner. The Internet Standards Process---Revision 3. 5308 RFC 2026. 5310 [RFC 2119] S. Bradner. Key Words For Use in RFCs to Indicate 5311 Requirement Levels. RFC 2119. 5313 [RFC 2460] S. Deering and R. Hinden. Internet Protocol, Version 6 5314 (IPv6) Specification. RFC 2460. 5316 [RFC 3168] K.K. Ramakrishnan, S. Floyd, and D. Black. The Addition 5317 of Explicit Congestion Notification (ECN) to IP. RFC 3168. 5319 [RFC 3309] J. Stone, R. Stewart, and D. Otis. Stream Control 5320 Transmission Protocol (SCTP) Checksum Change. RFC 3309. 5322 [RFC 3692] T. Narten. Assigning Experimental and Testing Numbers 5323 Considered Useful. RFC 3692. 5325 [UDP-LITE] L-A. Larzon, M. Degermark, S. Pink, L-E. Jonsson 5326 (editor), and G. Fairhurst (editor). The UDP-Lite Protocol. 5327 draft-ietf-tsvwg-udp-lite-02.txt, work in progress, August 2003. 5329 Informative References 5331 [BB01] S.M. Bellovin and M. Blaze. Cryptographic Modes of Operation 5332 for the Internet. 2nd NIST Workshop on Modes of Operation, 5333 August 2001. 5335 [BEL98] S.M. Bellovin. Cryptography and the Internet. Proc. CRYPTO 5336 '98 (LNCS 1462), pp46-55, August, 1988. 5338 [CCID 2 PROFILE] S. Floyd and E. Kohler. Profile for DCCP 5339 Congestion Control ID 2: TCP-like Congestion Control. draft- 5340 ietf-dccp-ccid2-05.txt, work in progress, February 2004. 5342 [CCID 3 PROFILE] S. Floyd, E. Kohler, and J. Padhye. Profile for 5343 DCCP Congestion Control ID 3: TFRC Congestion Control. draft- 5344 ietf-dccp-ccid3-05.txt, work in progress, February 2004. 5346 [LINK BCP] Phil Karn, editor. Advice for Internet Subnetwork 5347 Designers. draft-ietf-pilc-link-design-13.txt, work in 5348 progress, February 2003. 5350 [M85] Robert T. Morris. A Weakness in the 4.2BSD Unix TCP/IP 5351 Software. Computer Science Technical Report 117, AT&T Bell 5352 Laboratories, Murray Hill, NJ, February 1985. 5354 [PMTUD] Matt Mathis, John Heffner, and Kevin Lahey. Path MTU 5355 Discovery. draft-ietf-pmtud-method-00.txt, work in progress, 5356 October 2003. 5358 [RFC 792] J. Postel, editor. Internet Control Message Protocol. 5359 RFC 792. 5361 [RFC 1948] S. Bellovin. Defending Against Sequence Number Attacks. 5362 RFC 1948. 5364 [RFC 2960] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. 5365 Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, and V. 5366 Paxson. Stream Control Transmission Protocol. RFC 2960. 5368 [RFC 3124] H. Balakrishnan and S. Seshan. The Congestion Manager. 5369 RFC 3124. 5371 [RFC 3448] M. Handley, S. Floyd, J. Padhye, and J. Widmer. TCP 5372 Friendly Rate Control (TFRC): Protocol Specification. RFC 3448. 5374 [RFC 3517] E. Blanton, M. Allman, K. Fall, and L. Wang. A 5375 Conservative Selective Acknowledgment (SACK)-based Loss Recovery 5376 Algorithm for TCP. RFC 3517. 5378 [RFC 3540] N. Spring, D. Wetherall, and D. Ely. Robust Explicit 5379 Congestion Notification (ECN) Signaling with Nonces. RFC 3540. 5381 [RFC 3550] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. 5382 RTP: A Transport Protocol for Real-Time Applications. RFC 3550. 5384 [SB00] Alex C. Snoeren and Hari Balakrishnan. An End-to-End 5385 Approach to Host Mobility. Proc. 6th Annual ACM/IEEE 5386 International Conference on Mobile Computing and Networking 5387 (MOBICOM '00), August 2000. 5389 [SHHP00] Oliver Spatscheck, Jorgen S. Hansen, John H. Hartman, and 5390 Larry L. Peterson. Optimizing TCP Forwarder Performance. 5391 IEEE/ACM Transactions on Networking 8(2):146-157, April 2000. 5393 [SYNCOOKIES] Daniel J. Bernstein. SYN Cookies. 5394 http://cr.yp.to/syncookies.html, as of July 2003. 5396 Authors' Addresses 5397 Eddie Kohler 5398 4531C Boelter Hall 5399 UCLA Computer Science Department 5400 Los Angeles, CA 90095 5401 USA 5403 Mark Handley 5404 Department of Computer Science 5405 University College London 5406 Gower Street 5407 London WC1E 6BT 5408 UK 5410 Sally Floyd 5411 ICSI Center for Internet Research 5412 1947 Center Street, Suite 600 5413 Berkeley, CA 94704 5414 USA 5416 Intellectual Property Notice 5418 The IETF has been notified of intellectual property rights claimed 5419 in regard to some or all of the specification contained in this 5420 document, particularly regarding support for mobility. For more 5421 information consult the online list of claimed rights. 5423 The IETF takes no position regarding the validity or scope of any 5424 intellectual property or other rights that might be claimed to 5425 pertain to the implementation or use of the technology described in 5426 this document or the extent to which any license under such rights 5427 might or might not be available; neither does it represent that it 5428 has made any effort to identify any such rights. Information on the 5429 IETF's procedures with respect to rights in standards-track and 5430 standards-related documentation can be found in BCP-11. Copies of 5431 claims of rights made available for publication and any assurances 5432 of licenses to be made available, or the result of an attempt made 5433 to obtain a general license or permission for the use of such 5434 proprietary rights by implementors or users of this specification 5435 can be obtained from the IETF Secretariat. 5437 Full Copyright Statement 5439 Copyright (C) The Internet Society (2004). All Rights Reserved. 5441 This document and translations of it may be copied and furnished to 5442 others, and derivative works that comment on or otherwise explain it 5443 or assist in its implementation may be prepared, copied, published 5444 and distributed, in whole or in part, without restriction of any 5445 kind, provided that the above copyright notice and this paragraph 5446 are included on all such copies and derivative works. However, this 5447 document itself may not be modified in any way, such as by removing 5448 the copyright notice or references to the Internet Society or other 5449 Internet organizations, except as needed for the purpose of 5450 developing Internet standards in which case the procedures for 5451 copyrights defined in the Internet Standards process must be 5452 followed, or as required to translate it into languages other than 5453 English. 5455 The limited permissions granted above are perpetual and will not be 5456 revoked by the Internet Society or its successors or assigns. 5458 This document and the information contained herein is provided on an 5459 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING 5460 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING 5461 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION 5462 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF 5463 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.