idnits 2.17.1 draft-ietf-dccp-spec-10.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1.a on line 18. -- Found old boilerplate from RFC 3978, Section 5.5 on line 5823. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 5834. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 5841. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 5847. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 5815), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 40. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: This document is an Internet-Draft and is subject to all provisions of Section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 1839 has weird spacing: '...t value snd...' == Line 2401 has weird spacing: '...loseReq seq...' == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (7 March 2005) is 6989 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'CLOSED' is mentioned on line 847, but not defined == Missing Reference: 'LISTEN' is mentioned on line 847, but not defined == Missing Reference: 'TIMEWAIT' is mentioned on line 856, but not defined == Missing Reference: 'Nonce 0' is mentioned on line 4544, but not defined == Missing Reference: 'Nonce 1' is mentioned on line 4544, but not defined == Missing Reference: 'AWL' is mentioned on line 2364, but not defined == Missing Reference: 'AWH' is mentioned on line 2364, but not defined == Missing Reference: 'SWL' is mentioned on line 2364, but not defined == Missing Reference: 'SWH' is mentioned on line 2364, but not defined == Missing Reference: 'RFC TBA' is mentioned on line 3574, but not defined == Missing Reference: 'DrpCd' is mentioned on line 4302, but not defined == Missing Reference: 'E' is mentioned on line 5329, but not defined -- Looks like a reference, but probably isn't: '1' on line 5540 -- Looks like a reference, but probably isn't: '0' on line 5523 == Unused Reference: 'RFC 2119' is defined on line 5677, but no explicit reference was found in the text == Unused Reference: 'RFC 2434' is defined on line 5680, but no explicit reference was found in the text == Unused Reference: 'RFC 2460' is defined on line 5683, but no explicit reference was found in the text == Unused Reference: 'RFC 1948' is defined on line 5736, but no explicit reference was found in the text == Unused Reference: 'RFC 2960' is defined on line 5752, but no explicit reference was found in the text ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) ** Obsolete normative reference: RFC 3309 (Obsoleted by RFC 4960) ** Obsolete normative reference: RFC 3775 (Obsoleted by RFC 6275) == Outdated reference: A later version (-11) exists of draft-ietf-pmtud-method-01 -- Obsolete informational reference (is this intentional?): RFC 1750 (Obsoleted by RFC 4086) -- Obsolete informational reference (is this intentional?): RFC 1948 (Obsoleted by RFC 6528) -- Obsolete informational reference (is this intentional?): RFC 2401 (Obsoleted by RFC 4301) -- Obsolete informational reference (is this intentional?): RFC 2463 (Obsoleted by RFC 4443) -- Obsolete informational reference (is this intentional?): RFC 2581 (Obsoleted by RFC 5681) -- Obsolete informational reference (is this intentional?): RFC 2960 (Obsoleted by RFC 4960) -- Obsolete informational reference (is this intentional?): RFC 3448 (Obsoleted by RFC 5348) Summary: 11 errors (**), 0 flaws (~~), 23 warnings (==), 17 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Eddie Kohler 2 INTERNET-DRAFT UCLA 3 draft-ietf-dccp-spec-10.txt Mark Handley 4 Expires: 7 September 2005 UCL 5 Sally Floyd 6 ICIR 7 7 March 2005 9 Datagram Congestion Control Protocol (DCCP) 11 Status of this Memo 13 This document is an Internet-Draft and is subject to all provisions 14 of section 3 of RFC 3667. By submitting this Internet-Draft, each 15 author represents that any applicable patent or other IPR claims of 16 which he or she is aware have been or will be disclosed, and any of 17 which he or she become aware will be disclosed, in accordance with 18 RFC 3668. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six 26 months and may be updated, replaced, or obsoleted by other documents 27 at any time. It is inappropriate to use Internet-Drafts as 28 reference material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt. 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 This Internet-Draft will expire on 7 September 2005. 38 Copyright Notice 40 Copyright (C) The Internet Society (2004). All Rights Reserved. 42 Abstract 44 The Datagram Congestion Control Protocol (DCCP) is a transport 45 protocol that provides bidirectional unicast connections of 46 congestion-controlled unreliable datagrams. DCCP is suitable for 47 applications that transfer fairly large amounts of data, but can 48 benefit from control over the tradeoff between timeliness and 49 reliability. 51 TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION: 53 Changes since draft-ietf-dccp-spec-08.txt: 55 * Added minimum Sequence Window. 57 * Init Cookie implementation sketch. 59 * Include reasoning for ignoring options on DCCP-Data. 61 * More Aggression Penalty explanation. 63 * More explanation on Ack Vectors that report information on packets 64 that haven't been sent. 66 Changes since draft-ietf-dccp-spec-07.txt: 68 * Many changes, not listed here, for WGLC. 70 * The more stringent Sequence Number checks on DCCP-Sync and DCCP- 71 SyncAck packets become SHOULD, not MAY. 73 Changes since draft-ietf-dccp-spec-06.txt: 75 * Change extended sequence numbers. Now 48-bit sequence numbers are 76 MANDATORY, and all packet types except Data, Ack, and DataAck always 77 use 48-bit sequence numbers. This change improves DCCP's robustness 78 against blind attacks. 80 * Removed empty Change options. 82 * Allow preference list changes during feature negotiations (this 83 seems easier to implement than the alternative). This required a 84 new feature negotiation state, UNSTABLE. 86 * Add Minimum Checksum Coverage feature. 88 * Add Reset Congestion State option. 90 * Simplify the implementation of CCID-specific option processing: no 91 need to check whether the CCID feature is being negotiated. 93 * Many more minor changes. 95 Changes since draft-ietf-dccp-spec-05.txt: 97 * Organization overhaul. 99 * Add pseudocode for event processing. 101 * Remove # NDP; replace with Ack Count. 103 * Remove Identification, Challenge, ID Regime, and Connection Nonce. 105 * Data Checksum (formerly Payload Checksum) uses a 32-bit CRC. 107 * Switch location of non-negotiable features to clarify 108 presentation; now the feature location controls its value. 110 * Rename "value type" to "reconciliation rule". 112 * Rename "Reset Reason" to "Reset Code". 114 * Mobility ID becomes 128 bits long. 116 * Add probabilities to Mobility ID discussion. 118 * Add SyncAck. 120 Table of Contents 122 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 10 123 2. Design Rationale. . . . . . . . . . . . . . . . . . . . . . . 11 124 3. Conventions and Terminology . . . . . . . . . . . . . . . . . 12 125 3.1. Numbers and Fields . . . . . . . . . . . . . . . . . . . 12 126 3.2. Parts of a Connection. . . . . . . . . . . . . . . . . . 13 127 3.3. Features . . . . . . . . . . . . . . . . . . . . . . . . 13 128 3.4. Round-Trip Times . . . . . . . . . . . . . . . . . . . . 14 129 3.5. Security Limitation. . . . . . . . . . . . . . . . . . . 14 130 3.6. Robustness Principle . . . . . . . . . . . . . . . . . . 14 131 4. Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . 14 132 4.1. Packet Types . . . . . . . . . . . . . . . . . . . . . . 15 133 4.2. Sequence Numbers . . . . . . . . . . . . . . . . . . . . 16 134 4.3. States . . . . . . . . . . . . . . . . . . . . . . . . . 17 135 4.4. Congestion Control . . . . . . . . . . . . . . . . . . . 18 136 4.5. Features . . . . . . . . . . . . . . . . . . . . . . . . 19 137 4.6. Differences From TCP . . . . . . . . . . . . . . . . . . 20 138 4.7. Example Connection . . . . . . . . . . . . . . . . . . . 21 139 5. Packet Formats. . . . . . . . . . . . . . . . . . . . . . . . 23 140 5.1. Generic Header . . . . . . . . . . . . . . . . . . . . . 23 141 5.2. DCCP-Request Packets . . . . . . . . . . . . . . . . . . 27 142 5.3. DCCP-Response Packets. . . . . . . . . . . . . . . . . . 28 143 5.4. DCCP-Data, DCCP-Ack, and DCCP-DataAck Packets. . . . . . 28 144 5.5. DCCP-CloseReq and DCCP-Close Packets . . . . . . . . . . 30 145 5.6. DCCP-Reset Packets . . . . . . . . . . . . . . . . . . . 30 146 5.7. DCCP-Sync and DCCP-SyncAck Packets . . . . . . . . . . . 33 147 5.8. Options. . . . . . . . . . . . . . . . . . . . . . . . . 34 148 5.8.1. Padding Option. . . . . . . . . . . . . . . . . . . 36 149 5.8.2. Mandatory Option. . . . . . . . . . . . . . . . . . 36 150 6. Feature Negotiation . . . . . . . . . . . . . . . . . . . . . 37 151 6.1. Change Options . . . . . . . . . . . . . . . . . . . . . 37 152 6.2. Confirm Options. . . . . . . . . . . . . . . . . . . . . 38 153 6.3. Reconciliation Rules . . . . . . . . . . . . . . . . . . 38 154 6.3.1. Server-Priority . . . . . . . . . . . . . . . . . . 39 155 6.3.2. Non-Negotiable. . . . . . . . . . . . . . . . . . . 39 156 6.4. Feature Numbers. . . . . . . . . . . . . . . . . . . . . 39 157 6.5. Examples . . . . . . . . . . . . . . . . . . . . . . . . 40 158 6.6. Option Exchange. . . . . . . . . . . . . . . . . . . . . 42 159 6.6.1. Normal Exchange . . . . . . . . . . . . . . . . . . 42 160 6.6.2. Processing Received Options . . . . . . . . . . . . 43 161 6.6.3. Loss and Retransmission . . . . . . . . . . . . . . 45 162 6.6.4. Reordering. . . . . . . . . . . . . . . . . . . . . 46 163 6.6.5. Preference Changes. . . . . . . . . . . . . . . . . 47 164 6.6.6. Simultaneous Negotiation. . . . . . . . . . . . . . 47 165 6.6.7. Unknown Features. . . . . . . . . . . . . . . . . . 47 166 6.6.8. Invalid Options . . . . . . . . . . . . . . . . . . 48 167 6.6.9. Mandatory Feature Negotiation . . . . . . . . . . . 48 169 7. Sequence Numbers. . . . . . . . . . . . . . . . . . . . . . . 49 170 7.1. Variables. . . . . . . . . . . . . . . . . . . . . . . . 49 171 7.2. Initial Sequence Numbers . . . . . . . . . . . . . . . . 50 172 7.3. Quiet Time . . . . . . . . . . . . . . . . . . . . . . . 51 173 7.4. Acknowledgement Numbers. . . . . . . . . . . . . . . . . 51 174 7.5. Validity and Synchronization . . . . . . . . . . . . . . 52 175 7.5.1. Sequence and Acknowledgement Number 176 Windows. . . . . . . . . . . . . . . . . . . . . . . . . . 52 177 7.5.2. Sequence Window Feature . . . . . . . . . . . . . . 53 178 7.5.3. Sequence-Validity Rules . . . . . . . . . . . . . . 54 179 7.5.4. Handling Sequence-Invalid Packets . . . . . . . . . 56 180 7.5.5. Sequence Number Attacks . . . . . . . . . . . . . . 57 181 7.5.6. Examples. . . . . . . . . . . . . . . . . . . . . . 58 182 7.6. Short Sequence Numbers . . . . . . . . . . . . . . . . . 59 183 7.6.1. Allow Short Sequence Numbers Feature. . . . . . . . 60 184 7.6.2. When to Avoid Short Sequence Numbers. . . . . . . . 60 185 7.7. NDP Count and Detecting Application Loss . . . . . . . . 61 186 7.7.1. Usage Notes . . . . . . . . . . . . . . . . . . . . 62 187 7.7.2. Send NDP Count Feature. . . . . . . . . . . . . . . 62 188 8. Event Processing. . . . . . . . . . . . . . . . . . . . . . . 62 189 8.1. Connection Establishment . . . . . . . . . . . . . . . . 63 190 8.1.1. Client Request. . . . . . . . . . . . . . . . . . . 63 191 8.1.2. Service Codes . . . . . . . . . . . . . . . . . . . 64 192 8.1.3. Server Response . . . . . . . . . . . . . . . . . . 65 193 8.1.4. Init Cookie Option. . . . . . . . . . . . . . . . . 66 194 8.1.5. Handshake Completion. . . . . . . . . . . . . . . . 67 195 8.2. Data Transfer. . . . . . . . . . . . . . . . . . . . . . 67 196 8.3. Termination. . . . . . . . . . . . . . . . . . . . . . . 68 197 8.3.1. Abnormal Termination. . . . . . . . . . . . . . . . 70 198 8.4. DCCP State Diagram . . . . . . . . . . . . . . . . . . . 70 199 8.5. Pseudocode . . . . . . . . . . . . . . . . . . . . . . . 71 200 9. Checksums . . . . . . . . . . . . . . . . . . . . . . . . . . 75 201 9.1. Header Checksum Field. . . . . . . . . . . . . . . . . . 76 202 9.2. Header Checksum Coverage Field . . . . . . . . . . . . . 77 203 9.2.1. Minimum Checksum Coverage Feature . . . . . . . . . 78 204 9.3. Data Checksum Option . . . . . . . . . . . . . . . . . . 78 205 9.3.1. Check Data Checksum Feature . . . . . . . . . . . . 79 206 9.3.2. Usage Notes . . . . . . . . . . . . . . . . . . . . 79 207 10. Congestion Control . . . . . . . . . . . . . . . . . . . . . 80 208 10.1. TCP-like Congestion Control . . . . . . . . . . . . . . 81 209 10.2. TFRC Congestion Control . . . . . . . . . . . . . . . . 81 210 10.3. CCID-Specific Options, Features, and Reset 211 Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 212 10.4. CCID Profile Requirements . . . . . . . . . . . . . . . 84 213 10.5. Congestion State. . . . . . . . . . . . . . . . . . . . 84 214 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 85 215 11.1. Acks of Acks and Unidirectional Connections . . . . . . 86 216 11.2. Ack Piggybacking. . . . . . . . . . . . . . . . . . . . 87 217 11.3. Ack Ratio Feature . . . . . . . . . . . . . . . . . . . 87 218 11.4. Ack Vector Options. . . . . . . . . . . . . . . . . . . 89 219 11.4.1. Ack Vector Consistency . . . . . . . . . . . . . . 91 220 11.4.2. Ack Vector Coverage. . . . . . . . . . . . . . . . 93 221 11.5. Send Ack Vector Feature . . . . . . . . . . . . . . . . 94 222 11.6. Slow Receiver Option. . . . . . . . . . . . . . . . . . 94 223 11.7. Data Dropped Option . . . . . . . . . . . . . . . . . . 95 224 11.7.1. Data Dropped and Normal Congestion 225 Response . . . . . . . . . . . . . . . . . . . . . . . . . 98 226 11.7.2. Particular Drop Codes. . . . . . . . . . . . . . . 98 227 12. Explicit Congestion Notification . . . . . . . . . . . . . . 99 228 12.1. ECN Incapable Feature . . . . . . . . . . . . . . . . . 100 229 12.2. ECN Nonces. . . . . . . . . . . . . . . . . . . . . . . 100 230 12.3. Aggression Penalties. . . . . . . . . . . . . . . . . . 101 231 13. Timing Options . . . . . . . . . . . . . . . . . . . . . . . 102 232 13.1. Timestamp Option. . . . . . . . . . . . . . . . . . . . 102 233 13.2. Elapsed Time Option . . . . . . . . . . . . . . . . . . 103 234 13.3. Timestamp Echo Option . . . . . . . . . . . . . . . . . 104 235 14. Maximum Packet Size. . . . . . . . . . . . . . . . . . . . . 105 236 14.1. Measuring PMTU. . . . . . . . . . . . . . . . . . . . . 105 237 14.2. Sender Behavior . . . . . . . . . . . . . . . . . . . . 107 238 15. Forward Compatibility. . . . . . . . . . . . . . . . . . . . 108 239 16. Middlebox Considerations . . . . . . . . . . . . . . . . . . 108 240 17. Relations to Other Specifications. . . . . . . . . . . . . . 110 241 17.1. RTP . . . . . . . . . . . . . . . . . . . . . . . . . . 110 242 17.2. Congestion Manager and Multiplexing . . . . . . . . . . 111 243 18. Security Considerations. . . . . . . . . . . . . . . . . . . 111 244 18.1. Security Considerations for Partial 245 Checksums . . . . . . . . . . . . . . . . . . . . . . . . . . 112 246 19. IANA Considerations. . . . . . . . . . . . . . . . . . . . . 113 247 19.1. Packet Types. . . . . . . . . . . . . . . . . . . . . . 113 248 19.2. Reset Codes . . . . . . . . . . . . . . . . . . . . . . 113 249 19.3. Option Types. . . . . . . . . . . . . . . . . . . . . . 114 250 19.4. Feature Numbers . . . . . . . . . . . . . . . . . . . . 114 251 19.5. Congestion Control Identifiers. . . . . . . . . . . . . 114 252 19.6. Ack Vector States . . . . . . . . . . . . . . . . . . . 115 253 19.7. Drop Codes. . . . . . . . . . . . . . . . . . . . . . . 115 254 19.8. Service Codes . . . . . . . . . . . . . . . . . . . . . 115 255 20. Thanks . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 256 A. Appendix: Ack Vector Implementation Notes . . . . . . . . . . 116 257 A.1. Packet Arrival . . . . . . . . . . . . . . . . . . . . . 118 258 A.1.1. New Packets . . . . . . . . . . . . . . . . . . . . 118 259 A.1.2. Old Packets . . . . . . . . . . . . . . . . . . . . 119 260 A.2. Sending Acknowledgements . . . . . . . . . . . . . . . . 120 261 A.3. Clearing State . . . . . . . . . . . . . . . . . . . . . 121 262 A.4. Processing Acknowledgements. . . . . . . . . . . . . . . 122 263 B. Appendix: Partial Checksumming Design Motivation. . . . . . . 123 264 Normative References . . . . . . . . . . . . . . . . . . . . . . 124 265 Informative References . . . . . . . . . . . . . . . . . . . . . 125 266 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 127 267 Full Copyright Statement . . . . . . . . . . . . . . . . . . . . 127 268 Intellectual Property. . . . . . . . . . . . . . . . . . . . . . 128 269 List of Tables 271 Table 1: DCCP Packet Types . . . . . . . . . . . . . . . . . . . 25 272 Table 2: DCCP Reset Codes. . . . . . . . . . . . . . . . . . . . 33 273 Table 3: DCCP Options. . . . . . . . . . . . . . . . . . . . . . 35 274 Table 4: DCCP Feature Numbers. . . . . . . . . . . . . . . . . . 39 275 Table 5: DCCP Congestion Control Identifiers . . . . . . . . . . 80 276 Table 6: DCCP Ack Vector States. . . . . . . . . . . . . . . . . 90 277 Table 7: DCCP Drop Codes . . . . . . . . . . . . . . . . . . . . 96 279 1. Introduction 281 The Datagram Congestion Control Protocol (DCCP) is a transport 282 protocol that implements bidirectional, unicast connections of 283 congestion-controlled, unreliable datagrams. Specifically, DCCP 284 provides: 286 o Unreliable flows of datagrams, with acknowledgements. 288 o Reliable handshakes for connection setup and teardown. 290 o Reliable negotiation of options, including negotiation of a 291 suitable congestion control mechanism. 293 o Mechanisms allowing servers to avoid holding state for 294 unacknowledged connection attempts and already-finished 295 connections. 297 o Congestion control incorporating Explicit Congestion Notification 298 (ECN) [RFC 3168] and the ECN Nonce [RFC 3540]. 300 o Acknowledgement mechanisms communicating packet loss and ECN 301 information. Acks are transmitted as reliably as the relevant 302 congestion control mechanism requires, possibly completely 303 reliably. 305 o Optional mechanisms that tell the sending application, with high 306 reliability, which data packets reached the receiver, and whether 307 those packets were ECN marked, corrupted, or dropped in the 308 receive buffer. 310 o Path Maximum Transmission Unit (PMTU) discovery [RFC 1191]. 312 o A choice of modular congestion control mechanisms. Two 313 mechanisms are currently specified, TCP-like Congestion Control 314 [CCID 2 PROFILE] and TFRC (TCP-Friendly Rate Control) Congestion 315 Control [CCID 3 PROFILE], but DCCP is easily extensible to 316 further forms of unicast congestion control. 318 DCCP is intended for applications such as streaming media that can 319 benefit from control over the tradeoffs between delay and reliable 320 in-order delivery. TCP is not well-suited for these applications, 321 since reliable in-order delivery and congestion control can cause 322 arbitrarily long delays. UDP avoids long delays, but UDP 323 applications that implement congestion control must do so on their 324 own. DCCP provides built-in congestion control, including ECN 325 support, for unreliable datagram flows, avoiding the arbitrary 326 delays associated with TCP. It also implements reliable connection 327 setup, teardown, and feature negotiation. 329 2. Design Rationale 331 One DCCP design goal was to give most streaming UDP applications 332 little reason not to switch to DCCP, once it is deployed. To 333 facilitate this, DCCP was designed to have as little overhead as 334 possible, both in terms of the packet header size and in terms of 335 the state and CPU overhead required at end hosts. Only the minimal 336 necessary functionality was included in DCCP, leaving other 337 functionality, such as forward error correction (FEC), semi- 338 reliability, and multiple streams, to be layered on top of DCCP as 339 desired. 341 Different forms of conformant congestion control are appropriate for 342 different applications. For example, on-line games might want to 343 make quick use of any available bandwidth, while streaming media 344 might trade off this responsiveness for a steadier, less bursty 345 rate. (Sudden rate changes can cause unacceptable UI glitches, such 346 as audible pauses or clicks in the playout stream.) DCCP thus 347 allows applications to choose from a set of congestion control 348 mechanisms. One alternative, TCP-like Congestion Control, halves 349 the congestion window in response to a packet drop or mark, as in 350 TCP. Applications using this congestion control mechanism will 351 respond quickly to changes in available bandwidth, but must tolerate 352 the abrupt changes in congestion window typical of TCP. A second 353 alternative, TCP-Friendly Rate Control (TFRC) [RFC 3448], a form of 354 equation-based congestion control, minimizes abrupt changes in the 355 sending rate while maintaining longer-term fairness with TCP. Other 356 alternatives can be added as future congestion control mechanisms 357 are standardized. 359 DCCP also lets unreliable traffic safely use ECN. A UDP kernel API 360 might not allow applications to set UDP packets as ECN-capable, 361 since the API could not guarantee the application would properly 362 detect or respond to congestion. DCCP kernel APIs will have no such 363 issues, since DCCP implements congestion control itself. 365 We chose not to require the use of the Congestion Manager [RFC 366 3124], which allows multiple concurrent streams between the same 367 sender and receiver to share congestion control. The current 368 Congestion Manager can only be used by applications that have their 369 own end-to-end feedback about packet losses, but this is not the 370 case for many of the applications currently using UDP. In addition, 371 the current Congestion Manager does not easily support multiple 372 congestion control mechanisms, or lend itself to the use of forms of 373 TFRC where the state about past packet drops or marks is maintained 374 at the receiver rather than at the sender. DCCP should be able to 375 make use of CM where desired by the application, but we do not see 376 any benefit in making the deployment of DCCP contingent on the 377 deployment of CM itself. 379 We intend for DCCP's protocol mechanisms, which are described in 380 this document, to suit any application desiring unicast congestion- 381 controlled streams of unreliable datagrams. The congestion control 382 mechanisms currently approved for use with DCCP, which are described 383 in separate Congestion Control ID Profiles [CCID 2 PROFILE, CCID 3 384 PROFILE], may, however, cause problems for some applications, 385 including high-bandwidth interactive video. These applications 386 should be able to use DCCP once suitable Congestion Control ID 387 Profiles are standardized. 389 3. Conventions and Terminology 391 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 392 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 393 document are to be interpreted as described in RFC 2119. 395 3.1. Numbers and Fields 397 All multi-byte numerical quantities in DCCP, such as port numbers, 398 Sequence Numbers, and arguments to options, are transmitted in 399 network byte order (most significant byte first). 401 We occasionally refer to the "left" and "right" sides of a bit 402 field. "Left" means towards the most significant bit, and "right" 403 means towards the least significant bit. 405 Random numbers in DCCP are used for their security properties, and 406 SHOULD be chosen according to the guidelines in RFC 1750. 408 All operations on DCCP sequence numbers, and comparisons such as 409 "greater" and "greatest", use circular arithmetic modulo 2**48. 410 This form of arithmetic preserves the relationships between sequence 411 numbers as they roll over from 2**48 - 1 to 0. Note that the common 412 technique for implementing circular comparison using two's- 413 complement arithmetic, whereby A < B using circular arithmetic if 414 and only if (A - B) < 0 using conventional two's-complement 415 arithmetic, may be used for DCCP sequence numbers, provided they are 416 stored in the most significant 48 bits of 64-bit integers. 418 Reserved bitfields in DCCP packet headers MUST be set to zero by 419 senders, and MUST be ignored by receivers, unless otherwise 420 specified. This is to allow for future protocol extensions. In 421 particular, DCCP processors MUST NOT reset a DCCP connection simply 422 because a Reserved field has non-zero value [RFC 3360]. 424 3.2. Parts of a Connection 426 Each DCCP connection runs between two hosts, which we often name 427 DCCP A and DCCP B. Each connection is actively initiated by one of 428 the hosts, which we call the client; the other, initially passive 429 host is called the server. The term "DCCP endpoint" is used to 430 refer to either of the two hosts explicitly named by the connection 431 (the client and the server). The term "DCCP processor" refers more 432 generally to any host that might need to process a DCCP header; this 433 includes the endpoints and any middleboxes on the path, such as 434 firewalls and network address translators. 436 DCCP connections are bidirectional: data may pass from either 437 endpoint to the other. This means that data and acknowledgements 438 may be flowing in both directions simultaneously. Logically, 439 however, a DCCP connection consists of two separate unidirectional 440 connections, called half-connections. Each half-connection consists 441 of the application data sent by one endpoint and the corresponding 442 acknowledgements sent by the other endpoint. We can illustrate this 443 as follows: 445 +--------+ A-to-B half-connection: +--------+ 446 | | --> application data --> | | 447 | | <-- acknowledgements <-- | | 448 | DCCP A | | DCCP B | 449 | | B-to-A half-connection: | | 450 | | <-- application data <-- | | 451 +--------+ --> acknowledgements --> +--------+ 453 Although they are logically distinct, in practice the half- 454 connections overlap; a DCCP-DataAck packet, for example, contains 455 application data relevant to one half-connection and acknowledgement 456 information relevant to the other. 458 In the context of a single half-connection, the terms "HC-Sender" 459 and "HC-Receiver" denote the endpoints sending application data and 460 acknowledgements, respectively. For example, DCCP A is the HC- 461 Sender and DCCP B is the HC-Receiver in the A-to-B half-connection. 463 3.3. Features 465 A DCCP feature is a connection attribute on whose value the two 466 endpoints agree. Many properties of a DCCP connection are 467 controlled by features, including the congestion control mechanisms 468 in use on the two half-connections. The endpoints achieve agreement 469 through the exchange of feature negotiation options in DCCP headers. 471 DCCP features are identified by a feature number and an endpoint. 472 The notation "F/X" represents the feature with feature number F 473 located at DCCP endpoint X. Each valid feature number thus 474 corresponds to two features, which are negotiated separately and 475 need not have the same value. The two endpoints know, and agree on, 476 the value of every valid feature. DCCP A is the "feature location" 477 for all features F/A, and the "feature remote" for all features F/B. 479 3.4. Round-Trip Times 481 DCCP round-trip time measurements are performed by congestion 482 control mechanisms; different mechanisms may measure round-trip time 483 in different ways, or not measure it at all. However, the main DCCP 484 protocol does use round-trip times occasionally, such as in the 485 initial values for certain timers. Each DCCP implementation thus 486 defines a default round-trip time for use when no estimate is 487 available; this parameter should default to not less than 488 0.2 seconds, a reasonably conservative round-trip time for Internet 489 TCP connections. Protocol behavior specified in terms of "round- 490 trip time" values actually refers to "a current round-trip time 491 estimate taken by some CCID, or, if no estimate is available, the 492 default round-trip time parameter". 494 The maximum segment lifetime, or MSL, is the maximum length of time 495 a packet can survive in the network. The DCCP MSL should equal that 496 of TCP, which is normally two minutes. 498 3.5. Security Limitation 500 DCCP provides no protection against attackers who can snoop on a 501 connection in progress, or who can guess valid sequence numbers in 502 other ways. Applications desiring stronger security should use 503 IPsec [RFC 2401]; depending on the level of security required, 504 application-level cryptography may also suffice. These issues are 505 discussed further in Sections 18 and 7.5.5. 507 3.6. Robustness Principle 509 DCCP implementations will follow TCP's "general principle of 510 robustness": "be conservative in what you do, be liberal in what you 511 accept from others" [RFC 793]. 513 4. Overview 515 DCCP's high-level connection dynamics echo those of TCP. 516 Connections progress through three phases: initiation, including a 517 three-way handshake; data transfer; and termination. Data can flow 518 both ways over the connection. An acknowledgement framework lets 519 senders discover how much data has been lost, and thus avoid 520 unfairly congesting the network. Of course, DCCP provides 521 unreliable datagram semantics, not TCP's reliable bytestream 522 semantics. The application must package its data into explicit 523 frames, and must retransmit its own data as necessary. It may be 524 useful to think of DCCP as TCP minus bytestream semantics and 525 reliability, or as UDP plus congestion control, handshakes, and 526 acknowledgements. 528 4.1. Packet Types 530 Ten packet types implement DCCP's protocol functions. For example, 531 every new connection attempt begins with a DCCP-Request packet sent 532 by the client. A DCCP-Request packet thus resembles a TCP SYN; but 533 DCCP-Request is a packet type, not a flag, so there's no way to send 534 an unexpected combination such as TCP's SYN+FIN+ACK+RST. 536 Eight packet types occur during the progress of a typical 537 connection, shown here. Note the three-way handshakes during 538 initiation and termination. 540 Client Server 541 ------ ------ 542 (1) Initiation 543 DCCP-Request --> 544 <-- DCCP-Response 545 DCCP-Ack --> 546 (2) Data transfer 547 DCCP-Data, DCCP-Ack, DCCP-DataAck --> 548 <-- DCCP-Data, DCCP-Ack, DCCP-DataAck 549 (3) Termination 550 <-- DCCP-CloseReq 551 DCCP-Close --> 552 <-- DCCP-Reset 554 The two remaining packet types are used to resynchronize after 555 bursts of loss. 557 Every DCCP packet starts with a 12-byte generic header. Particular 558 packet types include additional fixed-size header data; for example, 559 DCCP-Acks include an Acknowledgement Number. DCCP options and any 560 application data follow the fixed-size header. 562 The packet types are as follows: 564 DCCP-Request 565 Sent by the client to initiate a connection (the first part of 566 the three-way initiation handshake). 568 DCCP-Response 569 Sent by the server in response to a DCCP-Request (the second 570 part of the three-way initiation handshake). 572 DCCP-Data 573 Used to transmit application data. 575 DCCP-Ack 576 Used to transmit pure acknowledgements. 578 DCCP-DataAck 579 Used to transmit application data with piggybacked 580 acknowledgements. 582 DCCP-CloseReq 583 Sent by the server to request that the client close the 584 connection. 586 DCCP-Close 587 Used by the client or the server to close the connection; 588 elicits a DCCP-Reset in response. 590 DCCP-Reset 591 Used to terminate the connection, either normally or abnormally. 593 DCCP-Sync, DCCP-SyncAck 594 Used to resynchronize sequence numbers after large bursts of 595 loss. 597 4.2. Sequence Numbers 599 Each DCCP packet carries a sequence number, so that losses can be 600 detected and reported. Unlike TCP sequence numbers, which are byte- 601 based, DCCP sequence numbers increment by one per packet. For 602 example: 604 DCCP A DCCP B 605 ------ ------ 606 DCCP-Data(seqno 1) --> 607 DCCP-Data(seqno 2) --> 608 <-- DCCP-Ack(seqno 10, ackno 2) 609 DCCP-DataAck(seqno 3, ackno 10) --> 610 <-- DCCP-Data(seqno 11) 612 Every DCCP packet increments the sequence number, whether or not it 613 contains application data. DCCP-Ack pure acknowledgements increment 614 the sequence number, for instance: DCCP B's second packet above uses 615 sequence number 11, since sequence number 10 was used for an 616 acknowledgement. This lets endpoints detect all packet loss, 617 including acknowledgement loss. It also means that endpoints can 618 get out of sync after long bursts of loss; the DCCP-Sync and DCCP- 619 SyncAck packet types are used to recover (Section 7.5). 621 Since DCCP provides unreliable semantics, there are no 622 retransmissions, and it doesn't make sense to have a TCP-style 623 cumulative acknowledgement field. DCCP's Acknowledgement Number 624 field equals the greatest sequence number received, rather than the 625 smallest sequence number not received. Separate options indicate 626 any intermediate sequence numbers that weren't received. 628 4.3. States 630 DCCP endpoints progress through different states during the course 631 of a connection, corresponding roughly to the three phases of 632 initiation, data transfer, and termination. The figure below shows 633 the typical progress through these states for a client and server. 635 Client Server 636 ------ ------ 637 (0) No connection 638 CLOSED LISTEN 640 (1) Initiation 641 REQUEST DCCP-Request --> 642 <-- DCCP-Response RESPOND 643 PARTOPEN DCCP-Ack or DCCP-DataAck --> 645 (2) Data transfer 646 OPEN <-- DCCP-Data, Ack, DataAck --> OPEN 648 (3) Termination 649 <-- DCCP-CloseReq CLOSEREQ 650 CLOSING DCCP-Close --> 651 <-- DCCP-Reset CLOSED 652 TIMEWAIT 653 CLOSED 655 The nine possible states are as follows. They are listed in 656 increasing order, so that "state >= CLOSEREQ" means the same as 657 "state = CLOSEREQ or state = CLOSING or state = TIMEWAIT". Section 658 8 describes the states in more detail. 660 CLOSED 661 Represents nonexistent connections. 663 LISTEN 664 Represents server sockets in the passive listening state. 665 LISTEN and CLOSED are not associated with any particular DCCP 666 connection. 668 REQUEST 669 A client socket enters this state, from CLOSED, after sending a 670 DCCP-Request packet to try to initiate a connection. 672 RESPOND 673 A server socket enters this state, from LISTEN, after receiving 674 a DCCP-Request from a client. 676 PARTOPEN 677 A client socket enters this state, from REQUEST, after receiving 678 a DCCP-Response from the server. This state represents the 679 third phase of the three-way handshake. The client may send 680 application data in this state, but it MUST include an 681 Acknowledgement Number on all of its packets. 683 OPEN 684 The central, data transfer portion of a DCCP connection. Client 685 and server sockets enter this state from PARTOPEN and RESPOND, 686 respectively. Sometimes we speak of SERVER-OPEN and CLIENT-OPEN 687 states, corresponding to the server's OPEN state and the 688 client's OPEN state. 690 CLOSEREQ 691 A server socket enters this state, from SERVER-OPEN, to signal 692 that the connection is over, but the client must hold TIMEWAIT 693 state. 695 CLOSING 696 Server and client sockets can both enter this state to close the 697 connection. 699 TIMEWAIT 700 A server or client socket remains in this state for 2MSL (4 701 minutes) after the connection has been torn down, to prevent 702 mistakes due to the delivery of old packets. Only one of the 703 endpoints need enter TIMEWAIT state (the other can enter CLOSED 704 state immediately), and a server can request its client to hold 705 TIMEWAIT state using the DCCP-CloseReq packet type. 707 4.4. Congestion Control 709 DCCP connections are congestion controlled, but unlike in TCP, DCCP 710 applications have a choice of congestion control mechanism. In 711 fact, the two half-connections can be governed by different 712 mechanisms. Mechanisms are denoted by one-byte congestion control 713 identifiers, or CCIDs. The endpoints negotiate their CCIDs during 714 connection initiation. Each CCID describes how the HC-Sender limits 715 data packet rates, how the HC-Receiver sends congestion feedback via 716 acknowledgements, and so forth. CCIDs 2 and 3 are currently 717 defined; CCIDs 0, 1, and 4-255 are reserved. Other CCIDs may be 718 defined in the future. 720 CCID 2 provides TCP-like Congestion Control, which is similar to 721 that of TCP. The sender maintains a congestion window and sends 722 packets until that window is full. Packets are acknowledged by the 723 receiver. Dropped packets and ECN [RFC 3168] indicate congestion; 724 the response to congestion is to halve the congestion window. 725 Acknowledgements in CCID 2 contain the sequence numbers of all 726 received packets within some window, similar to a selective 727 acknowledgement (SACK) [RFC 2018]. 729 CCID 3 provides TFRC Congestion Control, an equation-based form of 730 congestion control intended to respond to congestion more smoothly 731 than CCID 2. The sender maintains a transmit rate, which it updates 732 using the receiver's estimate of the packet loss and mark rate. 733 CCID 3 behaves somewhat differently from TCP in the short term, it 734 is designed to operate fairly with TCP over the long term. 736 Section 10 describes DCCP's CCIDs in more detail. The behaviors of 737 CCIDs 2 and 3 are fully defined in separate profile documents [CCID 738 2 PROFILE, CCID 3 PROFILE]. 740 4.5. Features 742 DCCP endpoints use Change and Confirm options to negotiate and agree 743 on feature values. Feature negotiation will almost always happen on 744 the connection initiation handshake, but it can begin at any time. 746 There are four feature negotiation options in all: Change L, 747 Confirm L, Change R, and Confirm R. The "L" options are sent by the 748 feature location, and the "R" options are sent by the feature 749 remote. A Change R option says to the feature location, "change 750 this feature value as follows". The feature location responds with 751 Confirm L, meaning "I've changed it". Some features allow Change R 752 options to contain multiple values, sorted in preference order. For 753 example: 755 Client Server 756 ------ ------ 757 Change R(CCID, 2) --> 758 <-- Confirm L(CCID, 2) 759 * agreement that CCID/Server = 2 * 761 Change R(CCID, 3 4) --> 762 <-- Confirm L(CCID, 4, 4 2) 763 * agreement that CCID/Server = 4 * 765 Both exchanges negotiate the CCID/Server feature's value, which is 766 the CCID in use on the server-to-client half-connection. In the 767 second exchange, the client requests that the server use either 768 CCID 3 or CCID 4, with 3 preferred; the server chooses 4 and 769 supplies its preference list, "4 2". 771 The Change L and Confirm R options are used for feature negotiations 772 initiated by the feature location. In the following example, the 773 server requests that CCID/Server be set to 3 or 2, with 3 preferred, 774 and the client agrees. 776 Client Server 777 ------ ------ 778 <-- Change L(CCID, 3 2) 779 Confirm R(CCID, 3, 3 2) --> 780 * agreement that CCID/Server = 3 * 782 Section 6 describes the feature negotiation options further, 783 including the retransmission strategies that make negotiation 784 reliable. 786 4.6. Differences From TCP 788 Differences between DCCP and TCP apart from those discussed so far 789 include: 791 o Copious space for options (up to 1008 bytes or the PMTU). 793 o Different acknowledgement formats. The CCID for a connection 794 determines how much acknowledgement information needs to be 795 transmitted. For example, in CCID 2 (TCP-like), this is about 796 one ack per 2 packets, and each ack must declare exactly which 797 packets were received; in CCID 3 (TFRC), it's about one ack per 798 round-trip time, and acks must declare at minimum just the 799 lengths of recent loss intervals. 801 o Denial-of-service (DoS) protection. Several mechanisms help 802 limit the amount of state possibly-misbehaving clients can force 803 DCCP servers to maintain. An Init Cookie option, analogous to 804 TCP's SYN Cookies [SYNCOOKIES], avoids SYN-flood-like attacks. 805 Only one connection endpoint need hold TIMEWAIT state; the DCCP- 806 CloseReq packet, which may only be sent by the server, passes 807 that state to the client. Various rate limits let servers avoid 808 attacks that might force extensive computation or packet 809 generation. 811 o Distinguishing different kinds of loss. A Data Dropped option 812 (Section 11.7) lets an endpoint declare that a packet was dropped 813 because of corruption, because of receive buffer overflow, and so 814 on. This facilitates research into more appropriate rate-control 815 responses for these non-network-congestion losses (although 816 currently such losses will cause a congestion response). 818 o Acknowledgeability. In TCP, a packet may be acknowledged only 819 once the data is reliably queued for application delivery. This 820 does not make sense in DCCP, where an application might, for 821 example, request a drop-from-front receive buffer. A DCCP packet 822 may be acknowledged as soon as its header has been successfully 823 processed. Concretely, a packet becomes acknowledgeable at 824 Step 8 of Section 8.5's packet processing pseudocode. 825 Acknowledgeability does not guarantee data delivery, however: the 826 Data Dropped option may later report that the packet's 827 application data was discarded. 829 o No receive window. DCCP is a congestion control protocol, not a 830 flow control protocol. 832 o No simultaneous open. Every connection has one client and one 833 server. 835 o No half-closed states. DCCP has no states corresponding to TCP's 836 FINWAIT and CLOSEWAIT, where one half-connection is explicitly 837 closed while the other is still active. The Data Dropped 838 option's Drop Code 1, Application Not Listening (Section 11.7), 839 can achieve a similar effect, however. 841 4.7. Example Connection 843 The progress of a typical DCCP connection is as follows. (This 844 description is informative, not normative.) 845 Client Server 846 ------ ------ 847 0. [CLOSED] [LISTEN] 848 1. DCCP-Request --> 849 2. <-- DCCP-Response 850 3. DCCP-Ack --> 851 4. DCCP-Data, DCCP-Ack, DCCP-DataAck --> 852 <-- DCCP-Data, DCCP-Ack, DCCP-DataAck 853 5. <-- DCCP-CloseReq 854 6. DCCP-Close --> 855 7. <-- DCCP-Reset 856 8. [TIMEWAIT] 858 1. The client sends the server a DCCP-Request packet specifying the 859 client and server ports, the service being requested, and any 860 features being negotiated, including the CCID that the client 861 would like the server to use. The client may optionally 862 piggyback an application request on the DCCP-Request packet, 863 which the server may ignore. 865 2. The server sends the client a DCCP-Response packet indicating 866 that it is willing to communicate with the client. This 867 response indicates any features and options that the server 868 agrees to, begins other feature negotiations as desired, and 869 optionally includes an Init Cookie that wraps up all this 870 information and which must be returned by the client for the 871 connection to complete. 873 3. The client sends the server a DCCP-Ack packet that acknowledges 874 the DCCP-Response packet. This acknowledges the server's 875 initial sequence number and returns the Init Cookie if there was 876 one in the DCCP-Response. It may also continue feature 877 negotiation. The client may piggyback an application-level 878 request on its final ack, producing a DCCP-DataAck packet. 880 4. The server and client then exchange DCCP-Data packets, DCCP-Ack 881 packets acknowledging that data, and, optionally, DCCP-DataAck 882 packets containing data with piggybacked acknowledgements. If 883 the client has no data to send, then the server will send DCCP- 884 Data and DCCP-DataAck packets, while the client will send DCCP- 885 Acks exclusively. (However, the client may not send DCCP-Data 886 packets before receiving at least one non-DCCP-Response packet 887 from the server.) 889 5. The server sends a DCCP-CloseReq packet requesting a close. 891 6. The client sends a DCCP-Close packet acknowledging the close. 893 7. The server sends a DCCP-Reset packet with Reset Code 1, 894 "Closed", and clears its connection state. DCCP-Resets are part 895 of normal connection termination; see Section 5.6. 897 8. The client receives the DCCP-Reset packet and holds state for 898 two maximum segment lifetimes, or 2MSL, to allow any remaining 899 packets to clear the network. 901 An alternative connection closedown sequence is initiated by the 902 client: 904 5b. The client sends a DCCP-Close packet closing the connection. 906 6b. The server sends a DCCP-Reset packet with Reset Code 1, 907 "Closed", and clears its connection state. 909 7b. The client receives the DCCP-Reset packet and holds state for 910 2MSL to allow any remaining packets to clear the network. 912 5. Packet Formats 914 The DCCP header can be from 12 to 1020 bytes long. The initial 12 915 bytes of the header have the same semantics for all currently- 916 defined packet types. Following this comes any additional fixed- 917 length fields required by the packet type, and then a variable- 918 length list of options. The application data area follows the 919 header. In some packet types, this area contains data for the 920 application; in other packet types, its contents are ignored. 922 +---------------------------------------+ -. 923 | Generic Header | | 924 +---------------------------------------+ | 925 | Additional Fields (depending on type) | +- DCCP Header 926 +---------------------------------------+ | 927 | Options (optional) | | 928 +=======================================+ -' 929 | Application Data Area | 930 +---------------------------------------+ 932 5.1. Generic Header 934 The DCCP generic header takes different forms depending on the value 935 of X, the Extended Sequence Numbers bit. If X is one, the Sequence 936 Number field is 48 bits long and the generic header takes 16 bytes, 937 as follows. 939 0 1 2 3 940 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 941 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 942 | Source Port | Dest Port | 943 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 944 | Data Offset | CCVal | CsCov | Checksum | 945 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 946 | | |X| | . 947 | Res | Type |=| Reserved | Sequence Number (high bits) . 948 | | |1| | . 949 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 950 . Sequence Number (low bits) | 951 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 953 If X is zero, only the low 24 bits of the Sequence Number are 954 transmitted, and the generic header is 12 bytes long. 956 0 1 2 3 957 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 958 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 959 | Source Port | Dest Port | 960 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 961 | Data Offset | CCVal | CsCov | Checksum | 962 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 963 | | |X| | 964 | Res | Type |=| Sequence Number (low bits) | 965 | | |0| | 966 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 968 The generic header fields are defined as follows. 970 Source and Destination Ports: 16 bits each 971 These fields identify the connection, similar to the 972 corresponding fields in TCP and UDP. The Source Port represents 973 the relevant port on the endpoint that sent this packet, the 974 Destination Port the relevant port on the other endpoint. When 975 initiating a connection, the client SHOULD choose its Source 976 Port randomly to reduce the likelihood of attack. 978 DCCP APIs should treat port numbers similarly to TCP and UDP 979 port numbers. For example, machines that distinguish between 980 "privileged" and "unprivileged" ports for TCP and UDP should do 981 the same for DCCP. 983 Data Offset: 8 bits 984 The offset from the start of the packet's DCCP header to the 985 start of its application data area, in 32-bit words. The 986 receiver MUST ignore packets whose Data Offset is smaller than 987 the minimum-sized header for the given Type, or larger than the 988 DCCP packet itself. 990 CCVal: 4 bits 991 Used by the HC-Sender CCID. For example, the A-to-B CCID's 992 sender, which is active at DCCP A, MAY send 4 bits of 993 information per packet to its receiver by encoding that 994 information in CCVal. The sender MUST set CCVal to zero unless 995 its HC-Sender CCID specifies otherwise, and the receiver MUST 996 ignore the CCVal field unless its HC-Receiver CCID specifies 997 otherwise. 999 Checksum Coverage (CsCov): 4 bits 1000 Checksum Coverage determines the parts of the packet that are 1001 covered by the Checksum field. This always includes the DCCP 1002 header and options, but some or all of the application data may 1003 be excluded. This can improve performance on noisy links for 1004 applications that can tolerate corruption. See Section 9. 1006 Checksum: 16 bits 1007 The Internet checksum of the packet's DCCP header (including 1008 options), a network-layer pseudoheader, and, depending on 1009 Checksum Coverage, all, some, or none of the application data. 1010 See Section 9. 1012 Reserved (Res): 3 bits 1013 Senders MUST set this field to all zeroes on generated packets, 1014 and receivers MUST ignore its value. 1016 Type: 4 bits 1017 The Type field specifies the type of the packet. The following 1018 values are defined: 1020 Type Meaning 1021 ---- ------- 1022 0 DCCP-Request 1023 1 DCCP-Response 1024 2 DCCP-Data 1025 3 DCCP-Ack 1026 4 DCCP-DataAck 1027 5 DCCP-CloseReq 1028 6 DCCP-Close 1029 7 DCCP-Reset 1030 8 DCCP-Sync 1031 9 DCCP-SyncAck 1032 10-15 Reserved 1034 Table 1: DCCP Packet Types 1036 Receivers MUST ignore any packets with reserved type. That is, 1037 packets with reserved type MUST NOT be processed and they MUST 1038 NOT be acknowledged as received. 1040 Extended Sequence Numbers (X): 1 bit 1041 Set to one to indicate the use of an extended generic header 1042 with 48-bit Sequence and Acknowledgement Numbers. DCCP-Data, 1043 DCCP-DataAck, and DCCP-Ack packets MAY set X to zero or one. 1044 All DCCP-Request, DCCP-Response, DCCP-CloseReq, DCCP-Close, 1045 DCCP-Reset, DCCP-Sync, and DCCP-SyncAck packets MUST set X to 1046 one; endpoints MUST ignore any such packets with X set to zero. 1047 High-rate connections SHOULD set X to one on all packets to gain 1048 increased protection against wrapped sequence numbers and 1049 attacks. See Section 7.6. 1051 Sequence Number: 48 or 24 bits 1052 Identifies the packet uniquely in the sequence of all packets 1053 the source sent on this connection. Sequence Number increases 1054 by one with every packet sent, including packets such as DCCP- 1055 Ack that carry no application data. See Section 7. 1057 All currently defined packet types except DCCP-Request and DCCP-Data 1058 carry an Acknowledgement Number Subheader in the four or eight bytes 1059 immediately following the generic header. When X=1, its format is: 1061 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1062 | Reserved | Acknowledgement Number . 1063 | | (high bits) . 1064 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1065 . Acknowledgement Number (low bits) | 1066 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1068 When X=0, only the low 24 bits of the Acknowledgement Number are 1069 transmitted, giving the Acknowledgement Number Subheader this 1070 format: 1072 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1073 | Reserved | Acknowledgement Number (low bits) | 1074 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1076 Reserved: 16 or 8 bits 1077 Senders MUST set this field to all zeroes on generated packets, 1078 and receivers MUST ignore its value. 1080 Acknowledgement Number: 48 or 24 bits 1081 Generally contains GSR, the Greatest Sequence Number Received on 1082 any acknowledgeable packet so far. A packet is acknowledgeable 1083 if and only if its header was successfully processed by the 1084 receiver; Section 7.4 describes this further. Options such as 1085 Ack Vector (Section 11.4) combine with the Acknowledgement 1086 Number to provide precise information about which packets have 1087 arrived. 1089 Acknowledgement Numbers on DCCP-Sync and DCCP-SyncAck packets 1090 need not equal GSR. See Section 5.7. 1092 5.2. DCCP-Request Packets 1094 A client initiates a DCCP connection by sending a DCCP-Request 1095 packet. These packets MAY contain application data, and MUST use 1096 48-bit sequence numbers (X=1). 1098 0 1 2 3 1099 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1100 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1101 / Generic DCCP Header with X=1 (16 bytes) / 1102 / with Type=0 (DCCP-Request) / 1103 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1104 | Service Code | 1105 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1106 / Options and Padding / 1107 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1108 / Application Data / 1109 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1111 Service Code: 32 bits 1112 Describes the application-level service to which the client 1113 application wants to connect. Service Codes are intended to 1114 provide information about which application protocol a 1115 connection intends to use, and thus aiding middleboxes and 1116 reducing reliance on globally well-known ports. See Section 1117 8.1.2. 1119 5.3. DCCP-Response Packets 1121 The server responds to valid DCCP-Request packets with DCCP-Response 1122 packets. This is the second phase of the three-way handshake. 1123 DCCP-Response packets MAY contain application data, and MUST use 1124 48-bit sequence numbers (X=1). 1126 0 1 2 3 1127 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1128 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1129 / Generic DCCP Header with X=1 (16 bytes) / 1130 / with Type=1 (DCCP-Response) / 1131 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1132 / Acknowledgement Number Subheader (8 bytes) / 1133 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1134 | Service Code | 1135 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1136 / Options and Padding / 1137 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1138 / Application Data / 1139 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1141 Acknowledgement Number: 48 bits 1142 Contains GSR. Since DCCP-Responses are only sent during 1143 connection initiation, this will always equal the Sequence 1144 Number on a received DCCP-Request. 1146 Service Code: 32 bits 1147 MUST equal the Service Code on the corresponding DCCP-Request. 1149 5.4. DCCP-Data, DCCP-Ack, and DCCP-DataAck Packets 1151 The central data transfer portion of every DCCP connection uses 1152 DCCP-Data, DCCP-Ack, and DCCP-DataAck packets. These packets MAY 1153 use 24-bit sequence numbers, depending on the value of the Allow 1154 Short Sequence Numbers feature (Section 7.6.1). DCCP-Data packets 1155 carry application data without acknowledgements. 1157 0 1 2 3 1158 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1159 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1160 / Generic DCCP Header (16 or 12 bytes) / 1161 / with Type=2 (DCCP-Data) / 1162 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1163 / Options and Padding / 1164 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1165 / Application Data / 1166 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1168 DCCP-Ack packets dispense with the data, but contain an 1169 Acknowledgement Number. They are used for pure acknowledgements. 1171 0 1 2 3 1172 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1173 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1174 / Generic DCCP Header (16 or 12 bytes) / 1175 / with Type=3 (DCCP-Ack) / 1176 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1177 / Acknowledgement Number Subheader (8 or 4 bytes) / 1178 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1179 / Options and Padding / 1180 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1181 / Application Data Area (Ignored) / 1182 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1184 DCCP-DataAck packets carry both application data and an 1185 Acknowledgement Number: acknowledgement information is piggybacked 1186 on a data packet. 1188 0 1 2 3 1189 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1190 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1191 / Generic DCCP Header (16 or 12 bytes) / 1192 / with Type=4 (DCCP-DataAck) / 1193 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1194 / Acknowledgement Number Subheader (8 or 4 bytes) / 1195 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1196 / Options and Padding / 1197 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1198 / Application Data / 1199 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1201 A DCCP-Data or DCCP-DataAck packet may have a zero-length 1202 application data area, which indicates that the application sent a 1203 zero-length datagram. This differs from DCCP-Request and DCCP- 1204 Response packets, where an empty application data area indicates the 1205 absence of application data (not the presence of zero-length 1206 application data). The API SHOULD report any received zero-length 1207 datagrams to the receiving application. 1209 A DCCP-Ack packet MAY have a non-zero-length application data area, 1210 which essentially pads the DCCP-Ack to a desired length. Receivers 1211 MUST ignore the content of the application data area in DCCP-Ack 1212 packets. 1214 DCCP-Ack and DCCP-DataAck packets often include additional 1215 acknowledgement options, such as Ack Vector, as required by the 1216 congestion control mechanism in use. 1218 5.5. DCCP-CloseReq and DCCP-Close Packets 1220 DCCP-CloseReq and DCCP-Close packets begin the handshake that 1221 normally terminates a connection. Either client or server may send 1222 a DCCP-Close packet, which will elicit a DCCP-Reset packet. Only 1223 the server can send a DCCP-CloseReq packet, which indicates that the 1224 server wants to close the connection, but does not want to hold its 1225 TIMEWAIT state. Both packet types MUST use 48-bit sequence numbers 1226 (X=1). 1228 0 1 2 3 1229 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1230 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1231 / Generic DCCP Header with X=1 (16 bytes) / 1232 / with Type=5 (DCCP-CloseReq) or 6 (DCCP-Close) / 1233 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1234 / Acknowledgement Number Subheader (8 bytes) / 1235 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1236 / Options and Padding / 1237 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1238 / Application Data Area (Ignored) / 1239 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1241 As with DCCP-Ack packets, DCCP-CloseReq and DCCP-Close packets MAY 1242 have non-zero-length application data areas, whose contents 1243 receivers MUST ignore. 1245 5.6. DCCP-Reset Packets 1247 DCCP-Reset packets unconditionally shut down a connection. 1248 Connections normally terminate with a DCCP-Reset, but resets may be 1249 sent for other reasons, including bad port numbers, bad option 1250 behavior, incorrect ECN Nonce Echoes, and so forth. DCCP-Resets 1251 MUST use 48-bit sequence numbers (X=1). 1253 0 1 2 3 1254 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1255 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1256 / Generic DCCP Header with X=1 (16 bytes) / 1257 / with Type=7 (DCCP-Reset) / 1258 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1259 / Acknowledgement Number Subheader (8 bytes) / 1260 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1261 | Reset Code | Data 1 | Data 2 | Data 3 | 1262 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1263 / Options and Padding / 1264 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1265 / Application Data Area (Error Text) / 1266 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1268 Reset Code: 8 bits 1269 Represents the reason that the sender reset the DCCP connection. 1271 Data 1, Data 2, and Data 3: 8 bits each 1272 The Data fields provide additional information about why the 1273 sender reset the DCCP connection. The meanings of these fields 1274 depend on the value of Reset Code. 1276 Application Data Area: Error Text 1277 If present, Error Text is a human-readable text string encoded 1278 in Unicode UTF-8, and preferably in English, that describes the 1279 error in more detail. For example, a DCCP-Reset with Reset Code 1280 11, "Aggression Penalty", might contain Error Text such as 1281 "Aggression Penalty: Received 3 bad ECN Nonce Echoes, assuming 1282 misbehavior". 1284 The following Reset Codes are currently defined. Unless otherwise 1285 specified, the Data 1, 2, and 3 fields MUST be set to 0 by the 1286 sender of the DCCP-Reset and ignored by its receiver. Section 1287 references describe concrete situations that will cause each Reset 1288 Code to be generated; they are not meant to be exhaustive. 1290 0, "Unspecified" 1291 Indicates the absence of a meaningful Reset Code. Use of Reset 1292 Code 0 is NOT RECOMMENDED: the sender should choose a Reset Code 1293 that more clearly defines why the connection is being reset. 1295 1, "Closed" 1296 Normal connection close. See Section 8.3. 1298 2, "Aborted" 1299 The sending endpoint gave up on the connection because of lack 1300 of progress. See Sections 8.1.1 and 8.1.5. 1302 3, "No Connection" 1303 No connection exists. See Section 8.3.1. 1305 4, "Packet Error" 1306 A valid packet arrived with unexpected type. For example, a 1307 DCCP-Data packet with valid header checksum and sequence numbers 1308 arrived at a connection in the REQUEST state. See Section 1309 8.3.1. The Data 1 field equals the offending packet type as an 1310 eight-bit number; thus, an offending packet with Type 2 will 1311 result in a Data 1 value of 2. 1313 5, "Option Error" 1314 An option was erroneous, and the error was serious enough to 1315 warrant resetting the connection. See Sections 6.6.7, 6.6.8, 1316 and 11.4. The Data 1 field equals the offending option type; 1317 Data 2 and Data 3 equal the first two bytes of option data (or 1318 zero if the option had less than two bytes of data). 1320 6, "Mandatory Error" 1321 The sending endpoint could not process an option O that was 1322 immediately preceded by Mandatory. The Data fields report the 1323 option type and data of option O, using the format of Reset Code 1324 5, "Option Error". See Section 5.8.2. 1326 7, "Connection Refused" 1327 The Destination Port didn't correspond to a port open for 1328 listening. Sent only in response to DCCP-Requests. See Section 1329 8.1.3. 1331 8, "Bad Service Code" 1332 The Service Code didn't equal the service code attached to the 1333 Destination Port. Sent only in response to DCCP-Requests. See 1334 Section 8.1.3. 1336 9, "Too Busy" 1337 The server is too busy to accept new connections. Sent only in 1338 response to DCCP-Requests. See Section 8.1.3. 1340 10, "Bad Init Cookie" 1341 The Init Cookie echoed by the client was incorrect or missing. 1342 See Section 8.1.4. 1344 11, "Aggression Penalty" 1345 This endpoint has detected congestion control-related 1346 misbehavior on the part of the other endpoint. See Section 1347 12.3. 1349 12-127, Reserved 1350 Receivers should treat these codes like Reset Code 0, 1351 "Unspecified". 1353 128-255, CCID-specific codes 1354 Semantics depend on the connection's CCIDs. See Section 10.3. 1355 Receivers should treat unknown CCID-specific Reset Codes like 1356 Reset Code 0, "Unspecified". 1358 The following table summarizes this information. 1360 Reset 1361 Code Name Data 1 Data 2 & 3 1362 ----- ---- ------ ---------- 1363 0 Unspecified 0 0 1364 1 Closed 0 0 1365 2 Aborted 0 0 1366 3 No Connection 0 0 1367 4 Packet Error pkt type 0 1368 5 Option Error option # option data 1369 6 Mandatory Error option # option data 1370 7 Connection Refused 0 0 1371 8 Bad Service Code 0 0 1372 9 Too Busy 0 0 1373 10 Bad Init Cookie 0 0 1374 11 Aggression Penalty 0 0 1375 12-127 Reserved 1376 128-255 CCID-specific codes 1378 Table 2: DCCP Reset Codes 1380 Options on DCCP-Reset packets are processed before the connection is 1381 shut down. This means that certain combinations of options, 1382 particularly involving Mandatory, may cause an endpoint to respond 1383 to a valid DCCP-Reset with another DCCP-Reset. This cannot lead to 1384 a reset storm; since the first endpoint has already reset the 1385 connection, the second DCCP-Reset will be ignored. 1387 5.7. DCCP-Sync and DCCP-SyncAck Packets 1389 DCCP-Sync packets help DCCP endpoints recover synchronization after 1390 bursts of loss, or recover from half-open connections. Each valid 1391 received DCCP-Sync immediately elicits a DCCP-SyncAck. Both packet 1392 types MUST use 48-bit sequence numbers (X=1). 1394 0 1 2 3 1395 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1396 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1397 / Generic DCCP Header with X=1 (16 bytes) / 1398 / with Type=8 (DCCP-Sync) or 9 (DCCP-SyncAck) / 1399 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1400 / Acknowledgement Number Subheader (8 bytes) / 1401 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1402 / Options and Padding / 1403 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1404 / Application Data Area (Ignored) / 1405 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1407 The Acknowledgement Number field has special semantics for DCCP-Sync 1408 and DCCP-SyncAck packets. First, the packet corresponding to a 1409 DCCP-Sync's Acknowledgement Number need not have been 1410 acknowledgeable. Thus, receivers MUST NOT assume that a packet was 1411 processed simply because it appears in the Acknowledgement Number 1412 field of a DCCP-Sync packet. This differs from all other packet 1413 types, where the Acknowledgement Number by definition corresponds to 1414 an acknowledgeable packet. Second, the Acknowledgement Number on 1415 any DCCP-SyncAck packet MUST correspond to the Sequence Number on an 1416 acknowledgeable DCCP-Sync packet. In the presence of reordering, 1417 this might not equal GSR. 1419 As with DCCP-Ack packets, DCCP-Sync and DCCP-SyncAck packets MAY 1420 have non-zero-length application data areas, whose contents 1421 receivers MUST ignore. Padded DCCP-Sync packets may be useful when 1422 performing Path MTU discovery; see Section 14. 1424 5.8. Options 1426 Any DCCP packet may contain options, which occupy space at the end 1427 of the DCCP header. Each option is a multiple of 8 bits in length. 1428 Individual options are not padded to multiples of 32 bits, and any 1429 option may begin on any byte boundary. However, the combination of 1430 all options MUST add up to a multiple of 32 bits; Padding options 1431 MUST be added as necessary to fill out option space to a word 1432 boundary. Any options present are included in the header checksum. 1434 The first byte of an option is the option type. Options with types 1435 0 through 31 are single-byte options. Other options are followed by 1436 a byte indicating the option's length. This length value includes 1437 the two bytes of option-type and option-length as well as any 1438 option-data bytes, and must therefore be greater than or equal to 1439 two. 1441 Options are processed sequentially, starting at the first option in 1442 the packet header. Options with unknown types MUST be ignored. 1443 Also, options with nonsensical lengths (length byte less than two or 1444 more than the remaining space in the options portion of the header) 1445 MUST be ignored, and any option space following an option with 1446 nonsensical length MUST likewise be ignored. 1448 The following options are currently defined: 1450 Option DCCP- Section 1451 Type Length Meaning Data? Reference 1452 ---- ------ ------- ----- --------- 1453 0 1 Padding Y 5.8.1 1454 1 1 Mandatory N 5.8.2 1455 2 1 Slow Receiver Y 11.6 1456 3-31 1 Reserved 1457 32 variable Change L N 6.1 1458 33 variable Confirm L N 6.2 1459 34 variable Change R N 6.1 1460 35 variable Confirm R N 6.2 1461 36 variable Init Cookie N 8.1.4 1462 37 3-5 NDP Count Y 7.7 1463 38 variable Ack Vector [Nonce 0] N 11.4 1464 39 variable Ack Vector [Nonce 1] N 11.4 1465 40 variable Data Dropped N 11.7 1466 41 6 Timestamp Y 13.1 1467 42 6/8/10 Timestamp Echo Y 13.3 1468 43 4/6 Elapsed Time N 13.2 1469 44 6 Data Checksum Y 9.3 1470 45-127 variable Reserved 1471 128-255 variable CCID-specific options - 10.3 1473 Table 3: DCCP Options 1475 Not all options are suitable for all packet types. For example, 1476 since the Ack Vector option is interpreted relative to the 1477 Acknowledgement Number, it isn't suitable on DCCP-Request and DCCP- 1478 Data packets, which have no Acknowledgement Number. If an option 1479 occurs on an unexpected packet type, it MUST generally be ignored; 1480 any such restrictions are mentioned in each option's description. 1481 The table summarizes the most common restriction: when the DCCP- 1482 Data? column value is N, the corresponding option MUST be ignored 1483 when received on a DCCP-Data packet. (Section 7.5.5 describes why 1484 such options are ignored as opposed to, say, causing a reset.) 1486 Options with invalid values MUST be ignored unless otherwise 1487 specified. For example, any Data Checksum option with option length 1488 4 MUST be ignored, since all valid Data Checksum options have option 1489 length 6. 1491 This section describes two generic options, Padding and Mandatory. 1492 Other options are described later. 1494 5.8.1. Padding Option 1496 +--------+ 1497 |00000000| 1498 +--------+ 1499 Type=0 1501 Padding is a single-byte "no-operation" option used to pad between 1502 or after options. If the length of a packet's other options is not 1503 a multiple of 32 bits, then Padding options are REQUIRED to pad out 1504 the options area to the length implied by Data Offset. Padding may 1505 also be used between options -- for example, to align the beginning 1506 of a subsequent option on a 32-bit boundary. There is no guarantee 1507 that senders will use this option, so receivers must be prepared to 1508 process options even if they do not begin on a word boundary. 1510 5.8.2. Mandatory Option 1512 +--------+ 1513 |00000001| 1514 +--------+ 1515 Type=1 1517 Mandatory is a single-byte option that marks the immediately 1518 following option as mandatory. Say that the immediately following 1519 option is O. Then the Mandatory option has no effect if the 1520 receiving DCCP endpoint understands and processes O. If the 1521 endpoint does not understand or process O, however, then it MUST 1522 reset the connection using Reset Code 6, "Mandatory Failure". For 1523 instance, the endpoint would reset the connection if it did not 1524 understand O's type; if it understood O's type, but not O's data; if 1525 O's data was invalid for O's type; if O was a feature negotiation 1526 option, and the endpoint did not understand the enclosed feature 1527 number; if the endpoint understood O, but chose not to perform the 1528 action O implies; and so forth. 1530 Mandatory options MUST NOT be sent on DCCP-Data packets, and any 1531 Mandatory options received on DCCP-Data packets MUST be ignored. 1533 The connection is in error and should be reset with Reset Code 5, 1534 "Option Error" if option O is absent (Mandatory was the last byte of 1535 the option list), or if option O equals Mandatory. However, the 1536 combination "Mandatory Padding" is valid, and MUST behave like two 1537 bytes of Padding. 1539 Section 6.6.9 describes the behavior of Mandatory feature 1540 negotiation options in more detail. 1542 6. Feature Negotiation 1544 Four DCCP options, Change L, Confirm L, Change R, and Confirm R, are 1545 used to negotiate feature values. Change options initiate a 1546 negotiation; Confirm options complete that negotiation. The "L" 1547 options are sent by the feature location, and the "R" options are 1548 sent by the feature remote. Change options are retransmitted to 1549 ensure reliability. 1551 All these options have the same format. The first byte of option 1552 data is the feature number, and the second and subsequent data bytes 1553 hold one or more feature values. The exact format of the feature 1554 value area depends on the feature type; see Section 6.3. 1556 +--------+--------+--------+--------+-------- 1557 | Type | Length |Feature#| Value(s) ... 1558 +--------+--------+--------+--------+-------- 1560 Together, the feature number and the option type ("L" or "R") 1561 uniquely identify the feature to which an option applies. The exact 1562 format of the Value(s) area depends on the feature number. 1564 Feature negotiation options MUST NOT be sent on DCCP-Data packets, 1565 and any feature negotiation options received on DCCP-Data packets 1566 MUST be ignored. 1568 6.1. Change Options 1570 Change L and Change R options initiate feature negotiation. The 1571 option to use depends on the relevant feature's location: To start a 1572 negotiation for feature F/A, DCCP A will send a Change L option; to 1573 start a negotiation for F/B, it will send a Change R option. Change 1574 options are retransmitted until some response is received. They 1575 contain at least one Value, and thus have length at least 4. 1577 +--------+--------+--------+--------+-------- 1578 Change L: |00100000| Length |Feature#| Value(s) ... 1579 +--------+--------+--------+--------+-------- 1580 Type=32 1582 +--------+--------+--------+--------+-------- 1583 Change R: |00100010| Length |Feature#| Value(s) ... 1584 +--------+--------+--------+--------+-------- 1585 Type=34 1587 6.2. Confirm Options 1589 Confirm L and Confirm R options complete feature negotiation, and 1590 are sent in response to Change R and Change L options, respectively. 1591 Confirm options MUST NOT be generated except in response to Change 1592 options. Confirm options need not be retransmitted, since Change 1593 options are retransmitted as necessary. The first byte of the 1594 Confirm option contains the feature number from the corresponding 1595 Change. Following this is the selected Value, and then possibly the 1596 sender's preference list. 1598 +--------+--------+--------+--------+-------- 1599 Confirm L: |00100001| Length |Feature#| Value(s) ... 1600 +--------+--------+--------+--------+-------- 1601 Type=33 1603 +--------+--------+--------+--------+-------- 1604 Confirm R: |00100011| Length |Feature#| Value(s) ... 1605 +--------+--------+--------+--------+-------- 1606 Type=35 1608 If an endpoint receives an invalid Change option -- with an unknown 1609 feature number, or an invalid value -- it will respond with an empty 1610 Confirm option containing the problematic feature number, but no 1611 value. Such options have length 3. 1613 6.3. Reconciliation Rules 1615 Reconciliation rules determine how the two sets of preferences for a 1616 given feature are resolved into a unique result. The reconciliation 1617 rule depends only on the feature number. Each reconciliation rule 1618 must have the property that the result is uniquely determined given 1619 the contents of Change options sent by the two endpoints. 1621 All current DCCP features use one of two reconciliation rules, 1622 server-priority ("SP") and non-negotiable ("NN"). 1624 6.3.1. Server-Priority 1626 The feature value is a fixed-length byte string (length determined 1627 by the feature number). Each Change option contains a list of 1628 values ordered by preference, with the most preferred value coming 1629 first. Each Confirm option contains the confirmed value, followed 1630 by the confirmer's preference list. Thus, the feature's current 1631 value will generally appear twice in Confirm options' data, once as 1632 the current value and once in the confirmer's preference list. 1634 To reconcile the preference lists, select the first entry in the 1635 server's list that also occurs in the client's list. If there is no 1636 shared entry, the feature's value MUST NOT change, and the Confirm 1637 option will confirm the feature's previous value (unless the Change 1638 option was Mandatory; see Section 6.6.9). 1640 6.3.2. Non-Negotiable 1642 The feature value is a byte string. Each option contains exactly 1643 one feature value. The feature location signals a new value by 1644 sending a Change L option. The feature remote MUST accept any valid 1645 value, responding with a Confirm R option containing the new value, 1646 and it MUST send empty Confirm R options in response to invalid 1647 values (unless the Change L option was Mandatory; see Section 1648 6.6.9). Change R and Confirm L options MUST NOT be sent for non- 1649 negotiable features; see Section 6.6.8. Non-negotiable features use 1650 the feature negotiation mechanism to achieve reliability. 1652 6.4. Feature Numbers 1654 This document defines the following feature numbers. 1656 Rec'n Initial Section 1657 Number Meaning Rule Value Req'd Reference 1658 ------ ------- ----- ----- ----- --------- 1659 0 Reserved 1660 1 Congestion Control ID (CCID) SP 2 Y 10 1661 2 Allow Short Seqnos SP 1 Y 7.6.1 1662 3 Sequence Window NN 100 Y 7.5.2 1663 4 ECN Incapable SP 0 N 12.1 1664 5 Ack Ratio NN 2 N 11.3 1665 6 Send Ack Vector SP 0 N 11.5 1666 7 Send NDP Count SP 0 N 7.7.2 1667 8 Minimum Checksum Coverage SP 0 N 9.2.1 1668 9 Check Data Checksum SP 0 N 9.3.1 1669 10-127 Reserved 1670 128-255 CCID-specific features 10.3 1672 Table 4: DCCP Feature Numbers 1674 Rec'n Rule The reconciliation rule used for the feature. SP is 1675 server-priority and NN is non-negotiable. 1677 Initial Value The initial value for the feature. Every feature has 1678 a known initial value. 1680 Req'd This column is "Y" if and only if every DCCP 1681 implementation MUST understand the feature. If it is 1682 "N", then the feature behaves like an extension (see 1683 Section 15), and it is safe to respond to Change 1684 options for the feature with empty Confirm options. 1685 Of course, a CCID might require the feature; a DCCP 1686 that implements CCID 2 MUST support Ack Ratio and 1687 Send Ack Vector, for example. 1689 6.5. Examples 1690 Here are three example feature negotiations for features located at 1691 the server, the first two for the Congestion Control ID feature, the 1692 last for the Ack Ratio. 1694 Client Server 1695 ------ ------ 1696 1. Change R(CCID, 2 3 1) --> 1697 ("2 3 1" is client's preference list) 1698 2. <-- Confirm L(CCID, 3, 3 2 1) 1699 (3 is the negotiated value; 1700 "3 2 1" is server's pref list) 1701 * agreement that CCID/Server = 3 * 1703 1. XXX <-- Change L(CCID, 3 2 1) 1704 2. Retransmission: 1705 <-- Change L(CCID, 3 2 1) 1706 3. Confirm R(CCID, 3, 2 3 1) --> 1707 * agreement that CCID/Server = 3 * 1709 1. <-- Change L(Ack Ratio, 3) 1710 2. Confirm R(Ack Ratio, 3) --> 1711 * agreement that Ack Ratio/Server = 3 * 1713 This example shows a simultaneous negotiation. 1715 Client Server 1716 ------ ------ 1717 1a. Change R(CCID, 2 3 1) --> 1718 b. <-- Change L(CCID, 3 2 1) 1719 2a. <-- Confirm L(CCID, 3, 3 2 1) 1720 b. Confirm R(CCID, 3, 2 3 1) --> 1721 * agreement that CCID/Server = 3 * 1723 Here are the byte encodings of several Change and Confirm options. 1724 Each option is sent by DCCP A. 1726 Change L(CCID, 2 3) = 32,5,1,2,3 1727 DCCP B should change CCID/A's value (feature number 1, a server- 1728 priority feature); DCCP A's preferred values are 2 and 3, in 1729 that preference order. 1731 Change L(Sequence Window, 1024) = 32,9,3,0,0,0,0,4,0 1732 DCCP B should change Sequence Window/A's value (feature number 1733 3, a non-negotiable feature) to the 6-byte string 0,0,0,0,4,0 1734 (the value 1024). 1736 Confirm L(CCID, 2, 2 3) = 33,6,1,2,2,3 1737 DCCP A has changed CCID/A's value to 2; its preferred values are 1738 2 and 3, in that preference order. 1740 Empty Confirm L(126) = 33,3,126 1741 DCCP A doesn't implement feature number 126, or DCCP B's 1742 proposed value for feature 126/A was invalid. 1744 Change R(CCID, 3 2) = 34,5,1,3,2 1745 DCCP B should change CCID/B's value; DCCP A's preferred values 1746 are 3 and 2, in that preference order. 1748 Confirm R(CCID, 2, 3 2) = 35,6,1,2,3,2 1749 DCCP A has changed CCID/B's value to 2; its preferred values 1750 were 3 and 2, in that preference order. 1752 Confirm R(Sequence Window, 1024) = 35,9,3,0,0,0,0,4,0 1753 DCCP A has changed Sequence Window/B's value to the 6-byte 1754 string 0,0,0,0,4,0 (the value 1024). 1756 Empty Confirm R(126) = 35,3,126 1757 DCCP A doesn't implement feature number 126, or DCCP B's 1758 proposed value for feature 126/B was invalid. 1760 6.6. Option Exchange 1762 A few basic rules govern feature negotiation option exchange. 1764 1. Every non-reordered Change option gets a Confirm option in 1765 response. 1767 2. Change options are retransmitted until a response for the latest 1768 Change is received. 1770 3. Feature negotiation options are processed in strictly increasing 1771 order by Sequence Number. 1773 The rest of this section describes the consequences of these rules 1774 in more detail. 1776 6.6.1. Normal Exchange 1778 Change options are generated when a DCCP endpoint wants to change 1779 the value of some feature. Generally, this will happen at the 1780 beginning of a connection, although it may happen at any time. We 1781 say the endpoint "generates" or "sends" a Change L or Change R 1782 option, but of course the option must be attached to a packet. The 1783 endpoint may attach the option to a packet it would have generated 1784 anyway (such as a DCCP-Request), or it may create a "feature 1785 negotiation packet", often a DCCP-Ack or DCCP-Sync, just to carry 1786 the option. Feature negotiation packets are controlled by the 1787 relevant congestion control mechanism. For example, DCCP A may send 1788 a DCCP-Ack or DCCP-Sync for feature negotiation only if the B-to-A 1789 CCID would allow sending a DCCP-Ack. In addition, an endpoint 1790 SHOULD generate at most one feature negotiation packet per round- 1791 trip time. 1793 On receiving a Change L or Change R option, a DCCP endpoint examines 1794 the included preference list, reconciles that with its own 1795 preference list, calculates the new value, and sends back a 1796 Confirm R or Confirm L option, respectively, informing its peer of 1797 the new value or that the feature was not understood. Every non- 1798 reordered Change option MUST result in a corresponding Confirm 1799 option, and any packet including a Confirm option MUST carry an 1800 Acknowledgement Number. (Section 6.6.4 describes how Change 1801 reordering is detected and handled.) Generated Confirm options may 1802 be attached to packets that would have been sent anyway (such as 1803 DCCP-Response or DCCP-SyncAck), or to new feature negotiation 1804 packets, as described above. 1806 The Change-sending endpoint MUST wait to receive a corresponding 1807 Confirm option before changing its stored feature value. The 1808 Confirm-sending endpoint changes its stored feature value as soon as 1809 it sends the Confirm. 1811 A packet MAY contain more than one feature negotiation option, as 1812 long as no two options refer to the same feature. Note, however, 1813 that a packet is allowed to contain one L option and one R option 1814 with the same feature number, since the two options actually refer 1815 to different features (F/A and F/B). 1817 6.6.2. Processing Received Options 1819 DCCP endpoints exist in one of three states relative to each 1820 feature. STABLE is the normal state, where the endpoint knows the 1821 feature's value and thinks the other endpoint agrees. An endpoint 1822 enters the CHANGING state when it first sends a Change for the 1823 feature, and returns to STABLE once it receives a corresponding 1824 Confirm. The final state, UNSTABLE, indicates that an endpoint in 1825 CHANGING state changed its preference list, but has not yet 1826 transmitted a Change option with the new preference list. 1828 Feature state transitions at a feature location are implemented 1829 according to this diagram. The diagram ignores sequence number and 1830 option validity issues; these are handled explicitly in the 1831 pseudocode that follows. 1833 timeout/ 1834 rcv Confirm R app/protocol evt : snd Change L rcv non-ack 1835 : ignore +---------------------------------------+ : snd Change L 1836 +----+ | | +----+ 1837 | v | rcv Change R v | v 1838 +------------+ rcv Confirm R : calc new value, +------------+ 1839 | | : accept value snd Confirm L | | 1840 | STABLE |<-----------------------------------| CHANGING | 1841 | | rcv empty Confirm R | | 1842 +------------+ : revert to old value +------------+ 1843 | ^ | ^ 1844 +----+ pref list | | snd 1845 rcv Change R changes | | Change L 1846 : calc new value, snd Confirm L v | 1847 +------------+ 1848 +---| | 1849 rcv Confirm/Change R | | UNSTABLE | 1850 : ignore +-->| | 1851 +------------+ 1853 Feature locations SHOULD use the following pseudocode, which 1854 corresponds to the state diagram, to react to each feature 1855 negotiation option on each valid packet received. The pseudocode 1856 refers to "P.seqno" and "P.ackno", which are properties of the 1857 packet; "O.type", and "O.len", which are properties of the option; 1858 "FGSR" and "FGSS", which are properties of the connection, and 1859 handle reordering as described in Section 6.6.4; "F.state", which is 1860 the feature's state (STABLE, CHANGING, or UNSTABLE); and "F.value", 1861 which is the feature's value. 1863 First, check for unknown features (Section 6.6.7); 1864 If F is unknown, 1865 If the option was Mandatory, /* Section 6.6.9 */ 1866 Reset connection and return 1867 Otherwise, if O.type == Change R, 1868 Send Empty Confirm L on a future packet 1869 Return 1871 Second, check for reordering (Section 6.6.4); 1872 If F.state == UNSTABLE or P.seqno <= FGSR 1873 or (O.type == Confirm R and P.ackno < FGSS), 1874 Ignore option and return 1876 Third, process Change R options; 1877 If O.type == Change R, 1878 If the option's value is valid, /* Section 6.6.8 */ 1879 Calculate new value 1880 Send Confirm L on a future packet 1881 Set F.state := STABLE 1882 Otherwise, if the option was Mandatory, 1883 Reset connection and return 1884 Otherwise, 1885 Send Empty Confirm L on a future packet 1886 /* Remain in existing state. If that's CHANGING, this 1887 endpoint will retransmit its Change L option later. */ 1889 Fourth, process Confirm R options (but only in CHANGING state). 1890 If F.state == CHANGING and O.type == Confirm R, 1891 If O.len > 3, /* nonempty */ 1892 If the option's value is valid, 1893 Set F.value := new value 1894 Otherwise, 1895 Reset connection and return 1896 Set F.state := STABLE 1898 Versions of this diagram and pseudocode are also used by feature 1899 remotes; simply switch the "L"s and "R"s, so that the relevant 1900 options are Change R and Confirm L. 1902 6.6.3. Loss and Retransmission 1904 Packets containing Change and Confirm options might be lost or 1905 delayed by the network. Therefore, Change options are repeatedly 1906 transmitted to achieve reliability. We refer to this as 1907 "retransmission", although of course there are no packet-level 1908 retransmissions in DCCP: a Change option that is sent again will be 1909 sent on a new packet with a new sequence number. 1911 A CHANGING endpoint transmits another Change option once it realizes 1912 that it has not heard back from the other endpoint. The new Change 1913 option need not contain the same payload as the original; reordering 1914 protection will ensure that agreement is reached based on the most 1915 recently transmitted option. 1917 A CHANGING endpoint MUST continue retransmitting Change options 1918 until it gets some response or the connection terminates. 1920 Endpoints SHOULD use an exponential-backoff timer to decide when to 1921 retransmit Change options. (Endpoints that generate packets 1922 specifically for feature negotiation MUST use such a timer.) The 1923 timer interval is initially set to not less than one round-trip 1924 time, and should back off to not less than 64 seconds. The backoff 1925 protects against delayed agreement due to the reordering protection 1926 algorithms described in the next section. Again, endpoints may 1927 piggyback Change options on packets they would have sent anyway, or 1928 create new packets to carry the options; any such new packets are 1929 controlled by the relevant congestion-control mechanism. 1931 Confirm options are never retransmitted, but the Confirm-sending 1932 endpoint MUST generate a Confirm option after every non-reordered 1933 Change. 1935 6.6.4. Reordering 1937 Reordering might cause packets containing Change and Confirm options 1938 to arrive in an unexpected order. Endpoints MUST ignore feature 1939 negotiation options that do not arrive in strictly-increasing order 1940 by Sequence Number. The rest of this section presents two 1941 algorithms that fulfill this requirement. 1943 The first algorithm introduces two sequence number variables that 1944 each endpoint maintains for the connection. 1946 FGSR Feature Greatest Sequence Number Received: The greatest 1947 sequence number received, considering only valid packets 1948 that contained one or more feature negotiation options 1949 (Change and/or Confirm). This value is initialized to 1950 ISR - 1. 1952 FGSS Feature Greatest Sequence Number Sent: The greatest 1953 sequence number sent, considering only packets that 1954 contained one or more non-retransmitted Change options. 1955 (Retransmitted Change options MUST have exactly the same 1956 contents as previously transmitted options, so limited 1957 reordering can safely be tolerated.) This value is 1958 initialized to ISS. 1960 Each endpoint checks two conditions on sequence numbers to decide 1961 whether to process received feature negotiation options. 1963 1. If a packet's Sequence Number is less than or equal to FGSR, 1964 then its Change options MUST be ignored. 1966 2. If a packet's Sequence Number is less than or equal to FGSR, OR 1967 it has no Acknowledgement Number, OR its Acknowledgement Number 1968 is less than FGSS, then its Confirm options MUST be ignored. 1970 Alternatively, an endpoint MAY maintain separate FGSR and FGSS 1971 values for every feature. FGSR(F/X) would equal the greatest 1972 sequence number received, considering only packets that contained 1973 Change or Confirm options applying to feature F/X; FGSS(F/X) would 1974 be defined similarly. This algorithm requires more state, but is 1975 slightly more forgiving to multiple overlapped feature negotiations. 1976 Either algorithm MAY be used; the first algorithm, with connection- 1977 wide FGSR and FGSS variables, is RECOMMENDED. 1979 One consequence of these rules is that a CHANGING endpoint will 1980 ignore any Confirm option that does not acknowledge the latest 1981 Change option sent. This ensures that agreement, once achieved, 1982 used the most recent available information about the endpoints' 1983 preferences. 1985 6.6.5. Preference Changes 1987 Endpoints are allowed to change their preference lists at any time. 1988 However, an endpoint that changes its preference list while in the 1989 CHANGING state MUST transition to the UNSTABLE state. It will 1990 transition back to CHANGING once it has transmitted a Change option 1991 with the new preference list. This ensures that agreement is based 1992 on active preference lists. Without the UNSTABLE state, 1993 simultaneous negotiation -- where the endpoints began independent 1994 negotiations for the same feature at the same time -- might lead to 1995 the negotiation terminating with the endpoints thinking the feature 1996 had different values. 1998 6.6.6. Simultaneous Negotiation 2000 The two endpoints might simultaneously open negotiation for the same 2001 feature, after which an endpoint in the CHANGING state will receive 2002 a Change option for the same feature. Such received Change options 2003 can act as responses to the original Change options. The CHANGING 2004 endpoint MUST examine the received Change's preference list, 2005 reconcile that with its own preference list (as expressed in its 2006 generated Change options), and generate the corresponding Confirm 2007 option. It can then transition to the STABLE state. 2009 6.6.7. Unknown Features 2011 Endpoints may receive Change options referring to feature numbers 2012 they do not understand -- for instance, when an extended DCCP 2013 converses with a non-extended DCCP. Endpoints MUST respond to 2014 unknown Change options with Empty Confirm options (that is, Confirm 2015 options containing no data), which inform the CHANGING endpoint that 2016 the feature was not understood. However, if the Change option was 2017 Mandatory, the connection MUST be reset; see Section 6.6.9. 2019 On receiving an empty Confirm option for some feature, the CHANGING 2020 endpoint MUST transition back to the STABLE state, leaving the 2021 feature's value unchanged. Section 15 suggests that the default 2022 value for any extension feature should correspond to "extension not 2023 available". 2025 Some features are required to be understood by all DCCPs (see 2026 Section 6.4). The CHANGING endpoint SHOULD reset the connection 2027 (with Reset Code 5, "Option Error") if it receives an empty Confirm 2028 option for such a feature. 2030 Since Confirm options are generated only in response to Change 2031 options, an endpoint should never receive a Confirm option referring 2032 to a feature number it does not understand. Nevertheless, endpoints 2033 MUST ignore any such options they receive. 2035 6.6.8. Invalid Options 2037 A DCCP endpoint might receive a Change or Confirm option that lists 2038 one or more values that it does not understand. Some, but not all, 2039 such options are invalid, depending on the relevant reconciliation 2040 rule (Section 6.3). For instance: 2042 o All features have length limitations, and options with invalid 2043 lengths are invalid. For example, the Ack Ratio feature takes 2044 16-bit values, so valid "Confirm R(Ack Ratio)" options have 2045 option length 5. 2047 o Some non-negotiable features have value limitations. The Ack 2048 Ratio feature takes two-byte, non-zero integer values, so a 2049 "Change L(Ack Ratio, 0)" option is never valid. Note that 2050 server-priority features do not have value limitations, since 2051 unknown values are handled as a matter of course. 2053 o Any Confirm option that selects the wrong value, based on the two 2054 preference lists and the relevant reconciliation rule, is 2055 invalid. 2057 o However, unexpected Confirm options -- that refer to unknown 2058 feature numbers, or that don't appear to be part of a current 2059 negotiation -- are considered valid, although they are ignored by 2060 the receiver. 2062 An endpoint receiving an invalid Change option MUST respond with the 2063 corresponding empty Confirm option. An endpoint receiving an 2064 invalid Confirm option MUST reset the connection, with Reset Code 5, 2065 "Option Error". 2067 6.6.9. Mandatory Feature Negotiation 2069 Change options may be preceded by Mandatory options (Section 5.8.2). 2070 Mandatory Change options are processed like normal Change options, 2071 except that the following failure cases will cause the receiver to 2072 reset the connection with Reset Code 6, "Mandatory Failure", rather 2073 than send a Confirm option. The connection MUST be reset if: 2075 o The Change option's feature number was not understood; 2077 o The Change option's value was invalid, and the receiver would 2078 normally have sent an empty Confirm option in response; or 2080 o For server-priority features, there was no shared entry in the 2081 two endpoints' preference lists. 2083 There's no reason to mark Confirm options as Mandatory in this 2084 version of DCCP, since Confirm options are sent only in response to 2085 Change options and therefore can't mention potentially-invalid 2086 values or unexpected feature numbers. 2088 7. Sequence Numbers 2090 DCCP uses sequence numbers to arrange packets into sequence, detect 2091 losses and network duplicates, and protect against attackers, half- 2092 open connections, and the delivery of very old packets. Every 2093 packet carries a Sequence Number; most packet types carry an 2094 Acknowledgement Number as well. 2096 DCCP sequence numbers are packet-based. That is, the packets 2097 generated by each endpoint have Sequence Numbers that increase by 2098 one, modulo 2^48, for every packet. Even DCCP-Ack and DCCP-Sync 2099 packets, and other packets that don't carry user data, increment the 2100 Sequence Number. Since DCCP is an unreliable protocol, there are no 2101 true retransmissions; but effective retransmissions, such as 2102 retransmissions of DCCP-Request packets, also increment the Sequence 2103 Number. This lets DCCP implementations detect network duplication, 2104 retransmissions, and acknowledgement loss, and is a significant 2105 departure from TCP practice. 2107 7.1. Variables 2109 DCCP endpoints maintain a set of sequence number variables for each 2110 connection. 2112 ISS The Initial Sequence Number Sent by this endpoint. This 2113 equals the Sequence Number of the first DCCP-Request or 2114 DCCP-Response sent. 2116 ISR The Initial Sequence Number Received from the other 2117 endpoint. This equals the Sequence Number of the first 2118 DCCP-Request or DCCP-Response received. 2120 GSS The Greatest Sequence Number Sent by this endpoint. Here, 2121 and elsewhere, "greatest" is measured in circular sequence 2122 space. 2124 GSR The Greatest Sequence Number Received from the other 2125 endpoint on an acknowledgeable packet. (Section 7.4 defines 2126 this term.) 2128 GAR The Greatest Acknowledgement Number Received from the other 2129 endpoint on an acknowledgeable packet that was not a DCCP- 2130 Sync. 2132 Some other variables are derived from these primitives. 2134 SWL and SWH 2135 (Sequence Number Window Low and High) The extremes of the 2136 validity window for received packets' Sequence Numbers. 2138 AWL and AWH 2139 (Acknowledgement Number Window Low and High) The extremes 2140 of the validity window for received packets' Acknowledgement 2141 Numbers. 2143 7.2. Initial Sequence Numbers 2145 The endpoints' initial sequence numbers are set by the first DCCP- 2146 Request and DCCP-Response packets sent. Initial sequence numbers 2147 MUST be chosen to avoid two problems: 2149 o Delivery of old packets, where packets lingering in the network 2150 from an old connection are delivered to a new connection with the 2151 same addresses and port numbers. 2153 o Sequence number attacks, where an attacker can guess the sequence 2154 numbers that a future connection would use [M85]. 2156 These problems are the same as problems faced by TCP, and DCCP 2157 implementations SHOULD use TCP's strategies to avoid them [RFC 793, 2158 RFC 1948]. The rest of this section explains these strategies in 2159 more detail. 2161 To address the first problem, an implementation MUST ensure that the 2162 initial sequence number for a given 4-tuple doesn't overlap with 2164 recent sequence numbers on previous connections with the same 2165 4-tuple. ("Recent" means sent within 2 maximum segment lifetimes, 2166 or 4 minutes.) The implementation MUST additionally ensure that the 2167 lower 24 bits of the initial sequence number don't overlap with the 2168 lower 24 bits of recent sequence numbers (unless the implementation 2169 plans to avoid short sequence numbers; see Section 7.6). An 2170 implementation that has state for a recent connection with the same 2171 4-tuple can pick a good initial sequence number explicitly. 2172 Otherwise, it could tie initial sequence number selection to some 2173 clock, such as the 4-microsecond clock used by TCP [RFC 793]. Two 2174 separate clocks may be required, one for the upper 24 bits and one 2175 for the lower 24 bits. 2177 To address the second problem, an implementation MUST provide each 2178 4-tuple with an independent initial sequence number space. Then 2179 opening a connection doesn't provide any information about initial 2180 sequence numbers on other connections to the same host. RFC 1948 2181 achieves this by adding a cryptographic hash of the 4-tuple and a 2182 secret to each initial sequence number. For the secret, RFC 1948 2183 recommends a combination of some truly-random data [RFC 1750], an 2184 administratively-installed passphrase, the endpoint's IP address, 2185 and the endpoint's boot time, but truly-random data is sufficient. 2186 Care should be taken when changing the secret; such a change alters 2187 all initial sequence number spaces, which might make an initial 2188 sequence number for some 4-tuple equal a recently sent sequence 2189 number for the same 4-tuple. To avoid this problem, the endpoint 2190 might remember dead connection state for each 4-tuple or stay quiet 2191 for 2 maximum segment lifetimes around such a change. 2193 7.3. Quiet Time 2195 DCCP endpoints, like TCP endpoints, must take care before initiating 2196 connections when they boot. In particular, they MUST NOT send 2197 packets whose sequence numbers are close to the sequence numbers of 2198 packets lingering in the network from before the boot. The simplest 2199 way to enforce this rule is for DCCP endpoints to avoid sending any 2200 packets until one maximum segment lifetime (2 minutes) after boot. 2201 Other enforcement mechanisms include remembering recent sequence 2202 numbers across boots, and reserving the upper 8 or so bits of 2203 initial sequence numbers for a persistent counter that decrements by 2204 two each boot. (The latter mechanism would require disallowing 2205 packets with short sequence numbers; see Section 7.6.1.) 2207 7.4. Acknowledgement Numbers 2209 Cumulative acknowledgements are meaningless in an unreliable 2210 protocol. Therefore, DCCP's Acknowledgement Number field has a 2211 different meaning than TCP's. 2213 A received packet is classified as acknowledgeable if and only if 2214 its header was succesfully processed by the receiving DCCP. In 2215 terms of the pseudocode in Section 8.5, a received packet becomes 2216 acknowledgeable when the receiving endpoint reaches Step 8. This 2217 means, for example, that all acknowledgeable packets have valid 2218 header checksums and sequence numbers. The Acknowledgement Number 2219 MUST equal GSR, the Greatest Sequence Number Received on an 2220 acknowledgeable packet, for all packet types except DCCP-Sync and 2221 DCCP-SyncAck. 2223 "Acknowledgeable" does not refer to data processing. Even 2224 acknowledgeable packets may have their application data dropped, due 2225 to receive buffer overflow or corruption, for instance. Data 2226 Dropped options report these data losses when necessary, letting 2227 congestion control mechanisms distinguish between network losses and 2228 endpoint losses. This issue is discussed further in Sections 11.4 2229 and 11.7. 2231 DCCP-Sync and DCCP-SyncAck packets' Acknowledgement Numbers differ 2232 as follows: The Acknowledgement Number on a DCCP-Sync packet 2233 corresponds to a received packet, but not necessarily an 2234 acknowledgeable packet; in particular, it might correspond to an 2235 out-of-sync packet whose options were not processed. The 2236 Acknowledgement Number on a DCCP-SyncAck packet always corresponds 2237 to an acknowledgeable DCCP-Sync packet; it might be less than GSR in 2238 the presence of reordering. 2240 7.5. Validity and Synchronization 2242 Any DCCP endpoint might receive packets that are not actually part 2243 of the current connection. For instance, the network might deliver 2244 an old packet, an attacker might attempt to hijack a connection, or 2245 the other endpoint might crash, causing a half-open connection. 2247 DCCP, like TCP, uses sequence number checks to detect these cases. 2248 Packets whose Sequence and/or Acknowledgement Numbers are out of 2249 range are called sequence-invalid, and are not processed normally. 2251 Unlike TCP, DCCP requires a synchronization mechanism to recover 2252 from large bursts of loss. One endpoint might send so many packets 2253 during a burst of loss that when one of its packets finally got 2254 through, the other endpoint would label its Sequence Number as 2255 invalid. A handshake of DCCP-Sync and DCCP-SyncAck packets recovers 2256 from this case. 2258 7.5.1. Sequence and Acknowledgement Number Windows 2260 Each DCCP endpoint defines sequence validity windows that are 2261 subsets of the Sequence and Acknowledgement Number spaces. These 2262 windows correspond to packets the endpoint expects to receive in the 2263 next few round-trip times. The Sequence and Acknowledgement Number 2264 windows always contain GSR and GSS, respectively. The window widths 2265 are controlled by Sequence Window features for the two half- 2266 connections. 2268 The Sequence Number validity window for packets from DCCP B is [SWL, 2269 SWH]. This window always contains GSR, the Greatest Sequence Number 2270 Received on a sequence-valid packet from DCCP B. It is W packets 2271 wide, where W is the value of the Sequence Window/B feature. One- 2272 fourth of the sequence window, rounded down, is less than or equal 2273 to GSR, and three-fourths is greater than GSR. (This asymmetric 2274 placement assumes that bursts of loss are more common in the network 2275 than significant reordering.) 2277 invalid | valid Sequence Numbers | invalid 2278 <---------*|*===========*=======================*|*---------> 2279 GSR -|GSR + 1 - GSR GSR +|GSR + 1 + 2280 floor(W/4)|floor(W/4) ceil(3W/4)|ceil(3W/4) 2281 = SWL = SWH 2283 The Acknowledgement Number validity window for packets from DCCP B 2284 is [AWL, AWH]. The high end of the window, AWH, equals GSS, the 2285 Greatest Sequence Number Sent by DCCP A; the window is W' packets 2286 wide, where W' is the value of the Sequence Window/A feature. 2288 invalid | valid Acknowledgement Numbers | invalid 2289 <---------*|*===================================*|*---------> 2290 GSS - W'|GSS + 1 - W' GSS|GSS + 1 2291 = AWL = AWH 2293 SWL and AWL are initially adjusted so that they are not less than 2294 the initial Sequence Numbers received and sent, respectively: 2295 SWL := max(GSR + 1 - floor(W/4), ISR), 2296 AWL := max(GSS - W' + 1, ISS). 2297 These adjustments MUST be applied only at the beginning of the 2298 connection. (Long-lived connections may wrap sequence numbers so 2299 that they appear to be less than ISR or ISS; the adjustments MUST 2300 NOT be applied in that case.) 2302 7.5.2. Sequence Window Feature 2304 The Sequence Window/A feature determines the width of the Sequence 2305 Number validity window used by DCCP B, and the width of the 2306 Acknowledgement Number validity window used by DCCP A. DCCP A sends 2307 a "Change L(Sequence Window, W)" option to notify DCCP B that the 2308 Sequence Window/A value is W. 2310 Sequence Window has feature number 3, and is non-negotiable. It 2311 takes 48-bit (6-byte) integer values, like DCCP sequence numbers. 2313 Change and Confirm options for Sequence Window are therefore 9 bytes 2314 long. New connections start with Sequence Window 100 for both 2315 endpoints. The minimum valid Sequence Window value is Wmin = 32. 2316 The maximum valid Sequence Window value is Wmax = 2^46 - 1 = 2317 70368744177663; circular sequence number comparisons would stop 2318 working absent this constraint. Change options suggesting Sequence 2319 Window values out of this range are invalid and MUST be handled 2320 accordingly. 2322 A proper Sequence Window/A value must reflect the number of packets 2323 DCCP A expects to be in flight. Only DCCP A can anticipate this 2324 number. Values that are too small increase the risk of the 2325 endpoints getting out sync after bursts of loss, and values that are 2326 much too small can prevent productive communication whether or not 2327 there is loss. On the other hand, too-large values increase the 2328 risk of connection hijacking; Section 7.5.5 quantifies this risk. 2329 One good guideline is for each endpoint to set Sequence Window to 2330 about five times the maximum number of packets it expects to send in 2331 a round-trip time. Endpoints SHOULD send Change L(Sequence Window) 2332 options as necessary as the connection progresses. Also, an 2333 endpoint MUST NOT persistently send more than its Sequence Window 2334 number of packets per round-trip time; that is, DCCP A MUST NOT 2335 persistently send more than Sequence Window/A packets per RTT. 2337 7.5.3. Sequence-Validity Rules 2339 Sequence-validity depends on the received packet's type. This table 2340 shows the sequence and acknowledgement number checks applied to each 2341 packet; a packet is sequence-valid if it passes both tests, and 2342 sequence-invalid if it does not. Many of the checks refer to the 2343 sequence and acknowledgement number validity windows [SWL, SWH] and 2344 [AWL, AWH] defined in Section 7.5.1. 2346 Acknowledgement Number 2347 Packet Type Sequence Number Check Check 2348 ----------- --------------------- ---------------------- 2349 DCCP-Request SWL <= seqno <= SWH (*) N/A 2350 DCCP-Response SWL <= seqno <= SWH (*) AWL <= ackno <= AWH 2351 DCCP-Data SWL <= seqno <= SWH N/A 2352 DCCP-Ack SWL <= seqno <= SWH AWL <= ackno <= AWH 2353 DCCP-DataAck SWL <= seqno <= SWH AWL <= ackno <= AWH 2354 DCCP-CloseReq GSR < seqno <= SWH GAR <= ackno <= AWH 2355 DCCP-Close GSR < seqno <= SWH GAR <= ackno <= AWH 2356 DCCP-Reset GSR < seqno <= SWH GAR <= ackno <= AWH 2357 DCCP-Sync SWL <= seqno AWL <= ackno <= AWH 2358 DCCP-SyncAck SWL <= seqno AWL <= ackno <= AWH 2360 (*) Check not applied if connection is in LISTEN or REQUEST state. 2362 In general, packets are sequence-valid if their Sequence and 2363 Acknowledgement Numbers lie within the corresponding valid windows, 2364 [SWL, SWH] and [AWL, AWH]. The exceptions to this rule are as 2365 follows: 2367 o Since DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets end a 2368 connection, they cannot have Sequence Numbers less than or equal 2369 to GSR, or Acknowledgement Numbers less than GAR. 2371 o DCCP-Sync and DCCP-SyncAck Sequence Numbers are not strongly 2372 checked. These packet types exist specifically to get the 2373 endpoints back into sync; checking their Sequence Numbers would 2374 eliminate their usefulness. 2376 The lenient checks on DCCP-Sync and DCCP-SyncAck packets allow 2377 continued operation after unusual events, such as endpoint crashes 2378 and large bursts of loss, but there's no need for leniency in the 2379 absence of unusual events -- that is, during ongoing successful 2380 communication. Therefore, DCCP implementations SHOULD use the 2381 following, more stringent checks for active connections, where a 2382 connection is considered active if it has received valid packets 2383 from the other endpoint within the last five round-trip times. 2385 Acknowledgement Number 2386 Packet Type Sequence Number Check Check 2387 ----------- --------------------- ---------------------- 2388 DCCP-Sync SWL <= seqno <= SWH AWL <= ackno <= AWH 2389 DCCP-SyncAck SWL <= seqno <= SWH AWL <= ackno <= AWH 2391 Finally, an endpoint MAY apply the following more stringent checks 2392 to DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets, further 2393 lowering the probability of successful blind attacks using those 2394 packet types. Since these checks can cause extra synchronization 2395 overhead and delay connection closing when packets are lost, they 2396 should be considered experimental. 2398 Acknowledgement Number 2399 Packet Type Sequence Number Check Check 2400 ----------- --------------------- ---------------------- 2401 DCCP-CloseReq seqno == GSR + 1 GAR <= ackno <= AWH 2402 DCCP-Close seqno == GSR + 1 GAR <= ackno <= AWH 2403 DCCP-Reset seqno == GSR + 1 GAR <= ackno <= AWH 2405 Note that sequence-validity is only one of the validity checks 2406 applied to received packets. 2408 7.5.4. Handling Sequence-Invalid Packets 2410 Endpoints MUST ignore sequence-invalid DCCP-Sync and DCCP-SyncAck 2411 packets, and MUST respond to other sequence-invalid packets with 2412 (possibly rate-limited) DCCP-Sync packets. Each DCCP-Sync packet 2413 MUST acknowledge the corresponding sequence-invalid packet's 2414 Sequence Number, not GSR. The DCCP-Sync MUST use a new Sequence 2415 Number, and thus will increase GSS; GSR will not change, however, 2416 since the received packet was sequence-invalid. 2418 On receiving a sequence-valid DCCP-Sync packet, the peer endpoint 2419 (say, DCCP B) MUST update its GSR variable and reply with a DCCP- 2420 SyncAck packet. The DCCP-SyncAck packet's Acknowledgement Number 2421 will equal the DCCP-Sync's Sequence Number, not necessarily GSR. 2422 Upon receiving this DCCP-SyncAck, which will be sequence-valid since 2423 it acknowledges the DCCP-Sync, DCCP A will update its GSR variable, 2424 and the endpoints will be back in sync. As an exception, if the 2425 peer endpoint is in the REQUEST state, it MUST respond with a DCCP- 2426 Reset instead of a DCCP-SyncAck. This serves to clean up DCCP A's 2427 half-open connection. 2429 To protect against denial-of-service attacks, DCCP implementations 2430 SHOULD impose a rate limit on DCCP-Syncs sent in response to 2431 sequence-invalid packets, such as not more than eight DCCP-Syncs per 2432 second. 2434 DCCP endpoints MUST NOT process sequence-invalid packets except, 2435 perhaps, by generating a DCCP-Sync. For instance, options MUST NOT 2436 but processed. An endpoint MAY temporarily preserve sequence- 2437 invalid packets in case they become valid later, however; this can 2438 reduce the impact of bursts of loss by delivering more packets to 2439 the application. In particular, an endpoint MAY preserve sequence- 2440 invalid packets for up to 2 round-trip times. If, within that time, 2441 the relevant sequence windows change so that the packets become 2442 sequence-valid, the endpoint MAY process them again. 2444 Note that sequence-invalid DCCP-Reset packets cause DCCP-Syncs to be 2445 generated. This is because endpoints in an unsynchronized state 2446 (CLOSED, REQUEST, and LISTEN) might not have enough information to 2447 generate a proper DCCP-Reset on the first try. For example, if a 2448 peer endpoint is in CLOSED state and receives a DCCP-Data packet, it 2449 cannot guess the right Sequence Number to use on the DCCP-Reset it 2450 generates (since the DCCP-Data packet has no Acknowledgement 2451 Number). The DCCP-Sync generated in response to this bad reset 2452 serves as a challenge, and contains enough information for the peer 2453 to generate a proper DCCP-Reset. However, the new DCCP-Reset may 2454 carry a different Reset Code than the original DCCP-Reset; probably 2455 the new Reset Code will be 3, "No Connection". The endpoint SHOULD 2456 use information from the original DCCP-Reset when possible. 2458 7.5.5. Sequence Number Attacks 2460 Sequence and Acknowledgement Numbers form DCCP's main line of 2461 defense against attackers. An attacker that cannot guess sequence 2462 numbers cannot easily manipulate or hijack a DCCP connection, and 2463 requirements like careful initial sequence number choice eliminate 2464 the most serious attacks. 2466 An attacker might still send many packets with randomly chosen 2467 Sequence and Acknowledgement Numbers, however. If one of those 2468 probes ends up sequence-valid, it may shut down the connection or 2469 otherwise cause problems. The easiest such attacks to execute are: 2471 o Send DCCP-Data packets with random Sequence Numbers. If one of 2472 these packets hits the valid sequence number window, the attack 2473 packet's application data may be inserted into the data stream. 2475 o Send DCCP-Sync packets with random Sequence and Acknowledgement 2476 Numbers. If one of these packets hits the valid acknowledgement 2477 number window, the receiver will shift its sequence number window 2478 accordingly, getting out of sync with the correct endpoint -- 2479 perhaps permanently. 2481 The attacker has to guess both Source and Destination Ports for any 2482 of these attacks to succeed. Additionally, the connection would 2483 have to be inactive for the DCCP-Sync attack to succeed, assuming 2484 the victim implemented the more stringent checks for active 2485 connections recommended in Section 7.5.3. 2487 To quantify the probability of success, let N be the number of 2488 attack packets the attacker is willing to send, W be the relevant 2489 sequence window width, and L be the length of sequence numbers (24 2490 or 48). The attacker's best strategy is to space the attack packets 2491 evenly over sequence space. Then the probability of hitting one 2492 sequence number window is P = WN/2^L. 2494 The success probability for a DCCP-Data attack using short sequence 2495 numbers thus equals P = WN/2^24. For W = 100, then, the attacker 2496 must send more than 83,000 packets to achieve a 50% chance of 2497 success. For reference, the easiest TCP attack -- sending a SYN 2498 with a random sequence number, which will cause a connection reset 2499 if it falls within the window -- has W = 8760 (a common default) and 2500 L = 32, and requires more than 245,000 packets to achieve a 50% 2501 chance of success. 2503 A fast connection's W will generally be high, increasing the attack 2504 success probability for fixed N. If this probability gets 2505 uncomfortably high with L = 24, the endpoint SHOULD prevent the use 2506 of short sequence numbers by manipulating the Allow Short Sequence 2507 Numbers feature (see Section 7.6.1). The probability limit depends 2508 on the application, however. Some applications, such as those 2509 already designed to handle corruption, are quite resilient to data 2510 injection attacks. 2512 The DCCP-Sync attack has L = 48, since DCCP-Sync packets use long 2513 sequence numbers exclusively; in addition, the success probability 2514 is halved, since only half the Sequence Number space is valid. 2515 Attacks have a correspondingly smaller probability of success. For 2516 a large W of 2000 packets, then, the attacker must send more than 2517 10^11 packets to achieve a 50% chance of success. 2519 Attacks involving DCCP-Ack, DCCP-DataAck, DCCP-CloseReq, DCCP-Close, 2520 and DCCP-Reset packets are more difficult, since Sequence and 2521 Acknowledgement Numbers must both be guessed. The probability of 2522 attack success for these packet types equals P = WXN/2^(2L), where W 2523 is the Sequence Number window, X is the Acknowledgement Number 2524 window, and N and L are as before. 2526 Since DCCP-Data attacks with short sequence numbers are relatively 2527 easy for attackers to execute, DCCP has been engineered to prevent 2528 these attacks from escalating to connection resets or other serious 2529 consequences. In particular, any options whose processing might 2530 cause the connection to be reset are ignored when they appear on 2531 DCCP-Data packets. 2533 7.5.6. Examples 2535 In the following example, DCCP A and DCCP B recover from a large 2536 burst of loss that runs DCCP A's sequence numbers out of DCCP B's 2537 appropriate sequence number window. 2539 DCCP A DCCP B 2540 (GSS=1,GSR=10) (GSS=10,GSR=1) 2541 --> DCCP-Data(seq 2) XXX 2542 ... 2543 --> DCCP-Data(seq 100) XXX 2544 --> DCCP-Data(seq 101) --> ??? 2545 seqno out of range; 2546 send Sync 2547 OK <-- DCCP-Sync(seq 11, ack 101) <-- 2548 (GSS=11,GSR=1) 2549 --> DCCP-SyncAck(seq 102, ack 11) --> OK 2550 (GSS=102,GSR=11) (GSS=11,GSR=102) 2552 In the next example, a DCCP connection recovers from a simple blind 2553 attack. 2555 DCCP A DCCP B 2556 (GSS=1,GSR=10) (GSS=10,GSR=1) 2557 *ATTACKER* --> DCCP-Data(seq 10^6) --> ??? 2558 seqno out of range; 2559 send Sync 2560 ??? <-- DCCP-Sync(seq 11, ack 10^6) <-- 2561 ackno out of range; ignore 2562 (GSS=1,GSR=10) (GSS=11,GSR=1) 2564 The final example demonstrates recovery from a half-open connection. 2566 DCCP A DCCP B 2567 (GSS=1,GSR=10) (GSS=10,GSR=1) 2568 (Crash) 2569 CLOSED OPEN 2570 REQUEST --> DCCP-Request(seq 400) --> ??? 2571 !! <-- DCCP-Sync(seq 11, ack 400) <-- OPEN 2572 REQUEST --> DCCP-Reset(seq 401, ack 11) --> (Abort) 2573 REQUEST CLOSED 2574 REQUEST --> DCCP-Request(seq 402) --> ... 2576 7.6. Short Sequence Numbers 2578 DCCP sequence numbers are 48 bits long. This large sequence space 2579 protects DCCP connections against some blind attacks, such as the 2580 injection of DCCP-Resets into the connection. However, DCCP-Data, 2581 DCCP-Ack, and DCCP-DataAck packets, which make up the body of any 2582 DCCP connection, may reduce header space by transmitting only the 2583 lower 24 bits of the relevant Sequence and Acknowledgement Numbers. 2584 The receiving endpoint will extend these numbers to 48 bits using 2585 the following pseudocode: 2587 procedure Extend_Sequence_Number(S, REF) 2588 /* S is a 24-bit sequence number from the packet header. 2589 REF is the relevant 48-bit reference sequence number: 2590 GSS if S is an Acknowledgement Number, and GSR if S is a 2591 Sequence Number. */ 2592 Set REF_low := low 24 bits of REF 2593 Set REF_hi := high 24 bits of REF 2594 If REF_low (<) S /* circular comparison mod 2^24 */ 2595 && S |<| REF_low, /* conventional, non-circular 2596 comparison */ 2597 Return (((REF_hi + 1) mod 2^24) << 24) | S 2598 Otherwise, 2599 Return (REF_hi << 24) | S 2601 The two different kinds of comparison in the if statement detect 2602 when the low-order bits of the sequence space have wrapped. (The 2603 circular comparison "REF_low (<) S" returns true if and only if 2604 (S - REF_low), calculated using two's-complement arithmetic and then 2605 represented as an unsigned number, is less than or equal to 2^23 2606 (mod 2^24).) When this happens, the high-order bits are 2607 incremented. 2609 7.6.1. Allow Short Sequence Numbers Feature 2611 Endpoints can require that all packets use long sequence numbers by 2612 setting the Allow Short Sequence Numbers feature to false. This can 2613 reduce the risk that data will be inappropriately injected into the 2614 connection. DCCP A sends a "Change R(Allow Short Seqnos, 0)" option 2615 to ask DCCP B to send only long sequence numbers. 2617 Allow Short Sequence Numbers has feature number 2, and is server- 2618 priority. It takes one-byte Boolean values. DCCP B MUST NOT send 2619 packets with short sequence numbers when Allow Short Seqnos/B is 2620 zero. Values of two or more are reserved. New connections start 2621 with Allow Short Sequence Numbers 1 for both endpoints. 2623 7.6.2. When to Avoid Short Sequence Numbers 2625 Short sequence numbers reduce the rate DCCP connections can safely 2626 achieve, and increase the risks of certain kinds of attacks, 2627 including blind data injection. Very-high-rate DCCP connections, 2628 and connections with large sequence windows (Section 7.5.2), SHOULD 2629 NOT use short sequence numbers on their data packets. The attack 2630 risk issues have been discussed in Section 7.5.5; we discuss the 2631 rate limitation issue here. 2633 The sequence-validity mechanism assumes that the network does not 2634 deliver extremely old data. In particular, it assumes that the 2635 network must have dropped any packet by the time the connection 2636 wraps around and uses its sequence number again. This constraint 2637 limits the maximum connection rate that can be safely achieved. Let 2638 MSL equal the maximum segment lifetime, P equal the average DCCP 2639 packet size in bits, and L equal the length of sequence numbers (24 2640 or 48 bits). Then the maximum safe rate, in bits per second, is R = 2641 P*(2^L)/2MSL. 2643 For the default MSL of 2 minutes, 1500-byte DCCP packets, and short 2644 sequence numbers, the safe rate is therefore approximately 800 Mb/s. 2645 Although 2 minutes is a very large MSL for any networks that could 2646 sustain that rate with such small packets, long sequence numbers 2647 allow much higher rates under the same constraints: up to 2648 14 petabits a second for 1500-byte packets and the default MSL. 2650 7.7. NDP Count and Detecting Application Loss 2652 DCCP's sequence numbers increment by one on every packet, including 2653 non-data packets (packets that don't carry application data). This 2654 makes DCCP sequence numbers suitable for detecting any network loss, 2655 but not for detecting the loss of application data. The NDP Count 2656 option reports the length of each burst of non-data packets. This 2657 lets the receiving DCCP reliably determine when a burst of loss 2658 included application data. 2660 +--------+--------+-------- ... --------+ 2661 |00100101| Length | NDP Count | 2662 +--------+--------+-------- ... --------+ 2663 Type=37 Len=3-5 (1-3 bytes) 2665 If a DCCP endpoint's Send NDP Count feature is one (see below), then 2666 that endpoint MUST send an NDP Count option on every packet whose 2667 immediate predecessor was a non-data packet. Non-data packets 2668 consist of DCCP packet types DCCP-Ack, DCCP-Close, DCCP-CloseReq, 2669 DCCP-Reset, DCCP-Sync, and DCCP-SyncAck. The other packet types, 2670 namely DCCP-Request, DCCP-Response, DCCP-Data, and DCCP-DataAck, are 2671 considered data packets, although not all DCCP-Request and DCCP- 2672 Response packets will actually carry application data. 2674 The value stored in NDP Count equals the number of consecutive non- 2675 data packets in the run immediately previous to the current packet. 2676 Packets with no NDP Count option are considered to have NDP Count 2677 zero. 2679 The NDP Count option can carry one to three bytes of data. The 2680 smallest option format that can hold the NDP Count SHOULD be used. 2682 With NDP Count, the receiver can reliably tell only whether a burst 2683 of loss contained at least one data packet. For example, the 2684 receiver cannot always tell whether a burst of loss contained a non- 2685 data packet. 2687 7.7.1. Usage Notes 2689 Say that K consecutive sequence numbers are missing in some burst of 2690 loss, and the Send NDP Count feature is on. Then some application 2691 data was lost within those sequence numbers unless the packet 2692 following the hole contains an NDP Count option whose value is 2693 greater than or equal to K. 2695 For example, say that an endpoint sent the following sequence of 2696 non-data packets (Nx) and data packets (Dx). 2698 N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 2700 Those packets would have NDP Counts as follows. 2702 N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 2703 - 1 2 - 1 - - 1 - - - - 1 2 2705 NDP Count is not useful for applications that include their own 2706 sequence numbers with their packet headers. 2708 7.7.2. Send NDP Count Feature 2710 The Send NDP Count feature lets DCCP endpoints negotiate whether 2711 they should send NDP Count options on their packets. DCCP A sends a 2712 "Change R(Send NDP Count, 1)" option to ask DCCP B to send NDP Count 2713 options. 2715 Send NDP Count has feature number 7, and is server-priority. It 2716 takes one-byte Boolean values. DCCP B MUST send NDP Count options 2717 as described above when Send NDP Count/B is one, although it MAY 2718 send NDP Count options even when Send NDP Count/B is zero. Values 2719 of two or more are reserved. New connections start with Send NDP 2720 Count 0 for both endpoints. 2722 8. Event Processing 2724 This section describes how DCCP connections move between states, and 2725 which packets are sent when. Note that feature negotiation takes 2726 place in parallel with the connection-wide state transitions 2727 described here. 2729 8.1. Connection Establishment 2731 DCCP connections' initiation phase consists of a three-way 2732 handshake: an initial DCCP-Request packet sent by the client, a 2733 DCCP-Response sent by the server in reply, and finally an 2734 acknowledgement from the client, usually via a DCCP-Ack or DCCP- 2735 DataAck packet. The client moves from the REQUEST state to 2736 PARTOPEN, and finally to OPEN; the server moves from LISTEN to 2737 RESPOND, and finally to OPEN. 2739 Client State Server State 2740 CLOSED LISTEN 2741 1. REQUEST --> Request --> 2742 2. <-- Response <-- RESPOND 2743 3. PARTOPEN --> Ack, DataAck --> 2744 4. <-- Data, Ack, DataAck <-- OPEN 2745 5. OPEN <-> Data, Ack, DataAck <-> OPEN 2747 8.1.1. Client Request 2749 When a client decides to initiate a connection, it enters the 2750 REQUEST state, chooses an initial sequence number (Section 7.2), and 2751 sends a DCCP-Request packet using that sequence number to the 2752 intended server. 2754 DCCP-Request packets will commonly carry feature negotiation options 2755 that open negotiations for various connection parameters, such as 2756 preferred congestion control IDs for each half-connection. They may 2757 also carry application data, but the client should be aware that the 2758 server may not accept such data. 2760 A client in the REQUEST state SHOULD use an exponential-backoff 2761 timer to send new DCCP-Request packets if no response is received. 2762 The first retransmission should occur after approximately one 2763 second, backing off to not less than one packet every 64 seconds; or 2764 the endpoint can use whatever retransmission strategy is followed 2765 for retransmitting TCP SYNs. Each new DCCP-Request MUST increment 2766 the Sequence Number by one, and MUST contain the same Service Code 2767 and application data as the original DCCP-Request. 2769 A client MAY give up on its DCCP-Requests after some time 2770 (3 minutes, for example). When it does, it SHOULD send a DCCP-Reset 2771 packet to the server with Reset Code 2, "Aborted", to clean up state 2772 in case one or more of the Requests actually arrived. A client in 2773 REQUEST state has never received an initial sequence number from its 2774 peer, so the DCCP-Reset's Acknowledgement Number MUST be set to 2775 zero. 2777 The client leaves the REQUEST state for PARTOPEN when it receives a 2778 DCCP-Response from the server. 2780 8.1.2. Service Codes 2782 Each DCCP-Request contains a 32-bit Service Code, which identifies 2783 the application-level service to which the client application is 2784 trying to connect. Service Codes should correspond to application 2785 services and protocols. For example, there might be a Service Code 2786 for SIP control connections and one for RTP audio connections. 2787 Middleboxes, such as firewalls, can use the Service Code to identify 2788 the application running on a nonstandard port (assuming the DCCP 2789 header has not been encrypted). 2791 Endpoints MUST associate a Service Code with every DCCP socket, both 2792 actively and passively opened. The application will generally 2793 supply this Service Code. Each active socket MUST have exactly one 2794 Service Code. Passive sockets MAY, at the implementation's 2795 discretion, be associated with more than one Service Code; this 2796 might let multiple applications, or multiple versions of the same 2797 application, listen on the same port, differentiated by Service 2798 Code. If the DCCP-Request's Service Code doesn't match any of the 2799 server's Service Codes for the given port, the server MUST reject 2800 the request by sending a DCCP-Reset packet with Reset Code 8, "Bad 2801 Service Code". A middlebox MAY also send such a DCCP-Reset in 2802 response to packets whose Service Code is considered unsuitable. 2804 Service Codes are not intended to be DCCP-specific, and are 2805 allocated by IANA. Following the policies outlined in RFC 2434, 2806 most Service Codes are allocated First Come First Served, subject to 2807 the following guidelines. 2809 o Service Codes are allocated one at a time, or in small blocks. A 2810 short English description of the intended service is REQUIRED to 2811 obtain a Service Code assignment, but no specification, 2812 standards-track or otherwise, is necessary. IANA maintains an 2813 association of Service Codes to the corresponding phrases. 2815 o Users request specific Service Code values. We suggest that 2816 users request Service Codes that can be interpreted as meaningful 2817 four-byte ASCII strings. Thus, the "Frobodyne Plotz Protocol" 2818 might correspond to "fdpz", or the number 1717858426. The 2819 canonical interpretation of a Service Code field is numeric. 2821 o Service Codes whose bytes each have values in the set {32, 45-57, 2822 65-90} use a Specification Required allocation policy. That is, 2823 these Service Codes are used for international standard or 2824 standards-track specifications, IETF or otherwise. (This set 2825 consists of the ASCII digits, uppercase letters, and characters 2826 space, '-', '.', and '/'.) 2828 o Service Codes whose high-order byte equals 63 (ASCII '?') are 2829 reserved for Private Use. 2831 o Service Code 0 represents the absence of a meaningful Service 2832 Code, and MUST NOT be allocated. 2834 This design for Service Code allocation is based on the allocation 2835 of 4-byte identifiers for Macintosh resources, PNG chunks, and 2836 TrueType and OpenType tables. 2838 8.1.3. Server Response 2840 In the second phase of the three-way handshake, the server moves 2841 from the LISTEN state to RESPOND, and sends a DCCP-Response message 2842 to the client. In this phase, a server will often specify the 2843 features it would like to use, either from among those the client 2844 requested, or in addition to those. Among these options is the 2845 congestion control mechanism the server expects to use. 2847 The server MAY respond to a DCCP-Request packet with a DCCP-Reset 2848 packet to refuse the connection. Relevant Reset Codes for refusing 2849 a connection include 7, "Connection Refused", when the DCCP- 2850 Request's Destination Port did not correspond to a DCCP port open 2851 for listening; 8, "Bad Service Code", when the DCCP-Request's 2852 Service Code did not correspond to the service code registered with 2853 the Destination Port; and 9, "Too Busy", when the server is 2854 currently too busy to respond to requests. The server SHOULD limit 2855 the rate at which it generates these resets, for example to not more 2856 than 1024 per second. 2858 The server SHOULD NOT retransmit DCCP-Response packets; the client 2859 will retransmit the DCCP-Request if necessary. (Note that the 2860 "retransmitted" DCCP-Request will have, at least, a different 2861 sequence number from the "original" DCCP-Request. The server can 2862 thus distinguish true retransmissions from network duplicates.) The 2863 server will detect that the retransmitted DCCP-Request applies to an 2864 existing connection because of its Source and Destination Ports. 2865 Every valid DCCP-Request received while the server is in the RESPOND 2866 state MUST elicit a new DCCP-Response. Each new DCCP-Response MUST 2867 increment the server's Sequence Number by one, and MUST include the 2868 same application data, if any, as the original DCCP-Response. 2870 The server MUST NOT accept more than one piece of DCCP-Request 2871 application data per connection. In particular, the DCCP-Response 2872 sent in reply to a retransmitted DCCP-Request with application data 2873 SHOULD contain a Data Dropped option, in which the retransmitted 2874 DCCP-Request data is reported with Drop Code 0, Protocol 2875 Constraints. The original DCCP-Request SHOULD also be reported in 2876 the Data Dropped option, either in a Normal Block (if the server 2877 accepted the data, or there was no data), or in a Drop Code 0 Drop 2878 Block (if the server refused the data the first time as well). 2880 The Data Dropped and Init Cookie options are particularly useful for 2881 DCCP-Response packets (Sections 11.7 and 8.1.4). 2883 The server leaves the RESPOND state for OPEN when it receives a 2884 valid DCCP-Ack from the client, completing the three-way handshake. 2885 It MAY also leave the RESPOND state for CLOSED after a timeout of 2886 not less than 4MSL (8 minutes); when doing so, it SHOULD send a 2887 DCCP-Reset with Reset Code 2, "Aborted", to clean up state at the 2888 client. 2890 8.1.4. Init Cookie Option 2892 +--------+--------+--------+--------+--------+-------- 2893 |00100100| Length | Init Cookie Value ... 2894 +--------+--------+--------+--------+--------+-------- 2895 Type=36 2897 The Init Cookie option lets a DCCP server avoid having to hold any 2898 state until the three-way connection setup handshake has completed, 2899 in a similar fashion as TCP SYN cookies [SYNCOOKIES]. The server 2900 wraps up the Service Code, server port, and any options it cares 2901 about from both the DCCP-Request and DCCP-Response in an opaque 2902 cookie. Typically the cookie will be encrypted using a secret known 2903 only to the server and include a cryptographic checksum or magic 2904 value so that correct decryption can be verified. When the server 2905 receives the cookie back in the response, it can decrypt the cookie 2906 and instantiate all the state it avoided keeping. In the meantime, 2907 it need not move from the LISTEN state. 2909 The Init Cookie option MUST NOT be sent on DCCP-Request or DCCP-Data 2910 packets, and any such options received on DCCP-Request or DCCP-Data 2911 packets MUST be ignored. The server MAY include an Init Cookie 2912 option in its DCCP-Response. If so, then the client MUST echo the 2913 same Init Cookie option in each succeeding DCCP packet until one of 2914 those packets is acknowledged, meaning the three-way handshake has 2915 completed, or the connection is reset. (As a result, the client 2916 MUST NOT use DCCP-Data packets until the three-way handshake 2917 completes or the connection is reset.) The server SHOULD design its 2918 Init Cookie format so that Init Cookies can be checked for 2919 tampering; it SHOULD respond to a tampered Init Cookie option by 2920 resetting the connection with Reset Code 10, "Bad Init Cookie". 2922 Init Cookie's precise implementation need not be specified here; 2923 since Init Cookies are opaque to the client, there are no 2924 interoperability concerns. An example cookie format might encrypt 2925 (using a secret key) the connection's initial sequence and 2926 acknowledgement numbers, ports, Service Code, any options included 2927 on the DCCP-Request packet and the corresponding DCCP-Reply, a 2928 random salt, and a magic number. On receiving a reflected Init 2929 Cookie, the server would decrypt the cookie, validate it by checking 2930 its magic number, sequence numbers, and ports, and, if valid, create 2931 a corresponding socket using the options. 2933 Init Cookies are limited to at most 253 bytes in length. 2935 8.1.5. Handshake Completion 2937 When the client receives a DCCP-Response from the server, it moves 2938 from the REQUEST state to PARTOPEN and completes the three-way 2939 handshake by sending a DCCP-Ack packet to the server. The client 2940 remains in PARTOPEN until it can be sure that the server has 2941 received some packet the client sent from PARTOPEN (either the 2942 initial DCCP-Ack or a later packet). Clients in the PARTOPEN state 2943 that want to send data MUST do so using DCCP-DataAck packets, not 2944 DCCP-Data packets. This is because DCCP-Data packets lack 2945 Acknowledgement Numbers, so the server can't tell from a DCCP-Data 2946 packet whether the client saw its DCCP-Response. Furthermore, if 2947 the DCCP-Response included an Init Cookie, that Init Cookie MUST be 2948 included on every packet sent in PARTOPEN. 2950 The single DCCP-Ack sent when entering the PARTOPEN state might, of 2951 course, be dropped by the network. The client SHOULD ensure that 2952 some packet gets through eventually. The preferred mechanism would 2953 be a roughly 200-millisecond timer, set every time a packet is 2954 transmitted in PARTOPEN. If this timer goes off and the client is 2955 still in PARTOPEN, the client generates another DCCP-Ack and backs 2956 off the timer. If the client remains in PARTOPEN for more than 4MSL 2957 (8 minutes), it SHOULD reset the connection with Reset Code 2, 2958 "Aborted". 2960 The client leaves the PARTOPEN state for OPEN when it receives a 2961 valid packet other than DCCP-Response, DCCP-Reset, or DCCP-Sync from 2962 the server. 2964 8.2. Data Transfer 2966 In the central data transfer phase of the connection, both server 2967 and client are in the OPEN state. 2969 DCCP A sends DCCP-Data and DCCP-DataAck packets to DCCP B due to 2970 application events on host A. These packets are congestion- 2971 controlled by the CCID for the A-to-B half-connection. In contrast, 2972 DCCP-Ack packets sent by DCCP A are controlled by the CCID for the 2973 B-to-A half-connection. Generally, DCCP A will piggyback 2974 acknowledgement information on DCCP-Data packets when acceptable, 2975 creating DCCP-DataAck packets. DCCP-Ack packets are used when there 2976 is no data to send from DCCP A to DCCP B, or when the congestion 2977 state of the A-to-B CCID will not allow data to be sent. 2979 DCCP-Sync and DCCP-SyncAck packets may also occur in the data 2980 transfer phase. Some cases causing DCCP-Sync generation are 2981 discussed in Section 7.5. One important distinction between DCCP- 2982 Sync packets and other packet types is that DCCP-Sync elicits an 2983 immediate acknowledgement. On receiving a valid DCCP-Sync packet, a 2984 DCCP endpoint MUST immediately generate and send a DCCP-SyncAck 2985 response (subject to any implementation rate limits); the 2986 Acknowledgement Number on that DCCP-SyncAck MUST equal the Sequence 2987 Number of the DCCP-Sync. 2989 A particular DCCP implementation might decide to initiate feature 2990 negotiation only once the OPEN state was reached, in which case it 2991 might not allow data transfer until some time later. Data received 2992 during that time SHOULD be rejected and reported using a Data 2993 Dropped Drop Block with Drop Code 0, Protocol Constraints (see 2994 Section 11.7). 2996 8.3. Termination 2998 DCCP connection termination uses a handshake consisting of an 2999 optional DCCP-CloseReq packet, a DCCP-Close packet, and a DCCP-Reset 3000 packet. The server moves from the OPEN state, possibly through the 3001 CLOSEREQ state, to CLOSED; the client moves from OPEN through 3002 CLOSING to TIMEWAIT, and after 2MSL wait time (4 minutes), to 3003 CLOSED. 3005 The sequence DCCP-CloseReq, DCCP-Close, DCCP-Reset is used when the 3006 server decides to close the connection, but doesn't want to hold 3007 TIMEWAIT state: 3009 Client State Server State 3010 OPEN OPEN 3011 1. <-- CloseReq <-- CLOSEREQ 3012 2. CLOSING --> Close --> 3013 3. <-- Reset <-- CLOSED (LISTEN) 3014 4. TIMEWAIT 3015 5. CLOSED 3016 A shorter sequence occurs when the client decides to close the 3017 connection. 3019 Client State Server State 3020 OPEN OPEN 3021 1. CLOSING --> Close --> 3022 2. <-- Reset <-- CLOSED (LISTEN) 3023 3. TIMEWAIT 3024 4. CLOSED 3026 Finally, the server can decide to hold TIMEWAIT state: 3028 Client State Server State 3029 OPEN OPEN 3030 1. <-- Close <-- CLOSING 3031 2. CLOSED --> Reset --> 3032 3. TIMEWAIT 3033 4. CLOSED (LISTEN) 3035 In all cases, the receiver of the DCCP-Reset packet holds TIMEWAIT 3036 state for the connection. As in TCP, TIMEWAIT state, where an 3037 endpoint quietly preserves a socket for 2MSL (4 minutes) after its 3038 connection has closed, ensures that no connection duplicating the 3039 current connection's source and destination addresses and ports can 3040 start up while old packets might remain in the network. 3042 The termination handshake proceeds as follows. The receiver of a 3043 valid DCCP-CloseReq packet MUST respond with a DCCP-Close packet. 3044 The receiver of a valid DCCP-Close packet MUST respond with a DCCP- 3045 Reset packet, with Reset Code 1, "Closed". The receiver of a valid 3046 DCCP-Reset packet -- which is also the sender of the DCCP-Close 3047 packet (and possibly the receiver of the DCCP-CloseReq packet) -- 3048 will hold TIMEWAIT state for the connection. 3050 A DCCP-Reset packet completes every DCCP connection, whether the 3051 termination is clean (due to application close; Reset Code 1, 3052 "Closed") or unclean. Unlike TCP, which has two distinct 3053 termination mechanisms (FIN and RST), DCCP ends all connections in a 3054 uniform manner. This is justified because some aspects of 3055 connection termination are the same independent of whether 3056 termination was clean. For instance, the endpoint that receives a 3057 valid DCCP-Reset SHOULD hold TIMEWAIT state for the connection. 3058 Processors that must distinguish between clean and unclean 3059 termination can examine the Reset Code. DCCP-Reset packets MUST NOT 3060 be generated in response to received DCCP-Reset packets. DCCP 3061 implementations generally transition to the CLOSED state after 3062 sending a DCCP-Reset packet. 3064 Endpoints in the CLOSEREQ and CLOSING states MUST retransmit DCCP- 3065 CloseReq and DCCP-Close packets, respectively, until leaving those 3066 states. The retransmission timer should initially be set to go off 3067 in two round-trip times, and should back off to not less than once 3068 every 64 seconds if no relevant response is received. 3070 Only the server can send a DCCP-CloseReq packet or enter the 3071 CLOSEREQ state. A server receiving a sequence-valid DCCP-CloseReq 3072 packet MUST respond with a DCCP-Sync packet, and otherwise ignore 3073 the DCCP-CloseReq. 3075 DCCP-Data, DCCP-DataAck, and DCCP-Ack packets received in CLOSEREQ 3076 or CLOSE states MAY be either processed or ignored. 3078 8.3.1. Abnormal Termination 3080 DCCP endpoints generate DCCP-Reset packets to terminate connections 3081 abnormally; a DCCP-Reset packet may be generated from any state. 3082 Resets sent in the CLOSED, LISTEN, and TIMEWAIT states use Reset 3083 Code 3, "No Connection", unless otherwise specified. Resets sent in 3084 the REQUEST or RESPOND states use Reset Code 4, "Packet Error", 3085 unless otherwise specified. 3087 DCCP endpoints in CLOSED or LISTEN state may need to generate a 3088 DCCP-Reset packet in response to a packet received from a peer. 3089 Since these states have no associated sequence number variables, the 3090 Sequence and Acknowledgement Numbers on the DCCP-Reset packet R are 3091 taken from the received packet P, as follows. 3093 1. If P.ackno exists, then set R.seqno := P.ackno + 1. Otherwise, 3094 set R.seqno := 0. 3096 2. Set R.ackno := P.seqno. 3098 3. If the packet used short sequence numbers (P.X == 0), then set 3099 the upper 24 bits of R.seqno and R.ackno to 0. 3101 8.4. DCCP State Diagram 3103 The most common state transitions discussed above can be summarized 3104 in the following state diagram. The diagram is illustrative; the 3105 text in Section 8.5 and elsewhere should be considered definitive. 3106 For example, there are arcs (not shown) from every state except 3107 CLOSED to TIMEWAIT, contingent on the receipt of a valid DCCP-Reset. 3109 +---------------------------+ +---------------------------+ 3110 | v v | 3111 | +----------+ | 3112 | +-------------+ CLOSED +------------+ | 3113 | | passive +----------+ active | | 3114 | | open open | | 3115 | | snd Request | | 3116 | v v | 3117 | +----------+ +----------+ | 3118 | | LISTEN | | REQUEST | | 3119 | +----+-----+ +----+-----+ | 3120 | | rcv Request rcv Response | | 3121 | | snd Response snd Ack | | 3122 | v v | 3123 | +----------+ +----------+ | 3124 | | RESPOND | | PARTOPEN | | 3125 | +----+-----+ +----+-----+ | 3126 | | rcv Ack/DataAck rcv packet | | 3127 | | | | 3128 | | +----------+ | | 3129 | +------------>| OPEN |<-----------+ | 3130 | +--+-+--+--+ | 3131 | server active close | | | active close | 3132 | snd CloseReq | | | or rcv CloseReq | 3133 | | | | snd Close | 3134 | | | | | 3135 | +----------+ | | | +----------+ | 3136 | | CLOSEREQ |<---------+ | +--------->| CLOSING | | 3137 | +----+-----+ | +----+-----+ | 3138 | | rcv Close | rcv Reset | | 3139 | | snd Reset | | | 3140 |<---------+ | v | 3141 | | +----+-----+ | 3142 | rcv Close | | TIMEWAIT | | 3143 | snd Reset | +----+-----+ | 3144 +-----------------------------+ | | 3145 +-----------+ 3146 2MSL timer expires 3148 8.5. Pseudocode 3150 This section presents an algorithm describing the processing steps a 3151 DCCP endpoint must go through when it receives a packet. A DCCP 3152 implementation need not implement the algorithm as it is described 3153 here, but any implementation MUST generate observable effects 3154 exactly as indicated by this pseudocode, except where allowed 3155 otherwise by another part of this document. 3157 The received packet is written as P, the socket as S. 3158 Packet variables P.seqno and P.ackno are 48-bit sequence numbers. 3159 Socket variables: 3160 S.SWL - sequence number window low 3161 S.SWH - sequence number window high 3162 S.AWL - acknowledgement number window low 3163 S.AWH - acknowledgement number window high 3164 S.ISS - initial sequence number sent 3165 S.ISR - initial sequence number received 3166 S.OSR - first OPEN sequence number received 3167 S.GSS - greatest sequence number sent 3168 S.GSR - greatest valid sequence number received 3169 S.GAR - greatest valid acknowledgement number received on a 3170 non-Sync; initialized to S.ISS 3171 "Send packet" actions always use, and increment, S.GSS. 3173 Step 1: Check header basics 3174 /* This step checks for malformed packets. Packets that fail 3175 these checks are ignored -- they do not receive Resets in 3176 response */ 3177 If the packet is shorter than 12 bytes, drop packet and return 3178 If the packet type is not understood, drop packet and return 3179 If P.Data Offset is too small for packet type, or too large for 3180 packet, drop packet and return 3181 If P.type is not Data, Ack, or DataAck and P.X == 0 (the packet 3182 has short sequence numbers), drop packet and return 3183 If the header checksum is incorrect, drop packet and return 3184 If P.CsCov is too large for the packet size, drop packet and 3185 return 3187 Step 2: Check ports and process TIMEWAIT state 3188 Look up flow ID in table and get corresponding socket 3189 If no socket, or S.state == TIMEWAIT, 3190 Generate Reset(No Connection) unless P.type == Reset 3191 Drop packet and return 3193 Step 3: Process LISTEN state 3194 If S.state == LISTEN, 3195 If P.type == Request or P contains a valid Init Cookie option, 3196 /* Must scan the packet's options to check for an Init 3197 Cookie. Only the Init Cookie is processed here, 3198 however; other options are processed in Step 8. This 3199 scan need only be performed if the endpoint uses Init 3200 Cookies */ 3201 /* Generate a new socket and switch to that socket */ 3202 Set S := new socket for this port pair 3203 S.state = RESPOND 3204 Choose S.ISS (initial seqno) or set from Init Cookie 3205 Set S.ISR, S.GSR, S.SWL, S.SWH from packet or Init Cookie 3206 Continue with S.state == RESPOND 3207 /* A Response packet will be generated in Step 11 */ 3208 Otherwise, 3209 Generate Reset(No Connection) unless P.type == Reset 3210 Drop packet and return 3212 Step 4: Prepare sequence numbers in REQUEST 3213 If S.state == REQUEST, 3214 If (P.type == Response or P.type == Reset) 3215 and S.AWL <= P.ackno <= S.AWH, 3216 /* Set sequence number variables corresponding to the 3217 other endpoint, so P will pass the tests in Step 6 */ 3218 Set S.GSR, S.ISR, S.SWL, S.SWH 3219 /* Response processing continues in Step 10; Reset 3220 processing continues in Step 9 */ 3221 Otherwise, 3222 /* Only Response and Reset are valid in REQUEST state */ 3223 Generate Reset(Packet Error) 3224 Drop packet and return 3226 Step 5: Prepare sequence numbers for Sync 3227 If P.type == Sync or P.type == SyncAck, 3228 If S.AWL <= P.ackno <= S.AWH and P.seqno >= S.SWL, 3229 /* P is valid, so update sequence number variables 3230 accordingly. After this update, P will pass the tests 3231 in Step 6. A SyncAck is generated if necessary in 3232 Step 15 */ 3233 Update S.GSR, S.SWL, S.SWH 3234 Otherwise, 3235 Drop packet and return 3237 Step 6: Check sequence numbers 3238 Let LSWL = S.SWL and LAWL = S.AWL 3239 If P.type == CloseReq or P.type == Close or P.type == Reset, 3240 LSWL := S.GSR + 1, LAWL := S.GAR 3241 If LSWL <= P.seqno <= S.SWH 3242 and (P.ackno does not exist or LAWL <= P.ackno <= S.AWH), 3243 Update S.GSR, S.SWL, S.SWH 3244 If P.type != Sync, 3245 Update S.GAR 3246 Otherwise, 3247 Send Sync packet acknowledging P.seqno 3248 Drop packet and return 3250 Step 7: Check for unexpected packet types 3251 If (S.is_server and P.type == CloseReq) 3252 or (S.is_server and P.type == Response) 3253 or (S.is_client and P.type == Request) 3254 or (S.state >= OPEN and P.type == Request 3255 and P.seqno >= S.OSR) 3256 or (S.state >= OPEN and P.type == Response 3257 and P.seqno >= S.OSR) 3258 or (S.state == RESPOND and P.type == Data), 3259 Send Sync packet acknowledging P.seqno 3260 Drop packet and return 3262 Step 8: Process options and mark acknowledgeable 3263 /* Option processing is not specifically described here. 3264 Certain options, such as Mandatory, may cause the connection 3265 to be reset, in which case Steps 9 and on are not executed */ 3266 Mark packet as acknowledgeable (in Ack Vector terms, Received 3267 or Received ECN Marked) 3269 Step 9: Process Reset 3270 If P.type == Reset, 3271 Tear down connection 3272 S.state := TIMEWAIT 3273 Set TIMEWAIT timer 3274 Drop packet and return 3276 Step 10: Process REQUEST state (second part) 3277 If S.state == REQUEST, 3278 /* If we get here, P is a valid Response from the server (see 3279 Step 4), and we should move to PARTOPEN state. PARTOPEN 3280 means send an Ack, don't send Data packets, retransmit 3281 Acks periodically, and always include any Init Cookie from 3282 the Response */ 3283 S.state := PARTOPEN 3284 Set PARTOPEN timer 3285 Continue with S.state == PARTOPEN 3286 /* Step 12 will send the Ack completing the three-way 3287 handshake */ 3289 Step 11: Process RESPOND state 3290 If S.state == RESPOND, 3291 If P.type == Request, 3292 Send Response, possibly containing Init Cookie 3293 If Init Cookie was sent, 3294 Destroy S and return 3295 /* Step 3 will create another socket when the client 3296 completes the three-way handshake */ 3297 Otherwise, 3298 S.OSR := P.seqno 3299 S.state := OPEN 3301 Step 12: Process PARTOPEN state 3302 If S.state == PARTOPEN, 3303 If P.type == Response, 3304 Send Ack 3305 Otherwise, if P.type != Sync, 3306 S.OSR := P.seqno 3307 S.state := OPEN 3309 Step 13: Process CloseReq 3310 If P.type == CloseReq and S.state < CLOSEREQ, 3311 Generate Close 3312 S.state := CLOSING 3313 Set CLOSING timer 3315 Step 14: Process Close 3316 If P.type == Close, 3317 Generate Reset(Closed) 3318 Tear down connection 3319 Drop packet and return 3321 Step 15: Process Sync 3322 If P.type == Sync, 3323 Generate SyncAck 3325 Step 16: Process data 3326 /* At this point any application data on P can be passed to the 3327 application, except that the application MUST NOT receive 3328 data from more than one Request or Response */ 3330 9. Checksums 3332 DCCP uses a header checksum to protect its header against 3333 corruption. Generally, this checksum also covers any application 3334 data. DCCP applications can, however, request that the header 3335 checksum cover only part of the application data, or perhaps no 3336 application data at all. Link layers may then reduce their 3337 protection on unprotected parts of DCCP packets. For some noisy 3338 links, and applications that can tolerate corruption, this can 3339 greatly improve delivery rates and perceived performance. 3341 Checksum coverage may eventually impact congestion control 3342 mechanisms as well. A packet with corrupt application data and 3343 complete checksum coverage is treated as lost. This incurs a heavy- 3344 duty loss response from the sender's congestion control mechanism, 3345 which can unfairly penalize connections on links with high 3346 background corruption. The combination of reduced checksum coverage 3347 and Data Checksum options may let endpoints report packets as 3348 corrupt rather than dropped, using Data Dropped options and Drop 3349 Code 3 (see Section 11.7). This may eventually benefit 3350 applications. However, further research is required to determine an 3351 appropriate response to corruption, which can sometimes correlate 3352 with congestion. Corrupt packets currently incur a loss response. 3354 The Data Checksum option, which contains a strong CRC, lets 3355 endpoints detect application data corruption. An API can then be 3356 used to avoid delivering corrupt data to the application, even if 3357 links deliver corrupt data to the endpoint due to reduced checksum 3358 coverage. However, the use of reduced checksum coverage for 3359 applications that demand correct data is currently considered 3360 experimental. This is because the combined loss-plus-corruption 3361 rate for packets with reduced checksum coverage may be significantly 3362 higher than that for packets with full checksum coverage, although 3363 the loss rate will generally be lower. Actual behavior will depend 3364 on link design; further research and experience is required. 3366 Reduced checksum coverage introduces some security considerations; 3367 see Section 18.1. See Appendix B for further motivation and 3368 discussion. DCCP's implementation of reduced checksum coverage was 3369 inspired by UDP-Lite [RFC 3828]. 3371 9.1. Header Checksum Field 3373 DCCP uses the TCP/IP checksum algorithm. The Checksum field in the 3374 DCCP generic header (see Section 5.1) equals the 16 bit one's 3375 complement of the one's complement sum of all 16 bit words in the 3376 DCCP header, DCCP options, a pseudoheader taken from the network- 3377 layer header, and, depending on the value of the Checksum Coverage 3378 field, some or all of the application data. When calculating the 3379 checksum, the Checksum field itself is treated as 0. If a packet 3380 contains an odd number of header and payload bytes to be 3381 checksummed, 8 zero bits are added on the right to form a 16 bit 3382 word for checksum purposes. The pad byte is not transmitted as part 3383 of the packet. 3385 The pseudoheader is calculated as for TCP. For IPv4, it is 96 bits 3386 long, and consists of the IPv4 source and destination addresses, the 3387 IP protocol number for DCCP (padded on the left with 8 zero bits), 3388 and the DCCP length as a 16-bit quantity (the length of the DCCP 3389 header with options, plus the length of any data); see RFC 793 3390 (Section 3.1). For IPv6, it is 320 bits long, and consists of the 3391 IPv6 source and destination addresses, the DCCP length as a 32-bit 3392 quantity, and the IP protocol number for DCCP (padded on the left 3393 with 24 zero bits); see RFC 2460 (Section 8.1). 3395 Packets with invalid header checksums MUST be ignored. In 3396 particular, their options MUST NOT be processed. 3398 9.2. Header Checksum Coverage Field 3400 The Checksum Coverage field in the DCCP generic header (see Section 3401 5.1) specifies what parts of the packet are covered by the Checksum 3402 field, as follows: 3404 CsCov = 0 The Checksum field covers the DCCP header, DCCP 3405 options, network-layer pseudoheader, and all 3406 application data in the packet, possibly padded on 3407 the right with zeros to an even number of bytes. 3409 CsCov = 1-15 The Checksum field covers the DCCP header, DCCP 3410 options, network-layer pseudoheader, and the initial 3411 (CsCov-1)*4 bytes of the packet's application data. 3413 Thus, if CsCov is 1, none of the application data is protected by 3414 the header checksum. The value (CsCov-1)*4 MUST be less than or 3415 equal to the length of the application data. Packets with invalid 3416 CsCov values MUST be ignored; in particular, their options MUST NOT 3417 be processed. The meanings of values other than 0 and 1 should be 3418 considered experimental. 3420 Values other than 0 specify that corruption is acceptable in some or 3421 all of the DCCP packet's application data. In fact, DCCP cannot 3422 even detect corruption in areas not covered by the header checksum, 3423 unless the Data Checksum option is used. Applications should not 3424 make any assumptions about the correctness of received data not 3425 covered by the checksum, and should if necessary introduce their own 3426 validity checks. 3428 A DCCP application interface should let sending applications suggest 3429 a value for CsCov for sent packets, defaulting to 0 (full coverage). 3430 The Minimum Checksum Coverage feature, described below, lets an 3431 endpoint refuse delivery of application data on packets with partial 3432 checksum coverage; by default, only fully-covered application data 3433 is accepted. Lower layers that support partial error detection MAY 3434 use the Checksum Coverage field as a hint of where errors do not 3435 need to be detected. Lower layers MUST use a strong error detection 3436 mechanism to detect at least errors that occur in the sensitive part 3437 of the packet, and discard damaged packets. The sensitive part 3438 consists of the bytes between the first byte of the IP header and 3439 the last byte identified by Checksum Coverage. 3441 For more details on application and lower-layer interface issues 3442 relating to partial checksumming, see [RFC 3828]. 3444 9.2.1. Minimum Checksum Coverage Feature 3446 The Minimum Checksum Coverage feature lets a DCCP endpoint determine 3447 whether its peer is willing to accept packets with reduced Checksum 3448 Coverage. For example, DCCP A sends a "Change R(Minimum Checksum 3449 Coverage, 1)" option to DCCP B to check whether B is willing to 3450 accept packets with Checksum Coverage set to 1. 3452 Minimum Checksum Coverage has feature number 8, and is server- 3453 priority. It takes one-byte integer values between 0 and 15; values 3454 of 16 or more are reserved. Minimum Checksum Coverage/B reflects 3455 values of Checksum Coverage that DCCP B finds unacceptable. Say 3456 that the value of Minimum Checksum Coverage/B is MinCsCov. Then: 3458 o If MinCsCov = 0, then DCCP B only finds packets with CsCov = 0 3459 acceptable. 3461 o If MinCsCov > 0, then DCCP B additionally finds packets with 3462 CsCov >= MinCsCov acceptable. 3464 DCCP B MAY refuse to process application data from packets with 3465 unacceptable Checksum Coverage. Such packets SHOULD be reported 3466 using Data Dropped options (Section 11.7) with Drop Code 0, Protocol 3467 Constraints. New connections start with Minimum Checksum Coverage 0 3468 for both endpoints. 3470 9.3. Data Checksum Option 3472 The Data Checksum option holds a 32-bit CRC-32c cyclic redundancy- 3473 check code of a DCCP packet's application data. 3475 +--------+--------+--------+--------+--------+--------+ 3476 |00101100|00000110| CRC-32c | 3477 +--------+--------+--------+--------+--------+--------+ 3478 Type=44 Length=6 3480 The sending DCCP computes the CRC of the bytes comprising the 3481 application data area and stores it in the option data. The CRC-32c 3482 algorithm used for Data Checksum is the same as that used for SCTP 3483 [RFC 3309]; note that the CRC-32c of zero bytes of data equals zero. 3484 The DCCP header checksum will cover the Data Checksum option, so the 3485 data checksum must be computed before the header checksum. 3487 A DCCP endpoint receiving a packet with a Data Checksum option 3488 SHOULD compute the received application data's CRC-32c, using the 3489 same algorithm as the sender, and compare the result with the Data 3490 Checksum value. (The endpoint can indicate its willingness to check 3491 Data Checksums using the Check Data Checksum feature, described 3492 below.) If the CRCs differ, the endpoint reacts in one of two ways. 3494 o The receiving application may have requested delivery of known- 3495 corrupt data via some optional API. In this case, the packet's 3496 data MUST be delivered to the application, with a note that it is 3497 known to be corrupt. Furthermore, the receiving endpoint MUST 3498 report the packet as delivered corrupt using a Data Dropped 3499 option (Drop Code 7, Delivered Corrupt). 3501 o Otherwise, the receiving endpoint MUST drop the application data, 3502 and report that data as dropped due to corruption using a Data 3503 Dropped option (Drop Code 3, Corrupt). 3505 In either case, the packet is considered acknowledgeable (since its 3506 header was processed), and will therefore be acknowledged using the 3507 equivalent of Ack Vector's Received or Received ECN Marked states. 3509 Although Data Checksum is intended for packets containing 3510 application data, it may be included on other packets, such as DCCP- 3511 Ack, DCCP-Sync, and DCCP-SyncAck. The receiver SHOULD calculate the 3512 application data area's CRC-32c on such packets, just as it does for 3513 DCCP-Data and similar packets; and if the CRCs differ, the packets 3514 similarly MUST be reported using Data Dropped options (Drop Code 3), 3515 although their application data areas would not be delivered to the 3516 application in any case. 3518 9.3.1. Check Data Checksum Feature 3520 The Check Data Checksum feature lets a DCCP endpoint determine 3521 whether its peer will definitely check Data Checksum options. 3522 DCCP A sends a Mandatory "Change R(Check Data Checksum, 1)" option 3523 to DCCP B to require it to check Data Checksum options (the 3524 connection will be reset if it cannot). 3526 Check Data Checksum has feature number 9, and is server-priority. 3527 It takes one-byte Boolean values. DCCP B MUST check any received 3528 Data Checksum options when Check Data Checksum/B is one, although it 3529 MAY check them even when Check Data Checksum/B is zero. Values of 3530 two or more are reserved. New connections start with Check Data 3531 Checksum 0 for both endpoints. 3533 9.3.2. Usage Notes 3535 Internet links must normally apply strong integrity checks to the 3536 packets they transmit [RFC 3828, RFC 3819]. This is the default 3537 case when the DCCP header's Checksum Coverage value equals zero 3538 (full coverage). However, the DCCP Checksum Coverage value might 3539 not be zero. By setting partial Checksum Coverage, the application 3540 indicates that it can tolerate corruption in the unprotected part of 3541 the application data. Recognizing this, link layers may reduce 3542 error detection and/or correction strength when transmitting this 3543 unprotected part. This, in turn, can significantly increase the 3544 likelihood of the endpoint receiving corrupt data; Data Checksum 3545 lets the receiver detect that corruption with very high probability. 3547 10. Congestion Control 3549 Each congestion control mechanism supported by DCCP is assigned a 3550 congestion control identifier, or CCID: a number from 0 to 255. 3551 During connection setup, and optionally thereafter, the endpoints 3552 negotiate their congestion control mechanisms by negotiating the 3553 values for their Congestion Control ID features. Congestion Control 3554 ID has feature number 1. The CCID/A value equals the CCID in use 3555 for the A-to-B half-connection. DCCP B sends a "Change R(CCID, K)" 3556 option to ask DCCP A to use CCID K for its data packets. 3558 CCID is a server-priority feature, so CCID negotiation options can 3559 list multiple acceptable CCIDs, sorted in descending order of 3560 priority. For example, the option "Change R(CCID, 2 3 4)" asks the 3561 receiver to use CCID 2 for its packets, although CCIDs 3 and 4 are 3562 also acceptable. (This corresponds to the bytes "35, 6, 1, 2, 3, 3563 4": Change R option (35), option length (6), feature ID (1), CCIDs 3564 (2, 3, 4).) Similarly, "Confirm L(CCID, 1, 2 3 4)" tells the 3565 receiver that the sender is using CCID 2 for its packets, but that 3566 CCIDs 3 and 4 might also be acceptable. 3568 Currently allocated CCIDs are as follows. 3570 CCID Meaning Reference 3571 ---- ------- --------- 3572 0-1 Reserved 3573 2 TCP-like Congestion Control [RFC TBA] 3574 3 TFRC Congestion Control [RFC TBA] 3575 4-255 Reserved 3577 Table 5: DCCP Congestion Control Identifiers 3579 New connections start with CCID 2 for both endpoints. If this is 3580 unacceptable for a DCCP endpoint, that endpoint MUST send Mandatory 3581 Change(CCID) options on its first packets. 3583 All CCIDs standardized for use with DCCP will correspond to 3584 congestion control mechanisms previously standardized by the IETF. 3585 We expect that for quite some time, all such mechanisms will be TCP- 3586 friendly, but TCP-friendliness is not an explicit DCCP requirement. 3588 A DCCP implementation intended for general use, such as an 3589 implementation in a general-purpose operating system kernel, SHOULD 3590 implement at least CCID 2. The intent is to make CCID 2 broadly 3591 available for interoperability, although particular applications 3592 might disallow its use. 3594 10.1. TCP-like Congestion Control 3596 CCID 2, TCP-like Congestion Control, denotes Additive Increase, 3597 Multiplicative Decrease (AIMD) congestion control with behavior 3598 modelled directly on TCP, including congestion window, slow start, 3599 timeouts, and so forth [RFC 2581]. CCID 2 achieves maximum 3600 bandwidth over the long term, consistent with the use of end-to-end 3601 congestion control, but halves its congestion window in response to 3602 each congestion event. This leads to the abrupt rate changes 3603 typical of TCP. Applications should use CCID 2 if they prefer 3604 maximum bandwidth utilization to steadiness of rate. This is often 3605 the case for applications that are not playing their data directly 3606 to the user. For example, a hypothetical application that 3607 transferred files over DCCP, using application-level retransmissions 3608 for lost packets, would prefer CCID 2 to CCID 3. On-line games may 3609 also prefer CCID 2. 3611 CCID 2 is further described in [CCID 2 PROFILE]. 3613 10.2. TFRC Congestion Control 3615 CCID 3 denotes TCP-Friendly Rate Control (TFRC), an equation-based 3616 rate-controlled congestion control mechanism. TFRC is designed to 3617 be reasonably fair when competing for bandwidth with TCP-like flows, 3618 where a flow is "reasonably fair" if its sending rate is generally 3619 within a factor of two of the sending rate of a TCP flow under the 3620 same conditions. However, TFRC has a much lower variation of 3621 throughput over time compared with TCP, which makes CCID 3 more 3622 suitable than CCID 2 for applications such streaming media where a 3623 relatively smooth sending rate is of importance. 3625 CCID 3 is further described in [CCID 3 PROFILE]. The TFRC 3626 congestion control algorithms were initially described in RFC 3448. 3628 10.3. CCID-Specific Options, Features, and Reset Codes 3630 Half of the option types, feature numbers, and Reset Codes are 3631 reserved for CCID-specific use. CCIDs may often need new options, 3632 for communicating acknowledgement or rate information, for example; 3633 reserved option spaces let CCIDs create options at will without 3634 polluting the global option space. Option 128 might have different 3635 meanings on a half-connection using CCID 4 and a half-connection 3636 using CCID 8. CCID-specific options and features will never 3637 conflict with global options and features introduced by later 3638 versions of this specification. 3640 Any packet may contain information meant for either half-connection, 3641 so CCID-specific option types, feature numbers, and Reset Codes 3642 explicitly signal the half-connection to which they apply. 3644 o Option numbers 128 through 191 are for options sent from the HC- 3645 Sender to the HC-Receiver; option numbers 192 through 255 are for 3646 options sent from the HC-Receiver to the HC-Sender. 3648 o Reset Codes 128 through 191 indicate that the HC-Sender reset the 3649 connection (most likely because of some problem with 3650 acknowledgements sent by the HC-Receiver); Reset Codes 192 3651 through 255 indicate that the HC-Receiver reset the connection 3652 (most likely because of some problem with data packets sent by 3653 the HC-Sender). 3655 o Finally, feature numbers 128 through 191 are used for features 3656 located at the HC-Sender; feature numbers 192 through 255 are for 3657 features located at the HC-Receiver. Since Change L and 3658 Confirm L options for a feature are sent by the feature location, 3659 we know that any Change L(128) option was sent by the HC-Sender, 3660 while any Change L(192) option was sent by the HC-Receiver. 3661 Similarly, Change R(128) options are sent by the HC-Receiver, 3662 while Change R(192) options are sent by the HC-Sender. 3664 For example, consider a DCCP connection where the A-to-B half- 3665 connection uses CCID 4 and the B-to-A half-connection uses CCID 5. 3666 Here is how a sampling of CCID-specific options are assigned to 3667 half-connections. 3669 Relevant Relevant 3670 Packet Option Half-conn. CCID 3671 ------ ------ ---------- ---- 3672 A > B 128 A-to-B 4 3673 A > B 192 B-to-A 5 3674 A > B Change L(128, ...) A-to-B 4 3675 A > B Change R(192, ...) A-to-B 4 3676 A > B Confirm L(128, ...) A-to-B 4 3677 A > B Confirm R(192, ...) A-to-B 4 3678 A > B Change R(128, ...) B-to-A 5 3679 A > B Change L(192, ...) B-to-A 5 3680 A > B Confirm R(128, ...) B-to-A 5 3681 A > B Confirm L(192, ...) B-to-A 5 3683 B > A 128 B-to-A 5 3684 B > A 192 A-to-B 4 3685 B > A Change L(128, ...) B-to-A 5 3686 B > A Change R(192, ...) B-to-A 5 3687 B > A Confirm L(128, ...) B-to-A 5 3688 B > A Confirm R(192, ...) B-to-A 5 3689 B > A Change R(128, ...) A-to-B 4 3690 B > A Change L(192, ...) A-to-B 4 3691 B > A Confirm R(128, ...) A-to-B 4 3692 B > A Confirm L(192, ...) A-to-B 4 3694 Using CCID-specific options and feature options during a negotiation 3695 for that CCID feature is NOT RECOMMENDED, since it is difficult to 3696 predict the CCID that will be in force when the option is processed. 3697 For example, if a DCCP-Request contains the option sequence 3698 "Change L(CCID, 3), 128", the CCID-specific option "128" may be 3699 processed either by CCID 3 (if the server supports CCID 3) or by the 3700 default CCID 2 (if it does not). However, it is safe to include 3701 CCID-specific options following certain Mandatory Change(CCID) 3702 options. For example, if a DCCP-Request contains the option 3703 sequence "Mandatory, Change L(CCID, 3), 128", then either the "128" 3704 option will be processed by CCID 3 or the connection will be reset. 3706 Servers that do not implement the default CCID 2 might nevertheless 3707 receive CCID 2-specific options on a DCCP-Request packet. (Such a 3708 server MUST send Mandatory Change(CCID) options on its DCCP- 3709 Response, so CCID-specific options on any other packet won't refer 3710 to CCID 2.) The server MUST treat such options as non-understood. 3711 Thus, it will reset the connection on encountering a Mandatory CCID- 3712 specific option, send an empty Confirm for a non-Mandatory Change 3713 option for a CCID-specific feature, and ignore other options. 3715 10.4. CCID Profile Requirements 3717 Each CCID Profile document MUST address at least the following 3718 requirements: 3720 o The profile MUST include the name and number of the CCID being 3721 described. 3723 o The profile MUST describe the conditions in which it is likely to 3724 be useful. Often the best way to do this is by comparison to 3725 existing CCIDs. 3727 o The profile MUST list and describe any CCID-specific options, 3728 features, and Reset Codes, and SHOULD list those general options 3729 and features described in this document that are especially 3730 relevant to the CCID. 3732 o Any newly defined acknowledgement mechanism MUST include a way to 3733 transmit ECN Nonce Echoes back to the sender. 3735 o The profile MUST describe the format of data packets, including 3736 any options that should be included and the setting of the CCval 3737 header field. 3739 o The profile MUST describe the format of acknowledgement packets, 3740 including any options that should be included. 3742 o The profile MUST define how data packets are congestion 3743 controlled. This includes responses to congestion events, idle 3744 and application-limited periods, and responses to the DCCP Data 3745 Dropped and Slow Receiver options. CCIDs that implement per- 3746 packet congestion control SHOULD discuss how packet size is 3747 factored in to congestion control decisions. 3749 o The profile MUST specify when acknowledgement packets are 3750 generated, and how they are congestion controlled. 3752 o The profile MUST define when a sender using the CCID is 3753 considered quiescent. 3755 o The profile MUST say whether its CCID's acknowledgements ever 3756 need to be acknowledged, and if so, how often. 3758 10.5. Congestion State 3760 Most congestion control algorithms depend on past history to 3761 determine the current allowed sending rate. In CCID 2, this 3762 congestion state includes a congestion window and a measurement of 3763 the number of packets outstanding in the network; in CCID 3, it 3764 includes the lengths of recent loss intervals; and both CCIDs use an 3765 estimate of the round-trip time. Congestion state depends on the 3766 network path, and is invalidated by path changes. Therefore, DCCP 3767 senders and receivers SHOULD reset their congestion state -- 3768 essentially restarting congestion control from "slow start" or 3769 equivalent -- on significant changes in end-to-end path. For 3770 example, an endpoint that sends or receives a Mobile IPv6 Binding 3771 Update message [RFC 3775] SHOULD reset its congestion state for any 3772 corresponding DCCP connections. 3774 A DCCP implementation MAY also reset its congestion state when a 3775 CCID changes (that is, a negotiation for the CCID feature completes 3776 successfully, and the new feature value differs from the old value). 3777 Thus, a connection in a heavily congested environment might evade 3778 end-to-end congestion control by frequently renegotiating a CCID, 3779 just as it could evade end-to-end congestion control by opening new 3780 connections for the same session. This behavior is prohibited. To 3781 prevent it, DCCP implementations MAY limit the rate at which CCID 3782 can be changed -- for instance, by refusing to change a CCID feature 3783 value more than once per minute. 3785 11. Acknowledgements 3787 Congestion control requires receivers to transmit information about 3788 packet losses and ECN marks to senders. DCCP receivers MUST report 3789 all congestion they see, as defined by the relevant CCID profile. 3790 Each CCID says when acknowledgements should be sent, what options 3791 they must use, and so on. DCCP acknowledgements are congestion 3792 controlled, although it is not required that the acknowledgement 3793 stream be more than very roughly TCP-friendly; each CCID defines how 3794 acknowledgements are congestion controlled. 3796 Most acknowledgements use DCCP options. For example, on a half- 3797 connection with CCID 2 (TCP-like), the receiver reports 3798 acknowledgement information using the Ack Vector option. This 3799 section describes common acknowledgement options and shows how acks 3800 using those options will commonly work. Full descriptions of the 3801 ack mechanisms used for each CCID are laid out in the CCID profile 3802 specifications. 3804 Acknowledgement options, such as Ack Vector, generally depend on the 3805 DCCP Acknowledgement Number, and are thus only allowed on packet 3806 types that carry that number (all packets except DCCP-Request and 3807 DCCP-Data). Detailed acknowledgement options are not necessarily 3808 required on every packet that carries an Acknowledgement Number, 3809 however. 3811 11.1. Acks of Acks and Unidirectional Connections 3813 DCCP was designed to work well for both bidirectional and 3814 unidirectional flows of data, and for connections that transition 3815 between these states. However, acknowledgements required for a 3816 unidirectional connection are very different from those required for 3817 a bidirectional connection. In particular, unidirectional 3818 connections need to worry about acks of acks. 3820 The ack-of-acks problem arises because some acknowledgement 3821 mechanisms are reliable. For example, an HC-Receiver using CCID 2, 3822 TCP-like Congestion Control, sends Ack Vectors containing completely 3823 reliable acknowledgement information. The HC-Sender should 3824 occasionally inform the HC-Receiver that it has received an ack. If 3825 it did not, the HC-Receiver might resend complete Ack Vector 3826 information, going back to the start of the connection, with every 3827 DCCP-Ack packet! However, note that acks-of-acks need not be 3828 reliable themselves: when an ack-of-acks is lost, the HC-Receiver 3829 will simply maintain, and periodically retransmit, old 3830 acknowledgement-related state for a little longer. Therefore, there 3831 is no need for acks-of-acks-of-acks. 3833 When communication is bidirectional, any required acks-of-acks are 3834 automatically contained in normal acknowledgements for data packets. 3835 On a unidirectional connection, however, the receiver DCCP sends no 3836 data, so the sender would not normally send acknowledgements. 3837 Therefore, the CCID in force on that half-connection must explicitly 3838 say whether, when, and how the HC-Sender should generate acks-of- 3839 acks. 3841 For example, consider a bidirectional connection where both half- 3842 connections use the same CCID (either 2 or 3), and where DCCP B goes 3843 "quiescent". This means that the connection becomes unidirectional: 3844 DCCP B stops sending data, and sends only sends DCCP-Ack packets to 3845 DCCP A. For example, in CCID 2, TCP-like Congestion Control, DCCP B 3846 uses Ack Vector to reliably communicate which packets it has 3847 received. As described above, DCCP A must occasionally acknowledge 3848 a pure acknowledgement from DCCP B, so that B can free old Ack 3849 Vector state. For instance, A might send a DCCP-DataAck packet 3850 every now and then, instead of DCCP-Data. In contrast, in CCID 3, 3851 TFRC Congestion Control, DCCP B's acknowledgements generally need 3852 not be reliable, since they contain cumulative loss rates; TFRC 3853 works even if every DCCP-Ack is lost. Therefore, DCCP A need never 3854 acknowledge an acknowledgement. 3856 When communication is unidirectional, a single CCID -- in the 3857 example, the A-to-B CCID -- controls both DCCPs' acknowledgements, 3858 in terms of their content, their frequency, and so forth. For 3859 bidirectional connections, the A-to-B CCID governs DCCP B's 3860 acknowledgements (including its acks of DCCP A's acks), while the B- 3861 to-A CCID governs DCCP A's acknowledgements. 3863 DCCP A switches its ack pattern from bidirectional to unidirectional 3864 when it notices that DCCP B has gone quiescent. It switches from 3865 unidirectional to bidirectional when it must acknowledge even a 3866 single DCCP-Data or DCCP-DataAck packet from DCCP B. 3868 Each CCID defines how to detect quiescence on that CCID, and how 3869 that CCID handles acks-of-acks on unidirectional connections. The 3870 B-to-A CCID defines when DCCP B has gone quiescent. Usually, this 3871 happens when a period has passed without B sending any data packets; 3872 in CCID 2, for example, this period is the maximum of 0.2 seconds 3873 and two round-trip times. The A-to-B CCID defines how DCCP A 3874 handles acks-of-acks once DCCP B has gone quiescent. 3876 11.2. Ack Piggybacking 3878 Acknowledgements of A-to-B data MAY be piggybacked on data sent by 3879 DCCP B, as long as that does not delay the acknowledgement longer 3880 than the A-to-B CCID would find acceptable. However, data 3881 acknowledgements often require more than 4 bytes to express. A 3882 large set of acknowledgements prepended to a large data packet might 3883 exceed the allowed maximum packet size. In this case, DCCP B SHOULD 3884 send separate DCCP-Data and DCCP-Ack packets, or wait, but not too 3885 long, for a smaller datagram. 3887 Piggybacking is particularly common at DCCP A when the B-to-A half- 3888 connection is quiescent -- that is, when DCCP A is just 3889 acknowledging DCCP B's acknowledgements. There are three reasons to 3890 acknowledge DCCP B's acknowledgements: to allow DCCP B to free up 3891 information about previously acknowledged data packets from A; to 3892 shrink the size of future acknowledgements; and to manipulate the 3893 rate at which future acknowledgements are sent. Since these are 3894 secondary concerns, DCCP A can generally afford to wait indefinitely 3895 for a data packet to piggyback its acknowledgement onto; if DCCP B 3896 wants to elicit an acknowledgement, it can send a DCCP-Sync. 3898 Any restrictions on ack piggybacking are described in the relevant 3899 CCID's profile. 3901 11.3. Ack Ratio Feature 3903 The Ack Ratio feature lets HC-Senders influence the rate at which 3904 HC-Receivers generate DCCP-Ack packets, thus controlling reverse- 3905 path congestion. This differs from TCP, which presently has no 3906 congestion control for pure acknowledgement traffic. Ack Ratio 3907 reverse-path congestion control does not try to be TCP-friendly. It 3908 just tries to avoid congestion collapse, and to be somewhat better 3909 than TCP in the presence of a high packet loss or mark rate on the 3910 reverse path. 3912 Ack Ratio applies to CCIDs whose HC-Receivers clock acknowledgements 3913 off the receipt of data packets. The value of Ack Ratio/A equals 3914 the rough ratio of data packets sent by DCCP A to DCCP-Ack packets 3915 sent by DCCP B. Higher Ack Ratios correspond to lower DCCP-Ack 3916 rates; the sender raises Ack Ratio when the reverse path is 3917 congested and lowers Ack Ratio when it is not. Each CCID profile 3918 defines how it controls congestion on the acknowledgement path, and, 3919 particularly, whether Ack Ratio is used. CCID 2, for example, uses 3920 Ack Ratio for acknowledgement congestion control, but CCID 3 does 3921 not. However, each Ack Ratio feature has a value whether or not 3922 that value is used by the relevant CCID. 3924 Ack Ratio has feature number 5, and is non-negotiable. It takes 3925 two-byte integer values. An Ack Ratio/A value of four means that 3926 DCCP B will send at least one acknowledgement packet for every four 3927 data packets sent by DCCP A. DCCP A sends a "Change L(Ack Ratio)" 3928 option to notify DCCP B of its ack ratio. An Ack Ratio value of 3929 zero indicates that the relevant half-connection does not use an Ack 3930 Ratio to control its acknowledgement rate. New connections start 3931 with Ack Ratio 2 for both endpoints; this Ack Ratio results in 3932 acknowledgement behavior analogous to TCP's delayed acks. 3934 Ack Ratio should be treated as a guideline rather than a strict 3935 requirement. We intend Ack Ratio-controlled acknowledgement 3936 behavior to resemble TCP's acknowledgement behavior when there is no 3937 reverse-path congestion, and to be somewhat more conservative when 3938 there is reverse-path congestion. Following this intent is more 3939 important than implementing Ack Ratio precisely. In particular: 3941 o Receivers MAY piggyback acknowledgement information on data 3942 packets, creating DCCP-DataAck packets. The Ack Ratio does not 3943 apply to piggybacked acknowledgements. However, if the data 3944 packets are too big to carry acknowledgement information, or the 3945 data sending rate is lower than Ack Ratio would suggest, then 3946 DCCP B SHOULD send enough pure DCCP-Ack packets to maintain the 3947 rate of one acknowledgement per Ack Ratio received data packets. 3949 o Receivers MAY rate-pace their acknowledgements, rather than 3950 sending acknowledgements immediately upon the receipt of data 3951 packets. Receivers that rate-pace acknowledgements SHOULD pick a 3952 rate that approximates the effect of Ack Ratio, and SHOULD 3953 include Elapsed Time options (Section 13.2) to help the sender 3954 calculate round-trip times. 3956 o Receivers SHOULD implement delayed acknowledgement timers like 3957 TCP's, whereby any packet's acknowledgement is delayed by at most 3958 T seconds. This delay lets the receiver collect additional 3959 packets to acknowledge, and thus reduce the per-packet overhead 3960 of acknowledgements; but if T seconds have passed by and the ack 3961 is still around, it is sent out right away. The default value of 3962 T should be 0.2 seconds, as is common in TCP implementations. 3963 This may lead to sending more acknowledgement packets than Ack 3964 Ratio would suggest. 3966 o Receivers SHOULD send acknowledgements immediately on receiving 3967 packets marked ECN Congestion Experienced, or packets whose out- 3968 of-order sequence numbers potentially indicate loss. However, 3969 there is no need to send such immediate acknowledgements for 3970 marked packets more than once per round-trip time. 3972 o Receivers MAY ignore Ack Ratio if they perform their own 3973 congestion control on acknowledgements. For example, a receiver 3974 that knows the loss and mark rate for its DCCP-Ack packets might 3975 maintain a TCP-friendly acknowledgement rate on its own. Such a 3976 receiver MUST either ensure that it always obtains sufficient 3977 acknowledgement loss and mark information, or fall back to Ack 3978 Ratio when sufficient information is not available, as might 3979 happen during periods when the receiver is quiescent. 3981 11.4. Ack Vector Options 3983 The Ack Vector gives a run-length encoded history of data packets 3984 received at the client. Each byte of the vector gives the state of 3985 that data packet in the loss history, and the number of preceding 3986 packets with the same state. The option's data looks like this: 3988 +--------+--------+--------+--------+--------+-------- 3989 |0010011?| Length |SSLLLLLL|SSLLLLLL|SSLLLLLL| ... 3990 +--------+--------+--------+--------+--------+-------- 3991 Type=38/39 \___________ Vector ___________... 3993 The two Ack Vector options (option types 38 and 39) differ only in 3994 the values they imply for ECN Nonce Echo. Section 12.2 describes 3995 this further. 3997 The vector itself consists of a series of bytes, each of whose 3998 encoding is: 4000 0 1 2 3 4 5 6 7 4001 +-+-+-+-+-+-+-+-+ 4002 |Sta| Run Length| 4003 +-+-+-+-+-+-+-+-+ 4005 Sta[te] occupies the most significant two bits of each byte, and can 4006 have one of four values, as follows. 4008 State Meaning 4009 ----- ------- 4010 0 Received 4011 1 Received ECN Marked 4012 2 Reserved 4013 3 Not Yet Received 4015 Table 6: DCCP Ack Vector States 4017 The term "ECN marked" refers to packets with ECN code point 11, CE 4018 (Congestion Experienced); packets received with this ECN code point 4019 MUST be reported using State 1, Received ECN Marked. Packets 4020 received with other ECN code points 00, 01, or 10 (Non-ECT, ECT(0), 4021 or ECT(1), respectively) MUST be reported using State 0, Received. 4023 Run Length, the least significant six bits of each byte, specifies 4024 how many consecutive packets have the given State. Run Length zero 4025 says the corresponding State applies to one packet only; Run Length 4026 63 says it applies to 64 consecutive packets. Run lengths of 65 or 4027 more must be encoded in multiple bytes. 4029 The first byte in the first Ack Vector option refers to the packet 4030 indicated in the Acknowledgement Number; subsequent bytes refer to 4031 older packets. (Ack Vector MUST NOT be sent on DCCP-Data and DCCP- 4032 Request packets, which lack an Acknowledgement Number.) An Ack 4033 Vector containing the decimal values 0,192,3,64,5 and the 4034 Acknowledgement Number is decimal 100 indicates that: 4036 Packet 100 was received (Acknowledgement Number 100, State 0, 4037 Run Length 0). 4039 Packet 99 was lost (State 3, Run Length 0). 4041 Packets 98, 97, 96 and 95 were received (State 0, Run Length 3). 4043 Packet 94 was ECN marked (State 1, Run Length 0). 4045 Packets 93, 92, 91, 90, 89, and 88 were received (State 0, Run 4046 Length 5). 4048 A single Ack Vector option can acknowledge up to 16192 data packets. 4049 Should more packets need to be acknowledged than can fit in 253 4050 bytes of Ack Vector, then multiple Ack Vector options can be sent; 4051 the second Ack Vector begins where the first left off, and so forth. 4053 Ack Vector states are subject to two general constraints. (These 4054 principles SHOULD also be followed for other acknowledgement 4055 mechanisms; referring to Ack Vector states simplifies their 4056 explanation.) 4058 1. Packets reported as State 0 or State 1 MUST be acknowledgeable: 4059 their options have been processed by the receiving DCCP stack. 4060 Any data on the packet need not have been delivered to the 4061 receiving application; in fact, the data may have been dropped. 4063 2. Packets reported as State 3 MUST NOT be acknowledgeable. 4064 Feature negotiations and options on such packets MUST NOT have 4065 been processed, and the Acknowledgement Number MUST NOT 4066 correspond to such a packet. 4068 Packets dropped in the application's receive buffer MUST be reported 4069 as Received or Received ECN Marked (States 0 and 1), depending on 4070 their ECN state; such packets' ECN Nonces MUST be included in the 4071 Nonce Echo. The Data Dropped option informs the sender that some 4072 packets reported as received actually had their application data 4073 dropped. 4075 One or more Ack Vector options that, together, report the status of 4076 a packet with sequence number less than ISN, the initial sequence 4077 number, SHOULD be considered invalid. The receiving DCCP SHOULD 4078 either ignore the options or reset the connection with Reset Code 5, 4079 "Option Error". No Ack Vector option can refer to a packet that has 4080 not yet been sent, as the Acknowledgement Number checks in Section 4081 7.5.3 ensure, but because of attack, implementation bug, or 4082 misbehavior, an Ack Vector option can claim that a packet was 4083 received before it is actually delivered; Section 12.2 describes how 4084 this is detected and how senders should react. Packets that haven't 4085 been included in any Ack Vector option SHOULD be treated as "not yet 4086 received" (State 3) by the sender. 4088 Appendix A provides a non-normative description of the details of 4089 DCCP acknowledgement handling, in the context of an abstract Ack 4090 Vector implementation. 4092 11.4.1. Ack Vector Consistency 4094 A DCCP sender will commonly receive multiple acknowledgements for 4095 some of its data packets. For instance, an HC-Sender might receive 4096 two DCCP-Acks with Ack Vectors, both of which contained information 4097 about sequence number 24. (Information about a sequence number is 4098 generally repeated in every ack until the HC-Sender acknowledges an 4099 ack. In this case, perhaps the HC-Receiver is sending acks faster 4100 than the HC-Sender is acknowledging them.) In a perfect world, the 4101 two Ack Vectors would always be consistent. However, there are many 4102 reasons why they might not be. For example: 4104 o The HC-Receiver received packet 24 between sending its acks, so 4105 the first ack said 24 was not received (State 3) and the second 4106 said it was received or ECN marked (State 0 or 1). 4108 o The HC-Receiver received packet 24 between sending its acks, and 4109 the network reordered the acks. In this case, the packet will 4110 appear to transition from State 0 or 1 to State 3. 4112 o The network duplicated packet 24, and one of the duplicates was 4113 ECN marked. This might show up as a transition between States 0 4114 and 1. 4116 To cope with these situations, HC-Sender DCCP implementations SHOULD 4117 combine multiple received Ack Vector states according to this table: 4119 Received State 4120 0 1 3 4121 +---+---+---+ 4122 0 | 0 |0/1| 0 | 4123 Old +---+---+---+ 4124 1 | 1 | 1 | 1 | 4125 State +---+---+---+ 4126 3 | 0 | 1 | 3 | 4127 +---+---+---+ 4129 To read the table, choose the row corresponding to the packet's old 4130 state and the column corresponding to the packet's state in the 4131 newly received Ack Vector, then read the packet's new state off the 4132 table. For an old state of 0 (received non-marked) and received 4133 state of 1 (received ECN marked), the packet's new state may be set 4134 to either 0 or 1. The HC-Sender implementation will be indifferent 4135 to ack reordering if it chooses new state 1 for that cell. 4137 The HC-Receiver should collect information about received packets, 4138 which it will eventually report to the HC-Sender on one or more 4139 acknowledgements, according to the following table: 4141 Received Packet 4142 0 1 3 4143 +---+---+---+ 4144 0 | 0 |0/1| 0 | 4145 Stored +---+---+---+ 4146 1 |0/1| 1 | 1 | 4147 State +---+---+---+ 4148 3 | 0 | 1 | 3 | 4149 +---+---+---+ 4151 This table equals the sender's table, except that when the stored 4152 state is 1 and the received state is 0, the receiver is allowed to 4153 switch its stored state to 0. 4155 A HC-Sender MAY choose to throw away old information gleaned from 4156 the HC-Receiver's Ack Vectors, in which case it MUST ignore newly 4157 received acknowledgements from the HC-Receiver for those old 4158 packets. It is often kinder to save recent Ack Vector information 4159 for a while, so that the HC-Sender can undo its reaction to presumed 4160 congestion when a "lost" packet unexpectedly shows up (the 4161 transition from State 3 to State 0). 4163 11.4.2. Ack Vector Coverage 4165 We can divide the packets that have been sent from an HC-Sender to 4166 an HC-Receiver into four roughly contiguous groups. From oldest to 4167 youngest, these are: 4169 1. Packets already acknowledged by the HC-Receiver, where the HC- 4170 Receiver knows that the HC-Sender has definitely received the 4171 acknowledgements. 4173 2. Packets already acknowledged by the HC-Receiver, where the HC- 4174 Receiver cannot be sure that the HC-Sender has received the 4175 acknowledgements. 4177 3. Packets not yet acknowledged by the HC-Receiver. 4179 4. Packets not yet received by the HC-Receiver. 4181 The union of groups 2 and 3 is called the Acknowledgement Window. 4182 Generally, every Ack Vector generated by the HC-Receiver will cover 4183 the whole Acknowledgement Window: Ack Vector acknowledgements are 4184 cumulative. (This simplifies Ack Vector maintenance at the HC- 4185 Receiver; see Appendix A, below.) As packets are received, this 4186 window both grows on the right and shrinks on the left. It grows 4187 because there are more packets, and shrinks because the data 4188 packets' Acknowledgement Numbers will acknowledge previous 4189 acknowledgements, moving packets from group 2 into group 1. 4191 11.5. Send Ack Vector Feature 4193 The Send Ack Vector feature lets DCCPs negotiate whether they should 4194 use Ack Vector options to report congestion. Ack Vector provides 4195 detailed loss information, and lets senders report back to their 4196 applications whether particular packets were dropped. Send Ack 4197 Vector is mandatory for some CCIDs, and optional for others. 4199 Send Ack Vector has feature number 6, and is server-priority. It 4200 takes one-byte Boolean values. DCCP A MUST send Ack Vector options 4201 on its acknowledgements when Send Ack Vector/A has value one, 4202 although it MAY send Ack Vector options even when Send Ack Vector/A 4203 is zero. Values of two or more are reserved. New connections start 4204 with Send Ack Vector 0 for both endpoints. DCCP B sends a 4205 "Change R(Send Ack Vector, 1)" option to DCCP A to ask A to send Ack 4206 Vector options as part of its acknowledgement traffic. 4208 11.6. Slow Receiver Option 4210 An HC-Receiver sends the Slow Receiver option to its sender to 4211 indicate that it is having trouble keeping up with the sender's 4212 data. The HC-Sender SHOULD NOT increase its sending rate for 4213 approximately one round-trip time after seeing a packet with a Slow 4214 Receiver option. After one round-trip time, the effect of Slow 4215 Receiver disappears and the HC-Sender may again increase its rate, 4216 so the HC-Receiver SHOULD continue to send Slow Receiver options if 4217 it needs to prevent the HC-Sender from going faster in the long 4218 term. The Slow Receiver option does not indicate congestion, and 4219 the HC-Sender need not reduce its sending rate. (If necessary, the 4220 receiver can force the sender to slow down by dropping packets, with 4221 or without Data Dropped, or reporting false ECN marks.) APIs should 4222 let receiver applications set Slow Receiver, and sending 4223 applications determine whether or not their receivers are Slow. 4225 Slow Receiver is a one-byte option. 4227 +--------+ 4228 |00000010| 4229 +--------+ 4230 Type=2 4232 Slow Receiver does not specify why the receiver is having trouble 4233 keeping up with the sender. Possible reasons include lack of buffer 4234 space, CPU overload, and application quotas. A sending application 4235 might react to Slow Receiver by reducing its sending rate, for 4236 example. 4238 The sending application should not react to Slow Receiver by sending 4239 more data, however. The optimal response to a CPU-bound receiver 4240 might be to increase the sending rate, by switching to a less- 4241 compressed sending format, since a highly-compressed data format 4242 might overwhelm a slow CPU more seriously than the higher memory 4243 requirements of a less-compressed data format. This kind of format 4244 change should be requested at the application level, not via the 4245 Slow Receiver option. 4247 Slow Receiver implements a portion of TCP's receive window 4248 functionality. 4250 11.7. Data Dropped Option 4252 The Data Dropped option indicates that the application data on one 4253 or more received packets did not actually reach the application. 4254 Data Dropped additionally reports why the data was dropped: perhaps 4255 the data was corrupt, or perhaps the receiver cannot keep up with 4256 the sender's current rate and the data was dropped in some receive 4257 buffer. Using Data Dropped, DCCP endpoints can discriminate between 4258 different kinds of loss; this differs from TCP, in which all loss is 4259 reported the same way. 4261 Unless explicitly specified otherwise, DCCP congestion control 4262 mechanisms MUST react as if each Data Dropped packet was marked as 4263 ECN Congestion Experienced by the network. We intend for Data 4264 Dropped to enable research into richer congestion responses to 4265 corrupt and other endpoint-dropped packets, but DCCP CCIDs MUST 4266 react conservatively to Data Dropped until this behavior is 4267 standardized. Section 11.7.2, below, describes congestion responses 4268 for all current Drop Codes. 4270 If a received packet's application data is dropped for one of the 4271 reasons listed below, this SHOULD be reported using a Data Dropped 4272 option. Alternatively, the receiver MAY choose to report as 4273 "received" only those packets whose data were not dropped, subject 4274 to the constraint that packets not reported as received MUST NOT 4275 have had their options processed. 4277 The option's data looks like this: 4279 +--------+--------+--------+--------+--------+-------- 4280 |00101000| Length | Block | Block | Block | ... 4281 +--------+--------+--------+--------+--------+-------- 4282 Type=40 \___________ Vector ___________ ... 4284 The Vector consists of a series of bytes, called Blocks, each of 4285 whose encoding corresponds to one of two choices: 4287 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 4288 +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 4289 |0| Run Length | or |1|DrpCd|Run Len| 4290 +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 4291 Normal Block Drop Block 4293 The first byte in the first Data Dropped option refers to the packet 4294 indicated in the Acknowledgement Number; subsequent bytes refer to 4295 older packets. (Data Dropped MUST NOT be sent on DCCP-Data or DCCP- 4296 Request packets, which lack an Acknowledgement Number, and any Data 4297 Dropped options received on these packet types MUST be ignored.) 4298 Normal Blocks, which have high bit 0, indicate that any received 4299 packets in the Run Length had their data delivered to the 4300 application. Drop Blocks, which have high bit 1, indicate that 4301 received packets in the Run Len[gth] were not delivered as usual. 4302 The 3-bit Drop Code [DrpCd] field says what happened; generally, no 4303 data from that packet reached the application. Packets reported as 4304 "not yet received" MUST be included in Normal Blocks; packets not 4305 covered by any Data Dropped option are treated as if they were in a 4306 Normal Block. Defined Drop Codes for Drop Blocks are as follows. 4308 Drop Code Meaning 4309 --------- ------- 4310 0 Protocol Constraints 4311 1 Application Not Listening 4312 2 Receive Buffer 4313 3 Corrupt 4314 4-6 Reserved 4315 7 Delivered Corrupt 4317 Table 7: DCCP Drop Codes 4319 In more detail: 4321 0 The packet data was dropped due to protocol constraints. 4322 For example, the data was included on a DCCP-Request packet, 4323 but the receiving application does not allow such 4324 piggybacking; or the data was included on a packet with 4325 inappropriately low Checksum Coverage. 4327 1 The packet data was dropped because the application is no 4328 longer listening. See Section 11.7.2. 4330 2 The packet data was dropped in a receive buffer, probably 4331 because of receive buffer overflow. See Section 11.7.2. 4333 3 The packet data was dropped due to corruption. See Section 4334 9.3. 4336 7 The packet data was corrupted, but delivered to the 4337 application anyway. See Section 9.3. 4339 For example, assume a packet arrives with Acknowledgement Number 4340 100, an Ack Vector reporting all packets as received, and a Data 4341 Dropped option containing the decimal values 0,160,3,162. Then: 4343 Packet 100 was received (Acknowledgement Number 100, Normal 4344 Block, Run Length 0). 4346 Packet 99 was dropped in a receive buffer (Drop Block, Drop Code 4347 2, Run Length 0). 4349 Packets 98, 97, 96, and 95 were received (Normal Block, Run 4350 Length 3). 4352 Packets 95, 94, and 93 were dropped in the receive buffer (Drop 4353 Block, Drop Code 2, Run Length 2). 4355 Run lengths of more than 128 (for Normal Blocks) or 16 (for Drop 4356 Blocks) must be encoded in multiple Blocks. A single Data Dropped 4357 option can acknowledge up to 32384 Normal Block data packets, 4358 although the receiver SHOULD NOT send a Data Dropped option when all 4359 relevant packets fit into Normal Blocks. Should more packets need 4360 to be acknowledged than can fit in 253 bytes of Data Dropped, then 4361 multiple Data Dropped options can be sent. The second option will 4362 begin where the first left off, and so forth. 4364 One or more Data Dropped options that, together, report the status 4365 of more packets than have been sent, or that change the status of a 4366 packet, or that disagree with Ack Vector or equivalent options (by 4367 reporting a "not yet received" packet as "dropped in the receive 4368 buffer", for example), SHOULD be considered invalid. The receiving 4369 DCCP SHOULD either ignore such options, or respond by resetting the 4370 connection with Reset Code 5, "Option Error". 4372 A DCCP application interface should let receiving applications 4373 specify the Drop Codes corresponding to received packets. For 4374 example, this would let applications calculate their own checksums, 4375 but still report "dropped due to corruption" packets via the Data 4376 Dropped option. The interface SHOULD NOT let applications reduce 4377 the "seriousness" of a packet's Drop Code; for example, the 4378 application should not be able to upgrade a packet from delivered 4379 corrupt (Drop Code 7) to delivered normally (no Drop Code). 4381 Data Dropped information is transmitted reliably. That is, 4382 endpoints SHOULD continue to transmit Data Dropped options until 4383 receiving an acknowledgement indicating that the relevant options 4384 have been processed. In Ack Vector terms, each acknowledgement 4385 should contain Data Dropped options that cover the whole 4386 Acknowledgement Window (Section 11.4.2), although when every packet 4387 in that window would be placed in a Normal Block no actual option is 4388 required. 4390 11.7.1. Data Dropped and Normal Congestion Response 4392 When deciding on a response to a particular acknowledgement or set 4393 of acknowledgements containing Data Dropped options, a congestion 4394 control mechanism MUST consider dropped packets and ECN Congestion 4395 Experienced marks (including marked packets that are included in 4396 Data Dropped), as well as the packets singled out in Data Dropped. 4397 For window-based mechanisms, the valid response space is defined as 4398 follows. 4400 Assume an old window of W. Independently calculate a new window 4401 W_new1 that assumes no packets were Data Dropped (so W_new1 contains 4402 only the normal congestion response), and a new window W_new2 that 4403 assumes no packets were lost or marked (so W_new2 contains only the 4404 Data Dropped response). We are assuming that Data Dropped 4405 recommended a reduction in congestion window, so W_new2 < W. 4407 Then the actual new window W_new MUST NOT be larger than the minimum 4408 of W_new1 and W_new2; and the sender MAY combine the two responses, 4409 by setting 4410 W_new = W + min(W_new1 - W, 0) + min(W_new2 - W, 0). 4412 The details of how this is accomplished are specified in CCID 4413 profile documents. Non-window-based congestion control mechanisms 4414 MUST behave analogously; again, CCID profiles define how. 4416 11.7.2. Particular Drop Codes 4418 Drop Code 0, Protocol Constraints, does not indicate any kind of 4419 congestion, so the sender's CCID SHOULD react to packets with Drop 4420 Code 0 as if they were received (with or without ECN Congestion 4421 Experienced marks, as appropriate). However, the sending endpoint 4422 SHOULD NOT send data until it believes the protocol constraint no 4423 longer applies. 4425 Drop Code 1, Application Not Listening, means the application 4426 running at the endpoint that sent the option is no longer listening 4427 for data. For example, a server might close its receiving half- 4428 connection to new data after receiving a complete request from the 4429 client. This would limit the amount of state available at the 4430 server for incoming data, and thus reduce the potential damage from 4431 certain denial-of-service attacks. A Data Dropped option containing 4432 Drop Code 1 SHOULD be sent whenever received data is ignored due to 4433 a non-listening application. Once an endpoint reports Drop Code 1 4434 for a packet, it SHOULD report Drop Code 1 for every succeeding data 4435 packet on that half-connection; once an endpoint receives a Drop 4436 State 1 report, it SHOULD expect that no more data will ever be 4437 delivered to the other endpoint's application, so it SHOULD NOT send 4438 more data. 4440 Drop Code 2, Receive Buffer, indicates congestion inside the 4441 receiving host. For instance, if a drop-from-tail kernel socket 4442 buffer is too full to accept a packet's application data, that 4443 packet should be reported as Drop Code 2. For a drop-from-head or 4444 more complex socket buffer, the dropped packet should be reported as 4445 Drop Code 2. DCCP implementations may also provide an API by which 4446 applications can mark received packets as Drop Code 2, indicating 4447 that the application ran out of space in its user-level receive 4448 buffer. (However, it is not generally useful to report packets as 4449 dropped due to Drop Code 2 after more than a couple round-trip times 4450 have passed. The HC-Sender may have forgotten its acknowledgement 4451 state for the packet by that time, so the Data Dropped report will 4452 have no effect.) Every packet newly acknowledged as Drop Code 2 4453 SHOULD reduce the sender's instantaneous rate by one packet per 4454 round-trip time, unless the sender is already sending one packet per 4455 RTT or less. Each CCID profile defines the CCID-specific mechanism 4456 by which this is accomplished. 4458 Currently, the other Drop Codes, namely Drop Code 3, Corrupt, Drop 4459 Code 7, Delivered Corrupt, and reserved Drop Codes 4-6, MUST cause 4460 the relevant CCID to behave as if the relevant packets were ECN 4461 marked (ECN Congestion Experienced). 4463 12. Explicit Congestion Notification 4465 The DCCP protocol is fully ECN-aware [RFC 3168]. Each CCID 4466 specifies how its endpoints respond to ECN marks. Furthermore, 4467 DCCP, unlike TCP, allows senders to control the rate at which 4468 acknowledgements are generated (with options like Ack Ratio); since 4469 acknowledgements are congestion-controlled, they also qualify as 4470 ECN-Capable Transport. 4472 A CCID profile describes how that CCID interacts with ECN, both for 4473 data traffic and pure-acknowledgement traffic. A sender SHOULD set 4474 ECN-Capable Transport on its packets' IP headers, unless the 4475 receiver's ECN Incapable feature is on or the relevant CCID 4476 disallows it. 4478 The rest of this section describes the ECN Incapable feature and the 4479 interaction of the ECN Nonce with acknowledgement options such as 4480 Ack Vector. 4482 12.1. ECN Incapable Feature 4484 DCCP endpoints are ECN-aware by default, but the ECN Incapable 4485 feature lets an endpoint reject the use of Explicit Congestion 4486 Notification. The use of this feature is NOT RECOMMENDED. ECN 4487 incapability both avoids ECN's possible benefits and prevents 4488 senders from using the ECN Nonce to check for receiver misbehavior. 4489 A DCCP stack MAY therefore leave the ECN Incapable feature 4490 unimplemented, acting as if all connections were ECN capable. It is 4491 worth noting that the inappropriate firewall interactions that 4492 dogged TCP's implementation of ECN [RFC 3360] involve TCP header 4493 bits, not the IP header's ECN bits; we know of no middlebox that 4494 would block ECN-capable DCCP packets, but allow ECN-incapable DCCP 4495 packets. 4497 ECN Incapable has feature number 4, and is server-priority. It 4498 takes one-byte Boolean values. DCCP A MUST be able to read ECN bits 4499 from received frames' IP headers when ECN Incapable/A is zero. 4500 (This is independent of whether it can set ECN bits on sent frames.) 4501 DCCP A thus sends a "Change L(ECN Inapable, 1)" option to DCCP B to 4502 inform it that A cannot read ECN bits. If the ECN Incapable/A 4503 feature is one, then all of DCCP B's packets MUST be sent as ECN 4504 incapable. New connections start with ECN Incapable 0 (that is, ECN 4505 capable) for both endpoints. Values of two or more are reserved. 4507 If a DCCP is not ECN capable, it MUST send Mandatory "Change L(ECN 4508 Incapable, 1)" options to the other endpoint until acknowledged (by 4509 "Confirm R(ECN Incapable, 1)") or the connection closes. 4510 Furthermore, it MUST NOT accept any data until the other endpoint 4511 sends "Confirm R(ECN Incapable, 1)". It SHOULD send Data Dropped 4512 options on its acknowledgements, with Drop Code 0 ("protocol 4513 constraints"), if the other endpoint does send data inappropriately. 4515 12.2. ECN Nonces 4517 Congestion avoidance will not occur, and the receiver will sometimes 4518 get its data faster, if the sender isn't told about congestion 4519 events. Thus, the receiver has some incentive to falsify 4520 acknowledgement information, reporting that marked or dropped 4521 packets were actually received unmarked. This problem is more 4522 serious with DCCP than with TCP, since TCP provides reliable 4523 transport: it is more difficult with TCP to lie about lost packets 4524 without breaking the application. 4526 ECN Nonces are a general mechanism to prevent ECN cheating (or loss 4527 cheating). Two values for the two-bit ECN header field indicate 4528 ECN-Capable Transport, 01 and 10. The second code point, 10, is the 4529 ECN Nonce. In general, a protocol sender chooses between these code 4530 points randomly on its output packets, remembering the sequence it 4531 chose. The protocol receiver reports, on every acknowledgement, the 4532 number of ECN Nonces it has received thus far. This is called the 4533 ECN Nonce Echo. Since ECN marking and packet dropping both destroy 4534 the ECN Nonce, a receiver that lies about an ECN mark or packet drop 4535 has a 50% chance of guessing right and avoiding discipline. The 4536 sender may react punitively to an ECN Nonce mismatch, possibly up to 4537 dropping the connection. The ECN Nonce Echo field need not be an 4538 integer; one bit is enough to catch 50% of infractions, and the 4539 probability of success drops exponentially as more packets are sent 4540 [RFC 3540]. 4542 In DCCP, the ECN Nonce Echo field is encoded in acknowledgement 4543 options. For example, the Ack Vector option comes in two forms, Ack 4544 Vector [Nonce 0] (option 38) and Ack Vector [Nonce 1] (option 39), 4545 corresponding to the two values for a one-bit ECN Nonce Echo. The 4546 Nonce Echo for a given Ack Vector equals the one-bit sum (exclusive- 4547 or, or parity) of ECN nonces for packets reported by that Ack Vector 4548 as received and not ECN marked. Thus, only packets marked as State 4549 0 matter for this calculation (that is, valid received packets that 4550 were not ECN marked). Every Ack Vector option is detailed enough 4551 for the sender to determine what the Nonce Echo should have been. 4552 It can check this calculation against the actual Nonce Echo, and 4553 complain if there is a mismatch. (The Ack Vector could conceivably 4554 report every packet's ECN Nonce state, but this would severely limit 4555 its compressibility without providing much extra protection.) 4557 Each DCCP sender SHOULD set ECN Nonces on its packets, and remember 4558 which packets had nonces. When a sender detects an ECN Nonce Echo 4559 mismatch, it behaves as described in the next section. Each DCCP 4560 receiver MUST calculate and use the correct value for ECN Nonce Echo 4561 when sending acknowledgement options. 4563 ECN incapability, as indicated by the ECN Incapable feature, is 4564 handled as follows: An endpoint sending packets to an ECN-incapable 4565 receiver MUST send its packets as ECN incapable, and an ECN- 4566 incapable receiver MUST use the value zero for all ECN Nonce Echoes. 4568 12.3. Aggression Penalties 4570 DCCP endpoints have several mechanisms for detecting congestion- 4571 related misbehavior. For example: 4573 o A sender can detect an ECN Nonce Echo mismatch, indicating 4574 possible receiver misbehavior. 4576 o A receiver can detect whether the sender is responding to 4577 congestion feedback or Slow Receiver. 4579 o An endpoint may be able to detect that its peer is reporting 4580 inappropriately small Elapsed Time values (Section 13.2). 4582 An endpoint that detects possible congestion-related misbehavior 4583 SHOULD try to verify that its peer is truly misbehaving. For 4584 example, a sending endpoint might send a packet whose ECN header 4585 field is set to Congestion Experienced, 11; a receiver that doesn't 4586 report a corresponding mark is most likely misbehaving. 4588 Upon detecting possible misbehavior, a sender SHOULD respond as if 4589 the receiver had reported one or more recent packets as ECN-marked 4590 (instead of unmarked), while a receiver SHOULD report one or more 4591 recent non-marked packets as ECN-marked. Alternately, a sender 4592 might act as if the receiver had sent a Slow Receiver option, and a 4593 receiver might send Slow Receiver options. Other reactions that 4594 serve to slow the transfer rate are also acceptable. An entity that 4595 detects particularly egregious and ongoing misbehavior MAY also 4596 reset the connection with Reset Code 11, "Aggression Penalty". 4598 However, ECN Nonce mismatches and other warning signs can result 4599 from innocent causes, such as implementation bugs or attack. In 4600 particular, a successful DCCP-Data attack (Section 7.5.5) can cause 4601 the receiver to report an incorrect ECN Nonce Echo. Therefore, 4602 connection reset and other heavyweight mechanisms SHOULD be sent 4603 only as last resorts, after multiple round-trip times of verified 4604 aggression. 4606 13. Timing Options 4608 The Timestamp, Timestamp Echo, and Elapsed Time options help DCCP 4609 endpoints explicitly measure round-trip times. 4611 13.1. Timestamp Option 4613 This option is permitted in any DCCP packet. The length of the 4614 option is 6 bytes. 4616 +--------+--------+--------+--------+--------+--------+ 4617 |00101001|00000110| Timestamp Value | 4618 +--------+--------+--------+--------+--------+--------+ 4619 Type=41 Length=6 4621 The four bytes of option data carry the timestamp of this packet. 4622 The timestamp is a 32-bit integer that increases monotonically with 4623 time, at a rate of 1 unit per 10 microseconds. At this rate, 4624 Timestamp Value will wrap approximately every 11.9 hours. Endpoints 4625 need not measure time at this fine granularity; for example, an 4626 endpoint that preferred to measure time at millisecond granularity 4627 might send Timestamp Values that were all multiples of 100. The 4628 precise time corresponding to Timestamp Value zero is not specified: 4629 Timestamp Values are only meaningful relative to other Timestamp 4630 Values sent on the same connection. A DCCP receiving a Timestamp 4631 option SHOULD respond with a Timestamp Echo option on the next 4632 packet it sends. 4634 13.2. Elapsed Time Option 4636 This option is permitted in any DCCP packet that contains an 4637 Acknowledgement Number (such options received on other packet types 4638 MUST be ignored). It indicates how much time has elapsed, in 4639 hundredths of milliseconds (or, equivalently, multiples of 4640 10 microseconds), since the packet being acknowledged -- the packet 4641 with the given Acknowledgement Number -- was received. The option 4642 may take 4 or 6 bytes, depending on the size of the Elapsed Time 4643 value. Elapsed Time helps correct round-trip time estimates when 4644 the gap between receiving a packet and acknowledging that packet may 4645 be long -- in CCID 3, for example, where acknowledgements are sent 4646 infrequently. 4648 +--------+--------+--------+--------+ 4649 |00101011|00000100| Elapsed Time | 4650 +--------+--------+--------+--------+ 4651 Type=43 Len=4 4653 +--------+--------+--------+--------+--------+--------+ 4654 |00101011|00000110| Elapsed Time | 4655 +--------+--------+--------+--------+--------+--------+ 4656 Type=43 Len=6 4658 The option data, Elapsed Time, represents an estimated upper bound 4659 on the amount of time elapsed since the packet being acknowledged 4660 was received, with units of hundredths of milliseconds. If Elapsed 4661 Time is less than a half-second, the first, smaller form of the 4662 option SHOULD be used. Elapsed Times of more than 0.65535 seconds 4663 MUST be sent using the second form of the option. The special 4664 Elapsed Time value 4294967295, which corresponds to approximately 4665 11.9 hours, is used to represent any Elapsed Time greater than 4666 42949.67294 seconds. DCCP endpoints MUST NOT report Elapsed Times 4667 that are significantly larger than the true elapsed times. A 4668 connection MAY be reset with Reset Code 11, "Aggression Penalty", if 4669 one endpoint determines that the other is reporting a much-too-large 4670 Elapsed Time. 4672 Elapsed Time is measured in hundredths of milliseconds as a 4673 compromise between two conflicting goals. First, it provides enough 4674 granularity to reduce rounding error when measuring elapsed time 4675 over fast LANs; second, it allows many reasonable elapsed times to 4676 fit into two bytes of data. 4678 13.3. Timestamp Echo Option 4680 This option is permitted in any DCCP packet, as long as at least one 4681 packet carrying the Timestamp option has been received. Generally, 4682 a DCCP endpoint should send one Timestamp Echo option for each 4683 Timestamp option it receives; and it should send that option as soon 4684 as is convenient. The length of the option is between 6 and 10 4685 bytes, depending on whether Elapsed Time is included and how large 4686 it is. 4688 +--------+--------+--------+--------+--------+--------+ 4689 |00101010|00000110| Timestamp Echo | 4690 +--------+--------+--------+--------+--------+--------+ 4691 Type=42 Len=6 4693 +--------+--------+------- ... -------+--------+--------+ 4694 |00101010|00001000| Timestamp Echo | Elapsed Time | 4695 +--------+--------+------- ... -------+--------+--------+ 4696 Type=42 Len=8 (4 bytes) 4698 +--------+--------+------- ... -------+------- ... -------+ 4699 |00101010|00001010| Timestamp Echo | Elapsed Time | 4700 +--------+--------+------- ... -------+------- ... -------+ 4701 Type=42 Len=10 (4 bytes) (4 bytes) 4703 The first four bytes of option data, Timestamp Echo, carry a 4704 Timestamp Value taken from a preceding received Timestamp option. 4705 Usually, this will be the last packet that was received -- the 4706 packet indicated by the Acknowledgement Number, if any -- but it 4707 might be a preceding packet. Each Timestamp received will generally 4708 result in exactly one Timestamp Echo transmitted. If an endpoint 4709 has received multiple Timestamp options since the last time it sent 4710 a packet, then it MAY ignore all Timestamp options but the one 4711 included on the packet with the greatest sequence number; 4712 alternatively, it MAY include multiple Timestamp Echo options in its 4713 response, each corresponding to a different Timestamp option. 4715 The Elapsed Time value, similar to that in the Elapsed Time option, 4716 indicates the amount of time elapsed since receiving the packet 4717 whose timestamp is being echoed. This time MUST be in hundredths of 4718 milliseconds. Elapsed Time is meant to help the Timestamp sender 4719 separate the network round-trip time from the Timestamp receiver's 4720 processing time. This may be particularly important for CCIDs where 4721 acknowledgements are sent infrequently, so that there might be 4722 considerable delay between receiving a Timestamp option and sending 4723 the corresponding Timestamp Echo. A missing Elapsed Time field is 4724 equivalent to an Elapsed Time of zero. The smallest version of the 4725 option SHOULD be used that can hold the relevant Elapsed Time value. 4727 14. Maximum Packet Size 4729 A DCCP implementation MUST maintain the maximum packet size (MPS) 4730 allowed for each active DCCP session. The MPS is influenced by the 4731 maximum packet size allowed by the current congestion control 4732 mechanism (CCMPS), the maximum packet size supported by the path's 4733 links (PMTU, the Path Maximum Transmission Unit) [RFC 1191], and the 4734 lengths of the IP and DCCP headers. 4736 A DCCP application interface SHOULD let the application discover 4737 DCCP's current MPS. Generally, the DCCP implementation will refuse 4738 to send any packet bigger than the MPS, returning an appropriate 4739 error to the application. A DCCP interface MAY allow applications 4740 to request fragmentation for packets larger than PMTU, but not 4741 larger than CCMPS (packets larger than CCMPS MUST be rejected in any 4742 case). Fragmentation SHOULD NOT be the default, since it decreases 4743 robustness: an entire packet is discarded if even one of its 4744 fragments is lost. Applications can usually get better error 4745 tolerance by producing packets smaller than the PMTU. 4747 The MPS reported to the application SHOULD be influenced by the size 4748 expected to be required for DCCP headers and options. If the 4749 application provides data that, when combined with the options the 4750 DCCP implementation would like to include, would exceed the MPS, the 4751 implementation should either send the options on a separate packet 4752 (such as a DCCP-Ack) or lower the MPS, drop the data, and return an 4753 appropriate error to the application. 4755 14.1. Measuring PMTU 4757 Each DCCP endpoint MUST keep track of the current PMTU for each 4758 connection, except that this is not required for IPv4 connections 4759 whose applications have requested fragmentation. The PMTU SHOULD be 4760 initialized from the interface MTU that will be used to send 4761 packets. The MPS will be initialized with the minimum of the PMTU 4762 and the CCMPS, if any. 4764 Classical PMTU discovery uses unfragmentable packets. In IPv4, 4765 these packets have the IP Don't Fragment (DF) bit set; in IPv6, all 4766 packets are unfragmentable once emitted by an end host. As 4767 specified in RFC 1191, when a router receives a packet with DF set 4768 that is larger than the next link's MTU, it sends an ICMP 4769 Destination Unreachable message back to the source whose Code 4770 indicates that an unfragmentable packet was too large to forward (a 4771 "Datagram Too Big" message). When a DCCP implementation receives a 4772 Datagram Too Big message, it decreases its PMTU to the Next-Hop MTU 4773 value given in the ICMP message. If the MTU given in the message is 4774 zero, the sender chooses a value for PMTU using the algorithm 4775 described in RFC 1191 (Section 7). If the MTU given in the message 4776 is greater than the current PMTU, the Datagram Too Big message is 4777 ignored, as described in RFC 1191. (We are aware that this may 4778 cause problems for DCCP endpoints behind certain firewalls.) 4780 A DCCP implementation may allow the application to occasionally 4781 request that PMTU discovery be performed again. This will reset the 4782 PMTU to the outgoing interface's MTU. Such requests SHOULD be rate 4783 limited, to one per two seconds, for example. 4785 A DCCP sender MAY treat the reception of an ICMP Datagram Too Big 4786 message as an indication that the packet being reported was not lost 4787 due to congestion, and so for the purposes of congestion control it 4788 MAY ignore the DCCP receiver's indication that this packet did not 4789 arrive. However, if this is done, then the DCCP sender MUST check 4790 the ECN bits of the IP header echoed in the ICMP message, and only 4791 perform this optimization if these ECN bits indicate that the packet 4792 did not experience congestion prior to reaching the router whose 4793 link MTU it exceeded. 4795 A DCCP implementation SHOULD ensure, as far as possible, that ICMP 4796 Datagram Too Big messages were actually generated by routers, so 4797 that attackers cannot drive the PMTU down to a falsely small value. 4798 The simplest way to do this is to verify that the Sequence Number on 4799 the ICMP error's encapsulated header corresponds to a Sequence 4800 Number that the implementation recently sent. (According to current 4801 specifications, routers should return the full DCCP header and 4802 payload up to a maximum of 576 bytes [RFC 1812] or the minimum IPv6 4803 MTU [RFC 2463], although they are not required to return more than 4804 64 bits [RFC 792]. Any amount greater than 128 bits will include 4805 the Sequence Number.) ICMP Datagram Too Big messages with incorrect 4806 or missing Sequence Numbers may be ignored, or the DCCP 4807 implementation may lower the PMTU only temporarily in response. If 4808 more than three odd Datagram Too Big messages are received and the 4809 other DCCP endpoint reports more than three lost packets, however, 4810 the DCCP implementation SHOULD assume the presence of a confused 4811 router, and either obey the ICMP messages' PMTU or (on IPv4 4812 networks) switch to allowing fragmentation. 4814 DCCP also allows upward probing of the PMTU [PMTUD], where the DCCP 4815 endpoint begins by sending small packets with DF set, then gradually 4816 increases the packet size until a packet is lost. This mechanism 4817 does not require any ICMP error processing. DCCP-Sync packets are 4818 the best choice for upward probing, since DCCP-Sync probes do not 4819 risk application data loss. The DCCP implementation inserts 4820 arbitrary data into the DCCP-Sync application area, padding the 4821 packet to the right length; and since every valid DCCP-Sync 4822 generates an immediate DCCP-SyncAck in response, the endpoint will 4823 have a pretty good idea of when a probe is lost. 4825 14.2. Sender Behavior 4827 A DCCP sender SHOULD send every packet as unfragmentable, as 4828 described above, with the following exceptions. 4830 o On IPv4 connections whose applications have requested 4831 fragmentation, the sender SHOULD send packets with the DF bit not 4832 set. 4834 o On IPv6 connections whose applications have requested 4835 fragmentation, the sender SHOULD use fragmentation extension 4836 headers to fragment packets larger than PMTU into suitably-sized 4837 chunks. (Those chunks are, of course, unfragmentable.) 4839 o It is undesirable for PMTU discovery to occur on the initial 4840 connection setup handshake, as the connection setup process may 4841 not be representative of packet sizes used during the connection, 4842 and performing MTU discovery on the initial handshake might 4843 unnecessarily delay connection establishment. Thus, DCCP-Request 4844 and DCCP-Response packets SHOULD be sent as fragmentable. In 4845 addition, DCCP-Reset packets SHOULD be sent as fragmentable, 4846 although typically these would be small enough to not be a 4847 problem. For IPv4 connections, these packets SHOULD be sent with 4848 the DF bit not set; for IPv6 connections, they SHOULD be 4849 preemptively fragmented to a size not larger than the relevant 4850 interface MTU. 4852 If the DCCP implementation has decreased the PMTU, the sending 4853 application has not requested fragmentation, and the sending 4854 application attempts to send a packet larger than the new MPS, the 4855 API MUST refuse to send the packet and return an appropriate error 4856 to the application. The application should then use the API to 4857 query the new value of MPS. The kernel might have some packets 4858 buffered for transmission that are smaller than the old MPS, but 4859 larger than the new MPS. It MAY send these packets as fragmentable, 4860 or it MAY discard these packets; it MUST NOT send them as 4861 unfragmentable. 4863 15. Forward Compatibility 4865 Future versions of DCCP may add new options and features. A few 4866 simple guidelines will let extended DCCPs interoperate with normal 4867 DCCPs. 4869 o DCCP processors MUST NOT act punitively towards options and 4870 features they do not understand. For example, DCCP processors 4871 MUST NOT reset the connection if some field marked Reserved in 4872 this specification is non-zero; if some unknown option is 4873 present; or if some feature negotiation option mentions an 4874 unknown feature. Instead, DCCP processors MUST ignore these 4875 events. The Mandatory option is the single exception: if 4876 Mandatory precedes some unknown option or feature, the connection 4877 MUST be reset. 4879 o DCCP processors MUST anticipate the possibility of unknown 4880 feature values, which might occur as part of a negotiation for a 4881 known feature. For server-priority features, unknown values are 4882 handled as a matter of course: since the non-extended DCCP's 4883 priority list will not contain unknown values, the result of the 4884 negotiation cannot be an unknown value. A DCCP SHOULD respond 4885 with an empty Confirm option if it is assigned an unacceptable 4886 value for some non-negotiable feature. 4888 o Each DCCP extension SHOULD be controlled by some feature. The 4889 default value of this feature should correspond to "extension not 4890 available". If an extended DCCP wants to use the extension, it 4891 SHOULD attempt to change the feature's value using a Change L or 4892 Change R option. Any non-extended DCCP will ignore the option, 4893 thus leaving the feature value at its default, "extension not 4894 available". 4896 Section 19 lists DCCP assigned numbers reserved for experimental and 4897 testing purposes. 4899 16. Middlebox Considerations 4901 This section describes properties of DCCP that firewalls, network 4902 address translators, and other middleboxes should consider, 4903 including parts of the packet that middleboxes should not change. 4904 The intent is to draw attention to aspects of DCCP that may be 4905 useful, or dangerous, for middleboxes, or that differ significantly 4906 from TCP. 4908 The Service Code field in DCCP-Request packets provides information 4909 that may be useful for stateful middleboxes. With Service Code, a 4910 middlebox can tell what protocol a connection will use without 4911 relying on port numbers. Middleboxes can disallow connections that 4912 attempt to access unexpected services by sending a DCCP-Reset with 4913 Reset Code 8, "Bad Service Code". Middleboxes should not modify the 4914 Service Code unless they are really changing the service a 4915 connection is accessing. 4917 The Source and Destination Port fields are in the same packet 4918 locations as the corresponding fields in TCP and UDP, which may 4919 simplify some middlebox implementations. 4921 The forward compatibility considerations in Section 15 apply to 4922 middleboxes as well. In particular, middleboxes generally shouldn't 4923 act punitively towards options and features they do not understand. 4925 Modifying DCCP Sequence Numbers and Acknowledgement Numbers is more 4926 tedious and dangerous than modifying TCP sequence numbers. A 4927 middlebox that added packets to, or removed packets from, a DCCP 4928 connection would have to modify acknowledgement options, such as Ack 4929 Vector, and CCID-specific options, such as TFRC's Loss Intervals, at 4930 minimum. On ECN-capable connections, the middlebox would have to 4931 keep track of ECN Nonce information for packets it introduced or 4932 removed, so that the relevant acknowledgement options continued to 4933 have correct ECN Nonce Echoes, or risk the connection being reset 4934 for "Aggression Penalty". We therefore recommend that middleboxes 4935 not modify packet streams by adding or removing packets. 4937 Note that there is less need to modify DCCP's per-packet sequence 4938 numbers than TCP's per-byte sequence numbers; for example, a 4939 middlebox can change the contents of a packet without changing its 4940 sequence number. (In TCP, sequence number modification is required 4941 to support protocols like FTP that carry variable-length addresses 4942 in the data stream. If such an application were deployed over DCCP, 4943 middleboxes would simply grow or shrink the relevant packets as 4944 necessary, without changing their sequence numbers. This might 4945 involve fragmenting the packet.) 4947 Middleboxes may, of course, reset connections in progress. Clearly 4948 this requires inserting a packet into one or both packet streams, 4949 but the difficult issues do not arise. 4951 DCCP is somewhat unfriendly to "connection splicing" [SHHP00], in 4952 which clients' connection attempts are intercepted, but possibly 4953 later "spliced in" to external server connections via sequence 4954 number manipulations. A connection splicer at minimum would have to 4955 ensure that the spliced connections agreed on all relevant feature 4956 values, which might take some renegotiation. 4958 The contents of this section should not be interpreted as a 4959 wholesale endorsement of stateful middleboxes. 4961 17. Relations to Other Specifications 4963 17.1. RTP 4965 The Real-Time Transport Protocol, RTP [RFC 3550], is currently used 4966 over UDP by many of DCCP's target applications (for instance, 4967 streaming media). Therefore, it is important to examine the 4968 relationship between DCCP and RTP, and in particular, the question 4969 of whether any changes in RTP are necessary or desirable when it is 4970 layered over DCCP instead of UDP. 4972 There are two potential sources of overhead in the RTP-over-DCCP 4973 combination, duplicated acknowledgement information and duplicated 4974 sequence numbers. Together, these sources of overhead add slightly 4975 more than 4 bytes per packet relative to RTP-over-UDP, and that 4976 eliminating the redundancy would not reduce the overhead. 4978 First, consider acknowledgements. Both RTP and DCCP report feedback 4979 about loss rates to data senders, via RTP Control Protocol Sender 4980 and Receiver Reports (RTCP SR/RR packets) and via DCCP 4981 acknowledgement options. These feedback mechanisms are potentially 4982 redundant. However, RTCP SR/RR packets contain information not 4983 present in DCCP acknowledgements, such as "interarrival jitter", and 4984 DCCP's acknowledgements contain information not transmitted by RTCP, 4985 such as the ECN Nonce Echo. Neither feedback mechanism makes the 4986 other redundant. 4988 Sending both types of feedback need not be particularly costly 4989 either. RTCP reports may be sent relatively infrequently: once 4990 every 5 seconds on average, for low-bandwidth flows. In DCCP, some 4991 feedback mechanisms are expensive -- Ack Vector, for example, is 4992 frequent and verbose -- but others are relatively cheap: CCID 3 4993 (TFRC) acknowledgements take between 16 and 32 bytes of options sent 4994 once per round-trip time. (Reporting less frequently than once per 4995 RTT would make congestion control less responsive to loss.) We 4996 therefore conclude that acknowledgement overhead in RTP-over-DCCP 4997 need not be significantly higher than for RTP-over-UDP, at least for 4998 CCID 3. 5000 One clear redundancy can be addressed at the application level. The 5001 verbose packet-by-packet loss reports sent in RTCP Extended Reports 5002 Loss RLE Blocks [RFC 3611] can be derived from DCCP's Ack Vector 5003 options. (The converse is not true, since Loss RLE Blocks contain 5004 no ECN information.) Since DCCP implementations should provide an 5005 API for application access to Ack Vector information, RTP-over-DCCP 5006 applications might request either DCCP Ack Vectors or RTCP Extended 5007 Report Loss RLE Blocks, but not both. 5009 Now consider sequence number redundancy on data packets. The 5010 embedded RTP header contains a 16-bit RTP sequence number. Most 5011 data packets will use the DCCP-Data type; DCCP-DataAck and DCCP-Ack 5012 packets need not usually be sent. The DCCP-Data header is 12 bytes 5013 long without options, including a 24-bit sequence number. This is 4 5014 bytes more than a UDP header. Any options required on data packets 5015 would add further overhead, although many CCIDs (for instance, CCID 5016 3, TFRC) don't require options on most data packets. 5018 The DCCP sequence number cannot be inferred from the RTP sequence 5019 number since it increments on non-data packets as well as data 5020 packets. The RTP sequence number cannot be inferred from the DCCP 5021 sequence number either [RFC 3550]. Furthermore, removing RTP's 5022 sequence number would not save any header space because of alignment 5023 issues. We therefore recommend that RTP transmitted over DCCP use 5024 the same headers currently defined. The 4 byte header cost is a 5025 reasonable tradeoff for DCCP's congestion control features and 5026 access to ECN. Truly bandwidth-starved endpoints should use some 5027 future header compression scheme. 5029 17.2. Congestion Manager and Multiplexing 5031 Since DCCP doesn't provide reliable, ordered delivery, multiple 5032 application sub-flows may be multiplexed over a single DCCP 5033 connection with no inherent performance penalty. Thus, there is no 5034 need for DCCP to provide built-in, SCTP-style support for multiple 5035 sub-flows. 5037 Some applications might want to share congestion control state among 5038 multiple DCCP flows that share the same source and destination 5039 addresses. This functionality could be provided by the Congestion 5040 Manager [RFC 3124], a generic multiplexing facility. However, the 5041 CM would not fully support DCCP without change; it does not 5042 gracefully handle multiple congestion control mechanisms, for 5043 example. 5045 18. Security Considerations 5047 DCCP does not provide cryptographic security guarantees. 5048 Applications desiring cryptographic security services (integrity, 5049 authentication, confidentiality, access control, and anti-replay 5050 protection) should use IPsec or end-to-end security of some kind; 5051 Secure RTP is one candidate protocol [RFC 3711]. 5053 Nevertheless, DCCP is intended to protect against some classes of 5054 attackers: Attackers cannot hijack a DCCP connection (close the 5055 connection unexpectedly, or cause attacker data to be accepted by an 5056 endpoint as if it came from the sender) unless they can guess valid 5057 sequence numbers. Thus, as long as endpoints choose initial 5058 sequence numbers well, a DCCP attacker must snoop on data packets to 5059 get any reasonable probability of success. Sequence number validity 5060 checks provide this guarantee. Section 7.5.5 describes sequence 5061 number security further. This security property only holds assuming 5062 that DCCP's random numbers are chosen according to the guidelines in 5063 RFC 1750. 5065 DCCP also provides mechanisms to limit the potential impact of some 5066 denial-of-service attacks. These mechanisms include Init Cookie 5067 (Section 8.1.4), the DCCP-CloseReq packet (Section 5.5), the 5068 Application Not Listening Drop Code (Section 11.7.2), limitations on 5069 the processing of options that might cause connection reset (Section 5070 7.5.5), limitations on the processing of some ICMP messages (Section 5071 14.1), and various rate limits, which let servers avoid extensive 5072 computation or packet generation (Sections 7.5.3, 8.1.3, and 5073 others). 5075 DCCP provides no protection against attackers that can snoop on data 5076 packets. 5078 18.1. Security Considerations for Partial Checksums 5080 The partial checksum facility has a separate security impact, 5081 particularly in its interaction with authentication and encryption 5082 mechanisms. The impact is the same in DCCP as in the UDP-Lite 5083 protocol, and what follows was adapted from the corresponding text 5084 in the UDP-Lite specification [RFC 3828]. 5086 When a DCCP packet's Checksum Coverage field is not zero, the 5087 uncovered portion of a packet may change in transit. This is 5088 contrary to the idea behind most authentication mechanisms: 5089 authentication succeeds if the packet has not changed in transit. 5090 Unless authentication mechanisms that operate only on the sensitive 5091 part of packets are developed and used, authentication will always 5092 fail for partially-checksummed DCCP packets whose uncovered part has 5093 been damaged. 5095 The IPsec integrity check (Encapsulation Security Protocol, ESP, or 5096 Authentication Header, AH) is applied (at least) to the entire IP 5097 packet payload. Corruption of any bit within that area will then 5098 result in the IP receiver discarding a DCCP packet, even if the 5099 corruption happened in an uncovered part of the DCCP application 5100 data. 5102 When IPsec is used with ESP payload encryption, a link can not 5103 determine the specific transport protocol of a packet being 5104 forwarded by inspecting the IP packet payload. In this case, the 5105 link MUST provide a standard integrity check covering the entire IP 5106 packet and payload. DCCP partial checksums provide no benefit in 5107 this case. 5109 Encryption (e.g., at the transport or application levels) may be 5110 used. Note that omitting an integrity check can, under certain 5111 circumstances, compromise confidentiality [BEL98]. 5113 If a few bits of an encrypted packet are damaged, the decryption 5114 transform will typically spread errors so that the packet becomes 5115 too damaged to be of use. Many encryption transforms today exhibit 5116 this behavior. There exist encryption transforms, stream ciphers, 5117 which do not cause error propagation. Proper use of stream ciphers 5118 can be quite difficult, especially when authentication checking is 5119 omitted [BB01]. In particular, an attacker can cause predictable 5120 changes to the ultimate plaintext, even without being able to 5121 decrypt the ciphertext. 5123 19. IANA Considerations 5125 DCCP introduces eight sets of numbers whose values should be 5126 allocated by IANA. We refer to allocation policies, such as 5127 Standards Action, outlined in RFC 2434, and most registries reserve 5128 some values for experimental and testing use [RFC 3692]. In 5129 addition, DCCP requires a Protocol Number to be added to the 5130 registry of Assigned Internet Protocol Numbers. IANA is requested 5131 to assign IP Protocol Number 33 to DCCP; this number has already 5132 been informally made available for experimental DCCP use. 5134 19.1. Packet Types 5136 Each entry in the DCCP Packet Type registry contains a packet type, 5137 which is a number in the range 0-15; a packet type name, such as 5138 DCCP-Request; and a reference to the RFC defining the packet type. 5139 The registry is initially populated using the values in Table 1 5140 (Section 5.1). This document allocates packet types 0-9, and packet 5141 type 14 is permanently reserved for experimental and testing use. 5142 Packet types 10-13 and 15 are currently reserved, and should be 5143 allocated with the Standards Action policy, which requires IESG 5144 review and approval and standards-track IETF RFC publication. 5146 19.2. Reset Codes 5148 Each entry in the DCCP Reset Code registry contains a Reset Code, 5149 which is a number in the range 0-255; a short description of the 5150 Reset Code, such as "No Connection"; and a reference to the RFC 5151 defining the Reset Code. The registry is initially populated using 5152 the values in Table 2 (Section 5.6). This document allocates Reset 5153 Codes 0-11, and Reset Codes 120-126 are permanently reserved for 5154 experimental and testing use. Reset Codes 12-119 and 127 are 5155 currently reserved, and should be allocated with the IETF Consensus 5156 policy, requiring an IETF RFC publication (standards-track or not) 5157 with IESG review and approval. Reset Codes 128-255 are permanently 5158 reserved for CCID-specific registries; each CCID Profile document 5159 describes how the corresponding registry is managed. 5161 19.3. Option Types 5163 Each entry in the DCCP option type registry contains an option type, 5164 which is a number in the range 0-255; the name of the option, such 5165 as "Slow Receiver"; and a reference to the RFC defining the option 5166 type. The registry is initially populated using the values in Table 5167 3 (Section 5.8). This document allocates option types 0-2 and 5168 32-44, and option types 31 and 120-126 are permanently reserved for 5169 experimental and testing use. Option types 3-30, 45-119, and 127 5170 are currently reserved, and should be allocated with the IETF 5171 Consensus policy, requiring an IETF RFC publication (standards-track 5172 or not) with IESG review and approval. Option types 128-255 are 5173 permanently reserved for CCID-specific registries; each CCID Profile 5174 document describes how the corresponding registry is managed. 5176 19.4. Feature Numbers 5178 Each entry in the DCCP feature number registry contains a feature 5179 number, which is a number in the range 0-255; the name of the 5180 feature, such as "ECN Incapable"; and a reference to the RFC 5181 defining the feature number. The registry is initially populated 5182 using the values in Table 4 (Section 6). This document allocates 5183 feature numbers 0-9, and feature numbers 120-126 are permanently 5184 reserved for experimental and testing use. Feature numbers 10-119 5185 and 127 are currently reserved, and should be allocated with the 5186 IETF Consensus policy, requiring an IETF RFC publication (standards- 5187 track or not) with IESG review and approval. Feature numbers 5188 128-255 are permanently reserved for CCID-specific registries; each 5189 CCID Profile document describes how the corresponding registry is 5190 managed. 5192 19.5. Congestion Control Identifiers 5194 Each entry in the DCCP Congestion Control Identifier (CCID) registry 5195 contains a CCID, which is a number in the range 0-255; the name of 5196 the CCID, such as "TCP-like Congestion Control"; and a reference to 5197 the RFC defining the CCID. The registry is initially populated 5198 using the values in Table 5 (Section 10). CCIDs 2 and 3 are 5199 allocated by concurrently published profiles, and CCIDs 248-254 are 5200 permanently reserved for experimental and testing use. CCIDs 0, 1, 5201 4-247, and 255 are currently reserved, and should be allocated with 5202 the IETF Consensus policy, requiring an IETF RFC publication 5203 (standards-track or not) with IESG review and approval. 5205 19.6. Ack Vector States 5207 Each entry in the DCCP Ack Vector State registry contains an Ack 5208 Vector State, which is a number in the range 0-3; the name of the 5209 State, such as "Received ECN Marked"; and a reference to the RFC 5210 defining the State. The registry is initially populated using the 5211 values in Table 6 (Section 11.4). This document allocates States 0, 5212 1, and 3. State 2 is currently reserved, and should be allocated 5213 with the Standards Action policy, which requires IESG review and 5214 approval and standards-track IETF RFC publication. 5216 19.7. Drop Codes 5218 Each entry in the DCCP Drop Code registry contains a Data Dropped 5219 Drop Code, which is a number in the range 0-7; the name of the Drop 5220 Code, such as "Application Not Listening"; and a reference to the 5221 RFC defining the Drop Code. The registry is initially populated 5222 using the values in Table 7 (Section 11.7). This document allocates 5223 Drop Codes 0-3 and 7. Drop Codes 4-6 are currently reserved, and 5224 should be allocated with the Standards Action policy, which requires 5225 IESG review and approval and standards-track IETF RFC publication. 5227 19.8. Service Codes 5229 Each entry in the Service Code registry contains a Service Code, 5230 which is a number in the range 0-4294967295; a short English 5231 description of the intended service; and an optional reference to an 5232 RFC or other publicly available specification defining the Service 5233 Code. The registry should list the Service Code's numeric value as 5234 a decimal number, but when each byte of the four-byte Service Code 5235 is in the range 32-127, the registry should also show a four- 5236 character ASCII interpretation of the Service Code. Thus, the 5237 number 1717858426 would additionally appear as "fdpz". Service 5238 Codes are not DCCP-specific. This document does not allocate any 5239 Service Codes, but Service Code 0 is permanently reserved (it 5240 represents the absence of a meaningful Service Code), and Service 5241 Codes 1056964608-1073741823 (high byte ASCII "?") are reserved for 5242 Private Use. Most of the remaining Service Codes are allocated 5243 First Come First Served, with no RFC publication required. 5244 Exceptions are listed in Section 8.1.2. 5246 20. Thanks 5248 Thanks to Jitendra Padhye for his help with early versions of this 5249 specification. 5251 Thanks to Junwen Lai and Arun Venkataramani, who, as interns at 5252 ICIR, built a prototype DCCP implementation. In particular, Junwen 5253 Lai recommended that the old feature negotiation mechanism be 5254 scrapped and co-designed the current mechanism. Arun 5255 Venkataramani's feedback improved Appendix A. 5257 We thank the staff and interns of ICIR and, formerly, ACIRI, the 5258 members of the End-to-End Research Group, and the members of the 5259 Transport Area Working Group for their feedback on DCCP. We 5260 especially thank the DCCP expert reviewers: Greg Minshall, Eric 5261 Rescorla, and Magnus Westerlund for detailed written comments and 5262 problem spotting, and Rob Austein and Steve Bellovin for verbal 5263 comments and written notes. 5265 We also thank those who provided comments and suggestions via the 5266 DCCP BOF, Working Group, and mailing lists, including Damon 5267 Lanphear, Patrick McManus, Colin Perkins, Sara Karlberg, Kevin Lai, 5268 Bernard Aboba, Youngsoo Choi, Pengfei Di, Dan Duchamp, Gorry 5269 Fairhurst, Derek Fawcus, David Timothy Fleeman, John Loughney, 5270 Ghyslain Pelletier, Tom Phelan, Stanislav Shalunov, David Vos, Yufei 5271 Wang, and Michael Welzl. In particular, Colin Perkins provided 5272 extensive, detailed feedback, Michael Welzl suggested the Data 5273 Checksum option, and Gorry Fairhurst provided extensive feedback on 5274 various checksum issues. 5276 A. Appendix: Ack Vector Implementation Notes 5278 This appendix discusses particulars of DCCP acknowledgement 5279 handling, in the context of an abstract implementation for Ack 5280 Vector. It is informative rather than normative. 5282 The first part of our implementation runs at the HC-Receiver, and 5283 therefore acknowledges data packets. It generates Ack Vector 5284 options. The implementation has the following characteristics: 5286 o At most one byte of state per acknowledged packet. 5288 o O(1) time to update that state when a new packet arrives (normal 5289 case). 5291 o Cumulative acknowledgements. 5293 o Quick removal of old state. 5295 The basic data structure is a circular buffer containing information 5296 about acknowledged packets. Each byte in this buffer contains a 5297 state and run length; the state can be 0 (packet received), 1 5298 (packet ECN marked), or 3 (packet not yet received). The buffer 5299 grows from right to left. The implementation maintains five 5300 variables, aside from the buffer contents: 5302 o "buf_head" and "buf_tail", which mark the live portion of the 5303 buffer. 5305 o "buf_ackno", the Acknowledgement Number of the most recent packet 5306 acknowledged in the buffer. This corresponds to the "head" 5307 pointer. 5309 o "buf_nonce", the one-bit sum (exclusive-or, or parity) of the ECN 5310 Nonces received on all packets acknowledged by the buffer with 5311 State 0. 5313 We draw acknowledgement buffers like this: 5315 +---------------------------------------------------------------+ 5316 |S,L|S,L|S,L|S,L| | | | |S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L| 5317 +---------------------------------------------------------------+ 5318 ^ ^ 5319 buf_tail buf_head, buf_ackno = A buf_nonce = E 5321 <=== buf_head and buf_tail move this way <=== 5323 Each `S,L' represents a State/Run length byte. We will draw these 5324 buffers showing only their live portion, and will add an annotation 5325 showing the Acknowledgement Number for the last live byte in the 5326 buffer. For example: 5328 +-----------------------------------------------+ 5329 A |S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L| T BN[E] 5330 +-----------------------------------------------+ 5332 Here, buf_nonce equals E and buf_ackno equals A. 5334 We will use this buffer as a running example. 5336 +---------------------------+ 5337 10 |0,0|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] [Example Buffer] 5338 +---------------------------+ 5340 In concrete terms, its meaning is as follows: 5342 Packet 10 was received. (The head of the buffer has sequence 5343 number 10, state 0, and run length 0.) 5345 Packets 9, 8, and 7 have not yet been received. (The three 5346 bytes preceding the head each have state 3 and run length 0.) 5348 Packets 6, 5, 4, 3, and 2 were received. 5350 Packet 1 was ECN marked. 5352 Packet 0 was received. 5354 The one-bit sum of the ECN Nonces on packets 10, 6, 5, 4, 3, 2, 5355 and 0 equals 1. 5357 Additionally, the HC-Receiver must keep some information about the 5358 Ack Vectors it has recently sent. For each packet sent carrying an 5359 Ack Vector, it remembers four variables: 5361 o "ack_seqno", the Sequence Number used for the packet. This is an 5362 HC-Receiver sequence number. 5364 o "ack_ptr", the value of buf_head at the time of acknowledgement. 5366 o "ack_ackno", the Acknowledgement Number used for the packet. 5367 This is an HC-Sender sequence number. Since acknowledgements are 5368 cumulative, this single number completely specifies all necessary 5369 information about the packets acknowledged by this Ack Vector. 5371 o "ack_nonce", the one-bit sum of the ECN Nonces for all State 0 5372 packets in the buffer from buf_head to ack_ackno, inclusive. 5373 Initially, this equals the Nonce Echo of the acknowledgement's 5374 Ack Vector (or, if the ack packet contained more than one Ack 5375 Vector, the exclusive-or of all the acknowledgement's Ack 5376 Vectors). It changes as information about old acknowledgements 5377 is removed (so ack_ptr and buf_head diverge), and as old packets 5378 arrive (so they change from State 3 or State 1 to State 0). 5380 A.1. Packet Arrival 5382 This section describes how the HC-Receiver updates its 5383 acknowledgement buffer as packets arrive from the HC-Sender. 5385 A.1.1. New Packets 5387 When a packet with Sequence Number greater than buf_ackno arrives, 5388 the HC-Receiver updates buf_head (by moving it to the left 5389 appropriately), buf_ackno (which is set to the new packet's Sequence 5390 Number), and possibly buf_nonce (if the packet arrived unmarked with 5391 ECN Nonce 1), in addition to the buffer itself. For example, if HC- 5392 Sender packet 11 arrived ECN marked, the Example Buffer above would 5393 enter this new state (changes are marked with stars): 5395 ** +***----------------------------+ 5396 11 |1,0|0,0|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5397 ** +***----------------------------+ 5399 If the packet's state equals the state at the head of the buffer, 5400 the HC-Receiver may choose to increment its run length (up to the 5401 maximum). For example, if HC-Sender packet 11 arrived without ECN 5402 marking and with ECN Nonce 0, the Example Buffer might enter this 5403 state instead: 5405 ** +--*------------------------+ 5406 11 |0,1|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5407 ** +--*------------------------+ 5409 Of course, the new packet's sequence number might not equal the 5410 expected sequence number. In this case, the HC-Receiver will enter 5411 the intervening packets as State 3. If several packets are missing, 5412 the HC-Receiver may prefer to enter multiple bytes with run length 5413 0, rather than a single byte with a larger run length; this 5414 simplifies table updates if one of the missing packets arrives. For 5415 example, if HC-Sender packet 12 arrived with ECN Nonce 1, the 5416 Example Buffer would enter this state: 5418 ** +*******----------------------------+ * 5419 12 |0,0|3,0|0,1|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[0] 5420 ** +*******----------------------------+ * 5422 Of course, the circular buffer may overflow, either when the HC- 5423 Sender is sending data at a very high rate, when the HC-Receiver's 5424 acknowledgements are not reaching the HC-Sender, or when the HC- 5425 Sender is forgetting to acknowledge those acks (so the HC-Receiver 5426 is unable to clean up old state). In this case, the HC-Receiver 5427 should either compress the buffer (by increasing run lengths when 5428 possible), transfer its state to a larger buffer, or, as a last 5429 resort, drop all received packets, without processing them 5430 whatsoever, until its buffer shrinks again. 5432 A.1.2. Old Packets 5434 When a packet with Sequence Number S arrives, and S <= buf_ackno, 5435 the HC-Receiver will scan the table for the byte corresponding to S. 5436 (Indexing structures could reduce the complexity of this scan.) If 5437 S was previously lost (State 3), and it was stored in a byte with 5438 run length 0, the HC-Receiver can simply change the byte's state. 5439 For example, if HC-Sender packet 8 was received with ECN Nonce 0, 5440 the Example Buffer would enter this state: 5442 +--------*------------------+ 5443 10 |0,0|3,0|0,0|3,0|0,4|1,0|0,0| 0 BN[1] 5444 +--------*------------------+ 5446 If S was not marked as lost, or if it was not contained in the 5447 table, the packet is probably a duplicate, and should be ignored. 5448 (The new packet's ECN marking state might differ from the state in 5449 the buffer; Section 11.4.1 describes what is allowed then.) If S's 5450 buffer byte has a non-zero run length, then the buffer might need be 5451 reshuffled to make space for one or two new bytes. 5453 The ack_nonce fields may also need manipulation when old packets 5454 arrive. In particular, when S transitions from State 3 or State 1 5455 to State 0, and S had ECN Nonce 1, then the implementation should 5456 flip the value of ack_nonce for every acknowledgement with ack_ackno 5457 >= S. 5459 It is impossible with this data structure to shift packets from 5460 State 0 to State 1, since the buffer doesn't store individual 5461 packets' ECN Nonces. 5463 A.2. Sending Acknowledgements 5465 Whenever the HC-Receiver needs to generate an acknowledgement, the 5466 buffer's contents can simply be copied into one or more Ack Vector 5467 options. Copied Ack Vectors might not be maximally compressed; for 5468 example, the Example Buffer above contains three adjacent 3,0 bytes 5469 that could be combined into a single 3,2 byte. The HC-Receiver 5470 might, therefore, choose to compress the buffer in place before 5471 sending the option, or to compress the buffer while copying it; 5472 either operation is simple. 5474 Every acknowledgement sent by the HC-Receiver SHOULD include the 5475 entire state of the buffer. That is, acknowledgements are 5476 cumulative. 5478 If the acknowledgement fits in one Ack Vector, that Ack Vector's 5479 Nonce Echo simply equals buf_nonce. For multiple Ack Vectors, more 5480 care is required. The Ack Vectors should be split at points 5481 corresponding to previous acknowledgements, since the stored 5482 ack_nonce fields provide enough information to calculate correct 5483 Nonce Echoes. The implementation should therefore acknowledge data 5484 at least once per 253 bytes of buffer state. (Otherwise, there'd be 5485 no way to calculate a Nonce Echo.) 5486 For each acknowledgement it sends, the HC-Receiver will add an 5487 acknowledgement record. ack_seqno will equal the HC-Receiver 5488 sequence number it used for the ack packet; ack_ptr will equal 5489 buf_head; ack_ackno will equal buf_ackno; and ack_nonce will equal 5490 buf_nonce. 5492 A.3. Clearing State 5494 Some of the HC-Sender's packets will include acknowledgement 5495 numbers, which ack the HC-Receiver's acknowledgements. When such an 5496 ack is received, the HC-Receiver finds the acknowledgement record R 5497 with the appropriate ack_seqno, then: 5499 o Sets buf_tail to R.ack_ptr + 1. 5501 o If R.ack_nonce is 1, it flips buf_nonce, and the value of 5502 ack_nonce for every later ack record. 5504 o Throws away R and every preceding ack record. 5506 (The HC-Receiver may choose to keep some older information, in case 5507 a lost packet shows up late.) For example, say that the HC-Receiver 5508 storing the Example Buffer had sent two acknowledgements already: 5510 1. ack_seqno = 59, ack_ackno = 3, ack_nonce = 1. 5512 2. ack_seqno = 60, ack_ackno = 10, ack_nonce = 0. 5514 Say the HC-Receiver then received a DCCP-DataAck packet with 5515 Acknowledgement Number 59 from the HC-Sender. This informs the HC- 5516 Receiver that the HC-Sender received, and processed, all the 5517 information in HC-Receiver packet 59. This packet acknowledged HC- 5518 Sender packet 3, so the HC-Sender has now received HC-Receiver's 5519 acknowledgements for packets 0, 1, 2, and 3. The Example Buffer 5520 should enter this state: 5522 +------------------*+ * * 5523 10 |0,0|3,0|3,0|3,0|0,2| 4 BN[0] 5524 +------------------*+ * * 5526 The tail byte's run length was adjusted, since packet 3 was in the 5527 middle of that byte. Since R.ack_nonce was 1, the buf_nonce field 5528 was flipped, as were the ack_nonce fields for later acknowledgements 5529 (here, the HC-Receiver Ack 60 record, not shown, has its ack_nonce 5530 flipped to 1). The HC-Receiver can also throw away stored 5531 information about HC-Receiver Ack 59 and any earlier 5532 acknowledgements. 5534 A careful implementation might try to ensure reasonable robustness 5535 to reordering. Suppose that the Example Buffer is as before, but 5536 that packet 9 now arrives, out of sequence. The buffer would enter 5537 this state: 5539 +----*----------------------+ 5540 10 |0,0|0,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5541 +----*----------------------+ 5543 The danger is that the HC-Sender might acknowledge the HC-Receiver's 5544 previous acknowledgement (with sequence number 60), which says that 5545 Packet 9 was not received, before the HC-Receiver has a chance to 5546 send a new acknowledgement saying that Packet 9 actually was 5547 received. Therefore, when packet 9 arrived, the HC-Receiver might 5548 modify its acknowledgement record to: 5550 1. ack_seqno = 59, ack_ackno = 3, ack_nonce = 1. 5552 2. ack_seqno = 60, ack_ackno = 3, ack_nonce = 1. 5554 That is, Ack 60 is now treated like a duplicate of Ack 59. This 5555 would prevent the Tail pointer from moving past packet 9 until the 5556 HC-Receiver knows that the HC-Sender has seen an Ack Vector 5557 indicating that packet's arrival. 5559 A.4. Processing Acknowledgements 5561 When the HC-Sender receives an acknowledgement, it generally cares 5562 about the number of packets that were dropped and/or ECN marked. It 5563 simply reads this off the Ack Vector. Additionally, it should check 5564 the ECN Nonce for correctness. (As described in Section 11.4.1, it 5565 may want to keep more detailed information about acknowledged 5566 packets in case packets change states between acknowledgements, or 5567 in case the application queries whether a packet arrived.) 5569 The HC-Sender must also acknowledge the HC-Receiver's 5570 acknowledgements so that the HC-Receiver can free old Ack Vector 5571 state. (Since Ack Vector acknowledgements are reliable, the HC- 5572 Receiver must maintain and resend Ack Vector information until it is 5573 sure that the HC-Sender has received that information.) A simple 5574 algorithm suffices: since Ack Vector acknowledgements are 5575 cumulative, a single acknowledgement number tells HC-Receiver how 5576 much ack information has arrived. Assuming that the HC-Receiver 5577 sends no data, the HC-Sender can ensure that at least once a round- 5578 trip time, it sends a DCCP-DataAck packet acknowledging the latest 5579 DCCP-Ack packet it has received. Of course, the HC-Sender only 5580 needs to acknowledge the HC-Receiver's acknowledgements if the HC- 5581 Sender is also sending data. If the HC-Sender is not sending data, 5582 then the HC-Receiver's Ack Vector state is stable, and there is no 5583 need to shrink it. The HC-Sender must watch for drops and ECN marks 5584 on received DCCP-Ack packets so that it can adjust the HC-Receiver's 5585 ack-sending rate -- for example, with Ack Ratio -- in response to 5586 congestion. 5588 If the other half-connection is not quiescent -- that is, the HC- 5589 Receiver is sending data to the HC-Sender, possibly using another 5590 CCID -- then the acknowledgements on that half-connection are 5591 sufficient for the HC-Receiver to free its state. 5593 B. Appendix: Partial Checksumming Design Motivation 5595 A great deal of discussion has taken place regarding the utility of 5596 allowing a DCCP sender to restrict the checksum so that it does not 5597 cover the complete packet. This section attempts to capture some of 5598 the rationale behind specific details of DCCP design. 5600 Many of the applications that we envisage using DCCP are resilient 5601 to some degree of data loss, or they would typically have chosen a 5602 reliable transport. Some of these applications may also be 5603 resilient to data corruption -- some audio payloads, for example. 5604 These resilient applications might prefer to receive corrupted data 5605 than to have DCCP drop a corrupted packet. This is particularly 5606 because of congestion control: DCCP cannot tell the difference 5607 between packets dropped due to corruption and packets dropped due to 5608 congestion, and so it must reduce the transmission rate accordingly. 5609 This response may cause the connection to receive less bandwidth 5610 than it is due; corruption in some networking technologies is 5611 independent of, or at least not always correlated to, congestion. 5612 Therefore, corrupted packets do not need to cause as strong a 5613 reduction in transmission rate as the congestion response would 5614 dictate (so long as the DCCP header and options are not corrupt). 5616 Thus DCCP allows the checksum to cover all of the packet, just the 5617 DCCP header, or both the DCCP header and some number of bytes from 5618 the application data. If the application cannot tolerate any data 5619 corruption, then the checksum must cover the whole packet. If the 5620 application would prefer to tolerate some corruption rather than 5621 have the packet dropped, then it can set the checksum to cover only 5622 part of the packet (but always the DCCP header). In addition, if 5623 the application wishes to decouple checksumming of the DCCP header 5624 from checksumming of the application data, it may do so by including 5625 the Data Checksum option. This would allow DCCP to discard 5626 corrupted application data, but still not mistake the corruption for 5627 network congestion. 5629 Thus, from the application point of view, partial checksums seem to 5630 be a desirable feature. However, the usefulness of partial 5631 checksums depends on partially corrupted packets being delivered to 5632 the receiver. If the link-layer CRC always discards corrupted 5633 packets, then this will not happen, and so the usefulness of partial 5634 checksums would be restricted to corruption that occurred in routers 5635 and other places not covered by link CRCs. There does not appear to 5636 be consensus on how likely it is that future network links that 5637 suffer significant corruption will not cover the entire packet with 5638 a single strong CRC. DCCP makes it possible to tailor such links to 5639 the application, but it is difficult to predict if this will be 5640 compelling for future link technologies. 5642 In addition, partial checksums do not co-exist well with IP-level 5643 authentication mechanisms such as IPsec AH, which cover the entire 5644 packet with a cryptographic hash. Thus, if cryptographic 5645 authentication mechanisms are required to co-exist with partial 5646 checksums, the authentication must be carried in the application 5647 data. A possible mode of usage would appear to be similar to that 5648 of Secure RTP. However, such "application-level" authentication 5649 does not protect the DCCP option negotiation and state machine from 5650 forged packets. An alternative would be to use IPsec ESP, and use 5651 encryption to protect the DCCP headers against attack, while using 5652 the DCCP header validity checks to authenticate that the header is 5653 from someone who possessed the correct key. However, while this is 5654 resistant to replay (due to the DCCP sequence number), it is not by 5655 itself resistant to some forms of man-in-the-middle attacks because 5656 the application data is not tightly coupled to the packet header. 5657 Thus an application-level authentication probably needs to be 5658 coupled with IPsec ESP or a similar mechanism to provide a 5659 reasonably complete security solution. The overhead of such a 5660 solution might be unacceptable for some applications that would 5661 otherwise wish to use partial checksums. 5663 On balance, the authors believe that DCCP partial checksums have the 5664 potential to enable some future uses that would otherwise be 5665 difficult. As the cost and complexity of supporting them is small, 5666 it seems worth including them at this time. It remains to be seen 5667 whether they are useful in practice. 5669 Normative References 5671 [RFC 793] J. Postel, editor. Transmission Control Protocol. 5672 RFC 793. 5674 [RFC 1191] J. C. Mogul and S. E. Deering. Path MTU Discovery. 5675 RFC 1191. 5677 [RFC 2119] S. Bradner. Key Words For Use in RFCs to Indicate 5678 Requirement Levels. RFC 2119. 5680 [RFC 2434] T. Narten and H. Alvestrand. Guidelines for Writing an 5681 IANA Considerations Section in RFCs. RFC 2434. 5683 [RFC 2460] S. Deering and R. Hinden. Internet Protocol, Version 6 5684 (IPv6) Specification. RFC 2460. 5686 [RFC 3168] K.K. Ramakrishnan, S. Floyd, and D. Black. The Addition 5687 of Explicit Congestion Notification (ECN) to IP. RFC 3168. 5689 [RFC 3309] J. Stone, R. Stewart, and D. Otis. Stream Control 5690 Transmission Protocol (SCTP) Checksum Change. RFC 3309. 5692 [RFC 3692] T. Narten. Assigning Experimental and Testing Numbers 5693 Considered Useful. RFC 3692. 5695 [RFC 3775] D. Johnson, C. Perkins, and J. Arkko. Mobility Support 5696 in IPv6. RFC 3775. 5698 [RFC 3828] L-A. Larzon, M. Degermark, S. Pink, L-E. Jonsson, editor, 5699 and G. Fairhurst, editor. The Lightweight User Datagram Protocol 5700 (UDP-Lite). RFC 3828. 5702 Informative References 5704 [BB01] S.M. Bellovin and M. Blaze. Cryptographic Modes of Operation 5705 for the Internet. 2nd NIST Workshop on Modes of Operation, 5706 August 2001. 5708 [BEL98] S.M. Bellovin. Cryptography and the Internet. Proc. CRYPTO 5709 '98 (LNCS 1462), pp46-55, August, 1988. 5711 [CCID 2 PROFILE] S. Floyd and E. Kohler. Profile for DCCP 5712 Congestion Control ID 2: TCP-like Congestion Control. draft- 5713 ietf-dccp-ccid2-08.txt, work in progress, November 2004. 5715 [CCID 3 PROFILE] S. Floyd, E. Kohler, and J. Padhye. Profile for 5716 DCCP Congestion Control ID 3: TFRC Congestion Control. draft- 5717 ietf-dccp-ccid3-08.txt, work in progress, November 2004. 5719 [M85] Robert T. Morris. A Weakness in the 4.2BSD Unix TCP/IP 5720 Software. Computer Science Technical Report 117, AT&T Bell 5721 Laboratories, Murray Hill, NJ, February 1985. 5723 [PMTUD] Matt Mathis, John Heffner, and Kevin Lahey. Path MTU 5724 Discovery. draft-ietf-pmtud-method-01.txt, work in progress, 5725 February 2004. 5727 [RFC 792] J. Postel, editor. Internet Control Message Protocol. 5728 RFC 792. 5730 [RFC 1750] D. Eastlake, S. Crocker, and J. Schiller. Randomness 5731 Recommendations for Security. RFC 1750. 5733 [RFC 1812] F. Baker, editor. Requirements for IP Version 4 Routers. 5734 RFC 1812. 5736 [RFC 1948] S. Bellovin. Defending Against Sequence Number Attacks. 5737 RFC 1948. 5739 [RFC 2018] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP 5740 Selective Acknowledgement Options. RFC 2018. 5742 [RFC 2401] S. Kent and R. Atkinson. Security Architecture for the 5743 Internet Protocol. RFC 2401. 5745 [RFC 2463] A. Conta and S. Deering. Internet Control Message 5746 Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) 5747 Specification. RFC 2463. 5749 [RFC 2581] M. Allman, V. Paxson, and W. Stevens. TCP Congestion 5750 Control. RFC 2581. 5752 [RFC 2960] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. 5753 Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, and V. 5754 Paxson. Stream Control Transmission Protocol. RFC 2960. 5756 [RFC 3124] H. Balakrishnan and S. Seshan. The Congestion Manager. 5757 RFC 3124. 5759 [RFC 3360] S. Floyd. Inappropriate TCP Resets Considered Harmful. 5760 RFC 3360. 5762 [RFC 3448] M. Handley, S. Floyd, J. Padhye, and J. Widmer. TCP 5763 Friendly Rate Control (TFRC): Protocol Specification. RFC 3448. 5765 [RFC 3540] N. Spring, D. Wetherall, and D. Ely. Robust Explicit 5766 Congestion Notification (ECN) Signaling with Nonces. RFC 3540. 5768 [RFC 3550] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. 5769 RTP: A Transport Protocol for Real-Time Applications. STD 64. 5770 RFC 3550. 5772 [RFC 3611] T. Friedman, R. Caceres, and A. Clark, editors. RTP 5773 Control Protocol Extended Reports (RTCP XR). RFC 3611. 5775 [RFC 3711] M. Baugher, D. McGrew, M. Naslund, E. Carrara, and K. 5776 Norrman. The Secure Real-time Transport Protocol (SRTP). 5777 RFC 3711. 5779 [RFC 3819] P. Karn, editor, C. Bormann, G. Fairhurst, D. Grossman, 5780 R. Ludwig, J. Mahdavi, G. Montenegro, J. Touch, and L. Wood. 5781 Advice for Internet Subnetwork Designers. RFC 3819. 5783 [SHHP00] Oliver Spatscheck, Jorgen S. Hansen, John H. Hartman, and 5784 Larry L. Peterson. Optimizing TCP Forwarder Performance. 5785 IEEE/ACM Transactions on Networking 8(2):146-157, April 2000. 5787 [SYNCOOKIES] Daniel J. Bernstein. SYN Cookies. 5788 http://cr.yp.to/syncookies.html, as of July 2003. 5790 Authors' Addresses 5792 Eddie Kohler 5793 4531C Boelter Hall 5794 UCLA Computer Science Department 5795 Los Angeles, CA 90095 5796 USA 5798 Mark Handley 5799 Department of Computer Science 5800 University College London 5801 Gower Street 5802 London WC1E 6BT 5803 UK 5805 Sally Floyd 5806 ICSI Center for Internet Research 5807 1947 Center Street, Suite 600 5808 Berkeley, CA 94704 5809 USA 5811 Full Copyright Statement 5813 Copyright (C) The Internet Society 2004. This document is subject 5814 to the rights, licenses and restrictions contained in BCP 78, and 5815 except as set forth therein, the authors retain all their rights. 5817 This document and the information contained herein are provided on 5818 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 5819 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE 5820 INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR 5821 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 5822 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 5823 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 5825 Intellectual Property 5827 The IETF takes no position regarding the validity or scope of any 5828 Intellectual Property Rights or other rights that might be claimed 5829 to pertain to the implementation or use of the technology described 5830 in this document or the extent to which any license under such 5831 rights might or might not be available; nor does it represent that 5832 it has made any independent effort to identify any such rights. 5833 Information on the procedures with respect to rights in RFC 5834 documents can be found in BCP 78 and BCP 79. 5836 Copies of IPR disclosures made to the IETF Secretariat and any 5837 assurances of licenses to be made available, or the result of an 5838 attempt made to obtain a general license or permission for the use 5839 of such proprietary rights by implementers or users of this 5840 specification can be obtained from the IETF on-line IPR repository 5841 at http://www.ietf.org/ipr. 5843 The IETF invites any interested party to bring to its attention any 5844 copyrights, patents or patent applications, or other proprietary 5845 rights that may cover technology that may be required to implement 5846 this standard. Please address the information to the IETF at ietf- 5847 ipr@ietf.org.