idnits 2.17.1 draft-ietf-dccp-spec-09.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1.a on line 18. -- Found old boilerplate from RFC 3978, Section 5.5 on line 5781. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 5792. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 5799. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 5805. ** Found boilerplate matching RFC 3978, Section 5.4, paragraph 1 (on line 5773), which is fine, but *also* found old RFC 2026, Section 10.4C, paragraph 1 text on line 40. ** The document seems to lack an RFC 3978 Section 5.1 IPR Disclosure Acknowledgement. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. ** The document uses RFC 3667 boilerplate or RFC 3978-like boilerplate instead of verbatim RFC 3978 boilerplate. After 6 May 2005, submission of drafts without verbatim RFC 3978 boilerplate is not accepted. The following non-3978 patterns matched text found in the document. That text should be removed or replaced: This document is an Internet-Draft and is subject to all provisions of Section 3 of RFC 3667. By submitting this Internet-Draft, each author represents that any applicable patent or other IPR claims of which he or she is aware have been or will be disclosed, and any of which he or she becomes aware will be disclosed, in accordance with Section 6 of BCP 79. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 1833 has weird spacing: '...t value snd...' == Line 2396 has weird spacing: '...loseReq seq...' == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (14 November 2004) is 7075 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'CLOSED' is mentioned on line 848, but not defined == Missing Reference: 'LISTEN' is mentioned on line 848, but not defined == Missing Reference: 'TIMEWAIT' is mentioned on line 857, but not defined == Missing Reference: 'Nonce 0' is mentioned on line 4528, but not defined == Missing Reference: 'Nonce 1' is mentioned on line 4528, but not defined == Missing Reference: 'AWL' is mentioned on line 2359, but not defined == Missing Reference: 'AWH' is mentioned on line 2359, but not defined == Missing Reference: 'SWL' is mentioned on line 2359, but not defined == Missing Reference: 'SWH' is mentioned on line 2359, but not defined == Missing Reference: 'RFC TBA' is mentioned on line 3569, but not defined == Missing Reference: 'DrpCd' is mentioned on line 4287, but not defined == Missing Reference: 'E' is mentioned on line 5298, but not defined -- Looks like a reference, but probably isn't: '1' on line 5510 -- Looks like a reference, but probably isn't: '0' on line 5493 == Unused Reference: 'RFC 2960' is defined on line 5715, but no explicit reference was found in the text ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) ** Obsolete normative reference: RFC 3309 (Obsoleted by RFC 4960) ** Obsolete normative reference: RFC 3775 (Obsoleted by RFC 6275) == Outdated reference: A later version (-11) exists of draft-ietf-pmtud-method-01 -- Obsolete informational reference (is this intentional?): RFC 1750 (Obsoleted by RFC 4086) -- Obsolete informational reference (is this intentional?): RFC 1948 (Obsoleted by RFC 6528) -- Obsolete informational reference (is this intentional?): RFC 2401 (Obsoleted by RFC 4301) -- Obsolete informational reference (is this intentional?): RFC 2581 (Obsoleted by RFC 5681) -- Obsolete informational reference (is this intentional?): RFC 2960 (Obsoleted by RFC 4960) -- Obsolete informational reference (is this intentional?): RFC 3448 (Obsoleted by RFC 5348) Summary: 11 errors (**), 0 flaws (~~), 19 warnings (==), 16 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Eddie Kohler 2 INTERNET-DRAFT UCLA 3 draft-ietf-dccp-spec-09.txt Mark Handley 4 Expires: 14 May 2005 UCL 5 Sally Floyd 6 ICIR 7 14 November 2004 9 Datagram Congestion Control Protocol (DCCP) 11 Status of this Memo 13 This document is an Internet-Draft and is subject to all provisions 14 of section 3 of RFC 3667. By submitting this Internet-Draft, each 15 author represents that any applicable patent or other IPR claims of 16 which he or she is aware have been or will be disclosed, and any of 17 which he or she become aware will be disclosed, in accordance with 18 RFC 3668. 20 Internet-Drafts are working documents of the Internet Engineering 21 Task Force (IETF), its areas, and its working groups. Note that 22 other groups may also distribute working documents as Internet- 23 Drafts. 25 Internet-Drafts are draft documents valid for a maximum of six 26 months and may be updated, replaced, or obsoleted by other documents 27 at any time. It is inappropriate to use Internet-Drafts as 28 reference material or to cite them other than as "work in progress." 30 The list of current Internet-Drafts can be accessed at 31 http://www.ietf.org/ietf/1id-abstracts.txt. 33 The list of Internet-Draft Shadow Directories can be accessed at 34 http://www.ietf.org/shadow.html. 36 This Internet-Draft will expire on 14 May 2005. 38 Copyright Notice 40 Copyright (C) The Internet Society (2004). All Rights Reserved. 42 Abstract 44 The Datagram Congestion Control Protocol (DCCP) is a transport 45 protocol that provides bidirectional unicast connections of 46 congestion-controlled unreliable datagrams. DCCP is suitable for 47 applications that transfer fairly large amounts of data, but can 48 benefit from control over the tradeoff between timeliness and 49 reliability. 51 TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION: 53 Changes since draft-ietf-dccp-spec-08.txt: 55 * Added minimum Sequence Window. 57 * Init Cookie implementation sketch. 59 * Include reasoning for ignoring options on DCCP-Data. 61 * More Aggression Penalty explanation. 63 * More explanation on Ack Vectors that report information on packets 64 that haven't been sent. 66 Changes since draft-ietf-dccp-spec-07.txt: 68 * Many changes, not listed here, for WGLC. 70 * The more stringent Sequence Number checks on DCCP-Sync and DCCP- 71 SyncAck packets become SHOULD, not MAY. 73 Changes since draft-ietf-dccp-spec-06.txt: 75 * Change extended sequence numbers. Now 48-bit sequence numbers are 76 MANDATORY, and all packet types except Data, Ack, and DataAck always 77 use 48-bit sequence numbers. This change improves DCCP's robustness 78 against blind attacks. 80 * Removed empty Change options. 82 * Allow preference list changes during feature negotiations (this 83 seems easier to implement than the alternative). This required a 84 new feature negotiation state, UNSTABLE. 86 * Add Minimum Checksum Coverage feature. 88 * Add Reset Congestion State option. 90 * Simplify the implementation of CCID-specific option processing: no 91 need to check whether the CCID feature is being negotiated. 93 * Many more minor changes. 95 Changes since draft-ietf-dccp-spec-05.txt: 97 * Organization overhaul. 99 * Add pseudocode for event processing. 101 * Remove # NDP; replace with Ack Count. 103 * Remove Identification, Challenge, ID Regime, and Connection Nonce. 105 * Data Checksum (formerly Payload Checksum) uses a 32-bit CRC. 107 * Switch location of non-negotiable features to clarify 108 presentation; now the feature location controls its value. 110 * Rename "value type" to "reconciliation rule". 112 * Rename "Reset Reason" to "Reset Code". 114 * Mobility ID becomes 128 bits long. 116 * Add probabilities to Mobility ID discussion. 118 * Add SyncAck. 120 Table of Contents 122 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 10 123 2. Design Rationale. . . . . . . . . . . . . . . . . . . . . . . 11 124 3. Conventions and Terminology . . . . . . . . . . . . . . . . . 12 125 3.1. Numbers and Fields . . . . . . . . . . . . . . . . . . . 12 126 3.2. Parts of a Connection. . . . . . . . . . . . . . . . . . 13 127 3.3. Features . . . . . . . . . . . . . . . . . . . . . . . . 13 128 3.4. Round-Trip Times . . . . . . . . . . . . . . . . . . . . 14 129 3.5. Security Limitation. . . . . . . . . . . . . . . . . . . 14 130 3.6. Robustness Principle . . . . . . . . . . . . . . . . . . 14 131 4. Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . 14 132 4.1. Packet Types . . . . . . . . . . . . . . . . . . . . . . 15 133 4.2. Sequence Numbers . . . . . . . . . . . . . . . . . . . . 16 134 4.3. States . . . . . . . . . . . . . . . . . . . . . . . . . 17 135 4.4. Congestion Control . . . . . . . . . . . . . . . . . . . 18 136 4.5. Features . . . . . . . . . . . . . . . . . . . . . . . . 19 137 4.6. Differences From TCP . . . . . . . . . . . . . . . . . . 20 138 4.7. Example Connection . . . . . . . . . . . . . . . . . . . 21 139 5. Packet Formats. . . . . . . . . . . . . . . . . . . . . . . . 23 140 5.1. Generic Header . . . . . . . . . . . . . . . . . . . . . 23 141 5.2. DCCP-Request Packets . . . . . . . . . . . . . . . . . . 27 142 5.3. DCCP-Response Packets. . . . . . . . . . . . . . . . . . 28 143 5.4. DCCP-Data, DCCP-Ack, and DCCP-DataAck Packets. . . . . . 28 144 5.5. DCCP-CloseReq and DCCP-Close Packets . . . . . . . . . . 30 145 5.6. DCCP-Reset Packets . . . . . . . . . . . . . . . . . . . 30 146 5.7. DCCP-Sync and DCCP-SyncAck Packets . . . . . . . . . . . 33 147 5.8. Options. . . . . . . . . . . . . . . . . . . . . . . . . 34 148 5.8.1. Padding Option. . . . . . . . . . . . . . . . . . . 36 149 5.8.2. Mandatory Option. . . . . . . . . . . . . . . . . . 36 150 6. Feature Negotiation . . . . . . . . . . . . . . . . . . . . . 37 151 6.1. Change Options . . . . . . . . . . . . . . . . . . . . . 37 152 6.2. Confirm Options. . . . . . . . . . . . . . . . . . . . . 38 153 6.3. Reconciliation Rules . . . . . . . . . . . . . . . . . . 38 154 6.3.1. Server-Priority . . . . . . . . . . . . . . . . . . 38 155 6.3.2. Non-Negotiable. . . . . . . . . . . . . . . . . . . 39 156 6.4. Feature Numbers. . . . . . . . . . . . . . . . . . . . . 39 157 6.5. Examples . . . . . . . . . . . . . . . . . . . . . . . . 40 158 6.6. Option Exchange. . . . . . . . . . . . . . . . . . . . . 41 159 6.6.1. Normal Exchange . . . . . . . . . . . . . . . . . . 42 160 6.6.2. Processing Received Options . . . . . . . . . . . . 42 161 6.6.3. Loss and Retransmission . . . . . . . . . . . . . . 44 162 6.6.4. Reordering. . . . . . . . . . . . . . . . . . . . . 45 163 6.6.5. Preference Changes. . . . . . . . . . . . . . . . . 46 164 6.6.6. Simultaneous Negotiation. . . . . . . . . . . . . . 46 165 6.6.7. Unknown Features. . . . . . . . . . . . . . . . . . 46 166 6.6.8. Invalid Options . . . . . . . . . . . . . . . . . . 47 167 6.6.9. Mandatory Feature Negotiation . . . . . . . . . . . 48 169 7. Sequence Numbers. . . . . . . . . . . . . . . . . . . . . . . 48 170 7.1. Variables. . . . . . . . . . . . . . . . . . . . . . . . 48 171 7.2. Initial Sequence Numbers . . . . . . . . . . . . . . . . 49 172 7.3. Quiet Time . . . . . . . . . . . . . . . . . . . . . . . 50 173 7.4. Acknowledgement Numbers. . . . . . . . . . . . . . . . . 51 174 7.5. Validity and Synchronization . . . . . . . . . . . . . . 51 175 7.5.1. Sequence and Acknowledgement Number 176 Windows. . . . . . . . . . . . . . . . . . . . . . . . . . 52 177 7.5.2. Sequence Window Feature . . . . . . . . . . . . . . 53 178 7.5.3. Sequence-Validity Rules . . . . . . . . . . . . . . 53 179 7.5.4. Handling Sequence-Invalid Packets . . . . . . . . . 55 180 7.5.5. Sequence Number Attacks . . . . . . . . . . . . . . 56 181 7.5.6. Examples. . . . . . . . . . . . . . . . . . . . . . 57 182 7.6. Short Sequence Numbers . . . . . . . . . . . . . . . . . 58 183 7.6.1. Allow Short Sequence Numbers Feature. . . . . . . . 59 184 7.6.2. When to Avoid Short Sequence Numbers. . . . . . . . 59 185 7.7. NDP Count and Detecting Application Loss . . . . . . . . 60 186 7.7.1. Usage Notes . . . . . . . . . . . . . . . . . . . . 61 187 7.7.2. Send NDP Count Feature. . . . . . . . . . . . . . . 61 188 8. Event Processing. . . . . . . . . . . . . . . . . . . . . . . 61 189 8.1. Connection Establishment . . . . . . . . . . . . . . . . 62 190 8.1.1. Client Request. . . . . . . . . . . . . . . . . . . 62 191 8.1.2. Service Codes . . . . . . . . . . . . . . . . . . . 63 192 8.1.3. Server Response . . . . . . . . . . . . . . . . . . 64 193 8.1.4. Init Cookie Option. . . . . . . . . . . . . . . . . 65 194 8.1.5. Handshake Completion. . . . . . . . . . . . . . . . 66 195 8.2. Data Transfer. . . . . . . . . . . . . . . . . . . . . . 66 196 8.3. Termination. . . . . . . . . . . . . . . . . . . . . . . 67 197 8.3.1. Abnormal Termination. . . . . . . . . . . . . . . . 69 198 8.4. DCCP State Diagram . . . . . . . . . . . . . . . . . . . 69 199 8.5. Pseudocode . . . . . . . . . . . . . . . . . . . . . . . 70 200 9. Checksums . . . . . . . . . . . . . . . . . . . . . . . . . . 74 201 9.1. Header Checksum Field. . . . . . . . . . . . . . . . . . 75 202 9.2. Header Checksum Coverage Field . . . . . . . . . . . . . 76 203 9.2.1. Minimum Checksum Coverage Feature . . . . . . . . . 77 204 9.3. Data Checksum Option . . . . . . . . . . . . . . . . . . 77 205 9.3.1. Check Data Checksum Feature . . . . . . . . . . . . 78 206 9.3.2. Usage Notes . . . . . . . . . . . . . . . . . . . . 78 207 10. Congestion Control . . . . . . . . . . . . . . . . . . . . . 79 208 10.1. TCP-like Congestion Control . . . . . . . . . . . . . . 80 209 10.2. TFRC Congestion Control . . . . . . . . . . . . . . . . 80 210 10.3. CCID-Specific Options, Features, and Reset 211 Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 212 10.4. CCID Profile Requirements . . . . . . . . . . . . . . . 83 213 10.5. Congestion State. . . . . . . . . . . . . . . . . . . . 83 214 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 84 215 11.1. Acks of Acks and Unidirectional Connections . . . . . . 84 216 11.2. Ack Piggybacking. . . . . . . . . . . . . . . . . . . . 86 217 11.3. Ack Ratio Feature . . . . . . . . . . . . . . . . . . . 86 218 11.4. Ack Vector Options. . . . . . . . . . . . . . . . . . . 88 219 11.4.1. Ack Vector Consistency . . . . . . . . . . . . . . 90 220 11.4.2. Ack Vector Coverage. . . . . . . . . . . . . . . . 92 221 11.5. Send Ack Vector Feature . . . . . . . . . . . . . . . . 92 222 11.6. Slow Receiver Option. . . . . . . . . . . . . . . . . . 93 223 11.7. Data Dropped Option . . . . . . . . . . . . . . . . . . 94 224 11.7.1. Data Dropped and Normal Congestion 225 Response . . . . . . . . . . . . . . . . . . . . . . . . . 97 226 11.7.2. Particular Drop Codes. . . . . . . . . . . . . . . 97 227 12. Explicit Congestion Notification . . . . . . . . . . . . . . 98 228 12.1. ECN Incapable Feature . . . . . . . . . . . . . . . . . 98 229 12.2. ECN Nonces. . . . . . . . . . . . . . . . . . . . . . . 99 230 12.3. Aggression Penalties. . . . . . . . . . . . . . . . . . 100 231 13. Timing Options . . . . . . . . . . . . . . . . . . . . . . . 101 232 13.1. Timestamp Option. . . . . . . . . . . . . . . . . . . . 101 233 13.2. Elapsed Time Option . . . . . . . . . . . . . . . . . . 102 234 13.3. Timestamp Echo Option . . . . . . . . . . . . . . . . . 103 235 14. Maximum Packet Size. . . . . . . . . . . . . . . . . . . . . 104 236 14.1. Measuring PMTU. . . . . . . . . . . . . . . . . . . . . 104 237 14.2. Sender Behavior . . . . . . . . . . . . . . . . . . . . 106 238 15. Forward Compatibility. . . . . . . . . . . . . . . . . . . . 106 239 16. Middlebox Considerations . . . . . . . . . . . . . . . . . . 107 240 17. Relations to Other Specifications. . . . . . . . . . . . . . 108 241 17.1. RTP . . . . . . . . . . . . . . . . . . . . . . . . . . 108 242 17.2. Congestion Manager and Multiplexing . . . . . . . . . . 110 243 18. Security Considerations. . . . . . . . . . . . . . . . . . . 110 244 18.1. Security Considerations for Partial 245 Checksums . . . . . . . . . . . . . . . . . . . . . . . . . . 111 246 19. IANA Considerations. . . . . . . . . . . . . . . . . . . . . 112 247 19.1. Packet Types. . . . . . . . . . . . . . . . . . . . . . 112 248 19.2. Reset Codes . . . . . . . . . . . . . . . . . . . . . . 112 249 19.3. Option Types. . . . . . . . . . . . . . . . . . . . . . 112 250 19.4. Feature Numbers . . . . . . . . . . . . . . . . . . . . 113 251 19.5. Congestion Control Identifiers. . . . . . . . . . . . . 113 252 19.6. Ack Vector States . . . . . . . . . . . . . . . . . . . 113 253 19.7. Drop Codes. . . . . . . . . . . . . . . . . . . . . . . 113 254 19.8. Service Codes . . . . . . . . . . . . . . . . . . . . . 114 255 20. Thanks . . . . . . . . . . . . . . . . . . . . . . . . . . . 114 256 A. Appendix: Ack Vector Implementation Notes . . . . . . . . . . 115 257 A.1. Packet Arrival . . . . . . . . . . . . . . . . . . . . . 117 258 A.1.1. New Packets . . . . . . . . . . . . . . . . . . . . 117 259 A.1.2. Old Packets . . . . . . . . . . . . . . . . . . . . 118 260 A.2. Sending Acknowledgements . . . . . . . . . . . . . . . . 119 261 A.3. Clearing State . . . . . . . . . . . . . . . . . . . . . 119 262 A.4. Processing Acknowledgements. . . . . . . . . . . . . . . 121 263 B. Appendix: Partial Checksumming Design Motivation. . . . . . . 121 264 Normative References . . . . . . . . . . . . . . . . . . . . . . 123 265 Informative References . . . . . . . . . . . . . . . . . . . . . 124 266 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 125 267 Full Copyright Statement . . . . . . . . . . . . . . . . . . . . 126 268 Intellectual Property. . . . . . . . . . . . . . . . . . . . . . 126 269 List of Tables 271 Table 1: DCCP Packet Types . . . . . . . . . . . . . . . . . . . 25 272 Table 2: DCCP Reset Codes. . . . . . . . . . . . . . . . . . . . 33 273 Table 3: DCCP Options. . . . . . . . . . . . . . . . . . . . . . 35 274 Table 4: DCCP Feature Numbers. . . . . . . . . . . . . . . . . . 39 275 Table 5: DCCP Congestion Control Identifiers . . . . . . . . . . 79 276 Table 6: DCCP Ack Vector States. . . . . . . . . . . . . . . . . 88 277 Table 7: DCCP Drop Codes . . . . . . . . . . . . . . . . . . . . 95 279 1. Introduction 281 The Datagram Congestion Control Protocol (DCCP) is a transport 282 protocol that implements bidirectional, unicast connections of 283 congestion-controlled, unreliable datagrams. Specifically, DCCP 284 provides: 286 o Unreliable flows of datagrams, with acknowledgements. 288 o Reliable handshakes for connection setup and teardown. 290 o Reliable negotiation of options, including negotiation of a 291 suitable congestion control mechanism. 293 o Mechanisms allowing servers to avoid holding state for 294 unacknowledged connection attempts and already-finished 295 connections. 297 o Congestion control incorporating Explicit Congestion Notification 298 (ECN) and the ECN Nonce, as per [RFC 3168] and [RFC 3540]. 300 o Acknowledgement mechanisms communicating packet loss and ECN 301 information. Acks are transmitted as reliably as the relevant 302 congestion control mechanism requires, possibly completely 303 reliably. 305 o Optional mechanisms that tell the sending application, with high 306 reliability, which data packets reached the receiver, and whether 307 those packets were ECN marked, corrupted, or dropped in the 308 receive buffer. 310 o Path Maximum Transmission Unit (PMTU) discovery, as per [RFC 311 1191]. 313 o A choice of modular congestion control mechanisms. Two 314 mechanisms are currently specified, TCP-like Congestion Control 315 [CCID 2 PROFILE] and TFRC (TCP-Friendly Rate Control) Congestion 316 Control [CCID 3 PROFILE], but DCCP is easily extensible to 317 further forms of unicast congestion control. 319 DCCP is intended for applications such as streaming media that can 320 benefit from control over the tradeoffs between delay and reliable 321 in-order delivery. TCP is not well-suited for these applications, 322 since reliable in-order delivery and congestion control can cause 323 arbitrarily long delays. UDP avoids long delays, but UDP 324 applications that implement congestion control must do so on their 325 own. DCCP provides built-in congestion control, including ECN 326 support, for unreliable datagram flows, avoiding the arbitrary 327 delays associated with TCP. It also implements reliable connection 328 setup, teardown, and feature negotiation. 330 2. Design Rationale 332 One DCCP design goal was to give most streaming UDP applications 333 little reason not to switch to DCCP, once it is deployed. To 334 facilitate this, DCCP was designed to have as little overhead as 335 possible, both in terms of the packet header size and in terms of 336 the state and CPU overhead required at end hosts. Only the minimal 337 necessary functionality was included in DCCP, leaving other 338 functionality, such as forward error correction (FEC), semi- 339 reliability, and multiple streams, to be layered on top of DCCP as 340 desired. 342 Different forms of conformant congestion control are appropriate for 343 different applications. For example, on-line games might want to 344 make quick use of any available bandwidth, while streaming media 345 might trade off this responsiveness for a steadier, less bursty 346 rate. (Sudden rate changes can cause unacceptable UI glitches, such 347 as audible pauses or clicks in the playout stream.) DCCP thus 348 allows applications to choose from a set of congestion control 349 mechanisms. One alternative, TCP-like Congestion Control, halves 350 the congestion window in response to a packet drop or mark, as in 351 TCP. Applications using this congestion control mechanism will 352 respond quickly to changes in available bandwidth, but must tolerate 353 the abrupt changes in congestion window typical of TCP. A second 354 alternative, TCP-Friendly Rate Control (TFRC, [RFC 3448]), a form of 355 equation-based congestion control, minimizes abrupt changes in the 356 sending rate while maintaining longer-term fairness with TCP. Other 357 alternatives can be added as future congestion control mechanisms 358 are standardized. 360 DCCP also lets unreliable traffic safely use ECN. A UDP kernel API 361 might not allow applications to set UDP packets as ECN-capable, 362 since the API could not guarantee the application would properly 363 detect or respond to congestion. DCCP kernel APIs will have no such 364 issues, since DCCP implements congestion control itself. 366 We chose not to require the use of the Congestion Manager [RFC 367 3124], which allows multiple concurrent streams between the same 368 sender and receiver to share congestion control. The current 369 Congestion Manager can only be used by applications that have their 370 own end-to-end feedback about packet losses, but this is not the 371 case for many of the applications currently using UDP. In addition, 372 the current Congestion Manager does not easily support multiple 373 congestion control mechanisms, or lend itself to the use of forms of 374 TFRC where the state about past packet drops or marks is maintained 375 at the receiver rather than at the sender. DCCP should be able to 376 make use of CM where desired by the application, but we do not see 377 any benefit in making the deployment of DCCP contingent on the 378 deployment of CM itself. 380 We intend for DCCP's protocol mechanisms, which are described in 381 this document, to suit any application desiring unicast congestion- 382 controlled streams of unreliable datagrams. The congestion control 383 mechanisms currently approved for use with DCCP, which are described 384 in separate Congestion Control ID Profiles [CCID 2 PROFILE] [CCID 3 385 PROFILE], may, however, cause problems for some applications, 386 including high-bandwidth interactive video. These applications 387 should be able to use DCCP once suitable Congestion Control ID 388 Profiles are standardized. 390 3. Conventions and Terminology 392 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 393 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in 394 this document are to be interpreted as described in [RFC 2119]. 396 3.1. Numbers and Fields 398 All multi-byte numerical quantities in DCCP, such as port numbers, 399 Sequence Numbers, and arguments to options, are transmitted in 400 network byte order (most significant byte first). 402 We occasionally refer to the "left" and "right" sides of a bit 403 field. "Left" means towards the most significant bit, and "right" 404 means towards the least significant bit. 406 Random numbers in DCCP are used for their security properties, and 407 SHOULD be chosen according to the guidelines in [RFC 1750]. 409 All operations on DCCP sequence numbers, and comparisons such as 410 "greater" and "greatest", use circular arithmetic modulo 2**48. 411 This form of arithmetic preserves the relationships between sequence 412 numbers as they roll over from 2**48 - 1 to 0. Note that the common 413 technique for implementing circular comparison using two's- 414 complement arithmetic, whereby A < B using circular arithmetic if 415 and only if (A - B) < 0 using conventional two's-complement 416 arithmetic, may be used for DCCP sequence numbers, providing they 417 are stored in the most significant 48 bits of 64-bit integers. 419 Reserved bitfields in DCCP packet headers MUST be set to zero by 420 senders, and MUST be ignored by receivers, unless otherwise 421 specified. This is to allow for future protocol extensions. In 422 particular, DCCP processors MUST NOT reset a DCCP connection simply 423 because a Reserved field has non-zero value [RFC 3360]. 425 3.2. Parts of a Connection 427 Each DCCP connection runs between two hosts, which we often name 428 DCCP A and DCCP B. Each connection is actively initiated by one of 429 the hosts, which we call the client; the other, initially passive 430 host is called the server. The term "DCCP endpoint" is used to 431 refer to either of the two hosts explicitly named by the connection 432 (the client and the server). The term "DCCP processor" refers more 433 generally to any host that might need to process a DCCP header; this 434 includes the endpoints and any middleboxes on the path, such as 435 firewalls and network address translators. 437 DCCP connections are bidirectional: data may pass from either 438 endpoint to the other. This means that data and acknowledgements 439 may be flowing in both directions simultaneously. Logically, 440 however, a DCCP connection consists of two separate unidirectional 441 connections, called half-connections. Each half-connection consists 442 of the application data sent by one endpoint and the corresponding 443 acknowledgements sent by the other endpoint. We can illustrate this 444 as follows: 446 +--------+ A-to-B half-connection: +--------+ 447 | | --> application data --> | | 448 | | <-- acknowledgements <-- | | 449 | DCCP A | | DCCP B | 450 | | B-to-A half-connection: | | 451 | | <-- application data <-- | | 452 +--------+ --> acknowledgements --> +--------+ 454 Although they are logically distinct, in practice the half- 455 connections overlap; a DCCP-DataAck packet, for example, contains 456 application data relevant to one half-connection and acknowledgement 457 information relevant to the other. 459 In the context of a single half-connection, the terms "HC-Sender" 460 and "HC-Receiver" denote the endpoints sending application data and 461 acknowledgements, respectively. For example, DCCP A is the HC- 462 Sender and DCCP B is the HC-Receiver in the A-to-B half-connection. 464 3.3. Features 466 A DCCP feature is a connection attribute on whose value the two 467 endpoints agree. Many properties of a DCCP connection are 468 controlled by features, including the congestion control mechanisms 469 in use on the two half-connections. The endpoints achieve agreement 470 through the exchange of feature negotiation options in DCCP headers. 472 DCCP features are identified by a feature number and an endpoint. 473 The notation "F/X" represents the feature with feature number F 474 located at DCCP endpoint X. Each valid feature number thus 475 corresponds to two features, which are negotiated separately and 476 need not have the same value. The two endpoints know, and agree on, 477 the value of every valid feature. DCCP A is the "feature location" 478 for all features F/A, and the "feature remote" for all features F/B. 480 3.4. Round-Trip Times 482 DCCP round-trip time measurements are performed by congestion 483 control mechanisms; different mechanisms may measure round-trip time 484 in different ways, or not measure it at all. However, the main DCCP 485 protocol does use round-trip times occasionally, such as in the 486 initial values for certain timers. Each DCCP implementation thus 487 defines a default round-trip time for use when no estimate is 488 available; this parameter should default to not less than 489 0.2 seconds, a reasonably conservative round-trip time for Internet 490 TCP connections. Protocol behavior specified in terms of "round- 491 trip time" values actually refers to "a current round-trip time 492 estimate taken by some CCID, or, if no estimate is available, the 493 default round-trip time parameter". 495 The maximum segment lifetime, or MSL, is the maximum length of time 496 a packet can survive in the network. The DCCP MSL should equal that 497 of TCP, which is normally two minutes. 499 3.5. Security Limitation 501 DCCP provides no protection against attackers who can snoop on a 502 connection in progress, or who can guess valid sequence numbers in 503 other ways. Applications desiring stronger security should use 504 IPsec [RFC 2401]; depending on the level of security required, 505 application-level cryptography may also suffice. These issues are 506 discussed further in Sections 18 and 7.5.5. 508 3.6. Robustness Principle 510 DCCP implementations will follow TCP's "general principle of 511 robustness": "be conservative in what you do, be liberal in what you 512 accept from others" [RFC 793]. 514 4. Overview 516 DCCP's high-level connection dynamics echo those of TCP. 517 Connections progress through three phases: initiation, including a 518 three-way handshake; data transfer; and termination. Data can flow 519 both ways over the connection. An acknowledgement framework lets 520 senders discover how much data has been lost, and thus avoid 521 unfairly congesting the network. Of course, DCCP provides 522 unreliable datagram semantics, not TCP's reliable bytestream 523 semantics. The application must package its data into explicit 524 frames, and must retransmit its own data as necessary. It may be 525 useful to think of DCCP as TCP minus bytestream semantics and 526 reliability, or as UDP plus congestion control, handshakes, and 527 acknowledgements. 529 4.1. Packet Types 531 Ten packet types implement DCCP's protocol functions. For example, 532 every new connection attempt begins with a DCCP-Request packet sent 533 by the client. A DCCP-Request packet thus resembles a TCP SYN; but 534 DCCP-Request is a packet type, not a flag, so there's no way to send 535 an unexpected combination such as TCP's SYN+FIN+ACK+RST. 537 Eight packet types occur during the progress of a typical 538 connection, shown here. Note the three-way handshakes during 539 initiation and termination. 541 Client Server 542 ------ ------ 543 (1) Initiation 544 DCCP-Request --> 545 <-- DCCP-Response 546 DCCP-Ack --> 547 (2) Data transfer 548 DCCP-Data, DCCP-Ack, DCCP-DataAck --> 549 <-- DCCP-Data, DCCP-Ack, DCCP-DataAck 550 (3) Termination 551 <-- DCCP-CloseReq 552 DCCP-Close --> 553 <-- DCCP-Reset 555 The two remaining packet types are used to resynchronize after 556 bursts of loss. 558 Every DCCP packet starts with a 12-byte generic header. Particular 559 packet types include additional fixed-size header data; for example, 560 DCCP-Acks include an Acknowledgement Number. DCCP options and any 561 application data follow the fixed-size header. 563 The packet types are as follows: 565 DCCP-Request 566 Sent by the client to initiate a connection (the first part of 567 the three-way initiation handshake). 569 DCCP-Response 570 Sent by the server in response to a DCCP-Request (the second 571 part of the three-way initiation handshake). 573 DCCP-Data 574 Used to transmit application data. 576 DCCP-Ack 577 Used to transmit pure acknowledgements. 579 DCCP-DataAck 580 Used to transmit application data with piggybacked 581 acknowledgements. 583 DCCP-CloseReq 584 Sent by the server to request that the client close the 585 connection. 587 DCCP-Close 588 Used by the client or the server to close the connection; 589 elicits a DCCP-Reset in response. 591 DCCP-Reset 592 Used to terminate the connection, either normally or abnormally. 594 DCCP-Sync, DCCP-SyncAck 595 Used to resynchronize sequence numbers after large bursts of 596 loss. 598 4.2. Sequence Numbers 600 Each DCCP packet carries a sequence number, so that losses can be 601 detected and reported. Unlike TCP sequence numbers, which are byte- 602 based, DCCP sequence numbers increment by one per packet. For 603 example: 605 DCCP A DCCP B 606 ------ ------ 607 DCCP-Data(seqno 1) --> 608 DCCP-Data(seqno 2) --> 609 <-- DCCP-Ack(seqno 10, ackno 2) 610 DCCP-DataAck(seqno 3, ackno 10) --> 611 <-- DCCP-Data(seqno 11) 613 Every DCCP packet increments the sequence number, whether or not it 614 contains application data. DCCP-Ack pure acknowledgements increment 615 the sequence number, for instance: DCCP B's second packet above uses 616 sequence number 11, since sequence number 10 was used for an 617 acknowledgement. This lets endpoints detect all packet loss, 618 including acknowledgement loss. It also means that endpoints can 619 get out of sync after long bursts of loss; the DCCP-Sync and DCCP- 620 SyncAck packet types are used to recover (Section 7.5). 622 Since DCCP provides unreliable semantics, there are no 623 retransmissions, and it doesn't make sense to have a TCP-style 624 cumulative acknowledgement field. DCCP's Acknowledgement Number 625 field equals the greatest sequence number received, rather than the 626 smallest sequence number not received. Separate options indicate 627 any intermediate sequence numbers that weren't received. 629 4.3. States 631 DCCP endpoints progress through different states during the course 632 of a connection, corresponding roughly to the three phases of 633 initiation, data transfer, and termination. The figure below shows 634 the typical progress through these states for a client and server. 636 Client Server 637 ------ ------ 638 (0) No connection 639 CLOSED LISTEN 641 (1) Initiation 642 REQUEST DCCP-Request --> 643 <-- DCCP-Response RESPOND 644 PARTOPEN DCCP-Ack or DCCP-DataAck --> 646 (2) Data transfer 647 OPEN <-- DCCP-Data, Ack, DataAck --> OPEN 649 (3) Termination 650 <-- DCCP-CloseReq CLOSEREQ 651 CLOSING DCCP-Close --> 652 <-- DCCP-Reset CLOSED 653 TIMEWAIT 654 CLOSED 656 The nine possible states are as follows. They are listed in 657 increasing order, so that "state >= CLOSEREQ" means the same as 658 "state = CLOSEREQ or state = CLOSING or state = TIMEWAIT". Section 659 8 describes the states in more detail. 661 CLOSED 662 Represents nonexistent connections. 664 LISTEN 665 Represents server sockets in the passive listening state. 666 LISTEN and CLOSED are not associated with any particular DCCP 667 connection. 669 REQUEST 670 A client socket enters this state, from CLOSED, after sending a 671 DCCP-Request packet to try to initiate a connection. 673 RESPOND 674 A server socket enters this state, from LISTEN, after receiving 675 a DCCP-Request from a client. 677 PARTOPEN 678 A client socket enters this state, from REQUEST, after receiving 679 a DCCP-Response from the server. This state represents the 680 third phase of the three-way handshake. The client may send 681 application data in this state, but it MUST include an 682 Acknowledgement Number on all of its packets. 684 OPEN 685 The central, data transfer portion of a DCCP connection. Client 686 and server sockets enter this state from PARTOPEN and RESPOND, 687 respectively. Sometimes we speak of SERVER-OPEN and CLIENT-OPEN 688 states, corresponding to the server's OPEN state and the 689 client's OPEN state. 691 CLOSEREQ 692 A server socket enters this state, from SERVER-OPEN, to signal 693 that the connection is over, but the client must hold TIMEWAIT 694 state. 696 CLOSING 697 Server and client sockets can both enter this state to close the 698 connection. 700 TIMEWAIT 701 A server or client socket remains in this state for 2MSL (4 702 minutes) after the connection has been torn down, to prevent 703 mistakes due to the delivery of old packets. Only one of the 704 endpoints need enter TIMEWAIT state (the other can enter CLOSED 705 state immediately), and a server can request its client to hold 706 TIMEWAIT state using the DCCP-CloseReq packet type. 708 4.4. Congestion Control 710 DCCP connections are congestion controlled, but unlike in TCP, DCCP 711 applications have a choice of congestion control mechanism. In 712 fact, the two half-connections can be governed by different 713 mechanisms. Mechanisms are denoted by one-byte congestion control 714 identifiers, or CCIDs. The endpoints negotiate their CCIDs during 715 connection initiation. Each CCID describes how the HC-Sender limits 716 data packet rates, how the HC-Receiver sends congestion feedback via 717 acknowledgements, and so forth. CCIDs 2 and 3 are currently 718 defined; CCIDs 0, 1, and 4-255 are reserved. Other CCIDs may be 719 defined in the future. 721 CCID 2 provides TCP-like Congestion Control, which is similar to 722 that of TCP. The sender maintains a congestion window and sends 723 packets until that window is full. Packets are acknowledged by the 724 receiver. Dropped packets and ECN [RFC 3168] indicate congestion; 725 the response to congestion is to halve the congestion window. 726 Acknowledgements in CCID 2 contain the sequence numbers of all 727 received packets within some window, similar to a selective 728 acknowledgement (SACK) [RFC 2018]. 730 CCID 3 provides TFRC Congestion Control, an equation-based form of 731 congestion control intended to respond to congestion more smoothly 732 than CCID 2. The sender maintains a transmit rate, which it updates 733 using the receiver's estimate of the packet loss and mark rate. 734 CCID 3 behaves somewhat differently from TCP in the short term, it 735 is designed to operate fairly with TCP over the long term. 737 Section 10 describes DCCP's CCIDs in more detail. The behaviors of 738 CCIDs 2 and 3 are fully defined in separate profile documents [CCID 739 2 PROFILE] [CCID 3 PROFILE]. 741 4.5. Features 743 DCCP endpoints use Change and Confirm options to negotiate and agree 744 on feature values. Feature negotiation will almost always happen on 745 the connection initiation handshake, but it can begin at any time. 747 There are four feature negotiation options in all: Change L, 748 Confirm L, Change R, and Confirm R. The "L" options are sent by the 749 feature location, and the "R" options are sent by the feature 750 remote. A Change R option says to the feature location, "change 751 this feature value as follows". The feature location responds with 752 Confirm L, meaning "I've changed it". Some features allow Change R 753 options to contain multiple values, sorted in preference order. For 754 example: 756 Client Server 757 ------ ------ 758 Change R(CCID, 2) --> 759 <-- Confirm L(CCID, 2) 760 * agreement that CCID/Server = 2 * 762 Change R(CCID, 3 4) --> 763 <-- Confirm L(CCID, 4, 4 2) 764 * agreement that CCID/Server = 4 * 766 Both exchanges negotiate the CCID/Server feature's value, which is 767 the CCID in use on the server-to-client half-connection. In the 768 second exchange, the client requests that the server use either 769 CCID 3 or CCID 4, with 3 preferred; the server chooses 4 and 770 supplies its preference list, "4 2". 772 The Change L and Confirm R options are used for feature negotiations 773 initiated by the feature location. In the following example, the 774 server requests that CCID/Server be set to 3 or 2, with 3 preferred, 775 and the client agrees. 777 Client Server 778 ------ ------ 779 <-- Change L(CCID, 3 2) 780 Confirm R(CCID, 3, 3 2) --> 781 * agreement that CCID/Server = 3 * 783 Section 6 describes the feature negotiation options further, 784 including the retransmission strategies that make negotiation 785 reliable. 787 4.6. Differences From TCP 789 Differences between DCCP and TCP apart from those discussed so far 790 include: 792 o Copious space for options (up to 1008 bytes or the PMTU). 794 o Different acknowledgement formats. The CCID for a connection 795 determines how much acknowledgement information needs to be 796 transmitted. For example, in CCID 2 (TCP-like), this is about 797 one ack per 2 packets, and each ack must declare exactly which 798 packets were received; in CCID 3 (TFRC), it's about one ack per 799 round-trip time, and acks must declare at minimum just the 800 lengths of recent loss intervals. 802 o Denial-of-service (DoS) protection. Several mechanisms help 803 limit the amount of state possibly-misbehaving clients can force 804 DCCP servers to maintain. An Init Cookie option, analogous to 805 TCP's SYN Cookies [SYNCOOKIES], avoids SYN-flood-like attacks. 806 Only one connection endpoint need hold TIMEWAIT state; the DCCP- 807 CloseReq packet, which may only be sent by the server, passes 808 that state to the client. Various rate limits let servers avoid 809 attacks that might force extensive computation or packet 810 generation. 812 o Distinguishing different kinds of loss. A Data Dropped option 813 (Section 11.7) lets an endpoint declare that a packet was dropped 814 because of corruption, because of receive buffer overflow, and so 815 on. This facilitates research into more appropriate rate-control 816 responses for these non-network-congestion losses (although 817 currently such losses will cause a congestion response). 819 o Acknowledgeability. In TCP, a packet may be acknowledged only 820 once the data is reliably queued for application delivery. This 821 does not make sense in DCCP, where an application might, for 822 example, request a drop-from-front receive buffer. A DCCP packet 823 may be acknowledged as soon as its header has been successfully 824 processed. Concretely, a packet becomes acknowledgeable at 825 Step 8 of Section 8.5's packet processing pseudocode. 826 Acknowledgeability does not guarantee data delivery, however: the 827 Data Dropped option may later report that the packet's 828 application data was discarded. 830 o No receive window. DCCP is a congestion control protocol, not a 831 flow control protocol. 833 o No simultaneous open. Every connection has one client and one 834 server. 836 o No half-closed states. DCCP has no states corresponding to TCP's 837 FINWAIT and CLOSEWAIT, where one half-connection is explicitly 838 closed while the other is still active. The Data Dropped 839 option's Drop Code 1, Application Not Listening (Section 11.7), 840 can achieve a similar effect, however. 842 4.7. Example Connection 844 The progress of a typical DCCP connection is as follows. (This 845 description is informative, not normative.) 846 Client Server 847 ------ ------ 848 0. [CLOSED] [LISTEN] 849 1. DCCP-Request --> 850 2. <-- DCCP-Response 851 3. DCCP-Ack --> 852 4. DCCP-Data, DCCP-Ack, DCCP-DataAck --> 853 <-- DCCP-Data, DCCP-Ack, DCCP-DataAck 854 5. <-- DCCP-CloseReq 855 6. DCCP-Close --> 856 7. <-- DCCP-Reset 857 8. [TIMEWAIT] 859 1. The client sends the server a DCCP-Request packet specifying the 860 client and server ports, the service being requested, and any 861 features being negotiated, including the CCID that the client 862 would like the server to use. The client may optionally 863 piggyback an application request on the DCCP-Request packet, 864 which the server may ignore. 866 2. The server sends the client a DCCP-Response packet indicating 867 that it is willing to communicate with the client. This 868 response indicates any features and options that the server 869 agrees to, begins other feature negotiations as desired, and 870 optionally includes an Init Cookie that wraps up all this 871 information and which must be returned by the client for the 872 connection to complete. 874 3. The client sends the server a DCCP-Ack packet that acknowledges 875 the DCCP-Response packet. This acknowledges the server's 876 initial sequence number and returns the Init Cookie if there was 877 one in the DCCP-Response. It may also continue feature 878 negotiation. The client may piggyback an application-level 879 request on its final ack, producing a DCCP-DataAck packet. 881 4. The server and client then exchange DCCP-Data packets, DCCP-Ack 882 packets acknowledging that data, and, optionally, DCCP-DataAck 883 packets containing data with piggybacked acknowledgements. If 884 the client has no data to send, then the server will send DCCP- 885 Data and DCCP-DataAck packets, while the client will send DCCP- 886 Acks exclusively. (However, the client may not send DCCP-Data 887 packets before receiving at least one non-DCCP-Response packet 888 from the server.) 890 5. The server sends a DCCP-CloseReq packet requesting a close. 892 6. The client sends a DCCP-Close packet acknowledging the close. 894 7. The server sends a DCCP-Reset packet with Reset Code 1, 895 "Closed", and clears its connection state. DCCP-Resets are part 896 of normal connection termination; see Section 5.6. 898 8. The client receives the DCCP-Reset packet and holds state for 899 two maximum segment lifetimes, or 2MSL, to allow any remaining 900 packets to clear the network. 902 An alternative connection closedown sequence is initiated by the 903 client: 905 5b. The client sends a DCCP-Close packet closing the connection. 907 6b. The server sends a DCCP-Reset packet with Reset Code 1, 908 "Closed", and clears its connection state. 910 7b. The client receives the DCCP-Reset packet and holds state for 911 2MSL to allow any remaining packets to clear the network. 913 5. Packet Formats 915 The DCCP header can be from 12 to 1020 bytes long. The initial 12 916 bytes of the header have the same semantics for all currently- 917 defined packet types. Following this comes any additional fixed- 918 length fields required by the packet type, and then a variable- 919 length list of options. The application data area follows the 920 header. In some packet types, this area contains data for the 921 application; in other packet types, its contents are ignored. 923 +---------------------------------------+ -. 924 | Generic Header | | 925 +---------------------------------------+ | 926 | Additional Fields (depending on type) | +- DCCP Header 927 +---------------------------------------+ | 928 | Options (optional) | | 929 +=======================================+ -' 930 | Application Data Area | 931 +---------------------------------------+ 933 5.1. Generic Header 935 The DCCP generic header takes different forms depending on the value 936 of X, the Extended Sequence Numbers bit. If X is one, the Sequence 937 Number field is 48 bits long and the generic header takes 16 bytes, 938 as follows. 940 0 1 2 3 941 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 942 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 943 | Source Port | Dest Port | 944 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 945 | Data Offset | CCVal | CsCov | Checksum | 946 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 947 | | |X| | . 948 | Res | Type |=| Reserved | Sequence Number (high bits) . 949 | | |1| | . 950 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 951 . Sequence Number (low bits) | 952 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 954 If X is zero, only the low 24 bits of the Sequence Number are 955 transmitted, and the generic header is 12 bytes long. 957 0 1 2 3 958 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 959 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 960 | Source Port | Dest Port | 961 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 962 | Data Offset | CCVal | CsCov | Checksum | 963 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 964 | | |X| | 965 | Res | Type |=| Sequence Number (low bits) | 966 | | |0| | 967 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 969 The generic header fields are defined as follows. 971 Source and Destination Ports: 16 bits each 972 These fields identify the connection, similar to the 973 corresponding fields in TCP and UDP. The Source Port represents 974 the relevant port on the endpoint that sent this packet, the 975 Destination Port the relevant port on the other endpoint. When 976 initiating a connection, the client SHOULD choose its Source 977 Port randomly to reduce the likelihood of attack. 979 DCCP APIs should treat port numbers similarly to TCP and UDP 980 port numbers. For example, machines that distinguish between 981 "privileged" and "unprivileged" ports for TCP and UDP should do 982 the same for DCCP. 984 Data Offset: 8 bits 985 The offset from the start of the packet's DCCP header to the 986 start of its application data area, in 32-bit words. The 987 receiver MUST ignore packets whose Data Offset is smaller than 988 the minimum-sized header for the given Type, or larger than the 989 DCCP packet itself. 991 CCVal: 4 bits 992 Used by the HC-Sender CCID. For example, the A-to-B CCID's 993 sender, which is active at DCCP A, MAY send 4 bits of 994 information per packet to its receiver by encoding that 995 information in CCVal. The sender MUST set CCVal to zero unless 996 its HC-Sender CCID specifies otherwise, and the receiver MUST 997 ignore the CCVal field unless its HC-Receiver CCID specifies 998 otherwise. 1000 Checksum Coverage (CsCov): 4 bits 1001 Checksum Coverage determines the parts of the packet that are 1002 covered by the Checksum field. This always includes the DCCP 1003 header and options, but some or all of the application data may 1004 be excluded. This can improve performance on noisy links for 1005 applications that can tolerate corruption. See Section 9. 1007 Checksum: 16 bits 1008 The Internet checksum of the packet's DCCP header (including 1009 options), a network-layer pseudoheader, and, depending on 1010 Checksum Coverage, all, some, or none of the application data. 1011 See Section 9. 1013 Reserved (Res): 3 bits 1014 Senders MUST set this field to all zeroes on generated packets, 1015 and receivers MUST ignore its value. 1017 Type: 4 bits 1018 The Type field specifies the type of the packet. The following 1019 values are defined: 1021 Type Meaning 1022 ---- ------- 1023 0 DCCP-Request 1024 1 DCCP-Response 1025 2 DCCP-Data 1026 3 DCCP-Ack 1027 4 DCCP-DataAck 1028 5 DCCP-CloseReq 1029 6 DCCP-Close 1030 7 DCCP-Reset 1031 8 DCCP-Sync 1032 9 DCCP-SyncAck 1033 10-15 Reserved 1035 Table 1: DCCP Packet Types 1037 Receivers MUST ignore any packets with reserved type. That is, 1038 packets with reserved type MUST NOT be processed and they MUST 1039 NOT be acknowledged as received. 1041 Extended Sequence Numbers (X): 1 bit 1042 Set to one to indicate the use of an extended generic header 1043 with 48-bit Sequence and Acknowledgement Numbers. DCCP-Data, 1044 DCCP-DataAck, and DCCP-Ack packets MAY set X to zero or one. 1045 All DCCP-Request, DCCP-Response, DCCP-CloseReq, DCCP-Close, 1046 DCCP-Reset, DCCP-Sync, and DCCP-SyncAck packets MUST set X to 1047 one; endpoints MUST ignore any such packets with X set to zero. 1048 High-rate connections SHOULD set X to one on all packets to gain 1049 increased protection against wrapped sequence numbers and 1050 attacks. See Section 7.6. 1052 Sequence Number: 48 or 24 bits 1053 Identifies the packet uniquely in the sequence of all packets 1054 the source sent on this connection. Sequence Number increases 1055 by one with every packet sent, including packets such as DCCP- 1056 Ack that carry no application data. See Section 7. 1058 All currently defined packet types except DCCP-Request and DCCP-Data 1059 carry an Acknowledgement Number Subheader in the four or eight bytes 1060 immediately following the generic header. When X=1, its format is: 1062 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1063 | Reserved | Acknowledgement Number . 1064 | | (high bits) . 1065 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1066 . Acknowledgement Number (low bits) | 1067 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1069 When X=0, only the low 24 bits of the Acknowledgement Number are 1070 transmitted, giving the Acknowledgement Number Subheader this 1071 format: 1073 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1074 | Reserved | Acknowledgement Number (low bits) | 1075 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1077 Reserved: 16 or 8 bits 1078 Senders MUST set this field to all zeroes on generated packets, 1079 and receivers MUST ignore its value. 1081 Acknowledgement Number: 48 or 24 bits 1082 Generally contains GSR, the Greatest Sequence Number Received on 1083 any acknowledgeable packet so far. A packet is acknowledgeable 1084 if and only if its header was successfully processed by the 1085 receiver; Section 7.4 describes this further. Options such as 1086 Ack Vector (Section 11.4) combine with the Acknowledgement 1087 Number to provide precise information about which packets have 1088 arrived. 1090 Acknowledgement Numbers on DCCP-Sync and DCCP-SyncAck packets 1091 need not equal GSR. See Section 5.7. 1093 5.2. DCCP-Request Packets 1095 A client initiates a DCCP connection by sending a DCCP-Request 1096 packet. These packets MAY contain application data, and MUST use 1097 48-bit sequence numbers (X=1). 1099 0 1 2 3 1100 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1101 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1102 / Generic DCCP Header with X=1 (16 bytes) / 1103 / with Type=0 (DCCP-Request) / 1104 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1105 | Service Code | 1106 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1107 / Options and Padding / 1108 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1109 / Application Data / 1110 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1112 Service Code: 32 bits 1113 Describes the application-level service to which the client 1114 application wants to connect. Service Codes are intended to 1115 provide information about which application protocol a 1116 connection intends to use, and thus aiding middleboxes and 1117 reducing reliance on globally well-known ports. See Section 1118 8.1.2. 1120 5.3. DCCP-Response Packets 1122 The server responds to valid DCCP-Request packets with DCCP-Response 1123 packets. This is the second phase of the three-way handshake. 1124 DCCP-Response packets MAY contain application data, and MUST use 1125 48-bit sequence numbers (X=1). 1127 0 1 2 3 1128 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1129 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1130 / Generic DCCP Header with X=1 (16 bytes) / 1131 / with Type=1 (DCCP-Response) / 1132 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1133 / Acknowledgement Number Subheader (8 bytes) / 1134 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1135 | Service Code | 1136 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1137 / Options and Padding / 1138 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1139 / Application Data / 1140 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1142 Acknowledgement Number: 48 bits 1143 Contains GSR. Since DCCP-Responses are only sent during 1144 connection initiation, this will always equal the Sequence 1145 Number on a received DCCP-Request. 1147 Service Code: 32 bits 1148 MUST equal the Service Code on the corresponding DCCP-Request. 1150 5.4. DCCP-Data, DCCP-Ack, and DCCP-DataAck Packets 1152 The central data transfer portion of every DCCP connection uses 1153 DCCP-Data, DCCP-Ack, and DCCP-DataAck packets. These packets MAY 1154 use 24-bit sequence numbers, depending on the value of the Allow 1155 Short Sequence Numbers feature (Section 7.6.1). DCCP-Data packets 1156 carry application data without acknowledgements. 1158 0 1 2 3 1159 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1160 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1161 / Generic DCCP Header (16 or 12 bytes) / 1162 / with Type=2 (DCCP-Data) / 1163 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1164 / Options and Padding / 1165 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1166 / Application Data / 1167 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1169 DCCP-Ack packets dispense with the data, but contain an 1170 Acknowledgement Number. They are used for pure acknowledgements. 1172 0 1 2 3 1173 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1174 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1175 / Generic DCCP Header (16 or 12 bytes) / 1176 / with Type=3 (DCCP-Ack) / 1177 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1178 / Acknowledgement Number Subheader (8 or 4 bytes) / 1179 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1180 / Options and Padding / 1181 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1182 / Application Data Area (Ignored) / 1183 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1185 DCCP-DataAck packets carry both application data and an 1186 Acknowledgement Number: acknowledgement information is piggybacked 1187 on a data packet. 1189 0 1 2 3 1190 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1191 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1192 / Generic DCCP Header (16 or 12 bytes) / 1193 / with Type=4 (DCCP-DataAck) / 1194 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1195 / Acknowledgement Number Subheader (8 or 4 bytes) / 1196 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1197 / Options and Padding / 1198 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1199 / Application Data / 1200 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1202 A DCCP-Data or DCCP-DataAck packet may have a zero-length 1203 application data area, which indicates that the application sent a 1204 zero-length datagram. This differs from DCCP-Request and DCCP- 1205 Response packets, where an empty application data area indicates the 1206 absence of application data (not the presence of zero-length 1207 application data). The API SHOULD report any received zero-length 1208 datagrams to the receiving application. 1210 A DCCP-Ack packet MAY have a non-zero-length application data area, 1211 which essentially pads the DCCP-Ack to a desired length. Receivers 1212 MUST ignore the content of the application data area in DCCP-Ack 1213 packets. 1215 DCCP-Ack and DCCP-DataAck packets often include additional 1216 acknowledgement options, such as Ack Vector, as required by the 1217 congestion control mechanism in use. 1219 5.5. DCCP-CloseReq and DCCP-Close Packets 1221 DCCP-CloseReq and DCCP-Close packets begin the handshake that 1222 normally terminates a connection. Either client or server may send 1223 a DCCP-Close packet, which will elicit a DCCP-Reset packet. Only 1224 the server can send a DCCP-CloseReq packet, which indicates that the 1225 server wants to close the connection, but does not want to hold its 1226 TIMEWAIT state. Both packet types MUST use 48-bit sequence numbers 1227 (X=1). 1229 0 1 2 3 1230 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1231 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1232 / Generic DCCP Header with X=1 (16 bytes) / 1233 / with Type=5 (DCCP-CloseReq) or 6 (DCCP-Close) / 1234 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1235 / Acknowledgement Number Subheader (8 bytes) / 1236 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1237 / Options and Padding / 1238 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1239 / Application Data Area (Ignored) / 1240 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1242 As with DCCP-Ack packets, DCCP-CloseReq and DCCP-Close packets MAY 1243 have non-zero-length application data areas, whose contents 1244 receivers MUST ignore. 1246 5.6. DCCP-Reset Packets 1248 DCCP-Reset packets unconditionally shut down a connection. 1249 Connections normally terminate with a DCCP-Reset, but resets may be 1250 sent for other reasons, including bad port numbers, bad option 1251 behavior, incorrect ECN Nonce Echoes, and so forth. DCCP-Resets 1252 MUST use 48-bit sequence numbers (X=1). 1254 0 1 2 3 1255 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1256 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1257 / Generic DCCP Header with X=1 (16 bytes) / 1258 / with Type=7 (DCCP-Reset) / 1259 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1260 / Acknowledgement Number Subheader (8 bytes) / 1261 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1262 | Reset Code | Data 1 | Data 2 | Data 3 | 1263 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1264 / Options and Padding / 1265 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1266 / Application Data Area (Error Text) / 1267 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1269 Reset Code: 8 bits 1270 Represents the reason that the sender reset the DCCP connection. 1272 Data 1, Data 2, and Data 3: 8 bits each 1273 The Data fields provide additional information about why the 1274 sender reset the DCCP connection. The meanings of these fields 1275 depend on the value of Reset Code. 1277 Application Data Area: Error Text 1278 If present, Error Text is a human-readable text string encoded 1279 in Unicode UTF-8, and preferably in English, that describes the 1280 error in more detail. For example, a DCCP-Reset with Reset Code 1281 11, "Aggression Penalty", might contain Error Text such as 1282 "Aggression Penalty: Received 3 bad ECN Nonce Echoes, assuming 1283 misbehavior". 1285 The following Reset Codes are currently defined. Unless otherwise 1286 specified, the Data 1, 2, and 3 fields MUST be set to 0 by the 1287 sender of the DCCP-Reset and ignored by its receiver. Section 1288 references describe concrete situations that will cause each Reset 1289 Code to be generated; they are not meant to be exhaustive. 1291 0, "Unspecified" 1292 Indicates the absence of a meaningful Reset Code. Use of Reset 1293 Code 0 is NOT RECOMMENDED: the sender should choose a Reset Code 1294 that more clearly defines why the connection is being reset. 1296 1, "Closed" 1297 Normal connection close. See Section 8.3. 1299 2, "Aborted" 1300 The sending endpoint gave up on the connection because of lack 1301 of progress. See Sections 8.1.1 and 8.1.5. 1303 3, "No Connection" 1304 No connection exists. See Section 8.3.1. 1306 4, "Packet Error" 1307 A valid packet arrived with unexpected type. For example, a 1308 DCCP-Data packet with valid header checksum and sequence numbers 1309 arrived at a connection in the REQUEST state. See Section 1310 8.3.1. The Data 1 field equals the offending packet type as an 1311 eight-bit number; thus, an offending packet with Type 2 will 1312 result in a Data 1 value of 2. 1314 5, "Option Error" 1315 An option was erroneous, and the error was serious enough to 1316 warrant resetting the connection. See Sections 6.6.7, 6.6.8, 1317 and 11.4. The Data 1 field equals the offending option type; 1318 Data 2 and Data 3 equal the first two bytes of option data (or 1319 zero if the option had less than two bytes of data). 1321 6, "Mandatory Error" 1322 The sending endpoint could not process an option O that was 1323 immediately preceded by Mandatory. The Data fields report the 1324 option type and data of option O, using the format of Reset Code 1325 5, "Option Error". See Section 5.8.2. 1327 7, "Connection Refused" 1328 The Destination Port didn't correspond to a port open for 1329 listening. Sent only in response to DCCP-Requests. See Section 1330 8.1.3. 1332 8, "Bad Service Code" 1333 The Service Code didn't equal the service code attached to the 1334 Destination Port. Sent only in response to DCCP-Requests. See 1335 Section 8.1.3. 1337 9, "Too Busy" 1338 The server is too busy to accept new connections. Sent only in 1339 response to DCCP-Requests. See Section 8.1.3. 1341 10, "Bad Init Cookie" 1342 The Init Cookie echoed by the client was incorrect or missing. 1343 See Section 8.1.4. 1345 11, "Aggression Penalty" 1346 This endpoint has detected congestion control-related 1347 misbehavior on the part of the other endpoint. See Section 1348 12.3. 1350 12-127, Reserved 1351 Receivers should treat these codes like Reset Code 0, 1352 "Unspecified". 1354 128-255, CCID-specific codes 1355 Semantics depend on the connection's CCIDs. See Section 10.3. 1356 Receivers should treat unknown CCID-specific Reset Codes like 1357 Reset Code 0, "Unspecified". 1359 The following table summarizes this information. 1361 Reset 1362 Code Name Data 1 Data 2 & 3 1363 ----- ---- ------ ---------- 1364 0 Unspecified 0 0 1365 1 Closed 0 0 1366 2 Aborted 0 0 1367 3 No Connection 0 0 1368 4 Packet Error pkt type 0 1369 5 Option Error option # option data 1370 6 Mandatory Error option # option data 1371 7 Connection Refused 0 0 1372 8 Bad Service Code 0 0 1373 9 Too Busy 0 0 1374 10 Bad Init Cookie 0 0 1375 11 Aggression Penalty 0 0 1376 12-127 Reserved 1377 128-255 CCID-specific codes 1379 Table 2: DCCP Reset Codes 1381 Options on DCCP-Reset packets are processed before the connection is 1382 shut down. This means that certain combinations of options, 1383 particularly involving Mandatory, may cause an endpoint to respond 1384 to a valid DCCP-Reset with another DCCP-Reset. This cannot lead to 1385 a reset storm; since the first endpoint has already reset the 1386 connection, the second DCCP-Reset will be ignored. 1388 5.7. DCCP-Sync and DCCP-SyncAck Packets 1390 DCCP-Sync packets help DCCP endpoints recover synchronization after 1391 bursts of loss, or recover from half-open connections. Each valid 1392 received DCCP-Sync immediately elicits a DCCP-SyncAck. Both packet 1393 types MUST use 48-bit sequence numbers (X=1). 1395 0 1 2 3 1396 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1397 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1398 / Generic DCCP Header with X=1 (16 bytes) / 1399 / with Type=8 (DCCP-Sync) or 9 (DCCP-SyncAck) / 1400 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1401 / Acknowledgement Number Subheader (8 bytes) / 1402 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1403 / Options and Padding / 1404 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1405 / Application Data Area (Ignored) / 1406 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1408 The Acknowledgement Number field has special semantics for DCCP-Sync 1409 and DCCP-SyncAck packets. First, the packet corresponding to a 1410 DCCP-Sync's Acknowledgement Number need not have been 1411 acknowledgeable. Thus, receivers MUST NOT assume that a packet was 1412 processed simply because it appears in the Acknowledgement Number 1413 field of a DCCP-Sync packet. This differs from all other packet 1414 types, where the Acknowledgement Number by definition corresponds to 1415 an acknowledgeable packet. Second, the Acknowledgement Number on 1416 any DCCP-SyncAck packet MUST correspond to the Sequence Number on an 1417 acknowledgeable DCCP-Sync packet. In the presence of reordering, 1418 this might not equal GSR. 1420 As with DCCP-Ack packets, DCCP-Sync and DCCP-SyncAck packets MAY 1421 have non-zero-length application data areas, whose contents 1422 receivers MUST ignore. Padded DCCP-Sync packets may be useful when 1423 performing Path MTU discovery; see Section 14. 1425 5.8. Options 1427 Any DCCP packet may contain options, which occupy space at the end 1428 of the DCCP header. Each option is a multiple of 8 bits in length. 1429 Individual options are not padded to multiples of 32 bits, and any 1430 option may begin on any byte boundary. However, the combination of 1431 all options MUST add up to a multiple of 32 bits; Padding options 1432 MUST be added as necessary to fill out option space to a word 1433 boundary. Any options present are included in the header checksum. 1435 The first byte of an option is the option type. Options with types 1436 0 through 31 are single-byte options. Other options are followed by 1437 a byte indicating the option's length. This length value includes 1438 the two bytes of option-type and option-length as well as any 1439 option-data bytes, and must therefore be greater than or equal to 1440 two. 1442 Options are processed sequentially, starting at the first option in 1443 the packet header. Options with unknown types, and options with 1444 invalid lengths (length byte less than two or more than the 1445 remaining space in the options portion of the header), MUST be 1446 ignored. 1448 The following options are currently defined: 1450 Option DCCP- Section 1451 Type Length Meaning Data? Reference 1452 ---- ------ ------- ----- --------- 1453 0 1 Padding Y 5.8.1 1454 1 1 Mandatory N 5.8.2 1455 2 1 Slow Receiver Y 11.6 1456 3-31 1 Reserved 1457 32 variable Change L N 6.1 1458 33 variable Confirm L N 6.2 1459 34 variable Change R N 6.1 1460 35 variable Confirm R N 6.2 1461 36 variable Init Cookie N 8.1.4 1462 37 3-5 NDP Count Y 7.7 1463 38 variable Ack Vector [Nonce 0] N 11.4 1464 39 variable Ack Vector [Nonce 1] N 11.4 1465 40 variable Data Dropped N 11.7 1466 41 6 Timestamp Y 13.1 1467 42 6/8/10 Timestamp Echo Y 13.3 1468 43 4/6 Elapsed Time N 13.2 1469 44 6 Data Checksum Y 9.3 1470 45-127 variable Reserved 1471 128-255 variable CCID-specific options - 10.3 1473 Table 3: DCCP Options 1475 Not all options are suitable for all packet types. For example, 1476 since the Ack Vector option is interpreted relative to the 1477 Acknowledgement Number, it isn't suitable on DCCP-Request and DCCP- 1478 Data packets, which have no Acknowledgement Number. If an option 1479 occurs on an unexpected packet type, it MUST generally be ignored; 1480 any such restrictions are mentioned in each option's description. 1481 The table summarizes the most common restriction: when the DCCP- 1482 Data? column value is N, the corresponding option MUST be ignored 1483 when received on a DCCP-Data packet. (Section 7.5.5 describes why 1484 such options are ignored as opposed to, say, causing a reset.) 1486 This section describes two generic options, Padding and Mandatory. 1487 Other options are described later. 1489 5.8.1. Padding Option 1491 +--------+ 1492 |00000000| 1493 +--------+ 1494 Type=0 1496 Padding is a single-byte "no-operation" option used to pad between 1497 or after options. If the length of a packet's other options is not 1498 a multiple of 4, then Padding options are REQUIRED to pad out the 1499 options area to the length implied by Data Offset. Padding may also 1500 be used between options -- for example, to align the beginning of a 1501 subsequent option on a word boundary. There is no guarantee that 1502 senders will use this option, so receivers must be prepared to 1503 process options even if they do not begin on a word boundary. 1505 5.8.2. Mandatory Option 1507 +--------+ 1508 |00000001| 1509 +--------+ 1510 Type=1 1512 Mandatory is a single byte option that marks the immediately 1513 following option as mandatory. Say that the immediately following 1514 option is O. Then the Mandatory option has no effect if the 1515 receiving DCCP endpoint understands and processes O. If the 1516 endpoint does not understand or process O, however, then it MUST 1517 reset the connection using Reset Code 6, "Mandatory Failure". For 1518 instance, the endpoint would reset the connection if it did not 1519 understand O's type; if it understood O's type, but not O's data; if 1520 O's data was invalid for O's type; if O was a feature negotiation 1521 option, and the endpoint did not understand the enclosed feature 1522 number; if the endpoint understood O, but chose not to perform the 1523 action O implies; and so forth. 1525 Mandatory options MUST NOT be sent on DCCP-Data packets, and any 1526 Mandatory options received on DCCP-Data packets MUST be ignored. 1528 The connection is in error and should be reset with Reset Code 5, 1529 "Option Error" if option O is absent (Mandatory was the last byte of 1530 the option list), or if option O equals Mandatory. However, the 1531 combination "Mandatory Padding" is valid, and MUST behave like two 1532 bytes of Padding. 1534 Section 6.6.9 describes the behavior of Mandatory feature 1535 negotiation options in more detail. 1537 6. Feature Negotiation 1539 Four DCCP options, Change L, Confirm L, Change R, and Confirm R, are 1540 used to negotiate feature values. Change options initiate a 1541 negotiation; Confirm options complete that negotiation. The "L" 1542 options are sent by the feature location, and the "R" options are 1543 sent by the feature remote. Change options are retransmitted to 1544 ensure reliability. 1546 All these options have the same format. The first byte of option 1547 data is the feature number, and the second and subsequent data bytes 1548 hold one or more feature values. The exact format of the feature 1549 value area depends on the feature type; see Section 6.3. 1551 +--------+--------+--------+--------+-------- 1552 | Type | Length |Feature#| Value(s) ... 1553 +--------+--------+--------+--------+-------- 1555 Together, the feature number and the option type ("L" or "R") 1556 uniquely identify the feature to which an option applies. The exact 1557 format of the Value(s) area depends on the feature number. 1559 Feature negotiation options MUST NOT be sent on DCCP-Data packets, 1560 and any feature negotiation options received on DCCP-Data packets 1561 MUST be ignored. 1563 6.1. Change Options 1565 Change L and Change R options initiate feature negotiation. The 1566 option to use depends on the relevant feature's location: To start a 1567 negotiation for feature F/A, DCCP A will send a Change L option; to 1568 start a negotiation for F/B, it will send a Change R option. Change 1569 options are retransmitted until some response is received. They 1570 contain at least one Value, and thus have length at least 4. 1572 +--------+--------+--------+--------+-------- 1573 Change L: |00100000| Length |Feature#| Value(s) ... 1574 +--------+--------+--------+--------+-------- 1575 Type=32 1577 +--------+--------+--------+--------+-------- 1578 Change R: |00100010| Length |Feature#| Value(s) ... 1579 +--------+--------+--------+--------+-------- 1580 Type=34 1582 6.2. Confirm Options 1584 Confirm L and Confirm R options complete feature negotiation, and 1585 are sent in response to Change R and Change L options, respectively. 1586 Confirm options MUST NOT be generated except in response to Change 1587 options. Confirm options need not be retransmitted, since Change 1588 options are retransmitted as necessary. The first byte of the 1589 Confirm option contains the feature number from the corresponding 1590 Change. Following this is the selected Value, and then possibly the 1591 sender's preference list. 1593 +--------+--------+--------+--------+-------- 1594 Confirm L: |00100001| Length |Feature#| Value(s) ... 1595 +--------+--------+--------+--------+-------- 1596 Type=33 1598 +--------+--------+--------+--------+-------- 1599 Confirm R: |00100011| Length |Feature#| Value(s) ... 1600 +--------+--------+--------+--------+-------- 1601 Type=35 1603 If an endpoint receives an invalid Change option -- with an unknown 1604 feature number, or an invalid value -- it will respond with an empty 1605 Confirm option containing the problematic feature number, but no 1606 value. Such options have length 3. 1608 6.3. Reconciliation Rules 1610 Reconciliation rules determine how the two sets of preferences for a 1611 given feature are resolved into a unique result. The reconciliation 1612 rule depends only on the feature number. Each reconciliation rule 1613 must have the property that the result is uniquely determined given 1614 the contents of Change options sent by the two endpoints. 1616 All current DCCP features use one of two reconciliation rules, 1617 server-priority ("SP") and non-negotiable ("NN"). 1619 6.3.1. Server-Priority 1621 The feature value is a fixed-length byte string (length determined 1622 by the feature number). Each Change option contains a list of 1623 values ordered by preference, with the most preferred value coming 1624 first. Each Confirm option contains the confirmed value, followed 1625 by the confirmer's preference list. Thus, the feature's current 1626 value will generally appear twice in Confirm options' data, once as 1627 the current value and once in the confirmer's preference list. 1629 To reconcile the preference lists, select the first entry in the 1630 server's list that also occurs in the client's list. If there is no 1631 shared entry, the feature's value MUST NOT change, and the Confirm 1632 option will confirm the feature's previous value (unless the Change 1633 option was Mandatory; see Section 6.6.9). 1635 6.3.2. Non-Negotiable 1637 The feature value is a byte string. Each option contains exactly 1638 one feature value. The feature location signals a new value by 1639 sending a Change L option. The feature remote MUST accept any valid 1640 value, responding with a Confirm R option containing the new value, 1641 and it MUST send empty Confirm R options in response to invalid 1642 values (unless the Change L option was Mandatory; see Section 1643 6.6.9). Change R and Confirm L options MUST NOT be sent for non- 1644 negotiable features; see Section 6.6.8. Non-negotiable features use 1645 the feature negotiation mechanism to achieve reliability. 1647 6.4. Feature Numbers 1649 This document defines the following feature numbers. 1651 Rec'n Initial Section 1652 Number Meaning Rule Value Req'd Reference 1653 ------ ------- ----- ----- ----- --------- 1654 0 Reserved 1655 1 Congestion Control ID (CCID) SP 2 Y 10 1656 2 Allow Short Seqnos SP 1 Y 7.6.1 1657 3 Sequence Window NN 100 Y 7.5.2 1658 4 ECN Incapable SP 0 N 12.1 1659 5 Ack Ratio NN 2 N 11.3 1660 6 Send Ack Vector SP 0 N 11.5 1661 7 Send NDP Count SP 0 N 7.7.2 1662 8 Minimum Checksum Coverage SP 0 N 9.2.1 1663 9 Check Data Checksum SP 0 N 9.3.1 1664 10-127 Reserved 1665 128-255 CCID-specific features 10.3 1667 Table 4: DCCP Feature Numbers 1669 Rec'n Rule The reconciliation rule used for the feature. SP is 1670 server-priority and NN is non-negotiable. 1672 Initial Value The initial value for the feature. Every feature has 1673 a known initial value. 1675 Req'd This column is "Y" if and only if every DCCP 1676 implementation MUST understand the feature. If it is 1677 "N", then the feature behaves like an extension (see 1678 Section 15), and it is safe to respond to Change 1679 options for the feature with empty Confirm options. 1680 Of course, a CCID might require the feature; a DCCP 1681 that implements CCID 2 MUST support Ack Ratio and 1682 Send Ack Vector, for example. 1684 6.5. Examples 1685 Here are three example feature negotiations for features located at 1686 the server, the first two for the Congestion Control ID feature, the 1687 last for the Ack Ratio. 1689 Client Server 1690 ------ ------ 1691 1. Change R(CCID, 2 3 1) --> 1692 ("2 3 1" is client's preference list) 1693 2. <-- Confirm L(CCID, 3, 3 2 1) 1694 (3 is the negotiated value; 1695 "3 2 1" is server's pref list) 1696 * agreement that CCID/Server = 3 * 1698 1. XXX <-- Change L(CCID, 3 2 1) 1699 2. Retransmission: 1700 <-- Change L(CCID, 3 2 1) 1701 3. Confirm R(CCID, 3, 2 3 1) --> 1702 * agreement that CCID/Server = 3 * 1704 1. <-- Change L(Ack Ratio, 3) 1705 2. Confirm R(Ack Ratio, 3) --> 1706 * agreement that Ack Ratio/Server = 3 * 1708 This example shows a simultaneous negotiation. 1710 Client Server 1711 ------ ------ 1712 1a. Change R(CCID, 2 3 1) --> 1713 b. <-- Change L(CCID, 3 2 1) 1714 2a. <-- Confirm L(CCID, 3, 3 2 1) 1715 b. Confirm R(CCID, 3, 2 3 1) --> 1716 * agreement that CCID/Server = 3 * 1718 Here are the byte encodings of several Change and Confirm options. 1719 Each option is sent by DCCP A. 1721 Change L(CCID, 2 3) = 32,5,1,2,3 1722 DCCP B should change CCID/A's value (feature number 1, a server- 1723 priority feature); DCCP A's preferred values are 2 and 3, in 1724 that preference order. 1726 Change L(Sequence Window, 1024) = 32,6,3,0,4,0 1727 DCCP B should change Sequence Window/A's value (feature number 1728 3, a non-negotiable feature) to the 3-byte string 0,4,0 (the 1729 value 1024). 1731 Confirm L(CCID, 2, 2 3) = 33,6,1,2,2,3 1732 DCCP A has changed CCID/A's value to 2; its preferred values are 1733 2 and 3, in that preference order. 1735 Empty Confirm L(126) = 33,3,126 1736 DCCP A doesn't implement feature number 126, or DCCP B's 1737 proposed value for feature 126/A was invalid. 1739 Change R(CCID, 3 2) = 34,5,1,3,2 1740 DCCP B should change CCID/B's value; DCCP A's preferred values 1741 are 3 and 2, in that preference order. 1743 Confirm R(CCID, 2, 3 2) = 35,6,1,2,3,2 1744 DCCP A has changed CCID/B's value to 2; its preferred values 1745 were 3 and 2, in that preference order. 1747 Confirm R(Sequence Window, 1024) = 35,6,3,0,4,0 1748 DCCP A has changed Sequence Window/B's value to the 3-byte 1749 string 0,4,0 (the value 1024). 1751 Empty Confirm R(126) = 35,3,126 1752 DCCP A doesn't implement feature number 126, or DCCP B's 1753 proposed value for feature 126/B was invalid. 1755 6.6. Option Exchange 1757 A few basic rules govern feature negotiation option exchange. 1759 1. Every non-reordered Change option gets a Confirm option in 1760 response. 1762 2. Change options are retransmitted until a response for the latest 1763 Change is received. 1765 3. Feature negotiation options are processed in strictly increasing 1766 order by Sequence Number. 1768 The rest of this section describes the consequences of these rules 1769 in more detail. 1771 6.6.1. Normal Exchange 1773 Change options are generated when a DCCP endpoint wants to change 1774 the value of some feature. Generally, this will happen at the 1775 beginning of a connection, although it may happen at any time. We 1776 say the endpoint "generates" or "sends" a Change L or Change R 1777 option, but of course the option must be attached to a packet. The 1778 endpoint may attach the option to a packet it would have generated 1779 anyway (such as a DCCP-Request), or it may create a "feature 1780 negotiation packet", often a DCCP-Ack or DCCP-Sync, just to carry 1781 the option. Feature negotiation packets are controlled by the 1782 relevant congestion control mechanism. For example, DCCP A may send 1783 a DCCP-Ack or DCCP-Sync for feature negotiation only if the B-to-A 1784 CCID would allow sending a DCCP-Ack. In addition, an endpoint 1785 SHOULD generate at most one feature negotiation packet per round- 1786 trip time. 1788 On receiving a Change L or Change R option, a DCCP endpoint examines 1789 the included preference list, reconciles that with its own 1790 preference list, calculates the new value, and sends back a 1791 Confirm R or Confirm L option, respectively, informing its peer of 1792 the new value or that the feature was not understood. Every non- 1793 reordered Change option MUST result in a corresponding Confirm 1794 option, and any packet including a Confirm option MUST carry an 1795 Acknowledgement Number. Generated Confirm options may be attached 1796 to packets that would have been sent anyway (such as DCCP-Response 1797 or DCCP-SyncAck), or to new feature negotiation packets, as 1798 described above. 1800 The Change-sending endpoint MUST wait to receive a corresponding 1801 Confirm option before changing its stored feature value. The 1802 Confirm-sending endpoint changes its stored feature value as soon as 1803 it sends the Confirm. 1805 A packet MAY contain more than one feature negotiation option, as 1806 long as no two options refer to the same feature. Note, however, 1807 that a packet is allowed to contain one L option and one R option 1808 with the same feature number, since the two options actually refer 1809 to different features (F/A and F/B). 1811 6.6.2. Processing Received Options 1813 DCCP endpoints exist in one of three states relative to each 1814 feature. STABLE is the normal state, where the endpoint knows the 1815 feature's value and thinks the other endpoint agrees. An endpoint 1816 enters the CHANGING state when it first sends a Change for the 1817 feature, and returns to STABLE once it receives a corresponding 1818 Confirm. The final state, UNSTABLE, indicates that an endpoint in 1819 CHANGING state changed its preference list, but has not yet 1820 transmitted a Change option with the new preference list. 1822 Feature state transitions at a feature location are implemented 1823 according to this diagram. The diagram ignores sequence number and 1824 option validity issues; these are handled explicitly in the 1825 pseudocode that follows. 1827 timeout/ 1828 rcv Confirm R app/protocol evt : snd Change L rcv non-ack 1829 : ignore +---------------------------------------+ : snd Change L 1830 +----+ | | +----+ 1831 | v | rcv Change R v | v 1832 +------------+ rcv Confirm R : calc new value, +------------+ 1833 | | : accept value snd Confirm L | | 1834 | STABLE |<-----------------------------------| CHANGING | 1835 | | rcv empty Confirm R | | 1836 +------------+ : revert to old value +------------+ 1837 | ^ | ^ 1838 +----+ pref list | | snd 1839 rcv Change R changes | | Change L 1840 : calc new value, snd Confirm L v | 1841 +------------+ 1842 +---| | 1843 rcv Confirm/Change R | | UNSTABLE | 1844 : ignore +-->| | 1845 +------------+ 1847 Feature locations SHOULD use the following pseudocode, which 1848 corresponds to the state diagram, to react to each feature 1849 negotiation option on each valid packet received. The pseudocode 1850 refers to "P.seqno" and "P.ackno", which are properties of the 1851 packet; "O.type", and "O.len", which are properties of the option; 1852 "FGSR" and "FGSS", which are properties of the connection, and 1853 handle reordering as described in Section 6.6.4; "F.state", which is 1854 the feature's state (STABLE, CHANGING, or UNSTABLE); and "F.value", 1855 which is the feature's value. 1857 First, check for unknown features (Section 6.6.7); 1858 If F is unknown, 1859 If the option was Mandatory, /* Section 6.6.9 */ 1860 Reset connection and return 1861 Otherwise, if O.type == Change R, 1862 Send Empty Confirm L on a future packet 1863 Return 1865 Second, check for reordering (Section 6.6.4); 1866 If F.state == UNSTABLE or P.seqno <= FGSR 1867 or (O.type == Confirm R and P.ackno < FGSS), 1868 Ignore option and return 1870 Third, process Change R options; 1871 If O.type == Change R, 1872 If the option's value is valid, /* Section 6.6.8 */ 1873 Calculate new value 1874 Send Confirm L on a future packet 1875 Set F.state := STABLE 1876 Otherwise, if the option was Mandatory, 1877 Reset connection and return 1878 Otherwise, 1879 Send Empty Confirm L on a future packet 1880 /* Remain in existing state. If that's CHANGING, this 1881 endpoint will retransmit its Change L option later. */ 1883 Fourth, process Confirm R options (but only in CHANGING state). 1884 If F.state == CHANGING and O.type == Confirm R, 1885 If O.len > 3, /* nonempty */ 1886 If the option's value is valid, 1887 Set F.value := new value 1888 Otherwise, 1889 Reset connection and return 1890 Set F.state := STABLE 1892 Versions of this diagram and pseudocode are also used by feature 1893 remotes; simply switch the "L"s and "R"s, so that the relevant 1894 options are Change R and Confirm L. 1896 6.6.3. Loss and Retransmission 1898 Packets containing Change and Confirm options might be lost or 1899 delayed by the network. Therefore, Change options are repeatedly 1900 transmitted to achieve reliability. We refer to this as 1901 "retransmission", although of course there are no packet-level 1902 retransmissions in DCCP: a Change option that is sent again will be 1903 sent on a new packet with a new sequence number. 1905 A CHANGING endpoint transmits another Change option once it realizes 1906 that it has not heard back from the other endpoint. The new Change 1907 option need not contain the same payload as the original; reordering 1908 protection will ensure that agreement is reached based on the most 1909 recently transmitted option. 1911 A CHANGING endpoint MUST continue retransmitting Change options 1912 until it gets some response or the connection terminates. 1914 Endpoints SHOULD use an exponential-backoff timer to decide when to 1915 retransmit Change options. (Endpoints that generate packets 1916 specifically for feature negotiation MUST use such a timer.) The 1917 timer interval is initially set to not less than one round-trip 1918 time, and should back off to not less than 64 seconds. The backoff 1919 protects against delayed agreement due to the reordering protection 1920 algorithms described in the next section. Again, endpoints may 1921 piggyback Change options on packets they would have sent anyway, or 1922 create new packets to carry the options; any such new packets are 1923 controlled by the relevant congestion-control mechanism. 1925 Confirm options are never retransmitted, but the Confirm-sending 1926 endpoint MUST generate a Confirm option after every non-reordered 1927 Change. 1929 6.6.4. Reordering 1931 Reordering might cause packets containing Change and Confirm options 1932 to arrive in an unexpected order. Endpoints MUST ignore feature 1933 negotiation options that do not arrive in strictly-increasing order 1934 by Sequence Number. The rest of this section presents two 1935 algorithms that fulfill this requirement. 1937 The first algorithm introduces two sequence number variables that 1938 each endpoint maintains for the connection. 1940 FGSR Feature Greatest Sequence Number Received: The greatest 1941 sequence number received, considering only valid packets 1942 that contained one or more feature negotiation options 1943 (Change and/or Confirm). This value is initialized to 1944 ISR - 1. 1946 FGSS Feature Greatest Sequence Number Sent: The greatest 1947 sequence number sent, considering only packets that 1948 contained one or more non-retransmitted Change options. 1949 (Retransmitted Change options MUST have exactly the same 1950 contents as previously transmitted options, so limited 1951 reordering can safely be tolerated.) This value is 1952 initialized to ISS. 1954 Each endpoint checks two conditions on sequence numbers to decide 1955 whether to process received feature negotiation options. 1957 1. If a packet's Sequence Number is less than or equal to FGSR, 1958 then its Change options MUST be ignored. 1960 2. If a packet's Sequence Number is less than or equal to FGSR, OR 1961 it has no Acknowledgement Number, OR its Acknowledgement Number 1962 is less than FGSS, then its Confirm options MUST be ignored. 1964 Alternatively, an endpoint MAY maintain separate FGSR and FGSS 1965 values for every feature. FGSR(F/X) would equal the greatest 1966 sequence number received, considering only packets that contained 1967 Change or Confirm options applying to feature F/X; FGSS(F/X) would 1968 be defined similarly. This algorithm requires more state, but is 1969 slightly more forgiving to multiple overlapped feature negotiations. 1970 Either algorithm MAY be used; the first algorithm, with connection- 1971 wide FGSR and FGSS variables, is RECOMMENDED. 1973 One consequence of these rules is that a CHANGING endpoint will 1974 ignore any Confirm option that does not acknowledge the latest 1975 Change option sent. This ensures that agreement, once achieved, 1976 used the most recent available information about the endpoints' 1977 preferences. 1979 6.6.5. Preference Changes 1981 Endpoints are allowed to change their preference lists at any time. 1982 However, an endpoint that changes its preference list while in the 1983 CHANGING state MUST transition to the UNSTABLE state. It will 1984 transition back to CHANGING once it has transmitted a Change option 1985 with the new preference list. This ensures that agreement is based 1986 on active preference lists. Without the UNSTABLE state, 1987 simultaneous negotiation -- where the endpoints began independent 1988 negotiations for the same feature at the same time -- might lead to 1989 the negotiation terminating with the endpoints thinking the feature 1990 had different values. 1992 6.6.6. Simultaneous Negotiation 1994 The two endpoints might simultaneously open negotiation for the same 1995 feature, after which an endpoint in the CHANGING state will receive 1996 a Change option for the same feature. Such received Change options 1997 can act as responses to the original Change options. The CHANGING 1998 endpoint MUST examine the received Change's preference list, 1999 reconcile that with its own preference list (as expressed in its 2000 generated Change options), and generate the corresponding Confirm 2001 option. It can then transition to the STABLE state. 2003 6.6.7. Unknown Features 2005 Endpoints may receive Change options referring to feature numbers 2006 they do not understand -- for instance, when an extended DCCP 2007 converses with a non-extended DCCP. Endpoints MUST respond to 2008 unknown Change options with Empty Confirm options (that is, Confirm 2009 options containing no data), which inform the CHANGING endpoint that 2010 the feature was not understood. However, if the Change option was 2011 Mandatory, the connection MUST be reset; see Section 6.6.9. 2013 On receiving an empty Confirm option for some feature, the CHANGING 2014 endpoint MUST transition back to the STABLE state, leaving the 2015 feature's value unchanged. Section 15 suggests that the default 2016 value for any extension feature should correspond to "extension not 2017 available". 2019 Some features are required to be understood by all DCCPs (see 2020 Section 6.4). The CHANGING endpoint SHOULD reset the connection 2021 (with Reset Code 5, "Option Error") if it receives an empty Confirm 2022 option for such a feature. 2024 Since Confirm options are generated only in response to Change 2025 options, an endpoint should never receive a Confirm option referring 2026 to a feature number it does not understand. Nevertheless, endpoints 2027 MUST ignore any such options they receive. 2029 6.6.8. Invalid Options 2031 A DCCP endpoint might receive a Change or Confirm option that lists 2032 one or more values that it does not understand. Some, but not all, 2033 such options are invalid, depending on the relevant reconciliation 2034 rule (Section 6.3). For instance: 2036 o All features have length limitiations, and options with invalid 2037 lengths are invalid. For example, the Ack Ratio feature takes 2038 16-bit values, so valid "Confirm R(Ack Ratio)" options have 2039 option length 5. 2041 o Some non-negotiable features have value limitations. The Ack 2042 Ratio feature takes two-byte, non-zero integer values, so a 2043 "Change L(Ack Ratio, 0)" option is never valid. Note that 2044 server-priority features do not have value limitations, since 2045 unknown values are handled as a matter of course. 2047 o Any Confirm option that selects the wrong value, based on the two 2048 preference lists and the relevant reconciliation rule, is 2049 invalid. 2051 o However, unexpected Confirm options -- that refer to unknown 2052 feature numbers, or that don't appear to be part of a current 2053 negotiation -- are considered valid, although they are ignored by 2054 the receiver. 2056 An endpoint receiving an invalid Change option MUST respond with the 2057 corresponding empty Confirm option. An endpoint receiving an 2058 invalid Confirm option MUST reset the connection, with Reset Code 5, 2059 "Option Error". 2061 6.6.9. Mandatory Feature Negotiation 2063 Change options may be preceded by Mandatory options (Section 5.8.2). 2064 Mandatory Change options are processed like normal Change options, 2065 except that the following failure cases will cause the receiver to 2066 reset the connection with Reset Code 6, "Mandatory Failure", rather 2067 than send a Confirm option. The connection MUST be reset if: 2069 o The Change option's feature number was not understood; 2071 o The Change option's value was invalid, and the receiver would 2072 normally have sent an empty Confirm option in response; or 2074 o For server-priority features, there was no shared entry in the 2075 two endpoints' preference lists. 2077 There's no reason to mark Confirm options as Mandatory in this 2078 version of DCCP, since Confirm options are sent only in response to 2079 Change options and therefore can't mention potentially-invalid 2080 values or unexpected feature numbers. 2082 7. Sequence Numbers 2084 DCCP uses sequence numbers to arrange packets into sequence, detect 2085 losses and network duplicates, and protect against attackers, half- 2086 open connections, and the delivery of very old packets. Every 2087 packet carries a Sequence Number; most packet types carry an 2088 Acknowledgement Number as well. 2090 DCCP sequence numbers are packet-based. That is, the packets 2091 generated by each endpoint have Sequence Numbers that increase by 2092 one, modulo 2^48, for every packet. Even DCCP-Ack and DCCP-Sync 2093 packets, and other packets that don't carry user data, increment the 2094 Sequence Number. Since DCCP is an unreliable protocol, there are no 2095 true retransmissions; but effective retransmissions, such as 2096 retransmissions of DCCP-Request packets, also increment the Sequence 2097 Number. This lets DCCP implementations detect network duplication, 2098 retransmissions, and acknowledgement loss, and is a significant 2099 departure from TCP practice. 2101 7.1. Variables 2103 DCCP endpoints maintain a set of sequence number variables for each 2104 connection. 2106 ISS The Initial Sequence Number Sent by this endpoint. This 2107 equals the Sequence Number of the first DCCP-Request or 2108 DCCP-Response sent. 2110 ISR The Initial Sequence Number Received from the other 2111 endpoint. This equals the Sequence Number of the first 2112 DCCP-Request or DCCP-Response received. 2114 GSS The Greatest Sequence Number Sent by this endpoint. Here, 2115 and elsewhere, "greatest" is measured in circular sequence 2116 space. 2118 GSR The Greatest Sequence Number Received from the other 2119 endpoint on an acknowledgeable packet. (Section 7.4 defines 2120 this term.) 2122 GAR The Greatest Acknowledgement Number Received from the other 2123 endpoint on an acknowledgeable packet that was not a DCCP- 2124 Sync. 2126 Some other variables are derived from these primitives. 2128 SWL and SWH 2129 (Sequence Number Window Low and High) The extremes of the 2130 validity window for received packets' Sequence Numbers. 2132 AWL and AWH 2133 (Acknowledgement Number Window Low and High) The extremes 2134 of the validity window for received packets' Acknowledgement 2135 Numbers. 2137 7.2. Initial Sequence Numbers 2139 The endpoints' initial sequence numbers are set by the first DCCP- 2140 Request and DCCP-Response packets sent. Initial sequence numbers 2141 MUST be chosen to avoid two problems: 2143 o Delivery of old packets, where packets lingering in the network 2144 from an old connection are delivered to a new connection with the 2145 same addresses and port numbers. 2147 o Sequence number attacks, where an attacker can guess the sequence 2148 numbers that a future connection would use [M85]. 2150 These problems are the same as problems faced by TCP, and DCCP 2151 implementations SHOULD use TCP's strategies to avoid them [RFC 793] 2152 [RFC 1948]. The rest of this section explains these strategies in 2153 more detail. 2155 To address the first problem, an implementation MUST ensure that the 2156 initial sequence number for a given 4-tuple doesn't overlap with 2158 recent sequence numbers on previous connections with the same 2159 4-tuple. ("Recent" means sent within 2 maximum segment lifetimes, 2160 or 4 minutes.) The implementation MUST additionally ensure that the 2161 lower 24 bits of the initial sequence number don't overlap with the 2162 lower 24 bits of recent sequence numbers (unless the implementation 2163 plans to avoid short sequence numbers; see Section 7.6). An 2164 implementation that has state for a recent connection with the same 2165 4-tuple can pick a good initial sequence number explicitly. 2166 Otherwise, it could tie initial sequence number selection to some 2167 clock, such as the 4-microsecond clock used by TCP [RFC 793]. Two 2168 separate clocks may be required, one for the upper 24 bits and one 2169 for the lower 24 bits. 2171 To address the second problem, an implementation MUST provide each 2172 4-tuple with an independent initial sequence number space. Then 2173 opening a connection doesn't provide any information about initial 2174 sequence numbers on other connections to the same host. RFC 1948 2175 achieves this by adding a cryptographic hash of the 4-tuple and a 2176 secret to each initial sequence number. For the secret, RFC 1948 2177 recommends a combination of some truly-random data [RFC 1750], an 2178 administratively-installed passphrase, the endpoint's IP address, 2179 and the endpoint's boot time, but truly-random data is sufficient. 2180 Care should be taken when changing the secret; such a change alters 2181 all initial sequence number spaces, which might make an initial 2182 sequence number for some 4-tuple equal a recently sent sequence 2183 number for the same 4-tuple. To avoid this problem, the endpoint 2184 might remember dead connection state for each 4-tuple or stay quiet 2185 for 2 maximum segment lifetimes around such a change. 2187 7.3. Quiet Time 2189 DCCP endpoints, like TCP endpoints, must take care before initiating 2190 connections when they boot. In particular, they MUST NOT send 2191 packets whose sequence numbers are close to the sequence numbers of 2192 packets lingering in the network from before the boot. The simplest 2193 way to enforce this rule is for DCCP endpoints to avoid sending any 2194 packets until one maximum segment lifetime (2 minutes) after boot. 2195 Other enforcement mechanisms include remembering recent sequence 2196 numbers across boots, and reserving the upper 8 or so bits of 2197 initial sequence numbers for a persistent counter that decrements by 2198 two each boot. (The latter mechanism would require disallowing 2199 packets with short sequence numbers; see Section 7.6.1.) 2201 7.4. Acknowledgement Numbers 2203 Cumulative acknowledgements are meaningless in an unreliable 2204 protocol. Therefore, DCCP's Acknowledgement Number field has a 2205 different meaning than TCP's. 2207 A received packet is classified as acknowledgeable if and only if 2208 its header was succesfully processed by the receiving DCCP. In 2209 terms of the pseudocode in Section 8.5, a received packet becomes 2210 acknowledgeable when the receiving endpoint reaches Step 8. This 2211 means, for example, that all acknowledgeable packets have valid 2212 header checksums and sequence numbers. The Acknowledgement Number 2213 MUST equal GSR, the Greatest Sequence Number Received on an 2214 acknowledgeable packet, for all packet types except DCCP-Sync and 2215 DCCP-SyncAck. 2217 "Acknowledgeable" does not refer to data processing. Even 2218 acknowledgeable packets may have their application data dropped, due 2219 to receive buffer overflow or corruption, for instance. Data 2220 Dropped options report these data losses when necessary, letting 2221 congestion control mechanisms distinguish between network losses and 2222 endpoint losses. This issue is discussed further in Sections 11.4 2223 and 11.7. 2225 DCCP-Sync and DCCP-SyncAck packets' Acknowledgement Numbers differ 2226 as follows: The Acknowledgement Number on a DCCP-Sync packet 2227 corresponds to a received packet, but not necessarily an 2228 acknowledgeable packet; in particular, it might correspond to an 2229 out-of-sync packet whose options were not processed. The 2230 Acknowledgement Number on a DCCP-SyncAck packet always corresponds 2231 to an acknowledgeable DCCP-Sync packet; it might be less than GSR in 2232 the presence of reordering. 2234 7.5. Validity and Synchronization 2236 Any DCCP endpoint might receive packets that are not actually part 2237 of the current connection. For instance, the network might deliver 2238 an old packet, an attacker might attempt to hijack a connection, or 2239 the other endpoint might crash, causing a half-open connection. 2241 DCCP, like TCP, uses sequence number checks to detect these cases. 2242 Packets whose Sequence and/or Acknowledgement Numbers are out of 2243 range are called sequence-invalid, and are not processed normally. 2245 Unlike TCP, DCCP requires a synchronization mechanism to recover 2246 from large bursts of loss. One endpoint might send so many packets 2247 during a burst of loss that when one of its packets finally got 2248 through, the other endpoint would label its Sequence Number as 2249 invalid. A handshake of DCCP-Sync and DCCP-SyncAck packets recovers 2250 from this case. 2252 7.5.1. Sequence and Acknowledgement Number Windows 2254 Each DCCP endpoint defines sequence validity windows that are 2255 subsets of the Sequence and Acknowledgement Number spaces. These 2256 windows correspond to packets the endpoint expects to receive in the 2257 next few round-trip times. The Sequence and Acknowledgement Number 2258 windows always contain GSR and GSS, respectively. The window widths 2259 are controlled by Sequence Window features for the two half- 2260 connections. 2262 The Sequence Number validity window for packets from DCCP B is [SWL, 2263 SWH]. This window always contains GSR, the Greatest Sequence Number 2264 Received on a sequence-valid packet from DCCP B. It is W packets 2265 wide, where W is the value of the Sequence Window/B feature. One- 2266 fourth of the sequence window, rounded down, is less than or equal 2267 to GSR, and three-fourths is greater than GSR. (This asymmetric 2268 placement assumes that bursts of loss are more common in the network 2269 than significant reordering.) 2271 invalid | valid Sequence Numbers | invalid 2272 <---------*|*===========*=======================*|*---------> 2273 GSR -|GSR + 1 - GSR GSR +|GSR + 1 + 2274 floor(W/4)|floor(W/4) ceil(3W/4)|ceil(3W/4) 2275 = SWL = SWH 2277 The Acknowledgement Number validity window for packets from DCCP B 2278 is [AWL, AWH]. The high end of the window, AWH, equals GSS, the 2279 Greatest Sequence Number Sent by DCCP A; the window is W' packets 2280 wide, where W' is the value of the Sequence Window/A feature. 2282 invalid | valid Acknowledgement Numbers | invalid 2283 <---------*|*===================================*|*---------> 2284 GSS - W'|GSS + 1 - W' GSS|GSS + 1 2285 = AWL = AWH 2287 SWL and AWL are initially adjusted so that they are not less than 2288 the initial Sequence Numbers received and sent, respectively: 2289 SWL := max(GSR + 1 - floor(W/4), ISR), 2290 AWL := max(GSS - W' + 1, ISS). 2291 These adjustments MUST be applied only at the beginning of the 2292 connection. (Long-lived connections may wrap sequence numbers so 2293 that they appear to be less than ISR or ISS; the adjustments MUST 2294 NOT be applied in that case.) 2296 7.5.2. Sequence Window Feature 2298 The Sequence Window/A feature determines the width of the Sequence 2299 Number validity window used by DCCP B, and the width of the 2300 Acknowledgement Number validity window used by DCCP A. DCCP A sends 2301 a "Change L(Sequence Window, W)" option to notify DCCP B that the 2302 Sequence Window/A value is W. 2304 Sequence Window has feature number 3, and is non-negotiable. It 2305 takes 48-bit (6-byte) integer values, like DCCP sequence numbers, 2306 but 1- to 5-byte values are also allowed in options; the receiver 2307 will pad on the left with zero bytes as necessary to total 48 bits. 2308 Change and Confirm options for Sequence Window are therefore between 2309 4 and 9 bytes long. New connections start with Sequence Window 100 2310 for both endpoints. The minimum valid Sequence Window value is 2311 Wmin = 32. The maximum valid Sequence Window value is Wmax = 2312 2^46 - 1 = 70368744177663; circular sequence number comparisons 2313 would stop working absent this constraint. Change options 2314 suggesting Sequence Window values out of this range are invalid and 2315 MUST be handled accordingly. 2317 A proper Sequence Window/A value must reflect the number of packets 2318 DCCP A expects to be in flight. Only DCCP A can anticipate this 2319 number. Values that are too small increase the risk of the 2320 endpoints getting out sync after bursts of loss, and values that are 2321 much too small can prevent productive communication whether or not 2322 there is loss. On the other hand, too-large values increase the 2323 risk of connection hijacking; Section 7.5.5 quantifies this risk. 2324 One good guideline is for each endpoint to set Sequence Window to 2325 about five times the maximum number of packets it expects to send in 2326 a round-trip time. Endpoints SHOULD send Change L(Sequence Window) 2327 options as necessary as the connection progresses. Also, an 2328 endpoint MUST NOT persistently send more than its Sequence Window 2329 number of packets per round-trip time; that is, DCCP A MUST NOT 2330 persistently send more than Sequence Window/A packets per RTT. 2332 7.5.3. Sequence-Validity Rules 2334 Sequence-validity depends on the received packet's type. This table 2335 shows the sequence and acknowledgement number checks applied to each 2336 packet; a packet is sequence-valid if it passes both tests, and 2337 sequence-invalid if it does not. Many of the checks refer to the 2338 sequence and acknowledgement number validity windows [SWL, SWH] and 2339 [AWL, AWH] defined in Section 7.5.1. 2341 Acknowledgement Number 2342 Packet Type Sequence Number Check Check 2343 ----------- --------------------- ---------------------- 2344 DCCP-Request SWL <= seqno <= SWH (*) N/A 2345 DCCP-Response SWL <= seqno <= SWH (*) AWL <= ackno <= AWH 2346 DCCP-Data SWL <= seqno <= SWH N/A 2347 DCCP-Ack SWL <= seqno <= SWH AWL <= ackno <= AWH 2348 DCCP-DataAck SWL <= seqno <= SWH AWL <= ackno <= AWH 2349 DCCP-CloseReq GSR < seqno <= SWH GAR <= ackno <= AWH 2350 DCCP-Close GSR < seqno <= SWH GAR <= ackno <= AWH 2351 DCCP-Reset GSR < seqno <= SWH GAR <= ackno <= AWH 2352 DCCP-Sync SWL <= seqno AWL <= ackno <= AWH 2353 DCCP-SyncAck SWL <= seqno AWL <= ackno <= AWH 2355 (*) Check not applied if connection is in LISTEN or REQUEST state. 2357 In general, packets are sequence-valid if their Sequence and 2358 Acknowledgement Numbers lie within the corresponding valid windows, 2359 [SWL, SWH] and [AWL, AWH]. The exceptions to this rule are as 2360 follows: 2362 o Since DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets end a 2363 connection, they cannot have Sequence Numbers less than or equal 2364 to GSR, or Acknowledgement Numbers less than GAR. 2366 o DCCP-Sync and DCCP-SyncAck Sequence Numbers are not strongly 2367 checked. These packet types exist specifically to get the 2368 endpoints back into sync; checking their Sequence Numbers would 2369 eliminate their usefulness. 2371 The lenient checks on DCCP-Sync and DCCP-SyncAck packets allow 2372 continued operation after unusual events, such as endpoint crashes 2373 and large bursts of loss, but there's no need for leniency in the 2374 absence of unusual events -- that is, during ongoing successful 2375 communication. Therefore, DCCP implementations SHOULD use the 2376 following, more stringent checks for active connections, where a 2377 connection is considered active if it has received valid packets 2378 from the other endpoint within the last five round-trip times. 2380 Acknowledgement Number 2381 Packet Type Sequence Number Check Check 2382 ----------- --------------------- ---------------------- 2383 DCCP-Sync SWL <= seqno <= SWH AWL <= ackno <= AWH 2384 DCCP-SyncAck SWL <= seqno <= SWH AWL <= ackno <= AWH 2386 Finally, an endpoint MAY apply the following more stringent checks 2387 to DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets, further 2388 lowering the probability of successful blind attacks using those 2389 packet types. Since these checks can cause extra synchronization 2390 overhead and delay connection closing when packets are lost, they 2391 should be considered experimental. 2393 Acknowledgement Number 2394 Packet Type Sequence Number Check Check 2395 ----------- --------------------- ---------------------- 2396 DCCP-CloseReq seqno == GSR + 1 GAR <= ackno <= AWH 2397 DCCP-Close seqno == GSR + 1 GAR <= ackno <= AWH 2398 DCCP-Reset seqno == GSR + 1 GAR <= ackno <= AWH 2400 Note that sequence-validity is only one of the validity checks 2401 applied to received packets. 2403 7.5.4. Handling Sequence-Invalid Packets 2405 Endpoints MUST ignore sequence-invalid DCCP-Sync and DCCP-SyncAck 2406 packets, and MUST respond to other sequence-invalid packets with 2407 (possibly rate-limited) DCCP-Sync packets. Each DCCP-Sync packet 2408 MUST acknowledge the corresponding sequence-invalid packet's 2409 Sequence Number, not GSR. The DCCP-Sync MUST use a new Sequence 2410 Number, and thus will increase GSS; GSR will not change, however, 2411 since the received packet was sequence-invalid. 2413 On receiving a sequence-valid DCCP-Sync packet, the peer endpoint 2414 (say, DCCP B) MUST update its GSR variable and reply with a DCCP- 2415 SyncAck packet. The DCCP-SyncAck packet's Acknowledgement Number 2416 will equal the DCCP-Sync's Sequence Number, not necessarily GSR. 2417 Upon receiving this DCCP-SyncAck, which will be sequence-valid since 2418 it acknowledges the DCCP-Sync, DCCP A will update its GSR variable, 2419 and the endpoints will be back in sync. As an exception, if the 2420 peer endpoint is in the REQUEST state, it MUST respond with a DCCP- 2421 Reset instead of a DCCP-SyncAck. This serves to clean up DCCP A's 2422 half-open connection. 2424 To protect against denial-of-service attacks, DCCP implementations 2425 SHOULD impose a rate limit on DCCP-Syncs sent in response to 2426 sequence-invalid packets, such as not more than eight DCCP-Syncs per 2427 second. 2429 DCCP endpoints MUST NOT process sequence-invalid packets except, 2430 perhaps, by generating a DCCP-Sync. For instance, options MUST NOT 2431 but processed. An endpoint MAY temporarily preserve sequence- 2432 invalid packets in case they become valid later, however; this can 2433 reduce the impact of bursts of loss by delivering more packets to 2434 the application. In particular, an endpoint MAY preserve sequence- 2435 invalid packets for up to 2 round-trip times. If, within that time, 2436 the relevant sequence windows change so that the packets become 2437 sequence-valid, the endpoint MAY process them again. 2439 Note that sequence-invalid DCCP-Reset packets cause DCCP-Syncs to be 2440 generated. This is because endpoints in an unsynchronized state 2441 (CLOSED, REQUEST, and LISTEN) might not have enough information to 2442 generate a proper DCCP-Reset on the first try. For example, if a 2443 peer endpoint is in CLOSED state and receives a DCCP-Data packet, it 2444 cannot guess the right Sequence Number to use on the DCCP-Reset it 2445 generates (since the DCCP-Data packet has no Acknowledgement 2446 Number). The DCCP-Sync generated in response to this bad reset 2447 serves as a challenge, and contains enough information for the peer 2448 to generate a proper DCCP-Reset. However, the new DCCP-Reset may 2449 carry a different Reset Code than the original DCCP-Reset; probably 2450 the new Reset Code will be 3, "No Connection". The endpoint SHOULD 2451 use information from the original DCCP-Reset when possible. 2453 7.5.5. Sequence Number Attacks 2455 Sequence and Acknowledgement Numbers form DCCP's main line of 2456 defense against attackers. An attacker that cannot guess sequence 2457 numbers cannot easily manipulate or hijack a DCCP connection, and 2458 requirements like careful initial sequence number choice eliminate 2459 the most serious attacks. 2461 An attacker might still send many packets with randomly chosen 2462 Sequence and Acknowledgement Numbers, however. If one of those 2463 probes ends up sequence-valid, it may shut down the connection or 2464 otherwise cause problems. The easiest such attacks to execute are: 2466 o Send DCCP-Data packets with random Sequence Numbers. If one of 2467 these packets hits the valid sequence number window, the attack 2468 packet's application data may be inserted into the data stream. 2470 o Send DCCP-Sync packets with random Sequence and Acknowledgement 2471 Numbers. If one of these packets hits the valid acknowledgement 2472 number window, the receiver will shift its sequence number window 2473 accordingly, getting out of sync with the correct endpoint -- 2474 perhaps permanently. 2476 The attacker has to guess both Source and Destination Ports for any 2477 of these attacks to succeed. Additionally, the connection would 2478 have to be inactive for the DCCP-Sync attack to succeed, assuming 2479 the victim implemented the more stringent checks for active 2480 connections recommended in Section 7.5.3. 2482 To quantify the probability of success, let N be the number of 2483 attack packets the attacker is willing to send, W be the relevant 2484 sequence window width, and L be the length of sequence numbers (24 2485 or 48). The attacker's best strategy is to space the attack packets 2486 evenly over sequence space. Then the probability of hitting one 2487 sequence number window is P = WN/2^L. 2489 The success probability for a DCCP-Data attack using short sequence 2490 numbers thus equals P = WN/2^24. For W = 100, then, the attacker 2491 must send more than 83,000 packets to achieve a 50% chance of 2492 success. For reference, the easiest TCP attack -- sending a SYN 2493 with a random sequence number, which will cause a connection reset 2494 if it falls within the window -- has W = 8760 (a common default) and 2495 L = 32, and requires more than 245,000 packets to achieve a 50% 2496 chance of success. 2498 A fast connection's W will generally be high, increasing the attack 2499 success probability for fixed N. If this probability gets 2500 uncomfortably high with L = 24, the endpoint SHOULD prevent the use 2501 of short sequence numbers by manipulating the Allow Short Sequence 2502 Numbers feature (see Section 7.6.1). The probability limit depends 2503 on the application, however. Some applications, such as those 2504 already designed to handle corruption, are quite resilient to data 2505 injection attacks. 2507 The DCCP-Sync attack has L = 48, since DCCP-Sync packets use long 2508 sequence numbers exclusively; in addition, the success probability 2509 is halved, since only half the Sequence Number space is valid. 2510 Attacks have a correspondingly smaller probability of success. For 2511 a large W of 2000 packets, then, the attacker must send more than 2512 10^11 packets to achieve a 50% chance of success. 2514 Attacks involving DCCP-Ack, DCCP-DataAck, DCCP-CloseReq, DCCP-Close, 2515 and DCCP-Reset packets are more difficult, since Sequence and 2516 Acknowledgement Numbers must both be guessed. The probability of 2517 attack success for these packet types equals P = WXN/2^(2L), where W 2518 is the Sequence Number window, X is the Acknowledgement Number 2519 window, and N and L are as before. 2521 Since DCCP-Data attacks with short sequence numbers are relatively 2522 easy for attackers to execute, DCCP has been engineered to prevent 2523 these attacks from escalating to connection resets or other serious 2524 consequences. In particular, any options whose processing might 2525 cause the connection to be reset are ignored when they appear on 2526 DCCP-Data packets. 2528 7.5.6. Examples 2530 In the following example, DCCP A and DCCP B recover from a large 2531 burst of loss that runs DCCP A's sequence numbers out of DCCP B's 2532 appropriate sequence number window. 2534 DCCP A DCCP B 2535 (GSS=1,GSR=10) (GSS=10,GSR=1) 2536 --> DCCP-Data(seq 2) XXX 2537 ... 2538 --> DCCP-Data(seq 100) XXX 2539 --> DCCP-Data(seq 101) --> ??? 2540 seqno out of range; 2541 send Sync 2542 OK <-- DCCP-Sync(seq 11, ack 101) <-- 2543 (GSS=11,GSR=1) 2544 --> DCCP-SyncAck(seq 102, ack 11) --> OK 2545 (GSS=102,GSR=11) (GSS=11,GSR=102) 2547 In the next example, a DCCP connection recovers from a simple blind 2548 attack. 2550 DCCP A DCCP B 2551 (GSS=1,GSR=10) (GSS=10,GSR=1) 2552 *ATTACKER* --> DCCP-Data(seq 10^6) --> ??? 2553 seqno out of range; 2554 send Sync 2555 ??? <-- DCCP-Sync(seq 11, ack 10^6) <-- 2556 ackno out of range; ignore 2557 (GSS=1,GSR=10) (GSS=11,GSR=1) 2559 The final example demonstrates recovery from a half-open connection. 2561 DCCP A DCCP B 2562 (GSS=1,GSR=10) (GSS=10,GSR=1) 2563 (Crash) 2564 CLOSED OPEN 2565 REQUEST --> DCCP-Request(seq 400) --> ??? 2566 !! <-- DCCP-Sync(seq 11, ack 400) <-- OPEN 2567 REQUEST --> DCCP-Reset(seq 401, ack 11) --> (Abort) 2568 REQUEST CLOSED 2569 REQUEST --> DCCP-Request(seq 402) --> ... 2571 7.6. Short Sequence Numbers 2573 DCCP sequence numbers are 48 bits long. This large sequence space 2574 protects DCCP connections against some blind attacks, such as the 2575 injection of DCCP-Resets into the connection. However, DCCP-Data, 2576 DCCP-Ack, and DCCP-DataAck packets, which make up the body of any 2577 DCCP connection, may reduce header space by transmitting only the 2578 lower 24 bits of the relevant Sequence and Acknowledgement Numbers. 2579 The receiving endpoint will extend these numbers to 48 bits using 2580 the following pseudocode: 2582 procedure Extend_Sequence_Number(S, REF) 2583 /* S is a 24-bit sequence number from the packet header. 2584 REF is the relevant 48-bit reference sequence number: 2585 GSS if S is an Acknowledgement Number, and GSR if S is a 2586 Sequence Number. */ 2587 Set REF_low := low 24 bits of REF 2588 Set REF_hi := high 24 bits of REF 2589 If REF_low (<) S /* circular comparison mod 2^24 */ 2590 && S |<| REF_low, /* conventional, non-circular 2591 comparison */ 2592 Return (((REF_hi + 1) mod 2^24) << 24) | S 2593 Otherwise, 2594 Return (REF_hi << 24) | S 2596 The two different kinds of comparison in the if statement detect 2597 when the low-order bits of the sequence space have wrapped. (The 2598 circular comparison "REF_low (<) S" returns true if and only if 2599 (S - REF_low), calculated using two's-complement arithmetic and then 2600 represented as an unsigned number, is less than or equal to 2^23 2601 (mod 2^24).) When this happens, the high-order bits are 2602 incremented. 2604 7.6.1. Allow Short Sequence Numbers Feature 2606 Endpoints can require that all packets use long sequence numbers by 2607 setting the Allow Short Sequence Numbers feature to false. This can 2608 reduce the risk that data will be inappropriately injected into the 2609 connection. DCCP A sends a "Change R(Allow Short Seqnos, 0)" option 2610 to ask DCCP B to send only long sequence numbers. 2612 Allow Short Sequence Numbers has feature number 2, and is server- 2613 priority. It takes one-byte Boolean values. DCCP B MUST NOT send 2614 packets with short sequence numbers when Allow Short Seqnos/B is 2615 zero. Values of two or more are reserved. New connections start 2616 with Allow Short Sequence Numbers 1 for both endpoints. 2618 7.6.2. When to Avoid Short Sequence Numbers 2620 Short sequence numbers reduce the rate DCCP connections can safely 2621 achieve, and increase the risks of certain kinds of attacks, 2622 including blind data injection. Very-high-rate DCCP connections, 2623 and connections with large sequence windows (Section 7.5.2), SHOULD 2624 NOT use short sequence numbers on their data packets. The attack 2625 risk issues have been discussed in Section 7.5.5; we discuss the 2626 rate limitation issue here. 2628 The sequence-validity mechanism assumes that the network does not 2629 deliver extremely old data. In particular, it assumes that the 2630 network must have dropped any packet by the time the connection 2631 wraps around and uses its sequence number again. This constraint 2632 limits the maximum connection rate that can be safely achieved. Let 2633 MSL equal the maximum segment lifetime, P equal the average DCCP 2634 packet size in bits, and L equal the length of sequence numbers (24 2635 or 48 bits). Then the maximum safe rate, in bits per second, is R = 2636 P*(2^L)/2MSL. 2638 For the default MSL of 2 minutes, 1500-byte DCCP packets, and short 2639 sequence numbers, the safe rate is therefore approximately 800 Mb/s. 2640 Although 2 minutes is a very large MSL for any networks that could 2641 sustain that rate with such small packets, long sequence numbers 2642 allow much higher rates under the same constraints: up to 2643 14 petabits a second for 1500-byte packets and the default MSL. 2645 7.7. NDP Count and Detecting Application Loss 2647 DCCP's sequence numbers increment by one on every packet, including 2648 non-data packets (packets that don't carry application data). This 2649 makes DCCP sequence numbers suitable for detecting any network loss, 2650 but not for detecting the loss of application data. The NDP Count 2651 option reports the length of each burst of non-data packets. This 2652 lets the receiving DCCP reliably determine when a burst of loss 2653 included application data. 2655 +--------+--------+-------- ... --------+ 2656 |00100101| Length | NDP Count | 2657 +--------+--------+-------- ... --------+ 2658 Type=37 Len=3-5 (1-3 bytes) 2660 If a DCCP endpoint's Send NDP Count feature is one (see below), then 2661 that endpoint MUST send an NDP Count option on every packet whose 2662 immediate predecessor was a non-data packet. Non-data packets 2663 consist of DCCP packet types DCCP-Ack, DCCP-Close, DCCP-CloseReq, 2664 DCCP-Reset, DCCP-Sync, and DCCP-SyncAck. The other packet types, 2665 namely DCCP-Request, DCCP-Response, DCCP-Data, and DCCP-DataAck, are 2666 considered data packets, although not all DCCP-Request and DCCP- 2667 Response packets will actually carry application data. 2669 The value stored in NDP Count equals the number of consecutive non- 2670 data packets in the run immediately previous to the current packet. 2671 Packets with no NDP Count option are considered to have NDP Count 2672 zero. 2674 The NDP Count option can carry one to three bytes of data. The 2675 smallest option format that can hold the NDP Count SHOULD be used. 2677 With NDP Count, the receiver can reliably tell only whether a burst 2678 of loss contained at least one data packet. For example, the 2679 receiver cannot always tell whether a burst of loss contained a non- 2680 data packet. 2682 7.7.1. Usage Notes 2684 Say that K consecutive sequence numbers are missing in some burst of 2685 loss, and the Send NDP Count feature is on. Then some application 2686 data was lost within those sequence numbers unless the packet 2687 following the hole contains an NDP Count option whose value is 2688 greater than or equal to K. 2690 For example, say that an endpoint sent the following sequence of 2691 non-data packets (Nx) and data packets (Dx). 2693 N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 2695 Those packets would have NDP Counts as follows. 2697 N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 2698 - 1 2 - 1 - - 1 - - - - 1 2 2700 NDP Count is not useful for applications that include their own 2701 sequence numbers with their packet headers. 2703 7.7.2. Send NDP Count Feature 2705 The Send NDP Count feature lets DCCP endpoints negotiate whether 2706 they should send NDP Count options on their packets. DCCP A sends a 2707 "Change R(Send NDP Count, 1)" option to ask DCCP B to send NDP Count 2708 options. 2710 Send NDP Count has feature number 7, and is server-priority. It 2711 takes one-byte Boolean values. DCCP B MUST send NDP Count options 2712 as described above when Send NDP Count/B is one, although it MAY 2713 send NDP Count options even when Send NDP Count/B is zero. Values 2714 of two or more are reserved. New connections start with Send NDP 2715 Count 0 for both endpoints. 2717 8. Event Processing 2719 This section describes how DCCP connections move between states, and 2720 which packets are sent when. Note that feature negotiation takes 2721 place in parallel with the connection-wide state transitions 2722 described here. 2724 8.1. Connection Establishment 2726 DCCP connections' initiation phase consists of a three-way 2727 handshake: an initial DCCP-Request packet sent by the client, a 2728 DCCP-Response sent by the server in reply, and finally an 2729 acknowledgement from the client, usually via a DCCP-Ack or DCCP- 2730 DataAck packet. The client moves from the REQUEST state to 2731 PARTOPEN, and finally to OPEN; the server moves from LISTEN to 2732 RESPOND, and finally to OPEN. 2734 Client State Server State 2735 CLOSED LISTEN 2736 1. REQUEST --> Request --> 2737 2. <-- Response <-- RESPOND 2738 3. PARTOPEN --> Ack, DataAck --> 2739 4. <-- Data, Ack, DataAck <-- OPEN 2740 5. OPEN <-> Data, Ack, DataAck <-> OPEN 2742 8.1.1. Client Request 2744 When a client decides to initiate a connection, it enters the 2745 REQUEST state, chooses an initial sequence number (Section 7.2), and 2746 sends a DCCP-Request packet using that sequence number to the 2747 intended server. 2749 DCCP-Request packets will commonly carry feature negotiation options 2750 that open negotiations for various connection parameters, such as 2751 preferred congestion control IDs for each half-connection. They may 2752 also carry application data, but the client should be aware that the 2753 server may not accept such data. 2755 A client in the REQUEST state SHOULD send use an exponential-backoff 2756 timer to send new DCCP-Request packets if no response is received. 2757 The first retransmission should occur after approximately one 2758 second, backing off to not less than one packet every 64 seconds; or 2759 the endpoint can use whatever retransmission strategy is followed 2760 for retransmitting TCP SYNs. Each new DCCP-Request MUST increment 2761 the Sequence Number by one, and MUST contain the same Service Code 2762 and application data as the original DCCP-Request. 2764 A client MAY give up on its DCCP-Requests after some time 2765 (3 minutes, for example). When it does, it SHOULD send a DCCP-Reset 2766 packet to the server with Reset Code 2, "Aborted", to clean up state 2767 in case one or more of the Requests actually arrived. A client in 2768 REQUEST state has never received an initial sequence number from its 2769 peer, so the DCCP-Reset's Acknowledgement Number MUST be set to 2770 zero. 2772 The client leaves the REQUEST state for PARTOPEN when it receives a 2773 DCCP-Response from the server. 2775 8.1.2. Service Codes 2777 Each DCCP-Request contains a 32-bit Service Code, which identifies 2778 the application-level service to which the client application is 2779 trying to connect. Service Codes should correspond to application 2780 services and protocols. For example, there might be a Service Code 2781 for SIP control connections and one for RTP audio connections. 2782 Middleboxes, such as firewalls, can use the Service Code to identify 2783 the application running on a nonstandard port (assuming the DCCP 2784 header has not been encrypted). 2786 Endpoints MUST associate a Service Code with every DCCP socket, both 2787 actively and passively opened. The application will generally 2788 supply this Service Code. Each active socket MUST have exactly one 2789 Service Code. Passive sockets MAY, at the implementation's 2790 discretion, be associated with more than one Service Code; this 2791 might let multiple applications, or multiple versions of the same 2792 application, listen on the same port, differentiated by Service 2793 Code. If the DCCP-Request's Service Code doesn't match any of the 2794 server's Service Codes for the given port, the server MUST reject 2795 the request by sending a DCCP-Reset packet with Reset Code 8, "Bad 2796 Service Code". A middlebox MAY also send such a DCCP-Reset in 2797 response to packets whose Service Code is considered unsuitable. 2799 Service Codes are not intended to be DCCP-specific, and are 2800 allocated by IANA. Following the policies outlined in [RFC 2434], 2801 most Service Codes are allocated First Come First Served, subject to 2802 the following guidelines. 2804 o Service Codes are allocated one at a time, or in small blocks. A 2805 short English description of the intended service is REQUIRED to 2806 obtain a Service Code assignment, but no specification, 2807 standards-track or otherwise, is necessary. IANA maintains an 2808 association of Service Codes to the corresponding phrases. 2810 o Users request specific Service Code values. We suggest that 2811 users request Service Codes that can be interpreted as meaningful 2812 four-byte ASCII strings. Thus, the "Frobodyne Plotz Protocol" 2813 might correspond to "fdpz", or the number 1717858426. The 2814 canonical interpretation of a Service Code field is numeric. 2816 o Service Codes whose bytes each have values in the set {32, 45-57, 2817 65-90} use a Specification Required allocation policy. That is, 2818 these Service Codes are used for international standard or 2819 standards-track specifications, IETF or otherwise. (This set 2820 consists of the ASCII digits, uppercase letters, and characters 2821 space, '-', '.', and '/'.) 2823 o Service Codes whose high-order byte equals 63 (ASCII '?') are 2824 reserved for Private Use. 2826 o Service Code 0 represents the absence of a meaningful Service 2827 Code, and MUST never be allocated. 2829 This design for Service Code allocation is based on the allocation 2830 of 4-byte identifiers for Macintosh resources, PNG chunks, and 2831 TrueType and OpenType tables. 2833 8.1.3. Server Response 2835 In the second phase of the three-way handshake, the server moves 2836 from the LISTEN state to RESPOND, and sends a DCCP-Response message 2837 to the client. In this phase, a server will often specify the 2838 features it would like to use, either from among those the client 2839 requested, or in addition to those. Among these options is the 2840 congestion control mechanism the server expects to use. 2842 The server MAY respond to a DCCP-Request packet with a DCCP-Reset 2843 packet to refuse the connection. Relevant Reset Codes for refusing 2844 a connection include 7, "Connection Refused", when the DCCP- 2845 Request's Destination Port did not correspond to a DCCP port open 2846 for listening; 8, "Bad Service Code", when the DCCP-Request's 2847 Service Code did not correspond to the service code registered with 2848 the Destination Port; and 9, "Too Busy", when the server is 2849 currently too busy to respond to requests. The server SHOULD limit 2850 the rate at which it generates these resets, for example to not more 2851 than 1024 per second. 2853 The server SHOULD NOT retransmit DCCP-Response packets; the client 2854 will retransmit the DCCP-Request if necessary. (Note that the 2855 "retransmitted" DCCP-Request will have, at least, a different 2856 sequence number from the "original" DCCP-Request. The server can 2857 thus distinguish true retransmissions from network duplicates.) The 2858 server will detect that the retransmitted DCCP-Request applies to an 2859 existing connection because of its Source and Destination Ports. 2860 Every valid DCCP-Request received while the server is in the RESPOND 2861 state MUST elicit a new DCCP-Response. Each new DCCP-Response MUST 2862 increment the server's Sequence Number by one, and MUST include the 2863 same application data, if any, as the original DCCP-Response. 2865 The server MUST NOT accept more than one piece of DCCP-Request 2866 application data per connection. In particular, the DCCP-Response 2867 sent in reply to a retransmitted DCCP-Request with application data 2868 SHOULD contain a Data Dropped option, in which the retransmitted 2869 DCCP-Request data is reported with Drop Code 0, Protocol 2870 Constraints. The original DCCP-Request SHOULD also be reported in 2871 the Data Dropped option, either in a Normal Block (if the server 2872 accepted the data, or there was no data), or in a Drop Code 0 Drop 2873 Block (if the server refused the data the first time as well). 2875 The Data Dropped and Init Cookie options are particularly useful for 2876 DCCP-Response packets (Sections 11.7 and 8.1.4). 2878 The server leaves the RESPOND state for OPEN when it receives a 2879 valid DCCP-Ack from the client, completing the three-way handshake. 2880 It MAY also leave the RESPOND state for CLOSED after a timeout of 2881 not less than 4MSL (8 minutes); when doing so, it SHOULD send a 2882 DCCP-Reset with Reset Code 2, "Aborted", to clean up state at the 2883 client. 2885 8.1.4. Init Cookie Option 2887 +--------+--------+--------+--------+--------+-------- 2888 |00100100| Length | Init Cookie Value ... 2889 +--------+--------+--------+--------+--------+-------- 2890 Type=36 2892 The Init Cookie option lets a DCCP server avoid having to hold any 2893 state until the three-way connection setup handshake has completed, 2894 in a similar fashion as TCP SYN cookies [SYNCOOKIES]. The server 2895 wraps up the Service Code, server port, and any options it cares 2896 about from both the DCCP-Request and DCCP-Response in an opaque 2897 cookie. Typically the cookie will be encrypted using a secret known 2898 only to the server and include a cryptographic checksum or magic 2899 value so that correct decryption can be verified. When the server 2900 receives the cookie back in the response, it can decrypt the cookie 2901 and instantiate all the state it avoided keeping. In the meantime, 2902 it need not move from the LISTEN state. 2904 The Init Cookie option MUST NOT be sent on DCCP-Request or DCCP-Data 2905 packets, and any such options received on DCCP-Request or DCCP-Data 2906 packets MUST be ignored. The server MAY include an Init Cookie 2907 option in its DCCP-Response. If so, then the client MUST echo the 2908 same Init Cookie option in each succeeding DCCP packet until one of 2909 those packets is acknowledged, meaning the three-way handshake has 2910 completed, or the connection is reset. (As a result, the client 2911 MUST NOT use DCCP-Data packets until the three-way handshake 2912 completes or the connection is reset.) The server SHOULD design its 2913 Init Cookie format so that Init Cookies can be checked for 2914 tampering; it SHOULD respond to a tampered Init Cookie option by 2915 resetting the connection with Reset Code 10, "Bad Init Cookie". 2917 Init Cookie's precise implementation need not be specified here; 2918 since Init Cookies are opaque to the client, there are no 2919 interoperability concerns. An example cookie format might encrypt 2920 (using a secret key) the connection's initial sequence and 2921 acknowledgement numbers, ports, Service Code, any options included 2922 on the DCCP-Request packet and the corresponding DCCP-Reply, a 2923 random salt, and a magic number. On receiving a reflected Init 2924 Cookie, the server would decrypt the cookie, validate it by checking 2925 its magic number, sequence numbers, and ports, and, if valid, create 2926 a corresponding socket using the options. 2928 Init Cookies are limited to at most 253 bytes in length. 2930 8.1.5. Handshake Completion 2932 When the client receives a DCCP-Response from the server, it moves 2933 from the REQUEST state to PARTOPEN and completes the three-way 2934 handshake by sending a DCCP-Ack packet to the server. The client 2935 remains in PARTOPEN until it can be sure that the server has 2936 received some packet the client sent from PARTOPEN (either the 2937 initial DCCP-Ack or a later packet). Clients in the PARTOPEN state 2938 that want to send data MUST do so using DCCP-DataAck packets, not 2939 DCCP-Data packets. This is because DCCP-Data packets lack 2940 Acknowledgement Numbers, so the server can't tell from a DCCP-Data 2941 packet whether the client saw its DCCP-Response. Furthermore, if 2942 the DCCP-Response included an Init Cookie, that Init Cookie MUST be 2943 included on every packet sent in PARTOPEN. 2945 The single DCCP-Ack sent when entering the PARTOPEN state might, of 2946 course, be dropped by the network. The client SHOULD ensure that 2947 some packet gets through eventually. The preferred mechanism would 2948 be a roughly 200-millisecond timer, set every time a packet is 2949 transmitted in PARTOPEN. If this timer goes off and the client is 2950 still in PARTOPEN, the client generates another DCCP-Ack and backs 2951 off the timer. If the client remains in PARTOPEN for more than 4MSL 2952 (8 minutes), it SHOULD reset the connection with Reset Code 2, 2953 "Aborted". 2955 The client leaves the PARTOPEN state for OPEN when it receives a 2956 valid packet other than DCCP-Response, DCCP-Reset, or DCCP-Sync from 2957 the server. 2959 8.2. Data Transfer 2961 In the central data transfer phase of the connection, both server 2962 and client are in the OPEN state. 2964 DCCP A sends DCCP-Data and DCCP-DataAck packets to DCCP B due to 2965 application events on host A. These packets are congestion- 2966 controlled by the CCID for the A-to-B half-connection. In contrast, 2967 DCCP-Ack packets sent by DCCP A are controlled by the CCID for the 2968 B-to-A half-connection. Generally, DCCP A will piggyback 2969 acknowledgement information on DCCP-Data packets when acceptable, 2970 creating DCCP-DataAck packets. DCCP-Ack packets are used when there 2971 is no data to send from DCCP A to DCCP B, or when the congestion 2972 state of the A-to-B CCID will not allow data to be sent. 2974 DCCP-Sync and DCCP-SyncAck packets may also occur in the data 2975 transfer phase. Some cases causing DCCP-Sync generation are 2976 discussed in Section 7.5. One important distinction between DCCP- 2977 Sync packets and other packet types is that DCCP-Sync elicits an 2978 immediate acknowledgement. On receiving a valid DCCP-Sync packet, a 2979 DCCP endpoint MUST immediately generate and send a DCCP-SyncAck 2980 response (subject to any implementation rate limits); the 2981 Acknowledgement Number on that DCCP-SyncAck MUST equal the Sequence 2982 Number of the DCCP-Sync. 2984 A particular DCCP implementation might decide to initiate feature 2985 negotiation only once the OPEN state was reached, in which case it 2986 might not allow data transfer until some time later. Data received 2987 during that time SHOULD be rejected and reported using a Data 2988 Dropped Drop Block with Drop Code 0, Protocol Constraints (see 2989 Section 11.7). 2991 8.3. Termination 2993 DCCP connection termination uses a handshake consisting of an 2994 optional DCCP-CloseReq packet, a DCCP-Close packet, and a DCCP-Reset 2995 packet. The server moves from the OPEN state, possibly through the 2996 CLOSEREQ state, to CLOSED; the client moves from OPEN through 2997 CLOSING to TIMEWAIT, and after 2MSL wait time (4 minutes), to 2998 CLOSED. 3000 The sequence DCCP-CloseReq, DCCP-Close, DCCP-Reset is used when the 3001 server decides to close the connection, but doesn't want to hold 3002 TIMEWAIT state: 3004 Client State Server State 3005 OPEN OPEN 3006 1. <-- CloseReq <-- CLOSEREQ 3007 2. CLOSING --> Close --> 3008 3. <-- Reset <-- CLOSED (LISTEN) 3009 4. TIMEWAIT 3010 5. CLOSED 3011 A shorter sequence occurs when the client decides to close the 3012 connection. 3014 Client State Server State 3015 OPEN OPEN 3016 1. CLOSING --> Close --> 3017 2. <-- Reset <-- CLOSED (LISTEN) 3018 3. TIMEWAIT 3019 4. CLOSED 3021 Finally, the server can decide to hold TIMEWAIT state: 3023 Client State Server State 3024 OPEN OPEN 3025 1. <-- Close <-- CLOSING 3026 2. CLOSED --> Reset --> 3027 3. TIMEWAIT 3028 4. CLOSED (LISTEN) 3030 In all cases, the receiver of the DCCP-Reset packet holds TIMEWAIT 3031 state for the connection. As in TCP, TIMEWAIT state, where an 3032 endpoint quietly preserves a socket for 2MSL (4 minutes) after its 3033 connection has closed, ensures that no connection duplicating the 3034 current connection's source and destination addresses and ports can 3035 start up while old packets might remain in the network. 3037 The termination handshake proceeds as follows. The receiver of a 3038 valid DCCP-CloseReq packet MUST respond with a DCCP-Close packet. 3039 The receiver of a valid DCCP-Close packet MUST respond with a DCCP- 3040 Reset packet, with Reset Code 1, "Closed". The receiver of a valid 3041 DCCP-Reset packet -- which is also the sender of the DCCP-Close 3042 packet (and possibly the receiver of the DCCP-CloseReq packet) -- 3043 will hold TIMEWAIT state for the connection. 3045 A DCCP-Reset packet completes every DCCP connection, whether the 3046 termination is clean (due to application close; Reset Code 1, 3047 "Closed") or unclean. Unlike TCP, which has two distinct 3048 termination mechanisms (FIN and RST), DCCP ends all connections in a 3049 uniform manner. This is justified because some aspects of 3050 connection termination are the same independent of whether 3051 termination was clean. For instance, the endpoint that receives a 3052 valid DCCP-Reset SHOULD hold TIMEWAIT state for the connection. 3053 Processors that must distinguish between clean and unclean 3054 termination can examine the Reset Code. DCCP-Reset packets MUST NOT 3055 be generated in response to received DCCP-Reset packets. DCCP 3056 implementations generally transition to the CLOSED state after 3057 sending a DCCP-Reset packet. 3059 Endpoints in the CLOSEREQ and CLOSING states MUST retransmit DCCP- 3060 CloseReq and DCCP-Close packets, respectively, until leaving those 3061 states. The retransmission timer should initially be set to go off 3062 in two round-trip times, and should back off to not less than once 3063 every 64 seconds if no relevant response is received. 3065 Only the server can send a DCCP-CloseReq packet or enter the 3066 CLOSEREQ state. A server receiving a sequence-valid DCCP-CloseReq 3067 packet MUST respond with a DCCP-Sync packet, and otherwise ignore 3068 the DCCP-CloseReq. 3070 DCCP-Data, DCCP-DataAck, and DCCP-Ack packets received in CLOSEREQ 3071 or CLOSE states MAY be either processed or ignored. 3073 8.3.1. Abnormal Termination 3075 DCCP endpoints generate DCCP-Reset packets to terminate connections 3076 abnormally; a DCCP-Reset packet may be generated from any state. 3077 Resets sent in the CLOSED, LISTEN, and TIMEWAIT states use Reset 3078 Code 3, "No Connection", unless otherwise specified. Resets sent in 3079 the REQUEST or RESPOND states use Reset Code 4, "Packet Error", 3080 unless otherwise specified. 3082 DCCP endpoints in CLOSED or LISTEN state may need to generate a 3083 DCCP-Reset packet in response to a packet received from a peer. 3084 Since these states have no associated sequence number variables, the 3085 Sequence and Acknowledgement Numbers on the DCCP-Reset packet R are 3086 taken from the received packet P, as follows. 3088 1. If P.ackno exists, then set R.seqno := P.ackno + 1. Otherwise, 3089 set R.seqno := 0. 3091 2. Set R.ackno := P.seqno. 3093 3. If the packet used short sequence numbers (P.X == 0), then set 3094 the upper 24 bits of R.seqno and R.ackno to 0. 3096 8.4. DCCP State Diagram 3098 The most common state transitions discussed above can be summarized 3099 in the following state diagram. The diagram is illustrative; the 3100 text in Section 8.5 and elsewhere should be considered definitive. 3101 For example, there are arcs (not shown) from every state except 3102 CLOSED to TIMEWAIT, contingent on the receipt of a valid DCCP-Reset. 3104 +---------------------------+ +---------------------------+ 3105 | v v | 3106 | +----------+ | 3107 | +-------------+ CLOSED +------------+ | 3108 | | passive +----------+ active | | 3109 | | open open | | 3110 | | snd Request | | 3111 | v v | 3112 | +----------+ +----------+ | 3113 | | LISTEN | | REQUEST | | 3114 | +----+-----+ +----+-----+ | 3115 | | rcv Request rcv Response | | 3116 | | snd Response snd Ack | | 3117 | v v | 3118 | +----------+ +----------+ | 3119 | | RESPOND | | PARTOPEN | | 3120 | +----+-----+ +----+-----+ | 3121 | | rcv Ack/DataAck rcv packet | | 3122 | | | | 3123 | | +----------+ | | 3124 | +------------>| OPEN |<-----------+ | 3125 | +--+-+--+--+ | 3126 | server active close | | | active close | 3127 | snd CloseReq | | | or rcv CloseReq | 3128 | | | | snd Close | 3129 | | | | | 3130 | +----------+ | | | +----------+ | 3131 | | CLOSEREQ |<---------+ | +--------->| CLOSING | | 3132 | +----+-----+ | +----+-----+ | 3133 | | rcv Close | rcv Reset | | 3134 | | snd Reset | | | 3135 |<---------+ | v | 3136 | | +----+-----+ | 3137 | rcv Close | | TIMEWAIT | | 3138 | snd Reset | +----+-----+ | 3139 +-----------------------------+ | | 3140 +-----------+ 3141 2MSL timer expires 3143 8.5. Pseudocode 3145 This section presents an algorithm describing the processing steps a 3146 DCCP endpoint must go through when it receives a packet. A DCCP 3147 implementation need not implement the algorithm as it is described 3148 here, but any implementation MUST generate observable effects 3149 exactly as indicated by this pseudocode, except where allowed 3150 otherwise by another part of this document. 3152 The received packet is written as P, the socket as S. 3153 Packet variables P.seqno and P.ackno are 48-bit sequence numbers. 3154 Socket variables: 3155 S.SWL - sequence number window low 3156 S.SWH - sequence number window high 3157 S.AWL - acknowledgement number window low 3158 S.AWH - acknowledgement number window high 3159 S.ISS - initial sequence number sent 3160 S.ISR - initial sequence number received 3161 S.OSR - first OPEN sequence number received 3162 S.GSS - greatest sequence number sent 3163 S.GSR - greatest valid sequence number received 3164 S.GAR - greatest valid acknowledgement number received on a 3165 non-Sync; initialized to S.ISS 3166 "Send packet" actions always use, and increment, S.GSS. 3168 Step 1: Check header basics 3169 /* This step checks for malformed packets. Packets that fail 3170 these checks are ignored -- they do not receive Resets in 3171 response */ 3172 If the packet is shorter than 12 bytes, drop packet and return 3173 If the packet type is not understood, drop packet and return 3174 If P.Data Offset is too small for packet type, or too large for 3175 packet, drop packet and return 3176 If P.type is not Data, Ack, or DataAck and P.X == 0 (the packet 3177 has short sequence numbers), drop packet and return 3178 If the header checksum is incorrect, drop packet and return 3179 If P.CsCov is too large for the packet size, drop packet and 3180 return 3182 Step 2: Check ports and process TIMEWAIT state 3183 Look up flow ID in table and get corresponding socket 3184 If no socket, or S.state == TIMEWAIT, 3185 Generate Reset(No Connection) unless P.type == Reset 3186 Drop packet and return 3188 Step 3: Process LISTEN state 3189 If S.state == LISTEN, 3190 If P.type == Request or P contains a valid Init Cookie option, 3191 /* Must scan the packet's options to check for an Init 3192 Cookie. Only the Init Cookie is processed here, 3193 however; other options are processed in Step 8. This 3194 scan need only be performed if the endpoint uses Init 3195 Cookies */ 3196 /* Generate a new socket and switch to that socket */ 3197 Set S := new socket for this port pair 3198 S.state = RESPOND 3199 Choose S.ISS (initial seqno) or set from Init Cookie 3200 Set S.ISR, S.GSR, S.SWL, S.SWH from packet or Init Cookie 3201 Continue with S.state == RESPOND 3202 /* A Response packet will be generated in Step 11 */ 3203 Otherwise, 3204 Generate Reset(No Connection) unless P.type == Reset 3205 Drop packet and return 3207 Step 4: Prepare sequence numbers in REQUEST 3208 If S.state == REQUEST, 3209 If (P.type == Response or P.type == Reset) 3210 and S.AWL <= P.ackno <= S.AWH, 3211 /* Set sequence number variables corresponding to the 3212 other endpoint, so P will pass the tests in Step 6 */ 3213 Set S.GSR, S.ISR, S.SWL, S.SWH 3214 /* Response processing continues in Step 10; Reset 3215 processing continues in Step 9 */ 3216 Otherwise, 3217 /* Only Response and Reset are valid in REQUEST state */ 3218 Generate Reset(Packet Error) 3219 Drop packet and return 3221 Step 5: Prepare sequence numbers for Sync 3222 If P.type == Sync or P.type == SyncAck, 3223 If S.AWL <= P.ackno <= S.AWH and P.seqno >= S.SWL, 3224 /* P is valid, so update sequence number variables 3225 accordingly. After this update, P will pass the tests 3226 in Step 6. A SyncAck is generated if necessary in 3227 Step 15 */ 3228 Update S.GSR, S.SWL, S.SWH 3229 Otherwise, 3230 Drop packet and return 3232 Step 6: Check sequence numbers 3233 Let LSWL = S.SWL and LAWL = S.AWL 3234 If P.type == CloseReq or P.type == Close or P.type == Reset, 3235 LSWL := S.GSR + 1, LAWL := S.GAR 3236 If LSWL <= P.seqno <= S.SWH 3237 and (P.ackno does not exist or LAWL <= P.ackno <= S.AWH), 3238 Update S.GSR, S.SWL, S.SWH 3239 If P.type != Sync, 3240 Update S.GAR 3241 Otherwise, 3242 Send Sync packet acknowledging P.seqno 3243 Drop packet and return 3245 Step 7: Check for unexpected packet types 3246 If (S.is_server and P.type == CloseReq) 3247 or (S.is_server and P.type == Response) 3248 or (S.is_client and P.type == Request) 3249 or (S.state >= OPEN and P.type == Request 3250 and P.seqno >= S.OSR) 3251 or (S.state >= OPEN and P.type == Response 3252 and P.seqno >= S.OSR) 3253 or (S.state == RESPOND and P.type == Data), 3254 Send Sync packet acknowledging P.seqno 3255 Drop packet and return 3257 Step 8: Process options and mark acknowledgeable 3258 /* Option processing is not specifically described here. 3259 Certain options, such as Mandatory, may cause the connection 3260 to be reset, in which case Steps 9 and on are not executed */ 3261 Mark packet as acknowledgeable (in Ack Vector terms, Received 3262 or Received ECN Marked) 3264 Step 9: Process Reset 3265 If P.type == Reset, 3266 Tear down connection 3267 S.state := TIMEWAIT 3268 Set TIMEWAIT timer 3269 Drop packet and return 3271 Step 10: Process REQUEST state (second part) 3272 If S.state == REQUEST, 3273 /* If we get here, P is a valid Response from the server (see 3274 Step 4), and we should move to PARTOPEN state. PARTOPEN 3275 means send an Ack, don't send Data packets, retransmit 3276 Acks periodically, and always include any Init Cookie from 3277 the Response */ 3278 S.state := PARTOPEN 3279 Set PARTOPEN timer 3280 Continue with S.state == PARTOPEN 3281 /* Step 12 will send the Ack completing the three-way 3282 handshake */ 3284 Step 11: Process RESPOND state 3285 If S.state == RESPOND, 3286 If P.type == Request, 3287 Send Response, possibly containing Init Cookie 3288 If Init Cookie was sent, 3289 Destroy S and return 3290 /* Step 3 will create another socket when the client 3291 completes the three-way handshake */ 3292 Otherwise, 3293 S.OSR := P.seqno 3294 S.state := OPEN 3296 Step 12: Process PARTOPEN state 3297 If S.state == PARTOPEN, 3298 If P.type == Response, 3299 Send Ack 3300 Otherwise, if P.type != Sync, 3301 S.OSR := P.seqno 3302 S.state := OPEN 3304 Step 13: Process CloseReq 3305 If P.type == CloseReq and S.state < CLOSEREQ, 3306 Generate Close 3307 S.state := CLOSING 3308 Set CLOSING timer 3310 Step 14: Process Close 3311 If P.type == Close, 3312 Generate Reset(Closed) 3313 Tear down connection 3314 Drop packet and return 3316 Step 15: Process Sync 3317 If P.type == Sync, 3318 Generate SyncAck 3320 Step 16: Process data 3321 /* At this point any application data on P can be passed to the 3322 application, except that the application MUST NOT receive 3323 data from more than one Request or Response */ 3325 9. Checksums 3327 DCCP uses a header checksum to protect its header against 3328 corruption. Generally, this checksum also covers any application 3329 data. DCCP applications can, however, request that the header 3330 checksum cover only part of the application data, or perhaps no 3331 application data at all. Link layers may then reduce their 3332 protection on unprotected parts of DCCP packets. For some noisy 3333 links, and applications that can tolerate corruption, this can 3334 greatly improve delivery rates and perceived performance. 3336 Checksum coverage may eventually impact congestion control 3337 mechanisms as well. A packet with corrupt application data and 3338 complete checksum coverage is treated as lost. This incurs a heavy- 3339 duty loss response from the sender's congestion control mechanism, 3340 which can unfairly penalize connections on links with high 3341 background corruption. The combination of reduced checksum coverage 3342 and Data Checksum options may let endpoints report packets as 3343 corrupt rather than dropped, using Data Dropped options and Drop 3344 Code 3 (see Section 11.7). This may eventually benefit 3345 applications. However, further research is required to determine an 3346 appropriate response to corruption, which can sometimes correlate 3347 with congestion. Corrupt packets currently incur a loss response. 3349 The Data Checksum option, which contains a strong CRC, lets 3350 endpoints detect application data corruption. An API can then be 3351 used to avoid delivering corrupt data to the application, even if 3352 links deliver corrupt data to the endpoint due to reduced checksum 3353 coverage. However, the use of reduced checksum coverage for 3354 applications that demand correct data is currently considered 3355 experimental. This is because the combined loss-plus-corruption 3356 rate for packets with reduced checksum coverage may be significantly 3357 higher than that for packets with full checksum coverage, although 3358 the loss rate will generally be lower. Actual behavior will depend 3359 on link design; further research and experience is required. 3361 Reduced checksum coverage introduces some security considerations; 3362 see Section 18.1. See Appendix B for further motivation and 3363 discussion. DCCP's implementation of reduced checksum coverage was 3364 inspired by UDP-Lite [RFC 3828]. 3366 9.1. Header Checksum Field 3368 DCCP uses the TCP/IP checksum algorithm. The Checksum field in the 3369 DCCP generic header (see Section 5.1) equals the 16 bit one's 3370 complement of the one's complement sum of all 16 bit words in the 3371 DCCP header, DCCP options, a pseudoheader taken from the network- 3372 layer header, and, depending on the value of the Checksum Coverage 3373 field, some or all of the application data. When calculating the 3374 checksum, the Checksum field itself is treated as 0. If a packet 3375 contains an odd number of header and payload bytes to be 3376 checksummed, 8 zero bits are added on the right to form a 16 bit 3377 word for checksum purposes. The pad byte is not transmitted as part 3378 of the packet. 3380 The pseudoheader is calculated as for TCP. For IPv4, it is 96 bits 3381 long, and consists of the IPv4 source and destination addresses, the 3382 IP protocol number for DCCP (padded on the left with 8 zero bits), 3383 and the DCCP length as a 16-bit quantity (the length of the DCCP 3384 header with options, plus the length of any data); see Section 3.1 3385 of [RFC 793]. For IPv6, it is 320 bits long, and consists of the 3386 IPv6 source and destination addresses, the DCCP length as a 32-bit 3387 quantity, and the IP protocol number for DCCP (padded on the left 3388 with 24 zero bits); see Section 8.1 of [RFC 2460]. 3390 Packets with invalid header checksums MUST be ignored. In 3391 particular, their options MUST NOT be processed. 3393 9.2. Header Checksum Coverage Field 3395 The Checksum Coverage field in the DCCP generic header (see Section 3396 5.1) specifies what parts of the packet are covered by the Checksum 3397 field, as follows: 3399 CsCov = 0 The Checksum field covers the DCCP header, DCCP 3400 options, network-layer pseudoheader, and all 3401 application data in the packet, possibly padded on 3402 the right with zeros to an even number of bytes. 3404 CsCov = 1-15 The Checksum field covers the DCCP header, DCCP 3405 options, network-layer pseudoheader, and the initial 3406 (CsCov-1)*4 bytes of the packet's application data. 3408 Thus, if CsCov is 1, none of the application data is protected by 3409 the header checksum. The value (CsCov-1)*4 MUST be less than or 3410 equal to the length of the application data. Packets with invalid 3411 CsCov values MUST be ignored; in particular, their options MUST NOT 3412 be processed. The meanings of values other than 0 and 1 should be 3413 considered experimental. 3415 Values other than 0 specify that corruption is acceptable in some or 3416 all of the DCCP packet's application data. In fact, DCCP cannot 3417 even detect corruption in areas not covered by the header checksum, 3418 unless the Data Checksum option is used. Applications should not 3419 make any assumptions about the correctness of received data not 3420 covered by the checksum, and should if necessary introduce their own 3421 validity checks. 3423 A DCCP application interface should let sending applications suggest 3424 a value for CsCov for sent packets, defaulting to 0 (full coverage). 3425 The Minimum Checksum Coverage feature, described below, lets an 3426 endpoint refuse delivery of application data on packets with partial 3427 checksum coverage; by default, only fully-covered application data 3428 is accepted. Lower layers that support partial error detection MAY 3429 use the Checksum Coverage field as a hint of where errors do not 3430 need to be detected. Lower layers MUST use a strong error detection 3431 mechanism to detect at least errors that occur in the sensitive part 3432 of the packet, and discard damaged packets. The sensitive part 3433 consists of the bytes between the first byte of the IP header and 3434 the last byte identified by Checksum Coverage. 3436 For more details on application and lower-layer interface issues 3437 relating to partial checksumming, see [RFC 3828]. 3439 9.2.1. Minimum Checksum Coverage Feature 3441 The Minimum Checksum Coverage feature lets a DCCP endpoint determine 3442 whether its peer is willing to accept packets with reduced Checksum 3443 Coverage. For example, DCCP A sends a "Change R(Minimum Checksum 3444 Coverage, 1)" option to DCCP B to check whether B is willing to 3445 accept packets with Checksum Coverage set to 1. 3447 Minimum Checksum Coverage has feature number 8, and is server- 3448 priority. It takes one-byte integer values between 0 and 15; values 3449 of 16 or more are reserved. Minimum Checksum Coverage/B reflects 3450 values of Checksum Coverage that DCCP B finds unacceptable. Say 3451 that the value of Minimum Checksum Coverage/B is MinCsCov. Then: 3453 o If MinCsCov = 0, then DCCP B only finds packets with CsCov = 0 3454 acceptable. 3456 o If MinCsCov > 0, then DCCP B additionally finds packets with 3457 CsCov >= MinCsCov acceptable. 3459 DCCP B MAY refuse to process application data from packets with 3460 unacceptable Checksum Coverage. Such packets SHOULD be reported 3461 using Data Dropped options (Section 11.7) with Drop Code 0, Protocol 3462 Constraints. New connections start with Minimum Checksum Coverage 0 3463 for both endpoints. 3465 9.3. Data Checksum Option 3467 The Data Checksum option holds a 32-bit CRC-32c cyclic redundancy- 3468 check code of a DCCP packet's application data. 3470 +--------+--------+--------+--------+--------+--------+ 3471 |00101100|00000110| CRC-32c | 3472 +--------+--------+--------+--------+--------+--------+ 3473 Type=44 Length=6 3475 The sending DCCP computes the CRC of the bytes comprising the 3476 application data area and stores it in the option data. The CRC-32c 3477 algorithm used for Data Checksum is the same as that used for SCTP 3478 [RFC 3309]; note that the CRC-32c of zero bytes of data equals zero. 3479 The DCCP header checksum will cover the Data Checksum option, so the 3480 data checksum must be computed before the header checksum. 3482 A DCCP endpoint receiving a packet with a Data Checksum option 3483 SHOULD compute the received application data's CRC-32c, using the 3484 same algorithm as the sender, and compare the result with the Data 3485 Checksum value. (The endpoint can indicate its willingness to check 3486 Data Checksums using the Check Data Checksum feature, described 3487 below.) If the CRCs differ, the endpoint reacts in one of two ways. 3489 o The receiving application may have requested delivery of known- 3490 corrupt data via some optional API. In this case, the packet's 3491 data MUST be delivered to the application, with a note that it is 3492 known to be corrupt. Furthermore, the receiving endpoint MUST 3493 report the packet as delivered corrupt using a Data Dropped 3494 option (Drop Code 7, Delivered Corrupt). 3496 o Otherwise, the receiving endpoint MUST drop the application data, 3497 and report that data as dropped due to corruption using a Data 3498 Dropped option (Drop Code 3, Corrupt). 3500 In either case, the packet is considered acknowledgeable (since its 3501 header was processed), and will therefore be acknowledged using the 3502 equivalent of Ack Vector's Received or Received ECN Marked states. 3504 Although Data Checksum is intended for packets containing 3505 application data, it may be included on other packets, such as DCCP- 3506 Ack, DCCP-Sync, and DCCP-SyncAck. The receiver SHOULD calculate the 3507 application data area's CRC-32c on such packets, just as it does for 3508 DCCP-Data and similar packets; and if the CRCs differ, the packets 3509 similarly MUST be reported using Data Dropped options (Drop Code 3), 3510 although their application data areas would not be delivered to the 3511 application in any case. 3513 9.3.1. Check Data Checksum Feature 3515 The Check Data Checksum feature lets a DCCP endpoint determine 3516 whether its peer will definitely check Data Checksum options. 3517 DCCP A sends a Mandatory "Change R(Check Data Checksum, 1)" option 3518 to DCCP B to require it to check Data Checksum options (the 3519 connection will be reset if it cannot). 3521 Check Data Checksum has feature number 9, and is server-priority. 3522 It takes one-byte Boolean values. DCCP B MUST check any received 3523 Data Checksum options when Check Data Checksum/B is one, although it 3524 MAY check them even when Check Data Checksum/B is zero. Values of 3525 two or more are reserved. New connections start with Check Data 3526 Checksum 0 for both endpoints. 3528 9.3.2. Usage Notes 3530 Internet links must normally apply strong integrity checks to the 3531 packets they transmit [RFC 3828] [RFC 3819]. This is the default 3532 case when the DCCP header's Checksum Coverage value equals zero 3533 (full coverage). However, the DCCP Checksum Coverage value might 3534 not be zero. By setting partial Checksum Coverage, the application 3535 indicates that it can tolerate corruption in the unprotected part of 3536 the application data. Recognizing this, link layers may reduce 3537 error detection and/or correction strength when transmitting this 3538 unprotected part. This, in turn, can significantly increase the 3539 likelihood of the endpoint receiving corrupt data; Data Checksum 3540 lets the receiver detect that corruption with very high probability. 3542 10. Congestion Control 3544 Each congestion control mechanism supported by DCCP is assigned a 3545 congestion control identifier, or CCID: a number from 0 to 255. 3546 During connection setup, and optionally thereafter, the endpoints 3547 negotiate their congestion control mechanisms by negotiating the 3548 values for their Congestion Control ID features. Congestion Control 3549 ID has feature number 1. The CCID/A value equals the CCID in use 3550 for the A-to-B half-connection. DCCP B sends a "Change R(CCID, K)" 3551 option to ask DCCP A to use CCID K for its data packets. 3553 CCID is a server-priority feature, so CCID negotiation options can 3554 list multiple acceptable CCIDs, sorted in descending order of 3555 priority. For example, the option "Change R(CCID, 2 3 4)" asks the 3556 receiver to use CCID 2 for its packets, although CCIDs 3 and 4 are 3557 also acceptable. (This corresponds to the bytes "35, 6, 1, 2, 3, 3558 4": Change R option (35), option length (6), feature ID (1), CCIDs 3559 (2, 3, 4).) Similarly, "Confirm L(CCID, 1, 2 3 4)" tells the 3560 receiver that the sender is using CCID 2 for its packets, but that 3561 CCIDs 3 and 4 might also be acceptable. 3563 Currently allocated CCIDs are as follows. 3565 CCID Meaning Reference 3566 ---- ------- --------- 3567 0-1 Reserved 3568 2 TCP-like Congestion Control [RFC TBA] 3569 3 TFRC Congestion Control [RFC TBA] 3570 4-255 Reserved 3572 Table 5: DCCP Congestion Control Identifiers 3574 New connections start with CCID 2 for both endpoints. If this is 3575 unacceptable for a DCCP endpoint, that endpoint MUST send Mandatory 3576 Change(CCID) options on its first packets. 3578 All CCIDs standardized for use with DCCP will correspond to 3579 congestion control mechanisms previously standardized by the IETF. 3580 We expect that for quite some time, all such mechanisms will be TCP- 3581 friendly, but TCP-friendliness is not an explicit DCCP requirement. 3583 A DCCP implementation intended for general use, such as an 3584 implementation in a general-purpose operating system kernel, SHOULD 3585 implement at least CCID 2. The intent is to make CCID 2 broadly 3586 available for interoperability, although particular applications 3587 might disallow its use. 3589 10.1. TCP-like Congestion Control 3591 CCID 2, TCP-like Congestion Control, denotes Additive Increase, 3592 Multiplicative Decrease (AIMD) congestion control with behavior 3593 modelled directly on TCP, including congestion window, slow start, 3594 timeouts, and so forth [RFC 2581]. CCID 2 achieves maximum 3595 bandwidth over the long term, consistent with the use of end-to-end 3596 congestion control, but halves its congestion window in response to 3597 each congestion event. This leads to the abrupt rate changes 3598 typical of TCP. Applications should use CCID 2 if they prefer 3599 maximum bandwidth utilization to steadiness of rate. This is often 3600 the case for applications that are not playing their data directly 3601 to the user. For example, a hypothetical application that 3602 transferred files over DCCP, using application-level retransmissions 3603 for lost packets, would prefer CCID 2 to CCID 3. On-line games may 3604 also prefer CCID 2. 3606 CCID 2 is further described in [CCID 2 PROFILE]. 3608 10.2. TFRC Congestion Control 3610 CCID 3 denotes TCP-Friendly Rate Control (TFRC), an equation-based 3611 rate-controlled congestion control mechanism. TFRC is designed to 3612 be reasonably fair when competing for bandwidth with TCP-like flows, 3613 where a flow is "reasonably fair" if its sending rate is generally 3614 within a factor of two of the sending rate of a TCP flow under the 3615 same conditions. However, TFRC has a much lower variation of 3616 throughput over time compared with TCP, which makes CCID 3 more 3617 suitable than CCID 2 for applications such streaming media where a 3618 relatively smooth sending rate is of importance. 3620 CCID 3 is further described in [CCID 3 PROFILE]. The TFRC 3621 congestion control algorithms were initially described in [RFC 3622 3448]. 3624 10.3. CCID-Specific Options, Features, and Reset Codes 3626 Half of the option types, feature numbers, and Reset Codes are 3627 reserved for CCID-specific use. CCIDs may often need new options, 3628 for communicating acknowledgement or rate information, for example; 3629 reserved option spaces let CCIDs create options at will without 3630 polluting the global option space. Option 128 might have different 3631 meanings on a half-connection using CCID 4 and a half-connection 3632 using CCID 8. CCID-specific options and features will never 3633 conflict with global options and features introduced by later 3634 versions of this specification. 3636 Any packet may contain information meant for either half-connection, 3637 so CCID-specific option types, feature numbers, and Reset Codes 3638 explicitly signal the half-connection to which they apply. 3640 o Option numbers 128 through 191 are for options sent from the HC- 3641 Sender to the HC-Receiver; option numbers 192 through 255 are for 3642 options sent from the HC-Receiver to the HC-Sender. 3644 o Reset Codes 128 through 191 indicate that the HC-Sender reset the 3645 connection (most likely because of some problem with 3646 acknowledgements sent by the HC-Receiver); Reset Codes 192 3647 through 255 indicate that the HC-Receiver reset the connection 3648 (most likely because of some problem with data packets sent by 3649 the HC-Sender). 3651 o Finally, feature numbers 128 through 191 are used for features 3652 located at the HC-Sender; feature numbers 192 through 255 are for 3653 features located at the HC-Receiver. Since Change L and 3654 Confirm L options for a feature are sent by the feature location, 3655 we know that any Change L(128) option was sent by the HC-Sender, 3656 while any Change L(192) option was sent by the HC-Receiver. 3657 Similarly, Change R(128) options are sent by the HC-Receiver, 3658 while Change R(192) options are sent by the HC-Sender. 3660 For example, consider a DCCP connection where the A-to-B half- 3661 connection uses CCID 4 and the B-to-A half-connection uses CCID 5. 3662 Here is how a sampling of CCID-specific options are assigned to 3663 half-connections. 3665 Relevant Relevant 3666 Packet Option Half-conn. CCID 3667 ------ ------ ---------- ---- 3668 A > B 128 A-to-B 4 3669 A > B 192 B-to-A 5 3670 A > B Change L(128, ...) A-to-B 4 3671 A > B Change R(192, ...) A-to-B 4 3672 A > B Confirm L(128, ...) A-to-B 4 3673 A > B Confirm R(192, ...) A-to-B 4 3674 A > B Change R(128, ...) B-to-A 5 3675 A > B Change L(192, ...) B-to-A 5 3676 A > B Confirm R(128, ...) B-to-A 5 3677 A > B Confirm L(192, ...) B-to-A 5 3679 B > A 128 B-to-A 5 3680 B > A 192 A-to-B 4 3681 B > A Change L(128, ...) B-to-A 5 3682 B > A Change R(192, ...) B-to-A 5 3683 B > A Confirm L(128, ...) B-to-A 5 3684 B > A Confirm R(192, ...) B-to-A 5 3685 B > A Change R(128, ...) A-to-B 4 3686 B > A Change L(192, ...) A-to-B 4 3687 B > A Confirm R(128, ...) A-to-B 4 3688 B > A Confirm L(192, ...) A-to-B 4 3690 Using CCID-specific options and feature options during a negotiation 3691 for that CCID feature is NOT RECOMMENDED, since it is difficult to 3692 predict the CCID that will be in force when the option is processed. 3693 For example, if a DCCP-Request contains the option sequence 3694 "Change L(CCID, 3), 128", the CCID-specific option "128" may be 3695 processed either by CCID 3 (if the server supports CCID 3) or by the 3696 default CCID 2 (if it does not). However, it is safe to include 3697 CCID-specific options following certain Mandatory Change(CCID) 3698 options. For example, if a DCCP-Request contains the option 3699 sequence "Mandatory, Change L(CCID, 3), 128", then either the "128" 3700 option will be processed by CCID 3 or the connection will be reset. 3702 Servers that do not implement the default CCID 2 might nevertheless 3703 receive CCID 2-specific options on a DCCP-Request packet. (Such a 3704 server MUST send Mandatory Change(CCID) options on its DCCP- 3705 Response, so CCID-specific options on any other packet won't refer 3706 to CCID 2.) The server MUST treat such options as non-understood. 3707 Thus, it will reset the connection on encountering a Mandatory CCID- 3708 specific option, send an empty Confirm for a non-Mandatory Change 3709 option for a CCID-specific feature, and ignore other options. 3711 10.4. CCID Profile Requirements 3713 Each CCID Profile document MUST address at least the following 3714 requirements: 3716 o The profile MUST include the name and number of the CCID being 3717 described. 3719 o The profile MUST describe the conditions in which it is likely to 3720 be useful. Often the best way to do this is by comparison to 3721 existing CCIDs. 3723 o The profile MUST list and describe any CCID-specific options, 3724 features, and Reset Codes, and SHOULD list those general options 3725 and features described in this document that are especially 3726 relevant to the CCID. 3728 o Any newly defined acknowledgement mechanism MUST include a way to 3729 transmit ECN Nonce Echoes back to the sender. 3731 o The profile MUST describe the format of data packets, including 3732 any options that should be included and the setting of the CCval 3733 header field. 3735 o The profile MUST describe the format of acknowledgement packets, 3736 including any options that should be included. 3738 o The profile MUST define how data packets are congestion 3739 controlled. This includes responses to congestion events, idle 3740 and application-limited periods, and responses to the DCCP Data 3741 Dropped and Slow Receiver options. CCIDs that implement per- 3742 packet congestion control SHOULD discuss how packet size is 3743 factored in to congestion control decisions. 3745 o The profile MUST specify when acknowledgement packets are 3746 generated, and how they are congestion controlled. 3748 o The profile MUST define when a sender using the CCID is 3749 considered quiescent. 3751 o The profile MUST say whether its CCID's acknowledgements ever 3752 need to be acknowledged, and if so, how often. 3754 10.5. Congestion State 3756 Most congestion control algorithms depend on past history to 3757 determine the current allowed sending rate. In CCID 2, this 3758 congestion state includes a congestion window and a measurement of 3759 the number of packets outstanding in the network; in CCID 3, it 3760 includes the lengths of recent loss intervals; and both CCIDs use an 3761 estimate of the round-trip time. Congestion state depends on the 3762 network path, and is invalidated by path changes. Therefore, DCCP 3763 senders and receivers SHOULD reset their congestion state -- 3764 essentially restarting congestion control from "slow start" or 3765 equivalent -- on significant changes in end-to-end path. For 3766 example, an endpoint that sends or receives a Mobile IPv6 Binding 3767 Update message [RFC 3775] SHOULD reset its congestion state for any 3768 corresponding DCCP connections. 3770 11. Acknowledgements 3772 Congestion control requires receivers to transmit information about 3773 packet losses and ECN marks to senders. DCCP receivers MUST report 3774 all congestion they see, as defined by the relevant CCID profile. 3775 Each CCID says when acknowledgements should be sent, what options 3776 they must use, and so on. DCCP acknowledgements are congestion 3777 controlled, although it is not required that the acknowledgement 3778 stream be more than very roughly TCP-friendly; each CCID defines how 3779 acknowledgements are congestion controlled. 3781 Most acknowledgements use DCCP options. For example, on a half- 3782 connection with CCID 2 (TCP-like), the receiver reports 3783 acknowledgement information using the Ack Vector option. This 3784 section describes common acknowledgement options and shows how acks 3785 using those options will commonly work. Full descriptions of the 3786 ack mechanisms used for each CCID are laid out in the CCID profile 3787 specifications. 3789 Acknowledgement options, such as Ack Vector, generally depend on the 3790 DCCP Acknowledgement Number, and are thus only allowed on packet 3791 types that carry that number (all packets except DCCP-Request and 3792 DCCP-Data). Detailed acknowledgement options are not necessarily 3793 required on every packet that carries an Acknowledgement Number, 3794 however. 3796 11.1. Acks of Acks and Unidirectional Connections 3798 DCCP was designed to work well for both bidirectional and 3799 unidirectional flows of data, and for connections that transition 3800 between these states. However, acknowledgements required for a 3801 unidirectional connection are very different from those required for 3802 a bidirectional connection. In particular, unidirectional 3803 connections need to worry about acks of acks. 3805 The ack-of-acks problem arises because some acknowledgement 3806 mechanisms are reliable. For example, an HC-Receiver using CCID 2, 3807 TCP-like Congestion Control, sends Ack Vectors containing completely 3808 reliable acknowledgement information. The HC-Sender should 3809 occasionally inform the HC-Receiver that it has received an ack. If 3810 it did not, the HC-Receiver might resend complete Ack Vector 3811 information, going back to the start of the connection, with every 3812 DCCP-Ack packet! However, note that acks-of-acks need not be 3813 reliable themselves: when an ack-of-acks is lost, the HC-Receiver 3814 will simply maintain, and periodically retransmit, old 3815 acknowledgement-related state for a little longer. Therefore, there 3816 is no need for acks-of-acks-of-acks. 3818 When communication is bidirectional, any required acks-of-acks are 3819 automatically contained in normal acknowledgements for data packets. 3820 On a unidirectional connection, however, the receiver DCCP sends no 3821 data, so the sender would not normally send acknowledgements. 3822 Therefore, the CCID in force on that half-connection must explicitly 3823 say whether, when, and how the HC-Sender should generate acks-of- 3824 acks. 3826 For example, consider a bidirectional connection where both half- 3827 connections use the same CCID (either 2 or 3), and where DCCP B goes 3828 "quiescent". This means that the connection becomes unidirectional: 3829 DCCP B stops sending data, and sends only sends DCCP-Ack packets to 3830 DCCP A. For example, in CCID 2, TCP-like Congestion Control, DCCP B 3831 uses Ack Vector to reliably communicate which packets it has 3832 received. As described above, DCCP A must occasionally acknowledge 3833 a pure acknowledgement from DCCP B, so that B can free old Ack 3834 Vector state. For instance, A might send a DCCP-DataAck packet 3835 every now and then, instead of DCCP-Data. In contrast, in CCID 3, 3836 TFRC Congestion Control, DCCP B's acknowledgements generally need 3837 not be reliable, since they contain cumulative loss rates; TFRC 3838 works even if every DCCP-Ack is lost. Therefore, DCCP A need never 3839 acknowledge an acknowledgement. 3841 When communication is unidirectional, a single CCID -- in the 3842 example, the A-to-B CCID -- controls both DCCPs' acknowledgements, 3843 in terms of their content, their frequency, and so forth. For 3844 bidirectional connections, the A-to-B CCID governs DCCP B's 3845 acknowledgements (including its acks of DCCP A's acks), while the B- 3846 to-A CCID governs DCCP A's acknowledgements. 3848 DCCP A switches its ack pattern from bidirectional to unidirectional 3849 when it notices that DCCP B has gone quiescent. It switches from 3850 unidirectional to bidirectional when it must acknowledge even a 3851 single DCCP-Data or DCCP-DataAck packet from DCCP B. 3853 Each CCID defines how to detect quiescence on that CCID, and how 3854 that CCID handles acks-of-acks on unidirectional connections. The 3855 B-to-A CCID defines when DCCP B has gone quiescent. Usually, this 3856 happens when a period has passed without B sending any data packets; 3857 in CCID 2, for example, this period is the maximum of 0.2 seconds 3858 and two round-trip times. The A-to-B CCID defines how DCCP A 3859 handles acks-of-acks once DCCP B has gone quiescent. 3861 11.2. Ack Piggybacking 3863 Acknowledgements of A-to-B data MAY be piggybacked on data sent by 3864 DCCP B, as long as that does not delay the acknowledgement longer 3865 than the A-to-B CCID would find acceptable. However, data 3866 acknowledgements often require more than 4 bytes to express. A 3867 large set of acknowledgements prepended to a large data packet might 3868 exceed the allowed maximum packet size. In this case, DCCP B SHOULD 3869 send separate DCCP-Data and DCCP-Ack packets, or wait, but not too 3870 long, for a smaller datagram. 3872 Piggybacking is particularly common at DCCP A when the B-to-A half- 3873 connection is quiescent -- that is, when DCCP A is just 3874 acknowledging DCCP B's acknowledgements. There are three reasons to 3875 acknowledge DCCP B's acknowledgements: to allow DCCP B to free up 3876 information about previously acknowledged data packets from A; to 3877 shrink the size of future acknowledgements; and to manipulate the 3878 rate at which future acknowledgements are sent. Since these are 3879 secondary concerns, DCCP A can generally afford to wait indefinitely 3880 for a data packet to piggyback its acknowledgement onto; if DCCP B 3881 wants to elicit an acknowledgement, it can send a DCCP-Sync. 3883 Any restrictions on ack piggybacking are described in the relevant 3884 CCID's profile. 3886 11.3. Ack Ratio Feature 3888 The Ack Ratio feature lets HC-Senders influence the rate at which 3889 HC-Receivers generate DCCP-Ack packets, thus controlling reverse- 3890 path congestion. This differs from TCP, which presently has no 3891 congestion control for pure acknowledgement traffic. Ack Ratio 3892 reverse-path congestion control does not try to be TCP-friendly. It 3893 just tries to avoid congestion collapse, and to be somewhat better 3894 than TCP in the presence of a high packet loss or mark rate on the 3895 reverse path. 3897 Ack Ratio applies to CCIDs whose HC-Receivers clock acknowledgements 3898 off the receipt of data packets. The value of Ack Ratio/A equals 3899 the rough ratio of data packets sent by DCCP A to DCCP-Ack packets 3900 sent by DCCP B. Higher Ack Ratios correspond to lower DCCP-Ack 3901 rates; the sender raises Ack Ratio when the reverse path is 3902 congested and lowers Ack Ratio when it is not. Each CCID profile 3903 defines how it controls congestion on the acknowledgement path, and, 3904 particularly, whether Ack Ratio is used. CCID 2, for example, uses 3905 Ack Ratio for acknowledgement congestion control, but CCID 3 does 3906 not. However, each Ack Ratio feature has a value whether or not 3907 that value is used by the relevant CCID. 3909 Ack Ratio has feature number 5, and is non-negotiable. It takes 3910 two-byte integer values. An Ack Ratio/A value of four means that 3911 DCCP B will send at least one acknowledgement packet for every four 3912 data packets sent by DCCP A. DCCP A sends a "Change L(Ack Ratio)" 3913 option to notify DCCP B of its ack ratio. An Ack Ratio value of 3914 zero indicates that the relevant half-connection does not use an Ack 3915 Ratio to control its acknowledgement rate. New connections start 3916 with Ack Ratio 2 for both endpoints; this Ack Ratio results in 3917 acknowledgement behavior analogous to TCP's delayed acks. 3919 Ack Ratio should be treated as a guideline rather than a strict 3920 requirement. We intend Ack Ratio-controlled acknowledgement 3921 behavior to resemble TCP's acknowledgement behavior when there is no 3922 reverse-path congestion, and to be somewhat more conservative when 3923 there is reverse-path congestion. Following this intent is more 3924 important than implementing Ack Ratio precisely. In particular: 3926 o Receivers MAY piggyback acknowledgement information on data 3927 packets, creating DCCP-DataAck packets. The Ack Ratio does not 3928 apply to piggybacked acknowledgements. However, if the data 3929 packets are too big to carry acknowledgement information, or the 3930 data sending rate is lower than Ack Ratio would suggest, then 3931 DCCP B SHOULD send enough pure DCCP-Ack packets to maintain the 3932 rate of one acknowledgement per Ack Ratio received data packets. 3934 o Receivers MAY rate-pace their acknowledgements, rather than 3935 sending acknowledgements immediately upon the receipt of data 3936 packets. Receivers that rate-pace acknowledgements SHOULD pick a 3937 rate that approximates the effect of Ack Ratio, and SHOULD 3938 include Elapsed Time options (Section 13.2) to help the sender 3939 calculate round-trip times. 3941 o Receivers SHOULD implement delayed acknowledgement timers like 3942 TCP's, whereby any packet's acknowledgement is delayed by at most 3943 T seconds. This delay lets the receiver collect additional 3944 packets to acknowledge, and thus reduce the per-packet overhead 3945 of acknowledgements; but if T seconds have passed by and the ack 3946 is still around, it is sent out right away. The default value of 3947 T should be 0.2 seconds, as is common in TCP implementations. 3948 This may lead to sending more acknowledgement packets than Ack 3949 Ratio would suggest. 3951 o Receivers SHOULD send acknowledgements immediately on receiving 3952 packets marked ECN Congestion Experienced, or packets whose out- 3953 of-order sequence numbers potentially indicate loss. However, 3954 there is no need to send such immediate acknowledgements for 3955 marked packets more than once per round-trip time. 3957 o Receivers MAY ignore Ack Ratio if they perform their own 3958 congestion control on acknowledgements. For example, a receiver 3959 that knows the loss and mark rate for its DCCP-Ack packets might 3960 maintain a TCP-friendly acknowledgement rate on its own. Such a 3961 receiver MUST either ensure that it always obtains sufficient 3962 acknowledgement loss and mark information, or fall back to Ack 3963 Ratio when sufficient information is not available, as might 3964 happen during periods when the receiver is quiescent. 3966 11.4. Ack Vector Options 3968 The Ack Vector gives a run-length encoded history of data packets 3969 received at the client. Each byte of the vector gives the state of 3970 that data packet in the loss history, and the number of preceding 3971 packets with the same state. The option's data looks like this: 3973 +--------+--------+--------+--------+--------+-------- 3974 |0010011?| Length |SSLLLLLL|SSLLLLLL|SSLLLLLL| ... 3975 +--------+--------+--------+--------+--------+-------- 3976 Type=38/39 \___________ Vector ___________... 3978 The two Ack Vector options (option types 38 and 39) differ only in 3979 the values they imply for ECN Nonce Echo. Section 12.2 describes 3980 this further. 3982 The vector itself consists of a series of bytes, each of whose 3983 encoding is: 3985 0 1 2 3 4 5 6 7 3986 +-+-+-+-+-+-+-+-+ 3987 |Sta| Run Length| 3988 +-+-+-+-+-+-+-+-+ 3990 Sta[te] occupies the most significant two bits of each byte, and can 3991 have one of four values, as follows. 3993 State Meaning 3994 ----- ------- 3995 0 Received 3996 1 Received ECN Marked 3997 2 Reserved 3998 3 Not Yet Received 4000 Table 6: DCCP Ack Vector States 4002 The term "ECN marked" refers to packets with ECN code point 11, CE 4003 (Congestion Experienced); packets received with this ECN code point 4004 MUST be reported using State 1, Received ECN Marked. Packets 4005 received with other ECN code points 00, 01, or 10 (Non-ECT, ECT(0), 4006 or ECT(1), respectively) MUST be reported using State 0, Received. 4008 Run Length, the least significant six bits of each byte, specifies 4009 how many consecutive packets have the given State. Run Length zero 4010 says the corresponding State applies to one packet only; Run Length 4011 63 says it applies to 64 consecutive packets. Run lengths of 65 or 4012 more must be encoded in multiple bytes. 4014 The first byte in the first Ack Vector option refers to the packet 4015 indicated in the Acknowledgement Number; subsequent bytes refer to 4016 older packets. (Ack Vector MUST NOT be sent on DCCP-Data and DCCP- 4017 Request packets, which lack an Acknowledgement Number.) An Ack 4018 Vector containing the decimal values 0,192,3,64,5 and the 4019 Acknowledgement Number is decimal 100 indicates that: 4021 Packet 100 was received (Acknowledgement Number 100, State 0, 4022 Run Length 0). 4024 Packet 99 was lost (State 3, Run Length 0). 4026 Packets 98, 97, 96 and 95 were received (State 0, Run Length 3). 4028 Packet 94 was ECN marked (State 1, Run Length 0). 4030 Packets 93, 92, 91, 90, 89, and 88 were received (State 0, Run 4031 Length 5). 4033 A single Ack Vector option can acknowledge up to 16192 data packets. 4034 Should more packets need to be acknowledged than can fit in 253 4035 bytes of Ack Vector, then multiple Ack Vector options can be sent; 4036 the second Ack Vector begins where the first left off, and so forth. 4038 Ack Vector states are subject to two general constraints. (These 4039 principles SHOULD also be followed for other acknowledgement 4040 mechanisms; referring to Ack Vector states simplifies their 4041 explanation.) 4043 1. Packets reported as State 0 or State 1 MUST be acknowledgeable: 4044 their options have been processed by the receiving DCCP stack. 4045 Any data on the packet need not have been delivered to the 4046 receiving application; in fact, the data may have been dropped. 4048 2. Packets reported as State 3 MUST NOT be acknowledgeable. 4049 Feature negotiations and options on such packets MUST NOT have 4050 been processed, and the Acknowledgement Number MUST NOT 4051 correspond to such a packet. 4053 Packets dropped in the application's receive buffer MUST be reported 4054 as Received or Received ECN Marked (States 0 and 1), depending on 4055 their ECN state; such packets' ECN Nonces MUST be included in the 4056 Nonce Echo. The Data Dropped option informs the sender that some 4057 packets reported as received actually had their application data 4058 dropped. 4060 One or more Ack Vector options that, together, report the status of 4061 a packet with sequence number less than ISN, the initial sequence 4062 number, SHOULD be considered invalid. The receiving DCCP SHOULD 4063 either ignore the options or reset the connection with Reset Code 5, 4064 "Option Error". No Ack Vector option can refer to a packet that has 4065 not yet been sent, as the Acknowledgement Number checks in Section 4066 7.5.3 ensure, but because of attack, implementation bug, or 4067 misbehavior, an Ack Vector option can claim that a packet was 4068 received before it is actually delivered; Section 12.2 describes how 4069 this is detected and how senders should react. Packets that haven't 4070 been included in any Ack Vector option SHOULD be treated as "not yet 4071 received" (State 3) by the sender. 4073 Appendix A provides a non-normative description of the details of 4074 DCCP acknowledgement handling, in the context of an abstract Ack 4075 Vector implementation. 4077 11.4.1. Ack Vector Consistency 4079 A DCCP sender will commonly receive multiple acknowledgements for 4080 some of its data packets. For instance, an HC-Sender might receive 4081 two DCCP-Acks with Ack Vectors, both of which contained information 4082 about sequence number 24. (Information about a sequence number is 4083 generally repeated in every ack until the HC-Sender acknowledges an 4084 ack. In this case, perhaps the HC-Receiver is sending acks faster 4085 than the HC-Sender is acknowledging them.) In a perfect world, the 4086 two Ack Vectors would always be consistent. However, there are many 4087 reasons why they might not be. For example: 4089 o The HC-Receiver received packet 24 between sending its acks, so 4090 the first ack said 24 was not received (State 3) and the second 4091 said it was received or ECN marked (State 0 or 1). 4093 o The HC-Receiver received packet 24 between sending its acks, and 4094 the network reordered the acks. In this case, the packet will 4095 appear to transition from State 0 or 1 to State 3. 4097 o The network duplicated packet 24, and one of the duplicates was 4098 ECN marked. This might show up as a transition between States 0 4099 and 1. 4101 To cope with these situations, HC-Sender DCCP implementations SHOULD 4102 combine multiple received Ack Vector states according to this table: 4104 Received State 4105 0 1 3 4106 +---+---+---+ 4107 0 | 0 |0/1| 0 | 4108 Old +---+---+---+ 4109 1 | 1 | 1 | 1 | 4110 State +---+---+---+ 4111 3 | 0 | 1 | 3 | 4112 +---+---+---+ 4114 To read the table, choose the row corresponding to the packet's old 4115 state and the column corresponding to the packet's state in the 4116 newly received Ack Vector, then read the packet's new state off the 4117 table. For an old state of 0 (received non-marked) and received 4118 state of 1 (received ECN marked), the packet's new state may be set 4119 to either 0 or 1. The HC-Sender implementation will be indifferent 4120 to ack reordering if it chooses new state 1 for that cell. 4122 The HC-Receiver should collect information about received packets, 4123 which it will eventually report to the HC-Sender on one or more 4124 acknowledgements, according to the following table: 4126 Received Packet 4127 0 1 3 4128 +---+---+---+ 4129 0 | 0 |0/1| 0 | 4130 Stored +---+---+---+ 4131 1 |0/1| 1 | 1 | 4132 State +---+---+---+ 4133 3 | 0 | 1 | 3 | 4134 +---+---+---+ 4136 This table equals the sender's table, except that when the stored 4137 state is 1 and the received state is 0, the receiver is allowed to 4138 switch its stored state to 0. 4140 A HC-Sender MAY choose to throw away old information gleaned from 4141 the HC-Receiver's Ack Vectors, in which case it MUST ignore newly 4142 received acknowledgements from the HC-Receiver for those old 4143 packets. It is often kinder to save recent Ack Vector information 4144 for a while, so that the HC-Sender can undo its reaction to presumed 4145 congestion when a "lost" packet unexpectedly shows up (the 4146 transition from State 3 to State 0). 4148 11.4.2. Ack Vector Coverage 4150 We can divide the packets that have been sent from an HC-Sender to 4151 an HC-Receiver into four roughly contiguous groups. From oldest to 4152 youngest, these are: 4154 1. Packets already acknowledged by the HC-Receiver, where the HC- 4155 Receiver knows that the HC-Sender has definitely received the 4156 acknowledgements. 4158 2. Packets already acknowledged by the HC-Receiver, where the HC- 4159 Receiver cannot be sure that the HC-Sender has received the 4160 acknowledgements. 4162 3. Packets not yet acknowledged by the HC-Receiver. 4164 4. Packets not yet received by the HC-Receiver. 4166 The union of groups 2 and 3 is called the Acknowledgement Window. 4167 Generally, every Ack Vector generated by the HC-Receiver will cover 4168 the whole Acknowledgement Window: Ack Vector acknowledgements are 4169 cumulative. (This simplifies Ack Vector maintenance at the HC- 4170 Receiver; see Appendix A, below.) As packets are received, this 4171 window both grows on the right and shrinks on the left. It grows 4172 because there are more packets, and shrinks because the data 4173 packets' Acknowledgement Numbers will acknowledge previous 4174 acknowledgements, moving packets from group 2 into group 1. 4176 11.5. Send Ack Vector Feature 4178 The Send Ack Vector feature lets DCCPs negotiate whether they should 4179 use Ack Vector options to report congestion. Ack Vector provides 4180 detailed loss information, and lets senders report back to their 4181 applications whether particular packets were dropped. Send Ack 4182 Vector is mandatory for some CCIDs, and optional for others. 4184 Send Ack Vector has feature number 6, and is server-priority. It 4185 takes one-byte Boolean values. DCCP A MUST send Ack Vector options 4186 on its acknowledgements when Send Ack Vector/A has value one, 4187 although it MAY send Ack Vector options even when Send Ack Vector/A 4188 is zero. Values of two or more are reserved. New connections start 4189 with Send Ack Vector 0 for both endpoints. DCCP B sends a 4190 "Change R(Send Ack Vector, 1)" option to DCCP A to ask A to send Ack 4191 Vector options as part of its acknowledgement traffic. 4193 11.6. Slow Receiver Option 4195 An HC-Receiver sends the Slow Receiver option to its sender to 4196 indicate that it is having trouble keeping up with the sender's 4197 data. The HC-Sender SHOULD NOT increase its sending rate for 4198 approximately one round-trip time after seeing a packet with a Slow 4199 Receiver option. After one round-trip time, the effect of Slow 4200 Receiver disappears and the HC-Sender may again increase its rate, 4201 so the HC-Receiver SHOULD continue to send Slow Receiver options if 4202 it needs to prevent the HC-Sender from going faster in the long 4203 term. The Slow Receiver option does not indicate congestion, and 4204 the HC-Sender need not reduce its sending rate. (If necessary, the 4205 receiver can force the sender to slow down by dropping packets, with 4206 or without Data Dropped, or reporting false ECN marks.) APIs should 4207 let receiver applications set Slow Receiver, and sending 4208 applications determine whether or not their receivers are Slow. 4210 Slow Receiver is a one-byte option. 4212 +--------+ 4213 |00000010| 4214 +--------+ 4215 Type=2 4217 Slow Receiver does not specify why the receiver is having trouble 4218 keeping up with the sender. Possible reasons include lack of buffer 4219 space, CPU overload, and application quotas. A sending application 4220 might react to Slow Receiver by reducing its sending rate, for 4221 example. 4223 The sending application should not react to Slow Receiver by sending 4224 more data, however. The optimal response to a CPU-bound receiver 4225 might be to increase the sending rate, by switching to a less- 4226 compressed sending format, since a highly-compressed data format 4227 might overwhelm a slow CPU more seriously than the higher memory 4228 requirements of a less-compressed data format. This kind of format 4229 change should be requested at the application level, not via the 4230 Slow Receiver option. 4232 Slow Receiver implements a portion of TCP's receive window 4233 functionality. 4235 11.7. Data Dropped Option 4237 The Data Dropped option indicates that the application data on one 4238 or more received packets did not actually reach the application. 4239 Data Dropped additionally reports why the data was dropped: perhaps 4240 the data was corrupt, or perhaps the receiver cannot keep up with 4241 the sender's current rate and the data was dropped in some receive 4242 buffer. Using Data Dropped, DCCP endpoints can discriminate between 4243 different kinds of loss; this differs from TCP, in which all loss is 4244 reported the same way. 4246 Unless explicitly specified otherwise, DCCP congestion control 4247 mechanisms MUST react as if each Data Dropped packet was marked as 4248 ECN Congestion Experienced by the network. We intend for Data 4249 Dropped to enable research into richer congestion responses to 4250 corrupt and other endpoint-dropped packets, but DCCP CCIDs MUST 4251 react conservatively to Data Dropped until this behavior is 4252 standardized. Section 11.7.2, below, describes congestion responses 4253 for all current Drop Codes. 4255 If a received packet's application data is dropped for one of the 4256 reasons listed below, this SHOULD be reported using a Data Dropped 4257 option. Alternatively, the receiver MAY choose to report as 4258 "received" only those packets whose data were not dropped, subject 4259 to the constraint that packets not reported as received MUST NOT 4260 have had their options processed. 4262 The option's data looks like this: 4264 +--------+--------+--------+--------+--------+-------- 4265 |00101000| Length | Block | Block | Block | ... 4266 +--------+--------+--------+--------+--------+-------- 4267 Type=40 \___________ Vector ___________ ... 4269 The Vector consists of a series of bytes, called Blocks, each of 4270 whose encoding corresponds to one of two choices: 4272 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 4273 +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 4274 |0| Run Length | or |1|DrpCd|Run Len| 4275 +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 4276 Normal Block Drop Block 4278 The first byte in the first Data Dropped option refers to the packet 4279 indicated in the Acknowledgement Number; subsequent bytes refer to 4280 older packets. (Data Dropped MUST NOT be sent on DCCP-Data or DCCP- 4281 Request packets, which lack an Acknowledgement Number, and any Data 4282 Dropped options received on these packet types MUST be ignored.) 4283 Normal Blocks, which have high bit 0, indicate that any received 4284 packets in the Run Length had their data delivered to the 4285 application. Drop Blocks, which have high bit 1, indicate that 4286 received packets in the Run Len[gth] were not delivered as usual. 4287 The 3-bit Drop Code [DrpCd] field says what happened; generally, no 4288 data from that packet reached the application. Packets reported as 4289 "not yet received" MUST be included in Normal Blocks; packets not 4290 covered by any Data Dropped option are treated as if they were in a 4291 Normal Block. Defined Drop Codes for Drop Blocks are as follows. 4293 Drop Code Meaning 4294 --------- ------- 4295 0 Protocol Constraints 4296 1 Application Not Listening 4297 2 Receive Buffer 4298 3 Corrupt 4299 4-6 Reserved 4300 7 Delivered Corrupt 4302 Table 7: DCCP Drop Codes 4304 In more detail: 4306 0 The packet data was dropped due to protocol constraints. 4307 For example, the data was included on a DCCP-Request packet, 4308 but the receiving application does not allow such 4309 piggybacking; or the data was included on a packet with 4310 inappropriately low Checksum Coverage. 4312 1 The packet data was dropped because the application is no 4313 longer listening. See Section 11.7.2. 4315 2 The packet data was dropped in a receive buffer, probably 4316 because of receive buffer overflow. See Section 11.7.2. 4318 3 The packet data was dropped due to corruption. See Section 4319 9.3. 4321 7 The packet data was corrupted, but delivered to the 4322 application anyway. See Section 9.3. 4324 For example, assume a packet arrives with Acknowledgement Number 4325 100, an Ack Vector reporting all packets as received, and a Data 4326 Dropped option containing the decimal values 0,160,3,162. Then: 4328 Packet 100 was received (Acknowledgement Number 100, Normal 4329 Block, Run Length 0). 4331 Packet 99 was dropped in a receive buffer (Drop Block, Drop Code 4332 2, Run Length 0). 4334 Packets 98, 97, 96, and 95 were received (Normal Block, Run 4335 Length 3). 4337 Packets 95, 94, and 93 were dropped in the receive buffer (Drop 4338 Block, Drop Code 2, Run Length 2). 4340 Run lengths of more than 128 (for Normal Blocks) or 16 (for Drop 4341 Blocks) must be encoded in multiple Blocks. A single Data Dropped 4342 option can acknowledge up to 32384 Normal Block data packets, 4343 although the receiver SHOULD NOT send a Data Dropped option when all 4344 relevant packets fit into Normal Blocks. Should more packets need 4345 to be acknowledged than can fit in 253 bytes of Data Dropped, then 4346 multiple Data Dropped options can be sent. The second option will 4347 begin where the first left off, and so forth. 4349 One or more Data Dropped options that, together, report the status 4350 of more packets than have been sent, or that change the status of a 4351 packet, or that disagree with Ack Vector or equivalent options (by 4352 reporting a "not yet received" packet as "dropped in the receive 4353 buffer", for example), SHOULD be considered invalid. The receiving 4354 DCCP SHOULD either such options, or respond by resetting the 4355 connection with Reset Code 5, "Option Error". 4357 A DCCP application interface should let receiving applications 4358 specify the Drop Codes corresponding to received packets. For 4359 example, this would let applications calculate their own checksums, 4360 but still report "dropped due to corruption" packets via the Data 4361 Dropped option. The interface SHOULD NOT let applications reduce 4362 the "seriousness" of a packet's Drop Code; for example, the 4363 application should not be able to upgrade a packet from delivered 4364 corrupt (Drop Code 7) to delivered normally (no Drop Code). 4366 Data Dropped information is transmitted reliably. That is, 4367 endpoints SHOULD continue to transmit Data Dropped options until 4368 receiving an acknowledgement indicating that the relevant options 4369 have been processed. In Ack Vector terms, each acknowledgement 4370 should contain Data Dropped options that cover the whole 4371 Acknowledgement Window (Section 11.4.2), although when every packet 4372 in that window would be placed in a Normal Block no actual option is 4373 required. 4375 11.7.1. Data Dropped and Normal Congestion Response 4377 When deciding on a response to a particular acknowledgement or set 4378 of acknowledgements containing Data Dropped options, a congestion 4379 control mechanism MUST consider dropped packets and ECN Congestion 4380 Experienced marks (including marked packets that are included in 4381 Data Dropped), as well as the packets singled out in Data Dropped. 4382 For window-based mechanisms, the valid response space is defined as 4383 follows. 4385 Assume an old window of W. Independently calculate a new window 4386 W_new1 that assumes no packets were Data Dropped (so W_new1 contains 4387 only the normal congestion response), and a new window W_new2 that 4388 assumes no packets were lost or marked (so W_new2 contains only the 4389 Data Dropped response). We are assuming that Data Dropped 4390 recommended a reduction in congestion window, so W_new2 < W. 4392 Then the actual new window W_new MUST NOT be larger than the minimum 4393 of W_new1 and W_new2; and the sender MAY combine the two responses, 4394 by setting 4395 W_new = W + min(W_new1 - W, 0) + min(W_new2 - W, 0). 4397 The details of how this is accomplished are specified in CCID 4398 profile documents. Non-window-based congestion control mechanisms 4399 MUST behave analogously; again, CCID profiles define how. 4401 11.7.2. Particular Drop Codes 4403 Drop Code 0, Protocol Constraints, does not indicate any kind of 4404 congestion, so the sender's CCID SHOULD react to packets with Drop 4405 Code 0 as if they were received (with or without ECN Congestion 4406 Experienced marks, as appropriate). However, the sending endpoint 4407 SHOULD NOT send data until it believes the protocol constraint no 4408 longer applies. 4410 Drop Code 1, Application Not Listening, means the application 4411 running at the endpoint that sent the option is no longer listening 4412 for data. For example, a server might close its receiving half- 4413 connection to new data after receiving a complete request from the 4414 client. This would limit the amount of state available at the 4415 server for incoming data, and thus reduce the potential damage from 4416 certain denial-of-service attacks. A Data Dropped option containing 4417 Drop Code 1 SHOULD be sent whenever received data is ignored due to 4418 a non-listening application. Once an endpoint reports Drop Code 1 4419 for a packet, it SHOULD report Drop Code 1 for every succeeding data 4420 packet on that half-connection; once an endpoint receives a Drop 4421 State 1 report, it SHOULD expect that no more data will ever be 4422 delivered to the other endpoint's application, so it SHOULD NOT send 4423 more data. 4425 Drop Code 2, Receive Buffer, indicates congestion inside the 4426 receiving host. For instance, if a drop-from-tail kernel socket 4427 buffer is too full to accept a packet's application data, that 4428 packet should be reported as Drop Code 2. For a drop-from-head or 4429 more complex socket buffer, the dropped packet should be reported as 4430 Drop Code 2. DCCP implementations may also provide an API by which 4431 applications can mark received packets as Drop Code 2, indicating 4432 that the application ran out of space in its user-level receive 4433 buffer. (However, it is not generally useful to report packets as 4434 dropped due to Drop Code 2 after more than a couple round-trip times 4435 have passed. The HC-Sender may have forgotten its acknowledgement 4436 state for the packet by that time, so the Data Dropped report will 4437 have no effect.) Every packet newly acknowledged as Drop Code 2 4438 SHOULD reduce the sender's instantaneous rate by one packet per 4439 round-trip time. Each CCID profile defines the CCID-specific 4440 mechanism by which this is accomplished. 4442 Currently, the other Drop Codes, namely Drop Code 3, Corrupt, Drop 4443 Code 7, Delivered Corrupt, and reserved Drop Codes 4-6, MUST cause 4444 the relevant CCID to behave as if the relevant packets were ECN 4445 marked (ECN Congestion Experienced). 4447 12. Explicit Congestion Notification 4449 The DCCP protocol is fully ECN-aware [RFC 3168]. Each CCID 4450 specifies how its endpoints respond to ECN marks. Furthermore, 4451 DCCP, unlike TCP, allows senders to control the rate at which 4452 acknowledgements are generated (with options like Ack Ratio); since 4453 acknowledgements are congestion-controlled, they also qualify as 4454 ECN-Capable Transport. 4456 A CCID profile describes how that CCID interacts with ECN, both for 4457 data traffic and pure-acknowledgement traffic. A sender SHOULD set 4458 ECN-Capable Transport on its packets' IP headers, unless the 4459 receiver's ECN Incapable feature is on or the relevant CCID 4460 disallows it. 4462 The rest of this section describes the ECN Incapable feature and the 4463 interaction of the ECN Nonce with acknowledgement options such as 4464 Ack Vector. 4466 12.1. ECN Incapable Feature 4468 DCCP endpoints are ECN-aware by default, but the ECN Incapable 4469 feature lets an endpoint reject the use of Explicit Congestion 4470 Notification. The use of this feature is NOT RECOMMENDED. ECN 4471 incapability both avoids ECN's possible benefits and prevents 4472 senders from using the ECN Nonce to check for receiver misbehavior. 4473 A DCCP stack MAY therefore leave the ECN Incapable feature 4474 unimplemented, acting as if all connections were ECN capable. It is 4475 worth noting that the inappropriate firewall interactions that 4476 dogged TCP's implementation of ECN [RFC 3360] involve TCP header 4477 bits, not the IP header's ECN bits; we know of no middlebox that 4478 would block ECN-capable DCCP packets, but allow ECN-incapable DCCP 4479 packets. 4481 ECN Incapable has feature number 4, and is server-priority. It 4482 takes one-byte Boolean values. DCCP A MUST be able to read ECN bits 4483 from received frames' IP headers when ECN Incapable/A is zero. 4484 (This is independent of whether it can set ECN bits on sent frames.) 4485 DCCP A thus sends a "Change L(ECN Inapable, 1)" option to DCCP B to 4486 inform it that A cannot read ECN bits. If the ECN Incapable/A 4487 feature is one, then all of DCCP B's packets MUST be sent as ECN 4488 incapable. New connections start with ECN Incapable 0 (that is, ECN 4489 capable) for both endpoints. Values of two or more are reserved. 4491 If a DCCP is not ECN capable, it MUST send Mandatory "Change L(ECN 4492 Incapable, 1)" options to the other endpoint until acknowledged (by 4493 "Confirm R(ECN Incapable, 1)") or the connection closes. 4494 Furthermore, it MUST NOT accept any data until the other endpoint 4495 sends "Confirm R(ECN Incapable, 1)". It SHOULD send Data Dropped 4496 options on its acknowledgements, with Drop Code 0 ("protocol 4497 constraints"), if the other endpoint does send data inappropriately. 4499 12.2. ECN Nonces 4501 Congestion avoidance will not occur, and the receiver will sometimes 4502 get its data faster, if the sender isn't told about congestion 4503 events. Thus, the receiver has some incentive to falsify 4504 acknowledgement information, reporting that marked or dropped 4505 packets were actually received unmarked. This problem is more 4506 serious with DCCP than with TCP, since TCP provides reliable 4507 transport: it is more difficult with TCP to lie about lost packets 4508 without breaking the application. 4510 ECN Nonces are a general mechanism to prevent ECN cheating (or loss 4511 cheating). Two values for the two-bit ECN header field indicate 4512 ECN-Capable Transport, 01 and 10. The second code point, 10, is the 4513 ECN Nonce. In general, a protocol sender chooses between these code 4514 points randomly on its output packets, remembering the sequence it 4515 chose. The protocol receiver reports, on every acknowledgement, the 4516 number of ECN Nonces it has received thus far. This is called the 4517 ECN Nonce Echo. Since ECN marking and packet dropping both destroy 4518 the ECN Nonce, a receiver that lies about an ECN mark or packet drop 4519 has a 50% chance of guessing right and avoiding discipline. The 4520 sender may react punitively to an ECN Nonce mismatch, possibly up to 4521 dropping the connection. The ECN Nonce Echo field need not be an 4522 integer; one bit is enough to catch 50% of infractions, and the 4523 probability of success drops exponentially as more bits are sent 4524 [RFC 3540]. 4526 In DCCP, the ECN Nonce Echo field is encoded in acknowledgement 4527 options. For example, the Ack Vector option comes in two forms, Ack 4528 Vector [Nonce 0] (option 38) and Ack Vector [Nonce 1] (option 39), 4529 corresponding to the two values for a one-bit ECN Nonce Echo. The 4530 Nonce Echo for a given Ack Vector equals the one-bit sum (exclusive- 4531 or, or parity) of ECN nonces for packets reported by that Ack Vector 4532 as received and not ECN marked. Thus, only packets marked as State 4533 0 matter for this calculation (that is, valid received packets that 4534 were not ECN marked). Every Ack Vector option is detailed enough 4535 for the sender to determine what the Nonce Echo should have been. 4536 It can check this calculation against the actual Nonce Echo, and 4537 complain if there is a mismatch. (The Ack Vector could conceivably 4538 report every packet's ECN Nonce state, but this would severely limit 4539 its compressibility without providing much extra protection.) 4541 Each DCCP sender SHOULD set ECN Nonces on its packets, and remember 4542 which packets had nonces. When a sender detects an ECN Nonce Echo 4543 mismatch, it behaves as described in the next section. Each DCCP 4544 receiver MUST calculate and use the correct value for ECN Nonce Echo 4545 when sending acknowledgement options. 4547 ECN incapability, as indicated by the ECN Incapable feature, is 4548 handled as follows: An endpoint sending packets to an ECN-incapable 4549 receiver MUST send its packets as ECN incapable, and an ECN- 4550 incapable receiver MUST use the value zero for all ECN Nonce Echoes. 4552 12.3. Aggression Penalties 4554 DCCP endpoints have several mechanisms for detecting congestion- 4555 related misbehavior. For example: 4557 o A sender can detect an ECN Nonce Echo mismatch, indicating 4558 possible receiver misbehavior. 4560 o A receiver can detect whether the sender is responding to 4561 congestion feedback or Slow Receiver. 4563 o An endpoint may be able to detect that its peer is reporting 4564 inappropriately small Elapsed Time values (Section 13.2). 4566 An endpoint that detects possible congestion-related misbehavior 4567 SHOULD try to verify that its peer is truly misbehaving. For 4568 example, a sending endpoint might send a packet whose ECN header 4569 field is set to Congestion Experienced, 11; a receiver that doesn't 4570 report a corresponding mark is most likely misbehaving. 4572 Upon detecting possible misbehavior, a sender SHOULD respond as if 4573 the receiver had reported one or more recent packets as ECN-marked 4574 (instead of unmarked), while a receiver SHOULD report one or more 4575 recent non-marked packets as ECN-marked. Alternately, a sender 4576 might act as if the receiver had sent a Slow Receiver option, and a 4577 receiver might send Slow Receiver options. Other reactions that 4578 serve to slow the transfer rate are also acceptable. An entity that 4579 detects particularly egregious and ongoing misbehavior MAY also 4580 reset the connection with Reset Code 11, "Aggression Penalty". 4582 However, ECN Nonce mismatches and other warning signs can result 4583 from innocent causes, such as implementation bugs or attack. In 4584 particular, a successful DCCP-Data attack (Section 7.5.5) can cause 4585 the receiver to report an incorrect ECN Nonce Echo. Therefore, 4586 connection reset and other heavyweight mechanisms SHOULD be sent 4587 only as last resorts, after multiple round-trip times of verified 4588 aggression. 4590 13. Timing Options 4592 The Timestamp, Timestamp Echo, and Elapsed Time options help DCCP 4593 endpoints explicitly measure round-trip times. 4595 13.1. Timestamp Option 4597 This option is permitted in any DCCP packet. The length of the 4598 option is 6 bytes. 4600 +--------+--------+--------+--------+--------+--------+ 4601 |00101001|00000110| Timestamp Value | 4602 +--------+--------+--------+--------+--------+--------+ 4603 Type=41 Length=6 4605 The four bytes of option data carry the timestamp of this packet. 4606 The timestamp is a 32-bit integer that increases monotonically with 4607 time, at a rate of 1 unit per 10 microseconds. At this rate, 4608 Timestamp Value will wrap approximately every 11.9 hours. Endpoints 4609 need not measure time at this fine granularity; for example, an 4610 endpoint that preferred to measure time at millisecond granularity 4611 might send Timestamp Values that were all multiples of 100. The 4612 precise time corresponding to Timestamp Value zero is not specified: 4613 Timestamp Values are only meaningful relative to other Timestamp 4614 Values sent on the same connection. A DCCP receiving a Timestamp 4615 option SHOULD respond with a Timestamp Echo option on the next 4616 packet it sends. 4618 13.2. Elapsed Time Option 4620 This option is permitted in any DCCP packet that contains an 4621 Acknowledgement Number (such options received on other packet types 4622 MUST be ignored). It indicates how much time has elapsed, in 4623 hundredths of milliseconds (or, equivalently, multiples of 4624 10 microseconds), since the packet being acknowledged -- the packet 4625 with the given Acknowledgement Number -- was received. The option 4626 may take 4 or 6 bytes, depending on the size of the Elapsed Time 4627 value. Elapsed Time helps correct round-trip time estimates when 4628 the gap between receiving a packet and acknowledging that packet may 4629 be long -- in CCID 3, for example, where acknowledgements are sent 4630 infrequently. 4632 +--------+--------+--------+--------+ 4633 |00101011|00000100| Elapsed Time | 4634 +--------+--------+--------+--------+ 4635 Type=43 Len=4 4637 +--------+--------+--------+--------+--------+--------+ 4638 |00101011|00000110| Elapsed Time | 4639 +--------+--------+--------+--------+--------+--------+ 4640 Type=43 Len=6 4642 The option data, Elapsed Time, represents an estimated upper bound 4643 on the amount of time elapsed since the packet being acknowledged 4644 was received, with units of tenths of milliseconds. If Elapsed Time 4645 is less than a half-second, the first, smaller form of the option 4646 SHOULD be used. Elapsed Times of more than 0.65535 seconds MUST be 4647 sent using the second form of the option. The special Elapsed Time 4648 value 4294967295, which corresponds to approximately 11.9 hours, is 4649 used to represent any Elapsed Time greater than 42949.67294 seconds. 4650 DCCP endpoints MUST NOT report Elapsed Times that are significantly 4651 larger than the true elapsed times. A connection MAY be reset with 4652 Reset Code 11, "Aggression Penalty", if one endpoint determines that 4653 the other is reporting a much-too-large Elapsed Time. 4655 Elapsed Time is measured in hundredths of milliseconds as a 4656 compromise between two conflicting goals. First, it provides enough 4657 granularity to reduce rounding error when measuring elapsed time 4658 over fast LANs; second, it allows many reasonable elapsed times to 4659 fit into two bytes of data. 4661 13.3. Timestamp Echo Option 4663 This option is permitted in any DCCP packet, as long as at least one 4664 packet carrying the Timestamp option has been received. Generally, 4665 a DCCP endpoint should send one Timestamp Echo option for each 4666 Timestamp option it receives; and it should send that option as soon 4667 as is convenient. The length of the option is between 6 and 10 4668 bytes, depending on whether Elapsed Time is included and how large 4669 it is. 4671 +--------+--------+--------+--------+--------+--------+ 4672 |00101010|00000110| Timestamp Echo | 4673 +--------+--------+--------+--------+--------+--------+ 4674 Type=42 Len=6 4676 +--------+--------+------- ... -------+--------+--------+ 4677 |00101010|00001000| Timestamp Echo | Elapsed Time | 4678 +--------+--------+------- ... -------+--------+--------+ 4679 Type=42 Len=8 (4 bytes) 4681 +--------+--------+------- ... -------+------- ... -------+ 4682 |00101010|00001010| Timestamp Echo | Elapsed Time | 4683 +--------+--------+------- ... -------+------- ... -------+ 4684 Type=42 Len=10 (4 bytes) (4 bytes) 4686 The first four bytes of option data, Timestamp Echo, carry a 4687 Timestamp Value taken from a preceding received Timestamp option. 4688 Usually, this will be the last packet that was received -- the 4689 packet indicated by the Acknowledgement Number, if any -- but it 4690 might be a preceding packet. Each Timestamp received will generally 4691 result in exactly one Timestamp Echo transmitted. If an endpoint 4692 has received multiple Timestamp options since the last time it sent 4693 a packet, then it MAY ignore all Timestamp options but the one 4694 included on the packet with the greatest sequence number; 4695 alternatively, it MAY include multiple Timestamp Echo options in its 4696 response, each corresponding to a different Timestamp option. 4698 The Elapsed Time value, similar to that in the Elapsed Time option, 4699 indicates the amount of time elapsed since receiving the packet 4700 whose timestamp is being echoed. This time MUST be in hundredths of 4701 milliseconds. Elapsed Time is meant to help the Timestamp sender 4702 separate the network round-trip time from the Timestamp receiver's 4703 processing time. This may be particularly important for CCIDs where 4704 acknowledgements are sent infrequently, so that there might be 4705 considerable delay between receiving a Timestamp option and sending 4706 the corresponding Timestamp Echo. A missing Elapsed Time field is 4707 equivalent to an Elapsed Time of zero. The smallest version of the 4708 option SHOULD be used that can hold the relevant Elapsed Time value. 4710 14. Maximum Packet Size 4712 A DCCP implementation MUST maintain the maximum packet size (MPS) 4713 allowed for each active DCCP session. The MPS is influenced by the 4714 maximum packet size allowed by the current congestion control 4715 mechanism (CCMPS), the maximum packet size supported by the path's 4716 links (PMTU, the Path Maximum Transmission Unit) [RFC 1191], and the 4717 lengths of the IP and DCCP headers. 4719 A DCCP application interface SHOULD let the application discover 4720 DCCP's current MPS. Generally, the DCCP implementation will refuse 4721 to send any packet bigger than the MPS, returning an appropriate 4722 error to the application. A DCCP interface MAY allow applications 4723 to request fragmentation for packets larger than PMTU, but not 4724 larger than CCMPS (packets larger than CCMPS MUST be rejected in any 4725 case). Fragmentation SHOULD NOT be the default, since it decreases 4726 robustness: an entire packet is discarded if even one of its 4727 fragments is lost. Applications can usually get better error 4728 tolerance by producing packets smaller than the PMTU. 4730 The MPS reported to the application SHOULD be influenced by the size 4731 expected to be required for DCCP headers and options. If the 4732 application provides data that, when combined with the options the 4733 DCCP implementation would like to include, would exceed the MPS, the 4734 implementation should either send the options on a separate packet 4735 (such as a DCCP-Ack) or lower the MPS, drop the data, and return an 4736 appropriate error to the application. 4738 14.1. Measuring PMTU 4740 Each DCCP endpoint MUST keep track of the current PMTU for each 4741 connection, except that this is not required for IPv4 connections 4742 whose applications have requested fragmentation. The PMTU SHOULD be 4743 initialized from the interface MTU that will be used to send 4744 packets. The MPS will be initialized with the minimum of the PMTU 4745 and the CCMPS, if any. 4747 Classical PMTU discovery uses unfragmentable packets. In IPv4, 4748 these packets have the IP Don't Fragment (DF) bit set; in IPv6, all 4749 packets are unfragmentable. As specified in [RFC 1191], when a 4750 router receives a packet with DF set that is larger than the next 4751 link's MTU, it sends an ICMP Destination Unreachable message back to 4752 the source whose Code indicates that an unfragmentable packet was 4753 too large to forward (a "Datagram Too Big" message). When a DCCP 4754 implementation receives a Datagram Too Big message, it decreases its 4755 PMTU to the Next-Hop MTU value given in the ICMP message. If the 4756 MTU given in the message is zero, the sender chooses a value for 4757 PMTU using the algorithm described in Section 7 of [RFC 1191]. If 4758 the MTU given in the message is greater than the current PMTU, the 4759 Datagram Too Big message is ignored, as described in [RFC 1191]. 4760 (We are aware that this may cause problems for DCCP endpoints behind 4761 certain firewalls.) 4763 A DCCP implementation may allow the application to occasionally 4764 request that PMTU discovery be performed again. This will reset the 4765 PMTU to the outgoing interface's MTU. Such requests SHOULD be rate 4766 limited, to one per two seconds, for example. 4768 A DCCP sender MAY treat the reception of an ICMP Datagram Too Big 4769 message as an indication that the packet being reported was not lost 4770 due to congestion, and so for the purposes of congestion control it 4771 MAY ignore the DCCP receiver's indication that this packet did not 4772 arrive. However, if this is done, then the DCCP sender MUST check 4773 the ECN bits of the IP header echoed in the ICMP message, and only 4774 perform this optimization if these ECN bits indicate that the packet 4775 did not experience congestion prior to reaching the router whose 4776 link MTU it exceeded. 4778 A DCCP implementation SHOULD ensure, as far as possible, that ICMP 4779 Datagram Too Big messages were actually generated by routers, so 4780 that attackers cannot drive the PMTU down to a falsely small value. 4781 The simplest way to do this is to verify that the Sequence Number on 4782 the ICMP error's encapsulated header corresponds to a Sequence 4783 Number that the implementation recently sent. (Routers are not 4784 required to return more than 64 bits of the DCCP header [RFC 792], 4785 but most modern routers will return far more, including the Sequence 4786 Number.) ICMP Datagram Too Big messages with incorrect or missing 4787 Sequence Numbers may be ignored, or the DCCP implementation may 4788 lower the PMTU only temporarily in response. If more than three odd 4789 Datagram Too Big messages are received and the other DCCP endpoint 4790 reports more than three lost packets, however, the DCCP 4791 implementation SHOULD assume the presence of a confused router, and 4792 either obey the ICMP messages' PMTU or (on IPv4 networks) switch to 4793 allowing fragmentation. 4795 DCCP also allows upward probing of the PMTU [PMTUD], where the DCCP 4796 endpoint begins by sending small packets with DF set, then gradually 4797 increases the packet size until a packet is lost. This mechanism 4798 does not require any ICMP error processing. DCCP-Sync packets are 4799 the best choice for upward probing, since DCCP-Sync probes do not 4800 risk application data loss. The DCCP implementation inserts 4801 arbitrary data into the DCCP-Sync application area, padding the 4802 packet to the right length; and since every valid DCCP-Sync 4803 generates an immediate DCCP-SyncAck in response, the endpoint will 4804 have a pretty good idea of when a probe is lost. 4806 14.2. Sender Behavior 4808 A DCCP sender SHOULD send every packet as unfragmentable, as 4809 described above, with the following exceptions. 4811 o On IPv4 connections whose applications have requested 4812 fragmentation, the sender SHOULD send packets with the DF bit not 4813 set. 4815 o On IPv6 connections whose applications have requested 4816 fragmentation, the sender SHOULD use fragmentation extension 4817 headers to fragment packets larger than PMTU into suitably-sized 4818 chunks. (Those chunks are, of course, unfragmentable.) 4820 o It is undesirable for PMTU discovery to occur on the initial 4821 connection setup handshake, as the connection setup process may 4822 not be representative of packet sizes used during the connection, 4823 and performing MTU discovery on the initial handshake might 4824 unnecessarily delay connection establishment. Thus, DCCP-Request 4825 and DCCP-Response packets SHOULD be sent as fragmentable. In 4826 addition, DCCP-Reset packets SHOULD be sent as fragmentable, 4827 although typically these would be small enough to not be a 4828 problem. For IPv4 connections, these packets SHOULD be sent with 4829 the DF bit not set; for IPv6 connections, they SHOULD be 4830 preemptively fragmented to a size not larger than the relevant 4831 interface MTU. 4833 If the DCCP implementation has decreased the PMTU, the sending 4834 application has not requested fragmentation, and the sending 4835 application attempts to send a packet larger than the new MPS, the 4836 API MUST refuse to send the packet and return an appropriate error 4837 to the application. The application should then use the API to 4838 query the new value of MPS. The kernel might have some packets 4839 buffered for transmission that are smaller than the old MPS, but 4840 larger than the new MPS. It MAY send these packets as fragmentable, 4841 or it MAY discard these packets; it MUST NOT send them as 4842 unfragmentable. 4844 15. Forward Compatibility 4846 Future versions of DCCP may add new options and features. A few 4847 simple guidelines will let extended DCCPs interoperate with normal 4848 DCCPs. 4850 o DCCP processors MUST NOT act punitively towards options and 4851 features they do not understand. For example, DCCP processors 4852 MUST NOT reset the connection if some field marked Reserved in 4853 this specification is non-zero; if some unknown option is 4854 present; or if some feature negotiation option mentions an 4855 unknown feature. Instead, DCCP processors MUST ignore these 4856 events. The Mandatory option is the single exception: if 4857 Mandatory precedes some unknown option or feature, the connection 4858 MUST be reset. 4860 o DCCP processors MUST anticipate the possibility of unknown 4861 feature values, which might occur as part of a negotiation for a 4862 known feature. For server-priority features, unknown values are 4863 handled as a matter of course: since the non-extended DCCP's 4864 priority list will not contain unknown values, the result of the 4865 negotiation cannot be an unknown value. A DCCP SHOULD respond 4866 with an empty Confirm option if it is assigned an unacceptable 4867 value for some non-negotiable feature. 4869 o Each DCCP extension SHOULD be controlled by some feature. The 4870 default value of this feature should correspond to "extension not 4871 available". If an extended DCCP wants to use the extension, it 4872 SHOULD attempt to change the feature's value using a Change L or 4873 Change R option. Any non-extended DCCP will ignore the option, 4874 thus leaving the feature value at its default, "extension not 4875 available". 4877 Section 19 lists DCCP assigned numbers reserved for experimental and 4878 testing purposes. 4880 16. Middlebox Considerations 4882 This section describes properties of DCCP that firewalls, network 4883 address translators, and other middleboxes should consider, 4884 including parts of the packet that middleboxes should not change. 4885 The intent is to draw attention to aspects of DCCP that may be 4886 useful, or dangerous, for middleboxes, or that differ significantly 4887 from TCP. 4889 The Service Code field in DCCP-Request packets provides information 4890 that may be useful for stateful middleboxes. With Service Code, a 4891 middlebox can tell what protocol a connection will use without 4892 relying on port numbers. Middleboxes can disallow connections that 4893 attempt to access unexpected services by sending a DCCP-Reset with 4894 Reset Code 8, "Bad Service Code". Middleboxes should not modify the 4895 Service Code unless they are really changing the service a 4896 connection is accessing. 4898 The Source and Destination Port fields are in the same packet 4899 locations as the corresponding fields in TCP and UDP, which may 4900 simplify some middlebox implementations. 4902 The forward compatibility considerations in Section 15 apply to 4903 middleboxes as well. In particular, middleboxes generally shouldn't 4904 act punitively towards options and features they do not understand. 4906 Modifying DCCP Sequence Numbers and Acknowledgement Numbers is more 4907 tedious and dangerous than modifying TCP sequence numbers. A 4908 middlebox that added packets to, or removed packets from, a DCCP 4909 connection would have to modify acknowledgement options, such as Ack 4910 Vector, and CCID-specific options, such as TFRC's Loss Intervals, at 4911 minimum. On ECN-capable connections, the middlebox would have to 4912 keep track of ECN Nonce information for packets it introduced or 4913 removed, so that the relevant acknowledgement options continued to 4914 have correct ECN Nonce Echoes, or risk the connection being reset 4915 for "Aggression Penalty". We therefore recommend that middleboxes 4916 not modify packet streams by adding or removing packets. 4918 Note that there is less need to modify DCCP's per-packet sequence 4919 numbers than TCP's per-byte sequence numbers; for example, a 4920 middlebox can change the contents of a packet without changing its 4921 sequence number. (In TCP, sequence number modification is required 4922 to support protocols like FTP that carry variable-length addresses 4923 in the data stream. If such an application were deployed over DCCP, 4924 middleboxes would simply grow or shrink the relevant packets as 4925 necessary, without changing their sequence numbers. This might 4926 involve fragmenting the packet.) 4928 Middleboxes may, of course, reset connections in progress. Clearly 4929 this requires inserting a packet into one or both packet streams, 4930 but the difficult issues do not arise. 4932 DCCP is somewhat unfriendly to "connection splicing" [SHHP00], in 4933 which clients' connection attempts are intercepted, but possibly 4934 later "spliced in" to external server connections via sequence 4935 number manipulations. A connection splicer at minimum would have to 4936 ensure that the spliced connections agreed on all relevant feature 4937 values, which might take some renegotiation. 4939 The contents of this section should not be interpreted as a 4940 wholesale endorsement of stateful middleboxes. 4942 17. Relations to Other Specifications 4944 17.1. RTP 4946 The Real-Time Transport Protocol, RTP [RFC 3550], is currently used 4947 over UDP by many of DCCP's target applications (for instance, 4948 streaming media). Therefore, it is important to examine the 4949 relationship between DCCP and RTP, and in particular, the question 4950 of whether any changes in RTP are necessary or desirable when it is 4951 layered over DCCP instead of UDP. 4953 There are two potential sources of overhead in the RTP-over-DCCP 4954 combination, duplicated acknowledgement information and duplicated 4955 sequence numbers. Together, these sources of overhead add slightly 4956 more than 4 bytes per packet relative to RTP-over-UDP, and that 4957 eliminating the redundancy would not reduce the overhead. 4959 First, consider acknowledgements. Both RTP and DCCP report feedback 4960 about loss rates to data senders, via RTP Control Protocol Sender 4961 and Receiver Reports (RTCP SR/RR packets) and via DCCP 4962 acknowledgement options. These feedback mechanisms are potentially 4963 redundant. However, RTCP SR/RR packets contain information not 4964 present in DCCP acknowledgements, such as "interarrival jitter", and 4965 DCCP's acknowledgements contain information not transmitted by RTCP, 4966 such as the ECN Nonce Echo. Neither feedback mechanism makes the 4967 other redundant. 4969 Sending both types of feedback need not be particularly costly 4970 either. RTCP reports may be sent relatively infrequently: once 4971 every 5 seconds on average, for low-bandwidth flows. In DCCP, some 4972 feedback mechanisms are expensive -- Ack Vector, for example, is 4973 frequent and verbose -- but others are relatively cheap: CCID 3 4974 (TFRC) acknowledgements take between 16 and 32 bytes of options sent 4975 once per round-trip time. (Reporting less frequently than once per 4976 RTT would make congestion control less responsive to loss.) We 4977 therefore conclude that acknowledgement overhead in RTP-over-DCCP 4978 need not be significantly higher than for RTP-over-UDP, at least for 4979 CCID 3. 4981 One clear redundancy can be addressed at the application level. The 4982 verbose packet-by-packet loss reports sent in RTCP Extended Reports 4983 Loss RLE Blocks [RFC 3611] can be derived from DCCP's Ack Vector 4984 options. (The converse is not true, since Loss RLE Blocks contain 4985 no ECN information.) Since DCCP implementations should provide an 4986 API for application access to Ack Vector information, RTP-over-DCCP 4987 applications might request either DCCP Ack Vectors or RTCP Extended 4988 Report Loss RLE Blocks, but not both. 4990 Now consider sequence number redundancy on data packets. The 4991 embedded RTP header contains a 16-bit RTP sequence number. Most 4992 data packets will use the DCCP-Data type; DCCP-DataAck and DCCP-Ack 4993 packets need not usually be sent. The DCCP-Data header is 12 bytes 4994 long without options, including a 24-bit sequence number. This is 4 4995 bytes more than a UDP header. Any options required on data packets 4996 would add further overhead, although many CCIDs (for instance, CCID 4997 3, TFRC) don't require options on most data packets. 4999 The DCCP sequence number cannot be inferred from the RTP sequence 5000 number since it increments on non-data packets as well as data 5001 packets. The RTP sequence number cannot be inferred from the DCCP 5002 sequence number either [RFC 3550]. Furthermore, removing RTP's 5003 sequence number would not save any header space because of alignment 5004 issues. We therefore recommend that RTP transmitted over DCCP use 5005 the same headers currently defined. The 4 byte header cost is a 5006 reasonable tradeoff for DCCP's congestion control features and 5007 access to ECN. Truly bandwidth-starved endpoints should use some 5008 future header compression scheme. 5010 17.2. Congestion Manager and Multiplexing 5012 Since DCCP doesn't provide reliable, ordered delivery, multiple 5013 application sub-flows may be multiplexed over a single DCCP 5014 connection with no inherent performance penalty. Thus, there is no 5015 need for DCCP to provide built-in, SCTP-style support for multiple 5016 sub-flows. 5018 Some applications might want to share congestion control state among 5019 multiple DCCP flows that share the same source and destination 5020 addresses. This functionality could be provided by the Congestion 5021 Manager [RFC 3124], a generic multiplexing facility. However, the 5022 CM would not fully support DCCP without change; it does not 5023 gracefully handle multiple congestion control mechanisms, for 5024 example. 5026 18. Security Considerations 5028 DCCP does not provide cryptographic security guarantees. 5029 Applications desiring hard security should use IPsec or end-to-end 5030 security of some kind. 5032 Nevertheless, DCCP is intended to protect against some classes of 5033 attackers: Attackers cannot hijack a DCCP connection (close the 5034 connection unexpectedly, or cause attacker data to be accepted by an 5035 endpoint as if it came from the sender) unless they can guess valid 5036 sequence numbers. Thus, as long as endpoints choose initial 5037 sequence numbers well, a DCCP attacker must snoop on data packets to 5038 get any reasonable probability of success. Sequence number validity 5039 checks provide this guarantee. Section 7.5.5 describes sequence 5040 number security further. 5042 This security property only holds assuming that DCCP's random 5043 numbers are chosen according to the guidelines in [RFC 1750]. 5045 DCCP provides no protection against attackers that can snoop on data 5046 packets. 5048 18.1. Security Considerations for Partial Checksums 5050 The partial checksum facility has a separate security impact, 5051 particularly in its interaction with authentication and encryption 5052 mechanisms. The impact is the same in DCCP as in the UDP-Lite 5053 protocol, and what follows was adapted from the corresponding text 5054 in the UDP-Lite specification [RFC 3828]. 5056 When a DCCP packet's Checksum Coverage field is not zero, the 5057 uncovered portion of a packet may change in transit. This is 5058 contrary to the idea behind most authentication mechanisms: 5059 authentication succeeds if the packet has not changed in transit. 5060 Unless authentication mechanisms that operate only on the sensitive 5061 part of packets are developed and used, authentication will always 5062 fail for partially-checksummed DCCP packets whose uncovered part has 5063 been damaged. 5065 The IPsec integrity check (Encapsulation Security Protocol, ESP, or 5066 Authentication Header, AH) is applied (at least) to the entire IP 5067 packet payload. Corruption of any bit within that area will then 5068 result in the IP receiver discarding a DCCP packet, even if the 5069 corruption happened in an uncovered part of the DCCP application 5070 data. 5072 When IPsec is used with ESP payload encryption, a link can not 5073 determine the specific transport protocol of a packet being 5074 forwarded by inspecting the IP packet payload. In this case, the 5075 link MUST provide a standard integrity check covering the entire IP 5076 packet and payload. DCCP partial checksums provide no benefit in 5077 this case. 5079 Encryption (e.g., at the transport or application levels) may be 5080 used. Note that omitting an integrity check can, under certain 5081 circumstances, compromise confidentiality [BEL98]. 5083 If a few bits of an encrypted packet are damaged, the decryption 5084 transform will typically spread errors so that the packet becomes 5085 too damaged to be of use. Many encryption transforms today exhibit 5086 this behavior. There exist encryption transforms, stream ciphers, 5087 which do not cause error propagation. Proper use of stream ciphers 5088 can be quite difficult, especially when authentication-checking is 5089 omitted [BB01]. In particular, an attacker can cause predictable 5090 changes to the ultimate plaintext, even without being able to 5091 decrypt the ciphertext. 5093 19. IANA Considerations 5095 DCCP introduces eight sets of numbers whose values should be 5096 allocated by IANA. We refer to allocation policies, such as 5097 Standards Action, outlined in [RFC 2434], and most registries 5098 reserve some values for experimental and testing use [RFC 3692]. In 5099 addition, DCCP requires a Protocol Number to be added to the 5100 registry of Assigned Internet Protocol Numbers. IANA is requested 5101 to assign IP Protocol Number 33 to DCCP; this number has already 5102 been informally made available for experimental DCCP use. 5104 19.1. Packet Types 5106 Each entry in the DCCP Packet Type registry contains a packet type, 5107 which is a number in the range 0-15; a packet type name, such as 5108 DCCP-Request; and a reference to the RFC defining the packet type. 5109 The registry is initially populated using the values in Table 1 5110 (Section 5.1). This document allocates packet types 0-9, and packet 5111 type 14 is permanently reserved for experimental and testing use. 5112 Packet types 10-13 and 15 are currently reserved, and should be 5113 allocated with the Standards Action policy, which requires IETF 5114 working group review and standards-track RFC publication. 5116 19.2. Reset Codes 5118 Each entry in the DCCP Reset Code registry contains a Reset Code, 5119 which is a number in the range 0-255; a short description of the 5120 Reset Code, such as "No Connection"; and a reference to the RFC 5121 defining the Reset Code. The registry is initially populated using 5122 the values in Table 2 (Section 5.6). This document allocates Reset 5123 Codes 0-11, and Reset Codes 120-126 are permanently reserved for 5124 experimental and testing use. Reset Codes 12-119 and 127 are 5125 currently reserved, and should be allocated with the IETF Consensus 5126 policy, which requires RFC publication (not necessarily standards- 5127 track). Reset Codes 128-255 are permanently reserved for CCID- 5128 specific registries; each CCID Profile document describes how the 5129 corresponding registry is managed. 5131 19.3. Option Types 5133 Each entry in the DCCP option type registry contains an option type, 5134 which is a number in the range 0-255; the name of the option, such 5135 as "Slow Receiver"; and a reference to the RFC defining the option 5136 type. The registry is initially populated using the values in Table 5137 3 (Section 5.8). This document allocates option types 0-2 and 5138 32-44, and option types 31 and 120-126 are permanently reserved for 5139 experimental and testing use. Option types 3-30, 45-119, and 127 5140 are currently reserved, and should be allocated with the IETF 5141 Consensus policy, which requires RFC publication (not necessarily 5142 standards-track). Option types 128-255 are permanently reserved for 5143 CCID-specific registries; each CCID Profile document describes how 5144 the corresponding registry is managed. 5146 19.4. Feature Numbers 5148 Each entry in the DCCP feature number registry contains a feature 5149 number, which is a number in the range 0-255; the name of the 5150 feature, such as "ECN Incapable"; and a reference to the RFC 5151 defining the feature number. The registry is initially populated 5152 using the values in Table 4 (Section 6). This document allocates 5153 feature numbers 0-9, and feature numbers 120-126 are permanently 5154 reserved for experimental and testing use. Feature numbers 10-119 5155 and 127 are currently reserved, and should be allocated with the 5156 IETF Consensus policy, which requires RFC publication (not 5157 necessarily standards-track). Feature numbers 128-255 are 5158 permanently reserved for CCID-specific registries; each CCID Profile 5159 document describes how the corresponding registry is managed. 5161 19.5. Congestion Control Identifiers 5163 Each entry in the DCCP Congestion Control Identifier (CCID) registry 5164 contains a CCID, which is a number in the range 0-255; the name of 5165 the CCID, such as "TCP-like Congestion Control"; and a reference to 5166 the RFC defining the CCID. The registry is initially populated 5167 using the values in Table 5 (Section 10). CCIDs 2 and 3 are 5168 allocated by concurrently published profiles, and CCIDs 248-254 are 5169 permanently reserved for experimental and testing use. CCIDs 0, 1, 5170 4-247, and 255 are currently reserved, and should be allocated with 5171 the IETF Consensus policy, which requires RFC publication (not 5172 necessarily standards-track). 5174 19.6. Ack Vector States 5176 Each entry in the DCCP Ack Vector State registry contains an Ack 5177 Vector State, which is a number in the range 0-3; the name of the 5178 State, such as "Received ECN Marked"; and a reference to the RFC 5179 defining the State. The registry is initially populated using the 5180 values in Table 6 (Section 11.4). This document allocates States 0, 5181 1, and 3. State 2 is currently reserved, and should be allocated 5182 with the Standards Action policy, which requires IETF working group 5183 review and standards-track RFC publication. 5185 19.7. Drop Codes 5187 Each entry in the DCCP Drop Code registry contains a Data Dropped 5188 Drop Code, which is a number in the range 0-7; the name of the Drop 5189 Code, such as "Application Not Listening"; and a reference to the 5190 RFC defining the Drop Code. The registry is initially populated 5191 using the values in Table 7 (Section 11.7). This document allocates 5192 Drop Codes 0-3 and 7. Drop Codes 4-6 are currently reserved, and 5193 should be allocated with the Standards Action policy, which requires 5194 IETF working group review and standards-track RFC publication. 5196 19.8. Service Codes 5198 Each entry in the Service Code registry contains a Service Code, 5199 which is a number in the range 0-4294967295; a short English 5200 description of the intended service; and an optional reference to an 5201 RFC or other publicly available specification defining the Service 5202 Code. The registry should list the Service Code's numeric value as 5203 a decimal number, but when each byte of the four-byte Service Code 5204 is in the range 32-127, the registry should also show a four- 5205 character ASCII interpretation of the Service Code. Thus, the 5206 number 1717858426 would additionally appear as "fdpz". Service 5207 Codes are not DCCP-specific. This document does not allocate any 5208 Service Codes, but Service Code 0 is permanently reserved (it 5209 represents the absence of a meaningful Service Code), and Service 5210 Codes 1056964608-1073741823 (high byte ASCII "?") are reserved for 5211 Private Use. Most of the remaining Service Codes are allocated 5212 First Come First Served, with no RFC publication required. 5213 Exceptions are listed in Section 8.1.2. 5215 20. Thanks 5217 Thanks to Jitendra Padhye for his help with early versions of this 5218 specification. 5220 Thanks to Junwen Lai and Arun Venkataramani, who, as interns at 5221 ICIR, built a prototype DCCP implementation. In particular, Junwen 5222 Lai recommended that the old feature negotiation mechanism be 5223 scrapped and codesigned the current mechanism. Arun Venkataramani's 5224 feedback improved Appendix A. 5226 We thank the staff and interns of ICIR and, formerly, ACIRI, the 5227 members of the End-to-End Research Group, and the members of the 5228 Transport Area Working Group for their feedback on DCCP. We 5229 especially thank the DCCP expert reviewers: Greg Minshall, Eric 5230 Rescorla, and Magnus Westerlund for detailed written comments and 5231 problem spotting, and Rob Austein and Steve Bellovin for verbal 5232 comments and written notes. 5234 We also thank those who provided comments and suggestions via the 5235 DCCP BOF, Working Group, and mailing lists, including Damon 5236 Lanphear, Patrick McManus, Colin Perkins, Sara Karlberg, Kevin Lai, 5237 Bernard Aboba, Youngsoo Choi, Pengfei Di, Dan Duchamp, Gorry 5238 Fairhurst, Derek Fawcus, David Timothy Fleeman, John Loughney, 5239 Ghyslain Pelletier, Tom Phelan, Stanislav Shalunov, David Vos, Yufei 5240 Wang, and Michael Welzl. In particular, Colin Perkins provided 5241 extensive, detailed feedback, Michael Welzl suggested the Data 5242 Checksum option, and Gorry Fairhurst provided extensive feedback on 5243 various checksum issues. 5245 A. Appendix: Ack Vector Implementation Notes 5247 This appendix discusses particulars of DCCP acknowledgement 5248 handling, in the context of an abstract implementation for Ack 5249 Vector. It is informative rather than normative. 5251 The first part of our implementation runs at the HC-Receiver, and 5252 therefore acknowledges data packets. It generates Ack Vector 5253 options. The implementation has the following characteristics: 5255 o At most one byte of state per acknowledged packet. 5257 o O(1) time to update that state when a new packet arrives (normal 5258 case). 5260 o Cumulative acknowledgements. 5262 o Quick removal of old state. 5264 The basic data structure is a circular buffer containing information 5265 about acknowledged packets. Each byte in this buffer contains a 5266 state and run length; the state can be 0 (packet received), 1 5267 (packet ECN marked), or 3 (packet not yet received). The buffer 5268 grows from right to left. The implementation maintains five 5269 variables, aside from the buffer contents: 5271 o "buf_head" and "buf_tail", which mark the live portion of the 5272 buffer. 5274 o "buf_ackno", the Acknowledgement Number of the most recent packet 5275 acknowledged in the buffer. This corresponds to the "head" 5276 pointer. 5278 o "buf_nonce", the one-bit sum (exclusive-or, or parity) of the ECN 5279 Nonces received on all packets acknowledged by the buffer with 5280 State 0. 5282 We draw acknowledgement buffers like this: 5284 +---------------------------------------------------------------+ 5285 |S,L|S,L|S,L|S,L| | | | |S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L| 5286 +---------------------------------------------------------------+ 5287 ^ ^ 5288 buf_tail buf_head, buf_ackno = A buf_nonce = E 5290 <=== buf_head and buf_tail move this way <=== 5292 Each `S,L' represents a State/Run length byte. We will draw these 5293 buffers showing only their live portion, and will add an annotation 5294 showing the Acknowledgement Number for the last live byte in the 5295 buffer. For example: 5297 +-----------------------------------------------+ 5298 A |S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L| T BN[E] 5299 +-----------------------------------------------+ 5301 Here, buf_nonce equals E and buf_ackno equals A. 5303 We will use this buffer as a running example. 5305 +---------------------------+ 5306 10 |0,0|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] [Example Buffer] 5307 +---------------------------+ 5309 In concrete terms, its meaning is as follows: 5311 Packet 10 was received. (The head of the buffer has sequence 5312 number 10, state 0, and run length 0.) 5314 Packets 9, 8, and 7 have not yet been received. (The three 5315 bytes preceding the head each have state 3 and run length 0.) 5317 Packets 6, 5, 4, 3, and 2 were received. 5319 Packet 1 was ECN marked. 5321 Packet 0 was received. 5323 The one-bit sum of the ECN Nonces on packets 10, 6, 5, 4, 3, 2, 5324 and 0 equals 1. 5326 Additionally, the HC-Receiver must keep some information about the 5327 Ack Vectors it has recently sent. For each packet sent carrying an 5328 Ack Vector, it remembers four variables: 5330 o "ack_seqno", the Sequence Number used for the packet. This is an 5331 HC-Receiver sequence number. 5333 o "ack_ptr", the value of buf_head at the time of acknowledgement. 5335 o "ack_ackno", the Acknowledgement Number used for the packet. 5336 This is an HC-Sender sequence number. Since acknowledgements are 5337 cumulative, this single number completely specifies all necessary 5338 information about the packets acknowledged by this Ack Vector. 5340 o "ack_nonce", the one-bit sum of the ECN Nonces for all State 0 5341 packets in the buffer from buf_head to ack_ackno, inclusive. 5342 Initially, this equals the Nonce Echo of the acknowledgement's 5343 Ack Vector (or, if the ack packet contained more than one Ack 5344 Vector, the exclusive-or of all the acknowledgement's Ack 5345 Vectors). It changes as information about old acknowledgements 5346 is removed (so ack_ptr and buf_head diverge), and as old packets 5347 arrive (so they change from State 3 or State 1 to State 0). 5349 A.1. Packet Arrival 5351 This section describes how the HC-Receiver updates its 5352 acknowledgement buffer as packets arrive from the HC-Sender. 5354 A.1.1. New Packets 5356 When a packet with Sequence Number greater than buf_ackno arrives, 5357 the HC-Receiver updates buf_head (by moving it to the left 5358 appropriately), buf_ackno (which is set to the new packet's Sequence 5359 Number), and possibly buf_nonce (if the packet arrived unmarked with 5360 ECN Nonce 1), in addition to the buffer itself. For example, if HC- 5361 Sender packet 11 arrived ECN marked, the Example Buffer above would 5362 enter this new state (changes are marked with stars): 5364 ** +***----------------------------+ 5365 11 |1,0|0,0|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5366 ** +***----------------------------+ 5368 If the packet's state equals the state at the head of the buffer, 5369 the HC-Receiver may choose to increment its run length (up to the 5370 maximum). For example, if HC-Sender packet 11 arrived without ECN 5371 marking and with ECN Nonce 0, the Example Buffer might enter this 5372 state instead: 5374 ** +--*------------------------+ 5375 11 |0,1|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5376 ** +--*------------------------+ 5378 Of course, the new packet's sequence number might not equal the 5379 expected sequence number. In this case, the HC-Receiver will enter 5380 the intervening packets as State 3. If several packets are missing, 5381 the HC-Receiver may prefer to enter multiple bytes with run length 5382 0, rather than a single byte with a larger run length; this 5383 simplifies table updates if one of the missing packets arrives. For 5384 example, if HC-Sender packet 12 arrived with ECN Nonce 1, the 5385 Example Buffer would enter this state: 5387 ** +*******----------------------------+ * 5388 12 |0,0|3,0|0,1|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[0] 5389 ** +*******----------------------------+ * 5391 Of course, the circular buffer may overflow, either when the HC- 5392 Sender is sending data at a very high rate, when the HC-Receiver's 5393 acknowledgements are not reaching the HC-Sender, or when the HC- 5394 Sender is forgetting to acknowledge those acks (so the HC-Receiver 5395 is unable to clean up old state). In this case, the HC-Receiver 5396 should either compress the buffer (by increasing run lengths when 5397 possible), transfer its state to a larger buffer, or, as a last 5398 resort, drop all received packets, without processing them 5399 whatsoever, until its buffer shrinks again. 5401 A.1.2. Old Packets 5403 When a packet with Sequence Number S arrives, and S <= buf_ackno, 5404 the HC-Receiver will scan the table for the byte corresponding to S. 5405 (Indexing structures could reduce the complexity of this scan.) If 5406 S was previously lost (State 3), and it was stored in a byte with 5407 run length 0, the HC-Receiver can simply change the byte's state. 5408 For example, if HC-Sender packet 8 was received with ECN Nonce 0, 5409 the Example Buffer would enter this state: 5411 +--------*------------------+ 5412 10 |0,0|3,0|0,0|3,0|0,4|1,0|0,0| 0 BN[1] 5413 +--------*------------------+ 5415 If S was not marked as lost, or if it was not contained in the 5416 table, the packet is probably a duplicate, and should be ignored. 5417 (The new packet's ECN marking state might differ from the state in 5418 the buffer; Section 11.4.1 describes what is allowed then.) If S's 5419 buffer byte has a non-zero run length, then the buffer might need be 5420 reshuffled to make space for one or two new bytes. 5422 The ack_nonce fields may also need manipulation when old packets 5423 arrive. In particular, when S transitions from State 3 or State 1 5424 to State 0, and S had ECN Nonce 1, then the implementation should 5425 flip the value of ack_nonce for every acknowledgement with ack_ackno 5426 >= S. 5428 It is impossible with this data structure to shift packets from 5429 State 0 to State 1, since the buffer doesn't store individual 5430 packets' ECN Nonces. 5432 A.2. Sending Acknowledgements 5434 Whenever the HC-Receiver needs to generate an acknowledgement, the 5435 buffer's contents can simply be copied into one or more Ack Vector 5436 options. Copied Ack Vectors might not be maximally compressed; for 5437 example, the Example Buffer above contains three adjacent 3,0 bytes 5438 that could be combined into a single 3,2 byte. The HC-Receiver 5439 might, therefore, choose to compress the buffer in place before 5440 sending the option, or to compress the buffer while copying it; 5441 either operation is simple. 5443 Every acknowledgement sent by the HC-Receiver SHOULD include the 5444 entire state of the buffer. That is, acknowledgements are 5445 cumulative. 5447 If the acknowledgement fits in one Ack Vector, that Ack Vector's 5448 Nonce Echo simply equals buf_nonce. For multiple Ack Vectors, more 5449 care is required. The Ack Vectors should be split at points 5450 corresponding to previous acknowledgements, since the stored 5451 ack_nonce fields provide enough information to calculate correct 5452 Nonce Echoes. The implementation should therefore acknowledge data 5453 at least once per 253 bytes of buffer state. (Otherwise, there'd be 5454 no way to calculate a Nonce Echo.) 5456 For each acknowledgement it sends, the HC-Receiver will add an 5457 acknowledgement record. ack_seqno will equal the HC-Receiver 5458 sequence number it used for the ack packet; ack_ptr will equal 5459 buf_head; ack_ackno will equal buf_ackno; and ack_nonce will equal 5460 buf_nonce. 5462 A.3. Clearing State 5464 Some of the HC-Sender's packets will include acknowledgement 5465 numbers, which ack the HC-Receiver's acknowledgements. When such an 5466 ack is received, the HC-Receiver finds the acknowledgement record R 5467 with the appropriate ack_seqno, then: 5469 o Sets buf_tail to R.ack_ptr + 1. 5471 o If R.ack_nonce is 1, it flips buf_nonce, and the value of 5472 ack_nonce for every later ack record. 5474 o Throws away R and every preceding ack record. 5476 (The HC-Receiver may choose to keep some older information, in case 5477 a lost packet shows up late.) For example, say that the HC-Receiver 5478 storing the Example Buffer had sent two acknowledgements already: 5480 1. ack_seqno = 59, ack_ackno = 3, ack_nonce = 1. 5482 2. ack_seqno = 60, ack_ackno = 10, ack_nonce = 0. 5484 Say the HC-Receiver then received a DCCP-DataAck packet with 5485 Acknowledgement Number 59 from the HC-Sender. This informs the HC- 5486 Receiver that the HC-Sender received, and processed, all the 5487 information in HC-Receiver packet 59. This packet acknowledged HC- 5488 Sender packet 3, so the HC-Sender has now received HC-Receiver's 5489 acknowledgements for packets 0, 1, 2, and 3. The Example Buffer 5490 should enter this state: 5492 +------------------*+ * * 5493 10 |0,0|3,0|3,0|3,0|0,2| 4 BN[0] 5494 +------------------*+ * * 5496 The tail byte's run length was adjusted, since packet 3 was in the 5497 middle of that byte. Since R.ack_nonce was 1, the buf_nonce field 5498 was flipped, as were the ack_nonce fields for later acknowledgements 5499 (here, the HC-Receiver Ack 60 record, not shown, has its ack_nonce 5500 flipped to 1). The HC-Receiver can also throw away stored 5501 information about HC-Receiver Ack 59 and any earlier 5502 acknowledgements. 5504 A careful implementation might try to ensure reasonable robustness 5505 to reordering. Suppose that the Example Buffer is as before, but 5506 that packet 9 now arrives, out of sequence. The buffer would enter 5507 this state: 5509 +----*----------------------+ 5510 10 |0,0|0,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5511 +----*----------------------+ 5513 The danger is that the HC-Sender might acknowledge the HC-Receiver's 5514 previous acknowledgement (with sequence number 60), which says that 5515 Packet 9 was not received, before the HC-Receiver has a chance to 5516 send a new acknowledgement saying that Packet 9 actually was 5517 received. Therefore, when packet 9 arrived, the HC-Receiver might 5518 modify its acknowledgement record to: 5520 1. ack_seqno = 59, ack_ackno = 3, ack_nonce = 1. 5522 2. ack_seqno = 60, ack_ackno = 3, ack_nonce = 1. 5524 That is, Ack 60 is now treated like a duplicate of Ack 59. This 5525 would prevent the Tail pointer from moving past packet 9 until the 5526 HC-Receiver knows that the HC-Sender has seen an Ack Vector 5527 indicating that packet's arrival. 5529 A.4. Processing Acknowledgements 5531 When the HC-Sender receives an acknowledgement, it generally cares 5532 about the number of packets that were dropped and/or ECN marked. It 5533 simply reads this off the Ack Vector. Additionally, it should check 5534 the ECN Nonce for correctness. (As described in Section 11.4.1, it 5535 may want to keep more detailed information about acknowledged 5536 packets in case packets change states between acknowledgements, or 5537 in case the application queries whether a packet arrived.) 5539 The HC-Sender must also acknowledge the HC-Receiver's 5540 acknowledgements so that the HC-Receiver can free old Ack Vector 5541 state. (Since Ack Vector acknowledgements are reliable, the HC- 5542 Receiver must maintain and resend Ack Vector information until it is 5543 sure that the HC-Sender has received that information.) A simple 5544 algorithm suffices: since Ack Vector acknowledgements are 5545 cumulative, a single acknowledgement number tells HC-Receiver how 5546 much ack information has arrived. Assuming that the HC-Receiver 5547 sends no data, the HC-Sender can ensure that at least once a round- 5548 trip time, it sends a DCCP-DataAck packet acknowledging the latest 5549 DCCP-Ack packet it has received. Of course, the HC-Sender only 5550 needs to acknowledge the HC-Receiver's acknowledgements if the HC- 5551 Sender is also sending data. If the HC-Sender is not sending data, 5552 then the HC-Receiver's Ack Vector state is stable, and there is no 5553 need to shrink it. The HC-Sender must watch for drops and ECN marks 5554 on received DCCP-Ack packets so that it can adjust the HC-Receiver's 5555 ack-sending rate -- for example, with Ack Ratio -- in response to 5556 congestion. 5558 If the other half-connection is not quiescent -- that is, the HC- 5559 Receiver is sending data to the HC-Sender, possibly using another 5560 CCID -- then the acknowledgements on that half-connection are 5561 sufficient for the HC-Receiver to free its state. 5563 B. Appendix: Partial Checksumming Design Motivation 5565 A great deal of discussion has taken place regarding the utility of 5566 allowing a DCCP sender to restrict the checksum so that it does not 5567 cover the complete packet. This section attempts to capture some of 5568 the rationale behind specific details of DCCP design. 5570 Many of the applications that we envisage using DCCP are resilient 5571 to some degree of data loss, or they would typically have chosen a 5572 reliable transport. Some of these applications may also be 5573 resilient to data corruption -- some audio payloads, for example. 5574 These resilient applications might prefer to receive corrupted data 5575 than to have DCCP drop a corrupted packet. This is particularly 5576 because of congestion control: DCCP cannot tell the difference 5577 between packets dropped due to corruption and packets dropped due to 5578 congestion, and so it must reduce the transmission rate accordingly. 5579 This response may cause the connection to receive less bandwidth 5580 than it is due; corruption in some networking technologies is 5581 independent of, or at least not always correlated to, congestion. 5582 Therefore, corrupted packets do not need to cause as strong a 5583 reduction in transmission rate as the congestion response would 5584 dictate (so long as the DCCP header and options are not corrupt). 5586 Thus DCCP allows the checksum to cover all of the packet, just the 5587 DCCP header, or both the DCCP header and some number of bytes from 5588 the application data. If the application cannot tolerate any data 5589 corruption, then the checksum must cover the whole packet. If the 5590 application would prefer to tolerate some corruption rather than 5591 have the packet dropped, then it can set the checksum to cover only 5592 part of the packet (but always the DCCP header). In addition, if 5593 the application wishes to decouple checksumming of the DCCP header 5594 from checksumming of the application data, it may do so by including 5595 the Data Checksum option. This would allow DCCP to discard 5596 corrupted application data, but still not mistake the corruption for 5597 network congestion. 5599 Thus, from the application point of view, partial checksums seem to 5600 be a desirable feature. However, the usefulness of partial 5601 checksums depends on partially corrupted packets being delivered to 5602 the receiver. If the link-layer CRC always discards corrupted 5603 packets, then this will not happen, and so the usefulness of partial 5604 checksums would be restricted to corruption that occurred in routers 5605 and other places not covered by link CRCs. There does not appear to 5606 be consensus on how likely it is that future network links that 5607 suffer significant corruption will not cover the entire packet with 5608 a single strong CRC. DCCP makes it possible to tailor such links to 5609 the application, but it is difficult to predict if this will be 5610 compelling for future link technologies. 5612 In addition, partial checksums do not co-exist well with IP-level 5613 authentication mechanisms such as IPsec AH, which cover the entire 5614 packet with a cryptographic hash. Thus, if cryptographic 5615 authentication mechanisms are required to co-exist with partial 5616 checksums, the authentication must be carried in the application 5617 data. A possible mode of usage would appear to be similar to that 5618 of Secure RTP. However, such "application-level" authentication 5619 does not protect the DCCP option negotiation and state machine from 5620 forged packets. An alternative would be to use IPsec ESP, and use 5621 encryption to protect the DCCP headers against attack, while using 5622 the DCCP header validity checks to authenticate that the header is 5623 from someone who possessed the correct key. However, while this is 5624 resistant to replay (due to the DCCP sequence number), it is not by 5625 itself resistant to some forms of man-in-the-middle attacks because 5626 the application data is not tightly coupled to the packet header. 5627 Thus an application-level authentication probably needs to be 5628 coupled with IPsec ESP or a similar mechanism to provide a 5629 reasonably complete security solution. The overhead of such a 5630 solution might be unacceptable for some applications that would 5631 otherwise wish to use partial checksums. 5633 On balance, the authors believe that DCCP partial checksums have the 5634 potential to enable some future uses that would otherwise be 5635 difficult. As the cost and complexity of supporting them is small, 5636 it seems worth including them at this time. It remains to be seen 5637 whether they are useful in practice. 5639 Normative References 5641 [RFC 793] J. Postel, editor. Transmission Control Protocol. 5642 RFC 793. 5644 [RFC 1191] J. C. Mogul and S. E. Deering. Path MTU Discovery. 5645 RFC 1191. 5647 [RFC 2119] S. Bradner. Key Words For Use in RFCs to Indicate 5648 Requirement Levels. RFC 2119. 5650 [RFC 2434] T. Narten and H. Alvestrand. Guidelines for Writing an 5651 IANA Considerations Section in RFCs. RFC 2434. 5653 [RFC 2460] S. Deering and R. Hinden. Internet Protocol, Version 6 5654 (IPv6) Specification. RFC 2460. 5656 [RFC 3168] K.K. Ramakrishnan, S. Floyd, and D. Black. The Addition 5657 of Explicit Congestion Notification (ECN) to IP. RFC 3168. 5659 [RFC 3309] J. Stone, R. Stewart, and D. Otis. Stream Control 5660 Transmission Protocol (SCTP) Checksum Change. RFC 3309. 5662 [RFC 3692] T. Narten. Assigning Experimental and Testing Numbers 5663 Considered Useful. RFC 3692. 5665 [RFC 3775] D. Johnson, C. Perkins, and J. Arkko. Mobility Support 5666 in IPv6. RFC 3775. 5668 [RFC 3828] L-A. Larzon, M. Degermark, S. Pink, L-E. Jonsson, editor, 5669 and G. Fairhurst, editor. The Lightweight User Datagram Protocol 5670 (UDP-Lite). RFC 3828. 5672 Informative References 5674 [BB01] S.M. Bellovin and M. Blaze. Cryptographic Modes of Operation 5675 for the Internet. 2nd NIST Workshop on Modes of Operation, 5676 August 2001. 5678 [BEL98] S.M. Bellovin. Cryptography and the Internet. Proc. CRYPTO 5679 '98 (LNCS 1462), pp46-55, August, 1988. 5681 [CCID 2 PROFILE] S. Floyd and E. Kohler. Profile for DCCP 5682 Congestion Control ID 2: TCP-like Congestion Control. draft- 5683 ietf-dccp-ccid2-08.txt, work in progress, November 2004. 5685 [CCID 3 PROFILE] S. Floyd, E. Kohler, and J. Padhye. Profile for 5686 DCCP Congestion Control ID 3: TFRC Congestion Control. draft- 5687 ietf-dccp-ccid3-08.txt, work in progress, November 2004. 5689 [M85] Robert T. Morris. A Weakness in the 4.2BSD Unix TCP/IP 5690 Software. Computer Science Technical Report 117, AT&T Bell 5691 Laboratories, Murray Hill, NJ, February 1985. 5693 [PMTUD] Matt Mathis, John Heffner, and Kevin Lahey. Path MTU 5694 Discovery. draft-ietf-pmtud-method-01.txt, work in progress, 5695 February 2004. 5697 [RFC 792] J. Postel, editor. Internet Control Message Protocol. 5698 RFC 792. 5700 [RFC 1750] D. Eastlake, S. Crocker, and J. Schiller. Randomness 5701 Recommendations for Security. RFC 1750. 5703 [RFC 1948] S. Bellovin. Defending Against Sequence Number Attacks. 5704 RFC 1948. 5706 [RFC 2018] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP 5707 Selective Acknowledgement Options. RFC 2018. 5709 [RFC 2401] S. Kent and R. Atkinson. Security Architecture for the 5710 Internet Protocol. RFC 2401. 5712 [RFC 2581] M. Allman, V. Paxson, and W. Stevens. TCP Congestion 5713 Control. RFC 2581. 5715 [RFC 2960] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. 5716 Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, and V. 5717 Paxson. Stream Control Transmission Protocol. RFC 2960. 5719 [RFC 3124] H. Balakrishnan and S. Seshan. The Congestion Manager. 5720 RFC 3124. 5722 [RFC 3360] S. Floyd. Inappropriate TCP Resets Considered Harmful. 5723 RFC 3360. 5725 [RFC 3448] M. Handley, S. Floyd, J. Padhye, and J. Widmer. TCP 5726 Friendly Rate Control (TFRC): Protocol Specification. RFC 3448. 5728 [RFC 3540] N. Spring, D. Wetherall, and D. Ely. Robust Explicit 5729 Congestion Notification (ECN) Signaling with Nonces. RFC 3540. 5731 [RFC 3550] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. 5732 RTP: A Transport Protocol for Real-Time Applications. STD 64. 5733 RFC 3550. 5735 [RFC 3611] T. Friedman, R. Caceres, and A. Clark, editors. RTP 5736 Control Protocol Extended Reports (RTCP XR). RFC 3611. 5738 [RFC 3819] P. Karn, editor, C. Bormann, G. Fairhurst, D. Grossman, 5739 R. Ludwig, J. Mahdavi, G. Montenegro, J. Touch, and L. Wood. 5740 Advice for Internet Subnetwork Designers. RFC 3819. 5742 [SHHP00] Oliver Spatscheck, Jorgen S. Hansen, John H. Hartman, and 5743 Larry L. Peterson. Optimizing TCP Forwarder Performance. 5744 IEEE/ACM Transactions on Networking 8(2):146-157, April 2000. 5746 [SYNCOOKIES] Daniel J. Bernstein. SYN Cookies. 5747 http://cr.yp.to/syncookies.html, as of July 2003. 5749 Authors' Addresses 5751 Eddie Kohler 5752 4531C Boelter Hall 5753 UCLA Computer Science Department 5754 Los Angeles, CA 90095 5755 USA 5757 Mark Handley 5758 Department of Computer Science 5759 University College London 5760 Gower Street 5761 London WC1E 6BT 5762 UK 5763 Sally Floyd 5764 ICSI Center for Internet Research 5765 1947 Center Street, Suite 600 5766 Berkeley, CA 94704 5767 USA 5769 Full Copyright Statement 5771 Copyright (C) The Internet Society 2004. This document is subject 5772 to the rights, licenses and restrictions contained in BCP 78, and 5773 except as set forth therein, the authors retain all their rights. 5775 This document and the information contained herein are provided on 5776 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 5777 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE 5778 INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR 5779 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 5780 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 5781 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 5783 Intellectual Property 5785 The IETF takes no position regarding the validity or scope of any 5786 Intellectual Property Rights or other rights that might be claimed 5787 to pertain to the implementation or use of the technology described 5788 in this document or the extent to which any license under such 5789 rights might or might not be available; nor does it represent that 5790 it has made any independent effort to identify any such rights. 5791 Information on the procedures with respect to rights in RFC 5792 documents can be found in BCP 78 and BCP 79. 5794 Copies of IPR disclosures made to the IETF Secretariat and any 5795 assurances of licenses to be made available, or the result of an 5796 attempt made to obtain a general license or permission for the use 5797 of such proprietary rights by implementers or users of this 5798 specification can be obtained from the IETF on-line IPR repository 5799 at http://www.ietf.org/ipr. 5801 The IETF invites any interested party to bring to its attention any 5802 copyrights, patents or patent applications, or other proprietary 5803 rights that may cover technology that may be required to implement 5804 this standard. Please address the information to the IETF at ietf- 5805 ipr@ietf.org.