idnits 2.17.1 draft-ietf-dccp-spec-13.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** It looks like you're using RFC 3978 boilerplate. You should update this to the boilerplate described in the IETF Trust License Policy document (see https://trustee.ietf.org/license-info), which is required now. -- Found old boilerplate from RFC 3978, Section 5.1 on line 16. -- Found old boilerplate from RFC 3978, Section 5.5 on line 5958. -- Found old boilerplate from RFC 3979, Section 5, paragraph 1 on line 5969. -- Found old boilerplate from RFC 3979, Section 5, paragraph 2 on line 5976. -- Found old boilerplate from RFC 3979, Section 5, paragraph 3 on line 5982. ** This document has an original RFC 3978 Section 5.4 Copyright Line, instead of the newer IETF Trust Copyright according to RFC 4748. ** This document has an original RFC 3978 Section 5.5 Disclaimer, instead of the newer disclaimer which includes the IETF Trust according to RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- == No 'Intended status' indicated for this document; assuming Proposed Standard Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the RFC 3978 Section 5.4 Copyright Line does not match the current year == Line 1837 has weird spacing: '...t value snd...' == Line 2399 has weird spacing: '...loseReq seq...' == The document seems to use 'NOT RECOMMENDED' as an RFC 2119 keyword, but does not include the phrase in its RFC 2119 key words list. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (2 December 2005) is 6712 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'CLOSED' is mentioned on line 846, but not defined == Missing Reference: 'LISTEN' is mentioned on line 846, but not defined == Missing Reference: 'TIMEWAIT' is mentioned on line 855, but not defined == Missing Reference: 'Nonce 0' is mentioned on line 4585, but not defined == Missing Reference: 'Nonce 1' is mentioned on line 4585, but not defined == Missing Reference: 'AWL' is mentioned on line 2362, but not defined == Missing Reference: 'AWH' is mentioned on line 2362, but not defined == Missing Reference: 'SWL' is mentioned on line 2362, but not defined == Missing Reference: 'SWH' is mentioned on line 2362, but not defined == Missing Reference: 'RFC TBA' is mentioned on line 3614, but not defined == Missing Reference: 'DrpCd' is mentioned on line 4343, but not defined == Missing Reference: 'E' is mentioned on line 5462, but not defined -- Looks like a reference, but probably isn't: '1' on line 5673 -- Looks like a reference, but probably isn't: '0' on line 5656 == Unused Reference: 'RFC 2119' is defined on line 5810, but no explicit reference was found in the text == Unused Reference: 'RFC 2434' is defined on line 5813, but no explicit reference was found in the text == Unused Reference: 'RFC 2460' is defined on line 5816, but no explicit reference was found in the text == Unused Reference: 'RFC 1948' is defined on line 5869, but no explicit reference was found in the text ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) ** Obsolete normative reference: RFC 3309 (Obsoleted by RFC 4960) ** Obsolete normative reference: RFC 3775 (Obsoleted by RFC 6275) == Outdated reference: A later version (-11) exists of draft-ietf-pmtud-method-01 -- Obsolete informational reference (is this intentional?): RFC 1750 (Obsoleted by RFC 4086) -- Obsolete informational reference (is this intentional?): RFC 1948 (Obsoleted by RFC 6528) -- Obsolete informational reference (is this intentional?): RFC 2401 (Obsoleted by RFC 4301) -- Obsolete informational reference (is this intentional?): RFC 2463 (Obsoleted by RFC 4443) -- Obsolete informational reference (is this intentional?): RFC 2581 (Obsoleted by RFC 5681) -- Obsolete informational reference (is this intentional?): RFC 2960 (Obsoleted by RFC 4960) -- Obsolete informational reference (is this intentional?): RFC 3448 (Obsoleted by RFC 5348) Summary: 8 errors (**), 0 flaws (~~), 22 warnings (==), 17 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Internet Engineering Task Force Eddie Kohler 2 INTERNET-DRAFT UCLA 3 draft-ietf-dccp-spec-13.txt Mark Handley 4 Expires: 2 June 2006 UCL 5 Sally Floyd 6 ICIR 7 2 December 2005 9 Datagram Congestion Control Protocol (DCCP) 11 Status of this Memo 13 By submitting this Internet-Draft, each author represents that any 14 applicable patent or other IPR claims of which he or she is aware 15 have been or will be disclosed, and any of which he or she becomes 16 aware will be disclosed, in accordance with Section 6 of BCP 79. 18 Internet-Drafts are working documents of the Internet Engineering 19 Task Force (IETF), its areas, and its working groups. Note that 20 other groups may also distribute working documents as Internet- 21 Drafts. 23 Internet-Drafts are draft documents valid for a maximum of six 24 months and may be updated, replaced, or obsoleted by other documents 25 at any time. It is inappropriate to use Internet-Drafts as 26 reference material or to cite them other than as "work in progress." 28 The list of current Internet-Drafts can be accessed at 29 http://www.ietf.org/ietf/1id-abstracts.txt. 31 The list of Internet-Draft Shadow Directories can be accessed at 32 http://www.ietf.org/shadow.html. 34 This Internet-Draft will expire on 2 June 2006. 36 Abstract 38 The Datagram Congestion Control Protocol (DCCP) is a transport 39 protocol that provides bidirectional unicast connections of 40 congestion-controlled unreliable datagrams. DCCP is suitable for 41 applications that transfer fairly large amounts of data, but can 42 benefit from control over the tradeoff between timeliness and 43 reliability. 45 TO BE DELETED BY THE RFC EDITOR UPON PUBLICATION: 47 Changes since draft-ietf-dccp-spec-08.txt: 49 * Added minimum Sequence Window. 51 * Init Cookie implementation sketch. 53 * Include reasoning for ignoring options on DCCP-Data. 55 * More Aggression Penalty explanation. 57 * More explanation on Ack Vectors that report information on packets 58 that haven't been sent. 60 Changes since draft-ietf-dccp-spec-07.txt: 62 * Many changes, not listed here, for WGLC. 64 * The more stringent Sequence Number checks on DCCP-Sync and DCCP- 65 SyncAck packets become SHOULD, not MAY. 67 Changes since draft-ietf-dccp-spec-06.txt: 69 * Change extended sequence numbers. Now 48-bit sequence numbers are 70 MANDATORY, and all packet types except Data, Ack, and DataAck always 71 use 48-bit sequence numbers. This change improves DCCP's robustness 72 against blind attacks. 74 * Removed empty Change options. 76 * Allow preference list changes during feature negotiations (this 77 seems easier to implement than the alternative). This required a 78 new feature negotiation state, UNSTABLE. 80 * Add Minimum Checksum Coverage feature. 82 * Add Reset Congestion State option. 84 * Simplify the implementation of CCID-specific option processing: no 85 need to check whether the CCID feature is being negotiated. 87 * Many more minor changes. 89 Changes since draft-ietf-dccp-spec-05.txt: 91 * Organization overhaul. 93 * Add pseudocode for event processing. 95 * Remove # NDP; replace with Ack Count. 97 * Remove Identification, Challenge, ID Regime, and Connection Nonce. 99 * Data Checksum (formerly Payload Checksum) uses a 32-bit CRC. 101 * Switch location of non-negotiable features to clarify 102 presentation; now the feature location controls its value. 104 * Rename "value type" to "reconciliation rule". 106 * Rename "Reset Reason" to "Reset Code". 108 * Mobility ID becomes 128 bits long. 110 * Add probabilities to Mobility ID discussion. 112 * Add SyncAck. 114 Table of Contents 116 1. Introduction. . . . . . . . . . . . . . . . . . . . . . . . . 9 117 2. Design Rationale. . . . . . . . . . . . . . . . . . . . . . . 10 118 3. Conventions and Terminology . . . . . . . . . . . . . . . . . 11 119 3.1. Numbers and Fields . . . . . . . . . . . . . . . . . . . 11 120 3.2. Parts of a Connection. . . . . . . . . . . . . . . . . . 12 121 3.3. Features . . . . . . . . . . . . . . . . . . . . . . . . 12 122 3.4. Round-Trip Times . . . . . . . . . . . . . . . . . . . . 13 123 3.5. Security Limitation. . . . . . . . . . . . . . . . . . . 13 124 3.6. Robustness Principle . . . . . . . . . . . . . . . . . . 13 125 4. Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . 14 126 4.1. Packet Types . . . . . . . . . . . . . . . . . . . . . . 14 127 4.2. Packet Sequencing. . . . . . . . . . . . . . . . . . . . 15 128 4.3. States . . . . . . . . . . . . . . . . . . . . . . . . . 16 129 4.4. Congestion Control Mechanisms. . . . . . . . . . . . . . 18 130 4.5. Connection Features. . . . . . . . . . . . . . . . . . . 19 131 4.6. Differences From TCP . . . . . . . . . . . . . . . . . . 20 132 4.7. Example Connection . . . . . . . . . . . . . . . . . . . 21 133 5. Packet Formats. . . . . . . . . . . . . . . . . . . . . . . . 22 134 5.1. Generic Header . . . . . . . . . . . . . . . . . . . . . 23 135 5.2. DCCP-Request Packets . . . . . . . . . . . . . . . . . . 26 136 5.3. DCCP-Response Packets. . . . . . . . . . . . . . . . . . 27 137 5.4. DCCP-Data, DCCP-Ack, and DCCP-DataAck Packets. . . . . . 28 138 5.5. DCCP-CloseReq and DCCP-Close Packets . . . . . . . . . . 29 139 5.6. DCCP-Reset Packets . . . . . . . . . . . . . . . . . . . 30 140 5.7. DCCP-Sync and DCCP-SyncAck Packets . . . . . . . . . . . 33 141 5.8. Options. . . . . . . . . . . . . . . . . . . . . . . . . 34 142 5.8.1. Padding Option. . . . . . . . . . . . . . . . . . . 35 143 5.8.2. Mandatory Option. . . . . . . . . . . . . . . . . . 36 144 6. Feature Negotiation . . . . . . . . . . . . . . . . . . . . . 37 145 6.1. Change Options . . . . . . . . . . . . . . . . . . . . . 37 146 6.2. Confirm Options. . . . . . . . . . . . . . . . . . . . . 38 147 6.3. Reconciliation Rules . . . . . . . . . . . . . . . . . . 38 148 6.3.1. Server-Priority . . . . . . . . . . . . . . . . . . 38 149 6.3.2. Non-Negotiable. . . . . . . . . . . . . . . . . . . 39 150 6.4. Feature Numbers. . . . . . . . . . . . . . . . . . . . . 39 151 6.5. Feature Negotiation Examples . . . . . . . . . . . . . . 40 152 6.6. Option Exchange. . . . . . . . . . . . . . . . . . . . . 41 153 6.6.1. Normal Exchange . . . . . . . . . . . . . . . . . . 42 154 6.6.2. Processing Received Options . . . . . . . . . . . . 42 155 6.6.3. Loss and Retransmission . . . . . . . . . . . . . . 44 156 6.6.4. Reordering. . . . . . . . . . . . . . . . . . . . . 45 157 6.6.5. Preference Changes. . . . . . . . . . . . . . . . . 46 158 6.6.6. Simultaneous Negotiation. . . . . . . . . . . . . . 46 159 6.6.7. Unknown Features. . . . . . . . . . . . . . . . . . 46 160 6.6.8. Invalid Options . . . . . . . . . . . . . . . . . . 47 161 6.6.9. Mandatory Feature Negotiation . . . . . . . . . . . 48 163 7. Sequence Numbers. . . . . . . . . . . . . . . . . . . . . . . 48 164 7.1. Variables. . . . . . . . . . . . . . . . . . . . . . . . 49 165 7.2. Initial Sequence Numbers . . . . . . . . . . . . . . . . 49 166 7.3. Quiet Time . . . . . . . . . . . . . . . . . . . . . . . 50 167 7.4. Acknowledgement Numbers. . . . . . . . . . . . . . . . . 51 168 7.5. Validity and Synchronization . . . . . . . . . . . . . . 51 169 7.5.1. Sequence and Acknowledgement Number 170 Windows. . . . . . . . . . . . . . . . . . . . . . . . . . 52 171 7.5.2. Sequence Window Feature . . . . . . . . . . . . . . 53 172 7.5.3. Sequence-Validity Rules . . . . . . . . . . . . . . 53 173 7.5.4. Handling Sequence-Invalid Packets . . . . . . . . . 55 174 7.5.5. Sequence Number Attacks . . . . . . . . . . . . . . 56 175 7.5.6. Sequence Number Handling Examples . . . . . . . . . 58 176 7.6. Short Sequence Numbers . . . . . . . . . . . . . . . . . 58 177 7.6.1. Allow Short Sequence Numbers Feature. . . . . . . . 59 178 7.6.2. When to Avoid Short Sequence Numbers. . . . . . . . 60 179 7.7. NDP Count and Detecting Application Loss . . . . . . . . 60 180 7.7.1. NDP Count Usage Notes . . . . . . . . . . . . . . . 61 181 7.7.2. Send NDP Count Feature. . . . . . . . . . . . . . . 61 182 8. Event Processing. . . . . . . . . . . . . . . . . . . . . . . 62 183 8.1. Connection Establishment . . . . . . . . . . . . . . . . 62 184 8.1.1. Client Request. . . . . . . . . . . . . . . . . . . 62 185 8.1.2. Service Codes . . . . . . . . . . . . . . . . . . . 63 186 8.1.3. Server Response . . . . . . . . . . . . . . . . . . 65 187 8.1.4. Init Cookie Option. . . . . . . . . . . . . . . . . 66 188 8.1.5. Handshake Completion. . . . . . . . . . . . . . . . 67 189 8.2. Data Transfer. . . . . . . . . . . . . . . . . . . . . . 67 190 8.3. Termination. . . . . . . . . . . . . . . . . . . . . . . 68 191 8.3.1. Abnormal Termination. . . . . . . . . . . . . . . . 70 192 8.4. DCCP State Diagram . . . . . . . . . . . . . . . . . . . 70 193 8.5. Pseudocode . . . . . . . . . . . . . . . . . . . . . . . 71 194 9. Checksums . . . . . . . . . . . . . . . . . . . . . . . . . . 75 195 9.1. Header Checksum Field. . . . . . . . . . . . . . . . . . 76 196 9.2. Header Checksum Coverage Field . . . . . . . . . . . . . 77 197 9.2.1. Minimum Checksum Coverage Feature . . . . . . . . . 78 198 9.3. Data Checksum Option . . . . . . . . . . . . . . . . . . 78 199 9.3.1. Check Data Checksum Feature . . . . . . . . . . . . 79 200 9.3.2. Checksum Usage Notes. . . . . . . . . . . . . . . . 79 201 10. Congestion Control . . . . . . . . . . . . . . . . . . . . . 80 202 10.1. TCP-like Congestion Control . . . . . . . . . . . . . . 81 203 10.2. TFRC Congestion Control . . . . . . . . . . . . . . . . 81 204 10.3. CCID-Specific Options, Features, and Reset 205 Codes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 206 10.4. CCID Profile Requirements . . . . . . . . . . . . . . . 84 207 10.5. Congestion State. . . . . . . . . . . . . . . . . . . . 84 208 11. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 85 209 11.1. Acks of Acks and Unidirectional Connections . . . . . . 86 210 11.2. Ack Piggybacking. . . . . . . . . . . . . . . . . . . . 87 211 11.3. Ack Ratio Feature . . . . . . . . . . . . . . . . . . . 87 212 11.4. Ack Vector Options. . . . . . . . . . . . . . . . . . . 89 213 11.4.1. Ack Vector Consistency . . . . . . . . . . . . . . 91 214 11.4.2. Ack Vector Coverage. . . . . . . . . . . . . . . . 93 215 11.5. Send Ack Vector Feature . . . . . . . . . . . . . . . . 94 216 11.6. Slow Receiver Option. . . . . . . . . . . . . . . . . . 94 217 11.7. Data Dropped Option . . . . . . . . . . . . . . . . . . 95 218 11.7.1. Data Dropped and Normal Congestion 219 Response . . . . . . . . . . . . . . . . . . . . . . . . . 98 220 11.7.2. Particular Drop Codes. . . . . . . . . . . . . . . 98 221 12. Explicit Congestion Notification . . . . . . . . . . . . . . 99 222 12.1. ECN Incapable Feature . . . . . . . . . . . . . . . . . 100 223 12.2. ECN Nonces. . . . . . . . . . . . . . . . . . . . . . . 100 224 12.3. Aggression Penalties. . . . . . . . . . . . . . . . . . 101 225 13. Timing Options . . . . . . . . . . . . . . . . . . . . . . . 102 226 13.1. Timestamp Option. . . . . . . . . . . . . . . . . . . . 102 227 13.2. Elapsed Time Option . . . . . . . . . . . . . . . . . . 103 228 13.3. Timestamp Echo Option . . . . . . . . . . . . . . . . . 104 229 14. Maximum Packet Size. . . . . . . . . . . . . . . . . . . . . 105 230 14.1. Measuring PMTU. . . . . . . . . . . . . . . . . . . . . 105 231 14.2. Sender Behavior . . . . . . . . . . . . . . . . . . . . 107 232 15. Forward Compatibility. . . . . . . . . . . . . . . . . . . . 108 233 16. Middlebox Considerations . . . . . . . . . . . . . . . . . . 108 234 17. Relations to Other Specifications. . . . . . . . . . . . . . 110 235 17.1. RTP . . . . . . . . . . . . . . . . . . . . . . . . . . 110 236 17.2. Congestion Manager and Multiplexing . . . . . . . . . . 111 237 18. Security Considerations. . . . . . . . . . . . . . . . . . . 111 238 18.1. Security Considerations for Partial 239 Checksums . . . . . . . . . . . . . . . . . . . . . . . . . . 112 240 19. IANA Considerations. . . . . . . . . . . . . . . . . . . . . 113 241 19.1. Packet Types Registry . . . . . . . . . . . . . . . . . 113 242 19.2. Reset Codes Registry. . . . . . . . . . . . . . . . . . 113 243 19.3. Option Types Registry . . . . . . . . . . . . . . . . . 114 244 19.4. Feature Numbers Registry. . . . . . . . . . . . . . . . 114 245 19.5. Congestion Control Identifiers Registry . . . . . . . . 114 246 19.6. Ack Vector States Registry. . . . . . . . . . . . . . . 115 247 19.7. Drop Codes Registry . . . . . . . . . . . . . . . . . . 115 248 19.8. Service Codes Registry. . . . . . . . . . . . . . . . . 115 249 19.9. Port Numbers Registry . . . . . . . . . . . . . . . . . 116 250 20. Thanks . . . . . . . . . . . . . . . . . . . . . . . . . . . 117 251 A. Appendix: Ack Vector Implementation Notes . . . . . . . . . . 118 252 A.1. Packet Arrival . . . . . . . . . . . . . . . . . . . . . 120 253 A.1.1. New Packets . . . . . . . . . . . . . . . . . . . . 120 254 A.1.2. Old Packets . . . . . . . . . . . . . . . . . . . . 121 255 A.2. Sending Acknowledgements . . . . . . . . . . . . . . . . 122 256 A.3. Clearing State . . . . . . . . . . . . . . . . . . . . . 123 257 A.4. Processing Acknowledgements. . . . . . . . . . . . . . . 124 258 B. Appendix: Partial Checksumming Design Motivation. . . . . . . 125 259 Normative References . . . . . . . . . . . . . . . . . . . . . . 126 260 Informative References . . . . . . . . . . . . . . . . . . . . . 127 261 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 129 262 Full Copyright Statement . . . . . . . . . . . . . . . . . . . . 130 263 Intellectual Property. . . . . . . . . . . . . . . . . . . . . . 130 264 List of Tables 266 Table 1: DCCP Packet Types . . . . . . . . . . . . . . . . . . . 25 267 Table 2: DCCP Reset Codes. . . . . . . . . . . . . . . . . . . . 32 268 Table 3: DCCP Options. . . . . . . . . . . . . . . . . . . . . . 34 269 Table 4: DCCP Feature Numbers. . . . . . . . . . . . . . . . . . 39 270 Table 5: DCCP Congestion Control Identifiers . . . . . . . . . . 80 271 Table 6: DCCP Ack Vector States. . . . . . . . . . . . . . . . . 90 272 Table 7: DCCP Drop Codes . . . . . . . . . . . . . . . . . . . . 96 274 1. Introduction 276 The Datagram Congestion Control Protocol (DCCP) is a transport 277 protocol that implements bidirectional, unicast connections of 278 congestion-controlled, unreliable datagrams. Specifically, DCCP 279 provides: 281 o Unreliable flows of datagrams, with acknowledgements. 283 o Reliable handshakes for connection setup and teardown. 285 o Reliable negotiation of options, including negotiation of a 286 suitable congestion control mechanism. 288 o Mechanisms allowing servers to avoid holding state for 289 unacknowledged connection attempts and already-finished 290 connections. 292 o Congestion control incorporating Explicit Congestion Notification 293 (ECN) [RFC 3168] and the ECN Nonce [RFC 3540]. 295 o Acknowledgement mechanisms communicating packet loss and ECN 296 information. Acks are transmitted as reliably as the relevant 297 congestion control mechanism requires, possibly completely 298 reliably. 300 o Optional mechanisms that tell the sending application, with high 301 reliability, which data packets reached the receiver, and whether 302 those packets were ECN marked, corrupted, or dropped in the 303 receive buffer. 305 o Path Maximum Transmission Unit (PMTU) discovery [RFC 1191]. 307 o A choice of modular congestion control mechanisms. Two 308 mechanisms are currently specified, TCP-like Congestion Control 309 [CCID 2 PROFILE] and TFRC (TCP-Friendly Rate Control) Congestion 310 Control [CCID 3 PROFILE], but DCCP is easily extensible to 311 further forms of unicast congestion control. 313 DCCP is intended for applications such as streaming media that can 314 benefit from control over the tradeoffs between delay and reliable 315 in-order delivery. TCP is not well-suited for these applications, 316 since reliable in-order delivery and congestion control can cause 317 arbitrarily long delays. UDP avoids long delays, but UDP 318 applications that implement congestion control must do so on their 319 own. DCCP provides built-in congestion control, including ECN 320 support, for unreliable datagram flows, avoiding the arbitrary 321 delays associated with TCP. It also implements reliable connection 322 setup, teardown, and feature negotiation. 324 2. Design Rationale 326 One DCCP design goal was to give most streaming UDP applications 327 little reason not to switch to DCCP, once it is deployed. To 328 facilitate this, DCCP was designed to have as little overhead as 329 possible, both in terms of the packet header size and in terms of 330 the state and CPU overhead required at end hosts. Only the minimal 331 necessary functionality was included in DCCP, leaving other 332 functionality, such as forward error correction (FEC), semi- 333 reliability, and multiple streams, to be layered on top of DCCP as 334 desired. 336 Different forms of conformant congestion control are appropriate for 337 different applications. For example, on-line games might want to 338 make quick use of any available bandwidth, while streaming media 339 might trade off this responsiveness for a steadier, less bursty 340 rate. (Sudden rate changes can cause unacceptable UI glitches, such 341 as audible pauses or clicks in the playout stream.) DCCP thus 342 allows applications to choose from a set of congestion control 343 mechanisms. One alternative, TCP-like Congestion Control, halves 344 the congestion window in response to a packet drop or mark, as in 345 TCP. Applications using this congestion control mechanism will 346 respond quickly to changes in available bandwidth, but must tolerate 347 the abrupt changes in congestion window typical of TCP. A second 348 alternative, TCP-Friendly Rate Control (TFRC) [RFC 3448], a form of 349 equation-based congestion control, minimizes abrupt changes in the 350 sending rate while maintaining longer-term fairness with TCP. Other 351 alternatives can be added as future congestion control mechanisms 352 are standardized. 354 DCCP also lets unreliable traffic safely use ECN. A UDP kernel API 355 might not allow applications to set UDP packets as ECN-capable, 356 since the API could not guarantee the application would properly 357 detect or respond to congestion. DCCP kernel APIs will have no such 358 issues, since DCCP implements congestion control itself. 360 We chose not to require the use of the Congestion Manager [RFC 361 3124], which allows multiple concurrent streams between the same 362 sender and receiver to share congestion control. The current 363 Congestion Manager can only be used by applications that have their 364 own end-to-end feedback about packet losses, but this is not the 365 case for many of the applications currently using UDP. In addition, 366 the current Congestion Manager does not easily support multiple 367 congestion control mechanisms, or lend itself to the use of forms of 368 TFRC where the state about past packet drops or marks is maintained 369 at the receiver rather than at the sender. DCCP should be able to 370 make use of CM where desired by the application, but we do not see 371 any benefit in making the deployment of DCCP contingent on the 372 deployment of CM itself. 374 We intend for DCCP's protocol mechanisms, which are described in 375 this document, to suit any application desiring unicast congestion- 376 controlled streams of unreliable datagrams. The congestion control 377 mechanisms currently approved for use with DCCP, which are described 378 in separate Congestion Control ID Profiles [CCID 2 PROFILE, CCID 3 379 PROFILE], may, however, cause problems for some applications, 380 including high-bandwidth interactive video. These applications 381 should be able to use DCCP once suitable Congestion Control ID 382 Profiles are standardized. 384 3. Conventions and Terminology 386 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 387 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 388 document are to be interpreted as described in RFC 2119. 390 3.1. Numbers and Fields 392 All multi-byte numerical quantities in DCCP, such as port numbers, 393 Sequence Numbers, and arguments to options, are transmitted in 394 network byte order (most significant byte first). 396 We occasionally refer to the "left" and "right" sides of a bit 397 field. "Left" means towards the most significant bit, and "right" 398 means towards the least significant bit. 400 Random numbers in DCCP are used for their security properties, and 401 SHOULD be chosen according to the guidelines in RFC 1750. 403 All operations on DCCP sequence numbers, and comparisons such as 404 "greater" and "greatest", use circular arithmetic modulo 2**48. 405 This form of arithmetic preserves the relationships between sequence 406 numbers as they roll over from 2**48 - 1 to 0. Implementation 407 strategies for DCCP sequence numbers will resemble those for other 408 circular arithmetic spaces, including TCP's sequence numbers [RFC 409 793] and DNS's serial numbers [RFC 1982]. Note that the common 410 technique for implementing circular comparison using two's- 411 complement arithmetic, whereby A < B using circular arithmetic if 412 and only if (A - B) < 0 using conventional two's-complement 413 arithmetic, may be used for DCCP sequence numbers, provided they are 414 stored in the most significant 48 bits of 64-bit integers. 416 Reserved bitfields in DCCP packet headers MUST be set to zero by 417 senders, and MUST be ignored by receivers, unless otherwise 418 specified. This is to allow for future protocol extensions. In 419 particular, DCCP processors MUST NOT reset a DCCP connection simply 420 because a Reserved field has non-zero value [RFC 3360]. 422 3.2. Parts of a Connection 424 Each DCCP connection runs between two hosts, which we often name 425 DCCP A and DCCP B. Each connection is actively initiated by one of 426 the hosts, which we call the client; the other, initially passive 427 host is called the server. The term "DCCP endpoint" is used to 428 refer to either of the two hosts explicitly named by the connection 429 (the client and the server). The term "DCCP processor" refers more 430 generally to any host that might need to process a DCCP header; this 431 includes the endpoints and any middleboxes on the path, such as 432 firewalls and network address translators. 434 DCCP connections are bidirectional: data may pass from either 435 endpoint to the other. This means that data and acknowledgements 436 may be flowing in both directions simultaneously. Logically, 437 however, a DCCP connection consists of two separate unidirectional 438 connections, called half-connections. Each half-connection consists 439 of the application data sent by one endpoint and the corresponding 440 acknowledgements sent by the other endpoint. We can illustrate this 441 as follows: 443 +--------+ A-to-B half-connection: +--------+ 444 | | --> application data --> | | 445 | | <-- acknowledgements <-- | | 446 | DCCP A | | DCCP B | 447 | | B-to-A half-connection: | | 448 | | <-- application data <-- | | 449 +--------+ --> acknowledgements --> +--------+ 451 Although they are logically distinct, in practice the half- 452 connections overlap; a DCCP-DataAck packet, for example, contains 453 application data relevant to one half-connection and acknowledgement 454 information relevant to the other. 456 In the context of a single half-connection, the terms "HC-Sender" 457 and "HC-Receiver" denote the endpoints sending application data and 458 acknowledgements, respectively. For example, DCCP A is the HC- 459 Sender and DCCP B is the HC-Receiver in the A-to-B half-connection. 461 3.3. Features 463 A DCCP feature is a connection attribute on whose value the two 464 endpoints agree. Many properties of a DCCP connection are 465 controlled by features, including the congestion control mechanisms 466 in use on the two half-connections. The endpoints achieve agreement 467 through the exchange of feature negotiation options in DCCP headers. 469 DCCP features are identified by a feature number and an endpoint. 470 The notation "F/X" represents the feature with feature number F 471 located at DCCP endpoint X. Each valid feature number thus 472 corresponds to two features, which are negotiated separately and 473 need not have the same value. The two endpoints know, and agree on, 474 the value of every valid feature. DCCP A is the "feature location" 475 for all features F/A, and the "feature remote" for all features F/B. 477 3.4. Round-Trip Times 479 DCCP round-trip time measurements are performed by congestion 480 control mechanisms; different mechanisms may measure round-trip time 481 in different ways, or not measure it at all. However, the main DCCP 482 protocol does use round-trip times occasionally, such as in the 483 initial values for certain timers. Each DCCP implementation thus 484 defines a default round-trip time for use when no estimate is 485 available; this parameter should default to not less than 486 0.2 seconds, a reasonably conservative round-trip time for Internet 487 TCP connections. Protocol behavior specified in terms of "round- 488 trip time" values actually refers to "a current round-trip time 489 estimate taken by some CCID, or, if no estimate is available, the 490 default round-trip time parameter". 492 The maximum segment lifetime, or MSL, is the maximum length of time 493 a packet can survive in the network. The DCCP MSL should equal that 494 of TCP, which is normally two minutes. 496 3.5. Security Limitation 498 DCCP provides no protection against attackers who can snoop on a 499 connection in progress, or who can guess valid sequence numbers in 500 other ways. Applications desiring stronger security should use 501 IPsec [RFC 2401]; depending on the level of security required, 502 application-level cryptography may also suffice. These issues are 503 discussed further in Sections 18 and 7.5.5. 505 3.6. Robustness Principle 507 DCCP implementations will follow TCP's "general principle of 508 robustness": "be conservative in what you do, be liberal in what you 509 accept from others" [RFC 793]. 511 4. Overview 513 DCCP's high-level connection dynamics echo those of TCP. 514 Connections progress through three phases: initiation, including a 515 three-way handshake; data transfer; and termination. Data can flow 516 both ways over the connection. An acknowledgement framework lets 517 senders discover how much data has been lost, and thus avoid 518 unfairly congesting the network. Of course, DCCP provides 519 unreliable datagram semantics, not TCP's reliable bytestream 520 semantics. The application must package its data into explicit 521 frames, and must retransmit its own data as necessary. It may be 522 useful to think of DCCP as TCP minus bytestream semantics and 523 reliability, or as UDP plus congestion control, handshakes, and 524 acknowledgements. 526 4.1. Packet Types 528 Ten packet types implement DCCP's protocol functions. For example, 529 every new connection attempt begins with a DCCP-Request packet sent 530 by the client. A DCCP-Request packet thus resembles a TCP SYN; but 531 DCCP-Request is a packet type, not a flag, so there's no way to send 532 an unexpected combination such as TCP's SYN+FIN+ACK+RST. 534 Eight packet types occur during the progress of a typical 535 connection, shown here. Note the three-way handshakes during 536 initiation and termination. 538 Client Server 539 ------ ------ 540 (1) Initiation 541 DCCP-Request --> 542 <-- DCCP-Response 543 DCCP-Ack --> 544 (2) Data transfer 545 DCCP-Data, DCCP-Ack, DCCP-DataAck --> 546 <-- DCCP-Data, DCCP-Ack, DCCP-DataAck 547 (3) Termination 548 <-- DCCP-CloseReq 549 DCCP-Close --> 550 <-- DCCP-Reset 552 The two remaining packet types are used to resynchronize after 553 bursts of loss. 555 Every DCCP packet starts with a 12-byte generic header. Particular 556 packet types include additional fixed-size header data; for example, 557 DCCP-Acks include an Acknowledgement Number. DCCP options and any 558 application data follow the fixed-size header. 560 The packet types are as follows: 562 DCCP-Request 563 Sent by the client to initiate a connection (the first part of 564 the three-way initiation handshake). 566 DCCP-Response 567 Sent by the server in response to a DCCP-Request (the second 568 part of the three-way initiation handshake). 570 DCCP-Data 571 Used to transmit application data. 573 DCCP-Ack 574 Used to transmit pure acknowledgements. 576 DCCP-DataAck 577 Used to transmit application data with piggybacked 578 acknowledgements. 580 DCCP-CloseReq 581 Sent by the server to request that the client close the 582 connection. 584 DCCP-Close 585 Used by the client or the server to close the connection; 586 elicits a DCCP-Reset in response. 588 DCCP-Reset 589 Used to terminate the connection, either normally or abnormally. 591 DCCP-Sync, DCCP-SyncAck 592 Used to resynchronize sequence numbers after large bursts of 593 loss. 595 4.2. Packet Sequencing 597 Each DCCP packet carries a sequence number, so that losses can be 598 detected and reported. Unlike TCP sequence numbers, which are byte- 599 based, DCCP sequence numbers increment by one per packet. For 600 example: 602 DCCP A DCCP B 603 ------ ------ 604 DCCP-Data(seqno 1) --> 605 DCCP-Data(seqno 2) --> 606 <-- DCCP-Ack(seqno 10, ackno 2) 607 DCCP-DataAck(seqno 3, ackno 10) --> 608 <-- DCCP-Data(seqno 11) 610 Every DCCP packet increments the sequence number, whether or not it 611 contains application data. DCCP-Ack pure acknowledgements increment 612 the sequence number, for instance: DCCP B's second packet above uses 613 sequence number 11, since sequence number 10 was used for an 614 acknowledgement. This lets endpoints detect all packet loss, 615 including acknowledgement loss. It also means that endpoints can 616 get out of sync after long bursts of loss; the DCCP-Sync and DCCP- 617 SyncAck packet types are used to recover (Section 7.5). 619 Since DCCP provides unreliable semantics, there are no 620 retransmissions, and it doesn't make sense to have a TCP-style 621 cumulative acknowledgement field. DCCP's Acknowledgement Number 622 field equals the greatest sequence number received, rather than the 623 smallest sequence number not received. Separate options indicate 624 any intermediate sequence numbers that weren't received. 626 4.3. States 628 DCCP endpoints progress through different states during the course 629 of a connection, corresponding roughly to the three phases of 630 initiation, data transfer, and termination. The figure below shows 631 the typical progress through these states for a client and server. 633 Client Server 634 ------ ------ 635 (0) No connection 636 CLOSED LISTEN 638 (1) Initiation 639 REQUEST DCCP-Request --> 640 <-- DCCP-Response RESPOND 641 PARTOPEN DCCP-Ack or DCCP-DataAck --> 643 (2) Data transfer 644 OPEN <-- DCCP-Data, Ack, DataAck --> OPEN 646 (3) Termination 647 <-- DCCP-CloseReq CLOSEREQ 648 CLOSING DCCP-Close --> 649 <-- DCCP-Reset CLOSED 650 TIMEWAIT 651 CLOSED 653 The nine possible states are as follows. They are listed in 654 increasing order, so that "state >= CLOSEREQ" means the same as 655 "state = CLOSEREQ or state = CLOSING or state = TIMEWAIT". Section 656 8 describes the states in more detail. 658 CLOSED 659 Represents nonexistent connections. 661 LISTEN 662 Represents server sockets in the passive listening state. 663 LISTEN and CLOSED are not associated with any particular DCCP 664 connection. 666 REQUEST 667 A client socket enters this state, from CLOSED, after sending a 668 DCCP-Request packet to try to initiate a connection. 670 RESPOND 671 A server socket enters this state, from LISTEN, after receiving 672 a DCCP-Request from a client. 674 PARTOPEN 675 A client socket enters this state, from REQUEST, after receiving 676 a DCCP-Response from the server. This state represents the 677 third phase of the three-way handshake. The client may send 678 application data in this state, but it MUST include an 679 Acknowledgement Number on all of its packets. 681 OPEN 682 The central, data transfer portion of a DCCP connection. Client 683 and server sockets enter this state from PARTOPEN and RESPOND, 684 respectively. Sometimes we speak of SERVER-OPEN and CLIENT-OPEN 685 states, corresponding to the server's OPEN state and the 686 client's OPEN state. 688 CLOSEREQ 689 A server socket enters this state, from SERVER-OPEN, to signal 690 that the connection is over, but the client must hold TIMEWAIT 691 state. 693 CLOSING 694 Server and client sockets can both enter this state to close the 695 connection. 697 TIMEWAIT 698 A server or client socket remains in this state for 2MSL (4 699 minutes) after the connection has been torn down, to prevent 700 mistakes due to the delivery of old packets. Only one of the 701 endpoints need enter TIMEWAIT state (the other can enter CLOSED 702 state immediately), and a server can request its client to hold 703 TIMEWAIT state using the DCCP-CloseReq packet type. 705 4.4. Congestion Control Mechanisms 707 DCCP connections are congestion controlled, but unlike in TCP, DCCP 708 applications have a choice of congestion control mechanism. In 709 fact, the two half-connections can be governed by different 710 mechanisms. Mechanisms are denoted by one-byte congestion control 711 identifiers, or CCIDs. The endpoints negotiate their CCIDs during 712 connection initiation. Each CCID describes how the HC-Sender limits 713 data packet rates, how the HC-Receiver sends congestion feedback via 714 acknowledgements, and so forth. CCIDs 2 and 3 are currently 715 defined; CCIDs 0, 1, and 4-255 are reserved. Other CCIDs may be 716 defined in the future. 718 CCID 2 provides TCP-like Congestion Control, which is similar to 719 that of TCP. The sender maintains a congestion window and sends 720 packets until that window is full. Packets are acknowledged by the 721 receiver. Dropped packets and ECN [RFC 3168] indicate congestion; 722 the response to congestion is to halve the congestion window. 723 Acknowledgements in CCID 2 contain the sequence numbers of all 724 received packets within some window, similar to a selective 725 acknowledgement (SACK) [RFC 2018]. 727 CCID 3 provides TFRC Congestion Control, an equation-based form of 728 congestion control intended to respond to congestion more smoothly 729 than CCID 2. The sender maintains a transmit rate, which it updates 730 using the receiver's estimate of the packet loss and mark rate. 731 CCID 3 behaves somewhat differently from TCP in the short term, it 732 is designed to operate fairly with TCP over the long term. 734 Section 10 describes DCCP's CCIDs in more detail. The behaviors of 735 CCIDs 2 and 3 are fully defined in separate profile documents [CCID 736 2 PROFILE, CCID 3 PROFILE]. 738 4.5. Connection Features 740 DCCP endpoints use Change and Confirm options to negotiate and agree 741 on feature values. Feature negotiation will almost always happen on 742 the connection initiation handshake, but it can begin at any time. 744 There are four feature negotiation options in all: Change L, 745 Confirm L, Change R, and Confirm R. The "L" options are sent by the 746 feature location, and the "R" options are sent by the feature 747 remote. A Change R option says to the feature location, "change 748 this feature value as follows". The feature location responds with 749 Confirm L, meaning "I've changed it". Some features allow Change R 750 options to contain multiple values, sorted in preference order. For 751 example: 753 Client Server 754 ------ ------ 755 Change R(CCID, 2) --> 756 <-- Confirm L(CCID, 2) 757 * agreement that CCID/Server = 2 * 759 Change R(CCID, 3 4) --> 760 <-- Confirm L(CCID, 4, 4 2) 761 * agreement that CCID/Server = 4 * 763 Both exchanges negotiate the CCID/Server feature's value, which is 764 the CCID in use on the server-to-client half-connection. In the 765 second exchange, the client requests that the server use either 766 CCID 3 or CCID 4, with 3 preferred; the server chooses 4 and 767 supplies its preference list, "4 2". 769 The Change L and Confirm R options are used for feature negotiations 770 initiated by the feature location. In the following example, the 771 server requests that CCID/Server be set to 3 or 2, with 3 preferred, 772 and the client agrees. 774 Client Server 775 ------ ------ 776 <-- Change L(CCID, 3 2) 777 Confirm R(CCID, 3, 3 2) --> 778 * agreement that CCID/Server = 3 * 780 Section 6 describes the feature negotiation options further, 781 including the retransmission strategies that make negotiation 782 reliable. 784 4.6. Differences From TCP 786 Differences between DCCP and TCP apart from those discussed so far 787 include: 789 o Copious space for options (up to 1008 bytes or the PMTU). 791 o Different acknowledgement formats. The CCID for a connection 792 determines how much acknowledgement information needs to be 793 transmitted. For example, in CCID 2 (TCP-like), this is about 794 one ack per 2 packets, and each ack must declare exactly which 795 packets were received; in CCID 3 (TFRC), it's about one ack per 796 round-trip time, and acks must declare at minimum just the 797 lengths of recent loss intervals. 799 o Denial-of-service (DoS) protection. Several mechanisms help 800 limit the amount of state possibly-misbehaving clients can force 801 DCCP servers to maintain. An Init Cookie option, analogous to 802 TCP's SYN Cookies [SYNCOOKIES], avoids SYN-flood-like attacks. 803 Only one connection endpoint need hold TIMEWAIT state; the DCCP- 804 CloseReq packet, which may only be sent by the server, passes 805 that state to the client. Various rate limits let servers avoid 806 attacks that might force extensive computation or packet 807 generation. 809 o Distinguishing different kinds of loss. A Data Dropped option 810 (Section 11.7) lets an endpoint declare that a packet was dropped 811 because of corruption, because of receive buffer overflow, and so 812 on. This facilitates research into more appropriate rate-control 813 responses for these non-network-congestion losses (although 814 currently such losses will cause a congestion response). 816 o Acknowledgeability. In TCP, a packet may be acknowledged only 817 once the data is reliably queued for application delivery. This 818 does not make sense in DCCP, where an application might, for 819 example, request a drop-from-front receive buffer. A DCCP packet 820 may be acknowledged as soon as its header has been successfully 821 processed. Concretely, a packet becomes acknowledgeable at 822 Step 8 of Section 8.5's packet processing pseudocode. 823 Acknowledgeability does not guarantee data delivery, however: the 824 Data Dropped option may later report that the packet's 825 application data was discarded. 827 o No receive window. DCCP is a congestion control protocol, not a 828 flow control protocol. 830 o No simultaneous open. Every connection has one client and one 831 server. 833 o No half-closed states. DCCP has no states corresponding to TCP's 834 FINWAIT and CLOSEWAIT, where one half-connection is explicitly 835 closed while the other is still active. The Data Dropped 836 option's Drop Code 1, Application Not Listening (Section 11.7), 837 can achieve a similar effect, however. 839 4.7. Example Connection 841 The progress of a typical DCCP connection is as follows. (This 842 description is informative, not normative.) 844 Client Server 845 ------ ------ 846 0. [CLOSED] [LISTEN] 847 1. DCCP-Request --> 848 2. <-- DCCP-Response 849 3. DCCP-Ack --> 850 4. DCCP-Data, DCCP-Ack, DCCP-DataAck --> 851 <-- DCCP-Data, DCCP-Ack, DCCP-DataAck 852 5. <-- DCCP-CloseReq 853 6. DCCP-Close --> 854 7. <-- DCCP-Reset 855 8. [TIMEWAIT] 857 1. The client sends the server a DCCP-Request packet specifying the 858 client and server ports, the service being requested, and any 859 features being negotiated, including the CCID that the client 860 would like the server to use. The client may optionally 861 piggyback an application request on the DCCP-Request packet, 862 which the server may ignore. 864 2. The server sends the client a DCCP-Response packet indicating 865 that it is willing to communicate with the client. This 866 response indicates any features and options that the server 867 agrees to, begins other feature negotiations as desired, and 868 optionally includes an Init Cookie that wraps up all this 869 information and which must be returned by the client for the 870 connection to complete. 872 3. The client sends the server a DCCP-Ack packet that acknowledges 873 the DCCP-Response packet. This acknowledges the server's 874 initial sequence number and returns the Init Cookie if there was 875 one in the DCCP-Response. It may also continue feature 876 negotiation. The client may piggyback an application-level 877 request on its final ack, producing a DCCP-DataAck packet. 879 4. The server and client then exchange DCCP-Data packets, DCCP-Ack 880 packets acknowledging that data, and, optionally, DCCP-DataAck 881 packets containing data with piggybacked acknowledgements. If 882 the client has no data to send, then the server will send DCCP- 883 Data and DCCP-DataAck packets, while the client will send DCCP- 884 Acks exclusively. (However, the client may not send DCCP-Data 885 packets before receiving at least one non-DCCP-Response packet 886 from the server.) 888 5. The server sends a DCCP-CloseReq packet requesting a close. 890 6. The client sends a DCCP-Close packet acknowledging the close. 892 7. The server sends a DCCP-Reset packet with Reset Code 1, 893 "Closed", and clears its connection state. DCCP-Resets are part 894 of normal connection termination; see Section 5.6. 896 8. The client receives the DCCP-Reset packet and holds state for 897 two maximum segment lifetimes, or 2MSL, to allow any remaining 898 packets to clear the network. 900 An alternative connection closedown sequence is initiated by the 901 client: 903 5b. The client sends a DCCP-Close packet closing the connection. 905 6b. The server sends a DCCP-Reset packet with Reset Code 1, 906 "Closed", and clears its connection state. 908 7b. The client receives the DCCP-Reset packet and holds state for 909 2MSL to allow any remaining packets to clear the network. 911 5. Packet Formats 913 The DCCP header can be from 12 to 1020 bytes long. The initial 12 914 bytes of the header have the same semantics for all currently- 915 defined packet types. Following this comes any additional fixed- 916 length fields required by the packet type, and then a variable- 917 length list of options. The application data area follows the 918 header. In some packet types, this area contains data for the 919 application; in other packet types, its contents are ignored. 921 +---------------------------------------+ -. 922 | Generic Header | | 923 +---------------------------------------+ | 924 | Additional Fields (depending on type) | +- DCCP Header 925 +---------------------------------------+ | 926 | Options (optional) | | 927 +=======================================+ -' 928 | Application Data Area | 929 +---------------------------------------+ 931 5.1. Generic Header 933 The DCCP generic header takes different forms depending on the value 934 of X, the Extended Sequence Numbers bit. If X is one, the Sequence 935 Number field is 48 bits long and the generic header takes 16 bytes, 936 as follows. 938 0 1 2 3 939 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 940 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 941 | Source Port | Dest Port | 942 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 943 | Data Offset | CCVal | CsCov | Checksum | 944 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 945 | | |X| | . 946 | Res | Type |=| Reserved | Sequence Number (high bits) . 947 | | |1| | . 948 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 949 . Sequence Number (low bits) | 950 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 952 If X is zero, only the low 24 bits of the Sequence Number are 953 transmitted, and the generic header is 12 bytes long. 955 0 1 2 3 956 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 957 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 958 | Source Port | Dest Port | 959 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 960 | Data Offset | CCVal | CsCov | Checksum | 961 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 962 | | |X| | 963 | Res | Type |=| Sequence Number (low bits) | 964 | | |0| | 965 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 967 The generic header fields are defined as follows. 969 Source and Destination Ports: 16 bits each 970 These fields identify the connection, similar to the 971 corresponding fields in TCP and UDP. The Source Port represents 972 the relevant port on the endpoint that sent this packet, the 973 Destination Port the relevant port on the other endpoint. When 974 initiating a connection, the client SHOULD choose its Source 975 Port randomly to reduce the likelihood of attack. 977 DCCP APIs should treat port numbers similarly to TCP and UDP 978 port numbers. For example, machines that distinguish between 979 "privileged" and "unprivileged" ports for TCP and UDP should do 980 the same for DCCP. See Section 19.9 for more discussion. 982 Data Offset: 8 bits 983 The offset from the start of the packet's DCCP header to the 984 start of its application data area, in 32-bit words. The 985 receiver MUST ignore packets whose Data Offset is smaller than 986 the minimum-sized header for the given Type, or larger than the 987 DCCP packet itself. 989 CCVal: 4 bits 990 Used by the HC-Sender CCID. For example, the A-to-B CCID's 991 sender, which is active at DCCP A, MAY send 4 bits of 992 information per packet to its receiver by encoding that 993 information in CCVal. The sender MUST set CCVal to zero unless 994 its HC-Sender CCID specifies otherwise, and the receiver MUST 995 ignore the CCVal field unless its HC-Receiver CCID specifies 996 otherwise. 998 Checksum Coverage (CsCov): 4 bits 999 Checksum Coverage determines the parts of the packet that are 1000 covered by the Checksum field. This always includes the DCCP 1001 header and options, but some or all of the application data may 1002 be excluded. This can improve performance on noisy links for 1003 applications that can tolerate corruption. See Section 9. 1005 Checksum: 16 bits 1006 The Internet checksum of the packet's DCCP header (including 1007 options), a network-layer pseudoheader, and, depending on 1008 Checksum Coverage, all, some, or none of the application data. 1009 See Section 9. 1011 Reserved (Res): 3 bits 1012 Senders MUST set this field to all zeroes on generated packets, 1013 and receivers MUST ignore its value. 1015 Type: 4 bits 1016 The Type field specifies the type of the packet. The following 1017 values are defined: 1019 Type Meaning 1020 ---- ------- 1021 0 DCCP-Request 1022 1 DCCP-Response 1023 2 DCCP-Data 1024 3 DCCP-Ack 1025 4 DCCP-DataAck 1026 5 DCCP-CloseReq 1027 6 DCCP-Close 1028 7 DCCP-Reset 1029 8 DCCP-Sync 1030 9 DCCP-SyncAck 1031 10-15 Reserved 1033 Table 1: DCCP Packet Types 1035 Receivers MUST ignore any packets with reserved type. That is, 1036 packets with reserved type MUST NOT be processed and they MUST 1037 NOT be acknowledged as received. 1039 Extended Sequence Numbers (X): 1 bit 1040 Set to one to indicate the use of an extended generic header 1041 with 48-bit Sequence and Acknowledgement Numbers. DCCP-Data, 1042 DCCP-DataAck, and DCCP-Ack packets MAY set X to zero or one. 1043 All DCCP-Request, DCCP-Response, DCCP-CloseReq, DCCP-Close, 1044 DCCP-Reset, DCCP-Sync, and DCCP-SyncAck packets MUST set X to 1045 one; endpoints MUST ignore any such packets with X set to zero. 1046 High-rate connections SHOULD set X to one on all packets to gain 1047 increased protection against wrapped sequence numbers and 1048 attacks. See Section 7.6. 1050 Sequence Number: 48 or 24 bits 1051 Identifies the packet uniquely in the sequence of all packets 1052 the source sent on this connection. Sequence Number increases 1053 by one with every packet sent, including packets such as DCCP- 1054 Ack that carry no application data. See Section 7. 1056 All currently defined packet types except DCCP-Request and DCCP-Data 1057 carry an Acknowledgement Number Subheader in the four or eight bytes 1058 immediately following the generic header. When X=1, its format is: 1060 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1061 | Reserved | Acknowledgement Number . 1062 | | (high bits) . 1063 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1064 . Acknowledgement Number (low bits) | 1065 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1067 When X=0, only the low 24 bits of the Acknowledgement Number are 1068 transmitted, giving the Acknowledgement Number Subheader this 1069 format: 1071 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1072 | Reserved | Acknowledgement Number (low bits) | 1073 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1075 Reserved: 16 or 8 bits 1076 Senders MUST set this field to all zeroes on generated packets, 1077 and receivers MUST ignore its value. 1079 Acknowledgement Number: 48 or 24 bits 1080 Generally contains GSR, the Greatest Sequence Number Received on 1081 any acknowledgeable packet so far. A packet is acknowledgeable 1082 if and only if its header was successfully processed by the 1083 receiver; Section 7.4 describes this further. Options such as 1084 Ack Vector (Section 11.4) combine with the Acknowledgement 1085 Number to provide precise information about which packets have 1086 arrived. 1088 Acknowledgement Numbers on DCCP-Sync and DCCP-SyncAck packets 1089 need not equal GSR. See Section 5.7. 1091 5.2. DCCP-Request Packets 1093 A client initiates a DCCP connection by sending a DCCP-Request 1094 packet. These packets MAY contain application data, and MUST use 1095 48-bit sequence numbers (X=1). 1097 0 1 2 3 1098 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1099 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1100 / Generic DCCP Header with X=1 (16 bytes) / 1101 / with Type=0 (DCCP-Request) / 1102 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1103 | Service Code | 1104 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1105 / Options and Padding / 1106 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1107 / Application Data / 1108 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1110 Service Code: 32 bits 1111 Describes the application-level service to which the client 1112 application wants to connect. Service Codes are intended to 1113 provide information about which application protocol a 1114 connection intends to use, and thus aiding middleboxes and 1115 reducing reliance on globally well-known ports. See Section 1116 8.1.2. 1118 5.3. DCCP-Response Packets 1120 The server responds to valid DCCP-Request packets with DCCP-Response 1121 packets. This is the second phase of the three-way handshake. 1122 DCCP-Response packets MAY contain application data, and MUST use 1123 48-bit sequence numbers (X=1). 1125 0 1 2 3 1126 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1127 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1128 / Generic DCCP Header with X=1 (16 bytes) / 1129 / with Type=1 (DCCP-Response) / 1130 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1131 / Acknowledgement Number Subheader (8 bytes) / 1132 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1133 | Service Code | 1134 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1135 / Options and Padding / 1136 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1137 / Application Data / 1138 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1140 Acknowledgement Number: 48 bits 1141 Contains GSR. Since DCCP-Responses are only sent during 1142 connection initiation, this will always equal the Sequence 1143 Number on a received DCCP-Request. 1145 Service Code: 32 bits 1146 MUST equal the Service Code on the corresponding DCCP-Request. 1148 5.4. DCCP-Data, DCCP-Ack, and DCCP-DataAck Packets 1150 The central data transfer portion of every DCCP connection uses 1151 DCCP-Data, DCCP-Ack, and DCCP-DataAck packets. These packets MAY 1152 use 24-bit sequence numbers, depending on the value of the Allow 1153 Short Sequence Numbers feature (Section 7.6.1). DCCP-Data packets 1154 carry application data without acknowledgements. 1156 0 1 2 3 1157 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1158 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1159 / Generic DCCP Header (16 or 12 bytes) / 1160 / with Type=2 (DCCP-Data) / 1161 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1162 / Options and Padding / 1163 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1164 / Application Data / 1165 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1167 DCCP-Ack packets dispense with the data, but contain an 1168 Acknowledgement Number. They are used for pure acknowledgements. 1170 0 1 2 3 1171 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1172 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1173 / Generic DCCP Header (16 or 12 bytes) / 1174 / with Type=3 (DCCP-Ack) / 1175 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1176 / Acknowledgement Number Subheader (8 or 4 bytes) / 1177 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1178 / Options and Padding / 1179 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1180 / Application Data Area (Ignored) / 1181 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1183 DCCP-DataAck packets carry both application data and an 1184 Acknowledgement Number: acknowledgement information is piggybacked 1185 on a data packet. 1187 0 1 2 3 1188 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1189 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1190 / Generic DCCP Header (16 or 12 bytes) / 1191 / with Type=4 (DCCP-DataAck) / 1192 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1193 / Acknowledgement Number Subheader (8 or 4 bytes) / 1194 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1195 / Options and Padding / 1196 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1197 / Application Data / 1198 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1200 A DCCP-Data or DCCP-DataAck packet may have a zero-length 1201 application data area, which indicates that the application sent a 1202 zero-length datagram. This differs from DCCP-Request and DCCP- 1203 Response packets, where an empty application data area indicates the 1204 absence of application data (not the presence of zero-length 1205 application data). The API SHOULD report any received zero-length 1206 datagrams to the receiving application. 1208 A DCCP-Ack packet MAY have a non-zero-length application data area, 1209 which essentially pads the DCCP-Ack to a desired length. Receivers 1210 MUST ignore the content of the application data area in DCCP-Ack 1211 packets. 1213 DCCP-Ack and DCCP-DataAck packets often include additional 1214 acknowledgement options, such as Ack Vector, as required by the 1215 congestion control mechanism in use. 1217 5.5. DCCP-CloseReq and DCCP-Close Packets 1219 DCCP-CloseReq and DCCP-Close packets begin the handshake that 1220 normally terminates a connection. Either client or server may send 1221 a DCCP-Close packet, which will elicit a DCCP-Reset packet. Only 1222 the server can send a DCCP-CloseReq packet, which indicates that the 1223 server wants to close the connection, but does not want to hold its 1224 TIMEWAIT state. Both packet types MUST use 48-bit sequence numbers 1225 (X=1). 1227 0 1 2 3 1228 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1229 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1230 / Generic DCCP Header with X=1 (16 bytes) / 1231 / with Type=5 (DCCP-CloseReq) or 6 (DCCP-Close) / 1232 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1233 / Acknowledgement Number Subheader (8 bytes) / 1234 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1235 / Options and Padding / 1236 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1237 / Application Data Area (Ignored) / 1238 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1240 As with DCCP-Ack packets, DCCP-CloseReq and DCCP-Close packets MAY 1241 have non-zero-length application data areas, whose contents 1242 receivers MUST ignore. 1244 5.6. DCCP-Reset Packets 1246 DCCP-Reset packets unconditionally shut down a connection. 1247 Connections normally terminate with a DCCP-Reset, but resets may be 1248 sent for other reasons, including bad port numbers, bad option 1249 behavior, incorrect ECN Nonce Echoes, and so forth. DCCP-Resets 1250 MUST use 48-bit sequence numbers (X=1). 1252 0 1 2 3 1253 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1254 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1255 / Generic DCCP Header with X=1 (16 bytes) / 1256 / with Type=7 (DCCP-Reset) / 1257 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1258 / Acknowledgement Number Subheader (8 bytes) / 1259 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1260 | Reset Code | Data 1 | Data 2 | Data 3 | 1261 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1262 / Options and Padding / 1263 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1264 / Application Data Area (Error Text) / 1265 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1267 Reset Code: 8 bits 1268 Represents the reason that the sender reset the DCCP connection. 1270 Data 1, Data 2, and Data 3: 8 bits each 1271 The Data fields provide additional information about why the 1272 sender reset the DCCP connection. The meanings of these fields 1273 depend on the value of Reset Code. 1275 Application Data Area: Error Text 1276 If present, Error Text is a human-readable text string encoded 1277 in Unicode UTF-8, and preferably in English, that describes the 1278 error in more detail. For example, a DCCP-Reset with Reset Code 1279 11, "Aggression Penalty", might contain Error Text such as 1280 "Aggression Penalty: Received 3 bad ECN Nonce Echoes, assuming 1281 misbehavior". 1283 The following Reset Codes are currently defined. Unless otherwise 1284 specified, the Data 1, 2, and 3 fields MUST be set to 0 by the 1285 sender of the DCCP-Reset and ignored by its receiver. Section 1286 references describe concrete situations that will cause each Reset 1287 Code to be generated; they are not meant to be exhaustive. 1289 0, "Unspecified" 1290 Indicates the absence of a meaningful Reset Code. Use of Reset 1291 Code 0 is NOT RECOMMENDED: the sender should choose a Reset Code 1292 that more clearly defines why the connection is being reset. 1294 1, "Closed" 1295 Normal connection close. See Section 8.3. 1297 2, "Aborted" 1298 The sending endpoint gave up on the connection because of lack 1299 of progress. See Sections 8.1.1 and 8.1.5. 1301 3, "No Connection" 1302 No connection exists. See Section 8.3.1. 1304 4, "Packet Error" 1305 A valid packet arrived with unexpected type. For example, a 1306 DCCP-Data packet with valid header checksum and sequence numbers 1307 arrived at a connection in the REQUEST state. See Section 1308 8.3.1. The Data 1 field equals the offending packet type as an 1309 eight-bit number; thus, an offending packet with Type 2 will 1310 result in a Data 1 value of 2. 1312 5, "Option Error" 1313 An option was erroneous, and the error was serious enough to 1314 warrant resetting the connection. See Sections 6.6.7, 6.6.8, 1315 and 11.4. The Data 1 field equals the offending option type; 1316 Data 2 and Data 3 equal the first two bytes of option data (or 1317 zero if the option had less than two bytes of data). 1319 6, "Mandatory Error" 1320 The sending endpoint could not process an option O that was 1321 immediately preceded by Mandatory. The Data fields report the 1322 option type and data of option O, using the format of Reset Code 1323 5, "Option Error". See Section 5.8.2. 1325 7, "Connection Refused" 1326 The Destination Port didn't correspond to a port open for 1327 listening. Sent only in response to DCCP-Requests. See Section 1328 8.1.3. 1330 8, "Bad Service Code" 1331 The Service Code didn't equal the service code attached to the 1332 Destination Port. Sent only in response to DCCP-Requests. See 1333 Section 8.1.3. 1335 9, "Too Busy" 1336 The server is too busy to accept new connections. Sent only in 1337 response to DCCP-Requests. See Section 8.1.3. 1339 10, "Bad Init Cookie" 1340 The Init Cookie echoed by the client was incorrect or missing. 1341 See Section 8.1.4. 1343 11, "Aggression Penalty" 1344 This endpoint has detected congestion control-related 1345 misbehavior on the part of the other endpoint. See Section 1346 12.3. 1348 12-127, Reserved 1349 Receivers should treat these codes as they do Reset Code 0, 1350 "Unspecified". 1352 128-255, CCID-specific codes 1353 Semantics depend on the connection's CCIDs. See Section 10.3. 1354 Receivers should treat unknown CCID-specific Reset Codes as they 1355 do Reset Code 0, "Unspecified". 1357 The following table summarizes this information. 1359 Reset 1360 Code Name Data 1 Data 2 & 3 1361 ----- ---- ------ ---------- 1362 0 Unspecified 0 0 1363 1 Closed 0 0 1364 2 Aborted 0 0 1365 3 No Connection 0 0 1366 4 Packet Error pkt type 0 1367 5 Option Error option # option data 1368 6 Mandatory Error option # option data 1369 7 Connection Refused 0 0 1370 8 Bad Service Code 0 0 1371 9 Too Busy 0 0 1372 10 Bad Init Cookie 0 0 1373 11 Aggression Penalty 0 0 1374 12-127 Reserved 1375 128-255 CCID-specific codes 1377 Table 2: DCCP Reset Codes 1379 Options on DCCP-Reset packets are processed before the connection is 1380 shut down. This means that certain combinations of options, 1381 particularly involving Mandatory, may cause an endpoint to respond 1382 to a valid DCCP-Reset with another DCCP-Reset. This cannot lead to 1383 a reset storm; since the first endpoint has already reset the 1384 connection, the second DCCP-Reset will be ignored. 1386 5.7. DCCP-Sync and DCCP-SyncAck Packets 1388 DCCP-Sync packets help DCCP endpoints recover synchronization after 1389 bursts of loss, or recover from half-open connections. Each valid 1390 received DCCP-Sync immediately elicits a DCCP-SyncAck. Both packet 1391 types MUST use 48-bit sequence numbers (X=1). 1393 0 1 2 3 1394 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1395 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1396 / Generic DCCP Header with X=1 (16 bytes) / 1397 / with Type=8 (DCCP-Sync) or 9 (DCCP-SyncAck) / 1398 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1399 / Acknowledgement Number Subheader (8 bytes) / 1400 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1401 / Options and Padding / 1402 +=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+ 1403 / Application Data Area (Ignored) / 1404 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1406 The Acknowledgement Number field has special semantics for DCCP-Sync 1407 and DCCP-SyncAck packets. First, the packet corresponding to a 1408 DCCP-Sync's Acknowledgement Number need not have been 1409 acknowledgeable. Thus, receivers MUST NOT assume that a packet was 1410 processed simply because it appears in the Acknowledgement Number 1411 field of a DCCP-Sync packet. This differs from all other packet 1412 types, where the Acknowledgement Number by definition corresponds to 1413 an acknowledgeable packet. Second, the Acknowledgement Number on 1414 any DCCP-SyncAck packet MUST correspond to the Sequence Number on an 1415 acknowledgeable DCCP-Sync packet. In the presence of reordering, 1416 this might not equal GSR. 1418 As with DCCP-Ack packets, DCCP-Sync and DCCP-SyncAck packets MAY 1419 have non-zero-length application data areas, whose contents 1420 receivers MUST ignore. Padded DCCP-Sync packets may be useful when 1421 performing Path MTU discovery; see Section 14. 1423 5.8. Options 1425 Any DCCP packet may contain options, which occupy space at the end 1426 of the DCCP header. Each option is a multiple of 8 bits in length. 1427 Individual options are not padded to multiples of 32 bits, and any 1428 option may begin on any byte boundary. However, the combination of 1429 all options MUST add up to a multiple of 32 bits; Padding options 1430 MUST be added as necessary to fill out option space to a word 1431 boundary. Any options present are included in the header checksum. 1433 The first byte of an option is the option type. Options with types 1434 0 through 31 are single-byte options. Other options are followed by 1435 a byte indicating the option's length. This length value includes 1436 the two bytes of option-type and option-length as well as any 1437 option-data bytes, and must therefore be greater than or equal to 1438 two. 1440 Options are processed sequentially, starting at the first option in 1441 the packet header. Options with unknown types MUST be ignored. 1442 Also, options with nonsensical lengths (length byte less than two or 1443 more than the remaining space in the options portion of the header) 1444 MUST be ignored, and any option space following an option with 1445 nonsensical length MUST likewise be ignored. 1447 The following options are currently defined: 1449 Option DCCP- Section 1450 Type Length Meaning Data? Reference 1451 ---- ------ ------- ----- --------- 1452 0 1 Padding Y 5.8.1 1453 1 1 Mandatory N 5.8.2 1454 2 1 Slow Receiver Y 11.6 1455 3-31 1 Reserved 1456 32 variable Change L N 6.1 1457 33 variable Confirm L N 6.2 1458 34 variable Change R N 6.1 1459 35 variable Confirm R N 6.2 1460 36 variable Init Cookie N 8.1.4 1461 37 3-5 NDP Count Y 7.7 1462 38 variable Ack Vector [Nonce 0] N 11.4 1463 39 variable Ack Vector [Nonce 1] N 11.4 1464 40 variable Data Dropped N 11.7 1465 41 6 Timestamp Y 13.1 1466 42 6/8/10 Timestamp Echo Y 13.3 1467 43 4/6 Elapsed Time N 13.2 1468 44 6 Data Checksum Y 9.3 1469 45-127 variable Reserved 1470 128-255 variable CCID-specific options - 10.3 1472 Table 3: DCCP Options 1474 Not all options are suitable for all packet types. For example, 1475 since the Ack Vector option is interpreted relative to the 1476 Acknowledgement Number, it isn't suitable on DCCP-Request and DCCP- 1477 Data packets, which have no Acknowledgement Number. If an option 1478 occurs on an unexpected packet type, it MUST generally be ignored; 1479 any such restrictions are mentioned in each option's description. 1480 The table summarizes the most common restriction: when the DCCP- 1481 Data? column value is N, the corresponding option MUST be ignored 1482 when received on a DCCP-Data packet. (Section 7.5.5 describes why 1483 such options are ignored as opposed to, say, causing a reset.) 1485 Options with invalid values MUST be ignored unless otherwise 1486 specified. For example, any Data Checksum option with option length 1487 4 MUST be ignored, since all valid Data Checksum options have option 1488 length 6. 1490 This section describes two generic options, Padding and Mandatory. 1491 Other options are described later. 1493 5.8.1. Padding Option 1494 +--------+ 1495 |00000000| 1496 +--------+ 1497 Type=0 1499 Padding is a single-byte "no-operation" option used to pad between 1500 or after options. If the length of a packet's other options is not 1501 a multiple of 32 bits, then Padding options are REQUIRED to pad out 1502 the options area to the length implied by Data Offset. Padding may 1503 also be used between options -- for example, to align the beginning 1504 of a subsequent option on a 32-bit boundary. There is no guarantee 1505 that senders will use this option, so receivers must be prepared to 1506 process options even if they do not begin on a word boundary. 1508 5.8.2. Mandatory Option 1510 +--------+ 1511 |00000001| 1512 +--------+ 1513 Type=1 1515 Mandatory is a single-byte option that marks the immediately 1516 following option as mandatory. Say that the immediately following 1517 option is O. Then the Mandatory option has no effect if the 1518 receiving DCCP endpoint understands and processes O. If the 1519 endpoint does not understand or process O, however, then it MUST 1520 reset the connection using Reset Code 6, "Mandatory Failure". For 1521 instance, the endpoint would reset the connection if it did not 1522 understand O's type; if it understood O's type, but not O's data; if 1523 O's data was invalid for O's type; if O was a feature negotiation 1524 option, and the endpoint did not understand the enclosed feature 1525 number; if the endpoint understood O, but chose not to perform the 1526 action O implies; and so forth. 1528 Mandatory options MUST NOT be sent on DCCP-Data packets, and any 1529 Mandatory options received on DCCP-Data packets MUST be ignored. 1531 The connection is in error and should be reset with Reset Code 5, 1532 "Option Error" if option O is absent (Mandatory was the last byte of 1533 the option list), or if option O equals Mandatory. However, the 1534 combination "Mandatory Padding" is valid, and MUST behave like two 1535 bytes of Padding. 1537 Section 6.6.9 describes the behavior of Mandatory feature 1538 negotiation options in more detail. 1540 6. Feature Negotiation 1542 Four DCCP options, Change L, Confirm L, Change R, and Confirm R, are 1543 used to negotiate feature values. Change options initiate a 1544 negotiation; Confirm options complete that negotiation. The "L" 1545 options are sent by the feature location, and the "R" options are 1546 sent by the feature remote. Change options are retransmitted to 1547 ensure reliability. 1549 All these options have the same format. The first byte of option 1550 data is the feature number, and the second and subsequent data bytes 1551 hold one or more feature values. The exact format of the feature 1552 value area depends on the feature type; see Section 6.3. 1554 +--------+--------+--------+--------+-------- 1555 | Type | Length |Feature#| Value(s) ... 1556 +--------+--------+--------+--------+-------- 1558 Together, the feature number and the option type ("L" or "R") 1559 uniquely identify the feature to which an option applies. The exact 1560 format of the Value(s) area depends on the feature number. 1562 Feature negotiation options MUST NOT be sent on DCCP-Data packets, 1563 and any feature negotiation options received on DCCP-Data packets 1564 MUST be ignored. 1566 6.1. Change Options 1568 Change L and Change R options initiate feature negotiation. The 1569 option to use depends on the relevant feature's location: To start a 1570 negotiation for feature F/A, DCCP A will send a Change L option; to 1571 start a negotiation for F/B, it will send a Change R option. Change 1572 options are retransmitted until some response is received. They 1573 contain at least one Value, and thus have length at least 4. 1575 +--------+--------+--------+--------+-------- 1576 Change L: |00100000| Length |Feature#| Value(s) ... 1577 +--------+--------+--------+--------+-------- 1578 Type=32 1580 +--------+--------+--------+--------+-------- 1581 Change R: |00100010| Length |Feature#| Value(s) ... 1582 +--------+--------+--------+--------+-------- 1583 Type=34 1585 6.2. Confirm Options 1587 Confirm L and Confirm R options complete feature negotiation, and 1588 are sent in response to Change R and Change L options, respectively. 1589 Confirm options MUST NOT be generated except in response to Change 1590 options. Confirm options need not be retransmitted, since Change 1591 options are retransmitted as necessary. The first byte of the 1592 Confirm option contains the feature number from the corresponding 1593 Change. Following this is the selected Value, and then possibly the 1594 sender's preference list. 1596 +--------+--------+--------+--------+-------- 1597 Confirm L: |00100001| Length |Feature#| Value(s) ... 1598 +--------+--------+--------+--------+-------- 1599 Type=33 1601 +--------+--------+--------+--------+-------- 1602 Confirm R: |00100011| Length |Feature#| Value(s) ... 1603 +--------+--------+--------+--------+-------- 1604 Type=35 1606 If an endpoint receives an invalid Change option -- with an unknown 1607 feature number, or an invalid value -- it will respond with an empty 1608 Confirm option containing the problematic feature number, but no 1609 value. Such options have length 3. 1611 6.3. Reconciliation Rules 1613 Reconciliation rules determine how the two sets of preferences for a 1614 given feature are resolved into a unique result. The reconciliation 1615 rule depends only on the feature number. Each reconciliation rule 1616 must have the property that the result is uniquely determined given 1617 the contents of Change options sent by the two endpoints. 1619 All current DCCP features use one of two reconciliation rules, 1620 server-priority ("SP") and non-negotiable ("NN"). 1622 6.3.1. Server-Priority 1624 The feature value is a fixed-length byte string (length determined 1625 by the feature number). Each Change option contains a list of 1626 values ordered by preference, with the most preferred value coming 1627 first. Each Confirm option contains the confirmed value, followed 1628 by the confirmer's preference list. Thus, the feature's current 1629 value will generally appear twice in Confirm options' data, once as 1630 the current value and once in the confirmer's preference list. 1632 To reconcile the preference lists, select the first entry in the 1633 server's list that also occurs in the client's list. If there is no 1634 shared entry, the feature's value MUST NOT change, and the Confirm 1635 option will confirm the feature's previous value (unless the Change 1636 option was Mandatory; see Section 6.6.9). 1638 6.3.2. Non-Negotiable 1640 The feature value is a byte string. Each option contains exactly 1641 one feature value. The feature location signals a new value by 1642 sending a Change L option. The feature remote MUST accept any valid 1643 value, responding with a Confirm R option containing the new value, 1644 and it MUST send empty Confirm R options in response to invalid 1645 values (unless the Change L option was Mandatory; see Section 1646 6.6.9). Change R and Confirm L options MUST NOT be sent for non- 1647 negotiable features; see Section 6.6.8. Non-negotiable features use 1648 the feature negotiation mechanism to achieve reliability. 1650 6.4. Feature Numbers 1652 This document defines the following feature numbers. 1654 Rec'n Initial Section 1655 Number Meaning Rule Value Req'd Reference 1656 ------ ------- ----- ----- ----- --------- 1657 0 Reserved 1658 1 Congestion Control ID (CCID) SP 2 Y 10 1659 2 Allow Short Seqnos SP 1 Y 7.6.1 1660 3 Sequence Window NN 100 Y 7.5.2 1661 4 ECN Incapable SP 0 N 12.1 1662 5 Ack Ratio NN 2 N 11.3 1663 6 Send Ack Vector SP 0 N 11.5 1664 7 Send NDP Count SP 0 N 7.7.2 1665 8 Minimum Checksum Coverage SP 0 N 9.2.1 1666 9 Check Data Checksum SP 0 N 9.3.1 1667 10-127 Reserved 1668 128-255 CCID-specific features 10.3 1670 Table 4: DCCP Feature Numbers 1672 Rec'n Rule The reconciliation rule used for the feature. SP is 1673 server-priority and NN is non-negotiable. 1675 Initial Value The initial value for the feature. Every feature has 1676 a known initial value. 1678 Req'd This column is "Y" if and only if every DCCP 1679 implementation MUST understand the feature. If it is 1680 "N", then the feature behaves like an extension (see 1681 Section 15), and it is safe to respond to Change 1682 options for the feature with empty Confirm options. 1683 Of course, a CCID might require the feature; a DCCP 1684 that implements CCID 2 MUST support Ack Ratio and 1685 Send Ack Vector, for example. 1687 6.5. Feature Negotiation Examples 1688 Here are three example feature negotiations for features located at 1689 the server, the first two for the Congestion Control ID feature, the 1690 last for the Ack Ratio. 1692 Client Server 1693 ------ ------ 1694 1. Change R(CCID, 2 3 1) --> 1695 ("2 3 1" is client's preference list) 1696 2. <-- Confirm L(CCID, 3, 3 2 1) 1697 (3 is the negotiated value; 1698 "3 2 1" is server's pref list) 1699 * agreement that CCID/Server = 3 * 1701 1. XXX <-- Change L(CCID, 3 2 1) 1702 2. Retransmission: 1703 <-- Change L(CCID, 3 2 1) 1704 3. Confirm R(CCID, 3, 2 3 1) --> 1705 * agreement that CCID/Server = 3 * 1707 1. <-- Change L(Ack Ratio, 3) 1708 2. Confirm R(Ack Ratio, 3) --> 1709 * agreement that Ack Ratio/Server = 3 * 1711 This example shows a simultaneous negotiation. 1713 Client Server 1714 ------ ------ 1715 1a. Change R(CCID, 2 3 1) --> 1716 b. <-- Change L(CCID, 3 2 1) 1717 2a. <-- Confirm L(CCID, 3, 3 2 1) 1718 b. Confirm R(CCID, 3, 2 3 1) --> 1719 * agreement that CCID/Server = 3 * 1721 Here are the byte encodings of several Change and Confirm options. 1722 Each option is sent by DCCP A. 1724 Change L(CCID, 2 3) = 32,5,1,2,3 1725 DCCP B should change CCID/A's value (feature number 1, a server- 1726 priority feature); DCCP A's preferred values are 2 and 3, in 1727 that preference order. 1729 Change L(Sequence Window, 1024) = 32,9,3,0,0,0,0,4,0 1730 DCCP B should change Sequence Window/A's value (feature number 1731 3, a non-negotiable feature) to the 6-byte string 0,0,0,0,4,0 1732 (the value 1024). 1734 Confirm L(CCID, 2, 2 3) = 33,6,1,2,2,3 1735 DCCP A has changed CCID/A's value to 2; its preferred values are 1736 2 and 3, in that preference order. 1738 Empty Confirm L(126) = 33,3,126 1739 DCCP A doesn't implement feature number 126, or DCCP B's 1740 proposed value for feature 126/A was invalid. 1742 Change R(CCID, 3 2) = 34,5,1,3,2 1743 DCCP B should change CCID/B's value; DCCP A's preferred values 1744 are 3 and 2, in that preference order. 1746 Confirm R(CCID, 2, 3 2) = 35,6,1,2,3,2 1747 DCCP A has changed CCID/B's value to 2; its preferred values 1748 were 3 and 2, in that preference order. 1750 Confirm R(Sequence Window, 1024) = 35,9,3,0,0,0,0,4,0 1751 DCCP A has changed Sequence Window/B's value to the 6-byte 1752 string 0,0,0,0,4,0 (the value 1024). 1754 Empty Confirm R(126) = 35,3,126 1755 DCCP A doesn't implement feature number 126, or DCCP B's 1756 proposed value for feature 126/B was invalid. 1758 6.6. Option Exchange 1760 A few basic rules govern feature negotiation option exchange. 1762 1. Every non-reordered Change option gets a Confirm option in 1763 response. 1765 2. Change options are retransmitted until a response for the latest 1766 Change is received. 1768 3. Feature negotiation options are processed in strictly increasing 1769 order by Sequence Number. 1771 The rest of this section describes the consequences of these rules 1772 in more detail. 1774 6.6.1. Normal Exchange 1776 Change options are generated when a DCCP endpoint wants to change 1777 the value of some feature. Generally, this will happen at the 1778 beginning of a connection, although it may happen at any time. We 1779 say the endpoint "generates" or "sends" a Change L or Change R 1780 option, but of course the option must be attached to a packet. The 1781 endpoint may attach the option to a packet it would have generated 1782 anyway (such as a DCCP-Request), or it may create a "feature 1783 negotiation packet", often a DCCP-Ack or DCCP-Sync, just to carry 1784 the option. Feature negotiation packets are controlled by the 1785 relevant congestion control mechanism. For example, DCCP A may send 1786 a DCCP-Ack or DCCP-Sync for feature negotiation only if the B-to-A 1787 CCID would allow sending a DCCP-Ack. In addition, an endpoint 1788 SHOULD generate at most one feature negotiation packet per round- 1789 trip time. 1791 On receiving a Change L or Change R option, a DCCP endpoint examines 1792 the included preference list, reconciles that with its own 1793 preference list, calculates the new value, and sends back a 1794 Confirm R or Confirm L option, respectively, informing its peer of 1795 the new value or that the feature was not understood. Every non- 1796 reordered Change option MUST result in a corresponding Confirm 1797 option, and any packet including a Confirm option MUST carry an 1798 Acknowledgement Number. (Section 6.6.4 describes how Change 1799 reordering is detected and handled.) Generated Confirm options may 1800 be attached to packets that would have been sent anyway (such as 1801 DCCP-Response or DCCP-SyncAck), or to new feature negotiation 1802 packets, as described above. 1804 The Change-sending endpoint MUST wait to receive a corresponding 1805 Confirm option before changing its stored feature value. The 1806 Confirm-sending endpoint changes its stored feature value as soon as 1807 it sends the Confirm. 1809 A packet MAY contain more than one feature negotiation option, as 1810 long as no two options refer to the same feature. Note, however, 1811 that a packet is allowed to contain one L option and one R option 1812 with the same feature number, since the two options actually refer 1813 to different features (F/A and F/B). 1815 6.6.2. Processing Received Options 1817 DCCP endpoints exist in one of three states relative to each 1818 feature. STABLE is the normal state, where the endpoint knows the 1819 feature's value and thinks the other endpoint agrees. An endpoint 1820 enters the CHANGING state when it first sends a Change for the 1821 feature, and returns to STABLE once it receives a corresponding 1822 Confirm. The final state, UNSTABLE, indicates that an endpoint in 1823 CHANGING state changed its preference list, but has not yet 1824 transmitted a Change option with the new preference list. 1826 Feature state transitions at a feature location are implemented 1827 according to this diagram. The diagram ignores sequence number and 1828 option validity issues; these are handled explicitly in the 1829 pseudocode that follows. 1831 timeout/ 1832 rcv Confirm R app/protocol evt : snd Change L rcv non-ack 1833 : ignore +---------------------------------------+ : snd Change L 1834 +----+ | | +----+ 1835 | v | rcv Change R v | v 1836 +------------+ rcv Confirm R : calc new value, +------------+ 1837 | | : accept value snd Confirm L | | 1838 | STABLE |<-----------------------------------| CHANGING | 1839 | | rcv empty Confirm R | | 1840 +------------+ : revert to old value +------------+ 1841 | ^ | ^ 1842 +----+ pref list | | snd 1843 rcv Change R changes | | Change L 1844 : calc new value, snd Confirm L v | 1845 +------------+ 1846 +---| | 1847 rcv Confirm/Change R | | UNSTABLE | 1848 : ignore +-->| | 1849 +------------+ 1851 Feature locations SHOULD use the following pseudocode, which 1852 corresponds to the state diagram, to react to each feature 1853 negotiation option on each valid packet received. The pseudocode 1854 refers to "P.seqno" and "P.ackno", which are properties of the 1855 packet; "O.type", and "O.len", which are properties of the option; 1856 "FGSR" and "FGSS", which are properties of the connection, and 1857 handle reordering as described in Section 6.6.4; "F.state", which is 1858 the feature's state (STABLE, CHANGING, or UNSTABLE); and "F.value", 1859 which is the feature's value. 1861 First, check for unknown features (Section 6.6.7); 1862 If F is unknown, 1863 If the option was Mandatory, /* Section 6.6.9 */ 1864 Reset connection and return 1865 Otherwise, if O.type == Change R, 1866 Send Empty Confirm L on a future packet 1868 Return 1870 Second, check for reordering (Section 6.6.4); 1871 If F.state == UNSTABLE or P.seqno <= FGSR 1872 or (O.type == Confirm R and P.ackno < FGSS), 1873 Ignore option and return 1875 Third, process Change R options; 1876 If O.type == Change R, 1877 If the option's value is valid, /* Section 6.6.8 */ 1878 Calculate new value 1879 Send Confirm L on a future packet 1880 Set F.state := STABLE 1881 Otherwise, if the option was Mandatory, 1882 Reset connection and return 1883 Otherwise, 1884 Send Empty Confirm L on a future packet 1885 /* Remain in existing state. If that's CHANGING, this 1886 endpoint will retransmit its Change L option later. */ 1888 Fourth, process Confirm R options (but only in CHANGING state). 1889 If F.state == CHANGING and O.type == Confirm R, 1890 If O.len > 3, /* nonempty */ 1891 If the option's value is valid, 1892 Set F.value := new value 1893 Otherwise, 1894 Reset connection and return 1895 Set F.state := STABLE 1897 Versions of this diagram and pseudocode are also used by feature 1898 remotes; simply switch the "L"s and "R"s, so that the relevant 1899 options are Change R and Confirm L. 1901 6.6.3. Loss and Retransmission 1903 Packets containing Change and Confirm options might be lost or 1904 delayed by the network. Therefore, Change options are repeatedly 1905 transmitted to achieve reliability. We refer to this as 1906 "retransmission", although of course there are no packet-level 1907 retransmissions in DCCP: a Change option that is sent again will be 1908 sent on a new packet with a new sequence number. 1910 A CHANGING endpoint transmits another Change option once it realizes 1911 that it has not heard back from the other endpoint. The new Change 1912 option need not contain the same payload as the original; reordering 1913 protection will ensure that agreement is reached based on the most 1914 recently transmitted option. 1916 A CHANGING endpoint MUST continue retransmitting Change options 1917 until it gets some response or the connection terminates. 1919 Endpoints SHOULD use an exponential-backoff timer to decide when to 1920 retransmit Change options. (Endpoints that generate packets 1921 specifically for feature negotiation MUST use such a timer.) The 1922 timer interval is initially set to not less than one round-trip 1923 time, and should back off to not less than 64 seconds. The backoff 1924 protects against delayed agreement due to the reordering protection 1925 algorithms described in the next section. Again, endpoints may 1926 piggyback Change options on packets they would have sent anyway, or 1927 create new packets to carry the options; any such new packets are 1928 controlled by the relevant congestion-control mechanism. 1930 Confirm options are never retransmitted, but the Confirm-sending 1931 endpoint MUST generate a Confirm option after every non-reordered 1932 Change. 1934 6.6.4. Reordering 1936 Reordering might cause packets containing Change and Confirm options 1937 to arrive in an unexpected order. Endpoints MUST ignore feature 1938 negotiation options that do not arrive in strictly-increasing order 1939 by Sequence Number. The rest of this section presents two 1940 algorithms that fulfill this requirement. 1942 The first algorithm introduces two sequence number variables that 1943 each endpoint maintains for the connection. 1945 FGSR Feature Greatest Sequence Number Received: The greatest 1946 sequence number received, considering only valid packets 1947 that contained one or more feature negotiation options 1948 (Change and/or Confirm). This value is initialized to 1949 ISR - 1. 1951 FGSS Feature Greatest Sequence Number Sent: The greatest 1952 sequence number sent, considering only packets that 1953 contained one or more non-retransmitted Change options. 1954 (Retransmitted Change options MUST have exactly the same 1955 contents as previously transmitted options, so limited 1956 reordering can safely be tolerated.) This value is 1957 initialized to ISS. 1959 Each endpoint checks two conditions on sequence numbers to decide 1960 whether to process received feature negotiation options. 1962 1. If a packet's Sequence Number is less than or equal to FGSR, 1963 then its Change options MUST be ignored. 1965 2. If a packet's Sequence Number is less than or equal to FGSR, OR 1966 it has no Acknowledgement Number, OR its Acknowledgement Number 1967 is less than FGSS, then its Confirm options MUST be ignored. 1969 Alternatively, an endpoint MAY maintain separate FGSR and FGSS 1970 values for every feature. FGSR(F/X) would equal the greatest 1971 sequence number received, considering only packets that contained 1972 Change or Confirm options applying to feature F/X; FGSS(F/X) would 1973 be defined similarly. This algorithm requires more state, but is 1974 slightly more forgiving to multiple overlapped feature negotiations. 1975 Either algorithm MAY be used; the first algorithm, with connection- 1976 wide FGSR and FGSS variables, is RECOMMENDED. 1978 One consequence of these rules is that a CHANGING endpoint will 1979 ignore any Confirm option that does not acknowledge the latest 1980 Change option sent. This ensures that agreement, once achieved, 1981 used the most recent available information about the endpoints' 1982 preferences. 1984 6.6.5. Preference Changes 1986 Endpoints are allowed to change their preference lists at any time. 1987 However, an endpoint that changes its preference list while in the 1988 CHANGING state MUST transition to the UNSTABLE state. It will 1989 transition back to CHANGING once it has transmitted a Change option 1990 with the new preference list. This ensures that agreement is based 1991 on active preference lists. Without the UNSTABLE state, 1992 simultaneous negotiation -- where the endpoints began independent 1993 negotiations for the same feature at the same time -- might lead to 1994 the negotiation terminating with the endpoints thinking the feature 1995 had different values. 1997 6.6.6. Simultaneous Negotiation 1999 The two endpoints might simultaneously open negotiation for the same 2000 feature, after which an endpoint in the CHANGING state will receive 2001 a Change option for the same feature. Such received Change options 2002 can act as responses to the original Change options. The CHANGING 2003 endpoint MUST examine the received Change's preference list, 2004 reconcile that with its own preference list (as expressed in its 2005 generated Change options), and generate the corresponding Confirm 2006 option. It can then transition to the STABLE state. 2008 6.6.7. Unknown Features 2010 Endpoints may receive Change options referring to feature numbers 2011 they do not understand -- for instance, when an extended DCCP 2012 converses with a non-extended DCCP. Endpoints MUST respond to 2013 unknown Change options with Empty Confirm options (that is, Confirm 2014 options containing no data), which inform the CHANGING endpoint that 2015 the feature was not understood. However, if the Change option was 2016 Mandatory, the connection MUST be reset; see Section 6.6.9. 2018 On receiving an empty Confirm option for some feature, the CHANGING 2019 endpoint MUST transition back to the STABLE state, leaving the 2020 feature's value unchanged. Section 15 suggests that the default 2021 value for any extension feature should correspond to "extension not 2022 available". 2024 Some features are required to be understood by all DCCPs (see 2025 Section 6.4). The CHANGING endpoint SHOULD reset the connection 2026 (with Reset Code 5, "Option Error") if it receives an empty Confirm 2027 option for such a feature. 2029 Since Confirm options are generated only in response to Change 2030 options, an endpoint should never receive a Confirm option referring 2031 to a feature number it does not understand. Nevertheless, endpoints 2032 MUST ignore any such options they receive. 2034 6.6.8. Invalid Options 2036 A DCCP endpoint might receive a Change or Confirm option that lists 2037 one or more values that it does not understand. Some, but not all, 2038 such options are invalid, depending on the relevant reconciliation 2039 rule (Section 6.3). For instance: 2041 o All features have length limitations, and options with invalid 2042 lengths are invalid. For example, the Ack Ratio feature takes 2043 16-bit values, so valid "Confirm R(Ack Ratio)" options have 2044 option length 5. 2046 o Some non-negotiable features have value limitations. The Ack 2047 Ratio feature takes two-byte, non-zero integer values, so a 2048 "Change L(Ack Ratio, 0)" option is never valid. Note that 2049 server-priority features do not have value limitations, since 2050 unknown values are handled as a matter of course. 2052 o Any Confirm option that selects the wrong value, based on the two 2053 preference lists and the relevant reconciliation rule, is 2054 invalid. 2056 o However, unexpected Confirm options -- that refer to unknown 2057 feature numbers, or that don't appear to be part of a current 2058 negotiation -- are considered valid, although they are ignored by 2059 the receiver. 2061 An endpoint receiving an invalid Change option MUST respond with the 2062 corresponding empty Confirm option. An endpoint receiving an 2063 invalid Confirm option MUST reset the connection, with Reset Code 5, 2064 "Option Error". 2066 6.6.9. Mandatory Feature Negotiation 2068 Change options may be preceded by Mandatory options (Section 5.8.2). 2069 Mandatory Change options are processed like normal Change options, 2070 except that the following failure cases will cause the receiver to 2071 reset the connection with Reset Code 6, "Mandatory Failure", rather 2072 than send a Confirm option. The connection MUST be reset if: 2074 o The Change option's feature number was not understood; 2076 o The Change option's value was invalid, and the receiver would 2077 normally have sent an empty Confirm option in response; or 2079 o For server-priority features, there was no shared entry in the 2080 two endpoints' preference lists. 2082 There's no reason to mark Confirm options as Mandatory in this 2083 version of DCCP, since Confirm options are sent only in response to 2084 Change options and therefore can't mention potentially-invalid 2085 values or unexpected feature numbers. 2087 7. Sequence Numbers 2089 DCCP uses sequence numbers to arrange packets into sequence, detect 2090 losses and network duplicates, and protect against attackers, half- 2091 open connections, and the delivery of very old packets. Every 2092 packet carries a Sequence Number; most packet types carry an 2093 Acknowledgement Number as well. 2095 DCCP sequence numbers are packet-based. That is, the packets 2096 generated by each endpoint have Sequence Numbers that increase by 2097 one, modulo 2^48, for every packet. Even DCCP-Ack and DCCP-Sync 2098 packets, and other packets that don't carry user data, increment the 2099 Sequence Number. Since DCCP is an unreliable protocol, there are no 2100 true retransmissions; but effective retransmissions, such as 2101 retransmissions of DCCP-Request packets, also increment the Sequence 2102 Number. This lets DCCP implementations detect network duplication, 2103 retransmissions, and acknowledgement loss, and is a significant 2104 departure from TCP practice. 2106 7.1. Variables 2108 DCCP endpoints maintain a set of sequence number variables for each 2109 connection. 2111 ISS The Initial Sequence Number Sent by this endpoint. This 2112 equals the Sequence Number of the first DCCP-Request or 2113 DCCP-Response sent. 2115 ISR The Initial Sequence Number Received from the other 2116 endpoint. This equals the Sequence Number of the first 2117 DCCP-Request or DCCP-Response received. 2119 GSS The Greatest Sequence Number Sent by this endpoint. Here, 2120 and elsewhere, "greatest" is measured in circular sequence 2121 space. 2123 GSR The Greatest Sequence Number Received from the other 2124 endpoint on an acknowledgeable packet. (Section 7.4 defines 2125 this term.) 2127 GAR The Greatest Acknowledgement Number Received from the other 2128 endpoint on an acknowledgeable packet that was not a DCCP- 2129 Sync. 2131 Some other variables are derived from these primitives. 2133 SWL and SWH 2134 (Sequence Number Window Low and High) The extremes of the 2135 validity window for received packets' Sequence Numbers. 2137 AWL and AWH 2138 (Acknowledgement Number Window Low and High) The extremes 2139 of the validity window for received packets' Acknowledgement 2140 Numbers. 2142 7.2. Initial Sequence Numbers 2144 The endpoints' initial sequence numbers are set by the first DCCP- 2145 Request and DCCP-Response packets sent. Initial sequence numbers 2146 MUST be chosen to avoid two problems: 2148 o Delivery of old packets, where packets lingering in the network 2149 from an old connection are delivered to a new connection with the 2150 same addresses and port numbers. 2152 o Sequence number attacks, where an attacker can guess the sequence 2153 numbers that a future connection would use [M85]. 2155 These problems are the same as problems faced by TCP, and DCCP 2156 implementations SHOULD use TCP's strategies to avoid them [RFC 793, 2157 RFC 1948]. The rest of this section explains these strategies in 2158 more detail. 2160 To address the first problem, an implementation MUST ensure that the 2161 initial sequence number for a given 4-tuple doesn't overlap with 2163 recent sequence numbers on previous connections with the same 2164 4-tuple. ("Recent" means sent within 2 maximum segment lifetimes, 2165 or 4 minutes.) The implementation MUST additionally ensure that the 2166 lower 24 bits of the initial sequence number don't overlap with the 2167 lower 24 bits of recent sequence numbers (unless the implementation 2168 plans to avoid short sequence numbers; see Section 7.6). An 2169 implementation that has state for a recent connection with the same 2170 4-tuple can pick a good initial sequence number explicitly. 2171 Otherwise, it could tie initial sequence number selection to some 2172 clock, such as the 4-microsecond clock used by TCP [RFC 793]. Two 2173 separate clocks may be required, one for the upper 24 bits and one 2174 for the lower 24 bits. 2176 To address the second problem, an implementation MUST provide each 2177 4-tuple with an independent initial sequence number space. Then 2178 opening a connection doesn't provide any information about initial 2179 sequence numbers on other connections to the same host. RFC 1948 2180 achieves this by adding a cryptographic hash of the 4-tuple and a 2181 secret to each initial sequence number. For the secret, RFC 1948 2182 recommends a combination of some truly-random data [RFC 1750], an 2183 administratively-installed passphrase, the endpoint's IP address, 2184 and the endpoint's boot time, but truly-random data is sufficient. 2185 Care should be taken when changing the secret; such a change alters 2186 all initial sequence number spaces, which might make an initial 2187 sequence number for some 4-tuple equal a recently sent sequence 2188 number for the same 4-tuple. To avoid this problem, the endpoint 2189 might remember dead connection state for each 4-tuple or stay quiet 2190 for 2 maximum segment lifetimes around such a change. 2192 7.3. Quiet Time 2194 DCCP endpoints, like TCP endpoints, must take care before initiating 2195 connections when they boot. In particular, they MUST NOT send 2196 packets whose sequence numbers are close to the sequence numbers of 2197 packets lingering in the network from before the boot. The simplest 2198 way to enforce this rule is for DCCP endpoints to avoid sending any 2199 packets until one maximum segment lifetime (2 minutes) after boot. 2200 Other enforcement mechanisms include remembering recent sequence 2201 numbers across boots, and reserving the upper 8 or so bits of 2202 initial sequence numbers for a persistent counter that decrements by 2203 two each boot. (The latter mechanism would require disallowing 2204 packets with short sequence numbers; see Section 7.6.1.) 2206 7.4. Acknowledgement Numbers 2208 Cumulative acknowledgements are meaningless in an unreliable 2209 protocol. Therefore, DCCP's Acknowledgement Number field has a 2210 different meaning than TCP's. 2212 A received packet is classified as acknowledgeable if and only if 2213 its header was succesfully processed by the receiving DCCP. In 2214 terms of the pseudocode in Section 8.5, a received packet becomes 2215 acknowledgeable when the receiving endpoint reaches Step 8. This 2216 means, for example, that all acknowledgeable packets have valid 2217 header checksums and sequence numbers. The Acknowledgement Number 2218 MUST equal GSR, the Greatest Sequence Number Received on an 2219 acknowledgeable packet, for all packet types except DCCP-Sync and 2220 DCCP-SyncAck. 2222 "Acknowledgeable" does not refer to data processing. Even 2223 acknowledgeable packets may have their application data dropped, due 2224 to receive buffer overflow or corruption, for instance. Data 2225 Dropped options report these data losses when necessary, letting 2226 congestion control mechanisms distinguish between network losses and 2227 endpoint losses. This issue is discussed further in Sections 11.4 2228 and 11.7. 2230 DCCP-Sync and DCCP-SyncAck packets' Acknowledgement Numbers differ 2231 as follows: The Acknowledgement Number on a DCCP-Sync packet 2232 corresponds to a received packet, but not necessarily an 2233 acknowledgeable packet; in particular, it might correspond to an 2234 out-of-sync packet whose options were not processed. The 2235 Acknowledgement Number on a DCCP-SyncAck packet always corresponds 2236 to an acknowledgeable DCCP-Sync packet; it might be less than GSR in 2237 the presence of reordering. 2239 7.5. Validity and Synchronization 2241 Any DCCP endpoint might receive packets that are not actually part 2242 of the current connection. For instance, the network might deliver 2243 an old packet, an attacker might attempt to hijack a connection, or 2244 the other endpoint might crash, causing a half-open connection. 2246 DCCP, like TCP, uses sequence number checks to detect these cases. 2247 Packets whose Sequence and/or Acknowledgement Numbers are out of 2248 range are called sequence-invalid, and are not processed normally. 2250 Unlike TCP, DCCP requires a synchronization mechanism to recover 2251 from large bursts of loss. One endpoint might send so many packets 2252 during a burst of loss that when one of its packets finally got 2253 through, the other endpoint would label its Sequence Number as 2254 invalid. A handshake of DCCP-Sync and DCCP-SyncAck packets recovers 2255 from this case. 2257 7.5.1. Sequence and Acknowledgement Number Windows 2259 Each DCCP endpoint defines sequence validity windows that are 2260 subsets of the Sequence and Acknowledgement Number spaces. These 2261 windows correspond to packets the endpoint expects to receive in the 2262 next few round-trip times. The Sequence and Acknowledgement Number 2263 windows always contain GSR and GSS, respectively. The window widths 2264 are controlled by Sequence Window features for the two half- 2265 connections. 2267 The Sequence Number validity window for packets from DCCP B is [SWL, 2268 SWH]. This window always contains GSR, the Greatest Sequence Number 2269 Received on a sequence-valid packet from DCCP B. It is W packets 2270 wide, where W is the value of the Sequence Window/B feature. One- 2271 fourth of the sequence window, rounded down, is less than or equal 2272 to GSR, and three-fourths is greater than GSR. (This asymmetric 2273 placement assumes that bursts of loss are more common in the network 2274 than significant reordering.) 2276 invalid | valid Sequence Numbers | invalid 2277 <---------*|*===========*=======================*|*---------> 2278 GSR -|GSR + 1 - GSR GSR +|GSR + 1 + 2279 floor(W/4)|floor(W/4) ceil(3W/4)|ceil(3W/4) 2280 = SWL = SWH 2282 The Acknowledgement Number validity window for packets from DCCP B 2283 is [AWL, AWH]. The high end of the window, AWH, equals GSS, the 2284 Greatest Sequence Number Sent by DCCP A; the window is W' packets 2285 wide, where W' is the value of the Sequence Window/A feature. 2287 invalid | valid Acknowledgement Numbers | invalid 2288 <---------*|*===================================*|*---------> 2289 GSS - W'|GSS + 1 - W' GSS|GSS + 1 2290 = AWL = AWH 2292 SWL and AWL are initially adjusted so that they are not less than 2293 the initial Sequence Numbers received and sent, respectively: 2294 SWL := max(GSR + 1 - floor(W/4), ISR), 2295 AWL := max(GSS - W' + 1, ISS). 2296 These adjustments MUST be applied only at the beginning of the 2297 connection. (Long-lived connections may wrap sequence numbers so 2298 that they appear to be less than ISR or ISS; the adjustments MUST 2299 NOT be applied in that case.) 2301 7.5.2. Sequence Window Feature 2303 The Sequence Window/A feature determines the width of the Sequence 2304 Number validity window used by DCCP B, and the width of the 2305 Acknowledgement Number validity window used by DCCP A. DCCP A sends 2306 a "Change L(Sequence Window, W)" option to notify DCCP B that the 2307 Sequence Window/A value is W. 2309 Sequence Window has feature number 3, and is non-negotiable. It 2310 takes 48-bit (6-byte) integer values, like DCCP sequence numbers. 2311 Change and Confirm options for Sequence Window are therefore 9 bytes 2312 long. New connections start with Sequence Window 100 for both 2313 endpoints. The minimum valid Sequence Window value is Wmin = 32. 2314 The maximum valid Sequence Window value is Wmax = 2^46 - 1 = 2315 70368744177663; circular sequence number comparisons would stop 2316 working absent this constraint. Change options suggesting Sequence 2317 Window values out of this range are invalid and MUST be handled 2318 accordingly. 2320 A proper Sequence Window/A value must reflect the number of packets 2321 DCCP A expects to be in flight. Only DCCP A can anticipate this 2322 number. Values that are too small increase the risk of the 2323 endpoints getting out sync after bursts of loss, and values that are 2324 much too small can prevent productive communication whether or not 2325 there is loss. On the other hand, too-large values increase the 2326 risk of connection hijacking; Section 7.5.5 quantifies this risk. 2327 One good guideline is for each endpoint to set Sequence Window to 2328 about five times the maximum number of packets it expects to send in 2329 a round-trip time. Endpoints SHOULD send Change L(Sequence Window) 2330 options as necessary as the connection progresses. Also, an 2331 endpoint MUST NOT persistently send more than its Sequence Window 2332 number of packets per round-trip time; that is, DCCP A MUST NOT 2333 persistently send more than Sequence Window/A packets per RTT. 2335 7.5.3. Sequence-Validity Rules 2337 Sequence-validity depends on the received packet's type. This table 2338 shows the sequence and acknowledgement number checks applied to each 2339 packet; a packet is sequence-valid if it passes both tests, and 2340 sequence-invalid if it does not. Many of the checks refer to the 2341 sequence and acknowledgement number validity windows [SWL, SWH] and 2342 [AWL, AWH] defined in Section 7.5.1. 2344 Acknowledgement Number 2345 Packet Type Sequence Number Check Check 2346 ----------- --------------------- ---------------------- 2347 DCCP-Request SWL <= seqno <= SWH (*) N/A 2348 DCCP-Response SWL <= seqno <= SWH (*) AWL <= ackno <= AWH 2349 DCCP-Data SWL <= seqno <= SWH N/A 2350 DCCP-Ack SWL <= seqno <= SWH AWL <= ackno <= AWH 2351 DCCP-DataAck SWL <= seqno <= SWH AWL <= ackno <= AWH 2352 DCCP-CloseReq GSR < seqno <= SWH GAR <= ackno <= AWH 2353 DCCP-Close GSR < seqno <= SWH GAR <= ackno <= AWH 2354 DCCP-Reset GSR < seqno <= SWH GAR <= ackno <= AWH 2355 DCCP-Sync SWL <= seqno AWL <= ackno <= AWH 2356 DCCP-SyncAck SWL <= seqno AWL <= ackno <= AWH 2358 (*) Check not applied if connection is in LISTEN or REQUEST state. 2360 In general, packets are sequence-valid if their Sequence and 2361 Acknowledgement Numbers lie within the corresponding valid windows, 2362 [SWL, SWH] and [AWL, AWH]. The exceptions to this rule are as 2363 follows: 2365 o Since DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets end a 2366 connection, they cannot have Sequence Numbers less than or equal 2367 to GSR, or Acknowledgement Numbers less than GAR. 2369 o DCCP-Sync and DCCP-SyncAck Sequence Numbers are not strongly 2370 checked. These packet types exist specifically to get the 2371 endpoints back into sync; checking their Sequence Numbers would 2372 eliminate their usefulness. 2374 The lenient checks on DCCP-Sync and DCCP-SyncAck packets allow 2375 continued operation after unusual events, such as endpoint crashes 2376 and large bursts of loss, but there's no need for leniency in the 2377 absence of unusual events -- that is, during ongoing successful 2378 communication. Therefore, DCCP implementations SHOULD use the 2379 following, more stringent checks for active connections, where a 2380 connection is considered active if it has received valid packets 2381 from the other endpoint within the last five round-trip times. 2383 Acknowledgement Number 2384 Packet Type Sequence Number Check Check 2385 ----------- --------------------- ---------------------- 2386 DCCP-Sync SWL <= seqno <= SWH AWL <= ackno <= AWH 2387 DCCP-SyncAck SWL <= seqno <= SWH AWL <= ackno <= AWH 2389 Finally, an endpoint MAY apply the following more stringent checks 2390 to DCCP-CloseReq, DCCP-Close, and DCCP-Reset packets, further 2391 lowering the probability of successful blind attacks using those 2392 packet types. Since these checks can cause extra synchronization 2393 overhead and delay connection closing when packets are lost, they 2394 should be considered experimental. 2396 Acknowledgement Number 2397 Packet Type Sequence Number Check Check 2398 ----------- --------------------- ---------------------- 2399 DCCP-CloseReq seqno == GSR + 1 GAR <= ackno <= AWH 2400 DCCP-Close seqno == GSR + 1 GAR <= ackno <= AWH 2401 DCCP-Reset seqno == GSR + 1 GAR <= ackno <= AWH 2403 Note that sequence-validity is only one of the validity checks 2404 applied to received packets. 2406 7.5.4. Handling Sequence-Invalid Packets 2408 Endpoints MUST ignore sequence-invalid DCCP-Sync and DCCP-SyncAck 2409 packets, and MUST respond to other sequence-invalid packets with 2410 (possibly rate-limited) DCCP-Sync packets. Each such DCCP-Sync 2411 packet MUST use a new Sequence Number, and thus will increase GSS; 2412 GSR will not change, however, since the received packet was 2413 sequence-invalid. Each such DCCP-Sync packet's Acknowledgement 2414 Number MUST equal GSR when the received sequence-invalid packet has 2415 type DCCP-Reset, and the received packet's Sequence Number 2416 otherwise. 2418 On receiving a sequence-valid DCCP-Sync packet, the peer endpoint 2419 (say, DCCP B) MUST update its GSR variable and reply with a DCCP- 2420 SyncAck packet. The DCCP-SyncAck packet's Acknowledgement Number 2421 will equal the DCCP-Sync's Sequence Number, not necessarily GSR. 2422 Upon receiving this DCCP-SyncAck, which will be sequence-valid since 2423 it acknowledges the DCCP-Sync, DCCP A will update its GSR variable, 2424 and the endpoints will be back in sync. As an exception, if the 2425 peer endpoint is in the REQUEST state, it MUST respond with a DCCP- 2426 Reset instead of a DCCP-SyncAck. This serves to clean up DCCP A's 2427 half-open connection. 2429 To protect against denial-of-service attacks, DCCP implementations 2430 SHOULD impose a rate limit on DCCP-Syncs sent in response to 2431 sequence-invalid packets, such as not more than eight DCCP-Syncs per 2432 second. 2434 DCCP endpoints MUST NOT process sequence-invalid packets except, 2435 perhaps, by generating a DCCP-Sync. For instance, options MUST NOT 2436 but processed. An endpoint MAY temporarily preserve sequence- 2437 invalid packets in case they become valid later, however; this can 2438 reduce the impact of bursts of loss by delivering more packets to 2439 the application. In particular, an endpoint MAY preserve sequence- 2440 invalid packets for up to 2 round-trip times. If, within that time, 2441 the relevant sequence windows change so that the packets become 2442 sequence-valid, the endpoint MAY process them again. 2444 Note that sequence-invalid DCCP-Reset packets cause DCCP-Syncs to be 2445 generated. This is because endpoints in an unsynchronized state 2446 (CLOSED, REQUEST, and LISTEN) might not have enough information to 2447 generate a proper DCCP-Reset on the first try. For example, if a 2448 peer endpoint is in CLOSED state and receives a DCCP-Data packet, it 2449 cannot guess the right Sequence Number to use on the DCCP-Reset it 2450 generates (since the DCCP-Data packet has no Acknowledgement 2451 Number). The DCCP-Sync generated in response to this bad reset 2452 serves as a challenge, and contains enough information for the peer 2453 to generate a proper DCCP-Reset. However, the new DCCP-Reset may 2454 carry a different Reset Code than the original DCCP-Reset; probably 2455 the new Reset Code will be 3, "No Connection". The endpoint SHOULD 2456 use information from the original DCCP-Reset when possible. 2458 7.5.5. Sequence Number Attacks 2460 Sequence and Acknowledgement Numbers form DCCP's main line of 2461 defense against attackers. An attacker that cannot guess sequence 2462 numbers cannot easily manipulate or hijack a DCCP connection, and 2463 requirements like careful initial sequence number choice eliminate 2464 the most serious attacks. 2466 An attacker might still send many packets with randomly chosen 2467 Sequence and Acknowledgement Numbers, however. If one of those 2468 probes ends up sequence-valid, it may shut down the connection or 2469 otherwise cause problems. The easiest such attacks to execute are: 2471 o Send DCCP-Data packets with random Sequence Numbers. If one of 2472 these packets hits the valid sequence number window, the attack 2473 packet's application data may be inserted into the data stream. 2475 o Send DCCP-Sync packets with random Sequence and Acknowledgement 2476 Numbers. If one of these packets hits the valid acknowledgement 2477 number window, the receiver will shift its sequence number window 2478 accordingly, getting out of sync with the correct endpoint -- 2479 perhaps permanently. 2481 The attacker has to guess both Source and Destination Ports for any 2482 of these attacks to succeed. Additionally, the connection would 2483 have to be inactive for the DCCP-Sync attack to succeed, assuming 2484 the victim implemented the more stringent checks for active 2485 connections recommended in Section 7.5.3. 2487 To quantify the probability of success, let N be the number of 2488 attack packets the attacker is willing to send, W be the relevant 2489 sequence window width, and L be the length of sequence numbers (24 2490 or 48). The attacker's best strategy is to space the attack packets 2491 evenly over sequence space. Then the probability of hitting one 2492 sequence number window is P = WN/2^L. 2494 The success probability for a DCCP-Data attack using short sequence 2495 numbers thus equals P = WN/2^24. For W = 100, then, the attacker 2496 must send more than 83,000 packets to achieve a 50% chance of 2497 success. For reference, the easiest TCP attack -- sending a SYN 2498 with a random sequence number, which will cause a connection reset 2499 if it falls within the window -- has W = 8760 (a common default) and 2500 L = 32, and requires more than 245,000 packets to achieve a 50% 2501 chance of success. 2503 A fast connection's W will generally be high, increasing the attack 2504 success probability for fixed N. If this probability gets 2505 uncomfortably high with L = 24, the endpoint SHOULD prevent the use 2506 of short sequence numbers by manipulating the Allow Short Sequence 2507 Numbers feature (see Section 7.6.1). The probability limit depends 2508 on the application, however. Some applications, such as those 2509 already designed to handle corruption, are quite resilient to data 2510 injection attacks. 2512 The DCCP-Sync attack has L = 48, since DCCP-Sync packets use long 2513 sequence numbers exclusively; in addition, the success probability 2514 is halved, since only half the Sequence Number space is valid. 2515 Attacks have a correspondingly smaller probability of success. For 2516 a large W of 2000 packets, then, the attacker must send more than 2517 10^11 packets to achieve a 50% chance of success. 2519 Attacks involving DCCP-Ack, DCCP-DataAck, DCCP-CloseReq, DCCP-Close, 2520 and DCCP-Reset packets are more difficult, since Sequence and 2521 Acknowledgement Numbers must both be guessed. The probability of 2522 attack success for these packet types equals P = WXN/2^(2L), where W 2523 is the Sequence Number window, X is the Acknowledgement Number 2524 window, and N and L are as before. 2526 Since DCCP-Data attacks with short sequence numbers are relatively 2527 easy for attackers to execute, DCCP has been engineered to prevent 2528 these attacks from escalating to connection resets or other serious 2529 consequences. In particular, any options whose processing might 2530 cause the connection to be reset are ignored when they appear on 2531 DCCP-Data packets. 2533 7.5.6. Sequence Number Handling Examples 2535 In the following example, DCCP A and DCCP B recover from a large 2536 burst of loss that runs DCCP A's sequence numbers out of DCCP B's 2537 appropriate sequence number window. 2539 DCCP A DCCP B 2540 (GSS=1,GSR=10) (GSS=10,GSR=1) 2541 --> DCCP-Data(seq 2) XXX 2542 ... 2543 --> DCCP-Data(seq 100) XXX 2544 --> DCCP-Data(seq 101) --> ??? 2545 seqno out of range; 2546 send Sync 2547 OK <-- DCCP-Sync(seq 11, ack 101) <-- 2548 (GSS=11,GSR=1) 2549 --> DCCP-SyncAck(seq 102, ack 11) --> OK 2550 (GSS=102,GSR=11) (GSS=11,GSR=102) 2552 In the next example, a DCCP connection recovers from a simple blind 2553 attack. 2555 DCCP A DCCP B 2556 (GSS=1,GSR=10) (GSS=10,GSR=1) 2557 *ATTACKER* --> DCCP-Data(seq 10^6) --> ??? 2558 seqno out of range; 2559 send Sync 2560 ??? <-- DCCP-Sync(seq 11, ack 10^6) <-- 2561 ackno out of range; ignore 2562 (GSS=1,GSR=10) (GSS=11,GSR=1) 2564 The final example demonstrates recovery from a half-open connection. 2566 DCCP A DCCP B 2567 (GSS=1,GSR=10) (GSS=10,GSR=1) 2568 (Crash) 2569 CLOSED OPEN 2570 REQUEST --> DCCP-Request(seq 400) --> ??? 2571 !! <-- DCCP-Sync(seq 11, ack 400) <-- OPEN 2572 REQUEST --> DCCP-Reset(seq 401, ack 11) --> (Abort) 2573 REQUEST CLOSED 2574 REQUEST --> DCCP-Request(seq 402) --> ... 2576 7.6. Short Sequence Numbers 2578 DCCP sequence numbers are 48 bits long. This large sequence space 2579 protects DCCP connections against some blind attacks, such as the 2580 injection of DCCP-Resets into the connection. However, DCCP-Data, 2581 DCCP-Ack, and DCCP-DataAck packets, which make up the body of any 2582 DCCP connection, may reduce header space by transmitting only the 2583 lower 24 bits of the relevant Sequence and Acknowledgement Numbers. 2584 The receiving endpoint will extend these numbers to 48 bits using 2585 the following pseudocode: 2587 procedure Extend_Sequence_Number(S, REF) 2588 /* S is a 24-bit sequence number from the packet header. 2589 REF is the relevant 48-bit reference sequence number: 2590 GSS if S is an Acknowledgement Number, and GSR if S is a 2591 Sequence Number. */ 2592 Set REF_low := low 24 bits of REF 2593 Set REF_hi := high 24 bits of REF 2594 If REF_low (<) S /* circular comparison mod 2^24 */ 2595 and S |<| REF_low, /* conventional, non-circular 2596 comparison */ 2597 Return (((REF_hi + 1) mod 2^24) << 24) | S 2598 Otherwise, if S (<) REF_low and REF_low |<| S, 2599 Return (((REF_hi - 1) mod 2^24) << 24) | S 2600 Otherwise, 2601 Return (REF_hi << 24) | S 2603 The two different kinds of comparison in the if statements detect 2604 when the low-order bits of the sequence space have wrapped. (The 2605 circular comparison "REF_low (<) S" returns true if and only if 2606 (S - REF_low), calculated using two's-complement arithmetic and then 2607 represented as an unsigned number, is less than or equal to 2^23 2608 (mod 2^24).) When this happens, the high-order bits are incremented 2609 or decremented, as appropriate. 2611 7.6.1. Allow Short Sequence Numbers Feature 2613 Endpoints can require that all packets use long sequence numbers by 2614 setting the Allow Short Sequence Numbers feature to false. This can 2615 reduce the risk that data will be inappropriately injected into the 2616 connection. DCCP A sends a "Change R(Allow Short Seqnos, 0)" option 2617 to ask DCCP B to send only long sequence numbers. 2619 Allow Short Sequence Numbers has feature number 2, and is server- 2620 priority. It takes one-byte Boolean values. DCCP B MUST NOT send 2621 packets with short sequence numbers when Allow Short Seqnos/B is 2622 zero. Values of two or more are reserved. New connections start 2623 with Allow Short Sequence Numbers 1 for both endpoints. 2625 7.6.2. When to Avoid Short Sequence Numbers 2627 Short sequence numbers reduce the rate DCCP connections can safely 2628 achieve, and increase the risks of certain kinds of attacks, 2629 including blind data injection. Very-high-rate DCCP connections, 2630 and connections with large sequence windows (Section 7.5.2), SHOULD 2631 NOT use short sequence numbers on their data packets. The attack 2632 risk issues have been discussed in Section 7.5.5; we discuss the 2633 rate limitation issue here. 2635 The sequence-validity mechanism assumes that the network does not 2636 deliver extremely old data. In particular, it assumes that the 2637 network must have dropped any packet by the time the connection 2638 wraps around and uses its sequence number again. This constraint 2639 limits the maximum connection rate that can be safely achieved. Let 2640 MSL equal the maximum segment lifetime, P equal the average DCCP 2641 packet size in bits, and L equal the length of sequence numbers (24 2642 or 48 bits). Then the maximum safe rate, in bits per second, is R = 2643 P*(2^L)/2MSL. 2645 For the default MSL of 2 minutes, 1500-byte DCCP packets, and short 2646 sequence numbers, the safe rate is therefore approximately 800 Mb/s. 2647 Although 2 minutes is a very large MSL for any networks that could 2648 sustain that rate with such small packets, long sequence numbers 2649 allow much higher rates under the same constraints: up to 2650 14 petabits a second for 1500-byte packets and the default MSL. 2652 7.7. NDP Count and Detecting Application Loss 2654 DCCP's sequence numbers increment by one on every packet, including 2655 non-data packets (packets that don't carry application data). This 2656 makes DCCP sequence numbers suitable for detecting any network loss, 2657 but not for detecting the loss of application data. The NDP Count 2658 option reports the length of each burst of non-data packets. This 2659 lets the receiving DCCP reliably determine when a burst of loss 2660 included application data. 2662 +--------+--------+-------- ... --------+ 2663 |00100101| Length | NDP Count | 2664 +--------+--------+-------- ... --------+ 2665 Type=37 Len=3-5 (1-3 bytes) 2667 If a DCCP endpoint's Send NDP Count feature is one (see below), then 2668 that endpoint MUST send an NDP Count option on every packet whose 2669 immediate predecessor was a non-data packet. Non-data packets 2670 consist of DCCP packet types DCCP-Ack, DCCP-Close, DCCP-CloseReq, 2671 DCCP-Reset, DCCP-Sync, and DCCP-SyncAck. The other packet types, 2672 namely DCCP-Request, DCCP-Response, DCCP-Data, and DCCP-DataAck, are 2673 considered data packets, although not all DCCP-Request and DCCP- 2674 Response packets will actually carry application data. 2676 The value stored in NDP Count equals the number of consecutive non- 2677 data packets in the run immediately previous to the current packet. 2678 Packets with no NDP Count option are considered to have NDP Count 2679 zero. 2681 The NDP Count option can carry one to three bytes of data. The 2682 smallest option format that can hold the NDP Count SHOULD be used. 2684 With NDP Count, the receiver can reliably tell only whether a burst 2685 of loss contained at least one data packet. For example, the 2686 receiver cannot always tell whether a burst of loss contained a non- 2687 data packet. 2689 7.7.1. NDP Count Usage Notes 2691 Say that K consecutive sequence numbers are missing in some burst of 2692 loss, and the Send NDP Count feature is on. Then some application 2693 data was lost within those sequence numbers unless the packet 2694 following the hole contains an NDP Count option whose value is 2695 greater than or equal to K. 2697 For example, say that an endpoint sent the following sequence of 2698 non-data packets (Nx) and data packets (Dx). 2700 N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 2702 Those packets would have NDP Counts as follows. 2704 N0 N1 D2 N3 D4 D5 N6 D7 D8 D9 D10 N11 N12 D13 2705 - 1 2 - 1 - - 1 - - - - 1 2 2707 NDP Count is not useful for applications that include their own 2708 sequence numbers with their packet headers. 2710 7.7.2. Send NDP Count Feature 2712 The Send NDP Count feature lets DCCP endpoints negotiate whether 2713 they should send NDP Count options on their packets. DCCP A sends a 2714 "Change R(Send NDP Count, 1)" option to ask DCCP B to send NDP Count 2715 options. 2717 Send NDP Count has feature number 7, and is server-priority. It 2718 takes one-byte Boolean values. DCCP B MUST send NDP Count options 2719 as described above when Send NDP Count/B is one, although it MAY 2720 send NDP Count options even when Send NDP Count/B is zero. Values 2721 of two or more are reserved. New connections start with Send NDP 2722 Count 0 for both endpoints. 2724 8. Event Processing 2726 This section describes how DCCP connections move between states, and 2727 which packets are sent when. Note that feature negotiation takes 2728 place in parallel with the connection-wide state transitions 2729 described here. 2731 8.1. Connection Establishment 2733 DCCP connections' initiation phase consists of a three-way 2734 handshake: an initial DCCP-Request packet sent by the client, a 2735 DCCP-Response sent by the server in reply, and finally an 2736 acknowledgement from the client, usually via a DCCP-Ack or DCCP- 2737 DataAck packet. The client moves from the REQUEST state to 2738 PARTOPEN, and finally to OPEN; the server moves from LISTEN to 2739 RESPOND, and finally to OPEN. 2741 Client State Server State 2742 CLOSED LISTEN 2743 1. REQUEST --> Request --> 2744 2. <-- Response <-- RESPOND 2745 3. PARTOPEN --> Ack, DataAck --> 2746 4. <-- Data, Ack, DataAck <-- OPEN 2747 5. OPEN <-> Data, Ack, DataAck <-> OPEN 2749 8.1.1. Client Request 2751 When a client decides to initiate a connection, it enters the 2752 REQUEST state, chooses an initial sequence number (Section 7.2), and 2753 sends a DCCP-Request packet using that sequence number to the 2754 intended server. 2756 DCCP-Request packets will commonly carry feature negotiation options 2757 that open negotiations for various connection parameters, such as 2758 preferred congestion control IDs for each half-connection. They may 2759 also carry application data, but the client should be aware that the 2760 server may not accept such data. 2762 A client in the REQUEST state SHOULD use an exponential-backoff 2763 timer to send new DCCP-Request packets if no response is received. 2764 The first retransmission should occur after approximately one 2765 second, backing off to not less than one packet every 64 seconds; or 2766 the endpoint can use whatever retransmission strategy is followed 2767 for retransmitting TCP SYNs. Each new DCCP-Request MUST increment 2768 the Sequence Number by one, and MUST contain the same Service Code 2769 and application data as the original DCCP-Request. 2771 A client MAY give up on its DCCP-Requests after some time 2772 (3 minutes, for example). When it does, it SHOULD send a DCCP-Reset 2773 packet to the server with Reset Code 2, "Aborted", to clean up state 2774 in case one or more of the Requests actually arrived. A client in 2775 REQUEST state has never received an initial sequence number from its 2776 peer, so the DCCP-Reset's Acknowledgement Number MUST be set to 2777 zero. 2779 The client leaves the REQUEST state for PARTOPEN when it receives a 2780 DCCP-Response from the server. 2782 8.1.2. Service Codes 2784 Each DCCP-Request contains a 32-bit Service Code, which identifies 2785 the application-level service to which the client application is 2786 trying to connect. Service Codes should correspond to application 2787 services and protocols. For example, there might be a Service Code 2788 for SIP control connections and one for RTP audio connections. 2789 Middleboxes, such as firewalls, can use the Service Code to identify 2790 the application running on a nonstandard port (assuming the DCCP 2791 header has not been encrypted). 2793 Endpoints MUST associate a Service Code with every DCCP socket, both 2794 actively and passively opened. The application will generally 2795 supply this Service Code. Each active socket MUST have exactly one 2796 Service Code. Passive sockets MAY, at the implementation's 2797 discretion, be associated with more than one Service Code; this 2798 might let multiple applications, or multiple versions of the same 2799 application, listen on the same port, differentiated by Service 2800 Code. If the DCCP-Request's Service Code doesn't equal any of the 2801 server's Service Codes for the given port, the server MUST reject 2802 the request by sending a DCCP-Reset packet with Reset Code 8, "Bad 2803 Service Code". A middlebox MAY also send such a DCCP-Reset in 2804 response to packets whose Service Code is considered unsuitable. 2806 Service Codes are not intended to be DCCP-specific, and are 2807 allocated by IANA. Following the policies outlined in RFC 2434, 2808 most Service Codes are allocated First Come First Served, subject to 2809 the following guidelines. 2811 o Service Codes are allocated one at a time, or in small blocks. A 2812 short English description of the intended service is REQUIRED to 2813 obtain a Service Code assignment, but no specification, 2814 standards-track or otherwise, is necessary. IANA maintains an 2815 association of Service Codes to the corresponding phrases. 2817 o Users request specific Service Code values. We suggest that 2818 users request Service Codes that can be interpreted as meaningful 2819 four-byte ASCII strings. Thus, the "Frobodyne Plotz Protocol" 2820 might correspond to "fdpz", or the number 1717858426. The 2821 canonical interpretation of a Service Code field is numeric. 2823 o Service Codes whose bytes each have values in the set {32, 45-57, 2824 65-90} use a Specification Required allocation policy. That is, 2825 these Service Codes are used for international standard or 2826 standards-track specifications, IETF or otherwise. (This set 2827 consists of the ASCII digits, uppercase letters, and characters 2828 space, '-', '.', and '/'.) 2830 o Service Codes whose high-order byte equals 63 (ASCII '?') are 2831 reserved for Private Use. 2833 o Service Code 0 represents the absence of a meaningful Service 2834 Code, and MUST NOT be allocated. 2836 o The value 4294967295 is an invalid Service Code. Servers MUST 2837 reject any DCCP-Request with this Service Code value by sending a 2838 DCCP-Reset packet with Reset Code 8, "Bad Service Code". 2840 This design for Service Code allocation is based on the allocation 2841 of 4-byte identifiers for Macintosh resources, PNG chunks, and 2842 TrueType and OpenType tables. 2844 In text settings, we recommend that Service Codes be written in one 2845 of three forms, prefixed by the ASCII letters SC and either a colon 2846 ":" or equals sign "=". These forms are interpreted as follows. 2848 SC: Indicates a Service Code representable using a subset of the 2849 ASCII characters. The colon is followed by between one and 2850 four characters taken from the following set: letters, 2851 digits, and the characters in "-_+.*/?@" (not including 2852 quotes). Numerically, these characters have values in 2853 {42-43, 45-57, 63-90, 95, 97-122}. The Service Code is 2854 calculated by padding the string on the right with spaces 2855 (value 32) and intepreting the four-character result as a 2856 32-bit big-endian number. 2858 SC= Indicates a decimal Service Code. The octothorp is followed 2859 by any number of decimal digits, which specify the Service 2860 Code. Values above 4294967294 are illegal. 2862 SC=x or SC=X 2863 Indicates a hexadecimal Service Code. The "x" or "X" is 2864 followed by any number of hexadecimal digits (upper or lower 2865 case), which specify the Service Code. Values above 2866 4294967294 are illegal. 2868 Thus, the Service Code 1717858426 might be represented in text as 2869 either SC:fdpz, SC=1717858426, or SC=x6664707A. 2871 8.1.3. Server Response 2873 In the second phase of the three-way handshake, the server moves 2874 from the LISTEN state to RESPOND, and sends a DCCP-Response message 2875 to the client. In this phase, a server will often specify the 2876 features it would like to use, either from among those the client 2877 requested, or in addition to those. Among these options is the 2878 congestion control mechanism the server expects to use. 2880 The server MAY respond to a DCCP-Request packet with a DCCP-Reset 2881 packet to refuse the connection. Relevant Reset Codes for refusing 2882 a connection include 7, "Connection Refused", when the DCCP- 2883 Request's Destination Port did not correspond to a DCCP port open 2884 for listening; 8, "Bad Service Code", when the DCCP-Request's 2885 Service Code did not correspond to the service code registered with 2886 the Destination Port; and 9, "Too Busy", when the server is 2887 currently too busy to respond to requests. The server SHOULD limit 2888 the rate at which it generates these resets, for example to not more 2889 than 1024 per second. 2891 The server SHOULD NOT retransmit DCCP-Response packets; the client 2892 will retransmit the DCCP-Request if necessary. (Note that the 2893 "retransmitted" DCCP-Request will have, at least, a different 2894 sequence number from the "original" DCCP-Request. The server can 2895 thus distinguish true retransmissions from network duplicates.) The 2896 server will detect that the retransmitted DCCP-Request applies to an 2897 existing connection because of its Source and Destination Ports. 2898 Every valid DCCP-Request received while the server is in the RESPOND 2899 state MUST elicit a new DCCP-Response. Each new DCCP-Response MUST 2900 increment the server's Sequence Number by one, and MUST include the 2901 same application data, if any, as the original DCCP-Response. 2903 The server MUST NOT accept more than one piece of DCCP-Request 2904 application data per connection. In particular, the DCCP-Response 2905 sent in reply to a retransmitted DCCP-Request with application data 2906 SHOULD contain a Data Dropped option, in which the retransmitted 2907 DCCP-Request data is reported with Drop Code 0, Protocol 2908 Constraints. The original DCCP-Request SHOULD also be reported in 2909 the Data Dropped option, either in a Normal Block (if the server 2910 accepted the data, or there was no data), or in a Drop Code 0 Drop 2911 Block (if the server refused the data the first time as well). 2913 The Data Dropped and Init Cookie options are particularly useful for 2914 DCCP-Response packets (Sections 11.7 and 8.1.4). 2916 The server leaves the RESPOND state for OPEN when it receives a 2917 valid DCCP-Ack from the client, completing the three-way handshake. 2918 It MAY also leave the RESPOND state for CLOSED after a timeout of 2919 not less than 4MSL (8 minutes); when doing so, it SHOULD send a 2920 DCCP-Reset with Reset Code 2, "Aborted", to clean up state at the 2921 client. 2923 8.1.4. Init Cookie Option 2925 +--------+--------+--------+--------+--------+-------- 2926 |00100100| Length | Init Cookie Value ... 2927 +--------+--------+--------+--------+--------+-------- 2928 Type=36 2930 The Init Cookie option lets a DCCP server avoid having to hold any 2931 state until the three-way connection setup handshake has completed, 2932 in a similar fashion as TCP SYN cookies [SYNCOOKIES]. The server 2933 wraps up the Service Code, server port, and any options it cares 2934 about from both the DCCP-Request and DCCP-Response in an opaque 2935 cookie. Typically the cookie will be encrypted using a secret known 2936 only to the server and include a cryptographic checksum or magic 2937 value so that correct decryption can be verified. When the server 2938 receives the cookie back in the response, it can decrypt the cookie 2939 and instantiate all the state it avoided keeping. In the meantime, 2940 it need not move from the LISTEN state. 2942 The Init Cookie option MUST NOT be sent on DCCP-Request or DCCP-Data 2943 packets, and any such options received on DCCP-Request or DCCP-Data 2944 packets MUST be ignored. The server MAY include an Init Cookie 2945 option in its DCCP-Response. If so, then the client MUST echo the 2946 same Init Cookie option in each succeeding DCCP packet until one of 2947 those packets is acknowledged, meaning the three-way handshake has 2948 completed, or the connection is reset. (As a result, the client 2949 MUST NOT use DCCP-Data packets until the three-way handshake 2950 completes or the connection is reset.) The server SHOULD design its 2951 Init Cookie format so that Init Cookies can be checked for 2952 tampering; it SHOULD respond to a tampered Init Cookie option by 2953 resetting the connection with Reset Code 10, "Bad Init Cookie". 2955 Init Cookie's precise implementation need not be specified here; 2956 since Init Cookies are opaque to the client, there are no 2957 interoperability concerns. An example cookie format might encrypt 2958 (using a secret key) the connection's initial sequence and 2959 acknowledgement numbers, ports, Service Code, any options included 2960 on the DCCP-Request packet and the corresponding DCCP-Reply, a 2961 random salt, and a magic number. On receiving a reflected Init 2962 Cookie, the server would decrypt the cookie, validate it by checking 2963 its magic number, sequence numbers, and ports, and, if valid, create 2964 a corresponding socket using the options. 2966 Init Cookies are limited to at most 253 bytes in length. 2968 8.1.5. Handshake Completion 2970 When the client receives a DCCP-Response from the server, it moves 2971 from the REQUEST state to PARTOPEN and completes the three-way 2972 handshake by sending a DCCP-Ack packet to the server. The client 2973 remains in PARTOPEN until it can be sure that the server has 2974 received some packet the client sent from PARTOPEN (either the 2975 initial DCCP-Ack or a later packet). Clients in the PARTOPEN state 2976 that want to send data MUST do so using DCCP-DataAck packets, not 2977 DCCP-Data packets. This is because DCCP-Data packets lack 2978 Acknowledgement Numbers, so the server can't tell from a DCCP-Data 2979 packet whether the client saw its DCCP-Response. Furthermore, if 2980 the DCCP-Response included an Init Cookie, that Init Cookie MUST be 2981 included on every packet sent in PARTOPEN. 2983 The single DCCP-Ack sent when entering the PARTOPEN state might, of 2984 course, be dropped by the network. The client SHOULD ensure that 2985 some packet gets through eventually. The preferred mechanism would 2986 be a roughly 200-millisecond timer, set every time a packet is 2987 transmitted in PARTOPEN. If this timer goes off and the client is 2988 still in PARTOPEN, the client generates another DCCP-Ack and backs 2989 off the timer. If the client remains in PARTOPEN for more than 4MSL 2990 (8 minutes), it SHOULD reset the connection with Reset Code 2, 2991 "Aborted". 2993 The client leaves the PARTOPEN state for OPEN when it receives a 2994 valid packet other than DCCP-Response, DCCP-Reset, or DCCP-Sync from 2995 the server. 2997 8.2. Data Transfer 2999 In the central data transfer phase of the connection, both server 3000 and client are in the OPEN state. 3002 DCCP A sends DCCP-Data and DCCP-DataAck packets to DCCP B due to 3003 application events on host A. These packets are congestion- 3004 controlled by the CCID for the A-to-B half-connection. In contrast, 3005 DCCP-Ack packets sent by DCCP A are controlled by the CCID for the 3006 B-to-A half-connection. Generally, DCCP A will piggyback 3007 acknowledgement information on DCCP-Data packets when acceptable, 3008 creating DCCP-DataAck packets. DCCP-Ack packets are used when there 3009 is no data to send from DCCP A to DCCP B, or when the congestion 3010 state of the A-to-B CCID will not allow data to be sent. 3012 DCCP-Sync and DCCP-SyncAck packets may also occur in the data 3013 transfer phase. Some cases causing DCCP-Sync generation are 3014 discussed in Section 7.5. One important distinction between DCCP- 3015 Sync packets and other packet types is that DCCP-Sync elicits an 3016 immediate acknowledgement. On receiving a valid DCCP-Sync packet, a 3017 DCCP endpoint MUST immediately generate and send a DCCP-SyncAck 3018 response (subject to any implementation rate limits); the 3019 Acknowledgement Number on that DCCP-SyncAck MUST equal the Sequence 3020 Number of the DCCP-Sync. 3022 A particular DCCP implementation might decide to initiate feature 3023 negotiation only once the OPEN state was reached, in which case it 3024 might not allow data transfer until some time later. Data received 3025 during that time SHOULD be rejected and reported using a Data 3026 Dropped Drop Block with Drop Code 0, Protocol Constraints (see 3027 Section 11.7). 3029 8.3. Termination 3031 DCCP connection termination uses a handshake consisting of an 3032 optional DCCP-CloseReq packet, a DCCP-Close packet, and a DCCP-Reset 3033 packet. The server moves from the OPEN state, possibly through the 3034 CLOSEREQ state, to CLOSED; the client moves from OPEN through 3035 CLOSING to TIMEWAIT, and after 2MSL wait time (4 minutes), to 3036 CLOSED. 3038 The sequence DCCP-CloseReq, DCCP-Close, DCCP-Reset is used when the 3039 server decides to close the connection, but doesn't want to hold 3040 TIMEWAIT state: 3042 Client State Server State 3043 OPEN OPEN 3044 1. <-- CloseReq <-- CLOSEREQ 3045 2. CLOSING --> Close --> 3046 3. <-- Reset <-- CLOSED (LISTEN) 3047 4. TIMEWAIT 3048 5. CLOSED 3050 A shorter sequence occurs when the client decides to close the 3051 connection. 3053 Client State Server State 3054 OPEN OPEN 3055 1. CLOSING --> Close --> 3056 2. <-- Reset <-- CLOSED (LISTEN) 3057 3. TIMEWAIT 3058 4. CLOSED 3060 Finally, the server can decide to hold TIMEWAIT state: 3062 Client State Server State 3063 OPEN OPEN 3064 1. <-- Close <-- CLOSING 3065 2. CLOSED --> Reset --> 3066 3. TIMEWAIT 3067 4. CLOSED (LISTEN) 3069 In all cases, the receiver of the DCCP-Reset packet holds TIMEWAIT 3070 state for the connection. As in TCP, TIMEWAIT state, where an 3071 endpoint quietly preserves a socket for 2MSL (4 minutes) after its 3072 connection has closed, ensures that no connection duplicating the 3073 current connection's source and destination addresses and ports can 3074 start up while old packets might remain in the network. 3076 The termination handshake proceeds as follows. The receiver of a 3077 valid DCCP-CloseReq packet MUST respond with a DCCP-Close packet. 3078 The receiver of a valid DCCP-Close packet MUST respond with a DCCP- 3079 Reset packet, with Reset Code 1, "Closed". The receiver of a valid 3080 DCCP-Reset packet -- which is also the sender of the DCCP-Close 3081 packet (and possibly the receiver of the DCCP-CloseReq packet) -- 3082 will hold TIMEWAIT state for the connection. 3084 A DCCP-Reset packet completes every DCCP connection, whether the 3085 termination is clean (due to application close; Reset Code 1, 3086 "Closed") or unclean. Unlike TCP, which has two distinct 3087 termination mechanisms (FIN and RST), DCCP ends all connections in a 3088 uniform manner. This is justified because some aspects of 3089 connection termination are the same independent of whether 3090 termination was clean. For instance, the endpoint that receives a 3091 valid DCCP-Reset SHOULD hold TIMEWAIT state for the connection. 3092 Processors that must distinguish between clean and unclean 3093 termination can examine the Reset Code. DCCP-Reset packets MUST NOT 3094 be generated in response to received DCCP-Reset packets. DCCP 3095 implementations generally transition to the CLOSED state after 3096 sending a DCCP-Reset packet. 3098 Endpoints in the CLOSEREQ and CLOSING states MUST retransmit DCCP- 3099 CloseReq and DCCP-Close packets, respectively, until leaving those 3100 states. The retransmission timer should initially be set to go off 3101 in two round-trip times, and should back off to not less than once 3102 every 64 seconds if no relevant response is received. 3104 Only the server can send a DCCP-CloseReq packet or enter the 3105 CLOSEREQ state. A server receiving a sequence-valid DCCP-CloseReq 3106 packet MUST respond with a DCCP-Sync packet, and otherwise ignore 3107 the DCCP-CloseReq. 3109 DCCP-Data, DCCP-DataAck, and DCCP-Ack packets received in CLOSEREQ 3110 or CLOSING states MAY be either processed or ignored. 3112 8.3.1. Abnormal Termination 3114 DCCP endpoints generate DCCP-Reset packets to terminate connections 3115 abnormally; a DCCP-Reset packet may be generated from any state. 3116 Resets sent in the CLOSED, LISTEN, and TIMEWAIT states use Reset 3117 Code 3, "No Connection", unless otherwise specified. Resets sent in 3118 the REQUEST or RESPOND states use Reset Code 4, "Packet Error", 3119 unless otherwise specified. 3121 DCCP endpoints in CLOSED or LISTEN state may need to generate a 3122 DCCP-Reset packet in response to a packet received from a peer. 3123 Since these states have no associated sequence number variables, the 3124 Sequence and Acknowledgement Numbers on the DCCP-Reset packet R are 3125 taken from the received packet P, as follows. 3127 1. If P.ackno exists, then set R.seqno := P.ackno + 1. Otherwise, 3128 set R.seqno := 0. 3130 2. Set R.ackno := P.seqno. 3132 3. If the packet used short sequence numbers (P.X == 0), then set 3133 the upper 24 bits of R.seqno and R.ackno to 0. 3135 8.4. DCCP State Diagram 3137 The most common state transitions discussed above can be summarized 3138 in the following state diagram. The diagram is illustrative; the 3139 text in Section 8.5 and elsewhere should be considered definitive. 3140 For example, there are arcs (not shown) from every state except 3141 CLOSED to TIMEWAIT, contingent on the receipt of a valid DCCP-Reset. 3143 +---------------------------+ +---------------------------+ 3144 | v v | 3145 | +----------+ | 3146 | +-------------+ CLOSED +------------+ | 3147 | | passive +----------+ active | | 3148 | | open open | | 3149 | | snd Request | | 3150 | v v | 3151 | +----------+ +----------+ | 3152 | | LISTEN | | REQUEST | | 3153 | +----+-----+ +----+-----+ | 3154 | | rcv Request rcv Response | | 3155 | | snd Response snd Ack | | 3156 | v v | 3157 | +----------+ +----------+ | 3158 | | RESPOND | | PARTOPEN | | 3159 | +----+-----+ +----+-----+ | 3160 | | rcv Ack/DataAck rcv packet | | 3161 | | | | 3162 | | +----------+ | | 3163 | +------------>| OPEN |<-----------+ | 3164 | +--+-+--+--+ | 3165 | server active close | | | active close | 3166 | snd CloseReq | | | or rcv CloseReq | 3167 | | | | snd Close | 3168 | | | | | 3169 | +----------+ | | | +----------+ | 3170 | | CLOSEREQ |<---------+ | +--------->| CLOSING | | 3171 | +----+-----+ | +----+-----+ | 3172 | | rcv Close | rcv Reset | | 3173 | | snd Reset | | | 3174 |<---------+ | v | 3175 | | +----+-----+ | 3176 | rcv Close | | TIMEWAIT | | 3177 | snd Reset | +----+-----+ | 3178 +-----------------------------+ | | 3179 +-----------+ 3180 2MSL timer expires 3182 8.5. Pseudocode 3184 This section presents an algorithm describing the processing steps a 3185 DCCP endpoint must go through when it receives a packet. A DCCP 3186 implementation need not implement the algorithm as it is described 3187 here, but any implementation MUST generate observable effects 3188 exactly as indicated by this pseudocode, except where allowed 3189 otherwise by another part of this document. 3191 The received packet is written as P, the socket as S. 3192 Packet variables P.seqno and P.ackno are 48-bit sequence numbers. 3193 Socket variables: 3194 S.SWL - sequence number window low 3195 S.SWH - sequence number window high 3196 S.AWL - acknowledgement number window low 3197 S.AWH - acknowledgement number window high 3198 S.ISS - initial sequence number sent 3199 S.ISR - initial sequence number received 3200 S.OSR - first OPEN sequence number received 3201 S.GSS - greatest sequence number sent 3202 S.GSR - greatest valid sequence number received 3203 S.GAR - greatest valid acknowledgement number received on a 3204 non-Sync; initialized to S.ISS 3205 "Send packet" actions always use, and increment, S.GSS. 3207 Step 1: Check header basics 3208 /* This step checks for malformed packets. Packets that fail 3209 these checks are ignored -- they do not receive Resets in 3210 response */ 3211 If the packet is shorter than 12 bytes, drop packet and return 3212 If the packet type is not understood, drop packet and return 3213 If P.Data Offset is too small for packet type, or too large for 3214 packet, drop packet and return 3215 If P.type is not Data, Ack, or DataAck and P.X == 0 (the packet 3216 has short sequence numbers), drop packet and return 3217 If the header checksum is incorrect, drop packet and return 3218 If P.CsCov is too large for the packet size, drop packet and 3219 return 3221 Step 2: Check ports and process TIMEWAIT state 3222 /* Flow ID is 4-tuple */ 3223 Look up flow ID in table and get corresponding socket 3224 If no socket, or S.state == TIMEWAIT, 3225 Generate Reset(No Connection) unless P.type == Reset 3226 Drop packet and return 3228 Step 3: Process LISTEN state 3229 If S.state == LISTEN, 3230 If P.type == Request or P contains a valid Init Cookie option, 3231 /* Must scan the packet's options to check for an Init 3232 Cookie. Only the Init Cookie is processed here, 3233 however; other options are processed in Step 8. This 3234 scan need only be performed if the endpoint uses Init 3235 Cookies */ 3236 /* Generate a new socket and switch to that socket */ 3237 Set S := new socket for this port pair 3238 S.state = RESPOND 3239 Choose S.ISS (initial seqno) or set from Init Cookie 3240 Initialize S.GAR := S.ISS 3241 Set S.ISR, S.GSR, S.SWL, S.SWH from packet or Init Cookie 3242 Continue with S.state == RESPOND 3243 /* A Response packet will be generated in Step 11 */ 3244 Otherwise, 3245 Generate Reset(No Connection) unless P.type == Reset 3246 Drop packet and return 3248 Step 4: Prepare sequence numbers in REQUEST 3249 If S.state == REQUEST, 3250 If (P.type == Response or P.type == Reset) 3251 and S.AWL <= P.ackno <= S.AWH, 3252 /* Set sequence number variables corresponding to the 3253 other endpoint, so P will pass the tests in Step 6 */ 3254 Set S.GSR, S.ISR, S.SWL, S.SWH 3255 /* Response processing continues in Step 10; Reset 3256 processing continues in Step 9 */ 3257 Otherwise, 3258 /* Only Response and Reset are valid in REQUEST state */ 3259 Generate Reset(Packet Error) 3260 Drop packet and return 3262 Step 5: Prepare sequence numbers for Sync 3263 If P.type == Sync or P.type == SyncAck, 3264 If S.AWL <= P.ackno <= S.AWH and P.seqno >= S.SWL, 3265 /* P is valid, so update sequence number variables 3266 accordingly. After this update, P will pass the tests 3267 in Step 6. A SyncAck is generated if necessary in 3268 Step 15 */ 3269 Update S.GSR, S.SWL, S.SWH 3270 Otherwise, 3271 Drop packet and return 3273 Step 6: Check sequence numbers 3274 Let LSWL = S.SWL and LAWL = S.AWL 3275 If P.type == CloseReq or P.type == Close or P.type == Reset, 3276 LSWL := S.GSR + 1, LAWL := S.GAR 3277 If LSWL <= P.seqno <= S.SWH 3278 and (P.ackno does not exist or LAWL <= P.ackno <= S.AWH), 3279 Update S.GSR, S.SWL, S.SWH 3280 If P.type != Sync, 3281 Update S.GAR 3282 Otherwise, 3283 If P.type == Reset, 3284 Send Sync packet acknowledging S.GSR 3285 Otherwise, 3286 Send Sync packet acknowledging P.seqno 3288 Drop packet and return 3290 Step 7: Check for unexpected packet types 3291 If (S.is_server and P.type == CloseReq) 3292 or (S.is_server and P.type == Response) 3293 or (S.is_client and P.type == Request) 3294 or (S.state >= OPEN and P.type == Request 3295 and P.seqno >= S.OSR) 3296 or (S.state >= OPEN and P.type == Response 3297 and P.seqno >= S.OSR) 3298 or (S.state == RESPOND and P.type == Data), 3299 Send Sync packet acknowledging P.seqno 3300 Drop packet and return 3302 Step 8: Process options and mark acknowledgeable 3303 /* Option processing is not specifically described here. 3304 Certain options, such as Mandatory, may cause the connection 3305 to be reset, in which case Steps 9 and on are not executed */ 3306 Mark packet as acknowledgeable (in Ack Vector terms, Received 3307 or Received ECN Marked) 3309 Step 9: Process Reset 3310 If P.type == Reset, 3311 Tear down connection 3312 S.state := TIMEWAIT 3313 Set TIMEWAIT timer 3314 Drop packet and return 3316 Step 10: Process REQUEST state (second part) 3317 If S.state == REQUEST, 3318 /* If we get here, P is a valid Response from the server (see 3319 Step 4), and we should move to PARTOPEN state. PARTOPEN 3320 means send an Ack, don't send Data packets, retransmit 3321 Acks periodically, and always include any Init Cookie from 3322 the Response */ 3323 S.state := PARTOPEN 3324 Set PARTOPEN timer 3325 Continue with S.state == PARTOPEN 3326 /* Step 12 will send the Ack completing the three-way 3327 handshake */ 3329 Step 11: Process RESPOND state 3330 If S.state == RESPOND, 3331 If P.type == Request, 3332 Send Response, possibly containing Init Cookie 3333 If Init Cookie was sent, 3334 Destroy S and return 3335 /* Step 3 will create another socket when the client 3336 completes the three-way handshake */ 3337 Otherwise, 3338 S.OSR := P.seqno 3339 S.state := OPEN 3341 Step 12: Process PARTOPEN state 3342 If S.state == PARTOPEN, 3343 If P.type == Response, 3344 Send Ack 3345 Otherwise, if P.type != Sync, 3346 S.OSR := P.seqno 3347 S.state := OPEN 3349 Step 13: Process CloseReq 3350 If P.type == CloseReq and S.state < CLOSEREQ, 3351 Generate Close 3352 S.state := CLOSING 3353 Set CLOSING timer 3355 Step 14: Process Close 3356 If P.type == Close, 3357 Generate Reset(Closed) 3358 Tear down connection 3359 Drop packet and return 3361 Step 15: Process Sync 3362 If P.type == Sync, 3363 Generate SyncAck 3365 Step 16: Process data 3366 /* At this point any application data on P can be passed to the 3367 application, except that the application MUST NOT receive 3368 data from more than one Request or Response */ 3370 9. Checksums 3372 DCCP uses a header checksum to protect its header against 3373 corruption. Generally, this checksum also covers any application 3374 data. DCCP applications can, however, request that the header 3375 checksum cover only part of the application data, or perhaps no 3376 application data at all. Link layers may then reduce their 3377 protection on unprotected parts of DCCP packets. For some noisy 3378 links, and applications that can tolerate corruption, this can 3379 greatly improve delivery rates and perceived performance. 3381 Checksum coverage may eventually impact congestion control 3382 mechanisms as well. A packet with corrupt application data and 3383 complete checksum coverage is treated as lost. This incurs a heavy- 3384 duty loss response from the sender's congestion control mechanism, 3385 which can unfairly penalize connections on links with high 3386 background corruption. The combination of reduced checksum coverage 3387 and Data Checksum options may let endpoints report packets as 3388 corrupt rather than dropped, using Data Dropped options and Drop 3389 Code 3 (see Section 11.7). This may eventually benefit 3390 applications. However, further research is required to determine an 3391 appropriate response to corruption, which can sometimes correlate 3392 with congestion. Corrupt packets currently incur a loss response. 3394 The Data Checksum option, which contains a strong CRC, lets 3395 endpoints detect application data corruption. An API can then be 3396 used to avoid delivering corrupt data to the application, even if 3397 links deliver corrupt data to the endpoint due to reduced checksum 3398 coverage. However, the use of reduced checksum coverage for 3399 applications that demand correct data is currently considered 3400 experimental. This is because the combined loss-plus-corruption 3401 rate for packets with reduced checksum coverage may be significantly 3402 higher than that for packets with full checksum coverage, although 3403 the loss rate will generally be lower. Actual behavior will depend 3404 on link design; further research and experience is required. 3406 Reduced checksum coverage introduces some security considerations; 3407 see Section 18.1. See Appendix B for further motivation and 3408 discussion. DCCP's implementation of reduced checksum coverage was 3409 inspired by UDP-Lite [RFC 3828]. 3411 9.1. Header Checksum Field 3413 DCCP uses the TCP/IP checksum algorithm. The Checksum field in the 3414 DCCP generic header (see Section 5.1) equals the 16 bit one's 3415 complement of the one's complement sum of all 16 bit words in the 3416 DCCP header, DCCP options, a pseudoheader taken from the network- 3417 layer header, and, depending on the value of the Checksum Coverage 3418 field, some or all of the application data. When calculating the 3419 checksum, the Checksum field itself is treated as 0. If a packet 3420 contains an odd number of header and payload bytes to be 3421 checksummed, 8 zero bits are added on the right to form a 16 bit 3422 word for checksum purposes. The pad byte is not transmitted as part 3423 of the packet. 3425 The pseudoheader is calculated as for TCP. For IPv4, it is 96 bits 3426 long, and consists of the IPv4 source and destination addresses, the 3427 IP protocol number for DCCP (padded on the left with 8 zero bits), 3428 and the DCCP length as a 16-bit quantity (the length of the DCCP 3429 header with options, plus the length of any data); see RFC 793 3430 (Section 3.1). For IPv6, it is 320 bits long, and consists of the 3431 IPv6 source and destination addresses, the DCCP length as a 32-bit 3432 quantity, and the IP protocol number for DCCP (padded on the left 3433 with 24 zero bits); see RFC 2460 (Section 8.1). 3435 Packets with invalid header checksums MUST be ignored. In 3436 particular, their options MUST NOT be processed. 3438 9.2. Header Checksum Coverage Field 3440 The Checksum Coverage field in the DCCP generic header (see Section 3441 5.1) specifies what parts of the packet are covered by the Checksum 3442 field, as follows: 3444 CsCov = 0 The Checksum field covers the DCCP header, DCCP 3445 options, network-layer pseudoheader, and all 3446 application data in the packet, possibly padded on 3447 the right with zeros to an even number of bytes. 3449 CsCov = 1-15 The Checksum field covers the DCCP header, DCCP 3450 options, network-layer pseudoheader, and the initial 3451 (CsCov-1)*4 bytes of the packet's application data. 3453 Thus, if CsCov is 1, none of the application data is protected by 3454 the header checksum. The value (CsCov-1)*4 MUST be less than or 3455 equal to the length of the application data. Packets with invalid 3456 CsCov values MUST be ignored; in particular, their options MUST NOT 3457 be processed. The meanings of values other than 0 and 1 should be 3458 considered experimental. 3460 Values other than 0 specify that corruption is acceptable in some or 3461 all of the DCCP packet's application data. In fact, DCCP cannot 3462 even detect corruption in areas not covered by the header checksum, 3463 unless the Data Checksum option is used. Applications should not 3464 make any assumptions about the correctness of received data not 3465 covered by the checksum, and should if necessary introduce their own 3466 validity checks. 3468 A DCCP application interface should let sending applications suggest 3469 a value for CsCov for sent packets, defaulting to 0 (full coverage). 3470 The Minimum Checksum Coverage feature, described below, lets an 3471 endpoint refuse delivery of application data on packets with partial 3472 checksum coverage; by default, only fully-covered application data 3473 is accepted. Lower layers that support partial error detection MAY 3474 use the Checksum Coverage field as a hint of where errors do not 3475 need to be detected. Lower layers MUST use a strong error detection 3476 mechanism to detect at least errors that occur in the sensitive part 3477 of the packet, and discard damaged packets. The sensitive part 3478 consists of the bytes between the first byte of the IP header and 3479 the last byte identified by Checksum Coverage. 3481 For more details on application and lower-layer interface issues 3482 relating to partial checksumming, see [RFC 3828]. 3484 9.2.1. Minimum Checksum Coverage Feature 3486 The Minimum Checksum Coverage feature lets a DCCP endpoint determine 3487 whether its peer is willing to accept packets with reduced Checksum 3488 Coverage. For example, DCCP A sends a "Change R(Minimum Checksum 3489 Coverage, 1)" option to DCCP B to check whether B is willing to 3490 accept packets with Checksum Coverage set to 1. 3492 Minimum Checksum Coverage has feature number 8, and is server- 3493 priority. It takes one-byte integer values between 0 and 15; values 3494 of 16 or more are reserved. Minimum Checksum Coverage/B reflects 3495 values of Checksum Coverage that DCCP B finds unacceptable. Say 3496 that the value of Minimum Checksum Coverage/B is MinCsCov. Then: 3498 o If MinCsCov = 0, then DCCP B only finds packets with CsCov = 0 3499 acceptable. 3501 o If MinCsCov > 0, then DCCP B additionally finds packets with 3502 CsCov >= MinCsCov acceptable. 3504 DCCP B MAY refuse to process application data from packets with 3505 unacceptable Checksum Coverage. Such packets SHOULD be reported 3506 using Data Dropped options (Section 11.7) with Drop Code 0, Protocol 3507 Constraints. New connections start with Minimum Checksum Coverage 0 3508 for both endpoints. 3510 9.3. Data Checksum Option 3512 The Data Checksum option holds a 32-bit CRC-32c cyclic redundancy- 3513 check code of a DCCP packet's application data. 3515 +--------+--------+--------+--------+--------+--------+ 3516 |00101100|00000110| CRC-32c | 3517 +--------+--------+--------+--------+--------+--------+ 3518 Type=44 Length=6 3520 The sending DCCP computes the CRC of the bytes comprising the 3521 application data area and stores it in the option data. The CRC-32c 3522 algorithm used for Data Checksum is the same as that used for SCTP 3523 [RFC 3309]; note that the CRC-32c of zero bytes of data equals zero. 3524 The DCCP header checksum will cover the Data Checksum option, so the 3525 data checksum must be computed before the header checksum. 3527 A DCCP endpoint receiving a packet with a Data Checksum option 3528 SHOULD compute the received application data's CRC-32c, using the 3529 same algorithm as the sender, and compare the result with the Data 3530 Checksum value. (The endpoint can indicate its willingness to check 3531 Data Checksums using the Check Data Checksum feature, described 3532 below.) If the CRCs differ, the endpoint reacts in one of two ways. 3534 o The receiving application may have requested delivery of known- 3535 corrupt data via some optional API. In this case, the packet's 3536 data MUST be delivered to the application, with a note that it is 3537 known to be corrupt. Furthermore, the receiving endpoint MUST 3538 report the packet as delivered corrupt using a Data Dropped 3539 option (Drop Code 7, Delivered Corrupt). 3541 o Otherwise, the receiving endpoint MUST drop the application data, 3542 and report that data as dropped due to corruption using a Data 3543 Dropped option (Drop Code 3, Corrupt). 3545 In either case, the packet is considered acknowledgeable (since its 3546 header was processed), and will therefore be acknowledged using the 3547 equivalent of Ack Vector's Received or Received ECN Marked states. 3549 Although Data Checksum is intended for packets containing 3550 application data, it may be included on other packets, such as DCCP- 3551 Ack, DCCP-Sync, and DCCP-SyncAck. The receiver SHOULD calculate the 3552 application data area's CRC-32c on such packets, just as it does for 3553 DCCP-Data and similar packets; and if the CRCs differ, the packets 3554 similarly MUST be reported using Data Dropped options (Drop Code 3), 3555 although their application data areas would not be delivered to the 3556 application in any case. 3558 9.3.1. Check Data Checksum Feature 3560 The Check Data Checksum feature lets a DCCP endpoint determine 3561 whether its peer will definitely check Data Checksum options. 3562 DCCP A sends a Mandatory "Change R(Check Data Checksum, 1)" option 3563 to DCCP B to require it to check Data Checksum options (the 3564 connection will be reset if it cannot). 3566 Check Data Checksum has feature number 9, and is server-priority. 3567 It takes one-byte Boolean values. DCCP B MUST check any received 3568 Data Checksum options when Check Data Checksum/B is one, although it 3569 MAY check them even when Check Data Checksum/B is zero. Values of 3570 two or more are reserved. New connections start with Check Data 3571 Checksum 0 for both endpoints. 3573 9.3.2. Checksum Usage Notes 3575 Internet links must normally apply strong integrity checks to the 3576 packets they transmit [RFC 3828, RFC 3819]. This is the default 3577 case when the DCCP header's Checksum Coverage value equals zero 3578 (full coverage). However, the DCCP Checksum Coverage value might 3579 not be zero. By setting partial Checksum Coverage, the application 3580 indicates that it can tolerate corruption in the unprotected part of 3581 the application data. Recognizing this, link layers may reduce 3582 error detection and/or correction strength when transmitting this 3583 unprotected part. This, in turn, can significantly increase the 3584 likelihood of the endpoint receiving corrupt data; Data Checksum 3585 lets the receiver detect that corruption with very high probability. 3587 10. Congestion Control 3589 Each congestion control mechanism supported by DCCP is assigned a 3590 congestion control identifier, or CCID: a number from 0 to 255. 3591 During connection setup, and optionally thereafter, the endpoints 3592 negotiate their congestion control mechanisms by negotiating the 3593 values for their Congestion Control ID features. Congestion Control 3594 ID has feature number 1. The CCID/A value equals the CCID in use 3595 for the A-to-B half-connection. DCCP B sends a "Change R(CCID, K)" 3596 option to ask DCCP A to use CCID K for its data packets. 3598 CCID is a server-priority feature, so CCID negotiation options can 3599 list multiple acceptable CCIDs, sorted in descending order of 3600 priority. For example, the option "Change R(CCID, 2 3 4)" asks the 3601 receiver to use CCID 2 for its packets, although CCIDs 3 and 4 are 3602 also acceptable. (This corresponds to the bytes "35, 6, 1, 2, 3, 3603 4": Change R option (35), option length (6), feature ID (1), CCIDs 3604 (2, 3, 4).) Similarly, "Confirm L(CCID, 1, 2 3 4)" tells the 3605 receiver that the sender is using CCID 2 for its packets, but that 3606 CCIDs 3 and 4 might also be acceptable. 3608 Currently allocated CCIDs are as follows. 3610 CCID Meaning Reference 3611 ---- ------- --------- 3612 0-1 Reserved 3613 2 TCP-like Congestion Control [RFC TBA] 3614 3 TFRC Congestion Control [RFC TBA] 3615 4-255 Reserved 3617 Table 5: DCCP Congestion Control Identifiers 3619 New connections start with CCID 2 for both endpoints. If this is 3620 unacceptable for a DCCP endpoint, that endpoint MUST send Mandatory 3621 Change(CCID) options on its first packets. 3623 All CCIDs standardized for use with DCCP will correspond to 3624 congestion control mechanisms previously standardized by the IETF. 3626 We expect that for quite some time, all such mechanisms will be TCP- 3627 friendly, but TCP-friendliness is not an explicit DCCP requirement. 3629 A DCCP implementation intended for general use, such as an 3630 implementation in a general-purpose operating system kernel, SHOULD 3631 implement at least CCID 2. The intent is to make CCID 2 broadly 3632 available for interoperability, although particular applications 3633 might disallow its use. 3635 10.1. TCP-like Congestion Control 3637 CCID 2, TCP-like Congestion Control, denotes Additive Increase, 3638 Multiplicative Decrease (AIMD) congestion control with behavior 3639 modelled directly on TCP, including congestion window, slow start, 3640 timeouts, and so forth [RFC 2581]. CCID 2 achieves maximum 3641 bandwidth over the long term, consistent with the use of end-to-end 3642 congestion control, but halves its congestion window in response to 3643 each congestion event. This leads to the abrupt rate changes 3644 typical of TCP. Applications should use CCID 2 if they prefer 3645 maximum bandwidth utilization to steadiness of rate. This is often 3646 the case for applications that are not playing their data directly 3647 to the user. For example, a hypothetical application that 3648 transferred files over DCCP, using application-level retransmissions 3649 for lost packets, would prefer CCID 2 to CCID 3. On-line games may 3650 also prefer CCID 2. 3652 CCID 2 is further described in [CCID 2 PROFILE]. 3654 10.2. TFRC Congestion Control 3656 CCID 3 denotes TCP-Friendly Rate Control (TFRC), an equation-based 3657 rate-controlled congestion control mechanism. TFRC is designed to 3658 be reasonably fair when competing for bandwidth with TCP-like flows, 3659 where a flow is "reasonably fair" if its sending rate is generally 3660 within a factor of two of the sending rate of a TCP flow under the 3661 same conditions. However, TFRC has a much lower variation of 3662 throughput over time compared with TCP, which makes CCID 3 more 3663 suitable than CCID 2 for applications such streaming media where a 3664 relatively smooth sending rate is of importance. 3666 CCID 3 is further described in [CCID 3 PROFILE]. The TFRC 3667 congestion control algorithms were initially described in RFC 3448. 3669 10.3. CCID-Specific Options, Features, and Reset Codes 3671 Half of the option types, feature numbers, and Reset Codes are 3672 reserved for CCID-specific use. CCIDs may often need new options, 3673 for communicating acknowledgement or rate information, for example; 3674 reserved option spaces let CCIDs create options at will without 3675 polluting the global option space. Option 128 might have different 3676 meanings on a half-connection using CCID 4 and a half-connection 3677 using CCID 8. CCID-specific options and features will never 3678 conflict with global options and features introduced by later 3679 versions of this specification. 3681 Any packet may contain information meant for either half-connection, 3682 so CCID-specific option types, feature numbers, and Reset Codes 3683 explicitly signal the half-connection to which they apply. 3685 o Option numbers 128 through 191 are for options sent from the HC- 3686 Sender to the HC-Receiver; option numbers 192 through 255 are for 3687 options sent from the HC-Receiver to the HC-Sender. 3689 o Reset Codes 128 through 191 indicate that the HC-Sender reset the 3690 connection (most likely because of some problem with 3691 acknowledgements sent by the HC-Receiver); Reset Codes 192 3692 through 255 indicate that the HC-Receiver reset the connection 3693 (most likely because of some problem with data packets sent by 3694 the HC-Sender). 3696 o Finally, feature numbers 128 through 191 are used for features 3697 located at the HC-Sender; feature numbers 192 through 255 are for 3698 features located at the HC-Receiver. Since Change L and 3699 Confirm L options for a feature are sent by the feature location, 3700 we know that any Change L(128) option was sent by the HC-Sender, 3701 while any Change L(192) option was sent by the HC-Receiver. 3702 Similarly, Change R(128) options are sent by the HC-Receiver, 3703 while Change R(192) options are sent by the HC-Sender. 3705 For example, consider a DCCP connection where the A-to-B half- 3706 connection uses CCID 4 and the B-to-A half-connection uses CCID 5. 3707 Here is how a sampling of CCID-specific options are assigned to 3708 half-connections. 3710 Relevant Relevant 3711 Packet Option Half-conn. CCID 3712 ------ ------ ---------- ---- 3713 A > B 128 A-to-B 4 3714 A > B 192 B-to-A 5 3715 A > B Change L(128, ...) A-to-B 4 3716 A > B Change R(192, ...) A-to-B 4 3717 A > B Confirm L(128, ...) A-to-B 4 3718 A > B Confirm R(192, ...) A-to-B 4 3719 A > B Change R(128, ...) B-to-A 5 3720 A > B Change L(192, ...) B-to-A 5 3721 A > B Confirm R(128, ...) B-to-A 5 3722 A > B Confirm L(192, ...) B-to-A 5 3724 B > A 128 B-to-A 5 3725 B > A 192 A-to-B 4 3726 B > A Change L(128, ...) B-to-A 5 3727 B > A Change R(192, ...) B-to-A 5 3728 B > A Confirm L(128, ...) B-to-A 5 3729 B > A Confirm R(192, ...) B-to-A 5 3730 B > A Change R(128, ...) A-to-B 4 3731 B > A Change L(192, ...) A-to-B 4 3732 B > A Confirm R(128, ...) A-to-B 4 3733 B > A Confirm L(192, ...) A-to-B 4 3735 Using CCID-specific options and feature options during a negotiation 3736 for that CCID feature is NOT RECOMMENDED, since it is difficult to 3737 predict the CCID that will be in force when the option is processed. 3738 For example, if a DCCP-Request contains the option sequence 3739 "Change L(CCID, 3), 128", the CCID-specific option "128" may be 3740 processed either by CCID 3 (if the server supports CCID 3) or by the 3741 default CCID 2 (if it does not). However, it is safe to include 3742 CCID-specific options following certain Mandatory Change(CCID) 3743 options. For example, if a DCCP-Request contains the option 3744 sequence "Mandatory, Change L(CCID, 3), 128", then either the "128" 3745 option will be processed by CCID 3 or the connection will be reset. 3747 Servers that do not implement the default CCID 2 might nevertheless 3748 receive CCID 2-specific options on a DCCP-Request packet. (Such a 3749 server MUST send Mandatory Change(CCID) options on its DCCP- 3750 Response, so CCID-specific options on any other packet won't refer 3751 to CCID 2.) The server MUST treat such options as non-understood. 3752 Thus, it will reset the connection on encountering a Mandatory CCID- 3753 specific option, send an empty Confirm for a non-Mandatory Change 3754 option for a CCID-specific feature, and ignore other options. 3756 10.4. CCID Profile Requirements 3758 Each CCID Profile document MUST address at least the following 3759 requirements: 3761 o The profile MUST include the name and number of the CCID being 3762 described. 3764 o The profile MUST describe the conditions in which it is likely to 3765 be useful. Often the best way to do this is by comparison to 3766 existing CCIDs. 3768 o The profile MUST list and describe any CCID-specific options, 3769 features, and Reset Codes, and SHOULD list those general options 3770 and features described in this document that are especially 3771 relevant to the CCID. 3773 o Any newly defined acknowledgement mechanism MUST include a way to 3774 transmit ECN Nonce Echoes back to the sender. 3776 o The profile MUST describe the format of data packets, including 3777 any options that should be included and the setting of the CCval 3778 header field. 3780 o The profile MUST describe the format of acknowledgement packets, 3781 including any options that should be included. 3783 o The profile MUST define how data packets are congestion 3784 controlled. This includes responses to congestion events, idle 3785 and application-limited periods, and responses to the DCCP Data 3786 Dropped and Slow Receiver options. CCIDs that implement per- 3787 packet congestion control SHOULD discuss how packet size is 3788 factored in to congestion control decisions. 3790 o The profile MUST specify when acknowledgement packets are 3791 generated, and how they are congestion controlled. 3793 o The profile MUST define when a sender using the CCID is 3794 considered quiescent. 3796 o The profile MUST say whether its CCID's acknowledgements ever 3797 need to be acknowledged, and if so, how often. 3799 10.5. Congestion State 3801 Most congestion control algorithms depend on past history to 3802 determine the current allowed sending rate. In CCID 2, this 3803 congestion state includes a congestion window and a measurement of 3804 the number of packets outstanding in the network; in CCID 3, it 3805 includes the lengths of recent loss intervals; and both CCIDs use an 3806 estimate of the round-trip time. Congestion state depends on the 3807 network path, and is invalidated by path changes. Therefore, DCCP 3808 senders and receivers SHOULD reset their congestion state -- 3809 essentially restarting congestion control from "slow start" or 3810 equivalent -- on significant changes in end-to-end path. For 3811 example, an endpoint that sends or receives a Mobile IPv6 Binding 3812 Update message [RFC 3775] SHOULD reset its congestion state for any 3813 corresponding DCCP connections. 3815 A DCCP implementation MAY also reset its congestion state when a 3816 CCID changes (that is, a negotiation for the CCID feature completes 3817 successfully, and the new feature value differs from the old value). 3818 Thus, a connection in a heavily congested environment might evade 3819 end-to-end congestion control by frequently renegotiating a CCID, 3820 just as it could evade end-to-end congestion control by opening new 3821 connections for the same session. This behavior is prohibited. To 3822 prevent it, DCCP implementations MAY limit the rate at which CCID 3823 can be changed -- for instance, by refusing to change a CCID feature 3824 value more than once per minute. 3826 11. Acknowledgements 3828 Congestion control requires receivers to transmit information about 3829 packet losses and ECN marks to senders. DCCP receivers MUST report 3830 all congestion they see, as defined by the relevant CCID profile. 3831 Each CCID says when acknowledgements should be sent, what options 3832 they must use, and so on. DCCP acknowledgements are congestion 3833 controlled, although it is not required that the acknowledgement 3834 stream be more than very roughly TCP-friendly; each CCID defines how 3835 acknowledgements are congestion controlled. 3837 Most acknowledgements use DCCP options. For example, on a half- 3838 connection with CCID 2 (TCP-like), the receiver reports 3839 acknowledgement information using the Ack Vector option. This 3840 section describes common acknowledgement options and shows how acks 3841 using those options will commonly work. Full descriptions of the 3842 ack mechanisms used for each CCID are laid out in the CCID profile 3843 specifications. 3845 Acknowledgement options, such as Ack Vector, generally depend on the 3846 DCCP Acknowledgement Number, and are thus only allowed on packet 3847 types that carry that number (all packets except DCCP-Request and 3848 DCCP-Data). Detailed acknowledgement options are not necessarily 3849 required on every packet that carries an Acknowledgement Number, 3850 however. 3852 11.1. Acks of Acks and Unidirectional Connections 3854 DCCP was designed to work well for both bidirectional and 3855 unidirectional flows of data, and for connections that transition 3856 between these states. However, acknowledgements required for a 3857 unidirectional connection are very different from those required for 3858 a bidirectional connection. In particular, unidirectional 3859 connections need to worry about acks of acks. 3861 The ack-of-acks problem arises because some acknowledgement 3862 mechanisms are reliable. For example, an HC-Receiver using CCID 2, 3863 TCP-like Congestion Control, sends Ack Vectors containing completely 3864 reliable acknowledgement information. The HC-Sender should 3865 occasionally inform the HC-Receiver that it has received an ack. If 3866 it did not, the HC-Receiver might resend complete Ack Vector 3867 information, going back to the start of the connection, with every 3868 DCCP-Ack packet! However, note that acks-of-acks need not be 3869 reliable themselves: when an ack-of-acks is lost, the HC-Receiver 3870 will simply maintain, and periodically retransmit, old 3871 acknowledgement-related state for a little longer. Therefore, there 3872 is no need for acks-of-acks-of-acks. 3874 When communication is bidirectional, any required acks-of-acks are 3875 automatically contained in normal acknowledgements for data packets. 3876 On a unidirectional connection, however, the receiver DCCP sends no 3877 data, so the sender would not normally send acknowledgements. 3878 Therefore, the CCID in force on that half-connection must explicitly 3879 say whether, when, and how the HC-Sender should generate acks-of- 3880 acks. 3882 For example, consider a bidirectional connection where both half- 3883 connections use the same CCID (either 2 or 3), and where DCCP B goes 3884 "quiescent". This means that the connection becomes unidirectional: 3885 DCCP B stops sending data, and sends only sends DCCP-Ack packets to 3886 DCCP A. For example, in CCID 2, TCP-like Congestion Control, DCCP B 3887 uses Ack Vector to reliably communicate which packets it has 3888 received. As described above, DCCP A must occasionally acknowledge 3889 a pure acknowledgement from DCCP B, so that B can free old Ack 3890 Vector state. For instance, A might send a DCCP-DataAck packet 3891 every now and then, instead of DCCP-Data. In contrast, in CCID 3, 3892 TFRC Congestion Control, DCCP B's acknowledgements generally need 3893 not be reliable, since they contain cumulative loss rates; TFRC 3894 works even if every DCCP-Ack is lost. Therefore, DCCP A need never 3895 acknowledge an acknowledgement. 3897 When communication is unidirectional, a single CCID -- in the 3898 example, the A-to-B CCID -- controls both DCCPs' acknowledgements, 3899 in terms of their content, their frequency, and so forth. For 3900 bidirectional connections, the A-to-B CCID governs DCCP B's 3901 acknowledgements (including its acks of DCCP A's acks), while the B- 3902 to-A CCID governs DCCP A's acknowledgements. 3904 DCCP A switches its ack pattern from bidirectional to unidirectional 3905 when it notices that DCCP B has gone quiescent. It switches from 3906 unidirectional to bidirectional when it must acknowledge even a 3907 single DCCP-Data or DCCP-DataAck packet from DCCP B. 3909 Each CCID defines how to detect quiescence on that CCID, and how 3910 that CCID handles acks-of-acks on unidirectional connections. The 3911 B-to-A CCID defines when DCCP B has gone quiescent. Usually, this 3912 happens when a period has passed without B sending any data packets; 3913 in CCID 2, for example, this period is the maximum of 0.2 seconds 3914 and two round-trip times. The A-to-B CCID defines how DCCP A 3915 handles acks-of-acks once DCCP B has gone quiescent. 3917 11.2. Ack Piggybacking 3919 Acknowledgements of A-to-B data MAY be piggybacked on data sent by 3920 DCCP B, as long as that does not delay the acknowledgement longer 3921 than the A-to-B CCID would find acceptable. However, data 3922 acknowledgements often require more than 4 bytes to express. A 3923 large set of acknowledgements prepended to a large data packet might 3924 exceed the allowed maximum packet size. In this case, DCCP B SHOULD 3925 send separate DCCP-Data and DCCP-Ack packets, or wait, but not too 3926 long, for a smaller datagram. 3928 Piggybacking is particularly common at DCCP A when the B-to-A half- 3929 connection is quiescent -- that is, when DCCP A is just 3930 acknowledging DCCP B's acknowledgements. There are three reasons to 3931 acknowledge DCCP B's acknowledgements: to allow DCCP B to free up 3932 information about previously acknowledged data packets from A; to 3933 shrink the size of future acknowledgements; and to manipulate the 3934 rate at which future acknowledgements are sent. Since these are 3935 secondary concerns, DCCP A can generally afford to wait indefinitely 3936 for a data packet to piggyback its acknowledgement onto; if DCCP B 3937 wants to elicit an acknowledgement, it can send a DCCP-Sync. 3939 Any restrictions on ack piggybacking are described in the relevant 3940 CCID's profile. 3942 11.3. Ack Ratio Feature 3944 The Ack Ratio feature lets HC-Senders influence the rate at which 3945 HC-Receivers generate DCCP-Ack packets, thus controlling reverse- 3946 path congestion. This differs from TCP, which presently has no 3947 congestion control for pure acknowledgement traffic. Ack Ratio 3948 reverse-path congestion control does not try to be TCP-friendly. It 3949 just tries to avoid congestion collapse, and to be somewhat better 3950 than TCP in the presence of a high packet loss or mark rate on the 3951 reverse path. 3953 Ack Ratio applies to CCIDs whose HC-Receivers clock acknowledgements 3954 off the receipt of data packets. The value of Ack Ratio/A equals 3955 the rough ratio of data packets sent by DCCP A to DCCP-Ack packets 3956 sent by DCCP B. Higher Ack Ratios correspond to lower DCCP-Ack 3957 rates; the sender raises Ack Ratio when the reverse path is 3958 congested and lowers Ack Ratio when it is not. Each CCID profile 3959 defines how it controls congestion on the acknowledgement path, and, 3960 particularly, whether Ack Ratio is used. CCID 2, for example, uses 3961 Ack Ratio for acknowledgement congestion control, but CCID 3 does 3962 not. However, each Ack Ratio feature has a value whether or not 3963 that value is used by the relevant CCID. 3965 Ack Ratio has feature number 5, and is non-negotiable. It takes 3966 two-byte integer values. An Ack Ratio/A value of four means that 3967 DCCP B will send at least one acknowledgement packet for every four 3968 data packets sent by DCCP A. DCCP A sends a "Change L(Ack Ratio)" 3969 option to notify DCCP B of its ack ratio. An Ack Ratio value of 3970 zero indicates that the relevant half-connection does not use an Ack 3971 Ratio to control its acknowledgement rate. New connections start 3972 with Ack Ratio 2 for both endpoints; this Ack Ratio results in 3973 acknowledgement behavior analogous to TCP's delayed acks. 3975 Ack Ratio should be treated as a guideline rather than a strict 3976 requirement. We intend Ack Ratio-controlled acknowledgement 3977 behavior to resemble TCP's acknowledgement behavior when there is no 3978 reverse-path congestion, and to be somewhat more conservative when 3979 there is reverse-path congestion. Following this intent is more 3980 important than implementing Ack Ratio precisely. In particular: 3982 o Receivers MAY piggyback acknowledgement information on data 3983 packets, creating DCCP-DataAck packets. The Ack Ratio does not 3984 apply to piggybacked acknowledgements. However, if the data 3985 packets are too big to carry acknowledgement information, or the 3986 data sending rate is lower than Ack Ratio would suggest, then 3987 DCCP B SHOULD send enough pure DCCP-Ack packets to maintain the 3988 rate of one acknowledgement per Ack Ratio received data packets. 3990 o Receivers MAY rate-pace their acknowledgements, rather than 3991 sending acknowledgements immediately upon the receipt of data 3992 packets. Receivers that rate-pace acknowledgements SHOULD pick a 3993 rate that approximates the effect of Ack Ratio, and SHOULD 3994 include Elapsed Time options (Section 13.2) to help the sender 3995 calculate round-trip times. 3997 o Receivers SHOULD implement delayed acknowledgement timers like 3998 TCP's, whereby any packet's acknowledgement is delayed by at most 3999 T seconds. This delay lets the receiver collect additional 4000 packets to acknowledge, and thus reduce the per-packet overhead 4001 of acknowledgements; but if T seconds have passed by and the ack 4002 is still around, it is sent out right away. The default value of 4003 T should be 0.2 seconds, as is common in TCP implementations. 4004 This may lead to sending more acknowledgement packets than Ack 4005 Ratio would suggest. 4007 o Receivers SHOULD send acknowledgements immediately on receiving 4008 packets marked ECN Congestion Experienced, or packets whose out- 4009 of-order sequence numbers potentially indicate loss. However, 4010 there is no need to send such immediate acknowledgements for 4011 marked packets more than once per round-trip time. 4013 o Receivers MAY ignore Ack Ratio if they perform their own 4014 congestion control on acknowledgements. For example, a receiver 4015 that knows the loss and mark rate for its DCCP-Ack packets might 4016 maintain a TCP-friendly acknowledgement rate on its own. Such a 4017 receiver MUST either ensure that it always obtains sufficient 4018 acknowledgement loss and mark information, or fall back to Ack 4019 Ratio when sufficient information is not available, as might 4020 happen during periods when the receiver is quiescent. 4022 11.4. Ack Vector Options 4024 The Ack Vector gives a run-length encoded history of data packets 4025 received at the client. Each byte of the vector gives the state of 4026 that data packet in the loss history, and the number of preceding 4027 packets with the same state. The option's data looks like this: 4029 +--------+--------+--------+--------+--------+-------- 4030 |0010011?| Length |SSLLLLLL|SSLLLLLL|SSLLLLLL| ... 4031 +--------+--------+--------+--------+--------+-------- 4032 Type=38/39 \___________ Vector ___________... 4034 The two Ack Vector options (option types 38 and 39) differ only in 4035 the values they imply for ECN Nonce Echo. Section 12.2 describes 4036 this further. 4038 The vector itself consists of a series of bytes, each of whose 4039 encoding is: 4041 0 1 2 3 4 5 6 7 4042 +-+-+-+-+-+-+-+-+ 4043 |Sta| Run Length| 4044 +-+-+-+-+-+-+-+-+ 4046 Sta[te] occupies the most significant two bits of each byte, and can 4047 have one of four values, as follows. 4049 State Meaning 4050 ----- ------- 4051 0 Received 4052 1 Received ECN Marked 4053 2 Reserved 4054 3 Not Yet Received 4056 Table 6: DCCP Ack Vector States 4058 The term "ECN marked" refers to packets with ECN code point 11, CE 4059 (Congestion Experienced); packets received with this ECN code point 4060 MUST be reported using State 1, Received ECN Marked. Packets 4061 received with other ECN code points 00, 01, or 10 (Non-ECT, ECT(0), 4062 or ECT(1), respectively) MUST be reported using State 0, Received. 4064 Run Length, the least significant six bits of each byte, specifies 4065 how many consecutive packets have the given State. Run Length zero 4066 says the corresponding State applies to one packet only; Run Length 4067 63 says it applies to 64 consecutive packets. Run lengths of 65 or 4068 more must be encoded in multiple bytes. 4070 The first byte in the first Ack Vector option refers to the packet 4071 indicated in the Acknowledgement Number; subsequent bytes refer to 4072 older packets. (Ack Vector MUST NOT be sent on DCCP-Data and DCCP- 4073 Request packets, which lack an Acknowledgement Number.) An Ack 4074 Vector containing the decimal values 0,192,3,64,5 and the 4075 Acknowledgement Number is decimal 100 indicates that: 4077 Packet 100 was received (Acknowledgement Number 100, State 0, 4078 Run Length 0). 4080 Packet 99 was lost (State 3, Run Length 0). 4082 Packets 98, 97, 96 and 95 were received (State 0, Run Length 3). 4084 Packet 94 was ECN marked (State 1, Run Length 0). 4086 Packets 93, 92, 91, 90, 89, and 88 were received (State 0, Run 4087 Length 5). 4089 A single Ack Vector option can acknowledge up to 16192 data packets. 4090 Should more packets need to be acknowledged than can fit in 253 4091 bytes of Ack Vector, then multiple Ack Vector options can be sent; 4092 the second Ack Vector begins where the first left off, and so forth. 4094 Ack Vector states are subject to two general constraints. (These 4095 principles SHOULD also be followed for other acknowledgement 4096 mechanisms; referring to Ack Vector states simplifies their 4097 explanation.) 4099 1. Packets reported as State 0 or State 1 MUST be acknowledgeable: 4100 their options have been processed by the receiving DCCP stack. 4101 Any data on the packet need not have been delivered to the 4102 receiving application; in fact, the data may have been dropped. 4104 2. Packets reported as State 3 MUST NOT be acknowledgeable. 4105 Feature negotiations and options on such packets MUST NOT have 4106 been processed, and the Acknowledgement Number MUST NOT 4107 correspond to such a packet. 4109 Packets dropped in the application's receive buffer MUST be reported 4110 as Received or Received ECN Marked (States 0 and 1), depending on 4111 their ECN state; such packets' ECN Nonces MUST be included in the 4112 Nonce Echo. The Data Dropped option informs the sender that some 4113 packets reported as received actually had their application data 4114 dropped. 4116 One or more Ack Vector options that, together, report the status of 4117 a packet with sequence number less than ISN, the initial sequence 4118 number, SHOULD be considered invalid. The receiving DCCP SHOULD 4119 either ignore the options or reset the connection with Reset Code 5, 4120 "Option Error". No Ack Vector option can refer to a packet that has 4121 not yet been sent, as the Acknowledgement Number checks in Section 4122 7.5.3 ensure, but because of attack, implementation bug, or 4123 misbehavior, an Ack Vector option can claim that a packet was 4124 received before it is actually delivered; Section 12.2 describes how 4125 this is detected and how senders should react. Packets that haven't 4126 been included in any Ack Vector option SHOULD be treated as "not yet 4127 received" (State 3) by the sender. 4129 Appendix A provides a non-normative description of the details of 4130 DCCP acknowledgement handling, in the context of an abstract Ack 4131 Vector implementation. 4133 11.4.1. Ack Vector Consistency 4135 A DCCP sender will commonly receive multiple acknowledgements for 4136 some of its data packets. For instance, an HC-Sender might receive 4137 two DCCP-Acks with Ack Vectors, both of which contained information 4138 about sequence number 24. (Information about a sequence number is 4139 generally repeated in every ack until the HC-Sender acknowledges an 4140 ack. In this case, perhaps the HC-Receiver is sending acks faster 4141 than the HC-Sender is acknowledging them.) In a perfect world, the 4142 two Ack Vectors would always be consistent. However, there are many 4143 reasons why they might not be. For example: 4145 o The HC-Receiver received packet 24 between sending its acks, so 4146 the first ack said 24 was not received (State 3) and the second 4147 said it was received or ECN marked (State 0 or 1). 4149 o The HC-Receiver received packet 24 between sending its acks, and 4150 the network reordered the acks. In this case, the packet will 4151 appear to transition from State 0 or 1 to State 3. 4153 o The network duplicated packet 24, and one of the duplicates was 4154 ECN marked. This might show up as a transition between States 0 4155 and 1. 4157 To cope with these situations, HC-Sender DCCP implementations SHOULD 4158 combine multiple received Ack Vector states according to this table: 4160 Received State 4161 0 1 3 4162 +---+---+---+ 4163 0 | 0 |0/1| 0 | 4164 Old +---+---+---+ 4165 1 | 1 | 1 | 1 | 4166 State +---+---+---+ 4167 3 | 0 | 1 | 3 | 4168 +---+---+---+ 4170 To read the table, choose the row corresponding to the packet's old 4171 state and the column corresponding to the packet's state in the 4172 newly received Ack Vector, then read the packet's new state off the 4173 table. For an old state of 0 (received non-marked) and received 4174 state of 1 (received ECN marked), the packet's new state may be set 4175 to either 0 or 1. The HC-Sender implementation will be indifferent 4176 to ack reordering if it chooses new state 1 for that cell. 4178 The HC-Receiver should collect information about received packets, 4179 which it will eventually report to the HC-Sender on one or more 4180 acknowledgements, according to the following table: 4182 Received Packet 4183 0 1 3 4184 +---+---+---+ 4185 0 | 0 |0/1| 0 | 4186 Stored +---+---+---+ 4187 1 |0/1| 1 | 1 | 4188 State +---+---+---+ 4189 3 | 0 | 1 | 3 | 4190 +---+---+---+ 4192 This table equals the sender's table, except that when the stored 4193 state is 1 and the received state is 0, the receiver is allowed to 4194 switch its stored state to 0. 4196 A HC-Sender MAY choose to throw away old information gleaned from 4197 the HC-Receiver's Ack Vectors, in which case it MUST ignore newly 4198 received acknowledgements from the HC-Receiver for those old 4199 packets. It is often kinder to save recent Ack Vector information 4200 for a while, so that the HC-Sender can undo its reaction to presumed 4201 congestion when a "lost" packet unexpectedly shows up (the 4202 transition from State 3 to State 0). 4204 11.4.2. Ack Vector Coverage 4206 We can divide the packets that have been sent from an HC-Sender to 4207 an HC-Receiver into four roughly contiguous groups. From oldest to 4208 youngest, these are: 4210 1. Packets already acknowledged by the HC-Receiver, where the HC- 4211 Receiver knows that the HC-Sender has definitely received the 4212 acknowledgements. 4214 2. Packets already acknowledged by the HC-Receiver, where the HC- 4215 Receiver cannot be sure that the HC-Sender has received the 4216 acknowledgements. 4218 3. Packets not yet acknowledged by the HC-Receiver. 4220 4. Packets not yet received by the HC-Receiver. 4222 The union of groups 2 and 3 is called the Acknowledgement Window. 4223 Generally, every Ack Vector generated by the HC-Receiver will cover 4224 the whole Acknowledgement Window: Ack Vector acknowledgements are 4225 cumulative. (This simplifies Ack Vector maintenance at the HC- 4226 Receiver; see Appendix A, below.) As packets are received, this 4227 window both grows on the right and shrinks on the left. It grows 4228 because there are more packets, and shrinks because the data 4229 packets' Acknowledgement Numbers will acknowledge previous 4230 acknowledgements, moving packets from group 2 into group 1. 4232 11.5. Send Ack Vector Feature 4234 The Send Ack Vector feature lets DCCPs negotiate whether they should 4235 use Ack Vector options to report congestion. Ack Vector provides 4236 detailed loss information, and lets senders report back to their 4237 applications whether particular packets were dropped. Send Ack 4238 Vector is mandatory for some CCIDs, and optional for others. 4240 Send Ack Vector has feature number 6, and is server-priority. It 4241 takes one-byte Boolean values. DCCP A MUST send Ack Vector options 4242 on its acknowledgements when Send Ack Vector/A has value one, 4243 although it MAY send Ack Vector options even when Send Ack Vector/A 4244 is zero. Values of two or more are reserved. New connections start 4245 with Send Ack Vector 0 for both endpoints. DCCP B sends a 4246 "Change R(Send Ack Vector, 1)" option to DCCP A to ask A to send Ack 4247 Vector options as part of its acknowledgement traffic. 4249 11.6. Slow Receiver Option 4251 An HC-Receiver sends the Slow Receiver option to its sender to 4252 indicate that it is having trouble keeping up with the sender's 4253 data. The HC-Sender SHOULD NOT increase its sending rate for 4254 approximately one round-trip time after seeing a packet with a Slow 4255 Receiver option. After one round-trip time, the effect of Slow 4256 Receiver disappears and the HC-Sender may again increase its rate, 4257 so the HC-Receiver SHOULD continue to send Slow Receiver options if 4258 it needs to prevent the HC-Sender from going faster in the long 4259 term. The Slow Receiver option does not indicate congestion, and 4260 the HC-Sender need not reduce its sending rate. (If necessary, the 4261 receiver can force the sender to slow down by dropping packets, with 4262 or without Data Dropped, or reporting false ECN marks.) APIs should 4263 let receiver applications set Slow Receiver, and sending 4264 applications determine whether or not their receivers are Slow. 4266 Slow Receiver is a one-byte option. 4268 +--------+ 4269 |00000010| 4270 +--------+ 4271 Type=2 4273 Slow Receiver does not specify why the receiver is having trouble 4274 keeping up with the sender. Possible reasons include lack of buffer 4275 space, CPU overload, and application quotas. A sending application 4276 might react to Slow Receiver by reducing its sending rate, for 4277 example. 4279 The sending application should not react to Slow Receiver by sending 4280 more data, however. The optimal response to a CPU-bound receiver 4281 might be to increase the sending rate, by switching to a less- 4282 compressed sending format, since a highly-compressed data format 4283 might overwhelm a slow CPU more seriously than the higher memory 4284 requirements of a less-compressed data format. This kind of format 4285 change should be requested at the application level, not via the 4286 Slow Receiver option. 4288 Slow Receiver implements a portion of TCP's receive window 4289 functionality. 4291 11.7. Data Dropped Option 4293 The Data Dropped option indicates that the application data on one 4294 or more received packets did not actually reach the application. 4295 Data Dropped additionally reports why the data was dropped: perhaps 4296 the data was corrupt, or perhaps the receiver cannot keep up with 4297 the sender's current rate and the data was dropped in some receive 4298 buffer. Using Data Dropped, DCCP endpoints can discriminate between 4299 different kinds of loss; this differs from TCP, in which all loss is 4300 reported the same way. 4302 Unless explicitly specified otherwise, DCCP congestion control 4303 mechanisms MUST react as if each Data Dropped packet was marked as 4304 ECN Congestion Experienced by the network. We intend for Data 4305 Dropped to enable research into richer congestion responses to 4306 corrupt and other endpoint-dropped packets, but DCCP CCIDs MUST 4307 react conservatively to Data Dropped until this behavior is 4308 standardized. Section 11.7.2, below, describes congestion responses 4309 for all current Drop Codes. 4311 If a received packet's application data is dropped for one of the 4312 reasons listed below, this SHOULD be reported using a Data Dropped 4313 option. Alternatively, the receiver MAY choose to report as 4314 "received" only those packets whose data were not dropped, subject 4315 to the constraint that packets not reported as received MUST NOT 4316 have had their options processed. 4318 The option's data looks like this: 4320 +--------+--------+--------+--------+--------+-------- 4321 |00101000| Length | Block | Block | Block | ... 4322 +--------+--------+--------+--------+--------+-------- 4323 Type=40 \___________ Vector ___________ ... 4325 The Vector consists of a series of bytes, called Blocks, each of 4326 whose encoding corresponds to one of two choices: 4328 0 1 2 3 4 5 6 7 0 1 2 3 4 5 6 7 4329 +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 4330 |0| Run Length | or |1|DrpCd|Run Len| 4331 +-+-+-+-+-+-+-+-+ +-+-+-+-+-+-+-+-+ 4332 Normal Block Drop Block 4334 The first byte in the first Data Dropped option refers to the packet 4335 indicated in the Acknowledgement Number; subsequent bytes refer to 4336 older packets. (Data Dropped MUST NOT be sent on DCCP-Data or DCCP- 4337 Request packets, which lack an Acknowledgement Number, and any Data 4338 Dropped options received on these packet types MUST be ignored.) 4339 Normal Blocks, which have high bit 0, indicate that any received 4340 packets in the Run Length had their data delivered to the 4341 application. Drop Blocks, which have high bit 1, indicate that 4342 received packets in the Run Len[gth] were not delivered as usual. 4343 The 3-bit Drop Code [DrpCd] field says what happened; generally, no 4344 data from that packet reached the application. Packets reported as 4345 "not yet received" MUST be included in Normal Blocks; packets not 4346 covered by any Data Dropped option are treated as if they were in a 4347 Normal Block. Defined Drop Codes for Drop Blocks are as follows. 4349 Drop Code Meaning 4350 --------- ------- 4351 0 Protocol Constraints 4352 1 Application Not Listening 4353 2 Receive Buffer 4354 3 Corrupt 4355 4-6 Reserved 4356 7 Delivered Corrupt 4358 Table 7: DCCP Drop Codes 4360 In more detail: 4362 0 The packet data was dropped due to protocol constraints. 4363 For example, the data was included on a DCCP-Request packet, 4364 but the receiving application does not allow such 4365 piggybacking; or the data was included on a packet with 4366 inappropriately low Checksum Coverage. 4368 1 The packet data was dropped because the application is no 4369 longer listening. See Section 11.7.2. 4371 2 The packet data was dropped in a receive buffer, probably 4372 because of receive buffer overflow. See Section 11.7.2. 4374 3 The packet data was dropped due to corruption. See Section 4375 9.3. 4377 7 The packet data was corrupted, but delivered to the 4378 application anyway. See Section 9.3. 4380 For example, assume a packet arrives with Acknowledgement Number 4381 100, an Ack Vector reporting all packets as received, and a Data 4382 Dropped option containing the decimal values 0,160,3,162. Then: 4384 Packet 100 was received (Acknowledgement Number 100, Normal 4385 Block, Run Length 0). 4387 Packet 99 was dropped in a receive buffer (Drop Block, Drop Code 4388 2, Run Length 0). 4390 Packets 98, 97, 96, and 95 were received (Normal Block, Run 4391 Length 3). 4393 Packets 95, 94, and 93 were dropped in the receive buffer (Drop 4394 Block, Drop Code 2, Run Length 2). 4396 Run lengths of more than 128 (for Normal Blocks) or 16 (for Drop 4397 Blocks) must be encoded in multiple Blocks. A single Data Dropped 4398 option can acknowledge up to 32384 Normal Block data packets, 4399 although the receiver SHOULD NOT send a Data Dropped option when all 4400 relevant packets fit into Normal Blocks. Should more packets need 4401 to be acknowledged than can fit in 253 bytes of Data Dropped, then 4402 multiple Data Dropped options can be sent. The second option will 4403 begin where the first left off, and so forth. 4405 One or more Data Dropped options that, together, report the status 4406 of more packets than have been sent, or that change the status of a 4407 packet, or that disagree with Ack Vector or equivalent options (by 4408 reporting a "not yet received" packet as "dropped in the receive 4409 buffer", for example), SHOULD be considered invalid. The receiving 4410 DCCP SHOULD either ignore such options, or respond by resetting the 4411 connection with Reset Code 5, "Option Error". 4413 A DCCP application interface should let receiving applications 4414 specify the Drop Codes corresponding to received packets. For 4415 example, this would let applications calculate their own checksums, 4416 but still report "dropped due to corruption" packets via the Data 4417 Dropped option. The interface SHOULD NOT let applications reduce 4418 the "seriousness" of a packet's Drop Code; for example, the 4419 application should not be able to upgrade a packet from delivered 4420 corrupt (Drop Code 7) to delivered normally (no Drop Code). 4422 Data Dropped information is transmitted reliably. That is, 4423 endpoints SHOULD continue to transmit Data Dropped options until 4424 receiving an acknowledgement indicating that the relevant options 4425 have been processed. In Ack Vector terms, each acknowledgement 4426 should contain Data Dropped options that cover the whole 4427 Acknowledgement Window (Section 11.4.2), although when every packet 4428 in that window would be placed in a Normal Block no actual option is 4429 required. 4431 11.7.1. Data Dropped and Normal Congestion Response 4433 When deciding on a response to a particular acknowledgement or set 4434 of acknowledgements containing Data Dropped options, a congestion 4435 control mechanism MUST consider dropped packets and ECN Congestion 4436 Experienced marks (including marked packets that are included in 4437 Data Dropped), as well as the packets singled out in Data Dropped. 4438 For window-based mechanisms, the valid response space is defined as 4439 follows. 4441 Assume an old window of W. Independently calculate a new window 4442 W_new1 that assumes no packets were Data Dropped (so W_new1 contains 4443 only the normal congestion response), and a new window W_new2 that 4444 assumes no packets were lost or marked (so W_new2 contains only the 4445 Data Dropped response). We are assuming that Data Dropped 4446 recommended a reduction in congestion window, so W_new2 < W. 4448 Then the actual new window W_new MUST NOT be larger than the minimum 4449 of W_new1 and W_new2; and the sender MAY combine the two responses, 4450 by setting 4451 W_new = W + min(W_new1 - W, 0) + min(W_new2 - W, 0). 4453 The details of how this is accomplished are specified in CCID 4454 profile documents. Non-window-based congestion control mechanisms 4455 MUST behave analogously; again, CCID profiles define how. 4457 11.7.2. Particular Drop Codes 4459 Drop Code 0, Protocol Constraints, does not indicate any kind of 4460 congestion, so the sender's CCID SHOULD react to packets with Drop 4461 Code 0 as if they were received (with or without ECN Congestion 4462 Experienced marks, as appropriate). However, the sending endpoint 4463 SHOULD NOT send data until it believes the protocol constraint no 4464 longer applies. 4466 Drop Code 1, Application Not Listening, means the application 4467 running at the endpoint that sent the option is no longer listening 4468 for data. For example, a server might close its receiving half- 4469 connection to new data after receiving a complete request from the 4470 client. This would limit the amount of state available at the 4471 server for incoming data, and thus reduce the potential damage from 4472 certain denial-of-service attacks. A Data Dropped option containing 4473 Drop Code 1 SHOULD be sent whenever received data is ignored due to 4474 a non-listening application. Once an endpoint reports Drop Code 1 4475 for a packet, it SHOULD report Drop Code 1 for every succeeding data 4476 packet on that half-connection; once an endpoint receives a Drop 4477 State 1 report, it SHOULD expect that no more data will ever be 4478 delivered to the other endpoint's application, so it SHOULD NOT send 4479 more data. 4481 Drop Code 2, Receive Buffer, indicates congestion inside the 4482 receiving host. For instance, if a drop-from-tail kernel socket 4483 buffer is too full to accept a packet's application data, that 4484 packet should be reported as Drop Code 2. For a drop-from-head or 4485 more complex socket buffer, the dropped packet should be reported as 4486 Drop Code 2. DCCP implementations may also provide an API by which 4487 applications can mark received packets as Drop Code 2, indicating 4488 that the application ran out of space in its user-level receive 4489 buffer. (However, it is not generally useful to report packets as 4490 dropped due to Drop Code 2 after more than a couple round-trip times 4491 have passed. The HC-Sender may have forgotten its acknowledgement 4492 state for the packet by that time, so the Data Dropped report will 4493 have no effect.) Every packet newly acknowledged as Drop Code 2 4494 SHOULD reduce the sender's instantaneous rate by one packet per 4495 round-trip time, unless the sender is already sending one packet per 4496 RTT or less. Each CCID profile defines the CCID-specific mechanism 4497 by which this is accomplished. 4499 Currently, the other Drop Codes, namely Drop Code 3, Corrupt, Drop 4500 Code 7, Delivered Corrupt, and reserved Drop Codes 4-6, MUST cause 4501 the relevant CCID to behave as if the relevant packets were ECN 4502 marked (ECN Congestion Experienced). 4504 12. Explicit Congestion Notification 4506 The DCCP protocol is fully ECN-aware [RFC 3168]. Each CCID 4507 specifies how its endpoints respond to ECN marks. Furthermore, 4508 DCCP, unlike TCP, allows senders to control the rate at which 4509 acknowledgements are generated (with options like Ack Ratio); since 4510 acknowledgements are congestion-controlled, they also qualify as 4511 ECN-Capable Transport. 4513 A CCID profile describes how that CCID interacts with ECN, both for 4514 data traffic and pure-acknowledgement traffic. A sender SHOULD set 4515 ECN-Capable Transport on its packets' IP headers, unless the 4516 receiver's ECN Incapable feature is on or the relevant CCID 4517 disallows it. 4519 The rest of this section describes the ECN Incapable feature and the 4520 interaction of the ECN Nonce with acknowledgement options such as 4521 Ack Vector. 4523 12.1. ECN Incapable Feature 4525 DCCP endpoints are ECN-aware by default, but the ECN Incapable 4526 feature lets an endpoint reject the use of Explicit Congestion 4527 Notification. The use of this feature is NOT RECOMMENDED. ECN 4528 incapability both avoids ECN's possible benefits and prevents 4529 senders from using the ECN Nonce to check for receiver misbehavior. 4530 A DCCP stack MAY therefore leave the ECN Incapable feature 4531 unimplemented, acting as if all connections were ECN capable. It is 4532 worth noting that the inappropriate firewall interactions that 4533 dogged TCP's implementation of ECN [RFC 3360] involve TCP header 4534 bits, not the IP header's ECN bits; we know of no middlebox that 4535 would block ECN-capable DCCP packets, but allow ECN-incapable DCCP 4536 packets. 4538 ECN Incapable has feature number 4, and is server-priority. It 4539 takes one-byte Boolean values. DCCP A MUST be able to read ECN bits 4540 from received frames' IP headers when ECN Incapable/A is zero. 4541 (This is independent of whether it can set ECN bits on sent frames.) 4542 DCCP A thus sends a "Change L(ECN Inapable, 1)" option to DCCP B to 4543 inform it that A cannot read ECN bits. If the ECN Incapable/A 4544 feature is one, then all of DCCP B's packets MUST be sent as ECN 4545 incapable. New connections start with ECN Incapable 0 (that is, ECN 4546 capable) for both endpoints. Values of two or more are reserved. 4548 If a DCCP is not ECN capable, it MUST send Mandatory "Change L(ECN 4549 Incapable, 1)" options to the other endpoint until acknowledged (by 4550 "Confirm R(ECN Incapable, 1)") or the connection closes. 4551 Furthermore, it MUST NOT accept any data until the other endpoint 4552 sends "Confirm R(ECN Incapable, 1)". It SHOULD send Data Dropped 4553 options on its acknowledgements, with Drop Code 0 ("protocol 4554 constraints"), if the other endpoint does send data inappropriately. 4556 12.2. ECN Nonces 4558 Congestion avoidance will not occur, and the receiver will sometimes 4559 get its data faster, if the sender isn't told about congestion 4560 events. Thus, the receiver has some incentive to falsify 4561 acknowledgement information, reporting that marked or dropped 4562 packets were actually received unmarked. This problem is more 4563 serious with DCCP than with TCP, since TCP provides reliable 4564 transport: it is more difficult with TCP to lie about lost packets 4565 without breaking the application. 4567 ECN Nonces are a general mechanism to prevent ECN cheating (or loss 4568 cheating). Two values for the two-bit ECN header field indicate 4569 ECN-Capable Transport, 01 and 10. The second code point, 10, is the 4570 ECN Nonce. In general, a protocol sender chooses between these code 4571 points randomly on its output packets, remembering the sequence it 4572 chose. The protocol receiver reports, on every acknowledgement, the 4573 number of ECN Nonces it has received thus far. This is called the 4574 ECN Nonce Echo. Since ECN marking and packet dropping both destroy 4575 the ECN Nonce, a receiver that lies about an ECN mark or packet drop 4576 has a 50% chance of guessing right and avoiding discipline. The 4577 sender may react punitively to an ECN Nonce mismatch, possibly up to 4578 dropping the connection. The ECN Nonce Echo field need not be an 4579 integer; one bit is enough to catch 50% of infractions, and the 4580 probability of success drops exponentially as more packets are sent 4581 [RFC 3540]. 4583 In DCCP, the ECN Nonce Echo field is encoded in acknowledgement 4584 options. For example, the Ack Vector option comes in two forms, Ack 4585 Vector [Nonce 0] (option 38) and Ack Vector [Nonce 1] (option 39), 4586 corresponding to the two values for a one-bit ECN Nonce Echo. The 4587 Nonce Echo for a given Ack Vector equals the one-bit sum (exclusive- 4588 or, or parity) of ECN nonces for packets reported by that Ack Vector 4589 as received and not ECN marked. Thus, only packets marked as State 4590 0 matter for this calculation (that is, valid received packets that 4591 were not ECN marked). Every Ack Vector option is detailed enough 4592 for the sender to determine what the Nonce Echo should have been. 4593 It can check this calculation against the actual Nonce Echo, and 4594 complain if there is a mismatch. (The Ack Vector could conceivably 4595 report every packet's ECN Nonce state, but this would severely limit 4596 its compressibility without providing much extra protection.) 4598 Each DCCP sender SHOULD set ECN Nonces on its packets, and remember 4599 which packets had nonces. When a sender detects an ECN Nonce Echo 4600 mismatch, it behaves as described in the next section. Each DCCP 4601 receiver MUST calculate and use the correct value for ECN Nonce Echo 4602 when sending acknowledgement options. 4604 ECN incapability, as indicated by the ECN Incapable feature, is 4605 handled as follows: An endpoint sending packets to an ECN-incapable 4606 receiver MUST send its packets as ECN incapable, and an ECN- 4607 incapable receiver MUST use the value zero for all ECN Nonce Echoes. 4609 12.3. Aggression Penalties 4611 DCCP endpoints have several mechanisms for detecting congestion- 4612 related misbehavior. For example: 4614 o A sender can detect an ECN Nonce Echo mismatch, indicating 4615 possible receiver misbehavior. 4617 o A receiver can detect whether the sender is responding to 4618 congestion feedback or Slow Receiver. 4620 o An endpoint may be able to detect that its peer is reporting 4621 inappropriately small Elapsed Time values (Section 13.2). 4623 An endpoint that detects possible congestion-related misbehavior 4624 SHOULD try to verify that its peer is truly misbehaving. For 4625 example, a sending endpoint might send a packet whose ECN header 4626 field is set to Congestion Experienced, 11; a receiver that doesn't 4627 report a corresponding mark is most likely misbehaving. 4629 Upon detecting possible misbehavior, a sender SHOULD respond as if 4630 the receiver had reported one or more recent packets as ECN-marked 4631 (instead of unmarked), while a receiver SHOULD report one or more 4632 recent non-marked packets as ECN-marked. Alternately, a sender 4633 might act as if the receiver had sent a Slow Receiver option, and a 4634 receiver might send Slow Receiver options. Other reactions that 4635 serve to slow the transfer rate are also acceptable. An entity that 4636 detects particularly egregious and ongoing misbehavior MAY also 4637 reset the connection with Reset Code 11, "Aggression Penalty". 4639 However, ECN Nonce mismatches and other warning signs can result 4640 from innocent causes, such as implementation bugs or attack. In 4641 particular, a successful DCCP-Data attack (Section 7.5.5) can cause 4642 the receiver to report an incorrect ECN Nonce Echo. Therefore, 4643 connection reset and other heavyweight mechanisms SHOULD be sent 4644 only as last resorts, after multiple round-trip times of verified 4645 aggression. 4647 13. Timing Options 4649 The Timestamp, Timestamp Echo, and Elapsed Time options help DCCP 4650 endpoints explicitly measure round-trip times. 4652 13.1. Timestamp Option 4654 This option is permitted in any DCCP packet. The length of the 4655 option is 6 bytes. 4657 +--------+--------+--------+--------+--------+--------+ 4658 |00101001|00000110| Timestamp Value | 4659 +--------+--------+--------+--------+--------+--------+ 4660 Type=41 Length=6 4662 The four bytes of option data carry the timestamp of this packet. 4663 The timestamp is a 32-bit integer that increases monotonically with 4664 time, at a rate of 1 unit per 10 microseconds. At this rate, 4665 Timestamp Value will wrap approximately every 11.9 hours. Endpoints 4666 need not measure time at this fine granularity; for example, an 4667 endpoint that preferred to measure time at millisecond granularity 4668 might send Timestamp Values that were all multiples of 100. The 4669 precise time corresponding to Timestamp Value zero is not specified: 4670 Timestamp Values are only meaningful relative to other Timestamp 4671 Values sent on the same connection. A DCCP receiving a Timestamp 4672 option SHOULD respond with a Timestamp Echo option on the next 4673 packet it sends. 4675 13.2. Elapsed Time Option 4677 This option is permitted in any DCCP packet that contains an 4678 Acknowledgement Number (such options received on other packet types 4679 MUST be ignored). It indicates how much time has elapsed, in 4680 hundredths of milliseconds (or, equivalently, multiples of 4681 10 microseconds), since the packet being acknowledged -- the packet 4682 with the given Acknowledgement Number -- was received. The option 4683 may take 4 or 6 bytes, depending on the size of the Elapsed Time 4684 value. Elapsed Time helps correct round-trip time estimates when 4685 the gap between receiving a packet and acknowledging that packet may 4686 be long -- in CCID 3, for example, where acknowledgements are sent 4687 infrequently. 4689 +--------+--------+--------+--------+ 4690 |00101011|00000100| Elapsed Time | 4691 +--------+--------+--------+--------+ 4692 Type=43 Len=4 4694 +--------+--------+--------+--------+--------+--------+ 4695 |00101011|00000110| Elapsed Time | 4696 +--------+--------+--------+--------+--------+--------+ 4697 Type=43 Len=6 4699 The option data, Elapsed Time, represents an estimated upper bound 4700 on the amount of time elapsed since the packet being acknowledged 4701 was received, with units of hundredths of milliseconds. If Elapsed 4702 Time is less than a half-second, the first, smaller form of the 4703 option SHOULD be used. Elapsed Times of more than 0.65535 seconds 4704 MUST be sent using the second form of the option. The special 4705 Elapsed Time value 4294967295, which corresponds to approximately 4706 11.9 hours, is used to represent any Elapsed Time greater than 4707 42949.67294 seconds. DCCP endpoints MUST NOT report Elapsed Times 4708 that are significantly larger than the true elapsed times. A 4709 connection MAY be reset with Reset Code 11, "Aggression Penalty", if 4710 one endpoint determines that the other is reporting a much-too-large 4711 Elapsed Time. 4713 Elapsed Time is measured in hundredths of milliseconds as a 4714 compromise between two conflicting goals. First, it provides enough 4715 granularity to reduce rounding error when measuring elapsed time 4716 over fast LANs; second, it allows many reasonable elapsed times to 4717 fit into two bytes of data. 4719 13.3. Timestamp Echo Option 4721 This option is permitted in any DCCP packet, as long as at least one 4722 packet carrying the Timestamp option has been received. Generally, 4723 a DCCP endpoint should send one Timestamp Echo option for each 4724 Timestamp option it receives; and it should send that option as soon 4725 as is convenient. The length of the option is between 6 and 10 4726 bytes, depending on whether Elapsed Time is included and how large 4727 it is. 4729 +--------+--------+--------+--------+--------+--------+ 4730 |00101010|00000110| Timestamp Echo | 4731 +--------+--------+--------+--------+--------+--------+ 4732 Type=42 Len=6 4734 +--------+--------+------- ... -------+--------+--------+ 4735 |00101010|00001000| Timestamp Echo | Elapsed Time | 4736 +--------+--------+------- ... -------+--------+--------+ 4737 Type=42 Len=8 (4 bytes) 4739 +--------+--------+------- ... -------+------- ... -------+ 4740 |00101010|00001010| Timestamp Echo | Elapsed Time | 4741 +--------+--------+------- ... -------+------- ... -------+ 4742 Type=42 Len=10 (4 bytes) (4 bytes) 4744 The first four bytes of option data, Timestamp Echo, carry a 4745 Timestamp Value taken from a preceding received Timestamp option. 4746 Usually, this will be the last packet that was received -- the 4747 packet indicated by the Acknowledgement Number, if any -- but it 4748 might be a preceding packet. Each Timestamp received will generally 4749 result in exactly one Timestamp Echo transmitted. If an endpoint 4750 has received multiple Timestamp options since the last time it sent 4751 a packet, then it MAY ignore all Timestamp options but the one 4752 included on the packet with the greatest sequence number; 4753 alternatively, it MAY include multiple Timestamp Echo options in its 4754 response, each corresponding to a different Timestamp option. 4756 The Elapsed Time value, similar to that in the Elapsed Time option, 4757 indicates the amount of time elapsed since receiving the packet 4758 whose timestamp is being echoed. This time MUST be in hundredths of 4759 milliseconds. Elapsed Time is meant to help the Timestamp sender 4760 separate the network round-trip time from the Timestamp receiver's 4761 processing time. This may be particularly important for CCIDs where 4762 acknowledgements are sent infrequently, so that there might be 4763 considerable delay between receiving a Timestamp option and sending 4764 the corresponding Timestamp Echo. A missing Elapsed Time field is 4765 equivalent to an Elapsed Time of zero. The smallest version of the 4766 option SHOULD be used that can hold the relevant Elapsed Time value. 4768 14. Maximum Packet Size 4770 A DCCP implementation MUST maintain the maximum packet size (MPS) 4771 allowed for each active DCCP session. The MPS is influenced by the 4772 maximum packet size allowed by the current congestion control 4773 mechanism (CCMPS), the maximum packet size supported by the path's 4774 links (PMTU, the Path Maximum Transmission Unit) [RFC 1191], and the 4775 lengths of the IP and DCCP headers. 4777 A DCCP application interface SHOULD let the application discover 4778 DCCP's current MPS. Generally, the DCCP implementation will refuse 4779 to send any packet bigger than the MPS, returning an appropriate 4780 error to the application. A DCCP interface MAY allow applications 4781 to request fragmentation for packets larger than PMTU, but not 4782 larger than CCMPS (packets larger than CCMPS MUST be rejected in any 4783 case). Fragmentation SHOULD NOT be the default, since it decreases 4784 robustness: an entire packet is discarded if even one of its 4785 fragments is lost. Applications can usually get better error 4786 tolerance by producing packets smaller than the PMTU. 4788 The MPS reported to the application SHOULD be influenced by the size 4789 expected to be required for DCCP headers and options. If the 4790 application provides data that, when combined with the options the 4791 DCCP implementation would like to include, would exceed the MPS, the 4792 implementation should either send the options on a separate packet 4793 (such as a DCCP-Ack) or lower the MPS, drop the data, and return an 4794 appropriate error to the application. 4796 14.1. Measuring PMTU 4798 Each DCCP endpoint MUST keep track of the current PMTU for each 4799 connection, except that this is not required for IPv4 connections 4800 whose applications have requested fragmentation. The PMTU SHOULD be 4801 initialized from the interface MTU that will be used to send 4802 packets. The MPS will be initialized with the minimum of the PMTU 4803 and the CCMPS, if any. 4805 Classical PMTU discovery uses unfragmentable packets. In IPv4, 4806 these packets have the IP Don't Fragment (DF) bit set; in IPv6, all 4807 packets are unfragmentable once emitted by an end host. As 4808 specified in RFC 1191, when a router receives a packet with DF set 4809 that is larger than the next link's MTU, it sends an ICMP 4810 Destination Unreachable message back to the source whose Code 4811 indicates that an unfragmentable packet was too large to forward (a 4812 "Datagram Too Big" message). When a DCCP implementation receives a 4813 Datagram Too Big message, it decreases its PMTU to the Next-Hop MTU 4814 value given in the ICMP message. If the MTU given in the message is 4815 zero, the sender chooses a value for PMTU using the algorithm 4816 described in RFC 1191 (Section 7). If the MTU given in the message 4817 is greater than the current PMTU, the Datagram Too Big message is 4818 ignored, as described in RFC 1191. (We are aware that this may 4819 cause problems for DCCP endpoints behind certain firewalls.) 4821 A DCCP implementation may allow the application to occasionally 4822 request that PMTU discovery be performed again. This will reset the 4823 PMTU to the outgoing interface's MTU. Such requests SHOULD be rate 4824 limited, to one per two seconds, for example. 4826 A DCCP sender MAY treat the reception of an ICMP Datagram Too Big 4827 message as an indication that the packet being reported was not lost 4828 due to congestion, and so for the purposes of congestion control it 4829 MAY ignore the DCCP receiver's indication that this packet did not 4830 arrive. However, if this is done, then the DCCP sender MUST check 4831 the ECN bits of the IP header echoed in the ICMP message, and only 4832 perform this optimization if these ECN bits indicate that the packet 4833 did not experience congestion prior to reaching the router whose 4834 link MTU it exceeded. 4836 A DCCP implementation SHOULD ensure, as far as possible, that ICMP 4837 Datagram Too Big messages were actually generated by routers, so 4838 that attackers cannot drive the PMTU down to a falsely small value. 4839 The simplest way to do this is to verify that the Sequence Number on 4840 the ICMP error's encapsulated header corresponds to a Sequence 4841 Number that the implementation recently sent. (According to current 4842 specifications, routers should return the full DCCP header and 4843 payload up to a maximum of 576 bytes [RFC 1812] or the minimum IPv6 4844 MTU [RFC 2463], although they are not required to return more than 4845 64 bits [RFC 792]. Any amount greater than 128 bits will include 4846 the Sequence Number.) ICMP Datagram Too Big messages with incorrect 4847 or missing Sequence Numbers may be ignored, or the DCCP 4848 implementation may lower the PMTU only temporarily in response. If 4849 more than three odd Datagram Too Big messages are received and the 4850 other DCCP endpoint reports more than three lost packets, however, 4851 the DCCP implementation SHOULD assume the presence of a confused 4852 router, and either obey the ICMP messages' PMTU or (on IPv4 4853 networks) switch to allowing fragmentation. 4855 DCCP also allows upward probing of the PMTU [PMTUD], where the DCCP 4856 endpoint begins by sending small packets with DF set, then gradually 4857 increases the packet size until a packet is lost. This mechanism 4858 does not require any ICMP error processing. DCCP-Sync packets are 4859 the best choice for upward probing, since DCCP-Sync probes do not 4860 risk application data loss. The DCCP implementation inserts 4861 arbitrary data into the DCCP-Sync application area, padding the 4862 packet to the right length; and since every valid DCCP-Sync 4863 generates an immediate DCCP-SyncAck in response, the endpoint will 4864 have a pretty good idea of when a probe is lost. 4866 14.2. Sender Behavior 4868 A DCCP sender SHOULD send every packet as unfragmentable, as 4869 described above, with the following exceptions. 4871 o On IPv4 connections whose applications have requested 4872 fragmentation, the sender SHOULD send packets with the DF bit not 4873 set. 4875 o On IPv6 connections whose applications have requested 4876 fragmentation, the sender SHOULD use fragmentation extension 4877 headers to fragment packets larger than PMTU into suitably-sized 4878 chunks. (Those chunks are, of course, unfragmentable.) 4880 o It is undesirable for PMTU discovery to occur on the initial 4881 connection setup handshake, as the connection setup process may 4882 not be representative of packet sizes used during the connection, 4883 and performing MTU discovery on the initial handshake might 4884 unnecessarily delay connection establishment. Thus, DCCP-Request 4885 and DCCP-Response packets SHOULD be sent as fragmentable. In 4886 addition, DCCP-Reset packets SHOULD be sent as fragmentable, 4887 although typically these would be small enough to not be a 4888 problem. For IPv4 connections, these packets SHOULD be sent with 4889 the DF bit not set; for IPv6 connections, they SHOULD be 4890 preemptively fragmented to a size not larger than the relevant 4891 interface MTU. 4893 If the DCCP implementation has decreased the PMTU, the sending 4894 application has not requested fragmentation, and the sending 4895 application attempts to send a packet larger than the new MPS, the 4896 API MUST refuse to send the packet and return an appropriate error 4897 to the application. The application should then use the API to 4898 query the new value of MPS. The kernel might have some packets 4899 buffered for transmission that are smaller than the old MPS, but 4900 larger than the new MPS. It MAY send these packets as fragmentable, 4901 or it MAY discard these packets; it MUST NOT send them as 4902 unfragmentable. 4904 15. Forward Compatibility 4906 Future versions of DCCP may add new options and features. A few 4907 simple guidelines will let extended DCCPs interoperate with normal 4908 DCCPs. 4910 o DCCP processors MUST NOT act punitively towards options and 4911 features they do not understand. For example, DCCP processors 4912 MUST NOT reset the connection if some field marked Reserved in 4913 this specification is non-zero; if some unknown option is 4914 present; or if some feature negotiation option mentions an 4915 unknown feature. Instead, DCCP processors MUST ignore these 4916 events. The Mandatory option is the single exception: if 4917 Mandatory precedes some unknown option or feature, the connection 4918 MUST be reset. 4920 o DCCP processors MUST anticipate the possibility of unknown 4921 feature values, which might occur as part of a negotiation for a 4922 known feature. For server-priority features, unknown values are 4923 handled as a matter of course: since the non-extended DCCP's 4924 priority list will not contain unknown values, the result of the 4925 negotiation cannot be an unknown value. A DCCP SHOULD respond 4926 with an empty Confirm option if it is assigned an unacceptable 4927 value for some non-negotiable feature. 4929 o Each DCCP extension SHOULD be controlled by some feature. The 4930 default value of this feature should correspond to "extension not 4931 available". If an extended DCCP wants to use the extension, it 4932 SHOULD attempt to change the feature's value using a Change L or 4933 Change R option. Any non-extended DCCP will ignore the option, 4934 thus leaving the feature value at its default, "extension not 4935 available". 4937 Section 19 lists DCCP assigned numbers reserved for experimental and 4938 testing purposes. 4940 16. Middlebox Considerations 4942 This section describes properties of DCCP that firewalls, network 4943 address translators, and other middleboxes should consider, 4944 including parts of the packet that middleboxes should not change. 4945 The intent is to draw attention to aspects of DCCP that may be 4946 useful, or dangerous, for middleboxes, or that differ significantly 4947 from TCP. 4949 The Service Code field in DCCP-Request packets provides information 4950 that may be useful for stateful middleboxes. With Service Code, a 4951 middlebox can tell what protocol a connection will use without 4952 relying on port numbers. Middleboxes can disallow connections that 4953 attempt to access unexpected services by sending a DCCP-Reset with 4954 Reset Code 8, "Bad Service Code". Middleboxes should not modify the 4955 Service Code unless they are really changing the service a 4956 connection is accessing. 4958 The Source and Destination Port fields are in the same packet 4959 locations as the corresponding fields in TCP and UDP, which may 4960 simplify some middlebox implementations. 4962 The forward compatibility considerations in Section 15 apply to 4963 middleboxes as well. In particular, middleboxes generally shouldn't 4964 act punitively towards options and features they do not understand. 4966 Modifying DCCP Sequence Numbers and Acknowledgement Numbers is more 4967 tedious and dangerous than modifying TCP sequence numbers. A 4968 middlebox that added packets to, or removed packets from, a DCCP 4969 connection would have to modify acknowledgement options, such as Ack 4970 Vector, and CCID-specific options, such as TFRC's Loss Intervals, at 4971 minimum. On ECN-capable connections, the middlebox would have to 4972 keep track of ECN Nonce information for packets it introduced or 4973 removed, so that the relevant acknowledgement options continued to 4974 have correct ECN Nonce Echoes, or risk the connection being reset 4975 for "Aggression Penalty". We therefore recommend that middleboxes 4976 not modify packet streams by adding or removing packets. 4978 Note that there is less need to modify DCCP's per-packet sequence 4979 numbers than TCP's per-byte sequence numbers; for example, a 4980 middlebox can change the contents of a packet without changing its 4981 sequence number. (In TCP, sequence number modification is required 4982 to support protocols like FTP that carry variable-length addresses 4983 in the data stream. If such an application were deployed over DCCP, 4984 middleboxes would simply grow or shrink the relevant packets as 4985 necessary, without changing their sequence numbers. This might 4986 involve fragmenting the packet.) 4988 Middleboxes may, of course, reset connections in progress. Clearly 4989 this requires inserting a packet into one or both packet streams, 4990 but the difficult issues do not arise. 4992 DCCP is somewhat unfriendly to "connection splicing" [SHHP00], in 4993 which clients' connection attempts are intercepted, but possibly 4994 later "spliced in" to external server connections via sequence 4995 number manipulations. A connection splicer at minimum would have to 4996 ensure that the spliced connections agreed on all relevant feature 4997 values, which might take some renegotiation. 4999 The contents of this section should not be interpreted as a 5000 wholesale endorsement of stateful middleboxes. 5002 17. Relations to Other Specifications 5004 17.1. RTP 5006 The Real-Time Transport Protocol, RTP [RFC 3550], is currently used 5007 over UDP by many of DCCP's target applications (for instance, 5008 streaming media). Therefore, it is important to examine the 5009 relationship between DCCP and RTP, and in particular, the question 5010 of whether any changes in RTP are necessary or desirable when it is 5011 layered over DCCP instead of UDP. 5013 There are two potential sources of overhead in the RTP-over-DCCP 5014 combination, duplicated acknowledgement information and duplicated 5015 sequence numbers. Together, these sources of overhead add slightly 5016 more than 4 bytes per packet relative to RTP-over-UDP, and that 5017 eliminating the redundancy would not reduce the overhead. 5019 First, consider acknowledgements. Both RTP and DCCP report feedback 5020 about loss rates to data senders, via RTP Control Protocol Sender 5021 and Receiver Reports (RTCP SR/RR packets) and via DCCP 5022 acknowledgement options. These feedback mechanisms are potentially 5023 redundant. However, RTCP SR/RR packets contain information not 5024 present in DCCP acknowledgements, such as "interarrival jitter", and 5025 DCCP's acknowledgements contain information not transmitted by RTCP, 5026 such as the ECN Nonce Echo. Neither feedback mechanism makes the 5027 other redundant. 5029 Sending both types of feedback need not be particularly costly 5030 either. RTCP reports may be sent relatively infrequently: once 5031 every 5 seconds on average, for low-bandwidth flows. In DCCP, some 5032 feedback mechanisms are expensive -- Ack Vector, for example, is 5033 frequent and verbose -- but others are relatively cheap: CCID 3 5034 (TFRC) acknowledgements take between 16 and 32 bytes of options sent 5035 once per round-trip time. (Reporting less frequently than once per 5036 RTT would make congestion control less responsive to loss.) We 5037 therefore conclude that acknowledgement overhead in RTP-over-DCCP 5038 need not be significantly higher than for RTP-over-UDP, at least for 5039 CCID 3. 5041 One clear redundancy can be addressed at the application level. The 5042 verbose packet-by-packet loss reports sent in RTCP Extended Reports 5043 Loss RLE Blocks [RFC 3611] can be derived from DCCP's Ack Vector 5044 options. (The converse is not true, since Loss RLE Blocks contain 5045 no ECN information.) Since DCCP implementations should provide an 5046 API for application access to Ack Vector information, RTP-over-DCCP 5047 applications might request either DCCP Ack Vectors or RTCP Extended 5048 Report Loss RLE Blocks, but not both. 5050 Now consider sequence number redundancy on data packets. The 5051 embedded RTP header contains a 16-bit RTP sequence number. Most 5052 data packets will use the DCCP-Data type; DCCP-DataAck and DCCP-Ack 5053 packets need not usually be sent. The DCCP-Data header is 12 bytes 5054 long without options, including a 24-bit sequence number. This is 4 5055 bytes more than a UDP header. Any options required on data packets 5056 would add further overhead, although many CCIDs (for instance, CCID 5057 3, TFRC) don't require options on most data packets. 5059 The DCCP sequence number cannot be inferred from the RTP sequence 5060 number since it increments on non-data packets as well as data 5061 packets. The RTP sequence number cannot be inferred from the DCCP 5062 sequence number either [RFC 3550]. Furthermore, removing RTP's 5063 sequence number would not save any header space because of alignment 5064 issues. We therefore recommend that RTP transmitted over DCCP use 5065 the same headers currently defined. The 4 byte header cost is a 5066 reasonable tradeoff for DCCP's congestion control features and 5067 access to ECN. Truly bandwidth-starved endpoints should use some 5068 future header compression scheme. 5070 17.2. Congestion Manager and Multiplexing 5072 Since DCCP doesn't provide reliable, ordered delivery, multiple 5073 application sub-flows may be multiplexed over a single DCCP 5074 connection with no inherent performance penalty. Thus, there is no 5075 need for DCCP to provide built-in support for multiple sub-flows. 5076 This differs from SCTP [RFC 2960]. 5078 Some applications might want to share congestion control state among 5079 multiple DCCP flows that share the same source and destination 5080 addresses. This functionality could be provided by the Congestion 5081 Manager [RFC 3124], a generic multiplexing facility. However, the 5082 CM would not fully support DCCP without change; it does not 5083 gracefully handle multiple congestion control mechanisms, for 5084 example. 5086 18. Security Considerations 5088 DCCP does not provide cryptographic security guarantees. 5089 Applications desiring cryptographic security services (integrity, 5090 authentication, confidentiality, access control, and anti-replay 5091 protection) should use IPsec or end-to-end security of some kind; 5092 Secure RTP is one candidate protocol [RFC 3711]. 5094 Nevertheless, DCCP is intended to protect against some classes of 5095 attackers: Attackers cannot hijack a DCCP connection (close the 5096 connection unexpectedly, or cause attacker data to be accepted by an 5097 endpoint as if it came from the sender) unless they can guess valid 5098 sequence numbers. Thus, as long as endpoints choose initial 5099 sequence numbers well, a DCCP attacker must snoop on data packets to 5100 get any reasonable probability of success. Sequence number validity 5101 checks provide this guarantee. Section 7.5.5 describes sequence 5102 number security further. This security property only holds assuming 5103 that DCCP's random numbers are chosen according to the guidelines in 5104 RFC 1750. 5106 DCCP also provides mechanisms to limit the potential impact of some 5107 denial-of-service attacks. These mechanisms include Init Cookie 5108 (Section 8.1.4), the DCCP-CloseReq packet (Section 5.5), the 5109 Application Not Listening Drop Code (Section 11.7.2), limitations on 5110 the processing of options that might cause connection reset (Section 5111 7.5.5), limitations on the processing of some ICMP messages (Section 5112 14.1), and various rate limits, which let servers avoid extensive 5113 computation or packet generation (Sections 7.5.3, 8.1.3, and 5114 others). 5116 DCCP provides no protection against attackers that can snoop on data 5117 packets. 5119 18.1. Security Considerations for Partial Checksums 5121 The partial checksum facility has a separate security impact, 5122 particularly in its interaction with authentication and encryption 5123 mechanisms. The impact is the same in DCCP as in the UDP-Lite 5124 protocol, and what follows was adapted from the corresponding text 5125 in the UDP-Lite specification [RFC 3828]. 5127 When a DCCP packet's Checksum Coverage field is not zero, the 5128 uncovered portion of a packet may change in transit. This is 5129 contrary to the idea behind most authentication mechanisms: 5130 authentication succeeds if the packet has not changed in transit. 5131 Unless authentication mechanisms that operate only on the sensitive 5132 part of packets are developed and used, authentication will always 5133 fail for partially-checksummed DCCP packets whose uncovered part has 5134 been damaged. 5136 The IPsec integrity check (Encapsulation Security Protocol, ESP, or 5137 Authentication Header, AH) is applied (at least) to the entire IP 5138 packet payload. Corruption of any bit within that area will then 5139 result in the IP receiver discarding a DCCP packet, even if the 5140 corruption happened in an uncovered part of the DCCP application 5141 data. 5143 When IPsec is used with ESP payload encryption, a link can not 5144 determine the specific transport protocol of a packet being 5145 forwarded by inspecting the IP packet payload. In this case, the 5146 link MUST provide a standard integrity check covering the entire IP 5147 packet and payload. DCCP partial checksums provide no benefit in 5148 this case. 5150 Encryption (e.g., at the transport or application levels) may be 5151 used. Note that omitting an integrity check can, under certain 5152 circumstances, compromise confidentiality [BEL98]. 5154 If a few bits of an encrypted packet are damaged, the decryption 5155 transform will typically spread errors so that the packet becomes 5156 too damaged to be of use. Many encryption transforms today exhibit 5157 this behavior. There exist encryption transforms, stream ciphers, 5158 which do not cause error propagation. Proper use of stream ciphers 5159 can be quite difficult, especially when authentication checking is 5160 omitted [BB01]. In particular, an attacker can cause predictable 5161 changes to the ultimate plaintext, even without being able to 5162 decrypt the ciphertext. 5164 19. IANA Considerations 5166 IANA has assigned IP Protocol Number 33 to DCCP. 5168 DCCP introduces eight sets of numbers whose values should be 5169 allocated by IANA. We refer to allocation policies, such as 5170 Standards Action, outlined in RFC 2434, and most registries reserve 5171 some values for experimental and testing use [RFC 3692]. In 5172 addition, DCCP requires that the IANA Port Numbers registry be 5173 opened for DCCP port registrations; Section 19.9 describes how. 5175 19.1. Packet Types Registry 5177 Each entry in the DCCP Packet Types registry contains a packet type, 5178 which is a number in the range 0-15; a packet type name, such as 5179 DCCP-Request; and a reference to the RFC defining the packet type. 5180 The registry is initially populated using the values in Table 1 5181 (Section 5.1). This document allocates packet types 0-9, and packet 5182 type 14 is permanently reserved for experimental and testing use. 5183 Packet types 10-13 and 15 are currently reserved, and should be 5184 allocated with the Standards Action policy, which requires IESG 5185 review and approval and standards-track IETF RFC publication. 5187 19.2. Reset Codes Registry 5189 Each entry in the DCCP Reset Codes registry contains a Reset Code, 5190 which is a number in the range 0-255; a short description of the 5191 Reset Code, such as "No Connection"; and a reference to the RFC 5192 defining the Reset Code. The registry is initially populated using 5193 the values in Table 2 (Section 5.6). This document allocates Reset 5194 Codes 0-11, and Reset Codes 120-126 are permanently reserved for 5195 experimental and testing use. Reset Codes 12-119 and 127 are 5196 currently reserved, and should be allocated with the IETF Consensus 5197 policy, requiring an IETF RFC publication (standards-track or not) 5198 with IESG review and approval. Reset Codes 128-255 are permanently 5199 reserved for CCID-specific registries; each CCID Profile document 5200 describes how the corresponding registry is managed. 5202 19.3. Option Types Registry 5204 Each entry in the DCCP option types registry contains an option 5205 type, which is a number in the range 0-255; the name of the option, 5206 such as "Slow Receiver"; and a reference to the RFC defining the 5207 option type. The registry is initially populated using the values 5208 in Table 3 (Section 5.8). This document allocates option types 0-2 5209 and 32-44, and option types 31 and 120-126 are permanently reserved 5210 for experimental and testing use. Option types 3-30, 45-119, and 5211 127 are currently reserved, and should be allocated with the IETF 5212 Consensus policy, requiring an IETF RFC publication (standards-track 5213 or not) with IESG review and approval. Option types 128-255 are 5214 permanently reserved for CCID-specific registries; each CCID Profile 5215 document describes how the corresponding registry is managed. 5217 19.4. Feature Numbers Registry 5219 Each entry in the DCCP feature numbers registry contains a feature 5220 number, which is a number in the range 0-255; the name of the 5221 feature, such as "ECN Incapable"; and a reference to the RFC 5222 defining the feature number. The registry is initially populated 5223 using the values in Table 4 (Section 6). This document allocates 5224 feature numbers 0-9, and feature numbers 120-126 are permanently 5225 reserved for experimental and testing use. Feature numbers 10-119 5226 and 127 are currently reserved, and should be allocated with the 5227 IETF Consensus policy, requiring an IETF RFC publication (standards- 5228 track or not) with IESG review and approval. Feature numbers 5229 128-255 are permanently reserved for CCID-specific registries; each 5230 CCID Profile document describes how the corresponding registry is 5231 managed. 5233 19.5. Congestion Control Identifiers Registry 5235 Each entry in the DCCP Congestion Control Identifiers (CCID) 5236 registry contains a CCID, which is a number in the range 0-255; the 5237 name of the CCID, such as "TCP-like Congestion Control"; and a 5238 reference to the RFC defining the CCID. The registry is initially 5239 populated using the values in Table 5 (Section 10). CCIDs 2 and 3 5240 are allocated by concurrently published profiles, and CCIDs 248-254 5241 are permanently reserved for experimental and testing use. CCIDs 0, 5242 1, 4-247, and 255 are currently reserved, and should be allocated 5243 with the IETF Consensus policy, requiring an IETF RFC publication 5244 (standards-track or not) with IESG review and approval. 5246 19.6. Ack Vector States Registry 5248 Each entry in the DCCP Ack Vector States registry contains an Ack 5249 Vector State, which is a number in the range 0-3; the name of the 5250 State, such as "Received ECN Marked"; and a reference to the RFC 5251 defining the State. The registry is initially populated using the 5252 values in Table 6 (Section 11.4). This document allocates States 0, 5253 1, and 3. State 2 is currently reserved, and should be allocated 5254 with the Standards Action policy, which requires IESG review and 5255 approval and standards-track IETF RFC publication. 5257 19.7. Drop Codes Registry 5259 Each entry in the DCCP Drop Codes registry contains a Data Dropped 5260 Drop Code, which is a number in the range 0-7; the name of the Drop 5261 Code, such as "Application Not Listening"; and a reference to the 5262 RFC defining the Drop Code. The registry is initially populated 5263 using the values in Table 7 (Section 11.7). This document allocates 5264 Drop Codes 0-3 and 7. Drop Codes 4-6 are currently reserved, and 5265 should be allocated with the Standards Action policy, which requires 5266 IESG review and approval and standards-track IETF RFC publication. 5268 19.8. Service Codes Registry 5270 Each entry in the Service Codes registry contains a Service Code, 5271 which is a number in the range 0-4294967294; a short English 5272 description of the intended service; and an optional reference to an 5273 RFC or other publicly available specification defining the Service 5274 Code. The registry should list the Service Code's numeric value as 5275 a decimal number, but when each byte of the four-byte Service Code 5276 is in the range 32-127, the registry should also show a four- 5277 character ASCII interpretation of the Service Code. Thus, the 5278 number 1717858426 would additionally appear as "fdpz". Service 5279 Codes are not DCCP-specific. Service Code 0 is permanently reserved 5280 (it represents the absence of a meaningful Service Code), and 5281 Service Codes 1056964608-1073741823 (high byte ASCII "?") are 5282 reserved for Private Use. Note that 4294967295 is not a valid 5283 Service Code. Most of the remaining Service Codes are allocated 5284 First Come First Served, with no RFC publication required; 5285 exceptions are listed in Section 8.1.2. This document allocates a 5286 single Service Code, 1145656131 ("DISC"). This corresponds to the 5287 discard service, which discards all data sent to the service and 5288 sends no data in reply. 5290 19.9. Port Numbers Registry 5292 DCCP services may use contact port numbers to provide service to 5293 unknown callers, as in TCP and UDP. IANA is therefore requested to 5294 open the existing Port Numbers registry for DCCP using the following 5295 rules, which we intend to mesh well with existing Port Numbers 5296 registration procedures. 5298 Port numbers are divided into three ranges. The Well Known Ports 5299 are those from 0 through 1023, the Registered Ports are those from 5300 1024 through 49151, and the Dynamic and/or Private Ports are those 5301 from 49152 through 65535. Well Known and Registered Ports are 5302 intended for use by server applications that desire a default 5303 contact point on a system. On most systems, Well Known Ports can 5304 only be used by system (or root) processes or by programs executed 5305 by privileged users, while Registered Ports can be used by ordinary 5306 user processes or programs executed by ordinary users. Dynamic 5307 and/or Private Ports are intended for temporary use, including 5308 client-side ports, out-of-band negotiated ports, and application 5309 testing prior to registration of a dedicated port; they MUST NOT be 5310 registered. 5312 The Port Numbers registry should accept registrations for DCCP ports 5313 in the Well Known Ports and Registered Ports ranges. Well Known and 5314 Registered Ports SHOULD NOT be used without registration. Although 5315 in some cases -- such as porting an application from UDP to DCCP -- 5316 it may seem natural to use a DCCP port before registration 5317 completes, we emphasize that IANA will not guarantee registration of 5318 particular Well Known and Registered Ports. Registrations should be 5319 requested as early as possible. 5321 Each port registration SHALL include the following information: 5323 o A short port name, consisting entirely of letters (A-Z and a-z), 5324 digits (0-9), and punctuation characters from "-_+./*" (not 5325 including the quotes). 5327 o The port number that is requested to be registered. 5329 o A short English phrase describing the port's purpose. This MUST 5330 include one or more space-separated textual Service Code 5331 descriptors naming the port's corresponding Service Codes (see 5332 Section 8.1.2). 5334 o Name and contact information for the person or entity performing 5335 the registration, and possibly a reference to a document defining 5336 the port's use. Registrations coming from IETF working groups 5337 need only name the working group, but it is also recommended to 5338 indicate a contact person. 5340 Registrants are encouraged to follow these guidelines when 5341 submitting a registration. The guidelines may be violated at IANA's 5342 discretion. 5344 o A port name SHOULD NOT be registered for more than one DCCP port 5345 number. 5347 o A port name registered for UDP MAY be registered for DCCP as 5348 well. Any such registration SHOULD use the same port number as 5349 the existing UDP registration. 5351 o Concrete intent to use a port SHOULD precede port registration. 5352 For example, existing UDP ports SHOULD NOT be registered in 5353 advance of any intent to use those ports for DCCP. 5355 o A port name generally associated with TCP and/or SCTP SHOULD NOT 5356 be registered for DCCP, since that port name implies reliable 5357 transport. For example, we discourage registration of any "http" 5358 port for DCCP. However, if such a registration makes sense (that 5359 is, if there is concrete intent to use such a port), the DCCP 5360 registration SHOULD use the same port number as the existing 5361 registration. 5363 o Multiple DCCP registrations for the same port number are allowed 5364 as long as the registrations' Service Codes do not overlap. 5366 This document registers the following port. (This should be 5367 considered a model registration.) 5369 discard 9/dccp Discard SC:DISC 5370 # IETF dccp WG, Eddie Kohler , DCCP RFC 5372 The discard service, which accepts DCCP connections on port 9, 5373 discards all incoming application data and sends no data in 5374 response. Thus, DCCP's discard port is analogous to TCP's discard 5375 port, and might be used to check the health of a DCCP stack. 5377 20. Thanks 5379 Thanks to Jitendra Padhye for his help with early versions of this 5380 specification. 5382 Thanks to Junwen Lai and Arun Venkataramani, who, as interns at 5383 ICIR, built a prototype DCCP implementation. In particular, Junwen 5384 Lai recommended that the old feature negotiation mechanism be 5385 scrapped and co-designed the current mechanism. Arun 5386 Venkataramani's feedback improved Appendix A. 5388 We thank the staff and interns of ICIR and, formerly, ACIRI, the 5389 members of the End-to-End Research Group, and the members of the 5390 Transport Area Working Group for their feedback on DCCP. We 5391 especially thank the DCCP expert reviewers: Greg Minshall, Eric 5392 Rescorla, and Magnus Westerlund for detailed written comments and 5393 problem spotting, and Rob Austein and Steve Bellovin for verbal 5394 comments and written notes. 5396 We also thank those who provided comments and suggestions via the 5397 DCCP BOF, Working Group, and mailing lists, including Damon 5398 Lanphear, Patrick McManus, Colin Perkins, Sara Karlberg, Kevin Lai, 5399 Bernard Aboba, Youngsoo Choi, Pengfei Di, Dan Duchamp, Gorry 5400 Fairhurst, Derek Fawcus, David Timothy Fleeman, John Loughney, 5401 Ghyslain Pelletier, Tom Phelan, Stanislav Shalunov, Somsak Vanit- 5402 Anunchai, David Vos, Yufei Wang, and Michael Welzl. In particular, 5403 Colin Perkins provided extensive, detailed feedback, Michael Welzl 5404 suggested the Data Checksum option, Gorry Fairhurst provided 5405 extensive feedback on various checksum issues, and Somsak Vanit- 5406 Anunchai et al.'s Colored Petri Net model discovered a problem with 5407 message exchange. 5409 A. Appendix: Ack Vector Implementation Notes 5411 This appendix discusses particulars of DCCP acknowledgement 5412 handling, in the context of an abstract implementation for Ack 5413 Vector. It is informative rather than normative. 5415 The first part of our implementation runs at the HC-Receiver, and 5416 therefore acknowledges data packets. It generates Ack Vector 5417 options. The implementation has the following characteristics: 5419 o At most one byte of state per acknowledged packet. 5421 o O(1) time to update that state when a new packet arrives (normal 5422 case). 5424 o Cumulative acknowledgements. 5426 o Quick removal of old state. 5428 The basic data structure is a circular buffer containing information 5429 about acknowledged packets. Each byte in this buffer contains a 5430 state and run length; the state can be 0 (packet received), 1 5431 (packet ECN marked), or 3 (packet not yet received). The buffer 5432 grows from right to left. The implementation maintains five 5433 variables, aside from the buffer contents: 5435 o "buf_head" and "buf_tail", which mark the live portion of the 5436 buffer. 5438 o "buf_ackno", the Acknowledgement Number of the most recent packet 5439 acknowledged in the buffer. This corresponds to the "head" 5440 pointer. 5442 o "buf_nonce", the one-bit sum (exclusive-or, or parity) of the ECN 5443 Nonces received on all packets acknowledged by the buffer with 5444 State 0. 5446 We draw acknowledgement buffers like this: 5448 +---------------------------------------------------------------+ 5449 |S,L|S,L|S,L|S,L| | | | |S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L| 5450 +---------------------------------------------------------------+ 5451 ^ ^ 5452 buf_tail buf_head, buf_ackno = A buf_nonce = E 5454 <=== buf_head and buf_tail move this way <=== 5456 Each "S,L" represents a State/Run length byte. We will draw these 5457 buffers showing only their live portion, and will add an annotation 5458 showing the Acknowledgement Number for the last live byte in the 5459 buffer. For example: 5461 +-----------------------------------------------+ 5462 A |S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L|S,L| T BN[E] 5463 +-----------------------------------------------+ 5465 Here, buf_nonce equals E and buf_ackno equals A. 5467 We will use this buffer as a running example. 5469 +---------------------------+ 5470 10 |0,0|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] [Example Buffer] 5471 +---------------------------+ 5473 In concrete terms, its meaning is as follows: 5475 Packet 10 was received. (The head of the buffer has sequence 5476 number 10, state 0, and run length 0.) 5477 Packets 9, 8, and 7 have not yet been received. (The three 5478 bytes preceding the head each have state 3 and run length 0.) 5480 Packets 6, 5, 4, 3, and 2 were received. 5482 Packet 1 was ECN marked. 5484 Packet 0 was received. 5486 The one-bit sum of the ECN Nonces on packets 10, 6, 5, 4, 3, 2, 5487 and 0 equals 1. 5489 Additionally, the HC-Receiver must keep some information about the 5490 Ack Vectors it has recently sent. For each packet sent carrying an 5491 Ack Vector, it remembers four variables: 5493 o "ack_seqno", the Sequence Number used for the packet. This is an 5494 HC-Receiver sequence number. 5496 o "ack_ptr", the value of buf_head at the time of acknowledgement. 5498 o "ack_ackno", the Acknowledgement Number used for the packet. 5499 This is an HC-Sender sequence number. Since acknowledgements are 5500 cumulative, this single number completely specifies all necessary 5501 information about the packets acknowledged by this Ack Vector. 5503 o "ack_nonce", the one-bit sum of the ECN Nonces for all State 0 5504 packets in the buffer from buf_head to ack_ackno, inclusive. 5505 Initially, this equals the Nonce Echo of the acknowledgement's 5506 Ack Vector (or, if the ack packet contained more than one Ack 5507 Vector, the exclusive-or of all the acknowledgement's Ack 5508 Vectors). It changes as information about old acknowledgements 5509 is removed (so ack_ptr and buf_head diverge), and as old packets 5510 arrive (so they change from State 3 or State 1 to State 0). 5512 A.1. Packet Arrival 5514 This section describes how the HC-Receiver updates its 5515 acknowledgement buffer as packets arrive from the HC-Sender. 5517 A.1.1. New Packets 5519 When a packet with Sequence Number greater than buf_ackno arrives, 5520 the HC-Receiver updates buf_head (by moving it to the left 5521 appropriately), buf_ackno (which is set to the new packet's Sequence 5522 Number), and possibly buf_nonce (if the packet arrived unmarked with 5523 ECN Nonce 1), in addition to the buffer itself. For example, if HC- 5524 Sender packet 11 arrived ECN marked, the Example Buffer above would 5525 enter this new state (changes are marked with stars): 5527 ** +***----------------------------+ 5528 11 |1,0|0,0|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5529 ** +***----------------------------+ 5531 If the packet's state equals the state at the head of the buffer, 5532 the HC-Receiver may choose to increment its run length (up to the 5533 maximum). For example, if HC-Sender packet 11 arrived without ECN 5534 marking and with ECN Nonce 0, the Example Buffer might enter this 5535 state instead: 5537 ** +--*------------------------+ 5538 11 |0,1|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5539 ** +--*------------------------+ 5541 Of course, the new packet's sequence number might not equal the 5542 expected sequence number. In this case, the HC-Receiver will enter 5543 the intervening packets as State 3. If several packets are missing, 5544 the HC-Receiver may prefer to enter multiple bytes with run length 5545 0, rather than a single byte with a larger run length; this 5546 simplifies table updates if one of the missing packets arrives. For 5547 example, if HC-Sender packet 12 arrived with ECN Nonce 1, the 5548 Example Buffer would enter this state: 5550 ** +*******----------------------------+ * 5551 12 |0,0|3,0|0,1|3,0|3,0|3,0|0,4|1,0|0,0| 0 BN[0] 5552 ** +*******----------------------------+ * 5554 Of course, the circular buffer may overflow, either when the HC- 5555 Sender is sending data at a very high rate, when the HC-Receiver's 5556 acknowledgements are not reaching the HC-Sender, or when the HC- 5557 Sender is forgetting to acknowledge those acks (so the HC-Receiver 5558 is unable to clean up old state). In this case, the HC-Receiver 5559 should either compress the buffer (by increasing run lengths when 5560 possible), transfer its state to a larger buffer, or, as a last 5561 resort, drop all received packets, without processing them 5562 whatsoever, until its buffer shrinks again. 5564 A.1.2. Old Packets 5566 When a packet with Sequence Number S arrives, and S <= buf_ackno, 5567 the HC-Receiver will scan the table for the byte corresponding to S. 5568 (Indexing structures could reduce the complexity of this scan.) If 5569 S was previously lost (State 3), and it was stored in a byte with 5570 run length 0, the HC-Receiver can simply change the byte's state. 5571 For example, if HC-Sender packet 8 was received with ECN Nonce 0, 5572 the Example Buffer would enter this state: 5574 +--------*------------------+ 5575 10 |0,0|3,0|0,0|3,0|0,4|1,0|0,0| 0 BN[1] 5576 +--------*------------------+ 5578 If S was not marked as lost, or if it was not contained in the 5579 table, the packet is probably a duplicate, and should be ignored. 5580 (The new packet's ECN marking state might differ from the state in 5581 the buffer; Section 11.4.1 describes what is allowed then.) If S's 5582 buffer byte has a non-zero run length, then the buffer might need be 5583 reshuffled to make space for one or two new bytes. 5585 The ack_nonce fields may also need manipulation when old packets 5586 arrive. In particular, when S transitions from State 3 or State 1 5587 to State 0, and S had ECN Nonce 1, then the implementation should 5588 flip the value of ack_nonce for every acknowledgement with ack_ackno 5589 >= S. 5591 It is impossible with this data structure to shift packets from 5592 State 0 to State 1, since the buffer doesn't store individual 5593 packets' ECN Nonces. 5595 A.2. Sending Acknowledgements 5597 Whenever the HC-Receiver needs to generate an acknowledgement, the 5598 buffer's contents can simply be copied into one or more Ack Vector 5599 options. Copied Ack Vectors might not be maximally compressed; for 5600 example, the Example Buffer above contains three adjacent 3,0 bytes 5601 that could be combined into a single 3,2 byte. The HC-Receiver 5602 might, therefore, choose to compress the buffer in place before 5603 sending the option, or to compress the buffer while copying it; 5604 either operation is simple. 5606 Every acknowledgement sent by the HC-Receiver SHOULD include the 5607 entire state of the buffer. That is, acknowledgements are 5608 cumulative. 5610 If the acknowledgement fits in one Ack Vector, that Ack Vector's 5611 Nonce Echo simply equals buf_nonce. For multiple Ack Vectors, more 5612 care is required. The Ack Vectors should be split at points 5613 corresponding to previous acknowledgements, since the stored 5614 ack_nonce fields provide enough information to calculate correct 5615 Nonce Echoes. The implementation should therefore acknowledge data 5616 at least once per 253 bytes of buffer state. (Otherwise, there'd be 5617 no way to calculate a Nonce Echo.) 5619 For each acknowledgement it sends, the HC-Receiver will add an 5620 acknowledgement record. ack_seqno will equal the HC-Receiver 5621 sequence number it used for the ack packet; ack_ptr will equal 5622 buf_head; ack_ackno will equal buf_ackno; and ack_nonce will equal 5623 buf_nonce. 5625 A.3. Clearing State 5627 Some of the HC-Sender's packets will include acknowledgement 5628 numbers, which ack the HC-Receiver's acknowledgements. When such an 5629 ack is received, the HC-Receiver finds the acknowledgement record R 5630 with the appropriate ack_seqno, then: 5632 o Sets buf_tail to R.ack_ptr + 1. 5634 o If R.ack_nonce is 1, it flips buf_nonce, and the value of 5635 ack_nonce for every later ack record. 5637 o Throws away R and every preceding ack record. 5639 (The HC-Receiver may choose to keep some older information, in case 5640 a lost packet shows up late.) For example, say that the HC-Receiver 5641 storing the Example Buffer had sent two acknowledgements already: 5643 1. ack_seqno = 59, ack_ackno = 3, ack_nonce = 1. 5645 2. ack_seqno = 60, ack_ackno = 10, ack_nonce = 0. 5647 Say the HC-Receiver then received a DCCP-DataAck packet with 5648 Acknowledgement Number 59 from the HC-Sender. This informs the HC- 5649 Receiver that the HC-Sender received, and processed, all the 5650 information in HC-Receiver packet 59. This packet acknowledged HC- 5651 Sender packet 3, so the HC-Sender has now received HC-Receiver's 5652 acknowledgements for packets 0, 1, 2, and 3. The Example Buffer 5653 should enter this state: 5655 +------------------*+ * * 5656 10 |0,0|3,0|3,0|3,0|0,2| 4 BN[0] 5657 +------------------*+ * * 5659 The tail byte's run length was adjusted, since packet 3 was in the 5660 middle of that byte. Since R.ack_nonce was 1, the buf_nonce field 5661 was flipped, as were the ack_nonce fields for later acknowledgements 5662 (here, the HC-Receiver Ack 60 record, not shown, has its ack_nonce 5663 flipped to 1). The HC-Receiver can also throw away stored 5664 information about HC-Receiver Ack 59 and any earlier 5665 acknowledgements. 5667 A careful implementation might try to ensure reasonable robustness 5668 to reordering. Suppose that the Example Buffer is as before, but 5669 that packet 9 now arrives, out of sequence. The buffer would enter 5670 this state: 5672 +----*----------------------+ 5673 10 |0,0|0,0|3,0|3,0|0,4|1,0|0,0| 0 BN[1] 5674 +----*----------------------+ 5676 The danger is that the HC-Sender might acknowledge the HC-Receiver's 5677 previous acknowledgement (with sequence number 60), which says that 5678 Packet 9 was not received, before the HC-Receiver has a chance to 5679 send a new acknowledgement saying that Packet 9 actually was 5680 received. Therefore, when packet 9 arrived, the HC-Receiver might 5681 modify its acknowledgement record to: 5683 1. ack_seqno = 59, ack_ackno = 3, ack_nonce = 1. 5685 2. ack_seqno = 60, ack_ackno = 3, ack_nonce = 1. 5687 That is, Ack 60 is now treated like a duplicate of Ack 59. This 5688 would prevent the Tail pointer from moving past packet 9 until the 5689 HC-Receiver knows that the HC-Sender has seen an Ack Vector 5690 indicating that packet's arrival. 5692 A.4. Processing Acknowledgements 5694 When the HC-Sender receives an acknowledgement, it generally cares 5695 about the number of packets that were dropped and/or ECN marked. It 5696 simply reads this off the Ack Vector. Additionally, it should check 5697 the ECN Nonce for correctness. (As described in Section 11.4.1, it 5698 may want to keep more detailed information about acknowledged 5699 packets in case packets change states between acknowledgements, or 5700 in case the application queries whether a packet arrived.) 5702 The HC-Sender must also acknowledge the HC-Receiver's 5703 acknowledgements so that the HC-Receiver can free old Ack Vector 5704 state. (Since Ack Vector acknowledgements are reliable, the HC- 5705 Receiver must maintain and resend Ack Vector information until it is 5706 sure that the HC-Sender has received that information.) A simple 5707 algorithm suffices: since Ack Vector acknowledgements are 5708 cumulative, a single acknowledgement number tells HC-Receiver how 5709 much ack information has arrived. Assuming that the HC-Receiver 5710 sends no data, the HC-Sender can ensure that at least once a round- 5711 trip time, it sends a DCCP-DataAck packet acknowledging the latest 5712 DCCP-Ack packet it has received. Of course, the HC-Sender only 5713 needs to acknowledge the HC-Receiver's acknowledgements if the HC- 5714 Sender is also sending data. If the HC-Sender is not sending data, 5715 then the HC-Receiver's Ack Vector state is stable, and there is no 5716 need to shrink it. The HC-Sender must watch for drops and ECN marks 5717 on received DCCP-Ack packets so that it can adjust the HC-Receiver's 5718 ack-sending rate -- for example, with Ack Ratio -- in response to 5719 congestion. 5721 If the other half-connection is not quiescent -- that is, the HC- 5722 Receiver is sending data to the HC-Sender, possibly using another 5723 CCID -- then the acknowledgements on that half-connection are 5724 sufficient for the HC-Receiver to free its state. 5726 B. Appendix: Partial Checksumming Design Motivation 5728 A great deal of discussion has taken place regarding the utility of 5729 allowing a DCCP sender to restrict the checksum so that it does not 5730 cover the complete packet. This section attempts to capture some of 5731 the rationale behind specific details of DCCP design. 5733 Many of the applications that we envisage using DCCP are resilient 5734 to some degree of data loss, or they would typically have chosen a 5735 reliable transport. Some of these applications may also be 5736 resilient to data corruption -- some audio payloads, for example. 5737 These resilient applications might prefer to receive corrupted data 5738 than to have DCCP drop a corrupted packet. This is particularly 5739 because of congestion control: DCCP cannot tell the difference 5740 between packets dropped due to corruption and packets dropped due to 5741 congestion, and so it must reduce the transmission rate accordingly. 5742 This response may cause the connection to receive less bandwidth 5743 than it is due; corruption in some networking technologies is 5744 independent of, or at least not always correlated to, congestion. 5745 Therefore, corrupted packets do not need to cause as strong a 5746 reduction in transmission rate as the congestion response would 5747 dictate (so long as the DCCP header and options are not corrupt). 5749 Thus DCCP allows the checksum to cover all of the packet, just the 5750 DCCP header, or both the DCCP header and some number of bytes from 5751 the application data. If the application cannot tolerate any data 5752 corruption, then the checksum must cover the whole packet. If the 5753 application would prefer to tolerate some corruption rather than 5754 have the packet dropped, then it can set the checksum to cover only 5755 part of the packet (but always the DCCP header). In addition, if 5756 the application wishes to decouple checksumming of the DCCP header 5757 from checksumming of the application data, it may do so by including 5758 the Data Checksum option. This would allow DCCP to discard 5759 corrupted application data, but still not mistake the corruption for 5760 network congestion. 5762 Thus, from the application point of view, partial checksums seem to 5763 be a desirable feature. However, the usefulness of partial 5764 checksums depends on partially corrupted packets being delivered to 5765 the receiver. If the link-layer CRC always discards corrupted 5766 packets, then this will not happen, and so the usefulness of partial 5767 checksums would be restricted to corruption that occurred in routers 5768 and other places not covered by link CRCs. There does not appear to 5769 be consensus on how likely it is that future network links that 5770 suffer significant corruption will not cover the entire packet with 5771 a single strong CRC. DCCP makes it possible to tailor such links to 5772 the application, but it is difficult to predict if this will be 5773 compelling for future link technologies. 5775 In addition, partial checksums do not co-exist well with IP-level 5776 authentication mechanisms such as IPsec AH, which cover the entire 5777 packet with a cryptographic hash. Thus, if cryptographic 5778 authentication mechanisms are required to co-exist with partial 5779 checksums, the authentication must be carried in the application 5780 data. A possible mode of usage would appear to be similar to that 5781 of Secure RTP. However, such "application-level" authentication 5782 does not protect the DCCP option negotiation and state machine from 5783 forged packets. An alternative would be to use IPsec ESP, and use 5784 encryption to protect the DCCP headers against attack, while using 5785 the DCCP header validity checks to authenticate that the header is 5786 from someone who possessed the correct key. However, while this is 5787 resistant to replay (due to the DCCP sequence number), it is not by 5788 itself resistant to some forms of man-in-the-middle attacks because 5789 the application data is not tightly coupled to the packet header. 5790 Thus an application-level authentication probably needs to be 5791 coupled with IPsec ESP or a similar mechanism to provide a 5792 reasonably complete security solution. The overhead of such a 5793 solution might be unacceptable for some applications that would 5794 otherwise wish to use partial checksums. 5796 On balance, the authors believe that DCCP partial checksums have the 5797 potential to enable some future uses that would otherwise be 5798 difficult. As the cost and complexity of supporting them is small, 5799 it seems worth including them at this time. It remains to be seen 5800 whether they are useful in practice. 5802 Normative References 5804 [RFC 793] J. Postel, editor. Transmission Control Protocol. 5805 RFC 793. 5807 [RFC 1191] J. C. Mogul and S. E. Deering. Path MTU Discovery. 5808 RFC 1191. 5810 [RFC 2119] S. Bradner. Key Words For Use in RFCs to Indicate 5811 Requirement Levels. RFC 2119. 5813 [RFC 2434] T. Narten and H. Alvestrand. Guidelines for Writing an 5814 IANA Considerations Section in RFCs. RFC 2434. 5816 [RFC 2460] S. Deering and R. Hinden. Internet Protocol, Version 6 5817 (IPv6) Specification. RFC 2460. 5819 [RFC 3168] K.K. Ramakrishnan, S. Floyd, and D. Black. The Addition 5820 of Explicit Congestion Notification (ECN) to IP. RFC 3168. 5822 [RFC 3309] J. Stone, R. Stewart, and D. Otis. Stream Control 5823 Transmission Protocol (SCTP) Checksum Change. RFC 3309. 5825 [RFC 3692] T. Narten. Assigning Experimental and Testing Numbers 5826 Considered Useful. RFC 3692. 5828 [RFC 3775] D. Johnson, C. Perkins, and J. Arkko. Mobility Support 5829 in IPv6. RFC 3775. 5831 [RFC 3828] L-A. Larzon, M. Degermark, S. Pink, L-E. Jonsson, editor, 5832 and G. Fairhurst, editor. The Lightweight User Datagram Protocol 5833 (UDP-Lite). RFC 3828. 5835 Informative References 5837 [BB01] S.M. Bellovin and M. Blaze. Cryptographic Modes of Operation 5838 for the Internet. 2nd NIST Workshop on Modes of Operation, 5839 August 2001. 5841 [BEL98] S.M. Bellovin. Cryptography and the Internet. Proc. CRYPTO 5842 '98 (LNCS 1462), pp46-55, August, 1988. 5844 [CCID 2 PROFILE] S. Floyd and E. Kohler. Profile for DCCP 5845 Congestion Control ID 2: TCP-like Congestion Control. draft- 5846 ietf-dccp-ccid2-10.txt, work in progress, March 2005. 5848 [CCID 3 PROFILE] S. Floyd, E. Kohler, and J. Padhye. Profile for 5849 DCCP Congestion Control ID 3: TFRC Congestion Control. draft- 5850 ietf-dccp-ccid3-11.txt, work in progress, March 2005. 5852 [M85] Robert T. Morris. A Weakness in the 4.2BSD Unix TCP/IP 5853 Software. Computer Science Technical Report 117, AT&T Bell 5854 Laboratories, Murray Hill, NJ, February 1985. 5856 [PMTUD] Matt Mathis, John Heffner, and Kevin Lahey. Path MTU 5857 Discovery. draft-ietf-pmtud-method-01.txt, work in progress, 5858 February 2004. 5860 [RFC 792] J. Postel, editor. Internet Control Message Protocol. 5861 RFC 792. 5863 [RFC 1750] D. Eastlake, S. Crocker, and J. Schiller. Randomness 5864 Recommendations for Security. RFC 1750. 5866 [RFC 1812] F. Baker, editor. Requirements for IP Version 4 Routers. 5867 RFC 1812. 5869 [RFC 1948] S. Bellovin. Defending Against Sequence Number Attacks. 5870 RFC 1948. 5872 [RFC 1982] R. Elz and R. Bush. Serial Number Arithmetic. RFC 1982. 5874 [RFC 2018] M. Mathis, J. Mahdavi, S. Floyd, and A. Romanow. TCP 5875 Selective Acknowledgement Options. RFC 2018. 5877 [RFC 2401] S. Kent and R. Atkinson. Security Architecture for the 5878 Internet Protocol. RFC 2401. 5880 [RFC 2463] A. Conta and S. Deering. Internet Control Message 5881 Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) 5882 Specification. RFC 2463. 5884 [RFC 2581] M. Allman, V. Paxson, and W. Stevens. TCP Congestion 5885 Control. RFC 2581. 5887 [RFC 2960] R. Stewart, Q. Xie, K. Morneault, C. Sharp, H. 5888 Schwarzbauer, T. Taylor, I. Rytina, M. Kalla, L. Zhang, and V. 5889 Paxson. Stream Control Transmission Protocol. RFC 2960. 5891 [RFC 3124] H. Balakrishnan and S. Seshan. The Congestion Manager. 5892 RFC 3124. 5894 [RFC 3360] S. Floyd. Inappropriate TCP Resets Considered Harmful. 5895 RFC 3360. 5897 [RFC 3448] M. Handley, S. Floyd, J. Padhye, and J. Widmer. TCP 5898 Friendly Rate Control (TFRC): Protocol Specification. RFC 3448. 5900 [RFC 3540] N. Spring, D. Wetherall, and D. Ely. Robust Explicit 5901 Congestion Notification (ECN) Signaling with Nonces. RFC 3540. 5903 [RFC 3550] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. 5904 RTP: A Transport Protocol for Real-Time Applications. STD 64. 5905 RFC 3550. 5907 [RFC 3611] T. Friedman, R. Caceres, and A. Clark, editors. RTP 5908 Control Protocol Extended Reports (RTCP XR). RFC 3611. 5910 [RFC 3711] M. Baugher, D. McGrew, M. Naslund, E. Carrara, and K. 5911 Norrman. The Secure Real-time Transport Protocol (SRTP). 5912 RFC 3711. 5914 [RFC 3819] P. Karn, editor, C. Bormann, G. Fairhurst, D. Grossman, 5915 R. Ludwig, J. Mahdavi, G. Montenegro, J. Touch, and L. Wood. 5916 Advice for Internet Subnetwork Designers. RFC 3819. 5918 [SHHP00] Oliver Spatscheck, Jorgen S. Hansen, John H. Hartman, and 5919 Larry L. Peterson. Optimizing TCP Forwarder Performance. 5920 IEEE/ACM Transactions on Networking 8(2):146-157, April 2000. 5922 [SYNCOOKIES] Daniel J. Bernstein. SYN Cookies. 5923 http://cr.yp.to/syncookies.html, as of July 2003. 5925 Authors' Addresses 5927 Eddie Kohler 5928 4531C Boelter Hall 5929 UCLA Computer Science Department 5930 Los Angeles, CA 90095 5931 USA 5933 Mark Handley 5934 Department of Computer Science 5935 University College London 5936 Gower Street 5937 London WC1E 6BT 5938 UK 5940 Sally Floyd 5941 ICSI Center for Internet Research 5942 1947 Center Street, Suite 600 5943 Berkeley, CA 94704 5944 USA 5946 Full Copyright Statement 5948 Copyright (C) The Internet Society (2005). This document is subject 5949 to the rights, licenses and restrictions contained in BCP 78, and 5950 except as set forth therein, the authors retain all their rights. 5952 This document and the information contained herein are provided on 5953 an "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE 5954 REPRESENTS OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE 5955 INTERNET ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR 5956 IMPLIED, INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF 5957 THE INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED 5958 WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. 5960 Intellectual Property 5962 The IETF takes no position regarding the validity or scope of any 5963 Intellectual Property Rights or other rights that might be claimed 5964 to pertain to the implementation or use of the technology described 5965 in this document or the extent to which any license under such 5966 rights might or might not be available; nor does it represent that 5967 it has made any independent effort to identify any such rights. 5968 Information on the procedures with respect to rights in RFC 5969 documents can be found in BCP 78 and BCP 79. 5971 Copies of IPR disclosures made to the IETF Secretariat and any 5972 assurances of licenses to be made available, or the result of an 5973 attempt made to obtain a general license or permission for the use 5974 of such proprietary rights by implementers or users of this 5975 specification can be obtained from the IETF on-line IPR repository 5976 at http://www.ietf.org/ipr. 5978 The IETF invites any interested party to bring to its attention any 5979 copyrights, patents or patent applications, or other proprietary 5980 rights that may cover technology that may be required to implement 5981 this standard. Please address the information to the IETF at ietf- 5982 ipr@ietf.org.