idnits 2.17.1 draft-ietf-tsvwg-rfc4960-bis-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 6518 has weird spacing: '...ed long crc_c...' == Line 6645 has weird spacing: '...ed long crc_c...' == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: o The initial cwnd before DATA transmission or after a sufficiently long idle period MUST be set to min(4*MTU, max (2*MTU, 4380 bytes)). o The initial cwnd after a retransmission timeout MUST be no more than 1*MTU. o The initial value of ssthresh MAY be arbitrarily high (for example, implementations MAY use the size of the receiver advertised window). o Whenever cwnd is greater than zero, the endpoint is allowed to have cwnd bytes of data outstanding on that transport address. o When cwnd is less than or equal to ssthresh, an SCTP endpoint MUST use the slow-start algorithm to increase cwnd only if the current congestion window is being fully utilized, an incoming SACK advances the Cumulative TSN Ack Point, and the data sender is not in Fast Recovery. Only when these three conditions are met can the cwnd be increased; otherwise, the cwnd MUST not be increased. If these conditions are met, then cwnd MUST be increased by, at most, the lesser of 1) the total size of the previously outstanding DATA chunk(s) acknowledged, and 2) the destination's path MTU. This upper bound protects against the ACK-Splitting attack outlined in [SAVAGE99]. == The document seems to contain a disclaimer for pre-RFC5378 work, but was first submitted on or after 10 November 2008. The disclaimer is usually necessary only for documents that revise or obsolete older RFCs, and that take significant amounts of text from those RFCs. If you can contact all authors of the source material and they are willing to grant the BCP78 rights to the IETF Trust, you can and should remove the disclaimer. Otherwise, the disclaimer is needed and you can ignore this comment. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- The document date (March 11, 2019) is 1866 days in the past. Is this intentional? -- Found something which looks like a code comment -- if you have code sections in the document, please surround them with '' and '' lines. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Missing Reference: 'ASSOCIATE' is mentioned on line 2364, but not defined == Missing Reference: 'SHUTDOWN' is mentioned on line 2395, but not defined == Missing Reference: 'ABORT' is mentioned on line 2358, but not defined -- Looks like a reference, but probably isn't: '256' on line 6645 == Unused Reference: 'RFC2960' is defined on line 6233, but no explicit reference was found in the text == Unused Reference: 'RFC3309' is defined on line 6244, but no explicit reference was found in the text -- Possible downref: Non-RFC (?) normative reference: ref. 'ITU.V42.1994' ** Obsolete normative reference: RFC 793 (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 1981 (Obsoleted by RFC 8201) ** Obsolete normative reference: RFC 2434 (Obsoleted by RFC 5226) ** Obsolete normative reference: RFC 2460 (Obsoleted by RFC 8200) ** Obsolete normative reference: RFC 2581 (Obsoleted by RFC 5681) ** Obsolete normative reference: RFC 4306 (Obsoleted by RFC 5996) -- Obsolete informational reference (is this intentional?): RFC 813 (Obsoleted by RFC 7805) -- Obsolete informational reference (is this intentional?): RFC 2960 (Obsoleted by RFC 4960) -- Obsolete informational reference (is this intentional?): RFC 3309 (Obsoleted by RFC 4960) -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) Summary: 6 errors (**), 0 flaws (~~), 10 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Stewart 3 Internet-Draft Netflix, Inc. 4 Obsoletes: 4960 (if approved) M. Tuexen 5 Intended status: Standards Track Muenster Univ. of Appl. Sciences 6 Expires: September 12, 2019 K. Nielsen 7 Kamstrup A/S 8 March 11, 2019 10 Stream Control Transmission Protocol 11 draft-ietf-tsvwg-rfc4960-bis-01 13 Abstract 15 This document obsoletes RFC 4960, if approved. It describes the 16 Stream Control Transmission Protocol (SCTP). SCTP is designed to 17 transport Public Switched Telephone Network (PSTN) signaling messages 18 over IP networks, but is capable of broader applications. 20 SCTP is a reliable transport protocol operating on top of a 21 connectionless packet network such as IP. It offers the following 22 services to its users: 24 o acknowledged error-free non-duplicated transfer of user data, 26 o data fragmentation to conform to discovered path MTU size, 28 o sequenced delivery of user messages within multiple streams, with 29 an option for order-of-arrival delivery of individual user 30 messages, 32 o optional bundling of multiple user messages into a single SCTP 33 packet, and 35 o network-level fault tolerance through supporting of multi-homing 36 at either or both ends of an association. 38 The design of SCTP includes appropriate congestion avoidance behavior 39 and resistance to flooding and masquerade attacks. 41 Status of This Memo 43 This Internet-Draft is submitted in full conformance with the 44 provisions of BCP 78 and BCP 79. 46 Internet-Drafts are working documents of the Internet Engineering 47 Task Force (IETF). Note that other groups may also distribute 48 working documents as Internet-Drafts. The list of current Internet- 49 Drafts is at https://datatracker.ietf.org/drafts/current/. 51 Internet-Drafts are draft documents valid for a maximum of six months 52 and may be updated, replaced, or obsoleted by other documents at any 53 time. It is inappropriate to use Internet-Drafts as reference 54 material or to cite them other than as "work in progress." 56 This Internet-Draft will expire on September 12, 2019. 58 Copyright Notice 60 Copyright (c) 2019 IETF Trust and the persons identified as the 61 document authors. All rights reserved. 63 This document is subject to BCP 78 and the IETF Trust's Legal 64 Provisions Relating to IETF Documents 65 (https://trustee.ietf.org/license-info) in effect on the date of 66 publication of this document. Please review these documents 67 carefully, as they describe your rights and restrictions with respect 68 to this document. Code Components extracted from this document must 69 include Simplified BSD License text as described in Section 4.e of 70 the Trust Legal Provisions and are provided without warranty as 71 described in the Simplified BSD License. 73 This document may contain material from IETF Documents or IETF 74 Contributions published or made publicly available before November 75 10, 2008. The person(s) controlling the copyright in some of this 76 material may not have granted the IETF Trust the right to allow 77 modifications of such material outside the IETF Standards Process. 78 Without obtaining an adequate license from the person(s) controlling 79 the copyright in such materials, this document may not be modified 80 outside the IETF Standards Process, and derivative works of it may 81 not be created outside the IETF Standards Process, except to format 82 it for publication as an RFC or to translate it into languages other 83 than English. 85 Table of Contents 87 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 5 88 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . 6 89 1.2. Architectural View of SCTP . . . . . . . . . . . . . . . 6 90 1.3. Key Terms . . . . . . . . . . . . . . . . . . . . . . . . 7 91 1.4. Abbreviations . . . . . . . . . . . . . . . . . . . . . . 11 92 1.5. Functional View of SCTP . . . . . . . . . . . . . . . . . 11 93 1.5.1. Association Startup and Takedown . . . . . . . . . . 12 94 1.5.2. Sequenced Delivery within Streams . . . . . . . . . . 13 95 1.5.3. User Data Fragmentation . . . . . . . . . . . . . . . 13 96 1.5.4. Acknowledgement and Congestion Avoidance . . . . . . 13 97 1.5.5. Chunk Bundling . . . . . . . . . . . . . . . . . . . 14 98 1.5.6. Packet Validation . . . . . . . . . . . . . . . . . . 14 99 1.5.7. Path Management . . . . . . . . . . . . . . . . . . . 14 100 1.6. Serial Number Arithmetic . . . . . . . . . . . . . . . . 15 101 1.7. Changes from RFC 4960 . . . . . . . . . . . . . . . . . . 16 102 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 16 103 3. SCTP Packet Format . . . . . . . . . . . . . . . . . . . . . 16 104 3.1. SCTP Common Header Field Descriptions . . . . . . . . . . 16 105 3.2. Chunk Field Descriptions . . . . . . . . . . . . . . . . 18 106 3.2.1. Optional/Variable-Length Parameter Format . . . . . . 20 107 3.2.2. Reporting of Unrecognized Parameters . . . . . . . . 22 108 3.3. SCTP Chunk Definitions . . . . . . . . . . . . . . . . . 22 109 3.3.1. Payload Data (DATA) (0) . . . . . . . . . . . . . . . 22 110 3.3.2. Initiation (INIT) (1) . . . . . . . . . . . . . . . . 25 111 3.3.2.1. Optional/Variable-Length Parameters in INIT . . . 28 112 3.3.3. Initiation Acknowledgement (INIT ACK) (2) . . . . . . 31 113 3.3.3.1. Optional or Variable-Length Parameters . . . . . 34 114 3.3.4. Selective Acknowledgement (SACK) (3) . . . . . . . . 34 115 3.3.5. Heartbeat Request (HEARTBEAT) (4) . . . . . . . . . . 38 116 3.3.6. Heartbeat Acknowledgement (HEARTBEAT ACK) (5) . . . . 39 117 3.3.7. Abort Association (ABORT) (6) . . . . . . . . . . . . 40 118 3.3.8. Shutdown Association (SHUTDOWN) (7) . . . . . . . . . 41 119 3.3.9. Shutdown Acknowledgement (SHUTDOWN ACK) (8) . . . . . 41 120 3.3.10. Operation Error (ERROR) (9) . . . . . . . . . . . . . 42 121 3.3.10.1. Invalid Stream Identifier (1) . . . . . . . . . 43 122 3.3.10.2. Missing Mandatory Parameter (2) . . . . . . . . 44 123 3.3.10.3. Stale Cookie Error (3) . . . . . . . . . . . . . 44 124 3.3.10.4. Out of Resource (4) . . . . . . . . . . . . . . 45 125 3.3.10.5. Unresolvable Address (5) . . . . . . . . . . . . 45 126 3.3.10.6. Unrecognized Chunk Type (6) . . . . . . . . . . 45 127 3.3.10.7. Invalid Mandatory Parameter (7) . . . . . . . . 46 128 3.3.10.8. Unrecognized Parameters (8) . . . . . . . . . . 46 129 3.3.10.9. No User Data (9) . . . . . . . . . . . . . . . . 47 130 3.3.10.10. Cookie Received While Shutting Down (10) . . . . 47 131 3.3.10.11. Restart of an Association with New Addresses 132 (11) . . . . . . . . . . . . . . . . . . . . . . 48 133 3.3.10.12. User-Initiated Abort (12) . . . . . . . . . . . 48 134 3.3.10.13. Protocol Violation (13) . . . . . . . . . . . . 48 135 3.3.11. Cookie Echo (COOKIE ECHO) (10) . . . . . . . . . . . 49 136 3.3.12. Cookie Acknowledgement (COOKIE ACK) (11) . . . . . . 50 137 3.3.13. Shutdown Complete (SHUTDOWN COMPLETE) (14) . . . . . 50 138 4. SCTP Association State Diagram . . . . . . . . . . . . . . . 51 139 5. Association Initialization . . . . . . . . . . . . . . . . . 54 140 5.1. Normal Establishment of an Association . . . . . . . . . 54 141 5.1.1. Handle Stream Parameters . . . . . . . . . . . . . . 56 142 5.1.2. Handle Address Parameters . . . . . . . . . . . . . . 57 143 5.1.3. Generating State Cookie . . . . . . . . . . . . . . . 59 144 5.1.4. State Cookie Processing . . . . . . . . . . . . . . . 60 145 5.1.5. State Cookie Authentication . . . . . . . . . . . . . 60 146 5.1.6. An Example of Normal Association Establishment . . . 61 147 5.2. Handle Duplicate or Unexpected INIT, INIT ACK, COOKIE 148 ECHO, and COOKIE ACK . . . . . . . . . . . . . . . . . . 63 149 5.2.1. INIT Received in COOKIE-WAIT or COOKIE-ECHOED State 150 (Item B) . . . . . . . . . . . . . . . . . . . . . . 63 151 5.2.2. Unexpected INIT in States Other than CLOSED, COOKIE- 152 ECHOED, COOKIE-WAIT, and SHUTDOWN-ACK-SENT . . . . . 64 153 5.2.3. Unexpected INIT ACK . . . . . . . . . . . . . . . . . 65 154 5.2.4. Handle a COOKIE ECHO when a TCB Exists . . . . . . . 65 155 5.2.4.1. An Example of a Association Restart . . . . . . . 67 156 5.2.5. Handle Duplicate COOKIE-ACK. . . . . . . . . . . . . 69 157 5.2.6. Handle Stale COOKIE Error . . . . . . . . . . . . . . 69 158 5.3. Other Initialization Issues . . . . . . . . . . . . . . . 69 159 5.3.1. Selection of Tag Value . . . . . . . . . . . . . . . 69 160 5.4. Path Verification . . . . . . . . . . . . . . . . . . . . 70 161 6. User Data Transfer . . . . . . . . . . . . . . . . . . . . . 71 162 6.1. Transmission of DATA Chunks . . . . . . . . . . . . . . . 73 163 6.2. Acknowledgement on Reception of DATA Chunks . . . . . . . 75 164 6.2.1. Processing a Received SACK . . . . . . . . . . . . . 78 165 6.3. Management of Retransmission Timer . . . . . . . . . . . 80 166 6.3.1. RTO Calculation . . . . . . . . . . . . . . . . . . . 80 167 6.3.2. Retransmission Timer Rules . . . . . . . . . . . . . 81 168 6.3.3. Handle T3-rtx Expiration . . . . . . . . . . . . . . 82 169 6.4. Multi-Homed SCTP Endpoints . . . . . . . . . . . . . . . 84 170 6.4.1. Failover from an Inactive Destination Address . . . . 84 171 6.5. Stream Identifier and Stream Sequence Number . . . . . . 85 172 6.6. Ordered and Unordered Delivery . . . . . . . . . . . . . 85 173 6.7. Report Gaps in Received DATA TSNs . . . . . . . . . . . . 86 174 6.8. CRC32c Checksum Calculation . . . . . . . . . . . . . . . 87 175 6.9. Fragmentation and Reassembly . . . . . . . . . . . . . . 88 176 6.10. Bundling . . . . . . . . . . . . . . . . . . . . . . . . 89 177 7. Congestion Control . . . . . . . . . . . . . . . . . . . . . 90 178 7.1. SCTP Differences from TCP Congestion Control . . . . . . 91 179 7.2. SCTP Slow-Start and Congestion Avoidance . . . . . . . . 92 180 7.2.1. Slow-Start . . . . . . . . . . . . . . . . . . . . . 92 181 7.2.2. Congestion Avoidance . . . . . . . . . . . . . . . . 94 182 7.2.3. Congestion Control . . . . . . . . . . . . . . . . . 94 183 7.2.4. Fast Retransmit on Gap Reports . . . . . . . . . . . 95 184 7.3. Path MTU Discovery . . . . . . . . . . . . . . . . . . . 96 185 8. Fault Management . . . . . . . . . . . . . . . . . . . . . . 97 186 8.1. Endpoint Failure Detection . . . . . . . . . . . . . . . 97 187 8.2. Path Failure Detection . . . . . . . . . . . . . . . . . 97 188 8.3. Path Heartbeat . . . . . . . . . . . . . . . . . . . . . 98 189 8.4. Handle "Out of the Blue" Packets . . . . . . . . . . . . 100 190 8.5. Verification Tag . . . . . . . . . . . . . . . . . . . . 101 191 8.5.1. Exceptions in Verification Tag Rules . . . . . . . . 101 193 9. Termination of Association . . . . . . . . . . . . . . . . . 103 194 9.1. Abort of an Association . . . . . . . . . . . . . . . . . 103 195 9.2. Shutdown of an Association . . . . . . . . . . . . . . . 103 196 10. Interface with Upper Layer . . . . . . . . . . . . . . . . . 106 197 10.1. ULP-to-SCTP . . . . . . . . . . . . . . . . . . . . . . 106 198 10.2. SCTP-to-ULP . . . . . . . . . . . . . . . . . . . . . . 115 199 11. Security Considerations . . . . . . . . . . . . . . . . . . . 117 200 11.1. Security Objectives . . . . . . . . . . . . . . . . . . 117 201 11.2. SCTP Responses to Potential Threats . . . . . . . . . . 118 202 11.2.1. Countering Insider Attacks . . . . . . . . . . . . . 118 203 11.2.2. Protecting against Data Corruption in the Network . 118 204 11.2.3. Protecting Confidentiality . . . . . . . . . . . . . 118 205 11.2.4. Protecting against Blind Denial-of-Service Attacks . 119 206 11.2.4.1. Flooding . . . . . . . . . . . . . . . . . . . . 119 207 11.2.4.2. Blind Masquerade . . . . . . . . . . . . . . . . 120 208 11.2.4.3. Improper Monopolization of Services . . . . . . 121 209 11.3. SCTP Interactions with Firewalls . . . . . . . . . . . . 121 210 11.4. Protection of Non-SCTP-Capable Hosts . . . . . . . . . . 122 211 12. Network Management Considerations . . . . . . . . . . . . . . 122 212 13. Recommended Transmission Control Block (TCB) Parameters . . . 122 213 13.1. Parameters Necessary for the SCTP Instance . . . . . . . 123 214 13.2. Parameters Necessary per Association (i.e., the TCB) . . 123 215 13.3. Per Transport Address Data . . . . . . . . . . . . . . . 124 216 13.4. General Parameters Needed . . . . . . . . . . . . . . . 125 217 14. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 125 218 14.1. IETF-Defined Chunk Extension . . . . . . . . . . . . . . 125 219 14.2. IETF-Defined Chunk Parameter Extension . . . . . . . . . 126 220 14.3. IETF-Defined Additional Error Causes . . . . . . . . . . 126 221 14.4. Payload Protocol Identifiers . . . . . . . . . . . . . . 127 222 14.5. Port Numbers Registry . . . . . . . . . . . . . . . . . 127 223 15. Suggested SCTP Protocol Parameter Values . . . . . . . . . . 130 224 16. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . 130 225 17. References . . . . . . . . . . . . . . . . . . . . . . . . . 131 226 17.1. Normative References . . . . . . . . . . . . . . . . . . 131 227 17.2. Informative References . . . . . . . . . . . . . . . . . 133 228 Appendix A. Explicit Congestion Notification . . . . . . . . . . 134 229 Appendix B. CRC32c Checksum Calculation . . . . . . . . . . . . 136 230 Appendix C. ICMP Handling . . . . . . . . . . . . . . . . . . . 138 231 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 144 233 1. Introduction 235 This section explains the reasoning behind the development of the 236 Stream Control Transmission Protocol (SCTP), the services it offers, 237 and the basic concepts needed to understand the detailed description 238 of the protocol. 240 This document obsoletes [RFC4960], if approved. 242 1.1. Motivation 244 TCP [RFC0793] has performed immense service as the primary means of 245 reliable data transfer in IP networks. However, an increasing number 246 of recent applications have found TCP too limiting, and have 247 incorporated their own reliable data transfer protocol on top of UDP 248 [RFC0768]. The limitations that users have wished to bypass include 249 the following: 251 o TCP provides both reliable data transfer and strict order-of- 252 transmission delivery of data. Some applications need reliable 253 transfer without sequence maintenance, while others would be 254 satisfied with partial ordering of the data. In both of these 255 cases, the head-of-line blocking offered by TCP causes unnecessary 256 delay. 258 o The stream-oriented nature of TCP is often an inconvenience. 259 Applications must add their own record marking to delineate their 260 messages, and must make explicit use of the push facility to 261 ensure that a complete message is transferred in a reasonable 262 time. 264 o The limited scope of TCP sockets complicates the task of providing 265 highly-available data transfer capability using multi-homed hosts. 267 o TCP is relatively vulnerable to denial-of-service attacks, such as 268 SYN attacks. 270 Transport of PSTN signaling across the IP network is an application 271 for which all of these limitations of TCP are relevant. While this 272 application directly motivated the development of SCTP, other 273 applications may find SCTP a good match to their requirements. 275 1.2. Architectural View of SCTP 277 SCTP is viewed as a layer between the SCTP user application ("SCTP 278 user" for short) and a connectionless packet network service such as 279 IP. The remainder of this document assumes SCTP runs on top of IP. 280 The basic service offered by SCTP is the reliable transfer of user 281 messages between peer SCTP users. It performs this service within 282 the context of an association between two SCTP endpoints. Section 10 283 of this document sketches the API that should exist at the boundary 284 between the SCTP and the SCTP user layers. 286 SCTP is connection-oriented in nature, but the SCTP association is a 287 broader concept than the TCP connection. SCTP provides the means for 288 each SCTP endpoint (Section 1.3) to provide the other endpoint 289 (during association startup) with a list of transport addresses 290 (i.e., multiple IP addresses in combination with an SCTP port) 291 through which that endpoint can be reached and from which it will 292 originate SCTP packets. The association spans transfers over all of 293 the possible source/destination combinations that may be generated 294 from each endpoint's lists. 296 _____________ _____________ 297 | SCTP User | | SCTP User | 298 | Application | | Application | 299 |-------------| |-------------| 300 | SCTP | | SCTP | 301 | Transport | | Transport | 302 | Service | | Service | 303 |-------------| |-------------| 304 | |One or more ---- One or more| | 305 | IP Network |IP address \/ IP address| IP Network | 306 | Service |appearances /\ appearances| Service | 307 |_____________| ---- |_____________| 309 SCTP Node A |<-------- Network transport ------->| SCTP Node B 311 Figure 1: An SCTP Association 313 1.3. Key Terms 315 Some of the language used to describe SCTP has been introduced in the 316 previous sections. This section provides a consolidated list of the 317 key terms and their definitions. 319 Active destination transport address: A transport address on a peer 320 endpoint that a transmitting endpoint considers available for 321 receiving user messages. 323 Bundling: An optional multiplexing operation, whereby more than one 324 user message may be carried in the same SCTP packet. Each user 325 message occupies its own DATA chunk. 327 Chunk: A unit of information within an SCTP packet, consisting of a 328 chunk header and chunk-specific content. 330 Congestion window (cwnd): An SCTP variable that limits the data, in 331 number of bytes, a sender can send to a particular destination 332 transport address before receiving an acknowledgement. 334 Cumulative TSN Ack Point: The TSN of the last DATA chunk 335 acknowledged via the Cumulative TSN Ack field of a SACK. 337 Idle destination address: An address that has not had user messages 338 sent to it within some length of time, normally the HEARTBEAT 339 interval or greater. 341 Inactive destination transport address: An address that is 342 considered inactive due to errors and unavailable to transport 343 user messages. 345 Message = user message: Data submitted to SCTP by the Upper Layer 346 Protocol (ULP). 348 Message Authentication Code (MAC): An integrity check mechanism 349 based on cryptographic hash functions using a secret key. 350 Typically, message authentication codes are used between two 351 parties that share a secret key in order to validate information 352 transmitted between these parties. In SCTP, it is used by an 353 endpoint to validate the State Cookie information that is returned 354 from the peer in the COOKIE ECHO chunk. The term "MAC" has 355 different meanings in different contexts. SCTP uses this term 356 with the same meaning as in [RFC2104]. 358 Network Byte Order: Most significant byte first, a.k.a., big endian. 360 Ordered Message: A user message that is delivered in order with 361 respect to all previous user messages sent within the stream on 362 which the message was sent. 364 Outstanding TSN (at an SCTP endpoint): A TSN (and the associated 365 DATA chunk) that has been sent by the endpoint but for which it 366 has not yet received an acknowledgement. 368 Path: The route taken by the SCTP packets sent by one SCTP endpoint 369 to a specific destination transport address of its peer SCTP 370 endpoint. Sending to different destination transport addresses 371 does not necessarily guarantee getting separate paths. 373 Primary Path: The primary path is the destination and source address 374 that will be put into a packet outbound to the peer endpoint by 375 default. The definition includes the source address since an 376 implementation MAY wish to specify both destination and source 377 address to better control the return path taken by reply chunks 378 and on which interface the packet is transmitted when the data 379 sender is multi-homed. 381 Receiver Window (rwnd): An SCTP variable a data sender uses to store 382 the most recently calculated receiver window of its peer, in 383 number of bytes. This gives the sender an indication of the space 384 available in the receiver's inbound buffer. 386 SCTP association: A protocol relationship between SCTP endpoints, 387 composed of the two SCTP endpoints and protocol state information 388 including Verification Tags and the currently active set of 389 Transmission Sequence Numbers (TSNs), etc. An association can be 390 uniquely identified by the transport addresses used by the 391 endpoints in the association. Two SCTP endpoints MUST NOT have 392 more than one SCTP association between them at any given time. 394 SCTP endpoint: The logical sender/receiver of SCTP packets. On a 395 multi-homed host, an SCTP endpoint is represented to its peers as 396 a combination of a set of eligible destination transport addresses 397 to which SCTP packets can be sent and a set of eligible source 398 transport addresses from which SCTP packets can be received. All 399 transport addresses used by an SCTP endpoint must use the same 400 port number, but can use multiple IP addresses. A transport 401 address used by an SCTP endpoint must not be used by another SCTP 402 endpoint. In other words, a transport address is unique to an 403 SCTP endpoint. 405 SCTP packet (or packet): The unit of data delivery across the 406 interface between SCTP and the connectionless packet network 407 (e.g., IP). An SCTP packet includes the common SCTP header, 408 possible SCTP control chunks, and user data encapsulated within 409 SCTP DATA chunks. 411 SCTP user application (SCTP user): The logical higher-layer 412 application entity which uses the services of SCTP, also called 413 the Upper-Layer Protocol (ULP). 415 Slow-Start Threshold (ssthresh): An SCTP variable. This is the 416 threshold that the endpoint will use to determine whether to 417 perform slow start or congestion avoidance on a particular 418 destination transport address. Ssthresh is in number of bytes. 420 Stream: A unidirectional logical channel established from one to 421 another associated SCTP endpoint, within which all user messages 422 are delivered in sequence except for those submitted to the 423 unordered delivery service. 425 Note: The relationship between stream numbers in opposite directions 426 is strictly a matter of how the applications use them. It is the 427 responsibility of the SCTP user to create and manage these 428 correlations if they are so desired. 430 Stream Sequence Number: A 16-bit sequence number used internally by 431 SCTP to ensure sequenced delivery of the user messages within a 432 given stream. One Stream Sequence Number is attached to each user 433 message. 435 Tie-Tags: Two 32-bit random numbers that together make a 64-bit 436 nonce. These tags are used within a State Cookie and TCB so that 437 a newly restarting association can be linked to the original 438 association within the endpoint that did not restart and yet not 439 reveal the true Verification Tags of an existing association. 441 Transmission Control Block (TCB): An internal data structure created 442 by an SCTP endpoint for each of its existing SCTP associations to 443 other SCTP endpoints. TCB contains all the status and operational 444 information for the endpoint to maintain and manage the 445 corresponding association. 447 Transmission Sequence Number (TSN): A 32-bit sequence number used 448 internally by SCTP. One TSN is attached to each chunk containing 449 user data to permit the receiving SCTP endpoint to acknowledge its 450 receipt and detect duplicate deliveries. 452 Transport address: A transport address is traditionally defined by a 453 network-layer address, a transport-layer protocol, and a 454 transport-layer port number. In the case of SCTP running over IP, 455 a transport address is defined by the combination of an IP address 456 and an SCTP port number (where SCTP is the transport protocol). 458 Unacknowledged TSN (at an SCTP endpoint): A TSN (and the associated 459 DATA chunk) that has been received by the endpoint but for which 460 an acknowledgement has not yet been sent. Or in the opposite 461 case, for a packet that has been sent but no acknowledgement has 462 been received. 464 Unordered Message: Unordered messages are "unordered" with respect 465 to any other message; this includes both other unordered messages 466 as well as other ordered messages. An unordered message might be 467 delivered prior to or later than ordered messages sent on the same 468 stream. 470 User message: The unit of data delivery across the interface between 471 SCTP and its user. 473 Verification Tag: A 32-bit unsigned integer that is randomly 474 generated. The Verification Tag provides a key that allows a 475 receiver to verify that the SCTP packet belongs to the current 476 association and is not an old or stale packet from a previous 477 association. 479 1.4. Abbreviations 481 MAC Message Authentication Code [RFC2104] 482 RTO Retransmission Timeout 483 RTT Round-Trip Time 484 RTTVAR Round-Trip Time Variation 485 SCTP Stream Control Transmission Protocol 486 SRTT Smoothed RTT 487 TCB Transmission Control Block 488 TLV Type-Length-Value coding format 489 TSN Transmission Sequence Number 490 ULP Upper-Layer Protocol 492 1.5. Functional View of SCTP 494 The SCTP transport service can be decomposed into a number of 495 functions. These are depicted in Figure 2 and explained in the 496 remainder of this section. 498 SCTP User Application 500 ----------------------------------------------------- 501 _____________ ____________________ 502 | | | Sequenced Delivery | 503 | Association | | within Streams | 504 | | |____________________| 505 | Startup | 506 | | ____________________________ 507 | and | | User Data Fragmentation | 508 | | |____________________________| 509 | Takedown | 510 | | ____________________________ 511 | | | Acknowledgement | 512 | | | and | 513 | | | Congestion Avoidance | 514 | | |____________________________| 515 | | 516 | | ____________________________ 517 | | | Chunk Bundling | 518 | | |____________________________| 519 | | 520 | | ________________________________ 521 | | | Packet Validation | 522 | | |________________________________| 523 | | 524 | | ________________________________ 525 | | | Path Management | 526 |_____________| |________________________________| 528 Figure 2: Functional View of the SCTP Transport Service 530 1.5.1. Association Startup and Takedown 532 An association is initiated by a request from the SCTP user (see the 533 description of the ASSOCIATE (or SEND) primitive in Section 10). 535 A cookie mechanism, similar to one described by Karn and Simpson in 536 [RFC2522], is employed during the initialization to provide 537 protection against synchronization attacks. The cookie mechanism 538 uses a four-way handshake, the last two legs of which are allowed to 539 carry user data for fast setup. The startup sequence is described in 540 Section 5 of this document. 542 SCTP provides for graceful close (i.e., shutdown) of an active 543 association on request from the SCTP user. See the description of 544 the SHUTDOWN primitive in Section 10. SCTP also allows ungraceful 545 close (i.e., abort), either on request from the user (ABORT 546 primitive) or as a result of an error condition detected within the 547 SCTP layer. Section 9 describes both the graceful and the ungraceful 548 close procedures. 550 SCTP does not support a half-open state (like TCP) wherein one side 551 may continue sending data while the other end is closed. When either 552 endpoint performs a shutdown, the association on each peer will stop 553 accepting new data from its user and only deliver data in queue at 554 the time of the graceful close (see Section 9). 556 1.5.2. Sequenced Delivery within Streams 558 The term "stream" is used in SCTP to refer to a sequence of user 559 messages that are to be delivered to the upper-layer protocol in 560 order with respect to other messages within the same stream. This is 561 in contrast to its usage in TCP, where it refers to a sequence of 562 bytes (in this document, a byte is assumed to be 8 bits). 564 The SCTP user can specify at association startup time the number of 565 streams to be supported by the association. This number is 566 negotiated with the remote end (see Section 5.1.1). User messages 567 are associated with stream numbers (SEND, RECEIVE primitives, 568 Section 10). Internally, SCTP assigns a Stream Sequence Number to 569 each message passed to it by the SCTP user. On the receiving side, 570 SCTP ensures that messages are delivered to the SCTP user in sequence 571 within a given stream. However, while one stream may be blocked 572 waiting for the next in-sequence user message, delivery from other 573 streams may proceed. 575 SCTP provides a mechanism for bypassing the sequenced delivery 576 service. User messages sent using this mechanism are delivered to 577 the SCTP user as soon as they are received. 579 1.5.3. User Data Fragmentation 581 When needed, SCTP fragments user messages to ensure that the SCTP 582 packet passed to the lower layer conforms to the path MTU. On 583 receipt, fragments are reassembled into complete messages before 584 being passed to the SCTP user. 586 1.5.4. Acknowledgement and Congestion Avoidance 588 SCTP assigns a Transmission Sequence Number (TSN) to each user data 589 fragment or unfragmented message. The TSN is independent of any 590 Stream Sequence Number assigned at the stream level. The receiving 591 end acknowledges all TSNs received, even if there are gaps in the 592 sequence. In this way, reliable delivery is kept functionally 593 separate from sequenced stream delivery. 595 The acknowledgement and congestion avoidance function is responsible 596 for packet retransmission when timely acknowledgement has not been 597 received. Packet retransmission is conditioned by congestion 598 avoidance procedures similar to those used for TCP. See Section 6 599 and Section 7 for a detailed description of the protocol procedures 600 associated with this function. 602 1.5.5. Chunk Bundling 604 As described in Section 3, the SCTP packet as delivered to the lower 605 layer consists of a common header followed by one or more chunks. 606 Each chunk may contain either user data or SCTP control information. 607 The SCTP user has the option to request bundling of more than one 608 user message into a single SCTP packet. The chunk bundling function 609 of SCTP is responsible for assembly of the complete SCTP packet and 610 its disassembly at the receiving end. 612 During times of congestion, an SCTP implementation MAY still perform 613 bundling even if the user has requested that SCTP not bundle. The 614 user's disabling of bundling only affects SCTP implementations that 615 may delay a small period of time before transmission (to attempt to 616 encourage bundling). When the user layer disables bundling, this 617 small delay is prohibited but not bundling that is performed during 618 congestion or retransmission. 620 1.5.6. Packet Validation 622 A mandatory Verification Tag field and a 32-bit checksum field (see 623 Appendix B for a description of the CRC32c checksum) are included in 624 the SCTP common header. The Verification Tag value is chosen by each 625 end of the association during association startup. Packets received 626 without the expected Verification Tag value are discarded, as a 627 protection against blind masquerade attacks and against stale SCTP 628 packets from a previous association. The CRC32c checksum should be 629 set by the sender of each SCTP packet to provide additional 630 protection against data corruption in the network. The receiver of 631 an SCTP packet with an invalid CRC32c checksum silently discards the 632 packet. 634 1.5.7. Path Management 636 The sending SCTP user is able to manipulate the set of transport 637 addresses used as destinations for SCTP packets through the 638 primitives described in Section 10. The SCTP path management 639 function chooses the destination transport address for each outgoing 640 SCTP packet based on the SCTP user's instructions and the currently 641 perceived reachability status of the eligible destination set. The 642 path management function monitors reachability through heartbeats 643 when other packet traffic is inadequate to provide this information 644 and advises the SCTP user when reachability of any far-end transport 645 address changes. The path management function is also responsible 646 for reporting the eligible set of local transport addresses to the 647 far end during association startup, and for reporting the transport 648 addresses returned from the far end to the SCTP user. 650 At association startup, a primary path is defined for each SCTP 651 endpoint, and is used for normal sending of SCTP packets. 653 On the receiving end, the path management is responsible for 654 verifying the existence of a valid SCTP association to which the 655 inbound SCTP packet belongs before passing it for further processing. 657 Note: Path Management and Packet Validation are done at the same 658 time, so although described separately above, in reality they cannot 659 be performed as separate items. 661 1.6. Serial Number Arithmetic 663 It is essential to remember that the actual Transmission Sequence 664 Number space is finite, though very large. This space ranges from 0 665 to 2**32 - 1. Since the space is finite, all arithmetic dealing with 666 Transmission Sequence Numbers must be performed modulo 2**32. This 667 unsigned arithmetic preserves the relationship of sequence numbers as 668 they cycle from 2**32 - 1 to 0 again. There are some subtleties to 669 computer modulo arithmetic, so great care should be taken in 670 programming the comparison of such values. When referring to TSNs, 671 the symbol "=<" means "less than or equal"(modulo 2**32). 673 Comparisons and arithmetic on TSNs in this document SHOULD use Serial 674 Number Arithmetic as defined in [RFC1982] where SERIAL_BITS = 32. 676 An endpoint SHOULD NOT transmit a DATA chunk with a TSN that is more 677 than 2**31 - 1 above the beginning TSN of its current send window. 678 Doing so will cause problems in comparing TSNs. 680 Transmission Sequence Numbers wrap around when they reach 2**32 - 1. 681 That is, the next TSN a DATA chunk MUST use after transmitting TSN = 682 2*32 - 1 is TSN = 0. 684 Any arithmetic done on Stream Sequence Numbers SHOULD use Serial 685 Number Arithmetic as defined in [RFC1982] where SERIAL_BITS = 16. 686 All other arithmetic and comparisons in this document use normal 687 arithmetic. 689 1.7. Changes from RFC 4960 691 SCTP was originally defined in [RFC4960], which this document 692 obsoletes, if approved. In this current revision no changes other 693 than formatting changes are present. 695 2. Conventions 697 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 698 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 699 document are to be interpreted as described in RFC 2119 [RFC2119]. 701 3. SCTP Packet Format 703 An SCTP packet is composed of a common header and chunks. A chunk 704 contains either control information or user data. 706 The SCTP packet format is shown below: 708 0 1 2 3 709 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 710 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 711 | Common Header | 712 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 713 | Chunk #1 | 714 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 715 | ... | 716 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 717 | Chunk #n | 718 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 720 Multiple chunks can be bundled into one SCTP packet up to the MTU 721 size, except for the INIT, INIT ACK, and SHUTDOWN COMPLETE chunks. 722 These chunks MUST NOT be bundled with any other chunk in a packet. 723 See Section 6.10 for more details on chunk bundling. 725 If a user data message doesn't fit into one SCTP packet it can be 726 fragmented into multiple chunks using the procedure defined in 727 Section 6.9. 729 All integer fields in an SCTP packet MUST be transmitted in network 730 byte order, unless otherwise stated. 732 3.1. SCTP Common Header Field Descriptions 733 SCTP Common Header Format 735 0 1 2 3 736 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 737 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 738 | Source Port Number | Destination Port Number | 739 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 740 | Verification Tag | 741 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 742 | Checksum | 743 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 745 Source Port Number: 16 bits (unsigned integer) 747 This is the SCTP sender's port number. It can be used by the 748 receiver in combination with the source IP address, the SCTP 749 destination port, and possibly the destination IP address to 750 identify the association to which this packet belongs. The port 751 number 0 MUST NOT be used. 753 Destination Port Number: 16 bits (unsigned integer) 755 This is the SCTP port number to which this packet is destined. 756 The receiving host will use this port number to de-multiplex the 757 SCTP packet to the correct receiving endpoint/application. The 758 port number 0 MUST NOT be used. 760 Verification Tag: 32 bits (unsigned integer) 762 The receiver of this packet uses the Verification Tag to validate 763 the sender of this SCTP packet. On transmit, the value of this 764 Verification Tag MUST be set to the value of the Initiate Tag 765 received from the peer endpoint during the association 766 initialization, with the following exceptions: 768 * A packet containing an INIT chunk MUST have a zero Verification 769 Tag. 771 * A packet containing a SHUTDOWN COMPLETE chunk with the T bit 772 set MUST have the Verification Tag copied from the packet with 773 the SHUTDOWN ACK chunk. 775 * A packet containing an ABORT chunk may have the verification 776 tag copied from the packet that caused the ABORT to be sent. 777 For details see Section 8.4 and Section 8.5. 779 An INIT chunk MUST be the only chunk in the SCTP packet carrying it. 781 Checksum: 32 bits (unsigned integer) 783 This field contains the checksum of this SCTP packet. Its 784 calculation is discussed in Section 6.8. SCTP uses the CRC32c 785 algorithm as described in Appendix B for calculating the checksum. 787 3.2. Chunk Field Descriptions 789 The figure below illustrates the field format for the chunks to be 790 transmitted in the SCTP packet. Each chunk is formatted with a Chunk 791 Type field, a chunk-specific Flag field, a Chunk Length field, and a 792 Value field. 794 0 1 2 3 795 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 796 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 797 | Chunk Type | Chunk Flags | Chunk Length | 798 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 799 \ \ 800 / Chunk Value / 801 \ \ 802 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 804 Chunk Type: 8 bits (unsigned integer) 806 This field identifies the type of information contained in the 807 Chunk Value field. It takes a value from 0 to 254. The value of 808 255 is reserved for future use as an extension field. 810 The values of Chunk Types are defined as follows: 812 ID Value Chunk Type 813 ----- ---------- 814 0 - Payload Data (DATA) 815 1 - Initiation (INIT) 816 2 - Initiation Acknowledgement (INIT ACK) 817 3 - Selective Acknowledgement (SACK) 818 4 - Heartbeat Request (HEARTBEAT) 819 5 - Heartbeat Acknowledgement (HEARTBEAT ACK) 820 6 - Abort (ABORT) 821 7 - Shutdown (SHUTDOWN) 822 8 - Shutdown Acknowledgement (SHUTDOWN ACK) 823 9 - Operation Error (ERROR) 824 10 - State Cookie (COOKIE ECHO) 825 11 - Cookie Acknowledgement (COOKIE ACK) 826 12 - Reserved for Explicit Congestion Notification Echo 827 (ECNE) 828 13 - Reserved for Congestion Window Reduced (CWR) 829 14 - Shutdown Complete (SHUTDOWN COMPLETE) 830 15 to 62 - available 831 63 - reserved for IETF-defined Chunk Extensions 832 64 to 126 - available 833 127 - reserved for IETF-defined Chunk Extensions 834 128 to 190 - available 835 191 - reserved for IETF-defined Chunk Extensions 836 192 to 254 - available 837 255 - reserved for IETF-defined Chunk Extensions 839 Chunk Types are encoded such that the highest-order 2 bits specify 840 the action that must be taken if the processing endpoint does not 841 recognize the Chunk Type. 843 00 - Stop processing this SCTP packet and discard it, do not 844 process any further chunks within it. 846 01 - Stop processing this SCTP packet and discard it, do not 847 process any further chunks within it, and report the 848 unrecognized chunk in an 'Unrecognized Chunk Type'. 850 10 - Skip this chunk and continue processing. 852 11 - Skip this chunk and continue processing, but report in an 853 ERROR chunk using the 'Unrecognized Chunk Type' cause of 854 error. 856 Note: The ECNE and CWR chunk types are reserved for future use of 857 Explicit Congestion Notification (ECN); see Appendix A. 859 Chunk Flags: 8 bits 861 The usage of these bits depends on the Chunk type as given by the 862 Chunk Type field. Unless otherwise specified, they are set to 0 863 on transmit and are ignored on receipt. 865 Chunk Length: 16 bits (unsigned integer) 867 This value represents the size of the chunk in bytes, including 868 the Chunk Type, Chunk Flags, Chunk Length, and Chunk Value fields. 869 Therefore, if the Chunk Value field is zero-length, the Length 870 field will be set to 4. The Chunk Length field does not count any 871 chunk padding. 873 Chunks (including Type, Length, and Value fields) are padded out 874 by the sender with all zero bytes to be a multiple of 4 bytes 875 long. This padding MUST NOT be more than 3 bytes in total. The 876 Chunk Length value does not include terminating padding of the 877 chunk. However, it does include padding of any variable-length 878 parameter except the last parameter in the chunk. The receiver 879 MUST ignore the padding. 881 Note: A robust implementation should accept the chunk whether or 882 not the final padding has been included in the Chunk Length. 884 Chunk Value: variable length 886 The Chunk Value field contains the actual information to be 887 transferred in the chunk. The usage and format of this field is 888 dependent on the Chunk Type. 890 The total length of a chunk (including Type, Length, and Value 891 fields) MUST be a multiple of 4 bytes. If the length of the chunk is 892 not a multiple of 4 bytes, the sender MUST pad the chunk with all 893 zero bytes, and this padding is not included in the Chunk Length 894 field. The sender MUST NOT pad with more than 3 bytes. The receiver 895 MUST ignore the padding bytes. 897 SCTP-defined chunks are described in detail in Section 3.3. The 898 guidelines for IETF-defined chunk extensions can be found in 899 Section 14.1 of this document. 901 3.2.1. Optional/Variable-Length Parameter Format 903 Chunk values of SCTP control chunks consist of a chunk-type-specific 904 header of required fields, followed by zero or more parameters. The 905 optional and variable-length parameters contained in a chunk are 906 defined in a Type-Length-Value format as shown below. 908 0 1 2 3 909 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 910 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 911 | Parameter Type | Parameter Length | 912 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 913 \ \ 914 / Parameter Value / 915 \ \ 916 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 918 Chunk Parameter Type: 16 bits (unsigned integer) 920 The Type field is a 16-bit identifier of the type of parameter. 921 It takes a value of 0 to 65534. 923 The value of 65535 is reserved for IETF-defined extensions. 924 Values other than those defined in specific SCTP chunk 925 descriptions are reserved for use by IETF. 927 Chunk Parameter Length: 16 bits (unsigned integer) 929 The Parameter Length field contains the size of the parameter in 930 bytes, including the Parameter Type, Parameter Length, and 931 Parameter Value fields. Thus, a parameter with a zero-length 932 Parameter Value field would have a Length field of 4. The 933 Parameter Length does not include any padding bytes. 935 Chunk Parameter Value: variable length 937 The Parameter Value field contains the actual information to be 938 transferred in the parameter. 940 The total length of a parameter (including Type, Parameter Length, 941 and Value fields) MUST be a multiple of 4 bytes. If the length of 942 the parameter is not a multiple of 4 bytes, the sender pads the 943 parameter at the end (i.e., after the Parameter Value field) with 944 all zero bytes. The length of the padding is not included in the 945 Parameter Length field. A sender MUST NOT pad with more than 3 946 bytes. The receiver MUST ignore the padding bytes. 948 The Parameter Types are encoded such that the highest-order 2 bits 949 specify the action that must be taken if the processing endpoint 950 does not recognize the Parameter Type. 952 00 - Stop processing this parameter; do not process any further 953 parameters within this chunk. 955 01 - Stop processing this parameter, do not process any further 956 parameters within this chunk, and report the unrecognized 957 parameter in an 'Unrecognized Parameter', as described in 958 Section 3.2.2. 960 10 - Skip this parameter and continue processing. 962 11 - Skip this parameter and continue processing but report the 963 unrecognized parameter in an 'Unrecognized Parameter', as 964 described in Section 3.2.2. 966 Please note that in all four cases, an INIT ACK or COOKIE ECHO chunk 967 is sent. In the 00 or 01 case, the processing of the parameters 968 after the unknown parameter is canceled, but no processing already 969 done is rolled back. 971 The actual SCTP parameters are defined in the specific SCTP chunk 972 sections. The rules for IETF-defined parameter extensions are 973 defined in Section 14.2. Note that a parameter type MUST be unique 974 across all chunks. For example, the parameter type '5' is used to 975 represent an IPv4 address (see Section 3.3.2.1). The value '5' then 976 is reserved across all chunks to represent an IPv4 address and MUST 977 NOT be reused with a different meaning in any other chunk. 979 3.2.2. Reporting of Unrecognized Parameters 981 If the receiver of an INIT chunk detects unrecognized parameters and 982 has to report them according to Section 3.2.1, it MUST put the 983 'Unrecognized Parameter' parameter(s) in the INIT ACK chunk sent in 984 response to the INIT chunk. Note that if the receiver of the INIT 985 chunk is NOT going to establish an association (e.g., due to lack of 986 resources), an 'Unrecognized Parameter' would NOT be included with 987 any ABORT being sent to the sender of the INIT. 989 If the receiver of an INIT ACK chunk detects unrecognized parameters 990 and has to report them according to Section 3.2.1, it SHOULD bundle 991 the ERROR chunk containing the 'Unrecognized Parameters' error cause 992 with the COOKIE ECHO chunk sent in response to the INIT ACK chunk. 993 If the receiver of the INIT ACK cannot bundle the COOKIE ECHO chunk 994 with the ERROR chunk, the ERROR chunk MAY be sent separately but not 995 before the COOKIE ACK has been received. 997 Note: Any time a COOKIE ECHO is sent in a packet, it MUST be the 998 first chunk. 1000 3.3. SCTP Chunk Definitions 1002 This section defines the format of the different SCTP chunk types. 1004 3.3.1. Payload Data (DATA) (0) 1006 The following format MUST be used for the DATA chunk: 1008 0 1 2 3 1009 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1010 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1011 | Type = 0 | Reserved|U|B|E| Length | 1012 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1013 | TSN | 1014 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1015 | Stream Identifier S | Stream Sequence Number n | 1016 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1017 | Payload Protocol Identifier | 1018 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1019 \ \ 1020 / User Data (seq n of Stream S) / 1021 \ \ 1022 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1024 Reserved: 5 bits 1026 Should be set to all '0's and ignored by the receiver. 1028 U bit: 1 bit 1030 The (U)nordered bit, if set to '1', indicates that this is an 1031 unordered DATA chunk, and there is no Stream Sequence Number 1032 assigned to this DATA chunk. Therefore, the receiver MUST ignore 1033 the Stream Sequence Number field. 1035 After reassembly (if necessary), unordered DATA chunks MUST be 1036 dispatched to the upper layer by the receiver without any attempt 1037 to reorder. 1039 If an unordered user message is fragmented, each fragment of the 1040 message MUST have its U bit set to '1'. 1042 B bit: 1 bit 1044 The (B)eginning fragment bit, if set, indicates the first fragment 1045 of a user message. 1047 E bit: 1 bit 1049 The (E)nding fragment bit, if set, indicates the last fragment of 1050 a user message. 1052 An unfragmented user message shall have both the B and E bits set to 1053 '1'. Setting both B and E bits to '0' indicates a middle fragment of 1054 a multi-fragment user message, as summarized in the following table: 1056 +---+---+-------------------------------------------+ 1057 | B | E | Description | 1058 +---+---+-------------------------------------------+ 1059 | 1 | 0 | First piece of a fragmented user message | 1060 +---+---+-------------------------------------------+ 1061 | 0 | 0 | Middle piece of a fragmented user message | 1062 +---+---+-------------------------------------------+ 1063 | 0 | 1 | Last piece of a fragmented user message | 1064 +---+---+-------------------------------------------+ 1065 | 1 | 1 | Unfragmented message | 1066 +---+---+-------------------------------------------+ 1068 Table 1: Fragment Description Flags 1070 When a user message is fragmented into multiple chunks, the TSNs are 1071 used by the receiver to reassemble the message. This means that the 1072 TSNs for each fragment of a fragmented user message MUST be strictly 1073 sequential. 1075 Length: 16 bits (unsigned integer) 1077 This field indicates the length of the DATA chunk in bytes from 1078 the beginning of the type field to the end of the User Data field 1079 excluding any padding. A DATA chunk with one byte of user data 1080 will have Length set to 17 (indicating 17 bytes). 1082 A DATA chunk with a User Data field of length L will have the 1083 Length field set to (16 + L) (indicating 16+L bytes) where L MUST 1084 be greater than 0. 1086 TSN: 32 bits (unsigned integer) 1088 This value represents the TSN for this DATA chunk. The valid 1089 range of TSN is from 0 to 4294967295 (2**32 - 1). TSN wraps back 1090 to 0 after reaching 4294967295. 1092 Stream Identifier S: 16 bits (unsigned integer) 1094 Identifies the stream to which the following user data belongs. 1096 Stream Sequence Number n: 16 bits (unsigned integer) 1098 This value represents the Stream Sequence Number of the following 1099 user data within the stream S. Valid range is 0 to 65535. 1101 When a user message is fragmented by SCTP for transport, the same 1102 Stream Sequence Number MUST be carried in each of the fragments of 1103 the message. 1105 Payload Protocol Identifier: 32 bits (unsigned integer) 1107 This value represents an application (or upper layer) specified 1108 protocol identifier. This value is passed to SCTP by its upper 1109 layer and sent to its peer. This identifier is not used by SCTP 1110 but can be used by certain network entities, as well as by the 1111 peer application, to identify the type of information being 1112 carried in this DATA chunk. This field must be sent even in 1113 fragmented DATA chunks (to make sure it is available for agents in 1114 the middle of the network). Note that this field is NOT touched 1115 by an SCTP implementation; therefore, its byte order is NOT 1116 necessarily big endian. The upper layer is responsible for any 1117 byte order conversions to this field. 1119 The value 0 indicates that no application identifier is specified 1120 by the upper layer for this payload data. 1122 User Data: variable length 1124 This is the payload user data. The implementation MUST pad the 1125 end of the data to a 4-byte boundary with all-zero bytes. Any 1126 padding MUST NOT be included in the Length field. A sender MUST 1127 never add more than 3 bytes of padding. 1129 3.3.2. Initiation (INIT) (1) 1131 This chunk is used to initiate an SCTP association between two 1132 endpoints. The format of the INIT chunk is shown below: 1134 0 1 2 3 1135 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1136 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1137 | Type = 1 | Chunk Flags | Chunk Length | 1138 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1139 | Initiate Tag | 1140 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1141 | Advertised Receiver Window Credit (a_rwnd) | 1142 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1143 | Number of Outbound Streams | Number of Inbound Streams | 1144 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1145 | Initial TSN | 1146 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1147 \ \ 1148 / Optional/Variable-Length Parameters / 1149 \ \ 1150 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1152 The INIT chunk contains the following parameters. Unless otherwise 1153 noted, each parameter MUST only be included once in the INIT chunk. 1155 Fixed Parameters Status 1156 ---------------------------------------------- 1157 Initiate Tag Mandatory 1158 Advertised Receiver Window Credit Mandatory 1159 Number of Outbound Streams Mandatory 1160 Number of Inbound Streams Mandatory 1161 Initial TSN Mandatory 1163 Variable Parameters Status Type Value 1164 ------------------------------------------------------------- 1165 IPv4 Address (Note 1) Optional 5 IPv6 Address 1166 (Note 1) Optional 6 Cookie Preservative 1167 Optional 9 Reserved for ECN Capable (Note 2) Optional 1168 32768 (0x8000) Host Name Address (Note 3) Optional 1169 11 Supported Address Types (Note 4) Optional 12 1171 Note 1: The INIT chunks can contain multiple addresses that can be 1172 IPv4 and/or IPv6 in any combination. 1174 Note 2: The ECN Capable field is reserved for future use of Explicit 1175 Congestion Notification. 1177 Note 3: An INIT chunk MUST NOT contain more than one Host Name 1178 Address parameter. Moreover, the sender of the INIT MUST NOT combine 1179 any other address types with the Host Name Address in the INIT. The 1180 receiver of INIT MUST ignore any other address types if the Host Name 1181 Address parameter is present in the received INIT chunk. 1183 Note 4: This parameter, when present, specifies all the address types 1184 the sending endpoint can support. The absence of this parameter 1185 indicates that the sending endpoint can support any address type. 1187 IMPLEMENTATION NOTE: If an INIT chunk is received with known 1188 parameters that are not optional parameters of the INIT chunk, then 1189 the receiver SHOULD process the INIT chunk and send back an INIT ACK. 1190 The receiver of the INIT chunk MAY bundle an ERROR chunk with the 1191 COOKIE ACK chunk later. However, restrictive implementations MAY 1192 send back an ABORT chunk in response to the INIT chunk. 1194 The Chunk Flags field in INIT is reserved, and all bits in it should 1195 be set to 0 by the sender and ignored by the receiver. The sequence 1196 of parameters within an INIT can be processed in any order. 1198 Initiate Tag: 32 bits (unsigned integer) 1199 The receiver of the INIT (the responding end) records the value of 1200 the Initiate Tag parameter. This value MUST be placed into the 1201 Verification Tag field of every SCTP packet that the receiver of 1202 the INIT transmits within this association. 1204 The Initiate Tag is allowed to have any value except 0. See 1205 Section 5.3.1 for more on the selection of the tag value. 1207 If the value of the Initiate Tag in a received INIT chunk is found 1208 to be 0, the receiver MUST treat it as an error and close the 1209 association by transmitting an ABORT. 1211 Advertised Receiver Window Credit (a_rwnd): 32 bits (unsigned 1212 integer) 1214 This value represents the dedicated buffer space, in number of 1215 bytes, the sender of the INIT has reserved in association with 1216 this window. During the life of the association, this buffer 1217 space SHOULD NOT be lessened (i.e., dedicated buffers taken away 1218 from this association); however, an endpoint MAY change the value 1219 of a_rwnd it sends in SACK chunks. 1221 Number of Outbound Streams (OS): 16 bits (unsigned integer) 1223 Defines the number of outbound streams the sender of this INIT 1224 chunk wishes to create in this association. The value of 0 MUST 1225 NOT be used. 1227 Note: A receiver of an INIT with the OS value set to 0 SHOULD 1228 abort the association. 1230 Number of Inbound Streams (MIS): 16 bits (unsigned integer) 1232 Defines the maximum number of streams the sender of this INIT 1233 chunk allows the peer end to create in this association. The 1234 value 0 MUST NOT be used. 1236 Note: There is no negotiation of the actual number of streams but 1237 instead the two endpoints will use the min(requested, offered). 1238 See Section 5.1.1 for details. 1240 Note: A receiver of an INIT with the MIS value of 0 SHOULD abort 1241 the association. 1243 Initial TSN (I-TSN): 32 bits (unsigned integer) 1244 Defines the initial TSN that the sender will use. The valid range 1245 is from 0 to 4294967295. This field MAY be set to the value of 1246 the Initiate Tag field. 1248 3.3.2.1. Optional/Variable-Length Parameters in INIT 1250 The following parameters follow the Type-Length-Value format as 1251 defined in Section 3.2.1. Any Type-Length-Value fields MUST come 1252 after the fixed-length fields defined in the previous section. 1254 IPv4 Address Parameter (5) 1256 0 1 2 3 1257 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1258 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1259 | Type = 5 | Length = 8 | 1260 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1261 | IPv4 Address | 1262 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1264 IPv4 Address: 32 bits (unsigned integer) 1266 Contains an IPv4 address of the sending endpoint. It is binary 1267 encoded. 1269 IPv6 Address Parameter (6) 1271 0 1 2 3 1272 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1273 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1274 | Type = 6 | Length = 20 | 1275 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1276 | | 1277 | IPv6 Address | 1278 | | 1279 | | 1280 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1282 IPv6 Address: 128 bits (unsigned integer) 1284 Contains an IPv6 [RFC2460] address of the sending endpoint. It is 1285 binary encoded. 1287 Note: A sender MUST NOT use an IPv4-mapped IPv6 address [RFC4291], 1288 but should instead use an IPv4 Address parameter for an IPv4 1289 address. 1291 Combined with the Source Port Number in the SCTP common header, 1292 the value passed in an IPv4 or IPv6 Address parameter indicates a 1293 transport address the sender of the INIT will support for the 1294 association being initiated. That is, during the life time of 1295 this association, this IP address can appear in the source address 1296 field of an IP datagram sent from the sender of the INIT, and can 1297 be used as a destination address of an IP datagram sent from the 1298 receiver of the INIT. 1300 More than one IP Address parameter can be included in an INIT 1301 chunk when the INIT sender is multi-homed. Moreover, a multi- 1302 homed endpoint may have access to different types of network; 1303 thus, more than one address type can be present in one INIT chunk, 1304 i.e., IPv4 and IPv6 addresses are allowed in the same INIT chunk. 1306 If the INIT contains at least one IP Address parameter, then the 1307 source address of the IP datagram containing the INIT chunk and 1308 any additional address(es) provided within the INIT can be used as 1309 destinations by the endpoint receiving the INIT. If the INIT does 1310 not contain any IP Address parameters, the endpoint receiving the 1311 INIT MUST use the source address associated with the received IP 1312 datagram as its sole destination address for the association. 1314 Note that not using any IP Address parameters in the INIT and INIT 1315 ACK is an alternative to make an association more likely to work 1316 across a NAT box. 1318 Cookie Preservative (9) 1320 The sender of the INIT shall use this parameter to suggest to the 1321 receiver of the INIT for a longer life-span of the State Cookie. 1323 0 1 2 3 1324 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1325 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1326 | Type = 9 | Length = 8 | 1327 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1328 | Suggested Cookie Life-Span Increment (msec.) | 1329 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1331 Suggested Cookie Life-Span Increment: 32 bits (unsigned integer) 1333 This parameter indicates to the receiver how much increment in 1334 milliseconds the sender wishes the receiver to add to its default 1335 cookie life-span. 1337 This optional parameter should be added to the INIT chunk by the 1338 sender when it reattempts establishing an association with a peer 1339 to which its previous attempt of establishing the association 1340 failed due to a stale cookie operation error. The receiver MAY 1341 choose to ignore the suggested cookie life-span increase for its 1342 own security reasons. 1344 Host Name Address (11) 1346 The sender of INIT uses this parameter to pass its Host Name (in 1347 place of its IP addresses) to its peer. The peer is responsible for 1348 resolving the name. Using this parameter might make it more likely 1349 for the association to work across a NAT box. 1351 0 1 2 3 1352 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1353 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1354 | Type = 11 | Length | 1355 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1356 / Host Name / 1357 \ \ 1358 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1360 Host Name: variable length 1362 This field contains a host name in "host name syntax" per RFC 1123 1363 Section 2.1 [RFC1123]. The method for resolving the host name is 1364 out of scope of SCTP. 1366 Note: At least one null terminator is included in the Host Name 1367 string and must be included in the length. 1369 Supported Address Types (12) 1371 The sender of INIT uses this parameter to list all the address types 1372 it can support. 1374 0 1 2 3 1375 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1376 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1377 | Type = 12 | Length | 1378 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1379 | Address Type #1 | Address Type #2 | 1380 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1381 | ...... | 1382 +-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+-+-+-+-+-+-+-+-+-+-+-++-+-+-+ 1384 Address Type: 16 bits (unsigned integer) 1385 This is filled with the type value of the corresponding address 1386 TLV (e.g., IPv4 = 5, IPv6 = 6, Host name = 11). 1388 3.3.3. Initiation Acknowledgement (INIT ACK) (2) 1390 The INIT ACK chunk is used to acknowledge the initiation of an SCTP 1391 association. 1393 The parameter part of INIT ACK is formatted similarly to the INIT 1394 chunk. It uses two extra variable parameters: The State Cookie and 1395 the Unrecognized Parameter: 1397 The format of the INIT ACK chunk is shown below: 1399 0 1 2 3 1400 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1401 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1402 | Type = 2 | Chunk Flags | Chunk Length | 1403 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1404 | Initiate Tag | 1405 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1406 | Advertised Receiver Window Credit | 1407 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1408 | Number of Outbound Streams | Number of Inbound Streams | 1409 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1410 | Initial TSN | 1411 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1412 \ \ 1413 / Optional/Variable-Length Parameters / 1414 \ \ 1415 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1417 Initiate Tag: 32 bits (unsigned integer) 1419 The receiver of the INIT ACK records the value of the Initiate Tag 1420 parameter. This value MUST be placed into the Verification Tag 1421 field of every SCTP packet that the INIT ACK receiver transmits 1422 within this association. 1424 The Initiate Tag MUST NOT take the value 0. See Section 5.3.1 for 1425 more on the selection of the Initiate Tag value. 1427 If the value of the Initiate Tag in a received INIT ACK chunk is 1428 found to be 0, the receiver MUST destroy the association 1429 discarding its TCB. The receiver MAY send an ABORT for debugging 1430 purpose. 1432 Advertised Receiver Window Credit (a_rwnd): 32 bits (unsigned 1433 integer) 1435 This value represents the dedicated buffer space, in number of 1436 bytes, the sender of the INIT ACK has reserved in association with 1437 this window. During the life of the association, this buffer 1438 space SHOULD NOT be lessened (i.e., dedicated buffers taken away 1439 from this association). 1441 Number of Outbound Streams (OS): 16 bits (unsigned integer) 1443 Defines the number of outbound streams the sender of this INIT ACK 1444 chunk wishes to create in this association. The value of 0 MUST 1445 NOT be used, and the value MUST NOT be greater than the MIS value 1446 sent in the INIT chunk. 1448 Note: A receiver of an INIT ACK with the OS value set to 0 SHOULD 1449 destroy the association discarding its TCB. 1451 Number of Inbound Streams (MIS): 16 bits (unsigned integer) 1453 Defines the maximum number of streams the sender of this INIT ACK 1454 chunk allows the peer end to create in this association. The 1455 value 0 MUST NOT be used. 1457 Note: There is no negotiation of the actual number of streams but 1458 instead the two endpoints will use the min(requested, offered). 1459 See Section 5.1.1 for details. 1461 Note: A receiver of an INIT ACK with the MIS value set to 0 SHOULD 1462 destroy the association discarding its TCB. 1464 Initial TSN (I-TSN): 32 bits (unsigned integer) 1466 Defines the initial TSN that the INIT ACK sender will use. The 1467 valid range is from 0 to 4294967295. This field MAY be set to the 1468 value of the Initiate Tag field. 1470 Fixed Parameters Status 1471 ---------------------------------------------- 1472 Initiate Tag Mandatory 1473 Advertised Receiver Window Credit Mandatory 1474 Number of Outbound Streams Mandatory 1475 Number of Inbound Streams Mandatory 1476 Initial TSN Mandatory 1478 Variable Parameters Status Type Value 1479 ------------------------------------------------------------- 1480 State Cookie Mandatory 7 1481 IPv4 Address (Note 1) Optional 5 1482 IPv6 Address (Note 1) Optional 6 1483 Unrecognized Parameter Optional 8 1484 Reserved for ECN Capable (Note 2) Optional 32768 (0x8000) 1485 Host Name Address (Note 3) Optional 11 1487 Note 1: The INIT ACK chunks can contain any number of IP address 1488 parameters that can be IPv4 and/or IPv6 in any combination. 1490 Note 2: The ECN Capable field is reserved for future use of Explicit 1491 Congestion Notification. 1493 Note 3: The INIT ACK chunks MUST NOT contain more than one Host Name 1494 Address parameter. Moreover, the sender of the INIT ACK MUST NOT 1495 combine any other address types with the Host Name Address in the 1496 INIT ACK. The receiver of the INIT ACK MUST ignore any other address 1497 types if the Host Name Address parameter is present. 1499 IMPLEMENTATION NOTE: An implementation MUST be prepared to receive an 1500 INIT ACK that is quite large (more than 1500 bytes) due to the 1501 variable size of the State Cookie AND the variable address list. For 1502 example if a responder to the INIT has 1000 IPv4 addresses it wishes 1503 to send, it would need at least 8,000 bytes to encode this in the 1504 INIT ACK. 1506 IMPLEMENTATION NOTE: If an INIT ACK chunk is received with known 1507 parameters that are not optional parameters of the INIT ACK chunk, 1508 then the receiver SHOULD process the INIT ACK chunk and send back a 1509 COOKIE ECHO. The receiver of the INIT ACK chunk MAY bundle an ERROR 1510 chunk with the COOKIE ECHO chunk. However, restrictive 1511 implementations MAY send back an ABORT chunk in response to the INIT 1512 ACK chunk. 1514 In combination with the Source Port carried in the SCTP common 1515 header, each IP Address parameter in the INIT ACK indicates to the 1516 receiver of the INIT ACK a valid transport address supported by the 1517 sender of the INIT ACK for the life time of the association being 1518 initiated. 1520 If the INIT ACK contains at least one IP Address parameter, then the 1521 source address of the IP datagram containing the INIT ACK and any 1522 additional address(es) provided within the INIT ACK may be used as 1523 destinations by the receiver of the INIT ACK. If the INIT ACK does 1524 not contain any IP Address parameters, the receiver of the INIT ACK 1525 MUST use the source address associated with the received IP datagram 1526 as its sole destination address for the association. 1528 The State Cookie and Unrecognized Parameters use the Type-Length- 1529 Value format as defined in Section 3.2.1 and are described below. 1530 The other fields are defined the same as their counterparts in the 1531 INIT chunk. 1533 3.3.3.1. Optional or Variable-Length Parameters 1535 State Cookie 1537 Parameter Type Value: 7 1539 Parameter Length: Variable size, depending on size of Cookie. 1541 Parameter Value: 1543 This parameter value MUST contain all the necessary state and 1544 parameter information required for the sender of this INIT ACK to 1545 create the association, along with a Message Authentication Code 1546 (MAC). See Section 5.1.3 for details on State Cookie definition. 1548 Unrecognized Parameter: 1550 Parameter Type Value: 8 1552 Parameter Length: Variable size. 1554 Parameter Value: 1556 This parameter is returned to the originator of the INIT chunk 1557 when the INIT contains an unrecognized parameter that has a value 1558 that indicates it should be reported to the sender. This 1559 parameter value field will contain unrecognized parameters copied 1560 from the INIT chunk complete with Parameter Type, Length, and 1561 Value fields. 1563 3.3.4. Selective Acknowledgement (SACK) (3) 1565 This chunk is sent to the peer endpoint to acknowledge received DATA 1566 chunks and to inform the peer endpoint of gaps in the received 1567 subsequences of DATA chunks as represented by their TSNs. 1569 The SACK MUST contain the Cumulative TSN Ack, Advertised Receiver 1570 Window Credit (a_rwnd), Number of Gap Ack Blocks, and Number of 1571 Duplicate TSNs fields. 1573 By definition, the value of the Cumulative TSN Ack parameter is the 1574 last TSN received before a break in the sequence of received TSNs 1575 occurs; the next TSN value following this one has not yet been 1576 received at the endpoint sending the SACK. This parameter therefore 1577 acknowledges receipt of all TSNs less than or equal to its value. 1579 The handling of a_rwnd by the receiver of the SACK is discussed in 1580 detail in Section 6.2.1. 1582 The SACK also contains zero or more Gap Ack Blocks. Each Gap Ack 1583 Block acknowledges a subsequence of TSNs received following a break 1584 in the sequence of received TSNs. By definition, all TSNs 1585 acknowledged by Gap Ack Blocks are greater than the value of the 1586 Cumulative TSN Ack. 1588 0 1 2 3 1589 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1590 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1591 | Type = 3 |Chunk Flags | Chunk Length | 1592 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1593 | Cumulative TSN Ack | 1594 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1595 | Advertised Receiver Window Credit (a_rwnd) | 1596 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1597 | Number of Gap Ack Blocks = N | Number of Duplicate TSNs = X | 1598 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1599 | Gap Ack Block #1 Start | Gap Ack Block #1 End | 1600 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1601 / / 1602 \ ... \ 1603 / / 1604 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1605 | Gap Ack Block #N Start | Gap Ack Block #N End | 1606 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1607 | Duplicate TSN 1 | 1608 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1609 / / 1610 \ ... \ 1611 / / 1612 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1613 | Duplicate TSN X | 1614 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1616 Chunk Flags: 8 bits 1618 Set to all '0's on transmit and ignored on receipt. 1620 Cumulative TSN Ack: 32 bits (unsigned integer) 1621 This parameter contains the TSN of the last DATA chunk received in 1622 sequence before a gap. In the case where no DATA chunk has been 1623 received, this value is set to the peer's Initial TSN minus one. 1625 Advertised Receiver Window Credit (a_rwnd): 32 bits (unsigned 1626 integer) 1628 This field indicates the updated receive buffer space in bytes of 1629 the sender of this SACK; see Section 6.2.1 for details. 1631 Number of Gap Ack Blocks: 16 bits (unsigned integer) 1633 Indicates the number of Gap Ack Blocks included in this SACK. 1635 Number of Duplicate TSNs: 16 bit 1637 This field contains the number of duplicate TSNs the endpoint has 1638 received. Each duplicate TSN is listed following the Gap Ack 1639 Block list. 1641 Gap Ack Blocks: 1643 These fields contain the Gap Ack Blocks. They are repeated for 1644 each Gap Ack Block up to the number of Gap Ack Blocks defined in 1645 the Number of Gap Ack Blocks field. All DATA chunks with TSNs 1646 greater than or equal to (Cumulative TSN Ack + Gap Ack Block 1647 Start) and less than or equal to (Cumulative TSN Ack + Gap Ack 1648 Block End) of each Gap Ack Block are assumed to have been received 1649 correctly. 1651 Gap Ack Block Start: 16 bits (unsigned integer) 1653 Indicates the Start offset TSN for this Gap Ack Block. To 1654 calculate the actual TSN number the Cumulative TSN Ack is added to 1655 this offset number. This calculated TSN identifies the first TSN 1656 in this Gap Ack Block that has been received. 1658 Gap Ack Block End: 16 bits (unsigned integer) 1660 Indicates the End offset TSN for this Gap Ack Block. To calculate 1661 the actual TSN number, the Cumulative TSN Ack is added to this 1662 offset number. This calculated TSN identifies the TSN of the last 1663 DATA chunk received in this Gap Ack Block. 1665 For example, assume that the receiver has the following DATA 1666 chunks newly arrived at the time when it decides to send a 1667 Selective ACK, 1668 ---------- 1669 | TSN=17 | 1670 ---------- 1671 | | <- still missing 1672 ---------- 1673 | TSN=15 | 1674 ---------- 1675 | TSN=14 | 1676 ---------- 1677 | | <- still missing 1678 ---------- 1679 | TSN=12 | 1680 ---------- 1681 | TSN=11 | 1682 ---------- 1683 | TSN=10 | 1684 ---------- 1686 then the parameter part of the SACK MUST be constructed as follows 1687 (assuming the new a_rwnd is set to 4660 by the sender): 1689 +--------------------------------+ 1690 | Cumulative TSN Ack = 12 | 1691 +--------------------------------+ 1692 | a_rwnd = 4660 | 1693 +----------------+---------------+ 1694 | num of block=2 | num of dup=0 | 1695 +----------------+---------------+ 1696 |block #1 strt=2 |block #1 end=3 | 1697 +----------------+---------------+ 1698 |block #2 strt=5 |block #2 end=5 | 1699 +----------------+---------------+ 1701 Duplicate TSN: 32 bits (unsigned integer) 1703 Indicates the number of times a TSN was received in duplicate 1704 since the last SACK was sent. Every time a receiver gets a 1705 duplicate TSN (before sending the SACK), it adds it to the list of 1706 duplicates. The duplicate count is reinitialized to zero after 1707 sending each SACK. 1709 For example, if a receiver were to get the TSN 19 three times it 1710 would list 19 twice in the outbound SACK. After sending the SACK, if 1711 it received yet one more TSN 19 it would list 19 as a duplicate once 1712 in the next outgoing SACK. 1714 3.3.5. Heartbeat Request (HEARTBEAT) (4) 1716 An endpoint should send this chunk to its peer endpoint to probe the 1717 reachability of a particular destination transport address defined in 1718 the present association. 1720 The parameter field contains the Heartbeat Information, which is a 1721 variable-length opaque data structure understood only by the sender. 1723 0 1 2 3 1724 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1725 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1726 | Type = 4 | Chunk Flags | Heartbeat Length | 1727 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1728 \ \ 1729 / Heartbeat Information TLV (Variable-Length) / 1730 \ \ 1731 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1733 Chunk Flags: 8 bits 1735 Set to 0 on transmit and ignored on receipt. 1737 Heartbeat Length: 16 bits (unsigned integer) 1739 Set to the size of the chunk in bytes, including the chunk header 1740 and the Heartbeat Information field. 1742 Heartbeat Information: variable length 1744 Defined as a variable-length parameter using the format described 1745 in Section 3.2.1, i.e.: 1747 Variable Parameters Status Type Value 1748 ------------------------------------------------------------- 1749 Heartbeat Info Mandatory 1 1751 0 1 2 3 1752 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1753 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1754 | Heartbeat Info Type=1 | HB Info Length | 1755 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1756 / Sender-Specific Heartbeat Info / 1757 \ \ 1758 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1760 The Sender-Specific Heartbeat Info field should normally include 1761 information about the sender's current time when this HEARTBEAT 1762 chunk is sent and the destination transport address to which this 1763 HEARTBEAT is sent (see Section 8.3). This information is simply 1764 reflected back by the receiver in the HEARTBEAT ACK message (see 1765 Section 3.3.6). Note also that the HEARTBEAT message is both for 1766 reachability checking and for path verification (see Section 5.4). 1767 When a HEARTBEAT chunk is being used for path verification 1768 purposes, it MUST hold a 64-bit random nonce. 1770 3.3.6. Heartbeat Acknowledgement (HEARTBEAT ACK) (5) 1772 An endpoint should send this chunk to its peer endpoint as a response 1773 to a HEARTBEAT chunk (see Section 8.3). A HEARTBEAT ACK is always 1774 sent to the source IP address of the IP datagram containing the 1775 HEARTBEAT chunk to which this ack is responding. 1777 The parameter field contains a variable-length opaque data structure. 1779 0 1 2 3 1780 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1781 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1782 | Type = 5 | Chunk Flags | Heartbeat Ack Length | 1783 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1784 \ \ 1785 / Heartbeat Information TLV (Variable-Length) / 1786 \ \ 1787 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1789 Chunk Flags: 8 bits 1791 Set to 0 on transmit and ignored on receipt. 1793 Heartbeat Ack Length: 16 bits (unsigned integer) 1795 Set to the size of the chunk in bytes, including the chunk header 1796 and the Heartbeat Information field. 1798 Heartbeat Information: variable length 1800 This field MUST contain the Heartbeat Information parameter of the 1801 Heartbeat Request to which this Heartbeat Acknowledgement is 1802 responding. 1804 Variable Parameters Status Type Value 1805 ------------------------------------------------------------- 1806 Heartbeat Info Mandatory 1 1808 3.3.7. Abort Association (ABORT) (6) 1810 The ABORT chunk is sent to the peer of an association to close the 1811 association. The ABORT chunk may contain Cause Parameters to inform 1812 the receiver about the reason of the abort. DATA chunks MUST NOT be 1813 bundled with ABORT. Control chunks (except for INIT, INIT ACK, and 1814 SHUTDOWN COMPLETE) MAY be bundled with an ABORT, but they MUST be 1815 placed before the ABORT in the SCTP packet or they will be ignored by 1816 the receiver. 1818 If an endpoint receives an ABORT with a format error or no TCB is 1819 found, it MUST silently discard it. Moreover, under any 1820 circumstances, an endpoint that receives an ABORT MUST NOT respond to 1821 that ABORT by sending an ABORT of its own. 1823 0 1 2 3 1824 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1825 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1826 | Type = 6 |Reserved |T| Length | 1827 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1828 \ \ 1829 / zero or more Error Causes / 1830 \ \ 1831 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1833 Chunk Flags: 8 bits 1835 Reserved: 7 bits 1837 Set to 0 on transmit and ignored on receipt. 1839 T bit: 1 bit 1841 The T bit is set to 0 if the sender filled in the Verification 1842 Tag expected by the peer. If the Verification Tag is 1843 reflected, the T bit MUST be set to 1. Reflecting means that 1844 the sent Verification Tag is the same as the received one. 1846 Note: Special rules apply to this chunk for verification; 1847 please see Section 8.5.1 for details. 1849 Length: 16 bits (unsigned integer) 1851 Set to the size of the chunk in bytes, including the chunk header 1852 and all the Error Cause fields present. 1854 See Section 3.3.10 for Error Cause definitions. 1856 3.3.8. Shutdown Association (SHUTDOWN) (7) 1858 An endpoint in an association MUST use this chunk to initiate a 1859 graceful close of the association with its peer. This chunk has the 1860 following format. 1862 0 1 2 3 1863 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1864 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1865 | Type = 7 | Chunk Flags | Length = 8 | 1866 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1867 | Cumulative TSN Ack | 1868 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1870 Chunk Flags: 8 bits 1872 Set to 0 on transmit and ignored on receipt. 1874 Length: 16 bits (unsigned integer) 1876 Indicates the length of the parameter. Set to 8. 1878 Cumulative TSN Ack: 32 bits (unsigned integer) 1880 This parameter contains the TSN of the last chunk received in 1881 sequence before any gaps. 1883 Note: Since the SHUTDOWN message does not contain Gap Ack Blocks, 1884 it cannot be used to acknowledge TSNs received out of order. In a 1885 SACK, lack of Gap Ack Blocks that were previously included 1886 indicates that the data receiver reneged on the associated DATA 1887 chunks. Since SHUTDOWN does not contain Gap Ack Blocks, the 1888 receiver of the SHUTDOWN shouldn't interpret the lack of a Gap Ack 1889 Block as a renege. (See Section 6.2 for information on reneging.) 1891 3.3.9. Shutdown Acknowledgement (SHUTDOWN ACK) (8) 1893 This chunk MUST be used to acknowledge the receipt of the SHUTDOWN 1894 chunk at the completion of the shutdown process; see Section 9.2 for 1895 details. 1897 The SHUTDOWN ACK chunk has no parameters. 1899 0 1 2 3 1900 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1901 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1902 | Type = 8 |Chunk Flags | Length = 4 | 1903 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1905 Chunk Flags: 8 bits 1907 Set to 0 on transmit and ignored on receipt. 1909 3.3.10. Operation Error (ERROR) (9) 1911 An endpoint sends this chunk to its peer endpoint to notify it of 1912 certain error conditions. It contains one or more error causes. An 1913 Operation Error is not considered fatal in and of itself, but may be 1914 used with an ABORT chunk to report a fatal condition. It has the 1915 following parameters: 1917 0 1 2 3 1918 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1919 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1920 | Type = 9 | Chunk Flags | Length | 1921 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1922 \ \ 1923 / one or more Error Causes / 1924 \ \ 1925 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1927 Chunk Flags: 8 bits 1929 Set to 0 on transmit and ignored on receipt. 1931 Length: 16 bits (unsigned integer) 1933 Set to the size of the chunk in bytes, including the chunk header 1934 and all the Error Cause fields present. 1936 Error causes are defined as variable-length parameters using the 1937 format described in Section 3.2.1, that is: 1939 0 1 2 3 1940 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1941 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1942 | Cause Code | Cause Length | 1943 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1944 / Cause-Specific Information / 1945 \ \ 1946 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1948 Cause Code: 16 bits (unsigned integer) 1950 Defines the type of error conditions being reported. 1952 Cause Code 1953 Value Cause Code 1954 --------- ---------------- 1955 1 Invalid Stream Identifier 1956 2 Missing Mandatory Parameter 1957 3 Stale Cookie Error 1958 4 Out of Resource 1959 5 Unresolvable Address 1960 6 Unrecognized Chunk Type 1961 7 Invalid Mandatory Parameter 1962 8 Unrecognized Parameters 1963 9 No User Data 1964 10 Cookie Received While Shutting Down 1965 11 Restart of an Association with New Addresses 1966 12 User Initiated Abort 1967 13 Protocol Violation 1968 Cause Length: 16 bits (unsigned integer) 1970 Set to the size of the parameter in bytes, including the Cause 1971 Code, Cause Length, and Cause-Specific Information fields. 1972 Cause-Specific Information: variable length 1974 This field carries the details of the error condition. 1976 Section 3.3.10.1 - Section 3.3.10.13 define error causes for SCTP. 1977 Guidelines for the IETF to define new error cause values are 1978 discussed in Section 14.3. 1980 3.3.10.1. Invalid Stream Identifier (1) 1982 Cause of error 1983 --------------- 1985 Invalid Stream Identifier: Indicates endpoint received a DATA chunk 1986 sent to a nonexistent stream. 1988 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1989 | Cause Code=1 | Cause Length=8 | 1990 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1991 | Stream Identifier | (Reserved) | 1992 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1994 Stream Identifier: 16 bits (unsigned integer) 1996 Contains the Stream Identifier of the DATA chunk received in 1997 error. 1999 Reserved: 16 bits 2000 This field is reserved. It is set to all 0's on transmit and 2001 ignored on receipt. 2003 3.3.10.2. Missing Mandatory Parameter (2) 2005 Cause of error 2006 --------------- 2008 Missing Mandatory Parameter: Indicates that one or more mandatory TLV 2009 parameters are missing in a received INIT or INIT ACK. 2011 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2012 | Cause Code=2 | Cause Length=8+N*2 | 2013 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2014 | Number of missing params=N | 2015 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2016 | Missing Param Type #1 | Missing Param Type #2 | 2017 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2018 | Missing Param Type #N-1 | Missing Param Type #N | 2019 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2021 Number of Missing params: 32 bits (unsigned integer) 2023 This field contains the number of parameters contained in the 2024 Cause-Specific Information field. 2026 Missing Param Type: 16 bits (unsigned integer) 2028 Each field will contain the missing mandatory parameter number. 2030 3.3.10.3. Stale Cookie Error (3) 2032 Cause of error 2033 -------------- 2035 Stale Cookie Error: Indicates the receipt of a valid State Cookie 2036 that has expired. 2038 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2039 | Cause Code=3 | Cause Length=8 | 2040 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2041 | Measure of Staleness (usec.) | 2042 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2044 Measure of Staleness: 32 bits (unsigned integer) 2046 This field contains the difference, in microseconds, between the 2047 current time and the time the State Cookie expired. 2049 The sender of this error cause MAY choose to report how long past 2050 expiration the State Cookie is by including a non-zero value in 2051 the Measure of Staleness field. If the sender does not wish to 2052 provide this information, it should set the Measure of Staleness 2053 field to the value of zero. 2055 3.3.10.4. Out of Resource (4) 2057 Cause of error 2058 --------------- 2060 Out of Resource: Indicates that the sender is out of resource. This 2061 is usually sent in combination with or within an ABORT. 2063 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2064 | Cause Code=4 | Cause Length=4 | 2065 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2067 3.3.10.5. Unresolvable Address (5) 2069 Cause of error 2070 --------------- 2072 Unresolvable Address: Indicates that the sender is not able to 2073 resolve the specified address parameter (e.g., type of address is not 2074 supported by the sender). This is usually sent in combination with 2075 or within an ABORT. 2077 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2078 | Cause Code=5 | Cause Length | 2079 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2080 / Unresolvable Address / 2081 \ \ 2082 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2084 Unresolvable Address: variable length 2086 The Unresolvable Address field contains the complete Type, Length, 2087 and Value of the address parameter (or Host Name parameter) that 2088 contains the unresolvable address or host name. 2090 3.3.10.6. Unrecognized Chunk Type (6) 2092 Cause of error 2093 --------------- 2094 Unrecognized Chunk Type: This error cause is returned to the 2095 originator of the chunk if the receiver does not understand the chunk 2096 and the upper bits of the 'Chunk Type' are set to 01 or 11. 2098 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2099 | Cause Code=6 | Cause Length | 2100 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2101 / Unrecognized Chunk / 2102 \ \ 2103 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2105 Unrecognized Chunk: variable length 2107 The Unrecognized Chunk field contains the unrecognized chunk from 2108 the SCTP packet complete with Chunk Type, Chunk Flags, and Chunk 2109 Length. 2111 3.3.10.7. Invalid Mandatory Parameter (7) 2113 Cause of error 2114 --------------- 2116 Invalid Mandatory Parameter: This error cause is returned to the 2117 originator of an INIT or INIT ACK chunk when one of the mandatory 2118 parameters is set to an invalid value. 2120 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2121 | Cause Code=7 | Cause Length=4 | 2122 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2124 3.3.10.8. Unrecognized Parameters (8) 2126 Cause of error 2127 --------------- 2129 Unrecognized Parameters: This error cause is returned to the 2130 originator of the INIT ACK chunk if the receiver does not recognize 2131 one or more Optional TLV parameters in the INIT ACK chunk. 2133 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2134 | Cause Code=8 | Cause Length | 2135 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2136 / Unrecognized Parameters / 2137 \ \ 2138 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2140 Unrecognized Parameters: variable length 2141 The Unrecognized Parameters field contains the unrecognized 2142 parameters copied from the INIT ACK chunk complete with TLV. This 2143 error cause is normally contained in an ERROR chunk bundled with 2144 the COOKIE ECHO chunk when responding to the INIT ACK, when the 2145 sender of the COOKIE ECHO chunk wishes to report unrecognized 2146 parameters. 2148 3.3.10.9. No User Data (9) 2150 Cause of error 2151 --------------- 2153 No User Data: This error cause is returned to the originator of a 2154 DATA chunk if a received DATA chunk has no user data. 2156 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2157 | Cause Code=9 | Cause Length=8 | 2158 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2159 / TSN value / 2160 \ \ 2161 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2163 TSN value: 32 bits (unsigned integer) 2165 The TSN value field contains the TSN of the DATA chunk received 2166 with no user data field. 2168 This cause code is normally returned in an ABORT chunk (see 2169 Section 6.2). 2171 3.3.10.10. Cookie Received While Shutting Down (10) 2173 Cause of error 2174 --------------- 2176 Cookie Received While Shutting Down: A COOKIE ECHO was received while 2177 the endpoint was in the SHUTDOWN-ACK-SENT state. This error is 2178 usually returned in an ERROR chunk bundled with the retransmitted 2179 SHUTDOWN ACK. 2181 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2182 | Cause Code=10 | Cause Length=4 | 2183 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2185 3.3.10.11. Restart of an Association with New Addresses (11) 2187 Cause of error 2188 -------------- 2190 Restart of an association with new addresses: An INIT was received on 2191 an existing association. But the INIT added addresses to the 2192 association that were previously NOT part of the association. The 2193 new addresses are listed in the error code. This ERROR is normally 2194 sent as part of an ABORT refusing the INIT (see Section 5.2). 2196 0 1 2 3 2197 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2198 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2199 | Cause Code=11 | Cause Length=Variable | 2200 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2201 / New Address TLVs / 2202 \ \ 2203 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2205 Note: Each New Address TLV is an exact copy of the TLV that was found 2206 in the INIT chunk that was new, including the Parameter Type and the 2207 Parameter Length. 2209 3.3.10.12. User-Initiated Abort (12) 2211 Cause of error 2212 -------------- 2214 This error cause MAY be included in ABORT chunks that are sent 2215 because of an upper-layer request. The upper layer can specify an 2216 Upper Layer Abort Reason that is transported by SCTP transparently 2217 and MAY be delivered to the upper-layer protocol at the peer. 2219 0 1 2 3 2220 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2221 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2222 | Cause Code=12 | Cause Length=Variable | 2223 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2224 / Upper Layer Abort Reason / 2225 \ \ 2226 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2228 3.3.10.13. Protocol Violation (13) 2230 Cause of error 2231 -------------- 2232 This error cause MAY be included in ABORT chunks that are sent 2233 because an SCTP endpoint detects a protocol violation of the peer 2234 that is not covered by the error causes described in Section 3.3.10.1 2235 to Section 3.3.10.12. An implementation MAY provide additional 2236 information specifying what kind of protocol violation has been 2237 detected. 2239 0 1 2 3 2240 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2241 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2242 | Cause Code=13 | Cause Length=Variable | 2243 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2244 / Additional Information / 2245 \ \ 2246 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2248 3.3.11. Cookie Echo (COOKIE ECHO) (10) 2250 This chunk is used only during the initialization of an association. 2251 It is sent by the initiator of an association to its peer to complete 2252 the initialization process. This chunk MUST precede any DATA chunk 2253 sent within the association, but MAY be bundled with one or more DATA 2254 chunks in the same packet. 2256 0 1 2 3 2257 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2258 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2259 | Type = 10 |Chunk Flags | Length | 2260 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2261 / Cookie / 2262 \ \ 2263 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2265 Chunk Flags: 8 bit 2267 Set to 0 on transmit and ignored on receipt. 2269 Length: 16 bits (unsigned integer) 2271 Set to the size of the chunk in bytes, including the 4 bytes of 2272 the chunk header and the size of the cookie. 2274 Cookie: variable size 2276 This field must contain the exact cookie received in the State 2277 Cookie parameter from the previous INIT ACK. 2279 An implementation SHOULD make the cookie as small as possible to 2280 ensure interoperability. 2282 Note: A Cookie Echo does NOT contain a State Cookie parameter; 2283 instead, the data within the State Cookie's Parameter Value 2284 becomes the data within the Cookie Echo's Chunk Value. This 2285 allows an implementation to change only the first 2 bytes of the 2286 State Cookie parameter to become a COOKIE ECHO chunk. 2288 3.3.12. Cookie Acknowledgement (COOKIE ACK) (11) 2290 This chunk is used only during the initialization of an association. 2291 It is used to acknowledge the receipt of a COOKIE ECHO chunk. This 2292 chunk MUST precede any DATA or SACK chunk sent within the 2293 association, but MAY be bundled with one or more DATA chunks or SACK 2294 chunk's in the same SCTP packet. 2296 0 1 2 3 2297 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2298 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2299 | Type = 11 |Chunk Flags | Length = 4 | 2300 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2302 Chunk Flags: 8 bits 2304 Set to 0 on transmit and ignored on receipt. 2306 3.3.13. Shutdown Complete (SHUTDOWN COMPLETE) (14) 2308 This chunk MUST be used to acknowledge the receipt of the SHUTDOWN 2309 ACK chunk at the completion of the shutdown process; see Section 9.2 2310 for details. 2312 The SHUTDOWN COMPLETE chunk has no parameters. 2314 0 1 2 3 2315 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2316 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2317 | Type = 14 |Reserved |T| Length = 4 | 2318 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 2320 Chunk Flags: 8 bits 2322 Reserved: 7 bits 2324 Set to 0 on transmit and ignored on receipt. 2326 T bit: 1 bit 2327 The T bit is set to 0 if the sender filled in the Verification 2328 Tag expected by the peer. If the Verification Tag is 2329 reflected, the T bit MUST be set to 1. Reflecting means that 2330 the sent Verification Tag is the same as the received one. 2332 Note: Special rules apply to this chunk for verification, please see 2333 Section 8.5.1 for details. 2335 4. SCTP Association State Diagram 2337 During the life time of an SCTP association, the SCTP endpoint's 2338 association progresses from one state to another in response to 2339 various events. The events that may potentially advance an 2340 association's state include: 2342 o SCTP user primitive calls, e.g., [ASSOCIATE], [SHUTDOWN], [ABORT], 2343 o Reception of INIT, COOKIE ECHO, ABORT, SHUTDOWN, etc., control 2344 chunks, or 2345 o Some timeout events. 2347 The state diagram in the figures below illustrates state changes, 2348 together with the causing events and resulting actions. Note that 2349 some of the error conditions are not shown in the state diagram. 2350 Full descriptions of all special cases are found in the text. 2352 Note: Chunk names are given in all capital letters, while parameter 2353 names have the first letter capitalized, e.g., COOKIE ECHO chunk type 2354 vs. State Cookie parameter. If more than one event/message can occur 2355 that causes a state transition, it is labeled (A), (B), etc. 2357 ----- -------- (from any state) 2358 / \ / rcv ABORT [ABORT] 2359 rcv INIT | | | ---------- or ---------- 2360 --------------- | v v delete TCB snd ABORT 2361 generate Cookie \ +---------+ delete TCB 2362 snd INIT ACK ---| CLOSED | 2363 +---------+ 2364 / \ [ASSOCIATE] 2365 / \ --------------- 2366 | | create TCB 2367 | | snd INIT 2368 | | strt init timer 2369 rcv valid | | 2370 COOKIE ECHO | v 2371 (1) ---------------- | +------------+ 2372 create TCB | | COOKIE-WAIT| (2) 2373 snd COOKIE ACK | +------------+ 2374 | | 2375 | | rcv INIT ACK 2376 | | ----------------- 2377 | | snd COOKIE ECHO 2378 | | stop init timer 2379 | | strt cookie timer 2380 | v 2381 | +--------------+ 2382 | | COOKIE-ECHOED| (3) 2383 | +--------------+ 2384 | | 2385 | | rcv COOKIE ACK 2386 | | ----------------- 2387 | | stop cookie timer 2388 v v 2389 +---------------+ 2390 | ESTABLISHED | 2391 +---------------+ 2392 | 2393 | 2394 /----+------------\ 2395 [SHUTDOWN] / \ 2396 -------------------| | 2397 check outstanding | | 2398 DATA chunks | | 2399 v | 2400 +---------+ | 2401 |SHUTDOWN-| | rcv SHUTDOWN 2402 |PENDING | |------------------ 2403 +---------+ | check outstanding 2404 | | DATA chunks 2405 No more outstanding | | 2406 ---------------------| | 2407 snd SHUTDOWN | | 2408 strt shutdown timer | | 2409 v v 2410 +---------+ +-----------+ 2411 (4) |SHUTDOWN-| | SHUTDOWN- | (5,6) 2412 |SENT | | RECEIVED | 2413 +---------+ +-----------+ 2414 | \ | 2415 (A) rcv SHUTDOWN ACK | \ | 2416 ----------------------| \ | 2417 stop shutdown timer | \rcv:SHUTDOWN | 2418 send SHUTDOWN COMPLETE| \ (B) | 2419 delete TCB | \ | 2420 | \ | No more outstanding 2421 | \ |----------------- 2422 | \ | send SHUTDOWN ACK 2423 (B)rcv SHUTDOWN | \ | strt shutdown timer 2424 ----------------------| \ | 2425 send SHUTDOWN ACK | \ | 2426 start shutdown timer | \ | 2427 move to SHUTDOWN- | \ | 2428 ACK-SENT | | | 2429 | v | 2430 | +-----------+ 2431 | | SHUTDOWN- | (7) 2432 | | ACK-SENT | 2433 | +----------+- 2434 | | (C)rcv SHUTDOWN COMPLETE 2435 | |----------------- 2436 | | stop shutdown timer 2437 | | delete TCB 2438 | | 2439 | | (D)rcv SHUTDOWN ACK 2440 | |-------------- 2441 | | stop shutdown timer 2442 | | send SHUTDOWN COMPLETE 2443 | | delete TCB 2444 | | 2445 \ +---------+ / 2446 \-->| CLOSED |<--/ 2447 +---------+ 2449 Figure 3: State Transition Diagram of SCTP 2451 Notes: 2453 1) If the State Cookie in the received COOKIE ECHO is invalid (i.e., 2454 failed to pass the integrity check), the receiver MUST silently 2455 discard the packet. Or, if the received State Cookie is expired 2456 (see Section 5.1.5), the receiver MUST send back an ERROR chunk. 2457 In either case, the receiver stays in the CLOSED state. 2458 2) If the T1-init timer expires, the endpoint MUST retransmit INIT 2459 and restart the T1-init timer without changing state. This MUST 2460 be repeated up to 'Max.Init.Retransmits' times. After that, the 2461 endpoint MUST abort the initialization process and report the 2462 error to the SCTP user. 2463 3) If the T1-cookie timer expires, the endpoint MUST retransmit 2464 COOKIE ECHO and restart the T1-cookie timer without changing 2465 state. This MUST be repeated up to 'Max.Init.Retransmits' times. 2466 After that, the endpoint MUST abort the initialization process 2467 and report the error to the SCTP user. 2469 4) In the SHUTDOWN-SENT state, the endpoint MUST acknowledge any 2470 received DATA chunks without delay. 2471 5) In the SHUTDOWN-RECEIVED state, the endpoint MUST NOT accept any 2472 new send requests from its SCTP user. 2473 6) In the SHUTDOWN-RECEIVED state, the endpoint MUST transmit or 2474 retransmit data and leave this state when all data in queue is 2475 transmitted. 2476 7) In the SHUTDOWN-ACK-SENT state, the endpoint MUST NOT accept any 2477 new send requests from its SCTP user. 2479 The CLOSED state is used to indicate that an association is not 2480 created (i.e., doesn't exist). 2482 5. Association Initialization 2484 Before the first data transmission can take place from one SCTP 2485 endpoint ("A") to another SCTP endpoint ("Z"), the two endpoints must 2486 complete an initialization process in order to set up an SCTP 2487 association between them. 2489 The SCTP user at an endpoint should use the ASSOCIATE primitive to 2490 initialize an SCTP association to another SCTP endpoint. 2492 IMPLEMENTATION NOTE: From an SCTP user's point of view, an 2493 association may be implicitly opened, without an ASSOCIATE primitive 2494 (see Section 10.1 B) being invoked, by the initiating endpoint's 2495 sending of the first user data to the destination endpoint. The 2496 initiating SCTP will assume default values for all mandatory and 2497 optional parameters for the INIT/INIT ACK. 2499 Once the association is established, unidirectional streams are open 2500 for data transfer on both ends (see Section 5.1.1). 2502 5.1. Normal Establishment of an Association 2504 The initialization process consists of the following steps (assuming 2505 that SCTP endpoint "A" tries to set up an association with SCTP 2506 endpoint "Z" and "Z" accepts the new association): 2508 A) "A" first sends an INIT chunk to "Z". In the INIT, "A" must 2509 provide its Verification Tag (Tag_A) in the Initiate Tag field. 2510 Tag_A SHOULD be a random number in the range of 1 to 4294967295 2511 (see Section 5.3.1 for Tag value selection). After sending the 2512 INIT, "A" starts the T1-init timer and enters the COOKIE-WAIT 2513 state. 2515 B) "Z" shall respond immediately with an INIT ACK chunk. The 2516 destination IP address of the INIT ACK MUST be set to the source 2517 IP address of the INIT to which this INIT ACK is responding. In 2518 the response, besides filling in other parameters, "Z" must set 2519 the Verification Tag field to Tag_A, and also provide its own 2520 Verification Tag (Tag_Z) in the Initiate Tag field. 2522 Moreover, "Z" MUST generate and send along with the INIT ACK a 2523 State Cookie. See Section 5.1.3 for State Cookie generation. 2525 Note: After sending out INIT ACK with the State Cookie parameter, 2526 "Z" MUST NOT allocate any resources or keep any states for the 2527 new association. Otherwise, "Z" will be vulnerable to resource 2528 attacks. 2530 C) Upon reception of the INIT ACK from "Z", "A" shall stop the 2531 T1-init timer and leave the COOKIE-WAIT state. "A" shall then 2532 send the State Cookie received in the INIT ACK chunk in a COOKIE 2533 ECHO chunk, start the T1-cookie timer, and enter the COOKIE- 2534 ECHOED state. 2536 Note: The COOKIE ECHO chunk can be bundled with any pending 2537 outbound DATA chunks, but it MUST be the first chunk in the 2538 packet and until the COOKIE ACK is returned the sender MUST NOT 2539 send any other packets to the peer. 2541 D) Upon reception of the COOKIE ECHO chunk, endpoint "Z" will reply 2542 with a COOKIE ACK chunk after building a TCB and moving to the 2543 ESTABLISHED state. A COOKIE ACK chunk may be bundled with any 2544 pending DATA chunks (and/or SACK chunks), but the COOKIE ACK 2545 chunk MUST be the first chunk in the packet. 2547 IMPLEMENTATION NOTE: An implementation may choose to send the 2548 Communication Up notification to the SCTP user upon reception of 2549 a valid COOKIE ECHO chunk. 2551 E) Upon reception of the COOKIE ACK, endpoint "A" will move from the 2552 COOKIE-ECHOED state to the ESTABLISHED state, stopping the 2553 T1-cookie timer. It may also notify its ULP about the successful 2554 establishment of the association with a Communication Up 2555 notification (see Section 10). 2557 An INIT or INIT ACK chunk MUST NOT be bundled with any other chunk. 2558 They MUST be the only chunks present in the SCTP packets that carry 2559 them. 2561 An endpoint MUST send the INIT ACK to the IP address from which it 2562 received the INIT. 2564 Note: T1-init timer and T1-cookie timer shall follow the same rules 2565 given in Section 6.3. 2567 If an endpoint receives an INIT, INIT ACK, or COOKIE ECHO chunk but 2568 decides not to establish the new association due to missing mandatory 2569 parameters in the received INIT or INIT ACK, invalid parameter 2570 values, or lack of local resources, it SHOULD respond with an ABORT 2571 chunk. It SHOULD also specify the cause of abort, such as the type 2572 of the missing mandatory parameters, etc., by including the error 2573 cause parameters with the ABORT chunk. The Verification Tag field in 2574 the common header of the outbound SCTP packet containing the ABORT 2575 chunk MUST be set to the Initiate Tag value of the peer. 2577 Note that a COOKIE ECHO chunk that does NOT pass the integrity check 2578 is NOT considered an 'invalid parameter' and requires special 2579 handling; see Section 5.1.5. 2581 After the reception of the first DATA chunk in an association the 2582 endpoint MUST immediately respond with a SACK to acknowledge the DATA 2583 chunk. Subsequent acknowledgements should be done as described in 2584 Section 6.2. 2586 When the TCB is created, each endpoint MUST set its internal 2587 Cumulative TSN Ack Point to the value of its transmitted Initial TSN 2588 minus one. 2590 IMPLEMENTATION NOTE: The IP addresses and SCTP port are generally 2591 used as the key to find the TCB within an SCTP instance. 2593 5.1.1. Handle Stream Parameters 2595 In the INIT and INIT ACK chunks, the sender of the chunk MUST 2596 indicate the number of outbound streams (OSs) it wishes to have in 2597 the association, as well as the maximum inbound streams (MISs) it 2598 will accept from the other endpoint. 2600 After receiving the stream configuration information from the other 2601 side, each endpoint MUST perform the following check: If the peer's 2602 MIS is less than the endpoint's OS, meaning that the peer is 2603 incapable of supporting all the outbound streams the endpoint wants 2604 to configure, the endpoint MUST use MIS outbound streams and MAY 2605 report any shortage to the upper layer. The upper layer can then 2606 choose to abort the association if the resource shortage is 2607 unacceptable. 2609 After the association is initialized, the valid outbound stream 2610 identifier range for either endpoint shall be 0 to min(local OS, 2611 remote MIS)-1. 2613 5.1.2. Handle Address Parameters 2615 During the association initialization, an endpoint shall use the 2616 following rules to discover and collect the destination transport 2617 address(es) of its peer. 2619 A) If there are no address parameters present in the received INIT 2620 or INIT ACK chunk, the endpoint shall take the source IP address 2621 from which the chunk arrives and record it, in combination with 2622 the SCTP source port number, as the only destination transport 2623 address for this peer. 2625 B) If there is a Host Name parameter present in the received INIT or 2626 INIT ACK chunk, the endpoint shall resolve that host name to a 2627 list of IP address(es) and derive the transport address(es) of 2628 this peer by combining the resolved IP address(es) with the SCTP 2629 source port. 2631 The endpoint MUST ignore any other IP Address parameters if they 2632 are also present in the received INIT or INIT ACK chunk. 2634 The time at which the receiver of an INIT resolves the host name 2635 has potential security implications to SCTP. If the receiver of 2636 an INIT resolves the host name upon the reception of the chunk, 2637 and the mechanism the receiver uses to resolve the host name 2638 involves potential long delay (e.g., DNS query), the receiver may 2639 open itself up to resource attacks for the period of time while 2640 it is waiting for the name resolution results before it can build 2641 the State Cookie and release local resources. 2643 Therefore, in cases where the name translation involves potential 2644 long delay, the receiver of the INIT MUST postpone the name 2645 resolution till the reception of the COOKIE ECHO chunk from the 2646 peer. In such a case, the receiver of the INIT SHOULD build the 2647 State Cookie using the received Host Name (instead of destination 2648 transport addresses) and send the INIT ACK to the source IP 2649 address from which the INIT was received. 2651 The receiver of an INIT ACK shall always immediately attempt to 2652 resolve the name upon the reception of the chunk. 2654 The receiver of the INIT or INIT ACK MUST NOT send user data 2655 (piggy-backed or stand-alone) to its peer until the host name is 2656 successfully resolved. 2658 If the name resolution is not successful, the endpoint MUST 2659 immediately send an ABORT with "Unresolvable Address" error cause 2660 to its peer. The ABORT shall be sent to the source IP address 2661 from which the last peer packet was received. 2663 C) If there are only IPv4/IPv6 addresses present in the received 2664 INIT or INIT ACK chunk, the receiver MUST derive and record all 2665 the transport addresses from the received chunk AND the source IP 2666 address that sent the INIT or INIT ACK. The transport addresses 2667 are derived by the combination of SCTP source port (from the 2668 common header) and the IP Address parameter(s) carried in the 2669 INIT or INIT ACK chunk and the source IP address of the IP 2670 datagram. The receiver should use only these transport addresses 2671 as destination transport addresses when sending subsequent 2672 packets to its peer. 2674 D) An INIT or INIT ACK chunk MUST be treated as belonging to an 2675 already established association (or one in the process of being 2676 established) if the use of any of the valid address parameters 2677 contained within the chunk would identify an existing TCB. 2679 IMPLEMENTATION NOTE: In some cases (e.g., when the implementation 2680 doesn't control the source IP address that is used for transmitting), 2681 an endpoint might need to include in its INIT or INIT ACK all 2682 possible IP addresses from which packets to the peer could be 2683 transmitted. 2685 After all transport addresses are derived from the INIT or INIT ACK 2686 chunk using the above rules, the endpoint shall select one of the 2687 transport addresses as the initial primary path. 2689 Note: The INIT ACK MUST be sent to the source address of the INIT. 2691 The sender of INIT may include a 'Supported Address Types' parameter 2692 in the INIT to indicate what types of address are acceptable. When 2693 this parameter is present, the receiver of INIT (initiate) MUST 2694 either use one of the address types indicated in the Supported 2695 Address Types parameter when responding to the INIT, or abort the 2696 association with an "Unresolvable Address" error cause if it is 2697 unwilling or incapable of using any of the address types indicated by 2698 its peer. 2700 IMPLEMENTATION NOTE: In the case that the receiver of an INIT ACK 2701 fails to resolve the address parameter due to an unsupported type, it 2702 can abort the initiation process and then attempt a reinitiation by 2703 using a 'Supported Address Types' parameter in the new INIT to 2704 indicate what types of address it prefers. 2706 IMPLEMENTATION NOTE: If an SCTP endpoint that only supports either 2707 IPv4 or IPv6 receives IPv4 and IPv6 addresses in an INIT or INIT ACK 2708 chunk from its peer, it MUST use all the addresses belonging to the 2709 supported address family. The other addresses MAY be ignored. The 2710 endpoint SHOULD NOT respond with any kind of error indication. 2712 IMPLEMENTATION NOTE: If an SCTP endpoint lists in the 'Supported 2713 Address Types' parameter either IPv4 or IPv6, but uses the other 2714 family for sending the packet containing the INIT chunk, or if it 2715 also lists addresses of the other family in the INIT chunk, then the 2716 address family that is not listed in the 'Supported Address Types' 2717 parameter SHOULD also be considered as supported by the receiver of 2718 the INIT chunk. The receiver of the INIT chunk SHOULD NOT respond 2719 with any kind of error indication. 2721 5.1.3. Generating State Cookie 2723 When sending an INIT ACK as a response to an INIT chunk, the sender 2724 of INIT ACK creates a State Cookie and sends it in the State Cookie 2725 parameter of the INIT ACK. Inside this State Cookie, the sender 2726 should include a MAC (see [RFC2104] for an example), a timestamp on 2727 when the State Cookie is created, and the lifespan of the State 2728 Cookie, along with all the information necessary for it to establish 2729 the association. 2731 The following steps SHOULD be taken to generate the State Cookie: 2733 1) Create an association TCB using information from both the 2734 received INIT and the outgoing INIT ACK chunk, 2735 2) In the TCB, set the creation time to the current time of day, and 2736 the lifespan to the protocol parameter 'Valid.Cookie.Life' (see 2737 Section 15), 2738 3) From the TCB, identify and collect the minimal subset of 2739 information needed to re-create the TCB, and generate a MAC using 2740 this subset of information and a secret key (see [RFC2104] for an 2741 example of generating a MAC), and 2742 4) Generate the State Cookie by combining this subset of information 2743 and the resultant MAC. 2745 After sending the INIT ACK with the State Cookie parameter, the 2746 sender SHOULD delete the TCB and any other local resource related to 2747 the new association, so as to prevent resource attacks. 2749 The hashing method used to generate the MAC is strictly a private 2750 matter for the receiver of the INIT chunk. The use of a MAC is 2751 mandatory to prevent denial-of-service attacks. The secret key 2752 SHOULD be random ([RFC4086] provides some information on randomness 2753 guidelines); it SHOULD be changed reasonably frequently, and the 2754 timestamp in the State Cookie MAY be used to determine which key 2755 should be used to verify the MAC. 2757 An implementation SHOULD make the cookie as small as possible to 2758 ensure interoperability. 2760 5.1.4. State Cookie Processing 2762 When an endpoint (in the COOKIE-WAIT state) receives an INIT ACK 2763 chunk with a State Cookie parameter, it MUST immediately send a 2764 COOKIE ECHO chunk to its peer with the received State Cookie. The 2765 sender MAY also add any pending DATA chunks to the packet after the 2766 COOKIE ECHO chunk. 2768 The endpoint shall also start the T1-cookie timer after sending out 2769 the COOKIE ECHO chunk. If the timer expires, the endpoint shall 2770 retransmit the COOKIE ECHO chunk and restart the T1-cookie timer. 2771 This is repeated until either a COOKIE ACK is received or 2772 'Max.Init.Retransmits' (see Section 15) is reached causing the peer 2773 endpoint to be marked unreachable (and thus the association enters 2774 the CLOSED state). 2776 5.1.5. State Cookie Authentication 2778 When an endpoint receives a COOKIE ECHO chunk from another endpoint 2779 with which it has no association, it shall take the following 2780 actions: 2782 1) Compute a MAC using the TCB data carried in the State Cookie and 2783 the secret key (note the timestamp in the State Cookie MAY be 2784 used to determine which secret key to use). [RFC2104] can be 2785 used as a guideline for generating the MAC, 2786 2) Authenticate the State Cookie as one that it previously generated 2787 by comparing the computed MAC against the one carried in the 2788 State Cookie. If this comparison fails, the SCTP packet, 2789 including the COOKIE ECHO and any DATA chunks, should be silently 2790 discarded, 2791 3) Compare the port numbers and the Verification Tag contained 2792 within the COOKIE ECHO chunk to the actual port numbers and the 2793 Verification Tag within the SCTP common header of the received 2794 packet. If these values do not match, the packet MUST be 2795 silently discarded. 2796 4) Compare the creation timestamp in the State Cookie to the current 2797 local time. If the elapsed time is longer than the lifespan 2798 carried in the State Cookie, then the packet, including the 2799 COOKIE ECHO and any attached DATA chunks, SHOULD be discarded, 2800 and the endpoint MUST transmit an ERROR chunk with a "Stale 2801 Cookie" error cause to the peer endpoint. 2802 5) If the State Cookie is valid, create an association to the sender 2803 of the COOKIE ECHO chunk with the information in the TCB data 2804 carried in the COOKIE ECHO and enter the ESTABLISHED state. 2806 6) Send a COOKIE ACK chunk to the peer acknowledging receipt of the 2807 COOKIE ECHO. The COOKIE ACK MAY be bundled with an outbound DATA 2808 chunk or SACK chunk; however, the COOKIE ACK MUST be the first 2809 chunk in the SCTP packet. 2810 7) Immediately acknowledge any DATA chunk bundled with the COOKIE 2811 ECHO with a SACK (subsequent DATA chunk acknowledgement should 2812 follow the rules defined in Section 6.2). As mentioned in step 2813 6, if the SACK is bundled with the COOKIE ACK, the COOKIE ACK 2814 MUST appear first in the SCTP packet. 2816 If a COOKIE ECHO is received from an endpoint with which the receiver 2817 of the COOKIE ECHO has an existing association, the procedures in 2818 Section 5.2 should be followed. 2820 5.1.6. An Example of Normal Association Establishment 2822 In the following example, "A" initiates the association and then 2823 sends a user message to "Z", then "Z" sends two user messages to "A" 2824 later (assuming no bundling or fragmentation occurs): 2826 Endpoint A Endpoint Z 2827 {app sets association with Z} 2828 (build TCB) 2829 INIT [I-Tag=Tag_A 2830 & other info] ------\ 2831 (Start T1-init timer) \ 2832 (Enter COOKIE-WAIT state) \---> (compose temp TCB and Cookie_Z) 2833 /-- INIT ACK [Veri Tag=Tag_A, 2834 / I-Tag=Tag_Z, 2835 (Cancel T1-init timer) <------/ Cookie_Z, & other info] 2836 (destroy temp TCB) 2837 COOKIE ECHO [Cookie_Z] ------\ 2838 (Start T1-init timer) \ 2839 (Enter COOKIE-ECHOED state) \---> (build TCB enter ESTABLISHED 2840 state) 2841 /---- COOKIE-ACK 2842 / 2843 (Cancel T1-init timer, <-----/ 2844 Enter ESTABLISHED state) 2845 {app sends 1st user data; strm 0} 2846 DATA [TSN=initial TSN_A 2847 Strm=0,Seq=0 & user data]--\ 2848 (Start T3-rtx timer) \ 2849 \-> 2850 /----- SACK [TSN Ack=init 2851 / TSN_A,Block=0] 2852 (Cancel T3-rtx timer) <------/ 2853 ... 2854 {app sends 2 messages;strm 0} 2855 /---- DATA 2856 / [TSN=init TSN_Z 2857 <--/ Strm=0,Seq=0 & user data 1] 2858 SACK [TSN Ack=init TSN_Z, /---- DATA 2859 Block=0] --------\ / [TSN=init TSN_Z +1, 2860 \/ Strm=0,Seq=1 & user data 2] 2861 <------/\ 2862 \ 2863 \------> 2865 Figure 4: INITIATION Example 2867 If the T1-init timer expires at "A" after the INIT or COOKIE ECHO 2868 chunks are sent, the same INIT or COOKIE ECHO chunk with the same 2869 Initiate Tag (i.e., Tag_A) or State Cookie shall be retransmitted and 2870 the timer restarted. This shall be repeated Max.Init.Retransmits 2871 times before "A" considers "Z" unreachable and reports the failure to 2872 its upper layer (and thus the association enters the CLOSED state). 2874 When retransmitting the INIT, the endpoint MUST follow the rules 2875 defined in Section 6.3 to determine the proper timer value. 2877 5.2. Handle Duplicate or Unexpected INIT, INIT ACK, COOKIE ECHO, and 2878 COOKIE ACK 2880 During the life time of an association (in one of the possible 2881 states), an endpoint may receive from its peer endpoint one of the 2882 setup chunks (INIT, INIT ACK, COOKIE ECHO, and COOKIE ACK). The 2883 receiver shall treat such a setup chunk as a duplicate and process it 2884 as described in this section. 2886 Note: An endpoint will not receive the chunk unless the chunk was 2887 sent to an SCTP transport address and is from an SCTP transport 2888 address associated with this endpoint. Therefore, the endpoint 2889 processes such a chunk as part of its current association. 2891 The following scenarios can cause duplicated or unexpected chunks: 2893 A) The peer has crashed without being detected, restarted itself, 2894 and sent out a new INIT chunk trying to restore the association, 2896 B) Both sides are trying to initialize the association at about the 2897 same time, 2899 C) The chunk is from a stale packet that was used to establish the 2900 present association or a past association that is no longer in 2901 existence, 2903 D) The chunk is a false packet generated by an attacker, or 2905 E) The peer never received the COOKIE ACK and is retransmitting its 2906 COOKIE ECHO. 2908 The rules in the following sections shall be applied in order to 2909 identify and correctly handle these cases. 2911 5.2.1. INIT Received in COOKIE-WAIT or COOKIE-ECHOED State (Item B) 2913 This usually indicates an initialization collision, i.e., each 2914 endpoint is attempting, at about the same time, to establish an 2915 association with the other endpoint. 2917 Upon receipt of an INIT in the COOKIE-WAIT state, an endpoint MUST 2918 respond with an INIT ACK using the same parameters it sent in its 2919 original INIT chunk (including its Initiate Tag, unchanged). When 2920 responding, the endpoint MUST send the INIT ACK back to the same 2921 address that the original INIT (sent by this endpoint) was sent. 2923 Upon receipt of an INIT in the COOKIE-ECHOED state, an endpoint MUST 2924 respond with an INIT ACK using the same parameters it sent in its 2925 original INIT chunk (including its Initiate Tag, unchanged), provided 2926 that no NEW address has been added to the forming association. If 2927 the INIT message indicates that a new address has been added to the 2928 association, then the entire INIT MUST be discarded, and NO changes 2929 should be made to the existing association. An ABORT SHOULD be sent 2930 in response that MAY include the error 'Restart of an association 2931 with new addresses'. The error SHOULD list the addresses that were 2932 added to the restarting association. 2934 When responding in either state (COOKIE-WAIT or COOKIE-ECHOED) with 2935 an INIT ACK, the original parameters are combined with those from the 2936 newly received INIT chunk. The endpoint shall also generate a State 2937 Cookie with the INIT ACK. The endpoint uses the parameters sent in 2938 its INIT to calculate the State Cookie. 2940 After that, the endpoint MUST NOT change its state, the T1-init timer 2941 shall be left running, and the corresponding TCB MUST NOT be 2942 destroyed. The normal procedures for handling State Cookies when a 2943 TCB exists will resolve the duplicate INITs to a single association. 2945 For an endpoint that is in the COOKIE-ECHOED state, it MUST populate 2946 its Tie-Tags within both the association TCB and inside the State 2947 Cookie (see Section 5.2.2 for a description of the Tie-Tags). 2949 5.2.2. Unexpected INIT in States Other than CLOSED, COOKIE-ECHOED, 2950 COOKIE-WAIT, and SHUTDOWN-ACK-SENT 2952 Unless otherwise stated, upon receipt of an unexpected INIT for this 2953 association, the endpoint shall generate an INIT ACK with a State 2954 Cookie. Before responding, the endpoint MUST check to see if the 2955 unexpected INIT adds new addresses to the association. If new 2956 addresses are added to the association, the endpoint MUST respond 2957 with an ABORT, copying the 'Initiate Tag' of the unexpected INIT into 2958 the 'Verification Tag' of the outbound packet carrying the ABORT. In 2959 the ABORT response, the cause of error MAY be set to 'restart of an 2960 association with new addresses'. The error SHOULD list the addresses 2961 that were added to the restarting association. If no new addresses 2962 are added, when responding to the INIT in the outbound INIT ACK, the 2963 endpoint MUST copy its current Tie-Tags to a reserved place within 2964 the State Cookie and the association's TCB. We shall refer to these 2965 locations inside the cookie as the Peer's-Tie-Tag and the Local-Tie- 2966 Tag. We will refer to the copy within an association's TCB as the 2967 Local Tag and Peer's Tag. The outbound SCTP packet containing this 2968 INIT ACK MUST carry a Verification Tag value equal to the Initiate 2969 Tag found in the unexpected INIT. And the INIT ACK MUST contain a 2970 new Initiate Tag (randomly generated; see Section 5.3.1). Other 2971 parameters for the endpoint SHOULD be copied from the existing 2972 parameters of the association (e.g., number of outbound streams) into 2973 the INIT ACK and cookie. 2975 After sending out the INIT ACK or ABORT, the endpoint shall take no 2976 further actions; i.e., the existing association, including its 2977 current state, and the corresponding TCB MUST NOT be changed. 2979 Note: Only when a TCB exists and the association is not in a COOKIE- 2980 WAIT or SHUTDOWN-ACK-SENT state are the Tie-Tags populated with a 2981 value other than 0. For a normal association INIT (i.e., the 2982 endpoint is in the CLOSED state), the Tie-Tags MUST be set to 0 2983 (indicating that no previous TCB existed). 2985 5.2.3. Unexpected INIT ACK 2987 If an INIT ACK is received by an endpoint in any state other than the 2988 COOKIE-WAIT state, the endpoint should discard the INIT ACK chunk. 2989 An unexpected INIT ACK usually indicates the processing of an old or 2990 duplicated INIT chunk. 2992 5.2.4. Handle a COOKIE ECHO when a TCB Exists 2994 When a COOKIE ECHO chunk is received by an endpoint in any state for 2995 an existing association (i.e., not in the CLOSED state) the following 2996 rules shall be applied: 2998 1) Compute a MAC as described in step 1 of Section 5.1.5, 3000 2) Authenticate the State Cookie as described in step 2 of 3001 Section 5.1.5 (this is case C or D above). 3003 3) Compare the timestamp in the State Cookie to the current time. 3004 If the State Cookie is older than the lifespan carried in the 3005 State Cookie and the Verification Tags contained in the State 3006 Cookie do not match the current association's Verification Tags, 3007 the packet, including the COOKIE ECHO and any DATA chunks, should 3008 be discarded. The endpoint also MUST transmit an ERROR chunk 3009 with a "Stale Cookie" error cause to the peer endpoint (this is 3010 case C or D in Section 5.2). 3012 If both Verification Tags in the State Cookie match the 3013 Verification Tags of the current association, consider the State 3014 Cookie valid (this is case E in Section 5.2) even if the lifespan 3015 is exceeded. 3017 4) If the State Cookie proves to be valid, unpack the TCB into a 3018 temporary TCB. 3020 5) Refer to Table 2 to determine the correct action to be taken. 3022 +-----------+------------+---------------+----------------+--------+ 3023 | Local Tag | Peer's Tag | Local-Tie-Tag | Peer's-Tie-Tag | Action | 3024 +-----------+------------+---------------+----------------+--------+ 3025 | X | X | M | M | (A) | 3026 +-----------+------------+---------------+----------------+--------+ 3027 | M | X | A | A | (B) | 3028 +-----------+------------+---------------+----------------+--------+ 3029 | M | 0 | A | A | (B) | 3030 +-----------+------------+---------------+----------------+--------+ 3031 | X | M | 0 | 0 | (C) | 3032 +-----------+------------+---------------+----------------+--------+ 3033 | M | M | A | A | (D) | 3034 +-----------+------------+---------------+----------------+--------+ 3036 Table 2: Handling of a COOKIE ECHO when a TCB Exists 3038 Legend: 3040 X - Tag does not match the existing TCB. 3041 M - Tag matches the existing TCB. 3042 0 - No Tie-Tag in cookie (unknown). 3043 A - All cases, i.e., M, X, or 0. 3045 Note: For any case not shown in Table 2, the cookie should be 3046 silently discarded. 3048 Action 3050 A) In this case, the peer may have restarted. When the endpoint 3051 recognizes this potential 'restart', the existing session is 3052 treated the same as if it received an ABORT followed by a new 3053 COOKIE ECHO with the following exceptions: 3055 * Any SCTP DATA chunks MAY be retained (this is an 3056 implementation-specific option). 3058 * A notification of RESTART SHOULD be sent to the ULP instead of 3059 a "COMMUNICATION LOST" notification. 3061 All the congestion control parameters (e.g., cwnd, ssthresh) 3062 related to this peer MUST be reset to their initial values (see 3063 Section 6.2.1). 3065 After this, the endpoint shall enter the ESTABLISHED state. 3067 If the endpoint is in the SHUTDOWN-ACK-SENT state and recognizes 3068 that the peer has restarted (Action A), it MUST NOT set up a new 3069 association but instead resend the SHUTDOWN ACK and send an ERROR 3070 chunk with a "Cookie Received While Shutting Down" error cause to 3071 its peer. 3073 B) In this case, both sides may be attempting to start an 3074 association at about the same time, but the peer endpoint started 3075 its INIT after responding to the local endpoint's INIT. Thus, it 3076 may have picked a new Verification Tag, not being aware of the 3077 previous tag it had sent this endpoint. The endpoint should stay 3078 in or enter the ESTABLISHED state, but it MUST update its peer's 3079 Verification Tag from the State Cookie, stop any init or cookie 3080 timers that may be running, and send a COOKIE ACK. 3082 C) In this case, the local endpoint's cookie has arrived late. 3083 Before it arrived, the local endpoint sent an INIT and received 3084 an INIT ACK and finally sent a COOKIE ECHO with the peer's same 3085 tag but a new tag of its own. The cookie should be silently 3086 discarded. The endpoint SHOULD NOT change states and should 3087 leave any timers running. 3089 D) When both local and remote tags match, the endpoint should enter 3090 the ESTABLISHED state, if it is in the COOKIE-ECHOED state. It 3091 should stop any cookie timer that may be running and send a 3092 COOKIE ACK. 3094 Note: The "peer's Verification Tag" is the tag received in the 3095 Initiate Tag field of the INIT or INIT ACK chunk. 3097 5.2.4.1. An Example of a Association Restart 3099 In the following example, "A" initiates the association after a 3100 restart has occurred. Endpoint "Z" had no knowledge of the restart 3101 until the exchange (i.e., Heartbeats had not yet detected the failure 3102 of "A") (assuming no bundling or fragmentation occurs): 3104 Endpoint A Endpoint Z 3105 <-------------- Association is established----------------------> 3106 Tag=Tag_A Tag=Tag_Z 3107 <---------------------------------------------------------------> 3108 {A crashes and restarts} 3109 {app sets up a association with Z} 3110 (build TCB) 3111 INIT [I-Tag=Tag_A' 3112 & other info] --------\ 3113 (Start T1-init timer) \ 3114 (Enter COOKIE-WAIT state) \---> (find an existing TCB 3115 compose temp TCB and Cookie_Z 3116 with Tie-Tags to previous 3117 association) 3118 /--- INIT ACK [Veri Tag=Tag_A', 3119 / I-Tag=Tag_Z', 3120 (Cancel T1-init timer) <------/ Cookie_Z[TieTags= 3121 Tag_A,Tag_Z 3122 & other info] 3123 (destroy temp TCB,leave original 3124 in place) 3125 COOKIE ECHO [Veri=Tag_Z', 3126 Cookie_Z 3127 Tie=Tag_A, 3128 Tag_Z]----------\ 3129 (Start T1-init timer) \ 3130 (Enter COOKIE-ECHOED state) \---> (Find existing association, 3131 Tie-Tags match old tags, 3132 Tags do not match, i.e., 3133 case X X M M above, 3134 Announce Restart to ULP 3135 and reset association). 3136 /---- COOKIE ACK 3137 (Cancel T1-init timer, <------/ 3138 Enter ESTABLISHED state) 3139 {app sends 1st user data; strm 0} 3140 DATA [TSN=initial TSN_A 3141 Strm=0,Seq=0 & user data]--\ 3142 (Start T3-rtx timer) \ 3143 \-> 3144 /--- SACK [TSN Ack=init TSN_A,Block=0] 3145 (Cancel T3-rtx timer) <------/ 3147 Figure 5: A Restart Example 3149 5.2.5. Handle Duplicate COOKIE-ACK. 3151 At any state other than COOKIE-ECHOED, an endpoint should silently 3152 discard a received COOKIE ACK chunk. 3154 5.2.6. Handle Stale COOKIE Error 3156 Receipt of an ERROR chunk with a "Stale Cookie" error cause indicates 3157 one of a number of possible events: 3159 A) The association failed to completely setup before the State 3160 Cookie issued by the sender was processed. 3162 B) An old State Cookie was processed after setup completed. 3164 C) An old State Cookie is received from someone that the receiver is 3165 not interested in having an association with and the ABORT chunk 3166 was lost. 3168 When processing an ERROR chunk with a "Stale Cookie" error cause an 3169 endpoint should first examine if an association is in the process of 3170 being set up, i.e., the association is in the COOKIE-ECHOED state. 3171 In all cases, if the association is not in the COOKIE-ECHOED state, 3172 the ERROR chunk should be silently discarded. 3174 If the association is in the COOKIE-ECHOED state, the endpoint may 3175 elect one of the following three alternatives. 3177 1) Send a new INIT chunk to the endpoint to generate a new State 3178 Cookie and reattempt the setup procedure. 3179 2) Discard the TCB and report to the upper layer the inability to 3180 set up the association. 3181 3) Send a new INIT chunk to the endpoint, adding a Cookie 3182 Preservative parameter requesting an extension to the life time 3183 of the State Cookie. When calculating the time extension, an 3184 implementation SHOULD use the RTT information measured based on 3185 the previous COOKIE ECHO / ERROR exchange, and should add no more 3186 than 1 second beyond the measured RTT, due to long State Cookie 3187 life times making the endpoint more subject to a replay attack. 3189 5.3. Other Initialization Issues 3191 5.3.1. Selection of Tag Value 3193 Initiate Tag values should be selected from the range of 1 to 2**32 - 3194 1. It is very important that the Initiate Tag value be randomized to 3195 help protect against "man in the middle" and "sequence number" 3196 attacks. The methods described in [RFC4086] can be used for the 3197 Initiate Tag randomization. Careful selection of Initiate Tags is 3198 also necessary to prevent old duplicate packets from previous 3199 associations being mistakenly processed as belonging to the current 3200 association. 3202 Moreover, the Verification Tag value used by either endpoint in a 3203 given association MUST NOT change during the life time of an 3204 association. A new Verification Tag value MUST be used each time the 3205 endpoint tears down and then reestablishes an association to the same 3206 peer. 3208 5.4. Path Verification 3210 During association establishment, the two peers exchange a list of 3211 addresses. In the predominant case, these lists accurately represent 3212 the addresses owned by each peer. However, it is possible that a 3213 misbehaving peer may supply addresses that it does not own. To 3214 prevent this, the following rules are applied to all addresses of the 3215 new association: 3217 1) Any address passed to the sender of the INIT by its upper layer 3218 is automatically considered to be CONFIRMED. 3220 2) For the receiver of the COOKIE ECHO, the only CONFIRMED address 3221 is the one to which the INIT-ACK was sent. 3223 3) All other addresses not covered by rules 1 and 2 are considered 3224 UNCONFIRMED and are subject to probing for verification. 3226 To probe an address for verification, an endpoint will send 3227 HEARTBEATs including a 64-bit random nonce and a path indicator (to 3228 identify the address that the HEARTBEAT is sent to) within the 3229 HEARTBEAT parameter. 3231 Upon receipt of the HEARTBEAT ACK, a verification is made that the 3232 nonce included in the HEARTBEAT parameter is the one sent to the 3233 address indicated inside the HEARTBEAT parameter. When this match 3234 occurs, the address that the original HEARTBEAT was sent to is now 3235 considered CONFIRMED and available for normal data transfer. 3237 These probing procedures are started when an association moves to the 3238 ESTABLISHED state and are ended when all paths are confirmed. 3240 In each RTO, a probe may be sent on an active UNCONFIRMED path in an 3241 attempt to move it to the CONFIRMED state. If during this probing 3242 the path becomes inactive, this rate is lowered to the normal 3243 HEARTBEAT rate. At the expiration of the RTO timer, the error 3244 counter of any path that was probed but not CONFIRMED is incremented 3245 by one and subjected to path failure detection, as defined in 3246 Section 8.2. When probing UNCONFIRMED addresses, however, the 3247 association overall error count is NOT incremented. 3249 The number of HEARTBEATS sent at each RTO SHOULD be limited by the 3250 HB.Max.Burst parameter. It is an implementation decision as to how 3251 to distribute HEARTBEATS to the peer's addresses for path 3252 verification. 3254 Whenever a path is confirmed, an indication MAY be given to the upper 3255 layer. 3257 An endpoint MUST NOT send any chunks to an UNCONFIRMED address, with 3258 the following exceptions: 3260 o A HEARTBEAT including a nonce MAY be sent to an UNCONFIRMED 3261 address. 3263 o A HEARTBEAT ACK MAY be sent to an UNCONFIRMED address. 3265 o A COOKIE ACK MAY be sent to an UNCONFIRMED address, but it MUST be 3266 bundled with a HEARTBEAT including a nonce. An implementation 3267 that does NOT support bundling MUST NOT send a COOKIE ACK to an 3268 UNCONFIRMED address. 3270 o A COOKIE ECHO MAY be sent to an UNCONFIRMED address, but it MUST 3271 be bundled with a HEARTBEAT including a nonce, and the packet MUST 3272 NOT exceed the path MTU. If the implementation does NOT support 3273 bundling or if the bundled COOKIE ECHO plus HEARTBEAT (including 3274 nonce) would exceed the path MTU, then the implementation MUST NOT 3275 send a COOKIE ECHO to an UNCONFIRMED address. 3277 6. User Data Transfer 3279 Data transmission MUST only happen in the ESTABLISHED, SHUTDOWN- 3280 PENDING, and SHUTDOWN-RECEIVED states. The only exception to this is 3281 that DATA chunks are allowed to be bundled with an outbound COOKIE 3282 ECHO chunk when in the COOKIE-WAIT state. 3284 DATA chunks MUST only be received according to the rules below in 3285 ESTABLISHED, SHUTDOWN-PENDING, and SHUTDOWN-SENT. A DATA chunk 3286 received in CLOSED is out of the blue and SHOULD be handled per 3287 Section 8.4. A DATA chunk received in any other state SHOULD be 3288 discarded. 3290 A SACK MUST be processed in ESTABLISHED, SHUTDOWN-PENDING, and 3291 SHUTDOWN-RECEIVED. An incoming SACK MAY be processed in COOKIE- 3292 ECHOED. A SACK in the CLOSED state is out of the blue and SHOULD be 3293 processed according to the rules in Section 8.4. A SACK chunk 3294 received in any other state SHOULD be discarded. 3296 An SCTP receiver MUST be able to receive a minimum of 1500 bytes in 3297 one SCTP packet. This means that an SCTP endpoint MUST NOT indicate 3298 less than 1500 bytes in its initial a_rwnd sent in the INIT or INIT 3299 ACK. 3301 For transmission efficiency, SCTP defines mechanisms for bundling of 3302 small user messages and fragmentation of large user messages. The 3303 following diagram depicts the flow of user messages through SCTP. 3305 In this section, the term "data sender" refers to the endpoint that 3306 transmits a DATA chunk and the term "data receiver" refers to the 3307 endpoint that receives a DATA chunk. A data receiver will transmit 3308 SACK chunks. 3310 +--------------------------+ 3311 | User Messages | 3312 +--------------------------+ 3313 SCTP user ^ | 3314 ==================|==|======================================= 3315 | v (1) 3316 +------------------+ +--------------------+ 3317 | SCTP DATA Chunks | |SCTP Control Chunks | 3318 +------------------+ +--------------------+ 3319 ^ | ^ | 3320 | v (2) | v (2) 3321 +--------------------------+ 3322 | SCTP packets | 3323 +--------------------------+ 3324 SCTP ^ | 3325 ===========================|==|=========================== 3326 | v 3327 Connectionless Packet Transfer Service (e.g., IP) 3329 Figure 6: Illustration of User Data Transfer 3331 Notes: 3333 1) When converting user messages into DATA chunks, an endpoint will 3334 fragment user messages larger than the current association path 3335 MTU into multiple DATA chunks. The data receiver will normally 3336 reassemble the fragmented message from DATA chunks before 3337 delivery to the user (see Section 6.9 for details). 3339 2) Multiple DATA and control chunks may be bundled by the sender 3340 into a single SCTP packet for transmission, as long as the final 3341 size of the packet does not exceed the current path MTU. The 3342 receiver will unbundle the packet back into the original chunks. 3343 Control chunks MUST come before DATA chunks in the packet. 3345 The fragmentation and bundling mechanisms, as detailed in Section 6.9 3346 and Section 6.10, are OPTIONAL to implement by the data sender, but 3347 they MUST be implemented by the data receiver, i.e., an endpoint MUST 3348 properly receive and process bundled or fragmented data. 3350 6.1. Transmission of DATA Chunks 3352 This document is specified as if there is a single retransmission 3353 timer per destination transport address, but implementations MAY have 3354 a retransmission timer for each DATA chunk. 3356 The following general rules MUST be applied by the data sender for 3357 transmission and/or retransmission of outbound DATA chunks: 3359 A) At any given time, the data sender MUST NOT transmit new data to 3360 any destination transport address if its peer's rwnd indicates 3361 that the peer has no buffer space (i.e., rwnd is 0; see 3362 Section 6.2.1). However, regardless of the value of rwnd 3363 (including if it is 0), the data sender can always have one DATA 3364 chunk in flight to the receiver if allowed by cwnd (see rule B, 3365 below). This rule allows the sender to probe for a change in 3366 rwnd that the sender missed due to the SACK's having been lost in 3367 transit from the data receiver to the data sender. 3369 When the receiver's advertised window is zero, this probe is 3370 called a zero window probe. Note that a zero window probe SHOULD 3371 only be sent when all outstanding DATA chunks have been 3372 cumulatively acknowledged and no DATA chunks are in flight. Zero 3373 window probing MUST be supported. 3375 If the sender continues to receive new packets from the receiver 3376 while doing zero window probing, the unacknowledged window probes 3377 should not increment the error counter for the association or any 3378 destination transport address. This is because the receiver MAY 3379 keep its window closed for an indefinite time. Refer to 3380 Section 6.2 on the receiver behavior when it advertises a zero 3381 window. The sender SHOULD send the first zero window probe after 3382 1 RTO when it detects that the receiver has closed its window and 3383 SHOULD increase the probe interval exponentially afterwards. 3384 Also note that the cwnd SHOULD be adjusted according to 3385 Section 7.2.1. Zero window probing does not affect the 3386 calculation of cwnd. 3388 The sender MUST also have an algorithm for sending new DATA 3389 chunks to avoid silly window syndrome (SWS) as described in 3390 [RFC0813]. The algorithm can be similar to the one described in 3391 Section 4.2.3.4 of [RFC1122]. 3393 However, regardless of the value of rwnd (including if it is 0), 3394 the data sender can always have one DATA chunk in flight to the 3395 receiver if allowed by cwnd (see rule B below). This rule allows 3396 the sender to probe for a change in rwnd that the sender missed 3397 due to the SACK having been lost in transit from the data 3398 receiver to the data sender. 3400 B) At any given time, the sender MUST NOT transmit new data to a 3401 given transport address if it has cwnd or more bytes of data 3402 outstanding to that transport address. 3404 C) When the time comes for the sender to transmit, before sending 3405 new DATA chunks, the sender MUST first transmit any outstanding 3406 DATA chunks that are marked for retransmission (limited by the 3407 current cwnd). 3409 D) When the time comes for the sender to transmit new DATA chunks, 3410 the protocol parameter Max.Burst SHOULD be used to limit the 3411 number of packets sent. The limit MAY be applied by adjusting 3412 cwnd as follows: 3414 if((flightsize + Max.Burst*MTU) < cwnd) cwnd = flightsize + 3415 Max.Burst*MTU 3417 Or it MAY be applied by strictly limiting the number of packets 3418 emitted by the output routine. 3420 E) Then, the sender can send out as many new DATA chunks as rule A 3421 and rule B allow. 3423 Multiple DATA chunks committed for transmission MAY be bundled in a 3424 single packet. Furthermore, DATA chunks being retransmitted MAY be 3425 bundled with new DATA chunks, as long as the resulting packet size 3426 does not exceed the path MTU. A ULP may request that no bundling is 3427 performed, but this should only turn off any delays that an SCTP 3428 implementation may be using to increase bundling efficiency. It does 3429 not in itself stop all bundling from occurring (i.e., in case of 3430 congestion or retransmission). 3432 Before an endpoint transmits a DATA chunk, if any received DATA 3433 chunks have not been acknowledged (e.g., due to delayed ack), the 3434 sender should create a SACK and bundle it with the outbound DATA 3435 chunk, as long as the size of the final SCTP packet does not exceed 3436 the current MTU. See Section 6.2. 3438 IMPLEMENTATION NOTE: When the window is full (i.e., transmission is 3439 disallowed by rule A and/or rule B), the sender MAY still accept send 3440 requests from its upper layer, but MUST transmit no more DATA chunks 3441 until some or all of the outstanding DATA chunks are acknowledged and 3442 transmission is allowed by rule A and rule B again. 3444 Whenever a transmission or retransmission is made to any address, if 3445 the T3-rtx timer of that address is not currently running, the sender 3446 MUST start that timer. If the timer for that address is already 3447 running, the sender MUST restart the timer if the earliest (i.e., 3448 lowest TSN) outstanding DATA chunk sent to that address is being 3449 retransmitted. Otherwise, the data sender MUST NOT restart the 3450 timer. 3452 When starting or restarting the T3-rtx timer, the timer value must be 3453 adjusted according to the timer rules defined in Section 6.3.2 and 3454 Section 6.3.3. 3456 Note: The data sender SHOULD NOT use a TSN that is more than 2**31 - 3457 1 above the beginning TSN of the current send window. 3459 6.2. Acknowledgement on Reception of DATA Chunks 3461 The SCTP endpoint MUST always acknowledge the reception of each valid 3462 DATA chunk when the DATA chunk received is inside its receive window. 3464 When the receiver's advertised window is 0, the receiver MUST drop 3465 any new incoming DATA chunk with a TSN larger than the largest TSN 3466 received so far. If the new incoming DATA chunk holds a TSN value 3467 less than the largest TSN received so far, then the receiver SHOULD 3468 drop the largest TSN held for reordering and accept the new incoming 3469 DATA chunk. In either case, if such a DATA chunk is dropped, the 3470 receiver MUST immediately send back a SACK with the current receive 3471 window showing only DATA chunks received and accepted so far. The 3472 dropped DATA chunk(s) MUST NOT be included in the SACK, as they were 3473 not accepted. The receiver MUST also have an algorithm for 3474 advertising its receive window to avoid receiver silly window 3475 syndrome (SWS), as described in [RFC0813]. The algorithm can be 3476 similar to the one described in Section 4.2.3.3 of [RFC1122]. 3478 The guidelines on delayed acknowledgement algorithm specified in 3479 Section 4.2 of [RFC2581] SHOULD be followed. Specifically, an 3480 acknowledgement SHOULD be generated for at least every second packet 3481 (not every second DATA chunk) received, and SHOULD be generated 3482 within 200 ms of the arrival of any unacknowledged DATA chunk. In 3483 some situations, it may be beneficial for an SCTP transmitter to be 3484 more conservative than the algorithms detailed in this document 3485 allow. However, an SCTP transmitter MUST NOT be more aggressive than 3486 the following algorithms allow. 3488 An SCTP receiver MUST NOT generate more than one SACK for every 3489 incoming packet, other than to update the offered window as the 3490 receiving application consumes new data. 3492 IMPLEMENTATION NOTE: The maximum delay for generating an 3493 acknowledgement may be configured by the SCTP administrator, either 3494 statically or dynamically, in order to meet the specific timing 3495 requirement of the protocol being carried. 3497 An implementation MUST NOT allow the maximum delay to be configured 3498 to be more than 500 ms. In other words, an implementation MAY lower 3499 this value below 500 ms but MUST NOT raise it above 500 ms. 3501 Acknowledgements MUST be sent in SACK chunks unless shutdown was 3502 requested by the ULP, in which case an endpoint MAY send an 3503 acknowledgement in the SHUTDOWN chunk. A SACK chunk can acknowledge 3504 the reception of multiple DATA chunks. See Section 3.3.4 for SACK 3505 chunk format. In particular, the SCTP endpoint MUST fill in the 3506 Cumulative TSN Ack field to indicate the latest sequential TSN (of a 3507 valid DATA chunk) it has received. Any received DATA chunks with TSN 3508 greater than the value in the Cumulative TSN Ack field are reported 3509 in the Gap Ack Block fields. The SCTP endpoint MUST report as many 3510 Gap Ack Blocks as can fit in a single SACK chunk limited by the 3511 current path MTU. 3513 Note: The SHUTDOWN chunk does not contain Gap Ack Block fields. 3514 Therefore, the endpoint should use a SACK instead of the SHUTDOWN 3515 chunk to acknowledge DATA chunks received out of order. 3517 When a packet arrives with duplicate DATA chunk(s) and with no new 3518 DATA chunk(s), the endpoint MUST immediately send a SACK with no 3519 delay. If a packet arrives with duplicate DATA chunk(s) bundled with 3520 new DATA chunks, the endpoint MAY immediately send a SACK. Normally, 3521 receipt of duplicate DATA chunks will occur when the original SACK 3522 chunk was lost and the peer's RTO has expired. The duplicate TSN 3523 number(s) SHOULD be reported in the SACK as duplicate. 3525 When an endpoint receives a SACK, it MAY use the duplicate TSN 3526 information to determine if SACK loss is occurring. Further use of 3527 this data is for future study. 3529 The data receiver is responsible for maintaining its receive buffers. 3530 The data receiver SHOULD notify the data sender in a timely manner of 3531 changes in its ability to receive data. How an implementation 3532 manages its receive buffers is dependent on many factors (e.g., 3533 operating system, memory management system, amount of memory, etc.). 3534 However, the data sender strategy defined in Section 6.2.1 is based 3535 on the assumption of receiver operation similar to the following: 3537 A) At initialization of the association, the endpoint tells the peer 3538 how much receive buffer space it has allocated to the association 3539 in the INIT or INIT ACK. The endpoint sets a_rwnd to this value. 3541 B) As DATA chunks are received and buffered, decrement a_rwnd by the 3542 number of bytes received and buffered. This is, in effect, 3543 closing rwnd at the data sender and restricting the amount of 3544 data it can transmit. 3546 C) As DATA chunks are delivered to the ULP and released from the 3547 receive buffers, increment a_rwnd by the number of bytes 3548 delivered to the upper layer. This is, in effect, opening up 3549 rwnd on the data sender and allowing it to send more data. The 3550 data receiver SHOULD NOT increment a_rwnd unless it has released 3551 bytes from its receive buffer. For example, if the receiver is 3552 holding fragmented DATA chunks in a reassembly queue, it should 3553 not increment a_rwnd. 3555 D) When sending a SACK, the data receiver SHOULD place the current 3556 value of a_rwnd into the a_rwnd field. The data receiver SHOULD 3557 take into account that the data sender will not retransmit DATA 3558 chunks that are acked via the Cumulative TSN Ack (i.e., will drop 3559 from its retransmit queue). 3561 Under certain circumstances, the data receiver may need to drop DATA 3562 chunks that it has received but hasn't released from its receive 3563 buffers (i.e., delivered to the ULP). These DATA chunks may have 3564 been acked in Gap Ack Blocks. For example, the data receiver may be 3565 holding data in its receive buffers while reassembling a fragmented 3566 user message from its peer when it runs out of receive buffer space. 3567 It may drop these DATA chunks even though it has acknowledged them in 3568 Gap Ack Blocks. If a data receiver drops DATA chunks, it MUST NOT 3569 include them in Gap Ack Blocks in subsequent SACKs until they are 3570 received again via retransmission. In addition, the endpoint should 3571 take into account the dropped data when calculating its a_rwnd. 3573 An endpoint SHOULD NOT revoke a SACK and discard data. Only in 3574 extreme circumstances should an endpoint use this procedure (such as 3575 out of buffer space). The data receiver should take into account 3576 that dropping data that has been acked in Gap Ack Blocks can result 3577 in suboptimal retransmission strategies in the data sender and thus 3578 in suboptimal performance. 3580 The following example illustrates the use of delayed 3581 acknowledgements: 3583 Endpoint A Endpoint Z 3585 {App sends 3 messages; strm 0} 3586 DATA [TSN=7,Strm=0,Seq=3] ------------> (ack delayed) 3587 (Start T3-rtx timer) 3589 DATA [TSN=8,Strm=0,Seq=4] ------------> (send ack) 3590 /------- SACK [TSN Ack=8,block=0] 3591 (cancel T3-rtx timer) <-----/ 3593 DATA [TSN=9,Strm=0,Seq=5] ------------> (ack delayed) 3594 (Start T3-rtx timer) 3595 ... 3596 {App sends 1 message; strm 1} 3597 (bundle SACK with DATA) 3598 /----- SACK [TSN Ack=9,block=0] \ 3599 / DATA [TSN=6,Strm=1,Seq=2] 3600 (cancel T3-rtx timer) <------/ (Start T3-rtx timer) 3602 (ack delayed) 3603 (send ack) 3604 SACK [TSN Ack=6,block=0] -------------> (cancel T3-rtx timer) 3606 Figure 7: Delayed Acknowledgement Example 3608 If an endpoint receives a DATA chunk with no user data (i.e., the 3609 Length field is set to 16), it MUST send an ABORT with error cause 3610 set to "No User Data". 3612 An endpoint SHOULD NOT send a DATA chunk with no user data part. 3614 6.2.1. Processing a Received SACK 3616 Each SACK an endpoint receives contains an a_rwnd value. This value 3617 represents the amount of buffer space the data receiver, at the time 3618 of transmitting the SACK, has left of its total receive buffer space 3619 (as specified in the INIT/INIT ACK). Using a_rwnd, Cumulative TSN 3620 Ack, and Gap Ack Blocks, the data sender can develop a representation 3621 of the peer's receive buffer space. 3623 One of the problems the data sender must take into account when 3624 processing a SACK is that a SACK can be received out of order. That 3625 is, a SACK sent by the data receiver can pass an earlier SACK and be 3626 received first by the data sender. If a SACK is received out of 3627 order, the data sender can develop an incorrect view of the peer's 3628 receive buffer space. 3630 Since there is no explicit identifier that can be used to detect out- 3631 of-order SACKs, the data sender must use heuristics to determine if a 3632 SACK is new. 3634 An endpoint SHOULD use the following rules to calculate the rwnd, 3635 using the a_rwnd value, the Cumulative TSN Ack, and Gap Ack Blocks in 3636 a received SACK. 3638 A) At the establishment of the association, the endpoint initializes 3639 the rwnd to the Advertised Receiver Window Credit (a_rwnd) the 3640 peer specified in the INIT or INIT ACK. 3642 B) Any time a DATA chunk is transmitted (or retransmitted) to a 3643 peer, the endpoint subtracts the data size of the chunk from the 3644 rwnd of that peer. 3646 C) Any time a DATA chunk is marked for retransmission, either via 3647 T3-rtx timer expiration (Section 6.3.3) or via Fast Retransmit 3648 (Section 7.2.4), add the data size of those chunks to the rwnd. 3650 Note: If the implementation is maintaining a timer on each DATA 3651 chunk, then only DATA chunks whose timer expired would be marked 3652 for retransmission. 3654 D) Any time a SACK arrives, the endpoint performs the following: 3656 i) If Cumulative TSN Ack is less than the Cumulative TSN Ack 3657 Point, then drop the SACK. Since Cumulative TSN Ack is 3658 monotonically increasing, a SACK whose Cumulative TSN Ack 3659 is less than the Cumulative TSN Ack Point indicates an out- 3660 of-order SACK. 3662 ii) Set rwnd equal to the newly received a_rwnd minus the 3663 number of bytes still outstanding after processing the 3664 Cumulative TSN Ack and the Gap Ack Blocks. 3666 iii) If the SACK is missing a TSN that was previously 3667 acknowledged via a Gap Ack Block (e.g., the data receiver 3668 reneged on the data), then consider the corresponding DATA 3669 that might be possibly missing: Count one miss indication 3670 towards Fast Retransmit as described in Section 7.2.4, and 3671 if no retransmit timer is running for the destination 3672 address to which the DATA chunk was originally transmitted, 3673 then T3-rtx is started for that destination address. 3675 iv) If the Cumulative TSN Ack matches or exceeds the Fast 3676 Recovery exitpoint (Section 7.2.4), Fast Recovery is 3677 exited. 3679 6.3. Management of Retransmission Timer 3681 An SCTP endpoint uses a retransmission timer T3-rtx to ensure data 3682 delivery in the absence of any feedback from its peer. The duration 3683 of this timer is referred to as RTO (retransmission timeout). 3685 When an endpoint's peer is multi-homed, the endpoint will calculate a 3686 separate RTO for each different destination transport address of its 3687 peer endpoint. 3689 The computation and management of RTO in SCTP follow closely how TCP 3690 manages its retransmission timer. To compute the current RTO, an 3691 endpoint maintains two state variables per destination transport 3692 address: SRTT (smoothed round-trip time) and RTTVAR (round-trip time 3693 variation). 3695 6.3.1. RTO Calculation 3697 The rules governing the computation of SRTT, RTTVAR, and RTO are as 3698 follows: 3700 C1) Until an RTT measurement has been made for a packet sent to the 3701 given destination transport address, set RTO to the protocol 3702 parameter 'RTO.Initial'. 3703 C2) When the first RTT measurement R is made, set 3705 SRTT <- R, 3707 RTTVAR <- R/2, and 3709 RTO <- SRTT + 4 * RTTVAR. 3710 C3) When a new RTT measurement R' is made, set 3712 RTTVAR <- (1 - RTO.Beta) * RTTVAR + RTO.Beta * |SRTT - R'| 3714 and 3716 SRTT <- (1 - RTO.Alpha) * SRTT + RTO.Alpha * R' 3718 Note: The value of SRTT used in the update to RTTVAR is its 3719 value before updating SRTT itself using the second assignment. 3721 After the computation, update RTO <- SRTT + 4 * RTTVAR. 3723 C4) When data is in flight and when allowed by rule C5 below, a new 3724 RTT measurement MUST be made each round trip. Furthermore, new 3725 RTT measurements SHOULD be made no more than once per round trip 3726 for a given destination transport address. There are two 3727 reasons for this recommendation: First, it appears that 3728 measuring more frequently often does not in practice yield any 3729 significant benefit [ALLMAN99]; second, if measurements are made 3730 more often, then the values of RTO.Alpha and RTO.Beta in rule C3 3731 above should be adjusted so that SRTT and RTTVAR still adjust to 3732 changes at roughly the same rate (in terms of how many round 3733 trips it takes them to reflect new values) as they would if 3734 making only one measurement per round-trip and using RTO.Alpha 3735 and RTO.Beta as given in rule C3. However, the exact nature of 3736 these adjustments remains a research issue. 3738 C5) Karn's algorithm: RTT measurements MUST NOT be made using 3739 packets that were retransmitted (and thus for which it is 3740 ambiguous whether the reply was for the first instance of the 3741 chunk or for a later instance) 3743 IMPLEMENTATION NOTE: RTT measurements should only be made using 3744 a chunk with TSN r if no chunk with TSN less than or equal to r 3745 is retransmitted since r is first sent. 3747 C6) Whenever RTO is computed, if it is less than RTO.Min seconds 3748 then it is rounded up to RTO.Min seconds. The reason for this 3749 rule is that RTOs that do not have a high minimum value are 3750 susceptible to unnecessary timeouts [ALLMAN99]. 3752 C7) A maximum value may be placed on RTO provided it is at least 3753 RTO.max seconds. 3755 There is no requirement for the clock granularity G used for 3756 computing RTT measurements and the different state variables, other 3757 than: 3759 G1) Whenever RTTVAR is computed, if RTTVAR = 0, then adjust RTTVAR 3760 <- G. 3762 Experience [ALLMAN99] has shown that finer clock granularities (<= 3763 100 msec) perform somewhat better than more coarse granularities. 3765 6.3.2. Retransmission Timer Rules 3767 The rules for managing the retransmission timer are as follows: 3769 R1) Every time a DATA chunk is sent to any address (including a 3770 retransmission), if the T3-rtx timer of that address is not 3771 running, start it running so that it will expire after the RTO 3772 of that address. The RTO used here is that obtained after any 3773 doubling due to previous T3-rtx timer expirations on the 3774 corresponding destination address as discussed in rule E2 below. 3775 R2) Whenever all outstanding data sent to an address have been 3776 acknowledged, turn off the T3-rtx timer of that address. 3777 R3) Whenever a SACK is received that acknowledges the DATA chunk 3778 with the earliest outstanding TSN for that address, restart the 3779 T3-rtx timer for that address with its current RTO (if there is 3780 still outstanding data on that address). 3781 R4) Whenever a SACK is received missing a TSN that was previously 3782 acknowledged via a Gap Ack Block, start the T3-rtx for the 3783 destination address to which the DATA chunk was originally 3784 transmitted if it is not already running. 3786 The following example shows the use of various timer rules (assuming 3787 that the receiver uses delayed acks). 3789 Endpoint A Endpoint Z 3790 {App begins to send} 3791 Data [TSN=7,Strm=0,Seq=3] ------------> (ack delayed) 3792 (Start T3-rtx timer) 3793 {App sends 1 message; strm 1} 3794 (bundle ack with data) 3795 DATA [TSN=8,Strm=0,Seq=4] ----\ /-- SACK [TSN Ack=7,Block=0] 3796 \ / DATA [TSN=6,Strm=1,Seq=2] 3797 \ / (Start T3-rtx timer) 3798 \ 3799 / \ 3800 (Restart T3-rtx timer) <------/ \--> (ack delayed) 3801 (ack delayed) 3802 {send ack} 3803 SACK [TSN Ack=6,Block=0] --------------> (Cancel T3-rtx timer) 3804 .. 3805 (send ack) 3806 (Cancel T3-rtx timer) <-------------- SACK [TSN Ack=8,Block=0] 3808 Figure 8: Timer Rule Examples 3810 6.3.3. Handle T3-rtx Expiration 3812 Whenever the retransmission timer T3-rtx expires for a destination 3813 address, do the following: 3815 E1) For the destination address for which the timer expires, adjust 3816 its ssthresh with rules defined in Section 7.2.3 and set the 3817 cwnd <- MTU. 3818 E2) For the destination address for which the timer expires, set RTO 3819 <- RTO * 2 ("back off the timer"). The maximum value discussed 3820 in rule C7 above (RTO.max) may be used to provide an upper bound 3821 to this doubling operation. 3822 E3) Determine how many of the earliest (i.e., lowest TSN) 3823 outstanding DATA chunks for the address for which the T3-rtx has 3824 expired will fit into a single packet, subject to the MTU 3825 constraint for the path corresponding to the destination 3826 transport address to which the retransmission is being sent 3827 (this may be different from the address for which the timer 3828 expires; see Section 6.4). Call this value K. Bundle and 3829 retransmit those K DATA chunks in a single packet to the 3830 destination endpoint. 3831 E4) Start the retransmission timer T3-rtx on the destination address 3832 to which the retransmission is sent, if rule R1 above indicates 3833 to do so. The RTO to be used for starting T3-rtx should be the 3834 one for the destination address to which the retransmission is 3835 sent, which, when the receiver is multi-homed, may be different 3836 from the destination address for which the timer expired (see 3837 Section 6.4 below). 3839 After retransmitting, once a new RTT measurement is obtained (which 3840 can happen only when new data has been sent and acknowledged, per 3841 rule C5, or for a measurement made from a HEARTBEAT; see 3842 Section 8.3), the computation in rule C3 is performed, including the 3843 computation of RTO, which may result in "collapsing" RTO back down 3844 after it has been subject to doubling (rule E2). 3846 Note: Any DATA chunks that were sent to the address for which the 3847 T3-rtx timer expired but did not fit in one MTU (rule E3 above) 3848 should be marked for retransmission and sent as soon as cwnd allows 3849 (normally, when a SACK arrives). 3851 The final rule for managing the retransmission timer concerns 3852 failover (see Section 6.4.1): 3854 F1) Whenever an endpoint switches from the current destination 3855 transport address to a different one, the current retransmission 3856 timers are left running. As soon as the endpoint transmits a 3857 packet containing DATA chunk(s) to the new transport address, 3858 start the timer on that transport address, using the RTO value 3859 of the destination address to which the data is being sent, if 3860 rule R1 indicates to do so. 3862 6.4. Multi-Homed SCTP Endpoints 3864 An SCTP endpoint is considered multi-homed if there are more than one 3865 transport address that can be used as a destination address to reach 3866 that endpoint. 3868 Moreover, the ULP of an endpoint shall select one of the multiple 3869 destination addresses of a multi-homed peer endpoint as the primary 3870 path (see Section 5.1.2 and Section 10.1 for details). 3872 By default, an endpoint SHOULD always transmit to the primary path, 3873 unless the SCTP user explicitly specifies the destination transport 3874 address (and possibly source transport address) to use. 3876 An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK, 3877 etc.) to the same destination transport address from which it 3878 received the DATA or control chunk to which it is replying. This 3879 rule should also be followed if the endpoint is bundling DATA chunks 3880 together with the reply chunk. 3882 However, when acknowledging multiple DATA chunks received in packets 3883 from different source addresses in a single SACK, the SACK chunk may 3884 be transmitted to one of the destination transport addresses from 3885 which the DATA or control chunks being acknowledged were received. 3887 When a receiver of a duplicate DATA chunk sends a SACK to a multi- 3888 homed endpoint, it MAY be beneficial to vary the destination address 3889 and not use the source address of the DATA chunk. The reason is that 3890 receiving a duplicate from a multi-homed endpoint might indicate that 3891 the return path (as specified in the source address of the DATA 3892 chunk) for the SACK is broken. 3894 Furthermore, when its peer is multi-homed, an endpoint SHOULD try to 3895 retransmit a chunk that timed out to an active destination transport 3896 address that is different from the last destination address to which 3897 the DATA chunk was sent. 3899 Retransmissions do not affect the total outstanding data count. 3900 However, if the DATA chunk is retransmitted onto a different 3901 destination address, both the outstanding data counts on the new 3902 destination address and the old destination address to which the data 3903 chunk was last sent shall be adjusted accordingly. 3905 6.4.1. Failover from an Inactive Destination Address 3907 Some of the transport addresses of a multi-homed SCTP endpoint may 3908 become inactive due to either the occurrence of certain error 3909 conditions (see Section 8.2) or adjustments from the SCTP user. 3911 When there is outbound data to send and the primary path becomes 3912 inactive (e.g., due to failures), or where the SCTP user explicitly 3913 requests to send data to an inactive destination transport address, 3914 before reporting an error to its ULP, the SCTP endpoint should try to 3915 send the data to an alternate active destination transport address if 3916 one exists. 3918 When retransmitting data that timed out, if the endpoint is multi- 3919 homed, it should consider each source-destination address pair in its 3920 retransmission selection policy. When retransmitting timed-out data, 3921 the endpoint should attempt to pick the most divergent source- 3922 destination pair from the original source-destination pair to which 3923 the packet was transmitted. 3925 Note: Rules for picking the most divergent source-destination pair 3926 are an implementation decision and are not specified within this 3927 document. 3929 6.5. Stream Identifier and Stream Sequence Number 3931 Every DATA chunk MUST carry a valid stream identifier. If an 3932 endpoint receives a DATA chunk with an invalid stream identifier, it 3933 shall acknowledge the reception of the DATA chunk following the 3934 normal procedure, immediately send an ERROR chunk with cause set to 3935 "Invalid Stream Identifier" (see Section 3.3.10), and discard the 3936 DATA chunk. The endpoint may bundle the ERROR chunk in the same 3937 packet as the SACK as long as the ERROR follows the SACK. 3939 The Stream Sequence Number in all the streams MUST start from 0 when 3940 the association is established. Also, when the Stream Sequence 3941 Number reaches the value 65535 the next Stream Sequence Number MUST 3942 be set to 0. 3944 6.6. Ordered and Unordered Delivery 3946 Within a stream, an endpoint MUST deliver DATA chunks received with 3947 the U flag set to 0 to the upper layer according to the order of 3948 their Stream Sequence Number. If DATA chunks arrive out of order of 3949 their Stream Sequence Number, the endpoint MUST hold the received 3950 DATA chunks from delivery to the ULP until they are reordered. 3952 However, an SCTP endpoint can indicate that no ordered delivery is 3953 required for a particular DATA chunk transmitted within the stream by 3954 setting the U flag of the DATA chunk to 1. 3956 When an endpoint receives a DATA chunk with the U flag set to 1, it 3957 must bypass the ordering mechanism and immediately deliver the data 3958 to the upper layer (after reassembly if the user data is fragmented 3959 by the data sender). 3961 This provides an effective way of transmitting "out-of-band" data in 3962 a given stream. Also, a stream can be used as an "unordered" stream 3963 by simply setting the U flag to 1 in all DATA chunks sent through 3964 that stream. 3966 IMPLEMENTATION NOTE: When sending an unordered DATA chunk, an 3967 implementation may choose to place the DATA chunk in an outbound 3968 packet that is at the head of the outbound transmission queue if 3969 possible. 3971 The 'Stream Sequence Number' field in a DATA chunk with U flag set to 3972 1 has no significance. The sender can fill it with arbitrary value, 3973 but the receiver MUST ignore the field. 3975 Note: When transmitting ordered and unordered data, an endpoint does 3976 not increment its Stream Sequence Number when transmitting a DATA 3977 chunk with U flag set to 1. 3979 6.7. Report Gaps in Received DATA TSNs 3981 Upon the reception of a new DATA chunk, an endpoint shall examine the 3982 continuity of the TSNs received. If the endpoint detects a gap in 3983 the received DATA chunk sequence, it SHOULD send a SACK with Gap Ack 3984 Blocks immediately. The data receiver continues sending a SACK after 3985 receipt of each SCTP packet that doesn't fill the gap. 3987 Based on the Gap Ack Block from the received SACK, the endpoint can 3988 calculate the missing DATA chunks and make decisions on whether to 3989 retransmit them (see Section 6.2.1 for details). 3991 Multiple gaps can be reported in one single SACK (see Section 3.3.4). 3993 When its peer is multi-homed, the SCTP endpoint SHOULD always try to 3994 send the SACK to the same destination address from which the last 3995 DATA chunk was received. 3997 Upon the reception of a SACK, the endpoint MUST remove all DATA 3998 chunks that have been acknowledged by the SACK's Cumulative TSN Ack 3999 from its transmit queue. The endpoint MUST also treat all the DATA 4000 chunks with TSNs not included in the Gap Ack Blocks reported by the 4001 SACK as "missing". The number of "missing" reports for each 4002 outstanding DATA chunk MUST be recorded by the data sender in order 4003 to make retransmission decisions. See Section 7.2.4 for details. 4005 The following example shows the use of SACK to report a gap. 4007 Endpoint A Endpoint Z {App 4008 sends 3 messages; strm 0} DATA [TSN=6,Strm=0,Seq=2] ---------- 4009 -----> (ack delayed) (Start T3-rtx timer) 4011 DATA [TSN=7,Strm=0,Seq=3] --------> X (lost) 4013 DATA [TSN=8,Strm=0,Seq=4] ---------------> (gap detected, 4014 immediately send ack) 4015 /----- SACK [TSN Ack=6,Block=1, 4016 / Start=2,End=2] 4017 <-----/ (remove 6 from out-queue, 4018 and mark 7 as "1" missing report) 4020 Figure 9: Reporting a Gap using SACK 4022 The maximum number of Gap Ack Blocks that can be reported within a 4023 single SACK chunk is limited by the current path MTU. When a single 4024 SACK cannot cover all the Gap Ack Blocks needed to be reported due to 4025 the MTU limitation, the endpoint MUST send only one SACK, reporting 4026 the Gap Ack Blocks from the lowest to highest TSNs, within the size 4027 limit set by the MTU, and leave the remaining highest TSN numbers 4028 unacknowledged. 4030 6.8. CRC32c Checksum Calculation 4032 When sending an SCTP packet, the endpoint MUST strengthen the data 4033 integrity of the transmission by including the CRC32c checksum value 4034 calculated on the packet, as described below. 4036 After the packet is constructed (containing the SCTP common header 4037 and one or more control or DATA chunks), the transmitter MUST 4039 1) fill in the proper Verification Tag in the SCTP common header and 4040 initialize the checksum field to '0's, 4041 2) calculate the CRC32c checksum of the whole packet, including the 4042 SCTP common header and all the chunks (refer to Appendix B for 4043 details of the CRC32c algorithm); and 4044 3) put the resultant value into the checksum field in the common 4045 header, and leave the rest of the bits unchanged. 4047 When an SCTP packet is received, the receiver MUST first check the 4048 CRC32c checksum as follows: 4050 1) Store the received CRC32c checksum value aside. 4051 2) Replace the 32 bits of the checksum field in the received SCTP 4052 packet with all '0's and calculate a CRC32c checksum value of the 4053 whole received packet. 4055 3) Verify that the calculated CRC32c checksum is the same as the 4056 received CRC32c checksum. If it is not, the receiver MUST treat 4057 the packet as an invalid SCTP packet. 4059 The default procedure for handling invalid SCTP packets is to 4060 silently discard them. 4062 Any hardware implementation SHOULD be done in a way that is 4063 verifiable by the software. 4065 6.9. Fragmentation and Reassembly 4067 An endpoint MAY support fragmentation when sending DATA chunks, but 4068 it MUST support reassembly when receiving DATA chunks. If an 4069 endpoint supports fragmentation, it MUST fragment a user message if 4070 the size of the user message to be sent causes the outbound SCTP 4071 packet size to exceed the current MTU. If an implementation does not 4072 support fragmentation of outbound user messages, the endpoint MUST 4073 return an error to its upper layer and not attempt to send the user 4074 message. 4076 Note: If an implementation that supports fragmentation makes 4077 available to its upper layer a mechanism to turn off fragmentation, 4078 it may do so. However, in so doing, it MUST react just like an 4079 implementation that does NOT support fragmentation, i.e., it MUST 4080 reject sends that exceed the current Path MTU (P-MTU). 4082 IMPLEMENTATION NOTE: In this error case, the Send primitive discussed 4083 in Section 10.1 would need to return an error to the upper layer. 4085 If its peer is multi-homed, the endpoint shall choose a size no 4086 larger than the association Path MTU. The association Path MTU is 4087 the smallest Path MTU of all destination addresses. 4089 Note: Once a message is fragmented, it cannot be re-fragmented. 4090 Instead, if the PMTU has been reduced, then IP fragmentation must be 4091 used. Please see Section 7.3 for details of PMTU discovery. 4093 When determining when to fragment, the SCTP implementation MUST take 4094 into account the SCTP packet header as well as the DATA chunk 4095 header(s). The implementation MUST also take into account the space 4096 required for a SACK chunk if bundling a SACK chunk with the DATA 4097 chunk. 4099 Fragmentation takes the following steps: 4101 1) The data sender MUST break the user message into a series of DATA 4102 chunks such that each chunk plus SCTP overhead fits into an IP 4103 datagram smaller than or equal to the association Path MTU. 4105 2) The transmitter MUST then assign, in sequence, a separate TSN to 4106 each of the DATA chunks in the series. The transmitter assigns 4107 the same SSN to each of the DATA chunks. If the user indicates 4108 that the user message is to be delivered using unordered 4109 delivery, then the U flag of each DATA chunk of the user message 4110 MUST be set to 1. 4112 3) The transmitter MUST also set the B/E bits of the first DATA 4113 chunk in the series to '10', the B/E bits of the last DATA chunk 4114 in the series to '01', and the B/E bits of all other DATA chunks 4115 in the series to '00'. 4117 An endpoint MUST recognize fragmented DATA chunks by examining the B/ 4118 E bits in each of the received DATA chunks, and queue the fragmented 4119 DATA chunks for reassembly. Once the user message is reassembled, 4120 SCTP shall pass the reassembled user message to the specific stream 4121 for possible reordering and final dispatching. 4123 Note: If the data receiver runs out of buffer space while still 4124 waiting for more fragments to complete the reassembly of the message, 4125 it should dispatch part of its inbound message through a partial 4126 delivery API (see Section 10), freeing some of its receive buffer 4127 space so that the rest of the message may be received. 4129 6.10. Bundling 4131 An endpoint bundles chunks by simply including multiple chunks in one 4132 outbound SCTP packet. The total size of the resultant IP datagram, 4134 including the SCTP packet and IP headers, MUST be less that or equal 4135 to the current Path MTU. 4137 If its peer endpoint is multi-homed, the sending endpoint shall 4138 choose a size no larger than the latest MTU of the current primary 4139 path. 4141 When bundling control chunks with DATA chunks, an endpoint MUST place 4142 control chunks first in the outbound SCTP packet. The transmitter 4143 MUST transmit DATA chunks within an SCTP packet in increasing order 4144 of TSN. 4146 Note: Since control chunks must be placed first in a packet and since 4147 DATA chunks must be transmitted before SHUTDOWN or SHUTDOWN ACK 4148 chunks, DATA chunks cannot be bundled with SHUTDOWN or SHUTDOWN ACK 4149 chunks. 4151 Partial chunks MUST NOT be placed in an SCTP packet. A partial chunk 4152 is a chunk that is not completely contained in the SCTP packet; i.e., 4153 the SCTP packet is too short to contain all the bytes of the chunk as 4154 indicated by the chunk length. 4156 An endpoint MUST process received chunks in their order in the 4157 packet. The receiver uses the Chunk Length field to determine the 4158 end of a chunk and beginning of the next chunk taking account of the 4159 fact that all chunks end on a 4-byte boundary. If the receiver 4160 detects a partial chunk, it MUST drop the chunk. 4162 An endpoint MUST NOT bundle INIT, INIT ACK, or SHUTDOWN COMPLETE with 4163 any other chunks. 4165 7. Congestion Control 4167 Congestion control is one of the basic functions in SCTP. For some 4168 applications, it may be likely that adequate resources will be 4169 allocated to SCTP traffic to ensure prompt delivery of time-critical 4170 data -- thus, it would appear to be unlikely, during normal 4171 operations, that transmissions encounter severe congestion 4172 conditions. However, SCTP must operate under adverse operational 4173 conditions, which can develop upon partial network failures or 4174 unexpected traffic surges. In such situations, SCTP must follow 4175 correct congestion control steps to recover from congestion quickly 4176 in order to get data delivered as soon as possible. In the absence 4177 of network congestion, these preventive congestion control algorithms 4178 should show no impact on the protocol performance. 4180 IMPLEMENTATION NOTE: As far as its specific performance requirements 4181 are met, an implementation is always allowed to adopt a more 4182 conservative congestion control algorithm than the one defined below. 4184 The congestion control algorithms used by SCTP are based on 4185 [RFC2581]. This section describes how the algorithms defined in 4186 [RFC2581] are adapted for use in SCTP. We first list differences in 4187 protocol designs between TCP and SCTP, and then describe SCTP's 4188 congestion control scheme. The description will use the same 4189 terminology as in TCP congestion control whenever appropriate. 4191 SCTP congestion control is always applied to the entire association, 4192 and not to individual streams. 4194 7.1. SCTP Differences from TCP Congestion Control 4196 Gap Ack Blocks in the SCTP SACK carry the same semantic meaning as 4197 the TCP SACK. TCP considers the information carried in the SACK as 4198 advisory information only. SCTP considers the information carried in 4199 the Gap Ack Blocks in the SACK chunk as advisory. In SCTP, any DATA 4200 chunk that has been acknowledged by SACK, including DATA that arrived 4201 at the receiving end out of order, is not considered fully delivered 4202 until the Cumulative TSN Ack Point passes the TSN of the DATA chunk 4203 (i.e., the DATA chunk has been acknowledged by the Cumulative TSN Ack 4204 field in the SACK). Consequently, the value of cwnd controls the 4205 amount of outstanding data, rather than (as in the case of non-SACK 4206 TCP) the upper bound between the highest acknowledged sequence number 4207 and the latest DATA chunk that can be sent within the congestion 4208 window. SCTP SACK leads to different implementations of Fast 4209 Retransmit and Fast Recovery than non-SACK TCP. As an example, see 4210 [FALL96]. 4212 The biggest difference between SCTP and TCP, however, is multi- 4213 homing. SCTP is designed to establish robust communication 4214 associations between two endpoints each of which may be reachable by 4215 more than one transport address. Potentially different addresses may 4216 lead to different data paths between the two endpoints; thus, ideally 4217 one may need a separate set of congestion control parameters for each 4218 of the paths. The treatment here of congestion control for multi- 4219 homed receivers is new with SCTP and may require refinement in the 4220 future. The current algorithms make the following assumptions: 4222 o The sender usually uses the same destination address until being 4223 instructed by the upper layer to do otherwise; however, SCTP may 4224 change to an alternate destination in the event an address is 4225 marked inactive (see Section 8.2). Also, SCTP may retransmit to a 4226 different transport address than the original transmission. 4228 o The sender keeps a separate congestion control parameter set for 4229 each of the destination addresses it can send to (not each source- 4230 destination pair but for each destination). The parameters should 4231 decay if the address is not used for a long enough time period. 4233 o For each of the destination addresses, an endpoint does slow start 4234 upon the first transmission to that address. 4236 Note: TCP guarantees in-sequence delivery of data to its upper-layer 4237 protocol within a single TCP session. This means that when TCP 4238 notices a gap in the received sequence number, it waits until the gap 4239 is filled before delivering the data that was received with sequence 4240 numbers higher than that of the missing data. On the other hand, 4241 SCTP can deliver data to its upper-layer protocol even if there is a 4242 gap in TSN if the Stream Sequence Numbers are in sequence for a 4243 particular stream (i.e., the missing DATA chunks are for a different 4244 stream) or if unordered delivery is indicated. Although this does 4245 not affect cwnd, it might affect rwnd calculation. 4247 7.2. SCTP Slow-Start and Congestion Avoidance 4249 The slow-start and congestion avoidance algorithms MUST be used by an 4250 endpoint to control the amount of data being injected into the 4251 network. The congestion control in SCTP is employed in regard to the 4252 association, not to an individual stream. In some situations, it may 4253 be beneficial for an SCTP sender to be more conservative than the 4254 algorithms allow; however, an SCTP sender MUST NOT be more aggressive 4255 than the following algorithms allow. 4257 Like TCP, an SCTP endpoint uses the following three control variables 4258 to regulate its transmission rate. 4260 o Receiver advertised window size (rwnd, in bytes), which is set by 4261 the receiver based on its available buffer space for incoming 4262 packets. 4263 Note: This variable is kept on the entire association. 4264 o Congestion control window (cwnd, in bytes), which is adjusted by 4265 the sender based on observed network conditions. 4266 Note: This variable is maintained on a per-destination-address 4267 basis. 4268 o Slow-start threshold (ssthresh, in bytes), which is used by the 4269 sender to distinguish slow-start and congestion avoidance phases. 4270 Note: This variable is maintained on a per-destination-address 4271 basis. 4273 SCTP also requires one additional control variable, 4274 partial_bytes_acked, which is used during congestion avoidance phase 4275 to facilitate cwnd adjustment. 4277 Unlike TCP, an SCTP sender MUST keep a set of these control variables 4278 cwnd, ssthresh, and partial_bytes_acked for EACH destination address 4279 of its peer (when its peer is multi-homed). Only one rwnd is kept 4280 for the whole association (no matter if the peer is multi-homed or 4281 has a single address). 4283 7.2.1. Slow-Start 4285 Beginning data transmission into a network with unknown conditions or 4286 after a sufficiently long idle period requires SCTP to probe the 4287 network to determine the available capacity. The slow-start 4288 algorithm is used for this purpose at the beginning of a transfer, or 4289 after repairing loss detected by the retransmission timer. 4291 o The initial cwnd before DATA transmission or after a sufficiently 4292 long idle period MUST be set to min(4*MTU, max (2*MTU, 4380 4293 bytes)). 4294 o The initial cwnd after a retransmission timeout MUST be no more 4295 than 1*MTU. 4296 o The initial value of ssthresh MAY be arbitrarily high (for 4297 example, implementations MAY use the size of the receiver 4298 advertised window). 4299 o Whenever cwnd is greater than zero, the endpoint is allowed to 4300 have cwnd bytes of data outstanding on that transport address. 4301 o When cwnd is less than or equal to ssthresh, an SCTP endpoint MUST 4302 use the slow-start algorithm to increase cwnd only if the current 4303 congestion window is being fully utilized, an incoming SACK 4304 advances the Cumulative TSN Ack Point, and the data sender is not 4305 in Fast Recovery. Only when these three conditions are met can 4306 the cwnd be increased; otherwise, the cwnd MUST not be increased. 4307 If these conditions are met, then cwnd MUST be increased by, at 4308 most, the lesser of 1) the total size of the previously 4309 outstanding DATA chunk(s) acknowledged, and 2) the destination's 4310 path MTU. This upper bound protects against the ACK-Splitting 4311 attack outlined in [SAVAGE99]. 4313 In instances where its peer endpoint is multi-homed, if an endpoint 4314 receives a SACK that advances its Cumulative TSN Ack Point, then it 4315 should update its cwnd (or cwnds) apportioned to the destination 4316 addresses to which it transmitted the acknowledged data. However, if 4317 the received SACK does not advance the Cumulative TSN Ack Point, the 4318 endpoint MUST NOT adjust the cwnd of any of the destination 4319 addresses. 4321 Because an endpoint's cwnd is not tied to its Cumulative TSN Ack 4322 Point, as duplicate SACKs come in, even though they may not advance 4323 the Cumulative TSN Ack Point an endpoint can still use them to clock 4324 out new data. That is, the data newly acknowledged by the SACK 4325 diminishes the amount of data now in flight to less than cwnd, and so 4326 the current, unchanged value of cwnd now allows new data to be sent. 4327 On the other hand, the increase of cwnd must be tied to the 4328 Cumulative TSN Ack Point advancement as specified above. Otherwise, 4329 the duplicate SACKs will not only clock out new data, but also will 4330 adversely clock out more new data than what has just left the 4331 network, during a time of possible congestion. 4333 o When the endpoint does not transmit data on a given transport 4334 address, the cwnd of the transport address should be adjusted to 4335 max(cwnd/2, 4*MTU) per RTO. 4337 7.2.2. Congestion Avoidance 4339 When cwnd is greater than ssthresh, cwnd should be incremented by 4340 1*MTU per RTT if the sender has cwnd or more bytes of data 4341 outstanding for the corresponding transport address. 4343 In practice, an implementation can achieve this goal in the following 4344 way: 4346 o partial_bytes_acked is initialized to 0. 4347 o Whenever cwnd is greater than ssthresh, upon each SACK arrival 4348 that advances the Cumulative TSN Ack Point, increase 4349 partial_bytes_acked by the total number of bytes of all new chunks 4350 acknowledged in that SACK including chunks acknowledged by the new 4351 Cumulative TSN Ack and by Gap Ack Blocks. 4352 o When partial_bytes_acked is equal to or greater than cwnd and 4353 before the arrival of the SACK the sender had cwnd or more bytes 4354 of data outstanding (i.e., before arrival of the SACK, flightsize 4355 was greater than or equal to cwnd), increase cwnd by MTU, and 4356 reset partial_bytes_acked to (partial_bytes_acked - cwnd). 4357 o Same as in the slow start, when the sender does not transmit DATA 4358 on a given transport address, the cwnd of the transport address 4359 should be adjusted to max(cwnd / 2, 4*MTU) per RTO. 4360 o When all of the data transmitted by the sender has been 4361 acknowledged by the receiver, partial_bytes_acked is initialized 4362 to 0. 4364 7.2.3. Congestion Control 4366 Upon detection of packet losses from SACK (see Section 7.2.4), an 4367 endpoint should do the following: 4369 ssthresh = max(cwnd/2, 4*MTU) 4370 cwnd = ssthresh 4371 partial_bytes_acked = 0 4373 Basically, a packet loss causes cwnd to be cut in half. 4375 When the T3-rtx timer expires on an address, SCTP should perform slow 4376 start by: 4378 ssthresh = max(cwnd/2, 4*MTU) 4379 cwnd = 1*MTU 4381 and ensure that no more than one SCTP packet will be in flight for 4382 that address until the endpoint receives acknowledgement for 4383 successful delivery of data to that address. 4385 7.2.4. Fast Retransmit on Gap Reports 4387 In the absence of data loss, an endpoint performs delayed 4388 acknowledgement. However, whenever an endpoint notices a hole in the 4389 arriving TSN sequence, it SHOULD start sending a SACK back every time 4390 a packet arrives carrying data until the hole is filled. 4392 Whenever an endpoint receives a SACK that indicates that some TSNs 4393 are missing, it SHOULD wait for two further miss indications (via 4394 subsequent SACKs for a total of three missing reports) on the same 4395 TSNs before taking action with regard to Fast Retransmit. 4397 Miss indications SHOULD follow the HTNA (Highest TSN Newly 4398 Acknowledged) algorithm. For each incoming SACK, miss indications 4399 are incremented only for missing TSNs prior to the highest TSN newly 4400 acknowledged in the SACK. A newly acknowledged DATA chunk is one not 4401 previously acknowledged in a SACK. If an endpoint is in Fast 4402 Recovery and a SACK arrives that advances the Cumulative TSN Ack 4403 Point, the miss indications are incremented for all TSNs reported 4404 missing in the SACK. 4406 When the third consecutive miss indication is received for a TSN(s), 4407 the data sender shall do the following: 4409 1) Mark the DATA chunk(s) with three miss indications for 4410 retransmission. 4411 2) If not in Fast Recovery, adjust the ssthresh and cwnd of the 4412 destination address(es) to which the missing DATA chunks were 4413 last sent, according to the formula described in Section 7.2.3. 4414 3) Determine how many of the earliest (i.e., lowest TSN) DATA chunks 4415 marked for retransmission will fit into a single packet, subject 4416 to constraint of the path MTU of the destination transport 4417 address to which the packet is being sent. Call this value K. 4418 Retransmit those K DATA chunks in a single packet. When a Fast 4419 Retransmit is being performed, the sender SHOULD ignore the value 4420 of cwnd and SHOULD NOT delay retransmission for this single 4421 packet. 4422 4) Restart the T3-rtx timer only if the last SACK acknowledged the 4423 lowest outstanding TSN number sent to that address, or the 4424 endpoint is retransmitting the first outstanding DATA chunk sent 4425 to that address. 4426 5) Mark the DATA chunk(s) as being fast retransmitted and thus 4427 ineligible for a subsequent Fast Retransmit. Those TSNs marked 4428 for retransmission due to the Fast-Retransmit algorithm that did 4429 not fit in the sent datagram carrying K other TSNs are also 4430 marked as ineligible for a subsequent Fast Retransmit. However, 4431 as they are marked for retransmission they will be retransmitted 4432 later on as soon as cwnd allows. 4434 6) If not in Fast Recovery, enter Fast Recovery and mark the highest 4435 outstanding TSN as the Fast Recovery exit point. When a SACK 4436 acknowledges all TSNs up to and including this exit point, Fast 4437 Recovery is exited. While in Fast Recovery, the ssthresh and 4438 cwnd SHOULD NOT change for any destinations due to a subsequent 4439 Fast Recovery event (i.e., one SHOULD NOT reduce the cwnd further 4440 due to a subsequent Fast Retransmit). 4442 Note: Before the above adjustments, if the received SACK also 4443 acknowledges new DATA chunks and advances the Cumulative TSN Ack 4444 Point, the cwnd adjustment rules defined in Section 7.2.1 and 4445 Section 7.2.2 must be applied first. 4447 A straightforward implementation of the above keeps a counter for 4448 each TSN hole reported by a SACK. The counter increments for each 4449 consecutive SACK reporting the TSN hole. After reaching 3 and 4450 starting the Fast-Retransmit procedure, the counter resets to 0. 4452 Because cwnd in SCTP indirectly bounds the number of outstanding 4453 TSN's, the effect of TCP Fast Recovery is achieved automatically with 4454 no adjustment to the congestion control window size. 4456 7.3. Path MTU Discovery 4458 [RFC4821], [RFC1981], and [RFC1191] specify "Packetization Layer Path 4459 MTU Discovery", whereby an endpoint maintains an estimate of the 4460 maximum transmission unit (MTU) along a given Internet path and 4461 refrains from sending packets along that path that exceed the MTU, 4462 other than occasional attempts to probe for a change in the Path MTU 4463 (PMTU). [RFC4821] is thorough in its discussion of the MTU discovery 4464 mechanism and strategies for determining the current end-to-end MTU 4465 setting as well as detecting changes in this value. 4467 An endpoint SHOULD apply these techniques, and SHOULD do so on a per- 4468 destination-address basis. 4470 There are two important SCTP-specific points regarding Path MTU 4471 discovery: 4473 1) SCTP associations can span multiple addresses. An endpoint MUST 4474 maintain separate MTU estimates for each destination address of 4475 its peer. 4477 2) The sender should track an association PMTU that will be the 4478 smallest PMTU discovered for all of the peer's destination 4479 addresses. When fragmenting messages into multiple parts this 4480 association PMTU should be used to calculate the size of each 4481 fragment. This will allow retransmissions to be seamlessly sent 4482 to an alternate address without encountering IP fragmentation. 4484 8. Fault Management 4486 8.1. Endpoint Failure Detection 4488 An endpoint shall keep a counter on the total number of consecutive 4489 retransmissions to its peer (this includes retransmissions to all the 4490 destination transport addresses of the peer if it is multi-homed), 4491 including unacknowledged HEARTBEAT chunks. If the value of this 4492 counter exceeds the limit indicated in the protocol parameter 4493 'Association.Max.Retrans', the endpoint shall consider the peer 4494 endpoint unreachable and shall stop transmitting any more data to it 4495 (and thus the association enters the CLOSED state). In addition, the 4496 endpoint MAY report the failure to the upper layer and optionally 4497 report back all outstanding user data remaining in its outbound 4498 queue. The association is automatically closed when the peer 4499 endpoint becomes unreachable. 4501 The counter shall be reset each time a DATA chunk sent to that peer 4502 endpoint is acknowledged (by the reception of a SACK) or a HEARTBEAT 4503 ACK is received from the peer endpoint. 4505 8.2. Path Failure Detection 4507 When its peer endpoint is multi-homed, an endpoint should keep an 4508 error counter for each of the destination transport addresses of the 4509 peer endpoint. 4511 Each time the T3-rtx timer expires on any address, or when a 4512 HEARTBEAT sent to an idle address is not acknowledged within an RTO, 4513 the error counter of that destination address will be incremented. 4514 When the value in the error counter exceeds the protocol parameter 4515 'Path.Max.Retrans' of that destination address, the endpoint should 4516 mark the destination transport address as inactive, and a 4517 notification SHOULD be sent to the upper layer. 4519 When an outstanding TSN is acknowledged or a HEARTBEAT sent to that 4520 address is acknowledged with a HEARTBEAT ACK, the endpoint shall 4521 clear the error counter of the destination transport address to which 4522 the DATA chunk was last sent (or HEARTBEAT was sent). When the peer 4523 endpoint is multi-homed and the last chunk sent to it was a 4524 retransmission to an alternate address, there exists an ambiguity as 4525 to whether or not the acknowledgement should be credited to the 4526 address of the last chunk sent. However, this ambiguity does not 4527 seem to bear any significant consequence to SCTP behavior. If this 4528 ambiguity is undesirable, the transmitter may choose not to clear the 4529 error counter if the last chunk sent was a retransmission. 4531 Note: When configuring the SCTP endpoint, the user should avoid 4532 having the value of 'Association.Max.Retrans' larger than the 4533 summation of the 'Path.Max.Retrans' of all the destination addresses 4534 for the remote endpoint. Otherwise, all the destination addresses 4535 may become inactive while the endpoint still considers the peer 4536 endpoint reachable. When this condition occurs, how SCTP chooses to 4537 function is implementation specific. 4539 When the primary path is marked inactive (due to excessive 4540 retransmissions, for instance), the sender MAY automatically transmit 4541 new packets to an alternate destination address if one exists and is 4542 active. If more than one alternate address is active when the 4543 primary path is marked inactive, only ONE transport address SHOULD be 4544 chosen and used as the new destination transport address. 4546 8.3. Path Heartbeat 4548 By default, an SCTP endpoint SHOULD monitor the reachability of the 4549 idle destination transport address(es) of its peer by sending a 4550 HEARTBEAT chunk periodically to the destination transport 4551 address(es). HEARTBEAT sending MAY begin upon reaching the 4552 ESTABLISHED state and is discontinued after sending either SHUTDOWN 4553 or SHUTDOWN-ACK. A receiver of a HEARTBEAT MUST respond to a 4554 HEARTBEAT with a HEARTBEAT-ACK after entering the COOKIE-ECHOED state 4555 (INIT sender) or the ESTABLISHED state (INIT receiver), up until 4556 reaching the SHUTDOWN-SENT state (SHUTDOWN sender) or the SHUTDOWN- 4557 ACK-SENT state (SHUTDOWN receiver). 4559 A destination transport address is considered "idle" if no new chunk 4560 that can be used for updating path RTT (usually including first 4561 transmission DATA, INIT, COOKIE ECHO, HEARTBEAT, etc.) and no 4562 HEARTBEAT has been sent to it within the current heartbeat period of 4563 that address. This applies to both active and inactive destination 4564 addresses. 4566 The upper layer can optionally initiate the following functions: 4568 A) Disable heartbeat on a specific destination transport address of 4569 a given association, 4571 B) Change the HB.interval, 4573 C) Re-enable heartbeat on a specific destination transport address 4574 of a given association, and 4576 D) Request an on-demand HEARTBEAT on a specific destination 4577 transport address of a given association. 4579 The endpoint should increment the respective error counter of the 4580 destination transport address each time a HEARTBEAT is sent to that 4581 address and not acknowledged within one RTO. 4583 When the value of this counter reaches the protocol parameter 4584 'Path.Max.Retrans', the endpoint should mark the corresponding 4585 destination address as inactive if it is not so marked, and may also 4586 optionally report to the upper layer the change of reachability of 4587 this destination address. After this, the endpoint should continue 4588 HEARTBEAT on this destination address but should stop increasing the 4589 counter. 4591 The sender of the HEARTBEAT chunk should include in the Heartbeat 4592 Information field of the chunk the current time when the packet is 4593 sent out and the destination address to which the packet is sent. 4595 IMPLEMENTATION NOTE: An alternative implementation of the heartbeat 4596 mechanism that can be used is to increment the error counter variable 4597 every time a HEARTBEAT is sent to a destination. Whenever a 4598 HEARTBEAT ACK arrives, the sender SHOULD clear the error counter of 4599 the destination that the HEARTBEAT was sent to. This in effect would 4600 clear the previously stroked error (and any other error counts as 4601 well). 4603 The receiver of the HEARTBEAT should immediately respond with a 4604 HEARTBEAT ACK that contains the Heartbeat Information TLV, together 4605 with any other received TLVs, copied unchanged from the received 4606 HEARTBEAT chunk. 4608 Upon the receipt of the HEARTBEAT ACK, the sender of the HEARTBEAT 4609 should clear the error counter of the destination transport address 4610 to which the HEARTBEAT was sent, and mark the destination transport 4611 address as active if it is not so marked. The endpoint may 4612 optionally report to the upper layer when an inactive destination 4613 address is marked as active due to the reception of the latest 4614 HEARTBEAT ACK. The receiver of the HEARTBEAT ACK must also clear the 4615 association overall error count as well (as defined in Section 8.1). 4617 The receiver of the HEARTBEAT ACK should also perform an RTT 4618 measurement for that destination transport address using the time 4619 value carried in the HEARTBEAT ACK chunk. 4621 On an idle destination address that is allowed to heartbeat, it is 4622 recommended that a HEARTBEAT chunk is sent once per RTO of that 4623 destination address plus the protocol parameter 'HB.interval', with 4624 jittering of +/- 50% of the RTO value, and exponential backoff of the 4625 RTO if the previous HEARTBEAT is unanswered. 4627 A primitive is provided for the SCTP user to change the HB.interval 4628 and turn on or off the heartbeat on a given destination address. The 4629 heartbeat interval set by the SCTP user is added to the RTO of that 4630 destination (including any exponential backoff). Only one heartbeat 4631 should be sent each time the heartbeat timer expires (if multiple 4632 destinations are idle). It is an implementation decision on how to 4633 choose which of the candidate idle destinations to heartbeat to (if 4634 more than one destination is idle). 4636 Note: When tuning the heartbeat interval, there is a side effect that 4637 SHOULD be taken into account. When this value is increased, i.e., 4638 the HEARTBEAT takes longer, the detection of lost ABORT messages 4639 takes longer as well. If a peer endpoint ABORTs the association for 4640 any reason and the ABORT chunk is lost, the local endpoint will only 4641 discover the lost ABORT by sending a DATA chunk or HEARTBEAT chunk 4642 (thus causing the peer to send another ABORT). This must be 4643 considered when tuning the HEARTBEAT timer. If the HEARTBEAT is 4644 disabled, only sending DATA to the association will discover a lost 4645 ABORT from the peer. 4647 8.4. Handle "Out of the Blue" Packets 4649 An SCTP packet is called an "out of the blue" (OOTB) packet if it is 4650 correctly formed (i.e., passed the receiver's CRC32c check; see 4651 Section 6.8), but the receiver is not able to identify the 4652 association to which this packet belongs. 4654 The receiver of an OOTB packet MUST do the following: 4656 1) If the OOTB packet is to or from a non-unicast address, a 4657 receiver SHOULD silently discard the packet. Otherwise, 4658 2) If the OOTB packet contains an ABORT chunk, the receiver MUST 4659 silently discard the OOTB packet and take no further action. 4660 Otherwise, 4661 3) If the packet contains an INIT chunk with a Verification Tag set 4662 to '0', process it as described in Section 5.1. If, for whatever 4663 reason, the INIT cannot be processed normally and an ABORT has to 4664 be sent in response, the Verification Tag of the packet 4665 containing the ABORT chunk MUST be the Initiate Tag of the 4666 received INIT chunk, and the T bit of the ABORT chunk has to be 4667 set to 0, indicating that the Verification Tag is NOT reflected. 4668 4) If the packet contains a COOKIE ECHO in the first chunk, process 4669 it as described in Section 5.1. Otherwise, 4670 5) If the packet contains a SHUTDOWN ACK chunk, the receiver should 4671 respond to the sender of the OOTB packet with a SHUTDOWN 4672 COMPLETE. When sending the SHUTDOWN COMPLETE, the receiver of 4673 the OOTB packet must fill in the Verification Tag field of the 4674 outbound packet with the Verification Tag received in the 4675 SHUTDOWN ACK and set the T bit in the Chunk Flags to indicate 4676 that the Verification Tag is reflected. Otherwise, 4677 6) If the packet contains a SHUTDOWN COMPLETE chunk, the receiver 4678 should silently discard the packet and take no further action. 4679 Otherwise, 4680 7) If the packet contains a "Stale Cookie" ERROR or a COOKIE ACK, 4681 the SCTP packet should be silently discarded. Otherwise, 4682 8) The receiver should respond to the sender of the OOTB packet with 4683 an ABORT. When sending the ABORT, the receiver of the OOTB 4684 packet MUST fill in the Verification Tag field of the outbound 4685 packet with the value found in the Verification Tag field of the 4686 OOTB packet and set the T bit in the Chunk Flags to indicate that 4687 the Verification Tag is reflected. After sending this ABORT, the 4688 receiver of the OOTB packet shall discard the OOTB packet and 4689 take no further action. 4691 8.5. Verification Tag 4693 The Verification Tag rules defined in this section apply when sending 4694 or receiving SCTP packets that do not contain an INIT, SHUTDOWN 4695 COMPLETE, COOKIE ECHO (see Section 5.1), ABORT, or SHUTDOWN ACK 4696 chunk. The rules for sending and receiving SCTP packets containing 4697 one of these chunk types are discussed separately in Section 8.5.1. 4699 When sending an SCTP packet, the endpoint MUST fill in the 4700 Verification Tag field of the outbound packet with the tag value in 4701 the Initiate Tag parameter of the INIT or INIT ACK received from its 4702 peer. 4704 When receiving an SCTP packet, the endpoint MUST ensure that the 4705 value in the Verification Tag field of the received SCTP packet 4706 matches its own tag. If the received Verification Tag value does not 4707 match the receiver's own tag value, the receiver shall silently 4708 discard the packet and shall not process it any further except for 4709 those cases listed in Section 8.5.1 below. 4711 8.5.1. Exceptions in Verification Tag Rules 4713 A) Rules for packet carrying INIT: 4715 * The sender MUST set the Verification Tag of the packet to 0. 4717 * When an endpoint receives an SCTP packet with the Verification 4718 Tag set to 0, it should verify that the packet contains only an 4719 INIT chunk. Otherwise, the receiver MUST silently discard the 4720 packet. 4722 B) Rules for packet carrying ABORT: 4724 * The endpoint MUST always fill in the Verification Tag field of 4725 the outbound packet with the destination endpoint's tag value, 4726 if it is known. 4728 * If the ABORT is sent in response to an OOTB packet, the 4729 endpoint MUST follow the procedure described in Section 8.4. 4731 * The receiver of an ABORT MUST accept the packet if the 4732 Verification Tag field of the packet matches its own tag and 4733 the T bit is not set OR if it is set to its peer's tag and the 4734 T bit is set in the Chunk Flags. Otherwise, the receiver MUST 4735 silently discard the packet and take no further action. 4737 C) Rules for packet carrying SHUTDOWN COMPLETE: 4739 * When sending a SHUTDOWN COMPLETE, if the receiver of the 4740 SHUTDOWN ACK has a TCB, then the destination endpoint's tag 4741 MUST be used, and the T bit MUST NOT be set. Only where no TCB 4742 exists should the sender use the Verification Tag from the 4743 SHUTDOWN ACK, and MUST set the T bit. 4745 * The receiver of a SHUTDOWN COMPLETE shall accept the packet if 4746 the Verification Tag field of the packet matches its own tag 4747 and the T bit is not set OR if it is set to its peer's tag and 4748 the T bit is set in the Chunk Flags. Otherwise, the receiver 4749 MUST silently discard the packet and take no further action. 4750 An endpoint MUST ignore the SHUTDOWN COMPLETE if it is not in 4751 the SHUTDOWN-ACK-SENT state. 4753 D) Rules for packet carrying a COOKIE ECHO 4755 * When sending a COOKIE ECHO, the endpoint MUST use the value of 4756 the Initiate Tag received in the INIT ACK. 4758 * The receiver of a COOKIE ECHO follows the procedures in 4759 Section 5. 4761 E) Rules for packet carrying a SHUTDOWN ACK 4763 * If the receiver is in COOKIE-ECHOED or COOKIE-WAIT state the 4764 procedures in Section 8.4 SHOULD be followed; in other words, 4765 it should be treated as an Out Of The Blue packet. 4767 9. Termination of Association 4769 An endpoint should terminate its association when it exits from 4770 service. An association can be terminated by either abort or 4771 shutdown. An abort of an association is abortive by definition in 4772 that any data pending on either end of the association is discarded 4773 and not delivered to the peer. A shutdown of an association is 4774 considered a graceful close where all data in queue by either 4775 endpoint is delivered to the respective peers. However, in the case 4776 of a shutdown, SCTP does not support a half-open state (like TCP) 4777 wherein one side may continue sending data while the other end is 4778 closed. When either endpoint performs a shutdown, the association on 4779 each peer will stop accepting new data from its user and only deliver 4780 data in queue at the time of sending or receiving the SHUTDOWN chunk. 4782 9.1. Abort of an Association 4784 When an endpoint decides to abort an existing association, it MUST 4785 send an ABORT chunk to its peer endpoint. The sender MUST fill in 4786 the peer's Verification Tag in the outbound packet and MUST NOT 4787 bundle any DATA chunk with the ABORT. If the association is aborted 4788 on request of the upper layer, a User-Initiated Abort error cause 4789 (see Section 3.3.10.12) SHOULD be present in the ABORT chunk. 4791 An endpoint MUST NOT respond to any received packet that contains an 4792 ABORT chunk (also see Section 8.4). 4794 An endpoint receiving an ABORT MUST apply the special Verification 4795 Tag check rules described in Section 8.5.1. 4797 After checking the Verification Tag, the receiving endpoint MUST 4798 remove the association from its record and SHOULD report the 4799 termination to its upper layer. If a User-Initiated Abort error 4800 cause is present in the ABORT chunk, the Upper Layer Abort Reason 4801 SHOULD be made available to the upper layer. 4803 9.2. Shutdown of an Association 4805 Using the SHUTDOWN primitive (see Section 10.1), the upper layer of 4806 an endpoint in an association can gracefully close the association. 4807 This will allow all outstanding DATA chunks from the peer of the 4808 shutdown initiator to be delivered before the association terminates. 4810 Upon receipt of the SHUTDOWN primitive from its upper layer, the 4811 endpoint enters the SHUTDOWN-PENDING state and remains there until 4812 all outstanding data has been acknowledged by its peer. The endpoint 4813 accepts no new data from its upper layer, but retransmits data to the 4814 far end if necessary to fill gaps. 4816 Once all its outstanding data has been acknowledged, the endpoint 4817 shall send a SHUTDOWN chunk to its peer including in the Cumulative 4818 TSN Ack field the last sequential TSN it has received from the peer. 4819 It shall then start the T2-shutdown timer and enter the SHUTDOWN-SENT 4820 state. If the timer expires, the endpoint must resend the SHUTDOWN 4821 with the updated last sequential TSN received from its peer. 4823 The rules in Section 6.3 MUST be followed to determine the proper 4824 timer value for T2-shutdown. To indicate any gaps in TSN, the 4825 endpoint may also bundle a SACK with the SHUTDOWN chunk in the same 4826 SCTP packet. 4828 An endpoint should limit the number of retransmissions of the 4829 SHUTDOWN chunk to the protocol parameter 'Association.Max.Retrans'. 4830 If this threshold is exceeded, the endpoint should destroy the TCB 4831 and MUST report the peer endpoint unreachable to the upper layer (and 4832 thus the association enters the CLOSED state). The reception of any 4833 packet from its peer (i.e., as the peer sends all of its queued DATA 4834 chunks) should clear the endpoint's retransmission count and restart 4835 the T2-shutdown timer, giving its peer ample opportunity to transmit 4836 all of its queued DATA chunks that have not yet been sent. 4838 Upon reception of the SHUTDOWN, the peer endpoint shall 4840 o enter the SHUTDOWN-RECEIVED state, 4842 o stop accepting new data from its SCTP user, and 4844 o verify, by checking the Cumulative TSN Ack field of the chunk, 4845 that all its outstanding DATA chunks have been received by the 4846 SHUTDOWN sender. 4848 Once an endpoint has reached the SHUTDOWN-RECEIVED state, it MUST NOT 4849 send a SHUTDOWN in response to a ULP request, and should discard 4850 subsequent SHUTDOWN chunks. 4852 If there are still outstanding DATA chunks left, the SHUTDOWN 4853 receiver MUST continue to follow normal data transmission procedures 4854 defined in Section 6, until all outstanding DATA chunks are 4855 acknowledged; however, the SHUTDOWN receiver MUST NOT accept new data 4856 from its SCTP user. 4858 While in the SHUTDOWN-SENT state, the SHUTDOWN sender MUST 4859 immediately respond to each received packet containing one or more 4860 DATA chunks with a SHUTDOWN chunk and restart the T2-shutdown timer. 4861 If a SHUTDOWN chunk by itself cannot acknowledge all of the received 4862 DATA chunks (i.e., there are TSNs that can be acknowledged that are 4863 larger than the cumulative TSN, and thus gaps exist in the TSN 4864 sequence), or if duplicate TSNs have been received, then a SACK chunk 4865 MUST also be sent. 4867 The sender of the SHUTDOWN MAY also start an overall guard timer 'T5- 4868 shutdown-guard' to bound the overall time for the shutdown sequence. 4869 At the expiration of this timer, the sender SHOULD abort the 4870 association by sending an ABORT chunk. If the 'T5-shutdown- guard' 4871 timer is used, it SHOULD be set to the recommended value of 5 times 4872 'RTO.Max'. 4874 If the receiver of the SHUTDOWN has no more outstanding DATA chunks, 4875 the SHUTDOWN receiver MUST send a SHUTDOWN ACK and start a T2- 4876 shutdown timer of its own, entering the SHUTDOWN-ACK-SENT state. If 4877 the timer expires, the endpoint must resend the SHUTDOWN ACK. 4879 The sender of the SHUTDOWN ACK should limit the number of 4880 retransmissions of the SHUTDOWN ACK chunk to the protocol parameter 4881 'Association.Max.Retrans'. If this threshold is exceeded, the 4882 endpoint should destroy the TCB and may report the peer endpoint 4883 unreachable to the upper layer (and thus the association enters the 4884 CLOSED state). 4886 Upon the receipt of the SHUTDOWN ACK, the SHUTDOWN sender shall stop 4887 the T2-shutdown timer, send a SHUTDOWN COMPLETE chunk to its peer, 4888 and remove all record of the association. 4890 Upon reception of the SHUTDOWN COMPLETE chunk, the endpoint will 4891 verify that it is in the SHUTDOWN-ACK-SENT state; if it is not, the 4892 chunk should be discarded. If the endpoint is in the SHUTDOWN-ACK- 4893 SENT state, the endpoint should stop the T2-shutdown timer and remove 4894 all knowledge of the association (and thus the association enters the 4895 CLOSED state). 4897 An endpoint SHOULD ensure that all its outstanding DATA chunks have 4898 been acknowledged before initiating the shutdown procedure. 4900 An endpoint should reject any new data request from its upper layer 4901 if it is in the SHUTDOWN-PENDING, SHUTDOWN-SENT, SHUTDOWN-RECEIVED, 4902 or SHUTDOWN-ACK-SENT state. 4904 If an endpoint is in the SHUTDOWN-ACK-SENT state and receives an INIT 4905 chunk (e.g., if the SHUTDOWN COMPLETE was lost) with source and 4906 destination transport addresses (either in the IP addresses or in the 4907 INIT chunk) that belong to this association, it should discard the 4908 INIT chunk and retransmit the SHUTDOWN ACK chunk. 4910 Note: Receipt of an INIT with the same source and destination IP 4911 addresses as used in transport addresses assigned to an endpoint but 4912 with a different port number indicates the initialization of a 4913 separate association. 4915 The sender of the INIT or COOKIE ECHO should respond to the receipt 4916 of a SHUTDOWN ACK with a stand-alone SHUTDOWN COMPLETE in an SCTP 4917 packet with the Verification Tag field of its common header set to 4918 the same tag that was received in the SHUTDOWN ACK packet. This is 4919 considered an Out of the Blue packet as defined in Section 8.4. The 4920 sender of the INIT lets T1-init continue running and remains in the 4921 COOKIE-WAIT or COOKIE-ECHOED state. Normal T1-init timer expiration 4922 will cause the INIT or COOKIE chunk to be retransmitted and thus 4923 start a new association. 4925 If a SHUTDOWN is received in the COOKIE-WAIT or COOKIE ECHOED state, 4926 the SHUTDOWN chunk SHOULD be silently discarded. 4928 If an endpoint is in the SHUTDOWN-SENT state and receives a SHUTDOWN 4929 chunk from its peer, the endpoint shall respond immediately with a 4930 SHUTDOWN ACK to its peer, and move into the SHUTDOWN-ACK-SENT state 4931 restarting its T2-shutdown timer. 4933 If an endpoint is in the SHUTDOWN-ACK-SENT state and receives a 4934 SHUTDOWN ACK, it shall stop the T2-shutdown timer, send a SHUTDOWN 4935 COMPLETE chunk to its peer, and remove all record of the association. 4937 10. Interface with Upper Layer 4939 The Upper Layer Protocols (ULPs) shall request services by passing 4940 primitives to SCTP and shall receive notifications from SCTP for 4941 various events. 4943 The primitives and notifications described in this section should be 4944 used as a guideline for implementing SCTP. The following functional 4945 description of ULP interface primitives is shown for illustrative 4946 purposes. Different SCTP implementations may have different ULP 4947 interfaces. However, all SCTPs must provide a certain minimum set of 4948 services to guarantee that all SCTP implementations can support the 4949 same protocol hierarchy. 4951 10.1. ULP-to-SCTP 4953 The following sections functionally characterize a ULP/SCTP 4954 interface. The notation used is similar to most procedure or 4955 function calls in high-level languages. 4957 The ULP primitives described below specify the basic functions that 4958 SCTP must perform to support inter-process communication. Individual 4959 implementations must define their own exact format, and may provide 4960 combinations or subsets of the basic functions in single calls. 4962 A) Initialize 4964 Format: INITIALIZE ([local port],[local eligible address list]) 4965 -> local SCTP instance name 4967 This primitive allows SCTP to initialize its internal data 4968 structures and allocate necessary resources for setting up its 4969 operation environment. Once SCTP is initialized, ULP can 4970 communicate directly with other endpoints without re-invoking 4971 this primitive. 4972 SCTP will return a local SCTP instance name to the ULP. 4973 Mandatory attributes: 4975 * None. 4977 Optional attributes: 4978 The following types of attributes may be passed along with the 4979 primitive: 4981 * local port - SCTP port number, if ULP wants it to be 4982 specified. 4983 * local eligible address list - an address list that the local 4984 SCTP endpoint should bind. By default, if an address list is 4985 not included, all IP addresses assigned to the host should be 4986 used by the local endpoint. 4988 IMPLEMENTATION NOTE: If this optional attribute is supported by 4989 an implementation, it will be the responsibility of the 4990 implementation to enforce that the IP source address field of any 4991 SCTP packets sent out by this endpoint contains one of the IP 4992 addresses indicated in the local eligible address list. 4993 B) Associate 4995 Format: ASSOCIATE(local SCTP instance name, destination transport 4996 addr, outbound stream count) -> association id 4997 [,destination transport addr list] [,outbound stream 4998 count] 5000 This primitive allows the upper layer to initiate an association 5001 to a specific peer endpoint. 5002 The peer endpoint shall be specified by one of the transport 5003 addresses that defines the endpoint (see Section 1.3). If the 5004 local SCTP instance has not been initialized, the ASSOCIATE is 5005 considered an error. 5007 An association id, which is a local handle to the SCTP 5008 association, will be returned on successful establishment of the 5009 association. If SCTP is not able to open an SCTP association 5010 with the peer endpoint, an error is returned. 5011 Other association parameters may be returned, including the 5012 complete destination transport addresses of the peer as well as 5013 the outbound stream count of the local endpoint. One of the 5014 transport addresses from the returned destination addresses will 5015 be selected by the local endpoint as default primary path for 5016 sending SCTP packets to this peer. The returned "destination 5017 transport addr list" can be used by the ULP to change the default 5018 primary path or to force sending a packet to a specific transport 5019 address. 5020 IMPLEMENTATION NOTE: If ASSOCIATE primitive is implemented as a 5021 blocking function call, the ASSOCIATE primitive can return 5022 association parameters in addition to the association id upon 5023 successful establishment. If ASSOCIATE primitive is implemented 5024 as a non-blocking call, only the association id shall be returned 5025 and association parameters shall be passed using the 5026 COMMUNICATION UP notification. 5027 Mandatory attributes: 5029 * local SCTP instance name - obtained from the INITIALIZE 5030 operation. 5031 * destination transport addr - specified as one of the transport 5032 addresses of the peer endpoint with which the association is 5033 to be established. 5034 * outbound stream count - the number of outbound streams the ULP 5035 would like to open towards this peer endpoint. 5037 Optional attributes: 5039 * None. 5040 C) Shutdown 5042 Format: SHUTDOWN(association id) -> result 5044 Gracefully closes an association. Any locally queued user data 5045 will be delivered to the peer. The association will be 5046 terminated only after the peer acknowledges all the SCTP packets 5047 sent. A success code will be returned on successful termination 5048 of the association. If attempting to terminate the association 5049 results in a failure, an error code shall be returned. 5050 Mandatory attributes: 5052 * association id - local handle to the SCTP association. 5054 Optional attributes: 5056 * None. 5057 D) Abort 5059 Format: ABORT(association id [, Upper Layer Abort Reason]) -> 5060 result 5062 Ungracefully closes an association. Any locally queued user data 5063 will be discarded, and an ABORT chunk is sent to the peer. A 5064 success code will be returned on successful abort of the 5065 association. If attempting to abort the association results in a 5066 failure, an error code shall be returned. 5067 Mandatory attributes: 5069 * association id - local handle to the SCTP association. 5071 Optional attributes: 5073 * Upper Layer Abort Reason - reason of the abort to be passed to 5074 the peer. 5075 * None. 5076 E) Send 5078 Format: SEND(association id, buffer address, byte count 5079 [,context] [,stream id] [,life time] [,destination 5080 transport address] [,unordered flag] [,no-bundle flag] 5081 [,payload protocol-id] ) -> result 5083 This is the main method to send user data via SCTP. 5084 Mandatory attributes: 5086 * association id - local handle to the SCTP association. 5087 * buffer address - the location where the user message to be 5088 transmitted is stored. 5089 * byte count - the size of the user data in number of bytes. 5091 Optional attributes: 5093 * context - an optional 32-bit integer that will be carried in 5094 the sending failure notification to the ULP if the 5095 transportation of this user message fails. 5096 * stream id - to indicate which stream to send the data on. If 5097 not specified, stream 0 will be used. 5098 * life time - specifies the life time of the user data. The 5099 user data will not be sent by SCTP after the life time 5100 expires. This parameter can be used to avoid efforts to 5101 transmit stale user messages. SCTP notifies the ULP if the 5102 data cannot be initiated to transport (i.e., sent to the 5103 destination via SCTP's send primitive) within the life time 5104 variable. However, the user data will be transmitted if SCTP 5105 has attempted to transmit a chunk before the life time 5106 expired. 5107 IMPLEMENTATION NOTE: In order to better support the data life 5108 time option, the transmitter may hold back the assigning of 5109 the TSN number to an outbound DATA chunk to the last moment. 5110 And, for implementation simplicity, once a TSN number has been 5111 assigned the sender should consider the send of this DATA 5112 chunk as committed, overriding any life time option attached 5113 to the DATA chunk. 5114 * destination transport address - specified as one of the 5115 destination transport addresses of the peer endpoint to which 5116 this packet should be sent. Whenever possible, SCTP should 5117 use this destination transport address for sending the 5118 packets, instead of the current primary path. 5119 * unordered flag - this flag, if present, indicates that the 5120 user would like the data delivered in an unordered fashion to 5121 the peer (i.e., the U flag is set to 1 on all DATA chunks 5122 carrying this message). 5123 * no-bundle flag - instructs SCTP not to bundle this user data 5124 with other outbound DATA chunks. SCTP MAY still bundle even 5125 when this flag is present, when faced with network congestion. 5126 * payload protocol-id - a 32-bit unsigned integer that is to be 5127 passed to the peer indicating the type of payload protocol 5128 data being transmitted. This value is passed as opaque data 5129 by SCTP. 5130 F) Set Primary 5132 Format: SETPRIMARY(association id, destination transport address, 5133 [source transport address]) -> result 5135 Instructs the local SCTP to use the specified destination 5136 transport address as the primary path for sending packets. 5137 The result of attempting this operation shall be returned. If 5138 the specified destination transport address is not present in the 5139 "destination transport address list" returned earlier in an 5140 associate command or communication up notification, an error 5141 shall be returned. 5142 Mandatory attributes: 5144 * association id - local handle to the SCTP association. 5145 * destination transport address - specified as one of the 5146 transport addresses of the peer endpoint, which should be used 5147 as the primary address for sending packets. This overrides 5148 the current primary address information maintained by the 5149 local SCTP endpoint. 5151 Optional attributes: 5153 * source transport address - optionally, some implementations 5154 may allow you to set the default source address placed in all 5155 outgoing IP datagrams. 5156 G) Receive 5158 Format: RECEIVE(association id, buffer address, buffer size 5159 [,stream id]) -> byte count [,transport address] [,stream 5160 id] [,stream sequence number] [,partial flag] [,delivery 5161 number] [,payload protocol-id] 5163 This primitive shall read the first user message in the SCTP in- 5164 queue into the buffer specified by ULP, if there is one 5165 available. The size of the message read, in bytes, will be 5166 returned. It may, depending on the specific implementation, also 5167 return other information such as the sender's address, the stream 5168 id on which it is received, whether there are more messages 5169 available for retrieval, etc. For ordered messages, their Stream 5170 Sequence Number may also be returned. 5171 Depending upon the implementation, if this primitive is invoked 5172 when no message is available the implementation should return an 5173 indication of this condition or should block the invoking process 5174 until data does become available. 5175 Mandatory attributes: 5177 * association id - local handle to the SCTP association 5178 * buffer address - the memory location indicated by the ULP to 5179 store the received message. 5180 * buffer size - the maximum size of data to be received, in 5181 bytes. 5183 Optional attributes: 5185 * stream id - to indicate which stream to receive the data on. 5186 * Stream Sequence Number - the Stream Sequence Number assigned 5187 by the sending SCTP peer. 5188 * partial flag - if this returned flag is set to 1, then this 5189 Receive contains a partial delivery of the whole message. 5190 When this flag is set, the stream id and Stream Sequence 5191 Number MUST accompany this receive. When this flag is set to 5192 0, it indicates that no more deliveries will be received for 5193 this Stream Sequence Number. 5194 * payload protocol-id - a 32-bit unsigned integer that is 5195 received from the peer indicating the type of payload protocol 5196 of the received data. This value is passed as opaque data by 5197 SCTP. 5198 H) Status 5200 Format: STATUS(association id) -> status data 5201 This primitive should return a data block containing the 5202 following information: 5204 association connection state, 5205 destination transport address list, 5206 destination transport address reachability states, 5207 current receiver window size, 5208 current congestion window sizes, 5209 number of unacknowledged DATA chunks, 5210 number of DATA chunks pending receipt, 5211 primary path, 5212 most recent SRTT on primary path, 5213 RTO on primary path, 5214 SRTT and RTO on other destination addresses, etc. 5216 Mandatory attributes: 5218 * association id - local handle to the SCTP association. 5220 Optional attributes: 5222 * None. 5223 I) Change Heartbeat 5225 Format: CHANGE HEARTBEAT(association id, destination transport 5226 address, new state [,interval]) -> result 5228 Instructs the local endpoint to enable or disable heartbeat on 5229 the specified destination transport address. 5230 The result of attempting this operation shall be returned. 5231 Note: Even when enabled, heartbeat will not take place if the 5232 destination transport address is not idle. 5233 Mandatory attributes: 5235 * association id - local handle to the SCTP association. 5236 * destination transport address - specified as one of the 5237 transport addresses of the peer endpoint. 5238 * new state - the new state of heartbeat for this destination 5239 transport address (either enabled or disabled). 5241 Optional attributes: 5243 * interval - if present, indicates the frequency of the 5244 heartbeat if this is to enable heartbeat on a destination 5245 transport address. This value is added to the RTO of the 5246 destination transport address. This value, if present, 5247 affects all destinations. 5248 J) Request HeartBeat 5249 Format: REQUESTHEARTBEAT(association id, destination transport 5250 address) -> result 5252 Instructs the local endpoint to perform a HeartBeat on the 5253 specified destination transport address of the given association. 5254 The returned result should indicate whether the transmission of 5255 the HEARTBEAT chunk to the destination address is successful. 5256 Mandatory attributes: 5258 * association id - local handle to the SCTP association. 5259 * destination transport address - the transport address of the 5260 association on which a heartbeat should be issued. 5261 K) Get SRTT Report 5263 Format: GETSRTTREPORT(association id, destination transport 5264 address) -> srtt result 5266 Instructs the local SCTP to report the current SRTT measurement 5267 on the specified destination transport address of the given 5268 association. The returned result can be an integer containing 5269 the most recent SRTT in milliseconds. 5270 Mandatory attributes: 5272 * association id - local handle to the SCTP association. 5273 * destination transport address - the transport address of the 5274 association on which the SRTT measurement is to be reported. 5275 L) Set Failure Threshold 5277 Format: SETFAILURETHRESHOLD(association id, destination transport 5278 address, failure threshold) -> result 5280 This primitive allows the local SCTP to customize the 5281 reachability failure detection threshold 'Path.Max.Retrans' for 5282 the specified destination address. 5283 Mandatory attributes: 5285 * association id - local handle to the SCTP association. 5286 * destination transport address - the transport address of the 5287 association on which the failure detection threshold is to be 5288 set. 5289 * failure threshold - the new value of 'Path.Max.Retrans' for 5290 the destination address. 5291 M) Set Protocol Parameters 5293 Format: SETPROTOCOLPARAMETERS(association id, [,destination 5294 transport address,] protocol parameter list) -> result 5296 This primitive allows the local SCTP to customize the protocol 5297 parameters. 5298 Mandatory attributes: 5300 * association id - local handle to the SCTP association. 5301 * protocol parameter list - the specific names and values of the 5302 protocol parameters (e.g., Association.Max.Retrans; see 5303 Section 15) that the SCTP user wishes to customize. 5305 Optional attributes: 5307 * destination transport address - some of the protocol 5308 parameters may be set on a per destination transport address 5309 basis. 5310 N) Receive Unsent Message 5312 Format: RECEIVE_UNSENT(data retrieval id, buffer address, buffer 5313 size [,stream id] [, stream sequence number] [,partial 5314 flag] [,payload protocol-id]) 5316 * data retrieval id - the identification passed to the ULP in 5317 the failure notification. 5318 * buffer address - the memory location indicated by the ULP to 5319 store the received message. 5320 * buffer size - the maximum size of data to be received, in 5321 bytes. 5323 Optional attributes: 5325 * stream id - this is a return value that is set to indicate 5326 which stream the data was sent to. 5327 * Stream Sequence Number - this value is returned indicating the 5328 Stream Sequence Number that was associated with the message. 5329 * partial flag - if this returned flag is set to 1, then this 5330 message is a partial delivery of the whole message. When this 5331 flag is set, the stream id and Stream Sequence Number MUST 5332 accompany this receive. When this flag is set to 0, it 5333 indicates that no more deliveries will be received for this 5334 Stream Sequence Number. 5335 * payload protocol-id - The 32 bit unsigned integer that was 5336 sent to be sent to the peer indicating the type of payload 5337 protocol of the received data. 5338 O) Receive Unacknowledged Message 5340 Format: RECEIVE_UNACKED(data retrieval id, buffer address, buffer 5341 size, [,stream id] [, stream sequence number] [,partial 5342 flag] [,payload protocol-id]) 5344 * data retrieval id - the identification passed to the ULP in 5345 the failure notification. 5346 * buffer address - the memory location indicated by the ULP to 5347 store the received message. 5348 * buffer size - the maximum size of data to be received, in 5349 bytes. 5351 Optional attributes: 5353 * stream id - this is a return value that is set to indicate 5354 which stream the data was sent to. 5355 * Stream Sequence Number - this value is returned indicating the 5356 Stream Sequence Number that was associated with the message. 5357 * partial flag - if this returned flag is set to 1, then this 5358 message is a partial delivery of the whole message. When this 5359 flag is set, the stream id and Stream Sequence Number MUST 5360 accompany this receive. When this flag is set to 0, it 5361 indicates that no more deliveries will be received for this 5362 Stream Sequence Number. 5363 * payload protocol-id - the 32-bit unsigned integer that was 5364 sent to the peer indicating the type of payload protocol of 5365 the received data. 5366 P) Destroy SCTP Instance 5368 Format: DESTROY(local SCTP instance name) 5370 * local SCTP instance name - this is the value that was passed 5371 to the application in the initialize primitive and it 5372 indicates which SCTP instance is to be destroyed. 5374 10.2. SCTP-to-ULP 5376 It is assumed that the operating system or application environment 5377 provides a means for the SCTP to asynchronously signal the ULP 5378 process. When SCTP does signal a ULP process, certain information is 5379 passed to the ULP. 5381 IMPLEMENTATION NOTE: In some cases, this may be done through a 5382 separate socket or error channel. 5384 A) DATA ARRIVE notification 5385 SCTP shall invoke this notification on the ULP when a user 5386 message is successfully received and ready for retrieval. 5387 The following may optionally be passed with the notification: 5389 * association id - local handle to the SCTP association. 5390 * stream id - to indicate which stream the data is received on. 5391 B) SEND FAILURE notification 5392 If a message cannot be delivered, SCTP shall invoke this 5393 notification on the ULP. 5394 The following may optionally be passed with the notification: 5396 * association id - local handle to the SCTP association. 5397 * data retrieval id - an identification used to retrieve unsent 5398 and unacknowledged data. 5399 * cause code - indicating the reason of the failure, e.g., size 5400 too large, message life time expiration, etc. 5401 * context - optional information associated with this message 5402 (see D in Section 10.1). 5403 C) NETWORK STATUS CHANGE notification 5404 When a destination transport address is marked inactive (e.g., 5405 when SCTP detects a failure) or marked active (e.g., when SCTP 5406 detects a recovery), SCTP shall invoke this notification on the 5407 ULP. 5408 The following shall be passed with the notification: 5410 * association id - local handle to the SCTP association. 5411 * destination transport address - this indicates the destination 5412 transport address of the peer endpoint affected by the change. 5413 * new-status - this indicates the new status. 5414 D) COMMUNICATION UP notification 5415 This notification is used when SCTP becomes ready to send or 5416 receive user messages, or when a lost communication to an 5417 endpoint is restored. 5418 IMPLEMENTATION NOTE: If the ASSOCIATE primitive is implemented as 5419 a blocking function call, the association parameters are returned 5420 as a result of the ASSOCIATE primitive itself. In that case, 5421 COMMUNICATION UP notification is optional at the association 5422 initiator's side. 5423 The following shall be passed with the notification: 5425 * association id - local handle to the SCTP association. 5426 * status - This indicates what type of event has occurred. 5427 * destination transport address list - the complete set of 5428 transport addresses of the peer. 5429 * outbound stream count - the maximum number of streams allowed 5430 to be used in this association by the ULP. 5431 * inbound stream count - the number of streams the peer endpoint 5432 has requested with this association (this may not be the same 5433 number as 'outbound stream count'). 5434 E) COMMUNICATION LOST notification 5435 When SCTP loses communication to an endpoint completely (e.g., 5436 via Heartbeats) or detects that the endpoint has performed an 5437 abort operation, it shall invoke this notification on the ULP. 5438 The following shall be passed with the notification: 5440 * association id - local handle to the SCTP association. 5441 * status - this indicates what type of event has occurred; the 5442 status may indicate that a failure OR a normal termination 5443 event occurred in response to a shutdown or abort request. 5445 The following may be passed with the notification: 5447 * data retrieval id - an identification used to retrieve unsent 5448 and unacknowledged data. 5449 * last-acked - the TSN last acked by that peer endpoint. 5450 * last-sent - the TSN last sent to that peer endpoint. 5451 * Upper Layer Abort Reason - the abort reason specified in case 5452 of a user-initiated abort. 5453 F) COMMUNICATION ERROR notification 5454 When SCTP receives an ERROR chunk from its peer and decides to 5455 notify its ULP, it can invoke this notification on the ULP. 5456 The following can be passed with the notification: 5458 * association id - local handle to the SCTP association. 5459 * error info - this indicates the type of error and optionally 5460 some additional information received through the ERROR chunk. 5461 G) RESTART notification 5462 When SCTP detects that the peer has restarted, it may send this 5463 notification to its ULP. 5464 The following can be passed with the notification: 5466 * association id - local handle to the SCTP association. 5467 H) SHUTDOWN COMPLETE notification 5468 When SCTP completes the shutdown procedures (Section 9.2), this 5469 notification is passed to the upper layer. 5470 The following can be passed with the notification: 5472 * association id - local handle to the SCTP association. 5474 11. Security Considerations 5476 11.1. Security Objectives 5478 As a common transport protocol designed to reliably carry time- 5479 sensitive user messages, such as billing or signaling messages for 5480 telephony services, between two networked endpoints, SCTP has the 5481 following security objectives. 5483 o availability of reliable and timely data transport services 5484 o integrity of the user-to-user information carried by SCTP 5486 11.2. SCTP Responses to Potential Threats 5488 SCTP may potentially be used in a wide variety of risk situations. 5489 It is important for operators of systems running SCTP to analyze 5490 their particular situations and decide on the appropriate counter- 5491 measures. 5493 Operators of systems running SCTP should consult [RFC2196] for 5494 guidance in securing their site. 5496 11.2.1. Countering Insider Attacks 5498 The principles of [RFC2196] should be applied to minimize the risk of 5499 theft of information or sabotage by insiders. Such procedures 5500 include publication of security policies, control of access at the 5501 physical, software, and network levels, and separation of services. 5503 11.2.2. Protecting against Data Corruption in the Network 5505 Where the risk of undetected errors in datagrams delivered by the 5506 lower-layer transport services is considered to be too great, 5507 additional integrity protection is required. If this additional 5508 protection were provided in the application layer, the SCTP header 5509 would remain vulnerable to deliberate integrity attacks. While the 5510 existing SCTP mechanisms for detection of packet replays are 5511 considered sufficient for normal operation, stronger protections are 5512 needed to protect SCTP when the operating environment contains 5513 significant risk of deliberate attacks from a sophisticated 5514 adversary. 5516 The SCTP Authentication extension SCTP-AUTH [RFC4895] MAY be used 5517 when the threat environment requires stronger integrity protections, 5518 but does not require confidentiality. 5520 11.2.3. Protecting Confidentiality 5522 In most cases, the risk of breach of confidentiality applies to the 5523 signaling data payload, not to the SCTP or lower-layer protocol 5524 overheads. If that is true, encryption of the SCTP user data only 5525 might be considered. As with the supplementary checksum service, 5526 user data encryption MAY be performed by the SCTP user application. 5527 Alternately, the user application MAY use an implementation-specific 5528 API to request that the IP Encapsulating Security Payload (ESP) 5529 [RFC4303] be used to provide confidentiality and integrity. 5531 Particularly for mobile users, the requirement for confidentiality 5532 might include the masking of IP addresses and ports. In this case, 5533 ESP SHOULD be used instead of application-level confidentiality. If 5534 ESP is used to protect confidentiality of SCTP traffic, an ESP 5535 cryptographic transform that includes cryptographic integrity 5536 protection MUST be used, because if there is a confidentiality threat 5537 there will also be a strong integrity threat. 5539 Whenever ESP is in use, application-level encryption is not generally 5540 required. 5542 Regardless of where confidentiality is provided, the Internet Key 5543 Exchange Protocol version 2 (IKEv2) [RFC4306] SHOULD be used for key 5544 management. 5546 Operators should consult [RFC4301] for more information on the 5547 security services available at and immediately above the Internet 5548 Protocol layer. 5550 11.2.4. Protecting against Blind Denial-of-Service Attacks 5552 A blind attack is one where the attacker is unable to intercept or 5553 otherwise see the content of data flows passing to and from the 5554 target SCTP node. Blind denial-of-service attacks may take the form 5555 of flooding, masquerade, or improper monopolization of services. 5557 11.2.4.1. Flooding 5559 The objective of flooding is to cause loss of service and incorrect 5560 behavior at target systems through resource exhaustion, interference 5561 with legitimate transactions, and exploitation of buffer-related 5562 software bugs. Flooding may be directed either at the SCTP node or 5563 at resources in the intervening IP Access Links or the Internet. 5564 Where the latter entities are the target, flooding will manifest 5565 itself as loss of network services, including potentially the breach 5566 of any firewalls in place. 5568 In general, protection against flooding begins at the equipment 5569 design level, where it includes measures such as: 5571 o avoiding commitment of limited resources before determining that 5572 the request for service is legitimate. 5574 o giving priority to completion of processing in progress over the 5575 acceptance of new work. 5577 o identification and removal of duplicate or stale queued requests 5578 for service. 5580 o not responding to unexpected packets sent to non-unicast 5581 addresses. 5583 Network equipment should be capable of generating an alarm and log if 5584 a suspicious increase in traffic occurs. The log should provide 5585 information such as the identity of the incoming link and source 5586 address(es) used, which will help the network or SCTP system operator 5587 to take protective measures. Procedures should be in place for the 5588 operator to act on such alarms if a clear pattern of abuse emerges. 5590 The design of SCTP is resistant to flooding attacks, particularly in 5591 its use of a four-way startup handshake, its use of a cookie to defer 5592 commitment of resources at the responding SCTP node until the 5593 handshake is completed, and its use of a Verification Tag to prevent 5594 insertion of extraneous packets into the flow of an established 5595 association. 5597 The IP Authentication Header and Encapsulating Security Payload might 5598 be useful in reducing the risk of certain kinds of denial-of-service 5599 attacks. 5601 The use of the host name feature in the INIT chunk could be used to 5602 flood a target DNS server. A large backlog of DNS queries, resolving 5603 the host name received in the INIT chunk to IP addresses, could be 5604 accomplished by sending INITs to multiple hosts in a given domain. 5605 In addition, an attacker could use the host name feature in an 5606 indirect attack on a third party by sending large numbers of INITs to 5607 random hosts containing the host name of the target. In addition to 5608 the strain on DNS resources, this could also result in large numbers 5609 of INIT ACKs being sent to the target. One method to protect against 5610 this type of attack is to verify that the IP addresses received from 5611 DNS include the source IP address of the original INIT. If the list 5612 of IP addresses received from DNS does not include the source IP 5613 address of the INIT, the endpoint MAY silently discard the INIT. 5614 This last option will not protect against the attack against the DNS. 5616 11.2.4.2. Blind Masquerade 5618 Masquerade can be used to deny service in several ways: 5620 o by tying up resources at the target SCTP node to which the 5621 impersonated node has limited access. For example, the target 5622 node may by policy permit a maximum of one SCTP association with 5623 the impersonated SCTP node. The masquerading attacker may attempt 5624 to establish an association purporting to come from the 5625 impersonated node so that the latter cannot do so when it requires 5626 it. 5628 o by deliberately allowing the impersonation to be detected, thereby 5629 provoking counter-measures that cause the impersonated node to be 5630 locked out of the target SCTP node. 5632 o by interfering with an established association by inserting 5633 extraneous content such as a SHUTDOWN request. 5635 SCTP reduces the risk of blind masquerade attacks through IP spoofing 5636 by use of the four-way startup handshake. Because the initial 5637 exchange is memory-less, no lockout mechanism is triggered by blind 5638 masquerade attacks. In addition, the INIT ACK containing the State 5639 Cookie is transmitted back to the IP address from which it received 5640 the INIT. Thus, the attacker would not receive the INIT ACK 5641 containing the State Cookie. SCTP protects against insertion of 5642 extraneous packets into the flow of an established association by use 5643 of the Verification Tag. 5645 Logging of received INIT requests and abnormalities such as 5646 unexpected INIT ACKs might be considered as a way to detect patterns 5647 of hostile activity. However, the potential usefulness of such 5648 logging must be weighed against the increased SCTP startup processing 5649 it implies, rendering the SCTP node more vulnerable to flooding 5650 attacks. Logging is pointless without the establishment of operating 5651 procedures to review and analyze the logs on a routine basis. 5653 11.2.4.3. Improper Monopolization of Services 5655 Attacks under this heading are performed openly and legitimately by 5656 the attacker. They are directed against fellow users of the target 5657 SCTP node or of the shared resources between the attacker and the 5658 target node. Possible attacks include the opening of a large number 5659 of associations between the attacker's node and the target, or 5660 transfer of large volumes of information within a legitimately 5661 established association. 5663 Policy limits should be placed on the number of associations per 5664 adjoining SCTP node. SCTP user applications should be capable of 5665 detecting large volumes of illegitimate or "no-op" messages within a 5666 given association and either logging or terminating the association 5667 as a result, based on local policy. 5669 11.3. SCTP Interactions with Firewalls 5671 It is helpful for some firewalls if they can inspect just the first 5672 fragment of a fragmented SCTP packet and unambiguously determine 5673 whether it corresponds to an INIT chunk (for further information, 5674 please refer to [RFC1858]). Accordingly, we stress the requirements, 5675 stated in Section 3.1, that (1) an INIT chunk MUST NOT be bundled 5676 with any other chunk in a packet, and (2) a packet containing an INIT 5677 chunk MUST have a zero Verification Tag. Furthermore, we require 5678 that the receiver of an INIT chunk MUST enforce these rules by 5679 silently discarding an arriving packet with an INIT chunk that is 5680 bundled with other chunks or has a non-zero verification tag and 5681 contains an INIT-chunk. 5683 11.4. Protection of Non-SCTP-Capable Hosts 5685 To provide a non-SCTP-capable host with the same level of protection 5686 against attacks as for SCTP-capable ones, all SCTP stacks MUST 5687 implement the ICMP handling described in Appendix C. 5689 When an SCTP stack receives a packet containing multiple control or 5690 DATA chunks and the processing of the packet requires the sending of 5691 multiple chunks in response, the sender of the response chunk(s) MUST 5692 NOT send more than one packet. If bundling is supported, multiple 5693 response chunks that fit into a single packet MAY be bundled together 5694 into one single response packet. If bundling is not supported, then 5695 the sender MUST NOT send more than one response chunk and MUST 5696 discard all other responses. Note that this rule does NOT apply to a 5697 SACK chunk, since a SACK chunk is, in itself, a response to DATA and 5698 a SACK does not require a response of more DATA. 5700 An SCTP implementation SHOULD abort the association if it receives a 5701 SACK acknowledging a TSN that has not been sent. 5703 An SCTP implementation that receives an INIT that would require a 5704 large packet in response, due to the inclusion of multiple ERROR 5705 parameters, MAY (at its discretion) elect to omit some or all of the 5706 ERROR parameters to reduce the size of the INIT ACK. Due to a 5707 combination of the size of the COOKIE parameter and the number of 5708 addresses a receiver of an INIT may be indicating to a peer, it is 5709 always possible that the INIT ACK will be larger than the original 5710 INIT. An SCTP implementation SHOULD attempt to make the INIT ACK as 5711 small as possible to reduce the possibility of byte amplification 5712 attacks. 5714 12. Network Management Considerations 5716 The MIB module for SCTP defined in [RFC3873] applies for the version 5717 of the protocol specified in this document. 5719 13. Recommended Transmission Control Block (TCB) Parameters 5721 This section details a recommended set of parameters that should be 5722 contained within the TCB for an implementation. This section is for 5723 illustrative purposes and should not be deemed as requirements on an 5724 implementation or as an exhaustive list of all parameters inside an 5725 SCTP TCB. Each implementation may need its own additional parameters 5726 for optimization. 5728 13.1. Parameters Necessary for the SCTP Instance 5730 Associations: A list of current associations and mappings to the data 5731 consumers for each association. This may be in the 5732 form of a hash table or other implementation-dependent 5733 structure. The data consumers may be process 5734 identification information such as file descriptors, 5735 named pipe pointer, or table pointers dependent on how 5736 SCTP is implemented. 5737 Secret Key: A secret key used by this endpoint to compute the MAC. 5738 This SHOULD be a cryptographic quality random number 5739 with a sufficient length. Discussion in [RFC4086] can 5740 be helpful in selection of the key. 5741 Address List: The list of IP addresses that this instance has bound. 5742 This information is passed to one's peer(s) in INIT and 5743 INIT ACK chunks. 5744 SCTP Port: The local SCTP port number to which the endpoint is 5745 bound. 5747 13.2. Parameters Necessary per Association (i.e., the TCB) 5749 Peer Verification Tag: Tag value to be sent in every packet and is 5750 received in the INIT or INIT ACK chunk. 5751 My Verification Tag: Tag expected in every inbound packet and sent 5752 in the INIT or INIT ACK chunk. 5753 State: A state variable indicating what state the association 5754 is in, i.e., COOKIE-WAIT, COOKIE-ECHOED, ESTABLISHED, 5755 SHUTDOWN-PENDING, SHUTDOWN-SENT, SHUTDOWN-RECEIVED, 5756 SHUTDOWN-ACK-SENT. 5757 Note: No "CLOSED" state is illustrated since if a 5758 association is "CLOSED" its TCB SHOULD be removed. 5759 Peer Transport Address List: A list of SCTP transport addresses to 5760 which the peer is bound. This information is derived 5761 from the INIT or INIT ACK and is used to associate an 5762 inbound packet with a given association. Normally, 5763 this information is hashed or keyed for quick lookup 5764 and access of the TCB. 5765 Primary Path: This is the current primary destination transport 5766 address of the peer endpoint. It may also specify a 5767 source transport address on this endpoint. 5768 Overall Error Count: The overall association error count. 5769 Overall Error Threshold: The threshold for this association that if 5770 the Overall Error Count reaches will cause this 5771 association to be torn down. 5772 Peer Rwnd: Current calculated value of the peer's rwnd. 5773 Next TSN: The next TSN number to be assigned to a new DATA chunk. 5774 This is sent in the INIT or INIT ACK chunk to the peer 5775 and incremented each time a DATA chunk is assigned a 5776 TSN (normally just prior to transmit or during 5777 fragmentation). 5778 Last Rcvd TSN: This is the last TSN received in sequence. This 5779 value is set initially by taking the peer's initial 5780 TSN, received in the INIT or INIT ACK chunk, and 5781 subtracting one from it. 5782 Mapping Array: An array of bits or bytes indicating which out-of- 5783 order TSNs have been received (relative to the Last 5784 Rcvd TSN). If no gaps exist, i.e., no out-of- order 5785 packets have been received, this array will be set to 5786 all zero. This structure may be in the form of a 5787 circular buffer or bit array. 5788 Ack State: This flag indicates if the next received packet is to 5789 be responded to with a SACK. This is initialized to 0. 5790 When a packet is received it is incremented. If this 5791 value reaches 2 or more, a SACK is sent and the value 5792 is reset to 0. Note: This is used only when no DATA 5793 chunks are received out of order. When DATA chunks are 5794 out of order, SACKs are not delayed (see Section 6). 5795 Inbound Streams: An array of structures to track the inbound 5796 streams, normally including the next sequence number 5797 expected and possibly the stream number. 5798 Outbound Streams: An array of structures to track the outbound 5799 streams, normally including the next sequence number to 5800 be sent on the stream. 5801 Reasm Queue: A reassembly queue. 5802 Local Transport Address List: The list of local IP addresses bound 5803 in to this association. 5804 Association PMTU: The smallest PMTU discovered for all of the peer's 5805 transport addresses. 5807 13.3. Per Transport Address Data 5809 For each destination transport address in the peer's address list 5810 derived from the INIT or INIT ACK chunk, a number of data elements 5811 need to be maintained including: 5813 Error Count: The current error count for this destination. 5814 Error Threshold: Current error threshold for this destination, i.e., 5815 what value marks the destination down if error count 5816 reaches this value. 5817 cwnd: The current congestion window. 5818 ssthresh: The current ssthresh value. 5819 RTO: The current retransmission timeout value. 5820 SRTT: The current smoothed round-trip time. 5821 RTTVAR: The current RTT variation. 5822 partial bytes acked: The tracking method for increase of cwnd when 5823 in congestion avoidance mode (see Section 7.2.2). 5825 state: The current state of this destination, i.e., DOWN, UP, 5826 ALLOW-HB, NO-HEARTBEAT, etc. 5827 PMTU: The current known path MTU. 5828 Per Destination Timer: A timer used by each destination. 5829 RTO-Pending: A flag used to track if one of the DATA chunks sent to 5830 this address is currently being used to compute an RTT. 5831 If this flag is 0, the next DATA chunk sent to this 5832 destination should be used to compute an RTT and this 5833 flag should be set. Every time the RTT calculation 5834 completes (i.e., the DATA chunk is SACK'd), clear this 5835 flag. 5836 last-time: The time to which this destination was last sent. This 5837 can be to determine if a HEARTBEAT is needed. 5839 13.4. General Parameters Needed 5841 Out Queue: A queue of outbound DATA chunks. 5842 In Queue: A queue of inbound DATA chunks. 5844 14. IANA Considerations 5846 SCTP defines three registries that IANA maintains: 5848 o through definition of additional chunk types, 5849 o through definition of additional parameter types, or 5850 o through definition of additional cause codes within ERROR chunks. 5852 SCTP requires that the IANA Port Numbers registry be opened for SCTP 5853 port registrations, Section 14.5 describes how. An IESG-appointed 5854 Expert Reviewer supports IANA in evaluating SCTP port allocation 5855 requests. 5857 14.1. IETF-Defined Chunk Extension 5859 The assignment of new chunk parameter type codes is done through an 5860 IETF Consensus action, as defined in [RFC2434]. Documentation of the 5861 chunk parameter MUST contain the following information: 5863 a) A long and short name for the new chunk type. 5865 b) A detailed description of the structure of the chunk, which MUST 5866 conform to the basic structure defined in Section 3.2. 5868 c) A detailed definition and description of intended use of each 5869 field within the chunk, including the chunk flags if any. 5871 d) A detailed procedural description of the use of the new chunk 5872 type within the operation of the protocol. 5874 The last chunk type (255) is reserved for future extension if 5875 necessary. 5877 14.2. IETF-Defined Chunk Parameter Extension 5879 The assignment of new chunk parameter type codes is done through an 5880 IETF Consensus action as defined in [RFC2434]. Documentation of the 5881 chunk parameter MUST contain the following information: 5883 a) Name of the parameter type. 5885 b) Detailed description of the structure of the parameter field. 5886 This structure MUST conform to the general Type-Length-Value 5887 format described in Section 3.2.1. 5889 c) Detailed definition of each component of the parameter value. 5891 d) Detailed description of the intended use of this parameter type, 5892 and an indication of whether and under what circumstances 5893 multiple instances of this parameter type may be found within the 5894 same chunk. 5896 e) Each parameter type MUST be unique across all chunks. 5898 14.3. IETF-Defined Additional Error Causes 5900 Additional cause codes may be allocated in the range 11 to 65535 5901 through a Specification Required action as defined in [RFC2434]. 5902 Provided documentation must include the following information: 5904 a) Name of the error condition. 5906 b) Detailed description of the conditions under which an SCTP 5907 endpoint should issue an ERROR (or ABORT) with this cause code. 5909 c) Expected action by the SCTP endpoint that receives an ERROR (or 5910 ABORT) chunk containing this cause code. 5912 d) Detailed description of the structure and content of data fields 5913 that accompany this cause code. 5915 The initial word (32 bits) of a cause code parameter MUST conform to 5916 the format shown in Section 3.3.10, i.e.: 5918 o first 2 bytes contain the cause code value 5919 o last 2 bytes contain the length of the cause parameter. 5921 14.4. Payload Protocol Identifiers 5923 Except for value 0, which is reserved by SCTP to indicate an 5924 unspecified payload protocol identifier in a DATA chunk, SCTP will 5925 not be responsible for standardizing or verifying any payload 5926 protocol identifiers; SCTP simply receives the identifier from the 5927 upper layer and carries it with the corresponding payload data. 5929 The upper layer, i.e., the SCTP user, SHOULD standardize any specific 5930 protocol identifier with IANA if it is so desired. The use of any 5931 specific payload protocol identifier is out of the scope of SCTP. 5933 14.5. Port Numbers Registry 5935 SCTP services may use contact port numbers to provide service to 5936 unknown callers, as in TCP and UDP. IANA is therefore requested to 5937 open the existing Port Numbers registry for SCTP using the following 5938 rules, which we intend to mesh well with existing Port Numbers 5939 registration procedures. An IESG-appointed Expert Reviewer supports 5940 IANA in evaluating SCTP port allocation requests, according to the 5941 procedure defined in [RFC2434]. 5943 Port numbers are divided into three ranges. The Well Known Ports are 5944 those from 0 through 1023, the Registered Ports are those from 1024 5945 through 49151, and the Dynamic and/or Private Ports are those from 5946 49152 through 65535. Well Known and Registered Ports are intended 5947 for use by server applications that desire a default contact point on 5948 a system. On most systems, Well Known Ports can only be used by 5949 system (or root) processes or by programs executed by privileged 5950 users, while Registered Ports can be used by ordinary user processes 5951 or programs executed by ordinary users. Dynamic and/or Private Ports 5952 are intended for temporary use, including client-side ports, out-of- 5953 band negotiated ports, and application testing prior to registration 5954 of a dedicated port; they MUST NOT be registered. 5956 The Port Numbers registry should accept registrations for SCTP ports 5957 in the Well Known Ports and Registered Ports ranges. Well Known and 5958 Registered Ports SHOULD NOT be used without registration. Although 5959 in some cases -- such as porting an application from TCP to SCTP -- 5960 it may seem natural to use an SCTP port before registration 5961 completes, we emphasize that IANA will not guarantee registration of 5962 particular Well Known and Registered Ports. Registrations should be 5963 requested as early as possible. 5965 Each port registration SHALL include the following information: 5967 o A short port name, consisting entirely of letters (A-Z and a-z), 5968 digits (0-9), and punctuation characters from "-_+./*" (not 5969 including the quotes). 5970 o The port number that is requested for registration. 5971 o A short English phrase describing the port's purpose. 5972 o Name and contact information for the person or entity performing 5973 the registration, and possibly a reference to a document defining 5974 the port's use. Registrations coming from IETF working groups 5975 need only name the working group, but indicating a contact person 5976 is recommended. 5978 Registrants are encouraged to follow these guidelines when submitting 5979 a registration. 5981 o A port name SHOULD NOT be registered for more than one SCTP port 5982 number. 5984 o A port name registered for TCP MAY be registered for SCTP as well. 5985 Any such registration SHOULD use the same port number as the 5986 existing TCP registration. 5988 o Concrete intent to use a port SHOULD precede port registration. 5989 For example, existing TCP ports SHOULD NOT be registered in 5990 advance of any intent to use those ports for SCTP. 5992 This document registers the following ports. (These registrations 5993 should be considered models to follow for future allocation 5994 requests.) 5995 discard 9/sctp Discard # IETF TSVWG 5996 # Randall Stewart 5997 # [RFC4960] 5999 The discard service, which accepts SCTP connections on port 6000 9, discards all incoming application data and sends no data 6001 in response. Thus, SCTP's discard port is analogous to 6002 TCP's discard port, and might be used to check the health 6003 of an SCTP stack. 6005 ftp-data 20/sctp FTP # IETF TSVWG 6006 # Randall Stewart 6007 # [RFC4960] 6009 ftp 21/sctp FTP # IETF TSVWG 6010 # Randall Stewart 6011 # [RFC4960] 6013 File Transfer Protocol (FTP) data (20) and control ports 6014 (21). 6016 ssh 22/sctp SSH # IETF TSVWG 6017 # Randall Stewart 6018 # [RFC4960] 6020 The Secure Shell (SSH) remote login service, which allows 6021 secure shell logins to a host. 6023 http 80/sctp HTTP # IETF TSVWG 6024 # Randall Stewart 6025 # [RFC4960] 6027 World Wide Web HTTP over SCTP. 6029 bgp 179/sctp BGP # IETF TSVWG 6030 # Randall Stewart 6031 # [RFC4960] 6033 Border Gateway Protocol over SCTP. 6035 https 443/sctp HTTPS # IETF TSVWG 6036 # Randall Stewart 6037 # [RFC4960] 6039 World Wide Web HTTP over TLS/SSL over SCTP. 6041 15. Suggested SCTP Protocol Parameter Values 6043 The following protocol parameters are RECOMMENDED: 6045 RTO.Initial - 3 seconds 6046 RTO.Min - 1 second 6047 RTO.Max - 60 seconds 6048 Max.Burst - 4 6049 RTO.Alpha - 1/8 6050 RTO.Beta - 1/4 6051 Valid.Cookie.Life - 60 seconds 6052 Association.Max.Retrans - 10 attempts 6053 Path.Max.Retrans - 5 attempts (per destination address) 6054 Max.Init.Retransmits - 8 attempts 6055 HB.interval - 30 seconds 6056 HB.Max.Burst - 1 6058 IMPLEMENTATION NOTE: The SCTP implementation may allow ULP to 6059 customize some of these protocol parameters (see Section 10). 6061 Note: RTO.Min SHOULD be set as recommended above. 6063 16. Acknowledgements 6065 An undertaking represented by this updated document is not a small 6066 feat and represents the summation of the initial authors of RFC 2960: 6067 Q. Xie, K. Morneault, C. Sharp, H. Schwarzbauer, T. Taylor, 6068 I. Rytina, M. Kalla, L. Zhang, and V. Paxson. 6070 Add to that, the comments from everyone who contributed to the 6071 original RFC: 6073 Mark Allman, R.J. Atkinson, Richard Band, Scott Bradner, Steve 6074 Bellovin, Peter Butler, Ram Dantu, R. Ezhirpavai, Mike Fisk, Sally 6075 Floyd, Atsushi Fukumoto, Matt Holdrege, Henry Houh, Christian 6076 Huitema, Gary Lehecka, Jonathan Lee, David Lehmann, John Loughney, 6077 Daniel Luan, Barry Nagelberg, Thomas Narten, Erik Nordmark, Lyndon 6078 Ong, Shyamal Prasad, Kelvin Porter, Heinz Prantner, Jarno Rajahalme, 6079 Raymond E. Reeves, Renee Revis, Ivan Arias Rodriguez, A. Sankar, Greg 6080 Sidebottom, Brian Wyld, La Monte Yarroll, and many others for their 6081 invaluable comments. 6083 Then, add the authors of the SCTP implementor's guide, I. Arias- 6084 Rodriguez, K. Poon, A. Caro, and M. Tuexen. 6086 Then add to these the efforts of all the subsequent seven SCTP 6087 interoperability tests and those who commented on RFC 4460 as shown 6088 in its acknowledgements: 6090 Barry Zuckerman, La Monte Yarroll, Qiaobing Xie, Wang Xiaopeng, 6091 Jonathan Wood, Jeff Waskow, Mike Turner, John Townsend, Sabina 6092 Torrente, Cliff Thomas, Yuji Suzuki, Manoj Solanki, Sverre Slotte, 6093 Keyur Shah, Jan Rovins, Ben Robinson, Renee Revis, Ian Periam, RC 6094 Monee, Sanjay Rao, Sujith Radhakrishnan, Heinz Prantner, Biren Patel, 6095 Nathalie Mouellic, Mitch Miers, Bernward Meyknecht, Stan McClellan, 6096 Oliver Mayor, Tomas Orti Martin, Sandeep Mahajan, David Lehmann, 6097 Jonathan Lee, Philippe Langlois, Karl Knutson, Joe Keller, Gareth 6098 Keily, Andreas Jungmaier, Janardhan Iyengar, Mutsuya Irie, John 6099 Hebert, Kausar Hassan, Fred Hasle, Dan Harrison, Jon Grim, Laurent 6100 Glaude, Steven Furniss, Atsushi Fukumoto, Ken Fujita, Steve Dimig, 6101 Thomas Curran, Serkan Cil, Melissa Campbell, Peter Butler, Rob 6102 Brennan, Harsh Bhondwe, Brian Bidulock, Caitlin Bestler, Jon Berger, 6103 Robby Benedyk, Stephen Baucke, Sandeep Balani, and Ronnie Sellar. 6105 A special thanks to Mark Allman, who should actually be a co-author 6106 for his work on the max-burst, but managed to wiggle out due to a 6107 technicality. Also, we would like to acknowledge Lyndon Ong and Phil 6108 Conrad for their valuable input and many contributions. 6110 And finally, you have this document, and those who have commented 6111 upon that including Alfred Hoenes and Ronnie Sellars. 6113 My thanks cannot be adequately expressed to all of you who have 6114 participated in the coding, testing, and updating process of this 6115 document. All I can say is, Thank You! 6117 Randall Stewart - Editor 6119 17. References 6121 17.1. Normative References 6123 [ITU.V42.1994] 6124 International Telecommunications Union, "Error-correcting 6125 Procedures for DCEs Using Asynchronous-to-Synchronous 6126 Conversion", ITU-T Recommendation V.42, 1994. 6128 [RFC0768] Postel, J., "User Datagram Protocol", STD 6, RFC 768, 6129 DOI 10.17487/RFC0768, August 1980, 6130 . 6132 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 6133 RFC 793, DOI 10.17487/RFC0793, September 1981, 6134 . 6136 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 6137 Communication Layers", STD 3, RFC 1122, 6138 DOI 10.17487/RFC1122, October 1989, 6139 . 6141 [RFC1123] Braden, R., Ed., "Requirements for Internet Hosts - 6142 Application and Support", STD 3, RFC 1123, 6143 DOI 10.17487/RFC1123, October 1989, 6144 . 6146 [RFC1191] Mogul, J. and S. Deering, "Path MTU discovery", RFC 1191, 6147 DOI 10.17487/RFC1191, November 1990, 6148 . 6150 [RFC1981] McCann, J., Deering, S., and J. Mogul, "Path MTU Discovery 6151 for IP version 6", RFC 1981, DOI 10.17487/RFC1981, August 6152 1996, . 6154 [RFC1982] Elz, R. and R. Bush, "Serial Number Arithmetic", RFC 1982, 6155 DOI 10.17487/RFC1982, August 1996, 6156 . 6158 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 6159 Requirement Levels", BCP 14, RFC 2119, 6160 DOI 10.17487/RFC2119, March 1997, 6161 . 6163 [RFC2434] Narten, T. and H. Alvestrand, "Guidelines for Writing an 6164 IANA Considerations Section in RFCs", RFC 2434, 6165 DOI 10.17487/RFC2434, October 1998, 6166 . 6168 [RFC2460] Deering, S. and R. Hinden, "Internet Protocol, Version 6 6169 (IPv6) Specification", RFC 2460, DOI 10.17487/RFC2460, 6170 December 1998, . 6172 [RFC2581] Allman, M., Paxson, V., and W. Stevens, "TCP Congestion 6173 Control", RFC 2581, DOI 10.17487/RFC2581, April 1999, 6174 . 6176 [RFC3873] Pastor, J. and M. Belinchon, "Stream Control Transmission 6177 Protocol (SCTP) Management Information Base (MIB)", 6178 RFC 3873, DOI 10.17487/RFC3873, September 2004, 6179 . 6181 [RFC4291] Hinden, R. and S. Deering, "IP Version 6 Addressing 6182 Architecture", RFC 4291, DOI 10.17487/RFC4291, February 6183 2006, . 6185 [RFC4301] Kent, S. and K. Seo, "Security Architecture for the 6186 Internet Protocol", RFC 4301, DOI 10.17487/RFC4301, 6187 December 2005, . 6189 [RFC4303] Kent, S., "IP Encapsulating Security Payload (ESP)", 6190 RFC 4303, DOI 10.17487/RFC4303, December 2005, 6191 . 6193 [RFC4306] Kaufman, C., Ed., "Internet Key Exchange (IKEv2) 6194 Protocol", RFC 4306, DOI 10.17487/RFC4306, December 2005, 6195 . 6197 [RFC4821] Mathis, M. and J. Heffner, "Packetization Layer Path MTU 6198 Discovery", RFC 4821, DOI 10.17487/RFC4821, March 2007, 6199 . 6201 17.2. Informative References 6203 [ALLMAN99] 6204 Allman, M. and V. Paxson, "On Estimating End-to-End 6205 Network Path Properties", SIGCOM 99, 1999. 6207 [FALL96] Fall, K. and S. Floyd, "Simulation-based Comparisons of 6208 Tahoe, Reno, and SACK TCP", SIGCOM 99, V. 26, N. 3, 6209 pp 5-21, July 1996. 6211 [RFC0813] Clark, D., "Window and Acknowledgement Strategy in TCP", 6212 RFC 813, DOI 10.17487/RFC0813, July 1982, 6213 . 6215 [RFC1858] Ziemba, G., Reed, D., and P. Traina, "Security 6216 Considerations for IP Fragment Filtering", RFC 1858, 6217 DOI 10.17487/RFC1858, October 1995, 6218 . 6220 [RFC2104] Krawczyk, H., Bellare, M., and R. Canetti, "HMAC: Keyed- 6221 Hashing for Message Authentication", RFC 2104, 6222 DOI 10.17487/RFC2104, February 1997, 6223 . 6225 [RFC2196] Fraser, B., "Site Security Handbook", FYI 8, RFC 2196, 6226 DOI 10.17487/RFC2196, September 1997, 6227 . 6229 [RFC2522] Karn, P. and W. Simpson, "Photuris: Session-Key Management 6230 Protocol", RFC 2522, DOI 10.17487/RFC2522, March 1999, 6231 . 6233 [RFC2960] Stewart, R., Xie, Q., Morneault, K., Sharp, C., 6234 Schwarzbauer, H., Taylor, T., Rytina, I., Kalla, M., 6235 Zhang, L., and V. Paxson, "Stream Control Transmission 6236 Protocol", RFC 2960, DOI 10.17487/RFC2960, October 2000, 6237 . 6239 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 6240 of Explicit Congestion Notification (ECN) to IP", 6241 RFC 3168, DOI 10.17487/RFC3168, September 2001, 6242 . 6244 [RFC3309] Stone, J., Stewart, R., and D. Otis, "Stream Control 6245 Transmission Protocol (SCTP) Checksum Change", RFC 3309, 6246 DOI 10.17487/RFC3309, September 2002, 6247 . 6249 [RFC4086] Eastlake 3rd, D., Schiller, J., and S. Crocker, 6250 "Randomness Requirements for Security", BCP 106, RFC 4086, 6251 DOI 10.17487/RFC4086, June 2005, 6252 . 6254 [RFC4895] Tuexen, M., Stewart, R., Lei, P., and E. Rescorla, 6255 "Authenticated Chunks for the Stream Control Transmission 6256 Protocol (SCTP)", RFC 4895, DOI 10.17487/RFC4895, August 6257 2007, . 6259 [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", 6260 RFC 4960, DOI 10.17487/RFC4960, September 2007, 6261 . 6263 [SAVAGE99] 6264 Savage, S., Cardwell, N., Wetherall, D., and T. Anderson, 6265 "TCP Congestion Control with a Misbehaving Receiver", ACM 6266 Computer Communications Review 29(5), October 1999. 6268 [WILLIAMS93] 6269 Williams, R., "A PAINLESS GUIDE TO CRC ERROR DETECTION 6270 ALGORITHMS", SIGCOM 99, August 1993, 6271 . 6274 Appendix A. Explicit Congestion Notification 6276 ECN [RFC3168] describes a proposed extension to IP that details a 6277 method to become aware of congestion outside of datagram loss. This 6278 is an optional feature that an implementation MAY choose to add to 6279 SCTP. This appendix details the minor differences implementers will 6280 need to be aware of if they choose to implement this feature. In 6281 general, [RFC3168] should be followed with the following exceptions. 6283 Negotiation: 6285 [RFC3168] details negotiation of ECN during the SYN and SYN-ACK 6286 stages of a TCP connection. The sender of the SYN sets 2 bits in the 6287 TCP flags, and the sender of the SYN-ACK sets only 1 bit. The 6288 reasoning behind this is to ensure that both sides are truly ECN 6289 capable. For SCTP, this is not necessary. To indicate that an 6290 endpoint is ECN capable, an endpoint SHOULD add to the INIT and or 6291 INIT ACK chunk the TLV reserved for ECN. This TLV contains no 6292 parameters, and thus has the following format: 6294 0 1 2 3 6295 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 6296 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 6297 | Parameter Type = 32768 | Parameter Length = 4 | 6298 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 6300 ECN-Echo: 6302 [RFC3168] details a specific bit for a receiver to send back in its 6303 TCP acknowledgements to notify the sender of the Congestion 6304 Experienced (CE) bit having arrived from the network. For SCTP, this 6305 same indication is made by including the ECNE chunk. This chunk 6306 contains one data element, i.e., the lowest TSN associated with the 6307 IP datagram marked with the CE bit, and looks as follows: 6309 0 1 2 3 6310 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 6311 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 6312 | Chunk Type=12 | Flags=00000000| Chunk Length = 8 | 6313 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 6314 | Lowest TSN Number | 6315 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 6317 Note: The ECNE is considered a Control chunk. 6319 CWR: 6321 [RFC3168] details a specific bit for a sender to send in the header 6322 of its next outbound TCP segment to indicate to its peer that it has 6323 reduced its congestion window. This is termed the CWR bit. For 6324 SCTP, the same indication is made by including the CWR chunk. This 6325 chunk contains one data element, i.e., the TSN number that was sent 6326 in the ECNE chunk. This element represents the lowest TSN number in 6327 the datagram that was originally marked with the CE bit. 6329 0 1 2 3 6330 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 6331 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 6332 | Chunk Type=13 | Flags=00000000| Chunk Length = 8 | 6333 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 6334 | Lowest TSN Number | 6335 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 6337 Note: The CWR is considered a Control chunk. 6339 Appendix B. CRC32c Checksum Calculation 6341 We define a 'reflected value' as one that is the opposite of the 6342 normal bit order of the machine. The 32-bit CRC (Cyclic Redundancy 6343 Check) is calculated as described for CRC32c and uses the polynomial 6344 code 0x11EDC6F41 (Castagnoli93) or 6345 x^32+x^28+x^27+x^26+x^25+x^23+x^22+x^20+x^19+x^18+ 6346 x^14+x^13+x^11+x^10+x^9+x^8+x^6+x^0. The CRC is computed using a 6347 procedure similar to ETHERNET CRC [ITU.V42.1994], modified to reflect 6348 transport-level usage. 6350 CRC computation uses polynomial division. A message bit-string M is 6351 transformed to a polynomial, M(X), and the CRC is calculated from 6352 M(X) using polynomial arithmetic. 6354 When CRCs are used at the link layer, the polynomial is derived from 6355 on-the-wire bit ordering: the first bit 'on the wire' is the high- 6356 order coefficient. Since SCTP is a transport-level protocol, it 6357 cannot know the actual serial-media bit ordering. Moreover, 6358 different links in the path between SCTP endpoints may use different 6359 link-level bit orders. 6361 A convention must therefore be established for mapping SCTP transport 6362 messages to polynomials for purposes of CRC computation. The bit- 6363 ordering for mapping SCTP messages to polynomials is that bytes are 6364 taken most-significant first, but within each byte, bits are taken 6365 least-significant first. The first byte of the message provides the 6366 eight highest coefficients. Within each byte, the least-significant 6367 SCTP bit gives the most-significant polynomial coefficient within 6368 that byte, and the most-significant SCTP bit is the least-significant 6369 polynomial coefficient in that byte. (This bit ordering is sometimes 6370 called 'mirrored' or 'reflected' [WILLIAMS93].) CRC polynomials are 6371 to be transformed back into SCTP transport-level byte values, using a 6372 consistent mapping. 6374 The SCTP transport-level CRC value should be calculated as follows: 6376 o CRC input data are assigned to a byte stream, numbered from 0 to 6377 N-1. 6379 o The transport-level byte stream is mapped to a polynomial value. 6380 An N-byte PDU with j bytes numbered 0 to N-1 is considered as 6381 coefficients of a polynomial M(x) of order 8N-1, with bit 0 of 6382 byte j being coefficient x^(8(N-j)-8), and bit 7 of byte j being 6383 coefficient x^(8(N-j)-1). 6385 o The CRC remainder register is initialized with all 1s and the CRC 6386 is computed with an algorithm that simultaneously multiplies by 6387 x^32 and divides by the CRC polynomial. 6389 o The polynomial is multiplied by x^32 and divided by G(x), the 6390 generator polynomial, producing a remainder R(x) of degree less 6391 than or equal to 31. 6393 o The coefficients of R(x) are considered a 32-bit sequence. 6395 o The bit sequence is complemented. The result is the CRC 6396 polynomial. 6398 o The CRC polynomial is mapped back into SCTP transport-level bytes. 6399 The coefficient of x^31 gives the value of bit 7 of SCTP byte 0, 6400 and the coefficient of x^24 gives the value of bit 0 of byte 0. 6401 The coefficient of x^7 gives bit 7 of byte 3, and the coefficient 6402 of x^0 gives bit 0 of byte 3. The resulting 4-byte transport- 6403 level sequence is the 32-bit SCTP checksum value. 6405 IMPLEMENTATION NOTE: Standards documents, textbooks, and vendor 6406 literature on CRCs often follow an alternative formulation, in which 6407 the register used to hold the remainder of the long-division 6408 algorithm is initialized to zero rather than all-1s, and instead the 6409 first 32 bits of the message are complemented. The long-division 6410 algorithm used in our formulation is specified such that the initial 6411 multiplication by 2^32 and the long-division are combined into one 6412 simultaneous operation. For such algorithms, and for messages longer 6413 than 64 bits, the two specifications are precisely equivalent. That 6414 equivalence is the intent of this document. 6416 Implementors of SCTP are warned that both specifications are to be 6417 found in the literature, sometimes with no restriction on the long- 6418 division algorithm. The choice of formulation in this document is to 6419 permit non-SCTP usage, where the same CRC algorithm may be used to 6420 protect messages shorter than 64 bits. 6422 There may be a computational advantage in validating the association 6423 against the Verification Tag, prior to performing a checksum, as 6424 invalid tags will result in the same action as a bad checksum in most 6425 cases. The exceptions for this technique would be INIT and some 6426 SHUTDOWN-COMPLETE exchanges, as well as a stale COOKIE ECHO. These 6427 special-case exchanges must represent small packets and will minimize 6428 the effect of the checksum calculation. 6430 Appendix C. ICMP Handling 6432 Whenever an ICMP message is received by an SCTP endpoint, the 6433 following procedures MUST be followed to ensure proper utilization of 6434 the information being provided by layer 3. 6436 ICMP1) An implementation MAY ignore all ICMPv4 messages where the 6437 type field is not set to "Destination Unreachable". 6439 ICMP2) An implementation MAY ignore all ICMPv6 messages where the 6440 type field is not "Destination Unreachable", "Parameter 6441 Problem",, or "Packet Too Big". 6443 ICMP3) An implementation MAY ignore any ICMPv4 messages where the 6444 code does not indicate "Protocol Unreachable" or 6445 "Fragmentation Needed". 6447 ICMP4) An implementation MAY ignore all ICMPv6 messages of type 6448 "Parameter Problem" if the code is not "Unrecognized Next 6449 Header Type Encountered". 6451 ICMP5) An implementation MUST use the payload of the ICMP message 6452 (v4 or v6) to locate the association that sent the message to 6453 which ICMP is responding. If the association cannot be 6454 found, an implementation SHOULD ignore the ICMP message. 6456 ICMP6) An implementation MUST validate that the Verification Tag 6457 contained in the ICMP message matches the Verification Tag of 6458 the peer. If the Verification Tag is not 0 and does NOT 6459 match, discard the ICMP message. If it is 0 and the ICMP 6460 message contains enough bytes to verify that the chunk type 6461 is an INIT chunk and that the Initiate Tag matches the tag of 6462 the peer, continue with ICMP7. If the ICMP message is too 6463 short or the chunk type or the Initiate Tag does not match, 6464 silently discard the packet. 6466 ICMP7) If the ICMP message is either a v6 "Packet Too Big" or a v4 6467 "Fragmentation Needed", an implementation MAY process this 6468 information as defined for PATH MTU discovery. 6470 ICMP8) If the ICMP code is an "Unrecognized Next Header Type 6471 Encountered" or a "Protocol Unreachable", an implementation 6472 MUST treat this message as an abort with the T bit set if it 6473 does not contain an INIT chunk. If it does contain an INIT 6474 chunk and the association is in the COOKIE-WAIT state, handle 6475 the ICMP message like an ABORT. 6477 ICMP9) If the ICMPv6 code is "Destination Unreachable", the 6478 implementation MAY mark the destination into the unreachable 6479 state or alternatively increment the path error counter. 6481 Note that these procedures differ from [RFC1122] and from its 6482 requirements for processing of port-unreachable messages and the 6483 requirements that an implementation MUST abort associations in 6484 response to a "protocol unreachable" message. Port-unreachable 6485 messages are not processed, since an implementation will send an 6486 ABORT, not a port unreachable. The stricter handling of the 6487 "protocol unreachable" message is due to security concerns for hosts 6488 that do NOT support SCTP. 6490 The following non-normative sample code is taken from an open-source 6491 CRC generator [WILLIAMS93], using the "mirroring" technique and 6492 yielding a lookup table for SCTP CRC32c with 256 entries, each 32 6493 bits wide. While neither especially slow nor especially fast, as 6494 software table-lookup CRCs go, it has the advantage of working on 6495 both big-endian and little-endian CPUs, using the same (host-order) 6496 lookup tables, and using only the predefined ntohl() and htonl() 6497 operations. The code is somewhat modified from [WILLIAMS93], to 6498 ensure portability between big-endian and little-endian 6499 architectures. (Note that if the byte endian-ness of the target 6500 architecture is known to be little-endian, the final bit-reversal and 6501 byte-reversal steps can be folded into a single operation.) 6503 /*************************************************************/ 6504 /* Note Definition for Ross Williams table generator would */ 6505 /* be: TB_WIDTH=4, TB_POLLY=0x1EDC6F41, TB_REVER=TRUE */ 6506 /* For Mr. Williams direct calculation code use the settings */ 6507 /* cm_width=32, cm_poly=0x1EDC6F41, cm_init=0xFFFFFFFF, */ 6508 /* cm_refin=TRUE, cm_refot=TRUE, cm_xorort=0x00000000 */ 6509 /*************************************************************/ 6511 /* Example of the crc table file */ 6512 #ifndef __crc32cr_table_h__ 6513 #define __crc32cr_table_h__ 6515 #define CRC32C_POLY 0x1EDC6F41 6516 #define CRC32C(c,d) (c=(c>>8)^crc_c[(c^(d))&0xFF]) 6518 unsigned long crc_c[256] = 6519 { 6520 0x00000000L, 0xF26B8303L, 0xE13B70F7L, 0x1350F3F4L, 6521 0xC79A971FL, 0x35F1141CL, 0x26A1E7E8L, 0xD4CA64EBL, 6522 0x8AD958CFL, 0x78B2DBCCL, 0x6BE22838L, 0x9989AB3BL, 6523 0x4D43CFD0L, 0xBF284CD3L, 0xAC78BF27L, 0x5E133C24L, 6524 0x105EC76FL, 0xE235446CL, 0xF165B798L, 0x030E349BL, 6525 0xD7C45070L, 0x25AFD373L, 0x36FF2087L, 0xC494A384L, 6526 0x9A879FA0L, 0x68EC1CA3L, 0x7BBCEF57L, 0x89D76C54L, 6527 0x5D1D08BFL, 0xAF768BBCL, 0xBC267848L, 0x4E4DFB4BL, 6528 0x20BD8EDEL, 0xD2D60DDDL, 0xC186FE29L, 0x33ED7D2AL, 6529 0xE72719C1L, 0x154C9AC2L, 0x061C6936L, 0xF477EA35L, 6530 0xAA64D611L, 0x580F5512L, 0x4B5FA6E6L, 0xB93425E5L, 6531 0x6DFE410EL, 0x9F95C20DL, 0x8CC531F9L, 0x7EAEB2FAL, 6532 0x30E349B1L, 0xC288CAB2L, 0xD1D83946L, 0x23B3BA45L, 6534 0xF779DEAEL, 0x05125DADL, 0x1642AE59L, 0xE4292D5AL, 6535 0xBA3A117EL, 0x4851927DL, 0x5B016189L, 0xA96AE28AL, 6536 0x7DA08661L, 0x8FCB0562L, 0x9C9BF696L, 0x6EF07595L, 6537 0x417B1DBCL, 0xB3109EBFL, 0xA0406D4BL, 0x522BEE48L, 6538 0x86E18AA3L, 0x748A09A0L, 0x67DAFA54L, 0x95B17957L, 6539 0xCBA24573L, 0x39C9C670L, 0x2A993584L, 0xD8F2B687L, 6540 0x0C38D26CL, 0xFE53516FL, 0xED03A29BL, 0x1F682198L, 6541 0x5125DAD3L, 0xA34E59D0L, 0xB01EAA24L, 0x42752927L, 6542 0x96BF4DCCL, 0x64D4CECFL, 0x77843D3BL, 0x85EFBE38L, 6543 0xDBFC821CL, 0x2997011FL, 0x3AC7F2EBL, 0xC8AC71E8L, 6544 0x1C661503L, 0xEE0D9600L, 0xFD5D65F4L, 0x0F36E6F7L, 6545 0x61C69362L, 0x93AD1061L, 0x80FDE395L, 0x72966096L, 6546 0xA65C047DL, 0x5437877EL, 0x4767748AL, 0xB50CF789L, 6547 0xEB1FCBADL, 0x197448AEL, 0x0A24BB5AL, 0xF84F3859L, 6548 0x2C855CB2L, 0xDEEEDFB1L, 0xCDBE2C45L, 0x3FD5AF46L, 6549 0x7198540DL, 0x83F3D70EL, 0x90A324FAL, 0x62C8A7F9L, 6550 0xB602C312L, 0x44694011L, 0x5739B3E5L, 0xA55230E6L, 6551 0xFB410CC2L, 0x092A8FC1L, 0x1A7A7C35L, 0xE811FF36L, 6552 0x3CDB9BDDL, 0xCEB018DEL, 0xDDE0EB2AL, 0x2F8B6829L, 6553 0x82F63B78L, 0x709DB87BL, 0x63CD4B8FL, 0x91A6C88CL, 6554 0x456CAC67L, 0xB7072F64L, 0xA457DC90L, 0x563C5F93L, 6555 0x082F63B7L, 0xFA44E0B4L, 0xE9141340L, 0x1B7F9043L, 6556 0xCFB5F4A8L, 0x3DDE77ABL, 0x2E8E845FL, 0xDCE5075CL, 6557 0x92A8FC17L, 0x60C37F14L, 0x73938CE0L, 0x81F80FE3L, 6558 0x55326B08L, 0xA759E80BL, 0xB4091BFFL, 0x466298FCL, 6559 0x1871A4D8L, 0xEA1A27DBL, 0xF94AD42FL, 0x0B21572CL, 6560 0xDFEB33C7L, 0x2D80B0C4L, 0x3ED04330L, 0xCCBBC033L, 6561 0xA24BB5A6L, 0x502036A5L, 0x4370C551L, 0xB11B4652L, 6562 0x65D122B9L, 0x97BAA1BAL, 0x84EA524EL, 0x7681D14DL, 6563 0x2892ED69L, 0xDAF96E6AL, 0xC9A99D9EL, 0x3BC21E9DL, 6564 0xEF087A76L, 0x1D63F975L, 0x0E330A81L, 0xFC588982L, 6565 0xB21572C9L, 0x407EF1CAL, 0x532E023EL, 0xA145813DL, 6566 0x758FE5D6L, 0x87E466D5L, 0x94B49521L, 0x66DF1622L, 6567 0x38CC2A06L, 0xCAA7A905L, 0xD9F75AF1L, 0x2B9CD9F2L, 6568 0xFF56BD19L, 0x0D3D3E1AL, 0x1E6DCDEEL, 0xEC064EEDL, 6569 0xC38D26C4L, 0x31E6A5C7L, 0x22B65633L, 0xD0DDD530L, 6570 0x0417B1DBL, 0xF67C32D8L, 0xE52CC12CL, 0x1747422FL, 6571 0x49547E0BL, 0xBB3FFD08L, 0xA86F0EFCL, 0x5A048DFFL, 6572 0x8ECEE914L, 0x7CA56A17L, 0x6FF599E3L, 0x9D9E1AE0L, 6573 0xD3D3E1ABL, 0x21B862A8L, 0x32E8915CL, 0xC083125FL, 6574 0x144976B4L, 0xE622F5B7L, 0xF5720643L, 0x07198540L, 6575 0x590AB964L, 0xAB613A67L, 0xB831C993L, 0x4A5A4A90L, 6576 0x9E902E7BL, 0x6CFBAD78L, 0x7FAB5E8CL, 0x8DC0DD8FL, 6577 0xE330A81AL, 0x115B2B19L, 0x020BD8EDL, 0xF0605BEEL, 6578 0x24AA3F05L, 0xD6C1BC06L, 0xC5914FF2L, 0x37FACCF1L, 6579 0x69E9F0D5L, 0x9B8273D6L, 0x88D28022L, 0x7AB90321L, 6580 0xAE7367CAL, 0x5C18E4C9L, 0x4F48173DL, 0xBD23943EL, 6581 0xF36E6F75L, 0x0105EC76L, 0x12551F82L, 0xE03E9C81L, 6583 0x34F4F86AL, 0xC69F7B69L, 0xD5CF889DL, 0x27A40B9EL, 6584 0x79B737BAL, 0x8BDCB4B9L, 0x988C474DL, 0x6AE7C44EL, 6585 0xBE2DA0A5L, 0x4C4623A6L, 0x5F16D052L, 0xAD7D5351L, 6586 }; 6588 #endif 6590 /* Example of table build routine */ 6592 #include 6593 #include 6595 #define OUTPUT_FILE "crc32cr.h" 6596 #define CRC32C_POLY 0x1EDC6F41L 6597 FILE *tf; 6598 unsigned long 6599 reflect_32 (unsigned long b) 6600 { 6601 int i; 6602 unsigned long rw = 0L; 6604 for (i = 0; i < 32; i++){ 6605 if (b & 1) 6606 rw |= 1 << (31 - i); 6607 b >>= 1; 6608 } 6609 return (rw); 6610 } 6612 unsigned long 6613 build_crc_table (int index) 6614 { 6615 int i; 6616 unsigned long rb; 6618 rb = reflect_32 (index); 6620 for (i = 0; i < 8; i++){ 6621 if (rb & 0x80000000L) 6622 rb = (rb << 1) ^ CRC32C_POLY; 6623 else 6624 rb <<= 1; 6625 } 6626 return (reflect_32 (rb)); 6627 } 6629 main () 6630 { 6631 int i; 6633 printf ("\nGenerating CRC-32c table file <%s>\n", 6634 OUTPUT_FILE); 6635 if ((tf = fopen (OUTPUT_FILE, "w")) == NULL){ 6636 printf ("Unable to open %s\n", OUTPUT_FILE); 6637 exit (1); 6638 } 6639 fprintf (tf, "#ifndef __crc32cr_table_h__\n"); 6640 fprintf (tf, "#define __crc32cr_table_h__\n\n"); 6641 fprintf (tf, "#define CRC32C_POLY 0x%08lX\n", 6642 CRC32C_POLY); 6643 fprintf (tf, 6644 "#define CRC32C(c,d) (c=(c>>8)^crc_c[(c^(d))&0xFF])\n"); 6645 fprintf (tf, "\nunsigned long crc_c[256] =\n{\n"); 6646 for (i = 0; i < 256; i++){ 6647 fprintf (tf, "0x%08lXL, ", build_crc_table (i)); 6648 if ((i & 3) == 3) 6649 fprintf (tf, "\n"); 6650 } 6651 fprintf (tf, "};\n\n#endif\n"); 6653 if (fclose (tf) != 0) 6654 printf ("Unable to close <%s>." OUTPUT_FILE); 6655 else 6656 printf ("\nThe CRC-32c table has been written to <%s>.\n", 6657 OUTPUT_FILE); 6658 } 6660 /* Example of crc insertion */ 6662 #include "crc32cr.h" 6663 unsigned long 6664 generate_crc32c(unsigned char *buffer, unsigned int length) 6665 { 6666 unsigned int i; 6667 unsigned long crc32 = ~0L; 6668 unsigned long result; 6669 unsigned char byte0,byte1,byte2,byte3; 6671 for (i = 0; i < length; i++){ 6672 CRC32C(crc32, buffer[i]); 6673 } 6675 result = ~crc32; 6677 /* result now holds the negated polynomial remainder; 6678 * since the table and algorithm is "reflected" [williams95]. 6679 * That is, result has the same value as if we mapped the message 6680 * to a polynomial, computed the host-bit-order polynomial 6681 * remainder, performed final negation, then did an end-for-end 6682 * bit-reversal. 6683 * Note that a 32-bit bit-reversal is identical to four inplace 6684 * 8-bit reversals followed by an end-for-end byteswap. 6685 * In other words, the bytes of each bit are in the right order, 6686 * but the bytes have been byteswapped. So we now do an explicit 6687 * byteswap. On a little-endian machine, this byteswap and 6688 * the final ntohl cancel out and could be elided. 6689 */ 6691 byte0 = result & 0xff; 6692 byte1 = (result>>8) & 0xff; 6693 byte2 = (result>>16) & 0xff; 6694 byte3 = (result>>24) & 0xff; 6695 crc32 = ((byte0 << 24) | 6696 (byte1 << 16) | 6697 (byte2 << 8) | 6698 byte3); 6699 return ( crc32 ); 6700 } 6702 int 6703 insert_crc32(unsigned char *buffer, unsigned int length) 6704 { 6705 SCTP_message *message; 6706 unsigned long crc32; 6707 message = (SCTP_message *) buffer; 6708 message->common_header.checksum = 0L; 6709 crc32 = generate_crc32c(buffer,length); 6710 /* and insert it into the message */ 6711 message->common_header.checksum = htonl(crc32); 6712 return 1; 6713 } 6715 int 6716 validate_crc32(unsigned char *buffer, unsigned int length) 6717 { 6718 SCTP_message *message; 6719 unsigned int i; 6720 unsigned long original_crc32; 6721 unsigned long crc32 = ~0L; 6723 /* save and zero checksum */ 6724 message = (SCTP_message *) buffer; 6725 original_crc32 = ntohl(message->common_header.checksum); 6726 message->common_header.checksum = 0L; 6727 crc32 = generate_crc32c(buffer,length); 6728 return ((original_crc32 == crc32)? 1 : -1); 6729 } 6731 Authors' Addresses 6733 Randall R. Stewart 6734 Netflix, Inc. 6735 2455 Heritage Green Ave 6736 Davenport, FL 33837 6737 United States 6739 Email: randall@lakerest.net 6741 Michael Tuexen 6742 Muenster University of Applied Sciences 6743 Stegerwaldstrasse 39 6744 Steinfurt 48565 6745 Germany 6747 Email: tuexen@fh-muenster.de 6749 Karen E. E. Nielsen 6750 Kamstrup A/S 6751 Industrivej 28 6752 Skanderborg DK-8660 6753 Denmark 6755 Email: kee@kamstrup.com