idnits 2.17.1 draft-ietf-sigtran-sctp-07.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- ** Looks like you're using RFC 2026 boilerplate. This must be updated to follow RFC 3978/3979, as updated by RFC 4748. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- ** The document seems to lack a 1id_guidelines paragraph about Internet-Drafts being working documents. ** The document seems to lack a 1id_guidelines paragraph about 6 months document validity. == No 'Intended status' indicated for this document; assuming Proposed Standard == The page length should not exceed 58 lines per page, but there was 1 longer page, the longest (page 1) being 5405 lines Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** The document seems to lack separate sections for Informative/Normative References. All references will be assumed normative when checking for downward references. ** There are 23 instances of too long lines in the document, the longest one being 8 characters in excess of 72. ** There are 2 instances of lines with control characters in the document. ** The document seems to lack a both a reference to RFC 2119 and the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. RFC 2119 keyword, line 568: '...ks. These chunks MUST NOT be multiplex...' RFC 2119 keyword, line 575: '...an SCTP datagram MUST be transmitted i...' RFC 2119 keyword, line 609: '...Verification Tag MUST be set to the va...' RFC 2119 keyword, line 613: '...IT chunk, the transmitter MUST set the...' RFC 2119 keyword, line 617: '...CK, the receiver MUST drop the datagra...' (198 more instances...) Miscellaneous warnings: ---------------------------------------------------------------------------- == Line 4393 has weird spacing: '...ication in t...' == Line 4409 has weird spacing: '...ss List boun...' == Line 4415 has weird spacing: '...ination add...' == Line 4505 has weird spacing: '...ion was last ...' == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: This is to allow vendors to support their own extended parameters not defined by the IETF. It MUST not affect the operation of SCTP. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'SHOULD not' in this paragraph: This value represents the dedicated buffer space, in number of octets, the sender of the INIT has placed in association with this window. During the life of the association this buffer space SHOULD not be lessened (i.e. dedicated buffers taken away from this association). == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: The ABORT chunk is sent to the peer of an association to terminate the association. The ABORT chunk may contain cause parameters to inform the receiver the reason of the abort. DATA chunks MUST not be bundled with ABORT. Control chunks MAY be bundled with an ABORT but they MUST be placed before the ABORT in the SCTP datagram, or they will be ignored by the receiver. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: This Chunk type is available to allow vendors to support their own extended data formats not defined by the IETF. It MUST not affect the operation of SCTP. In particular, when adding a Vendor Specific chunk type, the vendor defined chunks MUST obey the congestion avoidance rules defined in this document if they carry user data. User data is defined as any data transported over the association that is delivered to the upper layer of the receiver. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: Note: after sending out INIT ACK with the cookie, "Z" MUST not allocate any resources, nor keep any states for the new association. Otherwise, "Z" will be vulnerable to resource attacks. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: After sending out the INIT ACK, the endpoint shall take no further actions, i.e., the existing association, including its current state, and the corresponding TCB MUST not be changed. == Using lowercase 'not' together with uppercase 'MUST', 'SHALL', 'SHOULD', or 'RECOMMENDED' is not an accepted usage according to RFC 2119. Please use uppercase 'NOT' together with RFC 2119 keywords (if that is what you mean). Found 'MUST not' in this paragraph: NOTE: In instances where the data receiver endpoint is multi-homed, if a SACK arrives at the data sender that advances the sender's cumulative TSN point, then the data sender should update its cwnd (or cwnds) apportioned to the destination addresses where the data was transmitted to. However if the SACK does not advance the cumulative TSN point, the data sender MUST not adjust the cwnd of any of the destination addresses. -- The document seems to lack a disclaimer for pre-RFC5378 work, but may have content which was first submitted before 10 November 2008. If you have contacted all the original authors and they are all willing to grant the BCP78 rights to the IETF Trust, then this is fine, and you can ignore this comment. If not, you may need to add the pre-RFC5378 disclaimer. (See the Legal Provisions document at https://trustee.ietf.org/license-info for more information.) -- Couldn't find a document date in the document -- date freshness check skipped. Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) -- Looks like a reference, but probably isn't: 'ASSOCIATE' on line 1794 -- Looks like a reference, but probably isn't: 'TERMINATE' on line 1826 -- Looks like a reference, but probably isn't: 'ABORT' on line 1788 == Unused Reference: '7' is defined on line 4778, but no explicit reference was found in the text == Unused Reference: '10' is defined on line 4786, but no explicit reference was found in the text == Unused Reference: '14' is defined on line 4799, but no explicit reference was found in the text ** Obsolete normative reference: RFC 1750 (ref. '1') (Obsoleted by RFC 4086) ** Downref: Normative reference to an Informational RFC: RFC 1950 (ref. '2') ** Obsolete normative reference: RFC 2581 (ref. '3') (Obsoleted by RFC 5681) ** Downref: Normative reference to an Informational RFC: RFC 2104 (ref. '4') -- Possible downref: Non-RFC (?) normative reference: ref. '5' ** Downref: Normative reference to an Experimental RFC: RFC 2522 (ref. '6') ** Obsolete normative reference: RFC 793 (ref. '8') (Obsoleted by RFC 9293) ** Obsolete normative reference: RFC 1700 (ref. '10') (Obsoleted by RFC 3232) ** Obsolete normative reference: RFC 1981 (ref. '12') (Obsoleted by RFC 8201) ** Downref: Normative reference to an Informational RFC: RFC 2196 (ref. '13') ** Obsolete normative reference: RFC 2401 (ref. '14') (Obsoleted by RFC 4301) -- Possible downref: Non-RFC (?) normative reference: ref. '15' -- Possible downref: Non-RFC (?) normative reference: ref. '16' Summary: 17 errors (**), 0 flaws (~~), 16 warnings (==), 8 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 1 Network Working Group R. R. Stewart 2 INTERNET-DRAFT Q. Xie 3 Motorola 4 K. Morneault 5 C. Sharp 6 Cisco 7 H. J. Schwarzbauer 8 Siemens 9 T. Taylor 10 Nortel Networks 11 I. Rytina 12 Ericsson 13 M. Kalla 14 Telcordia 15 L. Zhang 16 UCLA 17 V. Paxson 18 ACIRI 20 expires in six months March 2,2000 22 Simple Control Transmission Protocol 23 25 Status of This Memo 27 This document is an Internet-Draft and is in full conformance with all 28 provisions of Section 10 of RFC 2026. Internet-Drafts are working 29 documents of the Internet Engineering Task Force (IETF), its areas, 30 and its working groups. Note that other groups may also distribute 31 working documents as Internet-Drafts. 33 The list of current Internet-Drafts can be accessed at 34 http://www.ietf.org/ietf/1id-abstracts.txt 36 The list of Internet-Draft Shadow Directories can be accessed at 37 http://www.ietf.org/shadow.html. 39 Stewart, et al [Page 1] 40 Abstract 42 This document describes the Simple Control Transmission Protocol 43 (SCTP). SCTP is designed to transport PSTN signaling messages over 44 IP networks, but is capable of broader applications. 46 SCTP is a reliable datagram transfer protocol operating on top of an 47 unreliable routed packet network such as IP. It offers the following 48 services to its users: 50 -- acknowledged error-free non-duplicated transfer of user data, 51 -- data segmentation to conform to discovered path MTU size, 52 -- sequenced delivery of user messages within multiple streams, 53 with an option for order-of-arrival delivery of individual 54 user messages, 55 -- optional multiplexing of user messages into SCTP datagrams, and 56 -- network-level fault tolerance through supporting of multi-homing 57 at either or both ends of an association. 59 The design of SCTP includes appropriate congestion avoidance behavior 60 and resistance to flooding and masquerade attacks. 62 Stewart, et al [Page 2] 63 TABLE OF CONTENTS 65 1. Introduction..................................................5 66 1.1 Motivation..................................................5 67 1.2 Architectural View of SCTP..................................5 68 1.3 Functional View of SCTP.....................................6 69 1.3.1 Association Startup and Takedown........................7 70 1.3.2 Sequenced Delivery within Streams.......................7 71 1.3.3 User Data Segmentation..................................8 72 1.3.4 Acknowledgment and Congestion Avoidance.................8 73 1.3.5 Chunk Multiplex.........................................8 74 1.3.6 Path Management.........................................8 75 1.3.7 Message Validation......................................9 76 1.4 Recapitulation of Key Terms.................................9 77 1.5 Abbreviations...............................................11 78 2. SCTP Datagram Format..........................................11 79 2.1 SCTP Common Header Field Descriptions.......................12 80 2.2 Chunk Field Descriptions....................................13 81 2.2.1 Optional/Variable-length Parameter Format...............14 82 2.2.2 Vendor-Specific Extension Parameter Format..............15 83 2.3 SCTP Chunk Definitions......................................17 84 2.3.1 Initiation (INIT).......................................17 85 2.3.1.1 Optional or Variable Length Parameters..............19 86 2.3.2 Initiation Acknowledgment (INIT ACK)....................20 87 2.3.2.1 Optional or Variable Length Parameters..............21 88 2.3.3 Selective Acknowledgment (SACK).........................22 89 2.3.4 Heartbeat Request (HEARTBEAT)...........................25 90 2.3.5 Heartbeat Acknowledgment (HEARTBEAT ACK)................26 91 2.3.6 Abort Association (ABORT)...............................26 92 2.3.7 Shutdown Association (SHUTDOWN).........................27 93 2.3.8 Shutdown Acknowledgment (SHUTDOWN ACK)..................28 94 2.3.9 Operation Error (ERROR).................................28 95 2.3.10 State Cookie (COOKIE)..................................30 96 2.3.11 Cookie Acknowledgment (COOKIE ACK).....................31 97 2.3.12 Payload Data (DATA)....................................31 98 2.4 Vendor-Specific Chunk Extensions............................33 99 3. SCTP Association State Diagram.................................34 100 4. Association Initialization.....................................36 101 4.1 Normal Establishment of an Association......................37 102 4.1.1 Handle Stream Parameters................................39 103 4.1.2 Handle Address Parameters...............................39 104 4.1.3 Generating State Cookie.................................39 105 4.1.4 Cookie Processing.......................................40 106 4.1.5 Cookie Authentication...................................40 107 4.1.6 An Example of Normal Association Establishment..........41 108 4.2 Handle Duplicate INIT, INIT ACK, COOKIE, and COOKIE ACK.....42 109 4.2.1 Handle Duplicate INIT in COOKIE-WAIT 110 or COOKIE-SENT States...................................43 111 4.2.2 Handle Duplicate INIT in Other States...................43 112 4.2.3 Handle Duplicate INIT ACK...............................43 113 4.2.4 Handle Duplicate COOKIE.................................43 114 4.2.5 Handle Duplicate COOKIE-ACK.............................45 115 4.2.6 Handle Stale COOKIE Error...............................45 116 4.3 Other Initialization Issues.................................45 117 Stewart, et al [Page 3] 118 4.3.1 Selection of Tag Value..................................45 119 5. User Data Transfer.............................................46 120 5.1 Transmission of DATA Chunks.................................47 121 5.2 Acknowledgment of Reception of DATA Chunks..................48 122 5.2.1 Tracking Peer's Receive Buffer Space....................49 123 5.3 Management Retransmission Timer.............................50 124 5.3.1 RTO Calculation.........................................50 125 5.3.2 Retransmission Timer Rules..............................51 126 5.3.3 Handle T3-rxt Expiration................................52 127 5.4 Multi-homed SCTP Endpoints..................................53 128 5.4.1 Failover from Inactive Destination Address..............54 129 5.5 Stream Identifier and Stream Sequence Number................54 130 5.6 Ordered and Un-ordered Delivery.............................54 131 5.7 Report Gaps in Received DATA TSNs...........................55 132 5.8 Adler-32 Checksum Calculation...............................56 133 5.9 Segmentation................................................57 134 5.10 Bundling and Multiplexing..................................58 135 6. Congestion Control ..........................................58 136 6.1 SCTP Differences from TCP Congestion Control................59 137 6.2 SCTP Slow-Start and Congestion Avoidance....................59 138 6.2.1 Slow-Start..............................................60 139 6.2.2 Congestion Avoidance....................................61 140 6.2.3 Congestion Control......................................61 141 6.2.4 Fast Retransmit on Gap Reports..........................62 142 6.3 Path MTU Discovery..........................................63 143 7. Fault Management..............................................64 144 7.1 Endpoint Failure Detection..................................64 145 7.2 Path Failure Detection......................................64 146 7.3 Path Heartbeat..............................................65 147 7.4 Handle "Out of the blue" Packets............................66 148 7.5 Verification Tag............................................67 149 7.5.1 Exceptions in Verification Tag Rules....................67 150 8. Termination of Association.....................................68 151 8.1 Close of an Association.....................................68 152 8.2 Shutdown of an Association..................................68 153 9. Interface with Upper Layer.....................................69 154 9.1 ULP-to-SCTP.................................................70 155 9.2 SCTP-to-ULP.................................................78 156 10. Security Considerations.......................................82 157 10.1 Security Objectives........................................82 158 10.2 SCTP Responses To Potential Threats........................82 159 10.2.1 Countering Insider Attacks.............................82 160 10.2.2 Protecting against Data Corruption in the Network......83 161 10.2.3 Protecting Confidentiality.............................83 162 10.2.4 Protecting against Blind Denial of Service Attacks.....83 163 10.2.4.1 Flooding...........................................84 164 10.2.4.2 Masquerade.........................................84 165 10.2.4.3 Improper Monopolization of Services................85 166 10.3 Protection against Fraud and Repudiation...................85 167 11. Recommended Transmission Control Block (TCB) Parameters.......86 168 11.1 Parameters necessary for the SCTP instance.................86 169 11.2 Parameters necessary per association (i.e. the TCB)........87 170 11.3 Per Transport Address Data.................................88 171 11.4 General Parameters Needed..................................89 172 12. IANA Consideration............................................89 173 12.1 IETF-defined Chunk Extension...............................89 174 12.2 IETF-defined Chunk Parameter Extension.....................90 175 12.3 IETF-defined Additional Error Causes.......................91 176 12.4 Payload Protocol Identifiers...............................92 177 Stewart, et al [Page 4] 178 13. Suggested SCTP Protocol Parameter Values......................92 179 14. Acknowledgments...............................................92 180 15. Authors' Addresses............................................93 181 16. References....................................................94 182 Appendix A .......................................................95 184 1. Introduction 186 This section explains the reasoning behind the development of the 187 Simple Control Transmission Protocol (SCTP), the services it offers, 188 and the basic concepts needed to understand the detailed description 189 of the protocol. 191 1.1 Motivation 193 TCP [8] has performed immense service as the primary means of reliable 194 data transfer in IP networks. However, an increasing number of recent 195 applications have found TCP too limiting, and have incorporated their 196 own reliable data transfer protocol on top of UDP [9]. The limitations 197 which users have wished to bypass include the following: 199 -- TCP provides both reliable data transfer and strict order- 200 of-transmission delivery of data. Some applications need reliable 201 transfer without sequence maintenance, while others would be 202 satisfied with partial ordering of the data. In both of these 203 cases the head-of-line blocking offered by TCP causes 204 unnecessary delay. 206 -- The stream-oriented nature of TCP is often an inconvenience. 207 Applications must add their own record marking to delineate 208 their messages, and must make explicit use of the push facility 209 to ensure that a complete message is transferred in a 210 reasonable time. 212 -- The limited scope of TCP sockets complicates the task of 213 providing highly-available data transfer capability using 214 multi-homed hosts. 216 -- TCP is relatively vulnerable to denial of service attacks, 217 such as SYN attacks. 219 Transport of PSTN signaling across the IP network is an application 220 for which all of these limitations of TCP are relevant. While this 221 application directly motivated the development of SCTP, other 222 applications may find SCTP a good match to their requirements. 224 1.2 Architectural View of SCTP 226 SCTP is viewed as a layer between the SCTP user application ("SCTP 227 user" for short) and an unreliable routed packet network service such 228 as IP. The basic service offered by SCTP is the reliable transfer of 229 user messages between peer SCTP users. It performs this service 231 Stewart, et al [Page 5] 232 within the context of an association between two SCTP nodes. Chapter 9 233 of this document sketches the API which should exist at the boundary 234 between the SCTP and the SCTP user layers. 236 SCTP is connection-oriented in nature, but the SCTP association is a 237 broader concept than the TCP connection. SCTP provides the means for 238 each SCTP endpoint (Section 1.4) to provide the other during 239 association startup with a list of transport addresses (e.g. multiple 240 IP addresses in combination with an SCTP port) through which that 241 endpoint can be reached and from which it will originate messages. 242 The association spans transfers over all of the possible 243 source/destination combinations which may be generated from the two 244 endpoint lists. 246 _____________ _____________ 247 | SCTP User | | SCTP User | 248 | Application | | Application | 249 |-------------| |-------------| 250 | SCTP | | SCTP | 251 | Transport | | Transport | 252 | Service | | Service | 253 |-------------| |-------------| 254 | |One or more ---- One or more| | 255 | IP Network |IP address \/ IP address| IP Network | 256 | Service |appearances /\ appearances| Service | 257 |_____________| ---- |_____________| 259 SCTP Node A |<-------- Network transport ------->| SCTP Node B 261 Figure 1: An SCTP Association 263 1.3 Functional View of SCTP 265 The SCTP transport service can be decomposed into a number of 266 functions. These are depicted in Figure 2 and explained in the 267 remainder of this section. 269 SCTP User Application 271 ..----------------------------------------------------- 272 .. _____________ ____________________ 273 | | | Sequenced delivery | 274 | Association | | within streams | 275 | | |____________________| 276 | startup | 277 ..| | ____________________________ 278 | and | | User Data Segmentation | 279 | | |____________________________| 280 | takedown | 282 Stewart, et al [Page 6] 283 ..| | ____________________________ 284 | | | Acknowledgment | 285 | | | and | 286 | | | Congestion Avoidance | 287 ..| | |____________________________| 288 | | 289 | | ____________________________ 290 | | | Chunk Multiplex | 291 | | |____________________________| 292 | | 293 | | ________________________________ 294 | | | Message Validataion | 295 | | |________________________________| 296 | | 297 | | ________________________________ 298 | | | Path Management | 299 |______________ |________________________________| 301 Figure 2: Functional View of the SCTP Transport Service 303 1.3.1 Association Startup and Takedown 305 An association is initiated by a request from the SCTP user (see the 306 description of the ASSOCIATE primitive in Chapter 9). 308 A cookie mechanism, taken from that devised by Karn and Simpson in RFC 309 2522 [6], is employed during the initialization to provide protection 310 against security attacks. The cookie mechanism uses a four-way 311 handshaking, but the last two legs of which are allowed to carry user 313 data for fast setup. The startup sequence is described in chapter 4 of 314 this document. 316 SCTP provides for graceful takedown of an active association on 317 request from the SCTP user. See the description of the TERMINATE 318 primitive in chapter 9. SCTP also allows ungraceful takedown, either 319 on request from the user (ABORT primitive) or as a result of an error 320 condition detected within the SCTP layer. Chapter 8 describes both the 321 graceful and the ungraceful takedown procedures. 323 1.3.2 Sequenced Delivery within Streams 325 The term "stream" is used in SCTP to refer to a sequence of user 326 messages. This is in contrast to its usage in TCP, where it refers to 327 a sequence of bytes. 329 The SCTP user can specify at association startup time the number of 330 streams to be supported by the association. This number is negotiated 331 with the remote end (see section 4.1.1). User messages are associated 332 with stream numbers (SEND, RECEIVE primitives, Chapter 9). Internally, 333 SCTP assigns a stream sequence number to each message passed to it by 335 Stewart, et al [Page 7] 336 the SCTP user. On the receiving side, SCTP ensures that messages are 337 delivered to the SCTP user in sequence within a given stream. However, 338 while one stream may be blocked waiting for the next in-sequence user 339 message, delivery from other streams may proceed. 341 SCTP provides a mechanism for bypassing the sequenced delivery 342 service. User messages sent using this mechanism are delivered to the 343 SCTP user as soon as they are received. 345 1.3.3 User Data Segmentation 347 SCTP can segment user messages to ensure that the SCTP datagram 348 passed to the lower layer conforms to the path MTU. Segments are 349 reassembled into complete messages before being passed to the SCTP 350 user. 352 1.3.4 Acknowledgment and Congestion Avoidance 354 SCTP assigns a Transmission Sequence Number (TSN) to each user data 355 segment or unsegmented message. The TSN is independent of any 356 stream sequence number assigned at the stream level. The receiving end 357 acknowledges all TSNs received, even if there are gaps in the 358 sequence. In this way, reliable delivery is kept functionally separate 359 from sequenced delivery. 361 The Acknowledgment and Congestion Avoidance function is responsible 362 for message retransmission when timely acknowledgment has not been 363 received. Message retransmission is conditioned by congestion 364 avoidance procedures similar to those used for TCP. See Chapters 5 365 and 6 for a detailed description of the protocol procedures associated 366 with this function. 368 1.3.5 Chunk Multiplex 370 As described in Chapter 2, the SCTP datagram as delivered to the lower 371 layer consists of a common header followed by one or more chunks. Each 372 chunk may contain either user data or SCTP control information. The 373 SCTP user has the option to request "bundling", or multiplexing of 374 more than one user messages into a single SCTP datagram. The chunk 375 multiplex function of SCTP is responsible for assembly of the complete 376 SCTP datagram and its disassembly at the receiving end. 378 1.3.6 Path Management 380 The sending SCTP user is able to manipulate the set of transport 381 addresses used as destinations for SCTP datagrams, through the 382 primitives described in Chapter 9. The SCTP path management function 383 chooses the destination transport address for each outgoing SCTP 384 datagram based on the SCTP user's instructions and the currently 385 perceived reachability status of the eligible destination set. 387 Stewart, et al [Page 8] 388 The path management function monitors reachability through heartbeat 389 messages when other message traffic is inadequate to provide this 390 information, and advises the SCTP user when reachability of any far- 391 end transport address changes. The path management function is also 392 responsible for reporting the eligible set of local transport 393 addresses to the far end during association startup, and for reporting 394 the transport addresses returned from the far end to the SCTP user. 396 At association start-up, a primary destination transport address is 397 defined for each SCTP endpoint, and is used for normal sending of SCTP 398 datagrams. 400 On the receiving end, the path management is responsible for verifying 401 the existence of a valid SCTP association to which the inbound SCTP 402 datagram belongs before passing it for further processing. 404 1.3.7 Message Validation 406 A mandatory verification tag and an Adler-32 checksum [2] fields are 407 included in the SCTP common header. The verification tag value is 408 chosen by each end of the association during association startup. 409 Messages received without the verification tag value expected by the 410 receiver are discarded, as a protection against blind masquerade 411 attacks and against stale datagrams from a previous association. 413 The Adler-32 checksum should be set by the sender of each SCTP datagram, 414 to provide additional protection against data corruption in the 415 network beyond that provided by lower layers (e.g. the IP checksum). 417 1.4 Recapitulation of Key Terms 419 The language used to describe SCTP has been introduced in the previous 420 sections. This section provides a consolidated list of the key terms 421 and their definitions. 423 o SCTP user application (SCTP user): The logical higher-layer 424 application entity which uses the services of SCTP, also called 425 the Upper-layer Protocol (ULP). 427 o User message: the unit of data delivery across the interface 428 between SCTP and its user. 430 o SCTP datagram: the unit of data delivery across the interface 431 between SCTP and the unreliable packet network (e.g. IP) which 432 it is using. An SCTP datagram includes the common SCTP header, 433 possible SCTP control chunks, and user data encapsulated within 434 SCTP DATA chunks. 436 o Transport address: an address which serves as a source or 437 destination for the unreliable packet transport service used by 438 SCTP. In IP networks, a transport address is defined by the 439 combination of an IP address and an SCTP port number. 441 Stewart, et al [Page 9] 442 Note, only one SCTP port may be defined for each endpoint, 443 but each endpoint may have multiple IP addresses. 445 o SCTP endpoint: the logical sender/receiver of SCTP datagrams. On a 446 multi-homed host, an SCTP endpoint is represented to its peers as a 447 combination of a set of eligible destination transport addresses to 448 which SCTP datagrams can be sent and a set of eligible source 449 transport addresses from which SCTP datagrams can be received. 451 Note, a source or destination transport address can only be 452 included in one unique SCTP endpoint, i.e., it is NOT allowed to 453 have the same SCTP source or destination transport address appear 454 in more than one SCTP endpoint. 456 o SCTP association: a protocol relationship between SCTP endpoints, 457 comprising the two SCTP endpoints and protocol state information 458 including verification tags and the currently active set of 459 Transmission Sequence Numbers (TSNs), etc. 461 o Chunk: a unit of information within an SCTP datagram, consisting of 462 a chunk header and chunk-specific content. 464 o Transmission Sequence Number (TSN): a 32-bit sequence number used 465 internally by SCTP. One TSN is attached to each chunk containing 466 user data to permit the receiving SCTP endpoint to acknowledge its 467 receipt and detect duplicate deliveries. 469 o Stream: a uni-directional logical channel established from one to 470 another associated SCTP endpoints, within which all user messages 471 are delivered in sequence except for those submitted to the 472 un-ordered delivery service. 474 Note: The relationship between stream numbers in opposite 475 directions is strictly a matter of how the applications use 476 them. It is the responsibility of the SCTP user to create and 477 manage these correlations if they are so desired. 479 o Stream Sequence Number: a 16-bit sequence number used internally by 480 SCTP to assure sequenced delivery of the user messages within a 481 given stream. One stream sequence number is attached to each user 482 message. 484 o Path: the route taken by the SCTP datagrams sent by one SCTP 485 endpoint to a specific destination transport address of its peer 486 SCTP endpoint. Note, sending to different destination transport 487 addresses does not necessarily guarantee getting separate paths. 489 o Bundling: an optional multiplexing operation, whereby more than one 490 user messages may be carried in the same SCTP datagram. Each user 491 message occupies its own DATA chunk. 493 o Outstanding TSN (at an SCTP endpoint): a TSN (and the associated DATA 494 chunk) which have been sent by the endpoint but for which it has not 495 yet received an acknowledgment. 497 Stewart, et al [Page 10] 498 o Unacknowledged TSN (at an SCTP endpoint): a TSN (and the associated DATA 499 chunk) which have been received by the endpoint but for which an 500 acknowledgment has not yet been sent. 502 o Receiver Window (rwnd): The most recently calculated receiver 503 window, in number of octets. This gives an indication of the space 504 available in the receiver's inbound buffer. 506 o Congestion Window (cwnd): An SCTP variable that limits the data, in 507 number of octets, a sender can send into the network before 508 receiving an acknowledgment on a particular destination Transport 509 address. 511 o Slow Start Threshold (ssthresh): An SCTP variable. This is the 512 threshold which the endpoint will use to determine whether to 513 perform slow start or congestion avoidance on a particular destination 514 transport address. Ssthresh is in number of octets. 516 o Transmission Control Block (TCB): an internal data structure 517 created by an SCTP endpoint for each of its existing SCTP 518 associations to other SCTP endpoints. TCB contains all the status 519 and operational information for the endpoint to maintain and manage 520 the corresponding association. 522 o Network Byte Order: Most significant byte first, a.k.a Big Endian. 524 1.5. Abbreviations 526 ICV - Integrity Check Value [4] 528 RTO - Retransmission Time-out 530 RTT - Round-trip Time 532 RTTVAR - Round-trip Time Variation 534 SCTP - Simple Control Transmission Protocol 536 SRTT - Smoothed RTT 538 TCB - Transmission Control Block 540 TLV - Type-Length-Value Coding Format 542 TSN - Transmission Sequence Number 544 ULP - Upper-layer Protocol 546 2. SCTP Datagram Format 548 An SCTP datagram is composed of a common header and chunks. A chunk 549 contains either control information or user data. 551 Stewart, et al [Page 11] 552 The SCTP datagram format is shown below: 554 0 1 2 3 555 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 556 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 557 | Common Header | 558 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 559 | Chunk #1 | 560 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 561 | ... | 562 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 563 | Chunk #n | 564 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 566 Multiple chunks can be multiplexed into one SCTP datagram up to 567 the MTU size, except for the INIT, INIT ACK, and SHUTDOWN ACK 568 chunks. These chunks MUST NOT be multiplexed with any other chunk in a 569 datagram. See Section 5.10 for more details on chunk multiplexing. 571 If an user data message doesn't fit into one SCTP datagram it can be 572 segmented into multiple chunks using the procedure defined in 573 Section 5.9. 575 All integer fields in an SCTP datagram MUST be transmitted in the 576 network byte order, unless otherwise stated. 578 2.1 SCTP Common Header Field Descriptions 580 SCTP Common Header Format 582 0 1 2 3 583 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 584 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 585 | Source Port Number | Destination Port Number | 586 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 587 | Verification Tag | 588 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 589 | Adler-32 Checksum | 590 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 592 Source Port Number: 16 bit u_int 594 This is the SCTP sender's port number. It can be used by the 595 receiver, in combination with the source IP address, to identify the 596 association to which this datagram belongs. 598 Destination Port Number: 16 bit u_int 600 This is the SCTP port number to which this datagram is destined. The 601 receiving host will use this port number to de-multiplex the 602 SCTP datagram to the correct receiving endpoint/application. 604 Stewart, et al [Page 12] 605 Verification Tag: 32 bit u_int 607 The receiver of this datagram uses the Verification Tag to validate 608 the sender of this SCTP datagram. On transmit, the value of this 609 Verification Tag MUST be set to the value of the Initiate Tag 610 received from the peer endpoint during the association 611 initialization. 613 For datagrams carrying the INIT chunk, the transmitter MUST set the 614 Verification Tag to all 0's. If the receiver receives a datagram 615 with an all-zeros Verification Tag field, it checks the Chunk ID 616 immediately following the common header. If the Chunk Type is 617 neither INIT nor SHUTDOWN ACK, the receiver MUST drop the datagram. 619 For datagrams carrying the SHUTDOWN ACK chunk, the transmitter 621 SHOULD set the Verification Tag to the Initiate Tag received from 622 the peer endpoint during the association initialization, if known. 623 Otherwise, the Verification Tag MUST be set to all 0's. 625 Adler-32 Checksum: 32 bit u_int 627 This field MUST contain an Adler-32 checksum of this SCTP 628 datagram. Its calculation is discussed in Section 5.8. 630 2.2 Chunk Field Descriptions 632 The figure below illustrates the field format for the chunks to be 633 transmitted in the SCTP datagram. Each chunk is formatted with a Chunk 634 ID field, a chunk-specific Flag field, a Length field, and a Value 635 field. 637 0 1 2 3 638 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 639 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 640 | Chunk ID | Chunk Flags | Chunk Length | 641 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 642 \ \ 643 / Chunk Value / 644 \ \ 645 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 647 Chunk ID: 8 bits, u_int 649 This field identifies the type of information contained in the Chunk 650 Value field. It takes a value from 0x00 to 0xFF. The value of 0xFE 651 is reserved for vendor-specific extensions. The value of 0xFF is 652 reserved for future use as an extension field. Procedures for 653 extending this field by vendors are defined in Section 2.4. 655 The values of Chunk ID are defined as follows: 657 Stewart, et al [Page 13] 658 ID Value Chunk Type 659 ----- ---------- 660 00000000 - Payload Data (DATA) 661 00000001 - Initiation (INIT) 662 00000010 - Initiation Acknowledgment (INIT ACK) 663 00000011 - Selective Acknowledgment (SACK) 664 00000100 - Heartbeat Request (HEARTBEAT) 665 00000101 - Heartbeat Acknowledgment (HEARTBEAT ACK) 666 00000110 - Abort (ABORT) 667 00000111 - Shutdown (SHUTDOWN) 668 00001000 - Shutdown Acknowledgment (SHUTDOWN ACK) 669 00001001 - Operation Error (ERROR) 670 00001010 - State Cookie (COOKIE) 671 00001011 - Cookie Acknowledgment (COOKIE ACK) 672 00001100 - Reserved for Explict Congestion Notification Echo (ECNE) 673 00001101 - Reserved for Congestion Window Reduced (CWR) 674 00001110 to 11111101 - reserved by IETF 675 11111110 - Vendor-specific Chunk Extensions 676 11111111 - IETF-defined Chunk Extensions 678 Note: The ENCE and CWR chunk types are reserved for future use of Explicit 679 Congestion Notification (ECN). 681 Chunk Flags: 8 bits 683 The usage of these bits depends on the chunk type as given by the 684 Chunk ID. Unless otherwise specified, they are set to zero on 685 transmit and are ignored on receipt. 687 Chunk Length: 16 bits (u_int) 689 This value represents the size of the chunk in octets including the 690 Chunk ID, Flags, Length, and Value fields. Therefore, if the Value 691 field is zero-length, the Length field will be set to 0x0004. The 692 Length field does not count any padding. 694 Chunk Value: variable length 696 The Chunk Value field contains the actual information to be 697 transferred in the chunk. The usage and format of this field is 698 dependent on the Chunk ID. The Chunk Value field MUST be aligned on 699 32-bit boundaries. If the length of the chunk does not align on 700 32-bit boundaries, it is padded at the end with all zero octets. 702 SCTP defined chunks are described in detail in Section 2.3. The 703 guideline for vendor-specific chunk extensions is discussed in Section 704 2.4. And the guidelines for IETF-defined chunk extensions can be found 705 in Section 12.1 of this document. 707 2.2.1 Optional/Variable-length Parameter Format 709 The optional and variable-length parameters contained in a chunk 710 are defined in a Type-Length-Value format as shown below. 712 Stewart, et al [Page 14] 713 0 1 2 3 714 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 715 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 716 | Parameter Type | Parameter Length | 717 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 718 \ \ 719 / Parameter Value / 720 \ \ 721 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 723 Parameter Type: 16 bit u_int 725 The Type field is a 16 bit identifier of the type of parameter. It 726 takes a value of 0x0000 to 0xFFFF. 728 The value of 0xFFFE is reserved for vendor-specific extensions if 729 the specific chunk allows such extensions. The value of 0xFFFF is 730 reserved for IETF-defined extensions. Values other than those 731 defined in specific SCTP chunk description are reserved for use by 732 IETF. 734 Parameter Length: 16 bit u_int 736 The Length field contains the size of the parameter in octets, 737 including the Type, Length, and Value fields. Thus, a parameter 738 with a zero-length Value field would have a Length field of 739 0x0004. The Length does not include any padding octets. 741 Parameter Value: variable-length. 743 The Value is dependent on the value of the Type field. The value 744 field MUST be aligned on 32-bit boundaries. If the value field is 745 not aligned on 32-bit boundaries it is padded at the end with all 746 zero octets. The value field must be an integer number of octets. 748 The actual SCTP parameters are defined in the specific SCTP chunk 749 section. The guidelines for vendor-specific parameter extensions are 750 discussed in Section 2.2.2. And the rules for IETF-defined parameter 751 extensions are defined in Section 12.2. 753 2.2.2 Vendor-Specific Extension Parameter Format 755 This is to allow vendors to support their own extended parameters not 756 defined by the IETF. It MUST not affect the operation of SCTP. 758 Endpoints not equipped to interpret the vendor-specific information 759 sent by a remote endpoint MUST ignore it (although it may be 760 reported). Endpoints that do not receive desired vendor-specific 761 information SHOULD make an attempt to operate without it, although 762 they may do so (and report they are doing so) in a degraded mode. 764 A summary of the Vendor-specific extension format is shown below. The 765 fields are transmitted from left to right. 767 Stewart, et al [Page 15] 768 0 1 2 3 769 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 770 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 771 | Parameter Type = 0xFFFE | Parameter Length | 772 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 773 | Vendor-Id | 774 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 775 \ \ 776 / Parameter Value / 777 \ \ 778 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 780 Type: 16 bit u_int 782 0xFFFE for all Vendor-Specific parameters. 784 Length: 16 bit u_int 786 Indicate the size of the parameter in octets, including the 787 Type, Length, Vendor-Id, and Value fields. 789 Vendor-Id: 32 bit u_int 791 The high-order octet is 0 and the low-order 3 octets are the 792 SMI Network Management Private Enterprise Code of the Vendor 793 in network byte order, as defined in the Assigned Numbers (RFC 794 1700). 796 Value: variable length 798 The Value field is one or more octets. The actual format of the 799 information is site or application specific, and a robust 800 implementation SHOULD support the field as undistinguished 801 octets. 803 The codification of the range of allowed usage of this field is 804 outside the scope of this specification. 806 It SHOULD be encoded as a sequence of vendor type / vendor length 807 / value fields, as follows. The parameter field is 808 dependent on the vendor's definition of that attribute. An 809 example encoding of the Vendor-Specific attribute using this 810 method follows: 812 0 1 2 3 813 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 814 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 815 | Parameter Type = 0xFFFE | Parameter Length | 816 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 817 | Vendor-Id | 818 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 819 | VS-Type | VS-Length | 820 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 821 / VS-Value / 822 \ \ 823 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 824 Stewart, et al [Page 16] 825 VS-Type: 16 bit u_int 827 This field identifies the parameter included in the VS-Value field. 828 It is assigned by the vendor. 830 VS-Length: 16 bit u_int 832 This field is the length of the vendor-specific parameter and 833 Includes the VS-Type, VS-Length and VS-Value (if included) fields. 835 VS-Value: Variable Length 837 This field contains the parameter identified by the VS-Type field. 838 It's meaning is identified by the vendor. 840 2.3 SCTP Chunk Definitions 842 This section defines the format of the different SCTP chunk types. 844 2.3.1 Initiation (INIT) (00000001) 846 This chunk is used to initiate a SCTP association between two 847 endpoints. The format of the INIT message is shown below: 849 0 1 2 3 850 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 851 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 852 |0 0 0 0 0 0 0 1| Chunk Flags | Chunk Length | 853 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 854 | Initiate Tag | 855 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 856 | Advertised Receiver Window Credit (a_rwnd) | 857 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 858 | Number of Outbound Streams | Number of Inbound Streams | 859 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 860 | Initial TSN | 861 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 862 \ \ 863 / Optional/Variable-Length Parameters / 864 \ \ 865 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 867 The INIT chunk contains the following parameters. Unless otherwise 868 noted, each parameter MUST only be included once in the INIT chunk. 870 Fixed Parameters Status 871 ---------------------------------------------- 872 Initiate Tag Mandatory 873 Advertised Receiver Window Credit Mandatory 874 Number of Outbound Streams Mandatory 875 Number of Inbound Streams Mandatory 876 Initial TSN Mandatory 878 Stewart, et al [Page 17] 879 Variable Parameters Status Type Value 880 ------------------------------------------------------------- 881 IPv4 Address (Note 1) Optional 0x0005 882 IPv6 Address (Note 1) Optional 0x0006 883 Cookie Preservative Optional 0x0009 884 Reserved for ECN Capable Optional 0x000a 886 Note 1: The INIT chunks may contain multiple addresses that may be 887 IPv4 and/or IPv6 in any combination. 889 Note 2: The ECN capable field is reserved for future use of Explicit 890 Congestion Notification. 892 Chunk Flags field in INIT is reserved, and all bits in it should be 893 set to 0 by the sender and ignored by the receiver. The sequence of 894 parameters within an INIT may be processed in any order. 896 Initiate Tag: 32 bit u_int 898 The receiver of the INIT (the responding end) records the value of 899 the Initiate Tag parameter. This value MUST be placed into the 900 Verification Tag field of every SCTP datagram that the responding 901 end transmits within this association. 903 The valid range for Initiate Tag is from 0x1 to 0xffffffff. See 904 Section 4.3.1 for more on the selection of the tag value. 906 If the value of the Initiate Tag in a received INIT chunk is found 907 to be 0x0, the receiver MUST treat it as an error and silently 908 discard the datagram. 910 Advertised Receiver Window Credit (a_rwnd): 32 bit u_int 912 This value represents the dedicated buffer space, in number of 913 octets, the sender of the INIT has placed in association with this 914 window. During the life of the association this buffer space SHOULD 915 not be lessened (i.e. dedicated buffers taken away from this 916 association). 918 Number of Outbound Streams (OS): 16 bit u_int 920 Defines the number of outbound streams the sender of this INIT chunk 921 wishes to create in this association. The value of 0 MUST NOT be 922 used. 924 Number of Inbound Streams (MIS) : 16 bit u_int 926 Defines the maximum number of streams the sender of this INIT chunk 927 allows the peer end to create in this association. The value 0 MUST 928 NOT be used. 930 Initial TSN (I-TSN) : 32 bit u_int 932 Defines the initial TSN that the sender will use. The valid range is 933 from 0x0 to 0xffffffff. This field MAY be set to the value of the 934 Initiate Tag field. 936 Stewart, et al [Page 18] 937 Vendor-specific parameters are allowed in INIT. However, they MUST be 938 appended to the end of the above INIT chunks. The format of the 939 vendor-specific parameters MUST follow the Type-Length-value format as 940 defined in Section 2.2.2. In case an endpoint does not support the 941 vendor-specific chunks received, it MUST ignore them. 943 2.3.1.1 Optional/Variable Length Parameters in INIT 945 The following parameters follow the Type-Length-Value format as 946 defined in Section 2.2.1. The IP address fields MUST come after 947 the fixed-length fields defined in the previous Section. 949 Any extensions MUST come after the IP address fields. 951 IPv4 Address Parameter 953 0 1 2 3 954 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 955 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 956 |0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1|0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0| 957 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 958 | IPv4 Address | 959 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 961 IPv4 Address: 32 bit 963 Contains an IPv4 address of the sending endpoint. It is binary 964 encoded. 966 IPv6 Address: 968 0 1 2 3 969 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 970 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 971 |0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0|0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0| 972 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 973 | | 974 | IPv6 Address | 975 | | 976 | | 977 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 979 IPv6 Address: 128 bit 981 Contains an IPv6 address of the sending endpoint. It is binary 982 encoded. 984 Combining with the Source Port Number in the SCTP common header, the 985 value passed in an IPv4 or IPv6 Address parameter indicates a 986 transport address the sender of the INIT will support for the 987 association being initiated. That is, during the lifetime of this 988 association, this IP address may appear in the source address field 990 Stewart, et al [Page 19] 991 of a datagram sent from the sender of the INIT, and may be used as a 992 destination address of a datagram sent from the receiver of the 993 INIT. 995 More than one IP Address parameters can be included in an INIT 996 chunk when the INIT sender is multi-homed. Moreover, a multi-homed 997 endpoint may have access to different types of network, thus more 998 than one address type may be present in one INIT chunk, i.e., IPv4 999 and IPv6 transport addresses are allowed in the same INIT message. 1001 If the INIT contains at least one IP Address parameter, then only the 1002 transport address(es) provided within the INIT may be used as 1003 destinations by the responding end. If the INIT does not contain any 1004 IP Address parameters, the responding end MUST use the source 1005 address associated with the received SCTP datagram as its sole 1006 destination address for the association. 1008 Cookie Preservative 1010 The sender of the INIT shall use this parameter to suggest to the 1011 receiver of the INIT for a longer life-span of the State Cookie. 1013 0 1 2 3 1014 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1015 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1016 |0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1|0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0| 1017 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1018 | Suggested Cookie Life-span Increment (msec.) | 1019 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1021 Suggested Cookie Life-span Increment: 32bit u_int 1023 This parameter indicates to the receiver how much increment the 1024 sender wishes the receiver to add to its default cookie life-span. 1026 This optional parameter should be added to the INIT message by the 1027 sender when it re-attempts establishing an association with a peer 1028 to which its previous attempt of establishing the association failed 1029 due to a Stale COOKIE error. Note, the receiver MAY choose to ignore 1030 the suggested cookie life-span increase for its own security 1031 reasons. 1033 2.3.2 Initiation Acknowledgment (INIT ACK) (00000010): 1035 The INIT ACK chunk is used to acknowledge the initiation of an SCTP 1036 association. 1038 The parameter part of INIT ACK is formatted similarly to the INIT 1039 chunk. It uses two extra variable parameters: The State Cookie 1040 and the Unrecognized Parameter: 1042 The format of the INIT ACK message is shown below: 1044 Stewart, et al [Page 20] 1045 0 1 2 3 1046 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1047 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1048 |0 0 0 0 0 0 1 0| Chunk Flags | Chunk Length | 1049 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1050 | Initiate Tag | 1051 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1052 | Advertised Receiver Window Credit | 1053 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1054 | Number of Outbound Streams | Number of Inbound Streams | 1055 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1056 | Initial TSN | 1057 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1058 \ \ 1059 / Optional/Variable-Length Parameters / 1060 \ \ 1061 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1063 The INIT ACK contains the following parameters. Unless otherwise 1064 noted, each parameter MUST only be included once in the INIT ACK chunk. 1066 Fixed Parameters Status 1067 ---------------------------------------------- 1068 Initiate Tag Mandatory 1069 Advertised Receiver Window Credit Mandatory 1070 Number of Outbound Streams Mandatory 1071 Number of Inbound Streams Mandatory 1072 Initial TSN Mandatory 1074 Variable Parameters Status Type Value 1075 ------------------------------------------------------------- 1076 State Cookie Mandatory 0x0007 1077 IPv4 Address (Note 1) Optional 0x0005 1078 IPv6 Address (Note 1) Optional 0x0006 1079 Unrecognized Parameters Optional 0x0008 1080 Reserved for ECN Capable Optional 0x000a 1082 Note 1: The INIT ACK chunks may contain any number of IP address 1083 parameters that may be IPv4 and/or IPv6 in any combination. 1085 Note 2: The ECN capable field is reserved for future use of Explicit 1086 Congestion Notification. 1088 Same as with INIT, in combination with the Source Port carried in the 1089 SCTP common header, each IP Address parameter in the INIT ACK indicates 1090 to the receiver of the INIT ACK a valid transport address supported by 1091 the sender of the INIT ACK for the lifetime of the association being 1092 initiated. 1094 If the INIT ACK contains at least one IP Address parameter, then only 1095 the transport address(es) explicitly indicated in the INIT ACK may be 1096 used as the destination(s) by the receiver of the INIT ACK. However, 1097 if the INIT ACK contains no IP Address parameter, the receiver of the 1098 INIT ACK MUST take the source IP address associated with this INIT ACK 1099 as its sole destination address for this association. 1101 Stewart, et al [Page 21] 1102 The State Cookie and Unrecognized Parameters use the Type-Length- 1103 Value format as defined in Section 2.2.1 and are described below. The 1104 other fields are defined the same as their counterparts in the INIT 1105 message. 1107 2.3.2.1 Optional or Variable Length Parameters 1109 State Cookie: variable size, depending on Size of Cookie 1111 This field MUST contain all the necessary state and parameter 1112 information required for the sender of this INIT ACK to create the 1113 association, along with an Integrity Check Value (ICV). See 1114 Section 4.1.3 for details on Cookie definition. The Cookie MUST be 1115 padded with '0' to the next 32-bit word boundary. The internal 1116 format of the Cookie is implementation-specific. 1118 Unrecognized Parameters: Variable Size. 1120 This parameter is returned to the originator of the INIT message if 1121 the receiver does not recognize one or more Optional TLV parameters 1122 in the INIT chunk. This parameter field will contain the 1123 unrecognized parameters copied from the INIT message complete 1124 with TLV. 1126 Vendor-Specific parameters are allowed in INIT ACK. However, they 1127 MUST be defined using the format described in Section 2.2.2, and be 1128 appended to the end of the above INIT ACK chunk. In case the receiver 1129 of the INIT ACK does not support the vendor-specific parameters 1130 received, it MUST ignore those fields. 1132 2.3.3 Selective Acknowledgment (SACK) (00000011): 1134 This chunk is sent to the remote endpoint to acknowledge received DATA 1135 chunks and to inform the remote endpoint of gaps in the received 1136 subsequences of DATA chunks as represented by their TSNs. 1138 The SACK MUST contain the Cumulative TSN ACK and Advertised Receiver 1139 Window Credit (a_rwnd) parameters. By definition, the value of the 1140 Cumulative TSN ACK parameter is the last TSN received at the time the 1141 Selective ACK is sent, before a break in the sequence of received TSNs 1142 occurs; the next TSN value following this one has not yet been 1143 received at the reporting end. This parameter therefore acknowledges 1144 receipt of all TSNs up to and including the value given. 1146 The handling of the a_rwnd by the receiver of the SACK is discussed in 1147 detail in Section 5.2.1. 1149 The Selective ACK also contains zero or more fragment reports. Each 1150 fragment report acknowledges a subsequence of TSNs received following 1151 a break in the sequence of received TSNs. By definition, all TSNs 1152 acknowledged by fragment reports are higher than the value of the 1153 Cumulative TSN ACK. 1155 Stewart, et al [Page 22] 1156 0 1 2 3 1157 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1158 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1159 |0 0 0 0 0 0 1 1|Chunk Flags | Chunk Length | 1160 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1161 | Cumulative TSN ACK | 1162 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1163 | Advertised Receiver Window Credit (a_rwnd) | 1164 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1165 | Number of Fragments = N | Number of Duplicate TSNs = X | 1166 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1167 | Fragment #1 Start | Fragment #1 End | 1168 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1169 / / 1170 \ ... \ 1171 / / 1172 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1173 | Fragment #N Start | Fragment #N End | 1174 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1175 | Duplicate TSN 1 | 1176 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1177 / / 1178 \ ... \ 1179 / / 1180 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1181 | Duplicate TSN X | 1182 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1184 Chunk Flags: 1186 Set to all zeros on transmit and ignored on receipt. 1188 Cumulative TSN ACK: 32 bit u_int 1190 This parameter contains the TSN of the last DATA chunk received in 1191 sequence before a gap. 1193 Advertised Receiver Window Credit (a_rwnd): 32 bit u_int 1195 This field indicates the updated receive buffer space in octets of 1196 the sender of this SACK, see Section 5.2.1 for details. 1198 Number of Fragments: 16 bit u_int 1200 Indicates the number of TSN fragments included in this Selective 1201 ACK. 1203 Number of Duplicate TSNs: 16 bit 1205 This field contains the number of duplicate TSNs the endpoint 1206 has received. Each duplicate TSN is listed following the fragment 1207 list. 1209 Fragments: 1211 These fields contain the ack fragments. They are repeated for each 1212 fragment up to the number of fragments defined in the Number of 1213 Fragments field. All DATA chunks with TSNs between the (Cumulative 1214 TSN ACK + Fragment Start) and (Cumulative TSN ACK + Fragment End) of 1215 each fragment are assumed to have been received correctly. 1217 Stewart, et al [Page 23] 1218 Fragment Start: 16 bit u_int 1220 Indicates the Start offset TSN for this fragment. To calculate the 1221 actual TSN number the Cumulative TSN ACK is added to this 1222 offset number to yield the TSN. This calculated TSN identifies 1223 the first TSN in this fragment that has been received. 1225 Fragment End: 16 bit u_int 1227 Indicates the End offset TSN for this fragment. To calculate the 1228 actual TSN number the Cumulative TSN ACK is added to this 1229 offset number to yield the TSN. This calculated TSN identifies 1230 the TSN of the last DATA chunk received in this fragment. 1232 Duplicate TSN: 32 bit u_int 1234 Indicates a TSN that was received in duplicate. 1236 For example, assume the receiver has the following datagrams newly 1237 arrived at the time when it decides to send a Selective ACK, 1239 ---------- 1240 | TSN=17 | 1241 ---------- 1242 | | <- still missing 1243 ---------- 1244 | TSN=15 | 1245 ---------- 1246 | TSN=14 | 1247 ---------- 1248 | | <- still missing 1249 ---------- 1250 | TSN=12 | 1251 ---------- 1252 | TSN=11 | 1253 ---------- 1254 | TSN=10 | 1255 ---------- 1257 then, the parameter part of the Selective ACK MUST be constructed as 1258 follows (assuming the new a_rwnd is set to 0x1234 by the sender): 1260 +---------------+--------------+ 1261 | Cumulative TSN ACK = 12 | 1262 ----------------+--------------- 1263 | a_rwnd = 0x1234 | 1264 ----------------+--------------- 1265 | num of frag=2 | (rev = 0) | 1266 ----------------+--------------- 1267 |frag #1 strt=2 |frag #1 end=3 | 1268 ----------------+--------------- 1269 |frag #2 strt=5 |frag #2 end=5 | 1270 -------------------------------- 1272 Stewart, et al [Page 24] 1273 2.3.4 Heartbeat Request (HEARTBEAT) (00000100): 1275 An endpoint should send this chunk to its peer endpoint of the current 1276 association to probe the reachability of a particular destination 1277 transport address defined in the present association. 1279 The parameter field contains the Heartbeat Information which is a 1280 variable length opaque data structure understood only by the sender. 1282 0 1 2 3 1283 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1284 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1285 |0 0 0 0 0 1 0 0| Chunk Flags | Heartbeat Length | 1286 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1287 \ \ 1288 / Heartbeat Information (Variable-Length) / 1289 \ \ 1290 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1292 Chunk Flags: 1294 Set to zero on transmit and ignored on receipt. 1296 Heartbeat Length: 1298 Set to the size of the chunk in octets, including the chunk header 1299 and the Heartbeat Information field. 1301 Heartbeat Information: 1303 defined as a variable-length parameter using the format described in 1304 Section 2.2.1, i.e.: 1306 0 1 2 3 1307 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1308 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1309 | Heartbeat Info Type=1 | HB Info Length | 1310 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1311 / Sender-specific Heartbeat Info / 1312 \ \ 1313 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1315 The Sender-specific Heartbeat Info field should normally include 1316 information about the sender's current time when this HEARTBEAT 1317 message is sent and the destination transport address to which this 1318 HEARTBEAT is sent (see Section 7.3). 1320 Stewart, et al [Page 25] 1321 2.3.5 Heartbeat Acknowledgment (HEARTBEAT ACK) (00000101): 1323 An endpoint should send this chunk to its peer endpoint as a response 1324 to a Heartbeat Request (see Section 7.3). 1326 The parameter field contains a variable length opaque data structure. 1328 0 1 2 3 1329 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1330 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1331 |0 0 0 0 0 1 0 1| Chunk Flags | Heartbeat Ack Length | 1332 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1333 \ \ 1334 / Heartbeat Information (Variable-Length) / 1335 \ \ 1336 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1338 Chunk Flags: 1340 Set to zero on transmit and ignored on receipt. 1342 Heartbeat Ack Length: 1344 Set to the size of the chunk in octets, including the chunk header 1345 and the Heartbeat Information field. 1347 Heartbeat Information: 1349 The values of this field SHALL be copied from the Heartbeat 1350 Information field found in the Heartbeat Request to which this 1351 Heartbeat Acknowledgment is responding. 1353 2.3.6 Abort Association (ABORT) (00000110): 1355 The ABORT chunk is sent to the peer of an association to terminate the 1356 association. The ABORT chunk may contain cause parameters to inform 1357 the receiver the reason of the abort. DATA chunks MUST not be bundled 1358 with ABORT. Control chunks MAY be bundled with an ABORT but they MUST 1359 be placed before the ABORT in the SCTP datagram, or they will be 1360 ignored by the receiver. 1362 If an endpoint receives an ABORT with a format error or for an 1363 association that doesn't exist, it MUST silently discard it. 1364 Moreover, under any circumstances, an endpoint that receives an ABORT 1365 MUST never respond to that ABORT by sending an ABORT of its own. 1367 0 1 2 3 1368 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1369 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1370 |0 0 0 0 0 1 1 0| Chunk Flags | Length | 1371 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1372 \ \ 1373 / zero or more Error Causes / 1374 \ \ 1375 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1377 Stewart, et al [Page 26] 1378 Chunk Flags: 1380 Set to zero on transmit and ignored on receipt. 1382 Length: 1384 Set to the size of the chunk in octets, including the chunk header 1385 and all the Error Cause fields present. 1387 See Section 2.3.9 for Error Cause definitions. 1389 Note: Special rules apply to the Verification Tag field of SCTP 1390 datagrams which carry an ABORT, see Section 7.5 for details. 1392 2.3.7 SHUTDOWN (00000111): 1394 An endpoint in an association MUST use this chunk to initiate a 1395 graceful termination of the association with its peer. This chunk has 1396 the following format. 1398 0 1 2 3 1399 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1400 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1401 |0 0 0 0 0 1 1 1|Chunk Flags |0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0| 1402 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1403 | Cumulative TSN ACK | 1404 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1406 Chunk Flags: 1408 Set to zero on transmit and ignored on receipt. 1410 Cumulative TSN ACK: 32 bit u_int 1412 This parameter contains the TSN of the last chunk received in 1413 sequence before any gaps. 1415 Stewart, et al [Page 27] 1416 2.3.8 Shutdown Acknowledgment (SHUTDOWN ACK) (00001000): 1418 This chunk MUST be used to acknowledge the receipt of the SHUTDOWN 1419 chunk at the completion of the shutdown process, see Section 8.2 for 1420 details. 1422 The SHUTDOWN ACK chunk has no parameters. 1424 0 1 2 3 1425 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1426 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1427 |0 0 0 0 1 0 0 0|Chunk Flags |0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0| 1428 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1430 Chunk Flags: 1432 Set to zero on transmit and ignored on receipt. 1434 Note: if the endpoint that receives the SHUTDOWN message does not have 1435 a TCB or tag for the sender of the SHUTDOWN, the receiver MUST still 1436 respond. In such cases, the receiver MUST send back a stand-alone 1437 SHUTDOWN ACK chunk in an SCTP datagram with the Verification Tag field 1438 of the common header filled with all '0's. 1440 2.3.9 Operation Error (ERROR) (00001001): 1442 This chunk is sent to the other endpoint in the association to notify 1443 certain error conditions. It contains one or more error causes. It has 1444 the following parameters: 1446 0 1 2 3 1447 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1448 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1449 |0 0 0 0 1 0 0 1| Chunk Flags | Length | 1450 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1451 \ \ 1452 / one or more Error Causes / 1453 \ \ 1454 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1456 Chunk Flags: 1458 Set to zero on transmit and ignored on receipt. 1460 Length: 1462 Set to the size of the chunk in octets, including the chunk header 1463 and all the Error Cause fields present. 1465 Error causes are defined as variable-length parameters using the 1466 format described in 2.2.1, i.e.: 1468 Stewart, et al [Page 28] 1469 0 1 2 3 1470 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1471 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1472 | Cause Code | Cause Length | 1473 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1474 / Cause-specific Information / 1475 \ \ 1476 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1478 Cause Code: 16 bit u_int 1480 Defines the type of error conditions being reported. 1482 Cause Length: 16 bit u_int 1484 Set to the size of the parameter in octets, including the Cause Code, 1485 Cause Length, and Cause-Specific Information fields 1487 Cause-specific Information: variable length 1489 This field carries the details of the error condition. 1491 Currently SCTP defines the following error causes: 1493 Cause of error 1494 --------------- 1495 Invalid Stream Identifier: indicating receiving a DATA sent to a 1496 nonexistent stream. 1498 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1499 | Cause Code=1 | Cause Length=8 | 1500 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1501 | Stream Identifier | (Reserved) | 1502 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1504 Cause of error 1505 --------------- 1506 Missing Mandatory Parameter: indicating that mandatory one or more 1507 TLV parameters are missing in a received INIT or INIT ACK. 1509 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1510 | Cause Code=2 | Cause Length=8+N*2 | 1511 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1512 | Number of missing params=N | 1513 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1514 | Missing Param Type #1 | Missing Param Type #2 | 1515 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1516 | Missing Param Type #N-1 | Missing Param Type #N | 1517 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1519 Each missing mandatory parameter type should be specified. 1521 Stewart, et al [Page 29] 1522 Cause of error 1523 -------------- 1524 Stale Cookie Error: indicating the receiving of a valid cookie 1525 which is however expired. 1527 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1528 | Cause Code=3 | Cause Length=8 | 1529 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1530 | Measure of Staleness (usec.) | 1531 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1533 The sender of this error cause MAY choose to report how long past 1534 expiration the cookie is, by putting in the Measure of Staleness 1535 field the difference, in microseconds, between the current time and 1536 the time the cookie expired. If the sender does not wish to provide 1537 this information it should set Measure of staleness to 0. 1539 Cause of error 1540 --------------- 1541 Out of Resource: indicating that the sender is out of resource. This 1542 is usually sent in combination with or within an ABORT. 1544 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1545 | Cause Code=4 | Cause Length=4 | 1546 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1548 Guidelines for IETF-defined Error Cause extensions are discussed in 1549 Section 12.3 of this document. 1551 2.3.10 State Cookie (COOKIE) (00001010): 1553 This chunk is used only during the initialization of an association. 1554 It is sent by the initiator of an association to its peer to complete 1555 the initialization process. This chunk MUST precede any chunk 1556 sent within the association, but MAY be bundled with one or more DATA 1557 chunks in the same datagram. 1559 0 1 2 3 1560 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1561 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1562 |0 0 0 0 1 0 1 0|Chunk Flags | Length | 1563 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1564 | Cookie | 1565 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1567 Chunk Flags: 8 bit 1569 Set to zero on transmit and ignored on receipt. 1571 Length: 16 bit u_int 1573 Set to the size of the chunk in octets, including the 4 octets of 1574 the chunk header and the size of the Cookie. 1576 Stewart, et al [Page 30] 1577 Cookie: variable size 1579 This field must contain the exact cookie received in a previous INIT 1580 ACK. 1582 2.3.11 Cookie Acknowledgment (COOKIE ACK) (00001011): 1584 This chunk is used only during the initialization of an association. 1585 It is used to acknowledge the receipt of a COOKIE chunk. This chunk 1586 MUST precede any chunk sent within the association, but MAY be 1587 bundled with one or more DATA chunks in the same SCTP datagram. 1589 0 1 2 3 1590 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1591 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1592 |0 0 0 0 1 0 1 1|Chunk Flags |0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0| 1593 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1595 Chunk Flags: 1597 Set to zero on transmit and ignored on receipt. 1599 2.3.12 Payload Data (DATA) (00000000): 1601 The following format MUST be used for the DATA chunk: 1603 0 1 2 3 1604 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1605 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1606 |0 0 0 0 0 0 0 0| Reserved|U|B|E| Length | 1607 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1608 | TSN | 1609 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1610 | Stream Identifier S | Stream Sequence Number n | 1611 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1612 | Payload Protocol Identifier | 1613 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1614 \ \ 1615 / User Data (seq n of Stream S) / 1616 \ \ 1617 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1619 Reserved: 5 bits 1620 should be set to all '0's and ignored by the receiver. 1622 U bit: 1 bit 1623 The (U)nordered bit, if set, indicates that this is an unordered 1624 data chunk, and there is NO Stream Sequence Number assigned to this 1625 DATA chunk. Therefore, the receiver MUST ignore the Stream Sequence 1626 Number field. 1628 Stewart, et al [Page 31] 1629 After re-assembly (if necessary), unordered data chunks MUST be 1630 dispatched to the upper layer by the receiver without any attempt of 1631 re-ordering. 1633 Note, if an unordered user message is segmented, each segment of the 1634 message MUST have its U bit set to 1. 1636 B bit: 1 bit 1638 The (B)eginning segment bit, if set, indicates the first segment of 1639 a user message. 1641 E bit: 1 bit 1642 The (E)nding segment bit, if set, indicates the last segment of a 1643 user message. 1645 A non-segmented user message shall have both the B and E bits set 1646 to 1. Setting both B and E bits to 0 indicates a middle segment of a 1647 multi-segment user message, as summarized in the following table: 1649 B E Description 1650 ============================================================ 1651 | 1 0 | First piece of a segmented user message | 1652 +----------------------------------------------------------+ 1653 | 0 0 | Middle piece of a segmented user message | 1654 +----------------------------------------------------------+ 1655 | 0 1 | Last piece of a segmented user message | 1656 +----------------------------------------------------------+ 1657 | 1 1 | Un-segmented Message | 1658 ============================================================ 1660 Length: 16 bits (16 bit u_int) 1662 This field indicates the length of the DATA chunk in octets. It 1663 includes the Type field, the Reserved field, the U and B/E bits, the 1664 Length field, TSN, the Stream Identifier, the Stream Sequence 1665 Number, and the User Data fields. It does not include any padding. 1667 TSN : 32 bits (32 bit u_int) 1669 This value represents the TSN for this DATA chunk. The valid range 1670 of TSN is from 0x0 to 0xffffffff. 1672 Stream Identifier S: 16 bit u_int 1674 Identifies the stream to which the following user data belongs. 1676 Stream Sequence Number n: 16 bit u_int 1678 This value presents the stream sequence number of the following user 1679 data within the stream S. Valid range is 0x0 to 0xFFFF. 1681 Note, when a user message is segmented by SCTP for transport, the 1682 same stream sequence number MUST be carried in each of the segments of 1683 the message. 1685 Stewart, et al [Page 32] 1686 Payload Protocol Identifier: 32 bits (32 bit u_int) 1688 This value represents an application (or upper layer) specified 1689 protocol identifier. This value is passed to SCTP by its upper layer 1690 and sent to its peer. This identifier is not used by SCTP but may be 1691 used by certain network entities as well as the peer application to 1692 identify the type of information being carried in this DATA chunk. 1694 The value 0x0 indicates no application identifier is specified by 1695 the upper layer for this payload data. 1697 User Data: variable length 1699 This is the payload user data. The implementation MUST pad the end 1700 of the data to a 32 bit boundary with 0 octets. Any padding MUST 1701 NOT be included in the length field. 1703 2.4 Vendor-Specific Chunk Extensions 1705 This Chunk type is available to allow vendors to support their own 1706 extended data formats not defined by the IETF. It MUST not affect the 1707 operation of SCTP. In particular, when adding a Vendor Specific chunk 1708 type, the vendor defined chunks MUST obey the congestion avoidance 1709 rules defined in this document if they carry user data. User data is 1710 defined as any data transported over the association that is delivered 1711 to the upper layer of the receiver. 1713 Endpoints not equipped to interpret the vendor-specific chunk sent by 1714 a remote endpoint MUST ignore it. Endpoints that do not receive 1715 desired vendor specific information SHOULD make an attempt to operate 1716 without it, although they may do so (and report they are doing so) in 1717 a degraded mode. 1719 A summary of the Vendor-Specific Chunk format is shown below. The 1720 fields are transmitted from left to right. 1722 0 1 2 3 1723 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 1724 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1725 | Type | Flags | Length | 1726 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1727 | Vendor-Id | 1728 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1729 \ \ 1730 / Value / 1731 \ \ 1732 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 1734 Type: 8 bit u_int 1736 0xFE for all Vendor-Specific chunks. 1738 Stewart, et al [Page 33] 1739 Flags: 8 bit u_int 1741 Vendor specific flags. 1743 Length: 16 bit u_int 1745 Size of this Vendor-Specific chunks in octets, including the Type, 1746 Flags, Length, Vendor-Id, and Value fields. 1748 Vendor-Id: 32 bit u_int 1750 The high-order octet is 0 and the low-order 3 octets are the SMI 1751 Network Management Private Enterprise Code of the Vendor in 1752 network byte order, as defined in the Assigned Numbers (RFC 1700). 1754 Value: Variable length 1756 The Value field is one or more octets. The actual format of the 1757 information is site or application specific, and a robust 1758 implementation SHOULD support the field as undistinguished 1759 octets. 1761 The codification of the range of allowed usage of this field is 1762 outside the scope of this specification. 1764 3. SCTP Association State Diagram 1766 During the lifetime of an SCTP association, the SCTP endpoints 1767 progress from one state to another in response to various events. The 1768 events that may potentially advance an endpoint's state include: 1770 o SCTP user primitive calls, e.g., [ASSOCIATE], [TERMINATE], [ABORT], 1772 o reception of INIT, COOKIE, ABORT, SHUTDOWN, etc. control 1773 chunks, or 1775 o some timeout events. 1777 The state diagram in the figures below illustrates state changes, 1778 together with the causing events and resulting actions. Note that some 1779 of the error conditions are not shown in the state diagram. Full 1780 description of all special cases should be found in the text. 1782 Note, chunk names are given in all capital letters, while parameter 1783 names have the first letter capitalized, e.g., COOKIE chunk type vs. 1784 Cookie parameter. 1786 Stewart, et al [Page 34] 1787 ----- -------- (frm any state) 1788 / \ / rcv ABORT [ABORT] 1789 rcv INIT | | | ---------- or ---------- 1790 --------------- | v v delete TCB snd ABORT 1791 generate Cookie \ +---------+ delete TCB 1792 snd INIT.ACK ---| CLOSED | 1793 +---------+ 1794 / \ [ASSOCIATE] 1795 / \ --------------- 1796 | | create TCB 1797 | | snd INIT 1798 | | strt init timer 1799 rcv valid COOKIE | v 1800 (1) ---------------- | +------------+ 1801 create TCB | | COOKIE_WAIT| (2) 1802 snd COOKIE.ACK | +------------+ 1803 | | 1804 | | rcv INIT.ACK 1805 | | ----------------- 1806 | | snd COOKIE 1807 | | stop init timer 1808 | | strt cookie timer 1809 | v 1810 | +------------+ 1811 | | COOKIE_SENT| (3) 1812 | +------------+ 1813 | | 1814 | | rcv COOKIE.ACK 1815 | | ----------------- 1816 | | stop cookie timer 1817 v v 1818 +---------------+ 1819 | ESTABLISHED | 1820 +---------------+ 1822 (from the ESTABLISHED state only) 1823 | 1824 | 1825 /--------+--------\ 1826 [TERMINATE] / \ 1827 ----------------- | | 1828 check outstanding | | 1829 data chunks | | 1830 v | 1831 +---------+ | 1832 |SHUTDOWN | | rcv SHUTDOWN 1833 |PENDING | | ---------------- 1834 +---------+ | x 1835 | | 1836 No more outstanding | | 1837 ------------------- | | 1838 snd SHUTDOWN | | 1839 strt shutdown timer | | 1840 v v 1842 Stewart, et al [Page 35] 1843 +---------+ +-----------+ 1844 (4) |SHUTDOWN | | SHUTDOWN | (5) 1845 |SENT | | RECEIVED | 1846 +---------+ +-----------+ 1847 | | 1848 rcv SHUTDOWN.ACK | | x 1849 ------------------- | |----------------- 1850 stop shutdown timer | | retransmit missing DATA 1851 delete TCB | | send SHUTDOWN.ACK 1852 | | delete TCB 1853 | | 1854 \ +---------+ / 1855 \-->| CLOSED |<--/ 1856 +---------+ 1858 Note: 1860 (1) If the received COOKIE is invalid (i.e., failed to pass the 1861 authentication check), the receiver MUST silently discard the 1862 datagram. Or, if the received COOKIE is expired (see Section 1863 4.1.5), the receiver SHALL send back an ERROR chunk. In either 1864 case, the receiver stays in the CLOSED state. 1866 (2) If the init timer expires, the endpoint SHALL retransmit INIT 1867 and re-start the init timer without changing state. This SHALL be 1868 repeated up to 'Max.Init.Retransmits' times. After that, the 1869 endpoint SHALL abort the initialization process and report the 1870 error to SCTP user. 1872 (3) If the cookie timer expires, the endpoint SHALL retransmit 1873 COOKIE and re-start the cookie timer without changing 1874 state. This SHALL be repeated up to 'Max.Init.Retransmits' 1875 times. After that, the endpoint SHALL abort the initialization 1876 process and report the error to SCTP user. 1878 (4) In SHUTDOWN-SENT state the endpoint SHALL acknowledge any received 1879 DATA chunks without delay 1881 (5) In SHUTDOWN-RECEIVED state, the endpoint MUST NOT accept any new 1882 send request from its SCTP user. 1884 4. Association Initialization 1886 Before the first data transmission can take place from one SCTP 1887 endpoint ("A") to another SCTP endpoint ("Z"), the two endpoints must 1888 complete an initialization process in order to set up an SCTP 1889 association between them. 1891 The SCTP user at an endpoint should use the ASSOCIATE primitive to 1892 initialize an SCTP association to another SCTP endpoint. 1894 Stewart, et al [Page 36] 1895 IMPLEMENTATION NOTE: From an SCTP-user's point of view, an 1896 association may be implicitly opened, without an ASSOCIATE primitive 1897 (see 9.1 B) being invoked, by the initiating endpoint's sending of 1898 the first user data to the destination endpoint. The initiating SCTP 1899 will assume default values for all mandatory and optional parameters 1900 for the INIT/INIT ACK. 1902 Once the association is established, unidirectional streams will be 1903 open for data transfer on both ends (see Section 4.1.1). 1905 4.1 Normal Establishment of an Association 1907 The initialization process consists of the following steps (assuming 1908 that SCTP endpoint "A" tries to set up an association with SCTP 1909 endpoint "Z" and "Z" accepts the new association): 1911 A) "A" shall first send an INIT message to "Z". In the INIT, "A" must 1912 provide its security tag "Tag_A" in the Initiate Tag field. Tag_A 1913 SHOULD be a random number in the range of 0x1 to 0xffffffff (see 1914 4.3.1 for Tag value selection). After sending the INIT, "A" starts 1915 the T1-init timer and enters the COOKIE-WAIT state. 1917 B) "Z" shall respond immediately with an INIT ACK message. In the 1918 message, besides filling in other parameters, "Z" must set the 1919 Verification Tag field to Tag_A, and also provide its own security 1920 tag "Tag_Z" in the Initiate Tag field. 1922 Moreover, "Z" MUST generate and send along with the INIT ACK an 1923 State Cookie. See Section 4.1.3 for State Cookie generation. 1925 Note: after sending out INIT ACK with the cookie, "Z" MUST not 1926 allocate any resources, nor keep any states for the new 1927 association. Otherwise, "Z" will be vulnerable to resource attacks. 1929 C) Upon reception of the INIT ACK from "Z", "A" shall stop the T1-init 1930 timer and leave COOKIE-WAIT state. "A" shall then send the cookie 1931 received in the INIT ACK message in a cookie chunk, restart the 1932 T1-init timer, and enter the COOKIE-SENT state. 1934 Note, the cookie chunk can be bundled with any pending outbound 1935 DATA chunks, but it MUST be the first chunk in the datagram AND 1936 until the COOKIE ACK is returned the sender MUST NOT send any 1937 other datagrams to the peer. 1939 D) Upon reception of the COOKIE chunk, Endpoint "Z" will reply with 1940 a COOKIE ACK chunk after building a TCB and marking itself to 1941 the ESTABLISHED state. A COOKIE ACK chunk may be combined with 1942 any pending DATA chunks (and/or SACK chunks), but the COOKIE ACK 1943 chunk MUST be the first chunk in the datagram. 1945 IMPLEMENTATION NOTE: an implementation may choose to send the 1946 Communication Up notification to the SCTP user upon reception 1947 of a valid COOKIE. 1949 Stewart, et al [Page 37] 1950 E) Upon reception of the COOKIE ACK, endpoint "A" will move from the 1951 COOKIE-SENT state to the ESTABLISHED state, stopping the T1-init 1952 timer, and it may also notify its ULP about the successful 1953 establishment of the associate with a Communication Up notification 1954 (see Section 9). 1956 Note: A DATA chunk MUST NOT be carried in the INIT or INIT ACK message. 1958 Note: T1-init timer shall follow the same rules given in Section 5.3. 1960 Note: if an endpoint receives an INIT, INIT ACK, or COOKIE chunk but 1961 decides not to establish the new association due to missing mandatory 1962 parameters in the received INIT or INIT ACK, invalid parameter values, 1963 or, lack of local resources, it SHALL respond with an ABORT chunk. It 1964 SHOULD also specify the cause of abort, such as the type of the 1965 missing mandatory parameters, etc., by either including cause 1966 parameters or bundling with the ABORT one or more Operational ERROR 1967 chunks. The Verification Tag field in the common header of the 1968 outbound abort datagram MUST be set to equal the Initiate Tag value of 1969 the peer. 1971 Note: After the reception of the first data chunk in an association 1972 the receiver MUST immediately respond with a SACK to acknowledge 1973 the data chunk, subsequent acknowledgments should be done as 1974 described in section 5.2. 1976 Note: When an SCTP endpoint sends an INIT or INIT ACK it SHOULD 1977 include all of its transport addresses in the parameter section. This 1978 is because it may NOT be possible to control the "sending" address 1979 that a receiver of an SCTP datagram sees. A receiver thus MUST know 1980 every address that may be a source address for a peer SCTP endpoint, 1981 this assures that the inbound SCTP datagram can be matched to the 1982 proper association. 1984 Note: At the time when the TCB is created, either end MUST set its 1985 internal cumulative TSN acknowledgment point to its peer's Initial TSN 1986 minus one. 1988 IMPLEMENTATION Note: The IP address and SCTP port(s) are generally 1989 used as the key to find the TCB within an SCTP instance. 1991 4.1.1 Handle Stream Parameters 1993 In the INIT and INIT ACK messages, the sender of the message shall 1994 indicate the number of outbound streams (OS) it wishes to have in the 1995 association, as well as the maximal inbound streams (MIS) it will 1996 accept from the other endpoint. 1998 After receiving these stream configuration information from the other 1999 side, each endpoint shall perform the following check: if the peer's 2000 MIS is less than the endpoint's OS, meaning that the peer is incapable 2001 of supporting all the outbound streams the endpoint wants to 2002 configure, the endpoint MUST either settle with MIS outbound streams, 2003 or abort the association and report to its upper layer the resources 2004 shortage at its peer. 2006 Stewart, et al [Page 38] 2007 After the association is initialized, the valid outbound stream 2008 identifier range for either endpoint shall be 0 to 2009 min(local OS, remote MIS)-1. 2011 4.1.2 Handle Address Parameters 2013 During the association initialization, an endpoint shall use the 2014 following rules to discover and collect the destination transport 2015 address(es) of its peer. 2017 On reception of an INIT or INIT ACK message, the receiver shall record 2018 any transport addresses. The transport address(es) are derived by the 2019 combination of SCTP source port (from the common header) and the IP 2020 address parameter(s) carried in the INIT or INIT ACK message. The 2021 receiver should use only these transport addresses as destination 2022 transport addresses when sending subsequent datagrams to its peer. If 2023 no IP address parameters are specified in the INIT or INIT ACK 2024 message, then the source IP address from which the message arrives 2025 should be combined with SCTP source port number and be considered as 2026 the only destination transport address to use. 2028 An initial primary destination transport address shall be selected 2029 for either endpoint, using the following rules: 2031 For the initiator: any valid transport address obtained from the 2032 INIT ACK message. If no transport address is specified in the INIT 2033 ACK message, use the source transport address from which the INIT 2034 ACK message arrived. 2036 For the responder: any valid transport address obtained from the 2037 INIT message. If no transport address is specified in the INIT 2038 message, use the source transport address from which the INIT 2039 message arrived. 2041 4.1.3 Generating State Cookie 2043 When sending an INIT ACK as a response to an INIT message, the sender 2044 of INIT ACK should create an State Cookie and send it as part of the 2045 INIT ACK. Inside this State Cookie, the sender should include a ICV 2046 security signature or MAC (message Authentication code) [4], a time 2047 stamp on when the cookie is created, and the lifespan of the cookie, 2048 along with all the information necessary for it to establish the 2049 association. 2051 The following steps SHOULD be taken to generate the cookie: 2053 1) create an association TCB using information from both the received 2054 INIT and the outgoing INIT ACK messages, 2056 2) in the TCB, set the creation time to the current time of day, and 2057 the lifespan to the protocol parameter 'Valid.Cookie.Life', 2059 3) Generate a MAC signature using the TCB and a Private Key (see [4] for 2060 details on generating the MAC), and 2062 Stewart, et al [Page 39] 2063 4) generate the State Cookie by combining the TCB and the 2064 resultant ICV signature. 2066 After sending the INIT ACK with the cookie, the sender SHOULD delete 2067 the TCB and any other local resource related to the new association, 2068 so as to prevent resource attacks. 2070 The ICV and hashing method used to generate the MAC is strictly a 2071 private matter for the receiver of the INIT message. The use of a MAC 2072 is mandatory to prevent denial of service attacks. The Private Key 2073 MUST be random per RFC1750 [1]; it SHOULD be changed reasonably 2074 frequently, and the timestamp in the cookie MAY be used to determine 2075 which key should be used to verify the MAC. 2077 4.1.4 Cookie Processing 2079 When an endpoint receives an INIT ACK chunk in response to its INIT 2080 chunk, and the INIT ACK contains an State Cookie parameter, it 2081 MUST immediately send a COOKIE chunk to its peer with the received 2082 cookie. The sender MAY also add any pending DATA chunks to the 2083 message. 2085 The sender shall also start the cookie timer after sending out 2086 the COOKIE chunk. If the timer expires, the sender shall retransmit 2087 the COOKIE chunk and restart the cookie timer. This is repeated until 2088 either a COOKIE ACK is received or the endpoint is marked 2089 unreachable (and thus the association enters the CLOSED state). 2091 4.1.5 Cookie Authentication 2093 When an endpoint receives a COOKIE chunk from another endpoint with 2094 which it has no association, it shall take the following actions: 2096 1) compute a MAC signature using the TCB data carried in the cookie 2097 and the Private Key (note the timestamp in the cookie MAY be 2098 used to determine which Private Key to use) reference [4] SHOULD 2099 be used has a guideline for generating the MAC, 2101 2) authenticate the cookie as one that it previously generated by 2102 comparing the computed MAC signature against the one carried in the 2103 cookie. If this comparison fails, the datagram, including the 2104 COOKIE and the attached user data, should be silently discarded, 2106 3) compare the creation time stamp in the cookie to the current local 2107 time, if the elapsed time is longer than the lifespan carried in 2108 the cookie, then the datagram, including the COOKIE and the 2109 attached user data, SHOULD be discarded and the endpoint MUST 2110 transmit a stale cookie operational error to the sending endpoint, 2112 4) if the cookie is valid, create an association to the sender of the 2113 COOKIE message with the information in the TCB data carried in the 2114 COOKIE, and enter the ESTABLISHED state, 2116 5) immediately acknowledge any DATA chunk in the datagram with a SACK 2117 (subsequent datagram acknowledgement should following the rules 2118 defined in Section 5.2), and, 2120 Stewart, et al [Page 40] 2121 6) send a COOKIE ACK chunk to the sender acknowledging reception of 2122 the cookie. The COOKIE ACK MAY be piggybacked with any outbound 2123 DATA chunk or SACK chunk. 2125 Note that if a COOKIE is received from an endpoint with which the 2126 receiver of the COOKIE has an existing association, the procedures in 2127 section 4.2 should be followed. 2129 4.1.6 An Example of Normal Association Establishment 2131 In the following example, "A" initiates the association and then sends 2132 a user message to "Z", then "Z" sends two user messages to "A" later 2133 (assuming no bundling or segmentation occurs): 2135 Endpoint A Endpoint Z 2137 {app sets association with Z} 2138 (build TCB) 2139 INIT [INIT Tag=Tag_A 2140 & other info] --------\ 2141 (Start T1-init timer) \ 2142 (Enter COOKIE-WAIT state) \---> (compose temp TCB and Cookie_Z) 2144 /--- INIT ACK [Veri Tag=Tag_A, 2145 / INIT Tag=Tag_Z, 2146 (Cancel T1-init timer) <------/ Cookie_Z, & other info] 2147 (destroy temp TCB) 2148 COOKIE [Cookie_Z] -----------\ 2149 (Start T1-init timer) \ 2150 (Enter COOKIE-SENT state) \---> (build TCB enter ESTABLISHED state) 2152 /---- COOKIE-ACK 2153 / 2154 (Cancel T1-init timer, <-----/ 2155 Enter established state) 2156 ... 2157 {app sends 1st user data; strm 0} 2158 DATA [TSN=initial TSN_A 2159 Strm=0,Seq=1 & user data]--\ 2160 (Start T3-rxt timer) \ 2161 \-> 2163 /----- SACK [TSN ACK=init TSN_A,Frag=0] 2164 (Cancel T3-rxt timer) <------/ 2165 ... 2167 Stewart, et al [Page 41] 2168 ... 2169 {app sends 2 datagrams;strm 0} 2170 /---- DATA 2171 / [TSN=init TSN_Z 2172 <--/ Strm=0,Seq=1 & user data 1] 2173 SACK [TSN ACK=init TSN_Z, /---- DATA 2174 Frag=0] --------\ / [TSN=init TSN_Z +1, 2175 \/ Strm=0,Seq=2 & user data 2] 2176 <------/\ 2177 \ 2178 \------> 2180 Note that If T1-init timer expires at "A" after the INIT or COOKIE 2181 chunks are sent, the same INIT or cookie chunk with the same Initiate 2182 Tag (i.e., Tag_A) or cookie shall be retransmitted and the timer 2183 restarted. This shall be repeated Max.Init.Retransmits times before "A" 2184 considers "Z" unreachable and reports the failure to its upper layer 2185 (and thus the association enters the CLOSED state). When 2186 retransmitting the INIT, the endpoint SHALL following the rules 2187 defined in 5.3 to determine the proper timer value. 2189 4.2 Handle Duplicate INIT, INIT ACK, COOKIE, and COOKIE ACK 2191 During the life time of an association (in one of the possible 2192 states), an endpoint may receive from its peer endpoint one of the 2193 setup chunks (INIT, INIT ACK, COOKIE, and COOKIE ACK). The receiver 2194 shall treat such a setup chuck as a duplicate and process it as 2195 described in this section. 2197 The following scenarios can cause duplicated chunks: 2199 A) The peer has crashed without being detected, and re-started itself 2200 and sent out a new INIT Chunk trying to restore the association, 2202 B) Both sides are trying to initialize the association at about the 2203 same time, 2205 C) The chunk is from a staled datagram that was used to establish 2206 the present association or a past association which is no longer in 2207 existence, 2209 D) The chunk is a false message generated by an attacker, or 2211 E) The peer never received the COOKIE ACK and is retransmitting its 2212 COOKIE. 2214 In case A), the endpoint shall reset the present association and set a 2215 new association with its peer. Case B) is unique and is discussed in 2216 Section 4.2.1. However, in cases C), D) and E), the endpoint must retain 2217 the present association. 2219 The rules in the following sections shall be applied in order to 2220 identify and correctly handle these cases. 2222 Stewart, et al [Page 42] 2223 4.2.1 Handle Duplicate INIT in COOKIE-WAIT or COOKIE-SENT State 2225 This usually indicates an initialization collision, i.e., both 2226 endpoints are attempting at about the same time to establish an 2227 association with the other endpoint. 2229 In such a case, each of the two side shall respond to the other side 2230 with an INIT ACK, with the Verification Tag field of the common header 2231 set to the tag value received from the INIT message, and the Initiate 2232 Tag field set to its own tag value (the same tag used in the INIT 2233 message sent out by itself). Each responder shall also generate a 2234 cookie with the INIT ACK. 2236 After that, no other actions shall be taken by either side, i.e., the 2237 endpoint shall not change its state, and the T1-init timer shall be 2238 left running. The normal procedures for handling cookies will 2239 resolve the duplicate INITs to a single association. 2241 4.2.2 Handle Duplicate INIT in Other States 2243 Upon reception of the duplicated INIT, the receiver shall generate an 2244 INIT ACK with an State Cookie. 2246 In the outbound INIT ACK, the endpoint shall set the Verification Tag 2247 field in the common header to the peer's new tag value (from the 2248 duplicated INIT message), and the Initiate Tag field to its own tag 2249 value (unchanged from the existing association). The included 2250 State Cookie shall be generated using the current time and a 2251 temporary TCB constructed with the information provided in the 2252 duplicated INIT message (see Section 4.1.3). This temporary TCB MUST 2253 be destroyed after the outbound INIT ACK is built. 2255 After sending out the INIT ACK, the endpoint shall take no further 2256 actions, i.e., the existing association, including its current state, 2257 and the corresponding TCB MUST not be changed. 2259 4.2.3 Handle Duplicate INIT ACK 2261 If an INIT ACK is received by an endpoint in any state 2262 other than the COOKIE-WAIT state, the endpoint should discard 2263 the INIT ACK message. A duplicate INIT ACK usually indicates the 2264 processing of an old INIT or duplicated INIT message. 2266 4.2.4 Handle Duplicate Cookie 2268 When a duplicated COOKIE chunk is received in any state for an 2269 existing association the following rules shall be applied: 2271 1) compute an MD5 signature using the TCB data carried in the cookie 2272 along with the receiver's private security key, 2274 Stewart, et al [Page 43] 2275 2) authenticate the cookie by comparing the computed MD5 signature 2276 against the one carried in the cookie. If this comparison fails, 2277 the datagram, including the COOKIE and the attached user data, 2278 should be silently discarded (this is case C or D above). 2280 3) compare the timestamp in the cookie to the current time, if 2281 the cookie is older than the lifespan carried in the cookie, 2282 the datagram, including the COOKIE and the attached user data, 2283 should be discarded and the endpoint MUST transmit a stale cookie 2284 error to the sending endpoint only if the Verification tags of the 2285 cookie's TCB does NOT match the current tag values in the association 2286 (this is case C or D above). If both Verification tags do match 2287 consider the cookie valid (this is case E). 2289 4) If the cookie proves to be valid, unpack the TCB into a 2290 temporary TCB. 2292 5) If the Verification Tags in the Temporary TCB matches the 2293 Verification Tags in the existing TCB, the cookie is a 2294 duplicate cookie. A cookie ack should be sent to the peer 2295 endpoint but NO update should be made to the existing 2296 TCB. 2298 6) If the the local Verification Tag in the temporary TCB 2299 does not match the local Verification Tag in the existing 2300 TCB, then the cookie is an old stale cookie and does 2301 not correspond to the existing association (case C above). 2302 The datagram should be silently discarded. 2304 7) If the peer's Verification Tag in the temporary TCB does not 2305 match the peer's Verification Tag in the existing TCB, 2306 then a restart of the peer has occurred (case A above). 2307 In such a case, the endpoint should report the restart to its ULP 2308 and respond the peer with a COOKIE ACK message. It shall also 2309 update the Verification Tag, initial TSN, and the destination 2310 address list of the existing TCB with the information from the 2311 temporary TCB. After that the temporary TCB can be discarded. 2313 Furthermore, all the congestion control parameters (e.g., cwnd, 2314 ssthresh) related to this peer shall be reset to their initial 2315 values (see Section 6.2.1). 2317 IMPLEMENTATION NOTE: It is an implementation decision on how 2318 to handle any pending datagrams. The implementation may elect 2319 to either A) send all messages back to its upper layer with the 2320 restart report, or B) automatically re-queue any datagrams 2321 pending by marking all of them as never-sent and assigning 2322 new TSN's at the time of their initial transmissions based upon 2323 the updated starting TSN (as defined in section 5). 2325 Note: The "peer's Verification Tag" is the tag received in the INIT 2326 or INIT ACK chunk. 2328 Stewart, et al [Page 44] 2329 4.2.5 Handle Duplicate COOKIE-ACK. 2331 At any state other than COOKIE-SENT, an endpoint may receive a 2332 duplicated COOKIE ACK chunk. If so, the chunk should be silently 2333 discarded. 2335 4.2.6 Handle Stale COOKIE Error 2337 A stale cookie error indicates one of a number of possible events: 2339 A) that the association failed to completely setup before the 2340 cookie issued by the sender was processed. 2342 B) an old cookie was processed after setup completed. 2344 C) an old cookie is received from someone that the receiver is 2345 not interested in having an association with and the ABORT 2346 message was lost. 2348 When processing a stale cookie an endpoint should first examine 2349 if an association is in the process of being setup, i.e. the 2350 association is in the COOKIE-SENT state. In all cases if 2351 the association is NOT in the COOKIE-SENT state, the stale 2352 cookie message should be silently discarded. 2354 If the association is in the COOKIE-SENT state, the endpoint may elect 2355 one of the following three alternatives. 2357 1) Send a new INIT message to the endpoint, to generate a new cookie 2358 and re-attempt the setup procedure. 2360 2) Discard the TCB and report to the upper layer the inability of 2361 setting-up the association. 2363 3) Send a new INIT message to the endpoint, adding a cookie 2364 preservative parameter requesting an extension on the life time of 2365 the cookie. When calculating the time extension, an implementation 2366 SHOULD use the RTT information measured based on the previous 2367 COOKIE / Stale COOKIE message exchange, and should add no more 2368 than 1 second beyond the measured RTT, due to a long cookie life 2369 time makes the endpoint more subject to a replay attack. 2371 4.3 Other Initialization Issues 2373 4.3.1 Selection of Tag Value 2375 Initiate Tag values should be selected from the range of 0x1 to 2376 0xffffffff. It is very important that the Tag value be randomized to 2377 help protect against "man in the middle" and "sequence number" attacks. 2378 It is suggested that RFC 1750 [1] be used for the Tag randomization. 2380 Stewart, et al [Page 45] 2381 Moreover, the tag value used by either endpoint in a given association 2382 MUST never be changed during the lifetime of the association. However, 2383 a new tag value MUST be used each time the endpoint tears-down and 2384 then re-establishes the association to the same peer. 2386 5. User Data Transfer 2388 For transmission efficiency, SCTP defines mechanisms for bundling of 2389 small user messages and segmentation of large user messages. 2390 The following diagram depicts the flow of user messages through SCTP. 2392 +--------------------------+ 2393 | User Messages | 2394 +--------------------------+ 2395 SCTP user ^ | 2396 ==================|==|======================================= 2397 | v (1) 2398 +------------------+ +--------------------+ 2399 | SCTP DATA Chunks | |SCTP Control Chunks | 2400 +------------------+ +--------------------+ 2401 ^ | ^ | 2402 | v (2) | v (2) 2403 +--------------------------+ 2404 | SCTP datagrams | 2405 +--------------------------+ 2406 SCTP ^ | 2407 ===========================|==|=========================== 2408 | v 2409 Unreliable Packet Transfer Service (e.g., IP) 2411 Note: 2412 (1) When converting user messages into Data chunks, SCTP sender 2413 will segment user messages larger than the current path MTU 2414 into multiple data chunks. The segmented message will normally 2415 be reassembled from data chunks before delivery to the user by 2416 the SCTP receiver (see Section 5.9 for details). 2418 (2) Multiple data and control chunks may be multiplexed by the 2419 sender into a single SCTP datagram for transmission, as long as 2420 the final size of the datagram does not exceed the current path 2421 MTU. The receiver will de-multiplex the datagram back into 2422 the original chunks. 2424 The segmentation and bundling mechanisms, as detailed in Sections 5.9 2425 and 5.10, are optional to implement by the data sender, but they MUST 2426 be implemented by the data receiver, i.e., an SCTP receiver MUST be 2427 prepared to receive and process bundled or segmented data. 2429 Stewart, et al [Page 46] 2430 5.1 Transmission of DATA Chunks 2432 The following general rules SHALL be applied by the sender for 2433 transmission and/or retransmission of outbound DATA chunks: 2435 A) At any given time, the sender MUST NOT transmit new data onto any 2436 destination transport address if its peer's rwnd indicates that the 2437 peer has no buffer space (i.e. rwnd is 0, see Section 5.2.1). 2439 However, regardless of the value of rwnd (including if it is 0), 2440 the sender can always have ONE data packet in flight to the 2441 receiver if allowed by cwnd (see rule B below). This rule 2442 allows the sender to probe for a change in rwnd that the sender 2443 missed due to the update having been lost in transmission from 2444 the receiver to the sender. 2446 B) At any given time, the sender MUST NOT transmit new data onto a 2447 given transport address if it has cwnd or more octets of data 2448 outstanding on that transport address. 2450 C) When the time comes for the sender to transmit, before sending 2451 new DATA chunks, the sender MUST first transmit any outstanding 2452 DATA chunks which are marked for retransmission (limited by the 2453 current cwnd). 2455 D) Then, the sender can send out as many new DATA chunks as Rule A and 2456 Rule B above allow. 2458 Note: multiple DATA chunks committed for transmission MAY be 2459 bundled in a single packet, unless bundling is explicitly disallowed 2460 by ULP of the data sender. Furthermore, DATA chunks being 2461 retransmitted MAY be bundled with new DATA chunks, as long as the 2462 resulting packet size does not exceed the path MTU. 2464 Note: before a sender transmits a data packet, if any received DATA 2465 chunks have not been acknowledged (e.g., due to delayed ack), the 2466 sender should create a SACK and bundle it with the outbound DATA 2467 chunk, as long as the size of the final SCTP datagram does not exceed 2468 the current MTU. See Section 5.2. 2470 IMPLEMENTATION Note: when the window is full (i.e., transmission is 2471 disallowed by Rule A and/or Rule B), the sender MAY still accept 2472 send requests from its upper layer, but SHALL transmit no more DATA 2473 chunks until some or all of the outstanding DATA chunks are 2474 acknowledged and transmission is allowed by Rule A and Rule B 2475 again. 2477 Whenever a transmission or retransmission is made to any address, if 2478 the T3-rxt timer of that address is not currently running, the sender 2479 MUST start that timer. However, if the timer of that address is 2480 already running, the sender SHALL restart the timer ONLY IF the 2481 earliest (i.e., lowest TSN) outstanding DATA chunk sent to that 2482 address is being retransmitted. 2484 Stewart, et al [Page 47] 2485 When starting or restarting the T3-rxt timer, the timer value must be 2486 adjusted according to the timer rules defined in Sections 5.3.2, 2487 and 5.3.3. 2489 5.2 Acknowledgment on Reception of DATA Chunks 2491 The SCTP receiver MUST always acknowledge the SCTP sender about the 2492 reception of each DATA chunk. 2494 The guidelines on delayed acknowledgment algorithm specified in 2495 Section 4.2 of RFC 2581 [3] SHOULD be followed. Specifically, an 2496 acknowledgment SHOULD be generated for at least every second datagram 2497 received, and SHOULD be generated within 200 ms of the arrival of any 2498 unacknowledged datagram. 2500 IMPLEMENTATION NOTE: the maximal delay for generating an 2501 acknowledgment may be configured by the SCTP user, either 2502 statically or dynamically, in order to meet the specific 2503 timing requirement of the signaling protocol being carried. 2505 Acknowledgments MUST be sent in SACK control chunks. A SACK chunk can 2506 acknowledge the reception of multiple DATA chunks. See Section 2.3.3 2507 for SACK chunk format. In particular, the SCTP receiver MUST fill in 2508 the Cumulative TSN ACK field to indicate the latest cumulative TSN 2509 number it has received, and any received segments beyond the 2510 Cumulative TSN SHALL also be reported. 2512 Upon reception of the SACK, the data sender MUST adjust its total 2513 outstanding data count and the outstanding data count on those 2514 destination addresses for which one or more data chunks is 2515 acknowledged by the SACK. 2517 Note: When a datagram arrives with duplicate DATA chunk(s) and no new 2518 DATA chunk(s), the receiver MUST immediately send a SACK with no 2519 delay. Normally this will occur when the original SACK was lost, and 2520 the peers RTO has expired. The duplicate TSN number(s) SHOULD be reported 2521 in the SACK as duplicate. 2523 When a receiver prepares a SACK, any duplicate DATA chunks received 2524 SHOULD be reported in the SACK. 2526 When a SACK is received the receiver MAY use the Duplicate TSN 2527 information to determine if SACK loss is occuring. Further use 2528 of this data is for future study. 2530 The following example illustrates the use of delayed acknowledgments: 2532 Endpoint A Endpoint Z 2534 {App sends 3 messages; strm 0} 2535 DATA [TSN=7,Strm=0,Seq=3] ------------> (ack delayed) 2536 (Start T3-rxt timer) 2538 Stewart, et al [Page 48] 2539 DATA [TSN=8,Strm=0,Seq=4] ------------> (send ack) 2540 /------- SACK [TSN ACK=8,Frag=0] 2541 (cancel T3-rxt timer) <-----/ 2542 ... 2543 ... 2545 DATA [TSN=9,Strm=0,Seq=5] ------------> (ack delayed) 2546 (Start T3-rxt timer) 2547 ... 2548 {App sends 1 message; strm 1} 2549 (bundle SACK with DATA) 2550 /----- SACK [TSN Ack=9,Frag=0] \ 2551 / DATA [TSN=6,Strm=1,Seq=2] 2552 (cancel T3-rxt timer) <------/ (Start T3-rxt timer) 2554 (ack delayed) 2555 ... 2556 (send ack) 2557 SACK [TSN ACK=6,Frag=0] -------------> (cancel T3-rxt timer) 2559 5.2.1 Tracking Peer's Receive Buffer Space 2561 Whenever a SACK arrives, a new updated a_rwnd arrives with it. This 2562 value represents the amount of buffer space the sender of the SACK, at 2563 the time of transmitting the SACK, has left of its total receive 2564 buffer space (as specified in the INIT/INIT-ACK). After processing the 2565 SACK, the receiver of the SACK must use the following rules to 2566 re-calculate the rwnd, using the received a_rwnd value. 2568 A) At the establishment of the association, the endpoint initializes 2569 the rwnd to the Advertised Receiver Window Credit (a_rwnd) 2570 the peer specified in the INIT or INIT ACK. 2572 B) Any time a DATA chunk is transmitted to a peer, the endpoint 2573 subtracts the data size of the chunk from the rwnd of that peer. 2575 C) Any time a SACK arrives, the endpoint performs the following: 2577 If all outstanding TSNs are acknowledged by the SACK, adopt 2578 the a_rwnd value in the SACK as the new rwnd. 2580 Otherwise, take the value of the current rwnd, and add to it the 2581 data size of any newly acknowledged TSNs that has its BE bits set 2582 to 11, OR that moved the cumulative TSN point forward. Then, set 2583 the rwnd to the lesser of the calculated value and the a_rwnd carried 2584 in the SACK. 2586 D) Any time the T3-rxt timer expires on any address, causing all 2587 outstanding chunks sent to that address to be marked for 2588 retransmission, add all of the data sizes of those chunks to the rwnd. 2590 Stewart, et al [Page 49] 2591 E) Any time a DATA chunk is marked for retransmission via the 2592 fast retransmit algorithm (section 6.2.4), add the DATA chunks 2593 size to the rwnd. 2595 5.3 Management of Retransmission Timer 2597 SCTP uses a retransmission timer T3-rxt to ensure data delivery in the 2598 absence of any feedback from the remote data receiver. The duration of 2599 this timer is referred to as RTO (retransmission timeout). 2601 When the receiver endpoint is multi-homed, the data sender endpoint 2602 will calculate a separate RTO for each different destination transport 2603 addresses of the receiver endpoint. 2605 The computation and management of RTO in SCTP follows closely with how 2606 TCP manages its retransmission timer. To compute the current RTO, an 2607 SCTP sender maintains two state variables per destination transport 2608 address: SRTT (smoothed round-trip time) and RTTVAR (round-trip time 2609 variation). 2611 5.3.1 RTO Calculation 2613 The rules governing the computation of SRTT, RTTVAR, and RTO are 2614 as follows: 2616 C1) Until an RTT measurement has been made for a packet sent 2617 to the given destination transport address, set RTO to the 2618 protocol parameter 'RTO.Initial'. 2620 C2) When the first RTT measurement R is made, set SRTT <- R, 2621 RTTVAR <- R/2, and RTO <- SRTT + 4 * RTTVAR. 2623 C3) When a new RTT measurement R' is made, set 2625 RTTVAR <- (1 - RTO.Beta) * RTTVAR + RTO.Beta * |SRTT - R'| 2626 SRTT <- (1 - RTO.Alpha) * SRTT + RTO.Alpha * R' 2628 Note, the value of SRTT used in the update to RTTVAR is its value 2629 *before* updating SRTT itself using the second assignment. 2631 After the computation, update RTO <- SRTT + 4 * RTTVAR. 2633 Stewart, et al [Page 50] 2634 C4) When data is in flight and when allowed by rule C5 below, a new 2635 RTT measurement MUST be made each round trip. Furthermore, 2636 it is RECOMMENDED that new RTT measurements should be made no 2637 more than once per round-trip for a given destination transport 2638 address. There are two reasons for this recommendation: first, 2639 it appears that measuring more frequently often does not in 2640 practice yield any significant benefit [5]; second, if 2641 measurements are made more often, then the values of RTO.Alpha and 2642 RTO.Beta in rule C3 above should be adjusted so that SRTT and 2643 RTTVAR still adjust to changes at roughly the same rate (in terms 2644 of how many round trips it takes them to reflect new value) as 2645 they would if making only one measurement per round-trip and 2646 using RTO.Alpha and RTO.Beta as given in rule C3. However, the 2647 exact nature of these adjustments remains a research issue. 2649 C5) Karn's algorithm: RTT measurements MUST NOT be made using 2650 packets that were retransmitted (and thus for which it is 2651 ambiguous whether the reply was for the first instance of the 2652 packet or a later instance). 2654 C6) Whenever RTO is computed, if it is less than RTO.Min seconds 2655 then it is rounded up to RTO.Min seconds. The reason for this 2656 rule is that RTOs that do not have a high minimum value are 2657 susceptible to unnecessary timeouts [5]. 2659 C7) A maximum value may be placed on RTO provided it is at least 2660 RTO.max seconds. 2662 There is no requirement for the clock granularity G used for computing 2663 RTT measurements and the different state variables, other than 2665 G1) Whenever RTTVAR is computed, if RTTVAR = 0, then adjust 2666 RTTVAR <- G. 2668 Experience [5] has shown that finer clock granularities (<= 100 msec) 2669 perform somewhat better than more coarse granularities. 2671 5.3.2 Retransmission Timer Rules 2673 The rules for managing the retransmission timer are as follows: 2675 R1) Every time a packet containing data is sent to any address (including 2676 a retransmission), if the T3-rxt timer of that address is not 2677 running, start it running so that it will expire after the RTO of 2678 that address. The RTO used here is that obtained after any doubling 2679 due to previous T3-rxt timer expirations on the corresponding 2680 destination address as discussed in rule E2 below. 2682 R2) Whenever all outstanding data on an address has been acknowledged, 2683 turn off the T3-rxt timer of that address. 2685 Stewart, et al [Page 51] 2686 R3) Whenever a SACK is received that acknowledges new data chunks 2687 including the one with the earliest outstanding TSN on that address, 2688 restart T3-rxt timer of that address with its current RTO. 2690 The following example shows the use of various timer rules (assuming 2691 the receiver uses delayed acks). 2693 Endpoint A Endpoint Z 2694 {App begins to send} 2695 Data [TSN=7,Strm=0,Seq=3] ------------> (ack delayed) 2696 (Start T3-rxt timer) 2697 {App sends 1 message; strm 1} 2698 (bundle ack with data) 2699 DATA [TSN=8,Strm=0,Seq=4] ----\ /-- SACK [TSN ACK=7,Frag=0] \ 2700 \ / DATA [TSN=6,Strm=1,Seq=2] 2701 \ / (Start T3-rxt timer) 2702 \ 2703 / \ 2704 (Re-start T3-rxt timer) <------/ \--> (ack delayed) 2705 (ack delayed) 2706 ... 2707 {send ack} 2708 SACK [TSN ACK=6,Frag=0] --------------> (Cancel T3-rxt timer) 2709 .. 2710 (send ack) 2711 (Cancel T3-rxt timer) <-------------- SACK [TSN ACK=8,Frag=0] 2713 5.3.3 Handle T3-rxt Expiration 2715 Whenever the retransmission timer T3-rxt expires on a destination 2716 address, do the following: 2718 E1) On the destination address where the timer expires, adjust its 2719 ssthresh with rules defined in Section 6.2.3 and set the 2720 cwnd <- MTU. 2722 E2) On the destination address where the timer expires, set 2723 RTO <- RTO * 2 ("back off the timer"). The maximum value discussed 2724 in rule C7 above (RTO.max) may be used to provide an upper bound 2725 to this doubling operation. 2727 E3) Determine how many of the earliest (i.e., lowest TSN) outstanding 2728 Data chunks on the address where the T3-rxt has expired that will 2729 fit into a single packet, subject to the MTU constraint for the 2730 path corresponding to the destination transport address where the 2731 retransmission is being sent to (this may be different from the 2732 address where the timer expires [see Section 5.4]). Call this 2733 value K. Bundle and retransmit those K data chunks in a single 2734 packet to the address. 2736 Stewart, et al [Page 52] 2737 E4) Start the retransmission timer T3-rxt on the destination address 2738 to where the retransmission is sent, if rule R1 above indicates to 2739 do so. Note, the RTO to be used for starting T3-rxt should be the 2740 one of the destination address to where the retransmission is 2741 sent, which, when the receiver is multi-homed, may be different 2742 from the destination address where the timer expired (see Section 2743 5.4 below). 2745 Note that after retransmitting, once a new RTT measurement is obtained 2746 (which can happen only when new data has been sent and acknowledged, 2747 per rule C5, or for a measurement made from a Heartbeat [see Section 2748 7.3]), the computation in rule C3 is performed, including the 2749 computation of RTO, which may result in "collapsing" RTO back down 2750 after it has been subject to doubling (rule E2). 2752 The final rule for managing the retransmission timer concerns failover 2753 (see Section 5.4.1): 2755 F1) Whenever SCTP switches from the current destination transport 2756 address to a different one, the current retransmission timers are 2757 left running. As soon as SCTP transmits a packet containing data 2758 to the new transport address, start the timer on that transport 2759 address, using the RTO value of the destination address where 2760 the data is being sent, if rule R1 indicates to do so. 2762 5.4 Multi-homed SCTP Endpoints 2764 An SCTP endpoint is considered multi-homed if there are more than one 2765 transport addresses that can be used as a destination address to reach 2766 that endpoint. 2768 Moreover, at the sender side, one of the multiple destination 2769 addresses of the multi-homed receiver endpoint shall be selected as 2770 the primary destination transport address by the UPL (see Sections 2771 4.1.2 and 9.1 for details). 2773 When the SCTP sender is transmitting to the multi-homed receiver, by 2774 default the transmission SHOULD always take place on the primary 2775 transport address, unless the SCTP user explicitly specifies the 2776 destination transport address to use. 2778 The acknowledgment SHOULD be transmitted to the same destination 2779 transport address from which the DATA or control chunk being 2780 acknowledged were received. 2782 However, when acknowledging multiple DATA chunks in a single SACK, the 2783 SACK message may be transmitted to one of the destination transport 2784 addresses from which the DATA or control chunks being acknowledged 2785 were received. 2787 Stewart, et al [Page 53] 2788 Furthermore, when the receiver is multi-homed, the SCTP data sender 2789 SHOULD try to retransmit a chunk to an active destination transport 2790 address that is different from the last destination address where the 2791 data chunk was sent to. 2793 Note, retransmissions do not affect the total outstanding data 2794 count. However, if the data chunk is retransmitted onto a different 2795 destination address, both the outstanding data counts on the new 2796 destination address and the old destination address where the data 2797 chunk was last sent to shall be adjusted accordingly. 2799 5.4.1 Failover from Inactive Destination Address 2801 Some of the destination transport addresses of a multi-homed SCTP data 2802 receiver may become inactive due to either the occurrence of certain 2803 error conditions (see Section 7.2) or adjustments from SCTP user. 2805 When there is outbound data to send and the primary destination 2806 transport address becomes inactive (e.g., due to failures), or where 2807 the SCTP user explicitly requests to send data to an inactive 2808 destination transport address, before reporting an error to its ULP, 2809 the SCTP sender should try to send the data to an alternate active 2810 destination transport address if one exists. 2812 5.5 Stream Identifier and Stream Sequence Number 2814 Every DATA chunk MUST carry a valid stream identifier. If a DATA chunk 2815 with an invalid stream identifier is received, the receiver shall, 2816 after acknowledging the reception of the DATA chunk following the normal 2817 procedure, respond immediately with an ERROR message with cause set to 2818 Invalid Stream Identifier (see Section 2.3.9) and discard the DATA 2819 chunk. 2821 The stream sequence number in all the streams shall start from 0x0 2822 when the association is established. Also, when the stream sequence 2823 number reaches the value 0xffff the next stream sequence number shall 2824 be set to 0x0. 2826 5.6 Ordered and Un-ordered Delivery 2828 By default the SCTP receiver shall ensure the DATA chunks within any 2829 given stream be delivered to the upper layer according to the order of 2830 their stream sequence number. If there are DATA chunks arriving out of 2831 order of their stream sequence number, the receiver MUST hold the 2832 received DATA chunks from delivery until they are re-ordered. 2834 However, an SCTP sender can indicate that no ordered delivery is 2835 required on a particular DATA chunk within the stream by setting the U 2836 flag of the DATA chunk to 1. 2838 Stewart, et al [Page 54] 2839 In this case, the receiver must bypass the ordering mechanism and 2840 immediately delivery the data to the upper layer (after re-assembly if 2841 the user data is segmented by the sender). 2843 This provides an effective way of transmitting "out-of-band" data in a 2844 given stream. Also, a stream can be used as an "unordered" stream by 2845 simply setting the U flag to 1 in all outbound DATA chunks sent 2846 through that stream. 2848 IMPLEMENTATION NOTE: when sending an unordered DATA chunk, an 2849 implementation may choose to place the DATA chunk in an outbound 2850 datagram that is at the head of the outbound transmission queue if 2851 possible. 2853 Note that the 'Stream Sequence Number' field in an un-ordered data 2854 chunk has no significance; the sender can fill it with arbitrary 2855 value, but the receiver MUST ignore the field. 2857 5.7 Report Gaps in Received DATA TSNs 2859 Upon the reception of a new DATA chunk, an SCTP receiver shall examine 2860 the continuity of the TSNs received. If the receiver detects that gaps 2861 exist in the received DATA chunk sequence, an SACK with fragment 2862 reports shall be sent back immediately. 2864 Based on the segment reports from the SACK, the data sender can 2865 calculate the missing DATA chunks and make decisions on whether to 2866 retransmit them (see Section 5.3 for details). 2868 Multiple gaps can be reported in one single SACK (see Section 2.3.3). 2870 Note that when the data sender is multi-homed, the SCTP receiver 2871 SHOULD always try to send the SACK to the same network from where the 2872 last DATA chunk was received. 2874 Upon the reception of the SACK, the data sender SHALL remove all DATA 2875 chunks which have been acknowledged by the SACK. The data sender MUST 2876 also treat all the DATA chunks which fall into the gaps between the 2877 fragments reported by the SACK as "missing". The number of "missing" 2878 reports for each outstanding DATA chunk MUST be recorded by the data 2879 sender in order to make retransmission decision, see Section 6.2.4 for 2880 details. 2882 The following example shows the use of SACK to report a gap. 2884 Endpoint A Endpoint Z 2885 {App sends 3 messages; strm 0} 2886 DATA [TSN=6,Strm=0,Seq=2] ---------------> (ack delayed) 2887 (Start T3-rxt timer) 2889 Stewart, et al [Page 55] 2890 DATA [TSN=7,Strm=0,Seq=3] --------> X (lost) 2892 DATA [TSN=8,Strm=0,Seq=4] ---------------> (gap detected, 2893 immediately send ack) 2894 /----- SACK [TSN ACK=6,Frag=1, 2895 / Strt=2,End=2] 2896 <-----/ 2897 (remove 6 and 8 from out-queue, 2898 and strike 7 as "1" missing report) 2900 Note: in order to keep the size of the outbound SCTP datagram not to 2901 exceed the current path MTU, the maximal number of fragments that can 2902 be reported within a single SACK chunk is limited. When a single SACK 2903 can not cover all the fragments needed to be reported due to the MTU 2904 limitation, the endpoint SHALL send only one SACK, reporting the 2905 fragments from the lowest to highest TSNs, within the size limit set 2906 by the MTU, and leave the remaining highest TSN fragment numbers 2907 unacknowledged. 2909 5.8 Adler-32 Checksum Calculation 2911 When sending an SCTP datagram, the sender MUST strengthen the data 2912 integrity of the transmission by including the Adler-32 checksum 2913 value calculated on the datagram, as described below. 2915 After the datagram is constructed (containing the SCTP common header 2916 and one or more control or DATA chunks), the sender shall: 2918 1) fill in the proper Verification Tag in the SCTP common header and 2919 initialize the Adler-32 checksum filed to 0's. 2921 2) calculate the Adler-32 checksum of the whole datagram, including the 2922 SCTP common header and all the chunks. Refer to Sections 8.2 and 9 2923 in [2] for details of the Adler-32 algorithm. And, 2925 3) put the resultant value into the Adler-32 checksum field in the 2926 common header, and leave the rest of the bits unchanged. 2928 When an SCTP datagram is received, the receiver MUST first check the 2929 Adler-32 checksum: 2931 1) store the received Adler-32 checksum value aside, 2933 2) replace the 32 bits of the Adler-32 checksum field in the received 2934 SCTP datagram with all '0's and calculate an Adler-32 checksum 2935 value of the whole received datagram. And, 2937 3) verify that the calculated Adler-32 checksum is the same as the 2938 received Adler-32 checksum, If not, the receiver MUST treat the 2939 datagram as an invalid SCTP datagram. 2941 The default procedure of handling invalid SCTP datagrams is to 2942 silently discard them. 2944 Stewart, et al [Page 56] 2945 5.9 Segmentation 2947 Segmentation MUST be performed by the data sender if the user message 2948 to be sent has a large size that causes the outbound SCTP datagram 2949 size exceeding the current MTU. 2951 Note, if the data receiver is multi-homed, the sender shall choose a 2952 size no larger than the latest MTU of the current primary destination 2953 address. 2955 When determining when to segment, the SCTP implementation MUST take 2956 into account the SCTP datagram header as well as the DATA chunk 2957 header. The implementation MAY also take account of the space required 2958 for a SACK chunk. 2960 IMPLEMENTATION NOTE: if segmentation is not support by the sender, 2961 an error should be reported to the sender's SCTP user if the data to be 2962 sent has a size exceeding the current MTU. In such cases the Send 2963 primitive discussed in Section 9.1 would need to return an error 2964 to the upper layer. 2966 Segmentation takes the following steps: 2968 1) the data sender SHALL break the large user message into a series of 2969 DATA chunks, such that each of the chunks can be fit into an IP 2970 datagram smaller than or equal to the current MTU, 2972 2) the data sender MUST then assign, in sequence, a separate TSN to 2973 each of the DATA chunks in the series, 2975 3) the data sender MUST also set the B/E bits of the first DATA chunk 2976 in the series to '10', the B/E bits of the last DATA chunk in the 2977 series to '01', and the B/E bits of all other DATA chunks in the 2978 series to '00'. 2980 The data receiver MUST recognize the segmented DATA chunks, by 2981 examining the B/E bits in each of the received DATA chunks, and queue 2982 the segmented DATA chunks for re-assembly. Then, it shall pass the 2983 re-assembled user message to the specific stream for possible 2984 re-ordering and final dispatching. 2986 Note, if the data receiver runs out of buffer space while still 2987 waiting for more segments to complete the re-assembly of the message, 2988 it should dispatch part of its inbound message through a partial 2989 delivery API (see Section 9), freeing some of its receive buffer space 2990 so that the rest of the message may be received. 2992 Stewart, et al [Page 57] 2993 5.10 Bundling and Multiplexing 2995 An SCTP sender achieves data bundling by simply including multiple 2996 DATA chunks in one outbound SCTP datagram. Note that the total size of 2997 the resultant IP datagram, including the SCTP datagram and IP headers, 2998 MUST be less or equal to the current MTU. 3000 Note, if the data receiver is multi-homed, the sender shall choose a 3001 size no larger than the latest MTU of the current primary destination 3002 address. 3004 When multiplexing control chunks with DATA chunks, control chunks have 3005 the priority and MUST be placed first in the outbound SCTP datagram 3006 and be transmitted first. The transmitter MUST transmit DATA chunks 3007 within a SCTP datagram in increasing order of TSN. 3009 Partial chunks MUST NOT be placed in a SCTP datagram. 3011 The receiver MUST process the chunks in order in the datagram. The 3012 receiver uses the chunk length field to determine the end of a chunk 3013 and beginning of the next chunk taking account of the fact that all 3014 chunks end on a thirty-two-bit word boundary. If the receiver detects 3015 a partial chunk, it MUST drop the chunk. 3017 6. Congestion control 3019 Congestion control is one of the basic functions in the SCTP protocol. 3020 For some applications, it may be likely that adequate resources will 3021 be allocated to SCTP traffic to assure prompt delivery of 3022 time-critical SCTP data - thus it would appear to be unlikely, during 3023 normal operations, that SCTP transmissions encounter severe congestion 3024 condition. However SCTP must prepare itself for adverse operational 3025 conditions, which can develop upon partial network failures or 3026 unexpected traffic surges. In such situations SCTP must follow correct 3027 congestion control steps to recover from congestion quickly in order 3028 to get data delivered as soon as possible. In the absence of network 3029 congestion, these preventive congestion control algorithms should show 3030 no impact on the protocol performance. 3032 IMPLEMENTATION NOTE: as far as its specific performance requirements 3033 are met, an implementation is always allowed to adopt a more 3034 conservative congestion control algorithm than the one defined 3035 below. 3037 The congestion control algorithms used by SCTP are based on RFC 2581 3038 [3], "TCP Congestion Control". This section describes how the 3039 algorithms defined in RFC 2581 are adopted for use in SCTP. We first 3040 list differences in protocol designs between TCP and SCTP, and then 3041 describe SCTP's congestion control scheme. The description will use 3042 the same terminology as in TCP congestion control whenever 3043 appropriate. 3045 Note: SCTP congestion control is always applied to the entire association, 3046 and NOT to individual streams. 3048 Stewart, et al [Page 58] 3049 6.1 SCTP Differences from TCP Congestion control 3051 One difference between SCTP and TCP is that the Selective 3052 Acknowledgment function (SACK) is designed into SCTP, rather than an 3053 enhancement that is added to the protocol later as is the case for 3054 TCP. SCTP SACK carries the same semantic meaning with that of TCP 3055 SACK. TCP and SCTP considers the information carried in the SACK as 3056 advisory information only. In SCTP, any DATA chunk that has been 3057 acknowledged by SACK, including DATA that arrived at the receiving end 3058 out of order, are NOT considered fully delivered until the Cumulative 3059 Acknowledgement point passes the acknowledged DATA chunk. Consequently, 3060 the value of cwnd controls the amount of outstanding data, rather than 3061 the upper bound between the highest acknowledged sequence number and 3062 the latest DATA chunk that can be sent within the congestion window, 3063 as is the case in non-SACK TCP. SCTP SACK leads to different 3064 implementations of fast-retransmit and fast-recovery from that of 3065 non-SACK TCP. As an example see [16]. 3067 The biggest difference between SCTP and TCP, however, is multi-homing. 3068 SCTP is designed to establish robust communication associations 3069 between two end points each of which may be reachable by more than one 3070 transport address. Potentially different addresses may lead to 3071 distinguished data paths between the two points, thus ideally one may 3072 need a separate set of congestion control parameters for each of the 3073 paths. The treatment here of congestion control for multi-homed 3074 receivers is new with SCTP and may require refinement in the 3075 future. The current algorithms make the following assumptions: 3077 o The sender always uses the same destination address until being 3078 instructed by the upper layer otherwise. 3080 o The sender keeps a separate congestion control parameter set for each 3081 of the destination addresses it can send to (NOT each source-destination 3082 pair but for each destination) . The parameters should decay if 3083 the address is not used for a long enough time period. 3085 o For each of the destination addresses, do slow-start upon the first 3086 transmission to that address. 3088 6.2 SCTP Slow-Start and Congestion Avoidance 3090 The slow start and congestion avoidance algorithms MUST be used by a 3091 SCTP sender to control the amount of outstanding data being injected 3092 into the network. The congestion control in SCTP is employed in regard 3093 to the association, not to an individual stream. In some situations it 3094 may be beneficial for an SCTP sender to be more conservative than the 3095 algorithms allow, however an SCTP sender MUST NOT be more aggressive 3096 than the following algorithms allow. 3098 Like TCP, an SCTP sender uses the following three control variables on a 3099 per destination basis to regulate its transmission rate. 3101 Stewart, et al [Page 59] 3102 o Receiver advertised window size (rwnd, in octets), which is set by 3103 the receiver based on its available buffer space for incoming packets. 3105 o Congestion control window (cwnd, in octets), which is adjusted by 3106 the sender based on observed network conditions. 3108 o Slow-start threshold (ssthresh, in octets), which is used by the 3109 sender to distinguish slow start and congestion avoidance phases. 3111 SCTP also requires one additional control variable, partial_bytes_acked, 3112 which is used during congestion avoidance phase to facilitate cwnd 3113 adjustment. 3115 Unlike TCP, an SCTP sender MUST keep a set of these control variables 3116 for EACH destination address of its peer (when its peer is multi-homed). 3118 6.2.1 Slow-Start 3120 Beginning data transmission into a network with unknown conditions 3121 requires SCTP to probe the network to determine the available capacity. 3122 The slow start algorithm is used for this purpose at the beginning of a 3123 transfer, or after repairing loss detected by the retransmission timer. 3125 o The initial cwnd before data transmission or after a sufficiently 3126 long idle period MUST be <= 2*MTU. 3128 o The initial cwnd after a retransmission timeout MUST be no more 3129 than 1*MTU. 3131 o The initial value of ssthresh MAY be arbitrarily high (for example, 3132 some implementations use the size of the receiver advertised window). 3134 o Whenever cwnd is greater than zero, the sender is allowed to have cwnd 3135 octets of data outstanding on that transport address. 3137 o When cwnd is less than or equal to ssthresh an SCTP sender MUST use 3138 the slow start algorithm to increase cwnd (assuming the current 3139 congestion window is being fully utilized). If the incoming SACK 3140 advances the cumulative TSN, cwnd MUST be increased by at most the 3141 lesser of 1) the total size of the previously outstanding DATA 3142 chunk(s) acknowledged, and 2) the destinations path MTU. 3143 This prevents against the ACK-Splitting attack outlined in [15]. 3145 NOTE: In instances where the data receiver endpoint is multi-homed, 3146 if a SACK arrives at the data sender that advances the 3147 sender's cumulative TSN point, then the data sender should update 3148 its cwnd (or cwnds) apportioned to the destination addresses where 3149 the data was transmitted to. However if the SACK does not advance 3150 the cumulative TSN point, the data sender MUST not adjust the cwnd 3151 of any of the destination addresses. 3153 Stewart, et al [Page 60] 3154 NOTE: because an SCTP data sender's cwnd is not tied to its 3155 cumulative TSN point, as duplicate SACKs come in, even though they 3156 may not advance the cumulative TSN point an SCTP sender can still 3157 use them to clock out new data. That is, the data newly 3158 acknowledged by the SACK diminishes the amount of data now in 3159 flight to less than cwnd; and so the current, unchanged value of 3160 cwnd now allows new data to be sent. On the other hand, the 3161 increase of cwnd must be tied to the cumulative TSN advancement as 3162 specified above. Otherwise the duplicate SACKs will not only clock 3163 out new data, but also will adversely clock out *more* new data 3164 than what has just left the network, during a time of possible 3165 congestion. 3167 o When the sender does not transmit data on a given transport address, 3168 the cwnd of the transport address should be adjusted to 3169 max(cwnd / 2, 2*MTU) per RTO. 3171 6.2.2 Congestion Avoidance 3173 When cwnd is greater than ssthresh, cwnd should be incremented 3174 by 1*MTU per RTT if the sender has cwnd or more octets of data 3175 outstanding on the corresponding transport address. 3177 In practice an implementation can achieve this goal in the 3178 following way: 3180 o partial_bytes_acked is initialized to 0. 3182 o Whenever cwnd is greater than ssthresh, upon each SACK arrival, 3183 increase partial_bytes_acked by the total number of octets of all 3184 new chunks acknowledged in that SACK. 3186 o When partial_bytes_acked is equal or greater than cwnd and before 3187 the arrival of the SACK the sender has cwnd or more octets of data 3188 outstanding, increase cwnd by MTU, and reset partial_bytes_acked to 3189 (partial_bytes_acked - cwnd). 3191 o Same as in the slow start, when the sender does not transmit data on 3192 a given transport address, the cwnd of the transport address should 3193 be adjusted to max(cwnd / 2, 2*MTU) per RTO. 3195 o When all of the data transmitted by the sender has been acknowledged 3196 by the receiver, partial_bytes_acked is initialized to 0. 3198 6.2.3 Congestion Control 3200 Upon detection of packet losses from SACK reports (see section 6.2.4), 3201 the sender should do the following: 3203 ssthresh = max(cwnd/2, 2*MTU) 3204 cwnd = ssthresh 3206 Stewart, et al [Page 61] 3207 Basically, a packet loss causes cwnd to be cut in half. 3209 When the T3-rxt timer expires on an address, SCTP should perform 3210 slow start by: 3212 ssthresh = max(cwnd/2, 2*MTU) 3213 cwnd = 1*MTU 3215 and assure that no more than one data packet will be in flight on that 3216 address until the sender receives acknowledgment for successful delivery 3217 of data to that address. 3219 6.2.4 Fast Retransmit on Gap Reports 3221 In the absence of data losses, a SCTP receiver performs delayed 3222 acknowledgment. However, whenever a receiver notices a hole in the 3223 arriving TSN sequence, it should start sending a SACK back every time 3224 a packet arrives carrying data. 3226 At the sender end, whenever the sender receives a SACK that indicate 3227 some TSN(s) missing, it SHOULD wait for 3 further miss indications 3228 (via subsequent SACKs) on the same TSN(s) before taking action. 3230 When the TSN(s) is reported as missing in consecutive SACKs for the 3231 4th time, the sender shall: 3233 1) Mark the missing DATA chunk(s) for retransmission, 3235 2) Adjust the ssthresh and cwnd of the destination address(es) where 3236 the missing data chunks were last sent, according to the formula 3237 described in Section 6.2.3. 3239 3) Determine how many of the earliest (i.e., lowest TSN) missing Data 3240 chunks will fit into a single packet, subject to constraint of the 3241 path MTU of the destination transport address to which the packet 3242 is being sent. Call this value K. Retransmit those K data chunks in 3243 a single packet. 3245 4) Restart T3-rxt timer ONLY IF the last SACK acknowledged the lowest 3246 outstanding TSN number sent to that address, or we are retransmitting 3247 the first outstanding Data chunk sent to that address. 3249 Note, before the above adjustments, if the received SACK also 3250 acknowledges new data chunks and advances the cumulative TSN point, 3251 the cwnd adjustment rules defined in Sections 6.2.1 and 6.2.2 must 3252 be applied first. 3254 A straightforward implementation of the above requires that the sender 3255 keeps a counter for each TSN hole first reported by a SACK; the 3256 counter keeps track of whether 3 subsequent SACKs have reported the 3257 same hole. 3259 Stewart, et al [Page 62] 3260 Because cwnd in SCTP indirectly bounds the number of outstanding 3261 TSN's, the effect of TCP fast-recovery is achieved automatically with 3262 no adjustment to the congestion control window size. 3264 6.3 Path MTU Discovery 3266 RFC 1191 [11] discusses "Path MTU Discovery", whereby a sender 3267 maintains an estimate of the maximum transmission unit (MTU) along a 3268 given Internet path and refrains from sending datagrams along that 3269 path which exceed the MTU, other than occasional attempts to probe for 3270 a change in the path MTU. RFC 1191 is thorough in its discussion of 3271 the MTU discovery mechanism and strategies for determining the current 3272 end-to-end MTU setting as well as detecting changes in this value. 3273 RFC 1981 [12] discusses applying the same mechanisms for IPv6. 3275 An SCTP sender SHOULD apply these techniques, and SHOULD do so on a 3276 per-destination-address basis. 3278 There are 4 ways in which SCTP differs from the description in RFC 1191 3279 of applying MTU discovery to TCP: 3281 1) SCTP associations can span multiple set of addresses. 3282 Per the above comment, an SCTP sender MUST maintain separate 3283 MTU estimates for each destination address of its peer. 3285 2) Elsewhere in this document, when the term "MTU" is discussed, 3286 it refers to the MTU associated with the destination address 3287 corresponding to the context of the discussion. 3289 3) Unlike TCP, SCTP does not have a notion of "Maximum Segment 3290 Size". Accordingly, the MTU for each destination address 3291 SHOULD be initialized to a value no larger than the link MTU 3292 for the local interface to which datagrams for that remote 3293 destination address will be routed. 3295 4) Since data transmission in SCTP is naturally structured in 3296 terms of TSNs rather than bytes (as is the case for TCP), the 3297 discussion in section 6.5 of RFC 1191 applies: when retransmitting 3298 a datagram to a remote address for which the datagram appears 3299 too large for the path MTU to that address, the datagram SHOULD 3300 be retransmitted without the DF bit set, allowing it to possibly 3301 be fragmented. Transmissions of new datagrams MUST have DF set. 3303 Other than these differences, the discussion of TCP's use of MTU 3304 discovery in RFCs 1191 and 1981 applies to SCTP, too, on a 3305 per-destination-address basis. 3307 Stewart, et al [Page 63] 3308 7. Fault Management 3310 7.1 Endpoint Failure Detection 3312 The data sender shall keep a counter on the total number of 3313 consecutive retransmissions to its peer (including retransmissions to 3314 ALL the destination transport addresses of the peer if it is 3315 multi-homed). 3316 If the value of this counter exceeds the limit indicated in the 3317 protocol parameter 'Association.Max.Retrans', the data sender shall 3318 consider the peer endpoint unreachable and shall stop transmitting any 3319 more data to it (and thus the association enters the CLOSED state). In 3320 addition, the data sender shall report the failure to the upper layer, 3321 and optionally report back all outstanding user data remaining in its 3322 outbound queue. The association is automatically terminated when the 3323 peer endpoint becomes unreachable. 3325 The counter shall be reset each time a datagram sent to that 3326 destination address is acknowledged by the peer endpoint, or 3327 a HEARTBEAT-ACK is received from the peer endpoint. 3329 7.2 Path Failure Detection 3331 When the remote endpoint is multi-homed, the data sender should keep a 3332 'retrans.count' counter for each of the destination transport 3333 addresses of the remote endpoint. 3335 Each time the T3-rxt timer on any address, or when a HEARTBEAT sent to 3336 an idle address is not acknowledged, the 'retrans.count' counter of 3337 that destination address will be incremented. When the value in 3338 'retrans.count' exceeds the protocol parameter 'Path.Max.Retrans' of 3339 that destination address, the data sender should mark the destination 3340 transport address as inactive, and a notification SHOULD be sent to 3341 the upper layer. 3343 When an outstanding TSN is acknowledged or a HEARTBEAT sent to that 3344 address is acknowledged with a HEARTBEAT-ACK, the data sender shall 3345 clear the 'retrans.count' counter of the destination transport address 3346 to which the datagram was last sent (or HEARTBEAT was sent). Note, 3347 when the data receiver is multi-homed and the last sent was a 3348 retransmission to an alternate address of the receiver, there exists 3349 an ambiguity as to whether or not the acknowledgment should be 3350 credited to the address of the last sent. However, this ambiguity does 3351 not seem to bear any significant consequence to SCTP behavior. If this 3352 ambiguity is undesirable, the data sender may choose not to clear the 3353 'retrans.count' counter if the last sent was a retransmission. 3355 Stewart, et al [Page 64] 3356 Note, when configuring the SCTP endpoint, the user should avoid 3357 having the value of 'Association.Max.Retrans' larger than the 3358 summation of the 'Path.Max.Retrans' of all the destination addresses 3359 for the remote endpoint. Otherwise, all the destination addresses may 3360 become inactive while the endpoint still considers the peer endpoint 3361 reachable. When this condition occurs, how the SCTP chooses to function 3362 is implementation specific. 3364 Note, when the primary destination address is marked inactive (due to 3365 excessive retransmissions, for instance), the sender MAY automatically 3366 transmit new datagrams to an alternate destination address if one 3367 exists and is active. This is, however, an implementation option. 3369 7.3 Path Heartbeat 3371 By default, an SCTP endpoint shall monitor the reachability of the 3372 idle destination transport address(es) of its peer by sending 3373 HEARTBEAT messages periodically to the destination transport 3374 address(es). 3376 A destination transport address is considered "idle" if no new chunk 3377 which can be used for updating path RTT (usually including first 3378 transmission DATA, INIT, COOKIE, etc.) and no heartbeat has been sent 3379 to it within the current heartbeat period of that address. This 3380 applies to both active and inactive destination addresses. 3382 The upper layer can optionally initiate the following functions: 3384 A) disable heart beat on a specific destination transport address of a 3385 given association, 3386 B) re-enable heart beat on a specific destination transport address of 3387 a given association, and, 3388 C) request an on-demand heartbeat on a specific destination transport 3389 address of a given association. 3391 The endpoint should increment the respective 'retrans.count' counter 3392 of the destination transport address each time a HEARTBEAT is sent to 3393 that address. 3395 When the value of this counter reaches the protocol parameter 3396 'Path.Max.Retrans', the endpoint should mark the corresponding 3397 destination address as inactive if it is not so marked, and may also 3398 optionally report to the upper layer the change of reachability of 3399 this destination address. After this, the endpoint should continue 3400 heartbeat on this destination address but should stop increasing the 3401 counter. 3403 The sender of the HEARTBEAT message should include in the Heartbeat 3404 Information field of the message the current time when the message is 3405 sent out and the information on the destination address to which the 3406 message is sent. 3408 Stewart, et al [Page 65] 3409 The receiver of the HEARTBEAT should immediately respond with a 3410 HEARTBEAT ACK that contains the Heartbeat Information field copied out 3411 from the received HEARTBEAT message. 3413 Upon the receipt of the HEARTBEAT ACK, the sender of the HEARTBEAT 3414 should clear the 'retrans.count' counter of the destination transport 3415 address to which the HEARTBEAT was sent, and mark the destination 3416 transport address as active if it is not so marked. The endpoint may 3417 optionally report to the upper layer when an inactive destination 3418 address is marked as active due to the reception of the latest 3419 HEARTBEAT ACK. 3421 The receiver of the HEARTBEAT ACK should also perform an RTT 3422 measurement for that destination transport address using the time 3423 value carried in the HEARTBEAT ACK message. 3425 On an idle destination address that is allowed to heartbeat, HEARTBEAT 3426 messages is RECOMMENDED to be sent once per RTO of that destination 3427 address, with jittering of +/- 50%, and exponential back-off if the 3428 previous HEARTBEAT is unanswered. 3430 A primitive is provided for the SCTP user to change the heart 3431 beat interval and turn on or off the heart beat on a given destination 3432 address. Note, the heartbeat interval set by the SCTP user on any of 3433 the idle destination addresses SHOULD be no smaller than the RTO of 3434 that destination address. Separate timers may be used to control the 3435 heartbeat transmission for different idle destination addresses. 3437 7.4 Handle "Out of the blue" Packets 3439 An SCTP datagram is called an "out of the blue" (OOTB) datagram if it 3440 is correctly formed, i.e., passed the receiver's Adler-32 check (see 3441 Section 5.8), but the receiver is not able to identify the association 3442 to which this datagram belongs. 3444 The receiver of an OOTB datagram MUST do the following: 3446 1) check if the OOTB datagram contains an ABORT chunk. If so, the 3447 receiver MUST silently discarded the OOTB datagram and take no 3448 further action. Otherwise, 3450 2) the receiver should respond the sender of the OOTB datagram with an 3451 ABORT. When sending the ABORT, the receiver of the OOTB datagram 3452 MUST fill in the Verification Tag field of the outbound datagram 3453 with the value found in the Verification Tag field of the OOTB 3454 datagram. After sending this ABORT, the receiver of the OOTB 3455 datagram shall discard the OOTB datagram and take no further 3456 action. 3458 Stewart, et al [Page 66] 3459 7.5 Verification Tag 3461 The Verification Tag rules defined in this section apply when sending 3462 or receiving SCTP datagrams which do NOT contain an INIT, SHUTDOWN 3463 ACK, or ABORT chunk. The rules for sending and receiving SCTP 3464 datagrams containing one of these chunk types are discussed separately 3465 in Section 7.5.1. 3467 When sending an SCTP datagram, the sender MUST fill in the 3468 Verification Tag field of the outbound datagram with the tag value of 3469 the peer endpoint to which this SCTP datagram is destined. 3471 When receiving an SCTP datagram, the receiver MUST ensure that the 3472 value in the Verification Tag field of the received SCTP datagram 3473 matches its own Tag. If the received tag value does not match the 3474 receiver's own tag value, the receiver shall silently discard the 3475 datagram and shall not process it any further. 3477 7.5.1 Exceptions in Verification Tag Rules 3479 A) Rules for datagram carrying INIT: 3481 - The sender MUST set the Verification Tag of the datagram to 0. 3482 - The receiver, when noticing an incoming SCTP datagram with the 3483 Verification Tag set to 0, should continue to process the datagram 3484 only if an INIT chunk is present. Otherwise, the receiver MUST 3485 silently discard the datagram and take no further action. 3487 B) Rules for datagram carrying ABORT: 3489 - The sender shall always fill in the Verification Tag field of the 3490 outbound datagram with the destination endpoint's tag value if it 3491 is known. 3492 - If the ABORT is sent in response to an OOTB datagram, the sender 3493 MUST follow the procedure described in Section 7.4. 3494 - The receiver MUST accept the datagram IF the Verification Tag 3495 matches either its own tag, OR the tag of its peer. Otherwise, the 3496 receiver MUST silently discard the datagram and take no further 3497 action. 3499 C) Rules for datagram carrying SHUTDOWN ACK: 3501 - When sending a SHUTDOWN ACK, the sender is allowed to either use 3502 the destination endpoint's tag or set the Verification Tag field 3503 of the outbound datagram to 0. 3504 - The receiver of a SHUTDOWN ACK shall accept the datagram IF the 3505 Verification Tag field of the datagram matches its own tag OR is 3506 set to 0. Otherwise, the receiver MUST silently discard the 3507 datagram and take no further action. NOTE: the receiver of the 3508 SHUTDOWN ACK MUST ignore the chunk if it is not in the SHUTDOWN 3509 SENT state. 3511 Stewart, et al [Page 67] 3512 8. Termination of Association 3514 All existing associations should be terminated when an endpoint exits 3515 from service. An association can be terminated by either close or 3516 shutdown. 3518 8.1 Close of an Association 3520 When an endpoint decides to close down an existing association, it 3521 shall send an ABORT message to its peer endpoint. The sender MUST fill 3522 in the peer's Verification Tag in the outbound datagram and MUST NOT 3523 bundle any DATA chunk with the ABORT. 3525 No acknowledgment is required for an ABORT message. In any 3526 circumstances, an endpoint MUST NOT respond to any received datagram 3527 that contains an ABORT with its own ABORT (also see Section 7.4). 3529 The receiver shall apply the special Verification Tag check rules 3530 described in Section 7.5.1 when handling the datagram carrying an 3531 ABORT. 3533 After checking the Verification Tag, the peer shall remove the 3534 association from its record, and shall report the termination to its 3535 upper layer. 3537 8.2 Shutdown of an Association 3539 Using the TERMINATE primitive (see Section 9.1), the upper layer of an 3540 endpoint in an association can gracefully shutdown the association. 3541 This will guarantee that all outstanding datagrams from the peer of 3542 the shutdown initiator be delivered before the association 3543 terminates. 3545 Upon receipt of the TERMINATE primitive from its upper layer, the 3546 initiator endpoint enters SHUTDOWN-PENDING state and remains there 3547 until all outstanding TSNs have been acknowledged by the far end. It 3548 accepts no new data from its upper layer, but retransmits data to the 3549 far end if necessary to fill gaps. 3551 Once all outstanding TSNs have been acknowledged, the initiator 3552 endpoint shall send a SHUTDOWN message to the peer of the association, 3553 and shall include the last cumulative TSN it has received from the 3554 peer in the 'Cumulative TSN ACK' field. It shall then start the 3555 T2-shutdown timer and enter the Shutdown-sent state. If the timer 3556 expires, the initiator must re-send the SHUTDOWN with the updated last 3557 TSN received from its peer. 3559 The same rules in 5.3 SHALL be followed to determine the proper timer 3560 value for T2-shutdown. The sender of the SHUTDOWN message may also 3561 optionally include a SACK to indicate any gaps by bundling both the 3562 SACK and SHUTDOWN message together. 3564 Stewart, et al [Page 68] 3565 Note the sender of a shutdown should limit the number of 3566 retransmissions of the shutdown message to the protocol parameter 3567 'Association.Max.Retrans'. If this threshold is exceeded the endpoint 3568 should destroy the TCB and may report the endpoint unreachable to the 3569 upper layer (and thus the association enters the CLOSED state). 3571 Upon the reception of the SHUTDOWN, the peer shall enter the 3572 Shutdown-received state, and shall verify, by checking the TSN ACK 3573 field of the message, that all its outstanding datagrams have been 3574 received by the initiator. 3576 If there are still outstanding datagrams left, the peer shall mark 3577 them for retransmission and start the retransmit procedure as defined 3578 in Section 5.3. 3580 While in Shutdown-sent state, the initiator shall immediately respond 3581 to each inbound SCTP datagram containing user data from the peer with 3582 a SACK and restart the T2-shutdown timer. 3584 If there is no more outstanding datagrams, the peer shall send a 3585 SHUTDOWN ACK and then remove all record of the association. 3587 Upon the receipt of the SHUTDOWN ACK, the initiator shall stop the 3588 T2-shutdown timer and remove all record of the association. 3590 Note: that it should be the responsibility of the initiator to assure 3591 that all the outstanding datagrams on its side have been resolved 3592 before it initiates the shutdown procedure. 3594 Note: an endpoint shall reject any new data request from its upper 3595 layer if it is in Shutdown-sent or Shutdown-received state until 3596 completion of the sequence. 3598 Note: if an endpoint is in Shutdown-sent state and receives an INIT 3599 message from its peer, it should discard the INIT message and 3600 retransmit the shutdown message. The sender of the INIT should respond 3601 with a stand-alone SHUTDOWN ACK in an SCTP datagram with the 3602 Verification Tag field of its common header set to 0, and let the 3603 normal T1-init timer cause the INIT message to be retransmitted and 3604 thus restart the association. 3606 Note: if an endpoint is in Shutdown-sent state and receives a 3607 SHUTDOWN message from its peer, the endpoint shall respond 3608 immediately with a SHUTDOWN ACK and shall stop the T2-shutdown timer 3609 and remove all record of the association. 3611 9. Interface with Upper Layer 3613 The Upper Layer Protocols (ULP) shall request for services by passing 3614 primitives to SCTP and shall receive notifications from SCTP for 3615 various events. 3617 Stewart, et al [Page 69] 3618 The primitives and notifications described in this section should be 3619 used as a guideline for implementing SCTP. The following functional 3620 description of ULP interface primitives is shown for illustrative 3621 purposes. We must warn readers that different SCTP implementations may 3622 have different ULP interfaces. However, all SCTPs must provide a 3623 certain minimum set of services to guarantee that all SCTP 3624 implementations can support the same protocol hierarchy. 3626 9.1 ULP-to-SCTP 3628 The following sections functionally characterize a ULP/SCTP interface. 3629 The notation used is similar to most procedure or function calls in 3630 high level languages. 3632 The ULP primitives described below specify the basic functions the 3633 SCTP must perform to support inter-process communication. Individual 3634 implementations must define their own exact format, and may provide 3635 combinations or subsets of the basic functions in single calls. 3637 A) Initialize 3639 Format: INITIALIZE ([local port], [local eligible address]) 3640 -> local SCTP instance name 3642 This primitive allows SCTP to initialize its internal data structures 3643 and allocate necessary resources for setting up its operation 3644 environment. Note that once SCTP is initialized, ULP can communicate 3645 directly with other endpoints without re-invoking this primitive. 3647 A local SCTP instance name will be returned to the ULP by the SCTP. 3649 Mandatory attributes: 3651 None. 3653 Optional attributes: 3655 The following types of attributes may be passed along with the 3656 primitive: 3658 o local port - SCTP port number, if ULP wants it to be specified; 3660 o local eligible address - A single address that the local SCTP 3661 endpoint should bind. By default all transport interface cards 3662 should be used by the local endpoint. 3664 IMPLEMENTATION NOTE: if this optional attribute is supported by an 3665 implementation, it will be the responsibility of the implementation 3666 to enforce that the IP source address field of any SCTP datagrams 3667 sent out by this endpoint MUST contain the IP addresses 3668 indicated in the local eligible address. 3670 Stewart, et al [Page 70] 3671 B) Associate 3673 Format: ASSOCIATE(local SCTP instance name, destination transport 3674 addr, outbound stream count) 3675 -> association id [,destination transport addr list] [,outbound stream 3676 count] 3678 This primitive allows the upper layer to initiate an association to a 3679 specific peer endpoint. 3681 The peer endpoint shall be specified by one of the transport addresses 3682 which defines the endpoint (see section 1.4). If the local SCTP 3683 instance has not been initialized, the ASSOCIATE is considered an 3684 error. 3686 An association id, which is a local handle to the SCTP association, 3687 will be returned on successful establishment of the association. If 3688 SCTP is not able to open an SCTP association with the peer endpoint, 3689 an error is returned. 3691 Other association parameters may be returned, including the complete 3692 destination transport addresses of the peer as well as the outbound 3693 stream count of the local endpoint. One of the transport address from 3694 the returned destination addresses will be selected by the local 3695 endpoint as default primary destination address for sending SCTP 3696 datagrams to this peer. The returned "destination transport addr 3697 list" can be used by the ULP to change the default primary destination 3698 address or to force sending a datagram to a specific transport address. 3700 IMPLEMENTATION NOTE: If ASSOCIATE primitive is implemented as a 3701 blocking function call, the ASSOCIATE primitive can return 3702 association parameters in addition to the association id upon 3703 successful establishment. If ASSOCIATE primitive is implemented as a 3704 non-blocking call, only the association id shall be returned and 3705 association parameters shall be passed using the COMMUNICATION UP 3706 notification. 3708 Mandatory attributes: 3710 o local SCTP instance name - obtained from the INITIALIZE operation. 3712 o destination transport addr - specified as one of the transport 3713 addresses of the peer endpoint with which the association is to be 3714 established. 3716 o outbound stream count - the number of outbound streams the ULP 3717 would like to open towards this peer endpoint. 3719 Optional attributes: 3721 None. 3723 Stewart, et al [Page 71] 3724 C) Terminate 3726 Format: TERMINATE(association id) 3727 -> result 3729 Gracefully terminates an association. Any locally queued user data 3730 will be delivered to the peer. The association will be terminated only 3731 after the peer acknowledges all the messages sent. A success code 3732 will be returned on successful termination of the association. If 3733 attempting to terminate the association results in a failure, an error 3734 code shall be returned. 3736 Mandatory attributes: 3738 o association id - local handle to the SCTP association 3740 Optional attributes: 3742 None. 3744 D) Abort 3746 Format: ABORT(association id [, cause code]) 3747 -> result 3749 Ungracefully terminates an association. Any locally queued user data 3750 will be discarded and an ABORT message is sent to the peer. A success 3751 code will be returned on successful abortion of the association. If 3752 attempting to abort the association results in a failure, an error 3753 code shall be returned. 3755 Note: If possible the SCTP should attempt to return all un-acknowledged 3756 data to the upper layer, however this behavior is implementation 3757 dependent. 3759 Mandatory attributes: 3761 o association id - local handle to the SCTP association 3763 Optional attributes: 3765 o cause code - reason of the abort to be passed to the peer. 3767 None. 3769 Stewart, et al [Page 72] 3770 E) Send 3772 Format: SEND(association id, buffer address, byte count [,context] 3773 [,stream id] [,life time] [,destination transport address] [,un-order 3774 flag] [,no-bundle flag]) 3775 -> result 3777 This is the main method to send user data via SCTP. 3779 Mandatory attributes: 3781 o association id - local handle to the SCTP association 3783 o buffer address - the location where the user message to be 3784 transmitted is stored; 3786 o byte count - The size of the user data in number of octets; 3788 Optional attributes: 3790 o context - optional information that will be carried in the 3791 sending failure notification to the ULP if the transportation of 3792 this datagram fails. 3794 o stream id - to indicate which stream to send the data on. If not 3795 specified, stream 0 will be used. 3797 o life time - specifies the life time of the user data. The user data 3798 will not be sent by SCTP after the life time expires. This 3799 parameter can be used to avoid efforts to transmit stale 3800 user messages. SCTP notifies the ULP, if the data cannot be 3801 initiated to transport (i.e. sent to the destination via SCTP's 3802 send primitive) within the life time variable. However, the 3803 user data will be transmitted if a TSN has been assigned to the 3804 user data before the life time expired. 3806 o destination transport address - specified as one of the destination 3807 transport addresses of the peer endpoint to which this message 3808 should be sent. Whenever possible, SCTP should use this destination 3809 transport address for sending the datagram, instead of the current 3810 primary destination transport address. 3812 o un-order flag - this flag, if present, indicates that the user 3813 would like the data delivered in an un-ordered fashion to the peer. 3815 o no-bundle flag - instructs SCTP not to bundle the user data with 3816 other outbound DATA chunks. Note: SCTP may still bundle even when 3817 this flag is present, when faced with network congestion. 3819 Stewart, et al [Page 73] 3820 F) Set Primary 3822 Format: SETPRIMARY(association id, destination transport address) 3823 -> result 3825 Instructs the local SCTP to use the specified destination transport 3826 address as primary destination address for sending datagrams. 3828 The result of attempting this operation shall be returned. If the 3829 specified destination transport address is not present in the 3830 "destination transport address list" returned earlier in an associate 3831 command or communication up notification, an error shall be returned. 3833 Mandatory attributes: 3835 o association id - local handle to the SCTP association 3837 o destination transport address - specified as one of the transport 3838 addresses of the peer endpoint, which should be used as primary 3839 address for sending datagrams. This overrides the current primary 3840 address information maintained by the local SCTP endpoint. 3841 Stewart, et al [Page 74] 3842 G) Receive 3844 Format: RECEIVE(association id, buffer address, buffer size 3845 [,stream id]) 3846 -> byte count [,transport address] [,stream id] [,stream sequence number] 3847 [,partial flag] [, delivery number] 3849 This primitive shall read the first user message in the SCTP in-queue 3850 to ULP, if there is one available, into the specified buffer. The size 3851 of the message read, in octets, will be returned. It may, depending on 3852 the specific implementation, also return other information such as the 3853 sender's address, the stream id on which it is received, whether there 3854 are more messages available for retrieval, etc. For ordered messages, 3855 their stream sequence number may also be returned. 3857 Depending upon the implementation, if this primitive is invoked when 3858 no message is available the implementation should return an indication 3859 of this condition or should block the invoking process until data does 3860 become available. 3862 Mandatory attributes: 3864 o association id - local handle to the SCTP association 3866 o buffer address - the memory location indicated by the ULP to store 3867 the received message. 3869 o buffer size - the maximum size of data to be received, in octets. 3871 Optional attributes: 3873 o stream id - to indicate which stream to receive the data on. 3875 o stream sequence number - the stream sequence number assigned by the 3876 sending SCTP peer. 3878 o partial flag - if this returned flag is set to 1, then this 3879 message is a partial delivery of the whole message. When 3880 this flag is set, the stream id and stream sequence number MUST 3881 accompany this receive. When this flag is set to 0, it indicates 3882 that no more deliveries will be received for this stream sequence 3883 number. 3885 Stewart, et al [Page 75] 3886 H) Status 3888 Format: STATUS(association id) 3889 -> status data 3891 This primitive should return a data block containing the following 3892 information: 3893 association connection state, 3894 destination transport address list, 3895 destination transport address reachability state, 3896 current receiver window size, 3897 current congestion window sizes, 3898 number of DATA chunks awaiting acknowledgment, 3899 number of DATA chunks pending receipt, 3900 primary destination transport address, 3901 SRTT on primary destination address, 3902 RTO on primary destination address, 3903 SRTT and RTO on other destination addresses, etc. 3905 Mandatory attributes: 3907 o association id - local handle to the SCTP association 3909 Optional attributes: 3911 None. 3913 I) Change Heartbeat 3915 Format: CHANGEHEARTBEAT(association id, destination transport address, 3916 new state [,interval]) 3917 -> result 3919 Instructs the local endpoint to enable or disable heart beat on the 3920 specified destination transport address. 3922 The result of attempting this operation shall be returned. 3923 Note, even when enabled, heart beat will not take place if the 3924 destination transport address is not idle. 3926 Mandatory attributes: 3928 o association id - local handle to the SCTP association 3930 o destination transport address - specified as one of the transport 3931 addresses of the peer endpoint. 3933 o new state - the new state of heart beat for this destination 3934 transport address (either enabled or disabled). 3936 Optional attributes: 3938 o interval - if present, indicates the frequency of the heart beat if 3939 this is to enable heart beat on a destination transport 3940 address. Default interval is the RTO of the destination address. 3942 Stewart, et al [Page 76] 3943 J) Request HeartBeat 3945 Format: REQUESTHEARTBEAT(association id, destination transport 3946 address) 3947 -> result 3949 Instructs the local endpoint to perform a HeartBeat on the specified 3950 destination transport address of the given association. The returned 3951 result should indicate whether the transmission of the HEARTBEAT 3952 message to the destination address is successful. 3954 Mandatory attributes: 3956 o association id - local handle to the SCTP association 3958 o destination transport address - the transport address of the 3959 association on which a heartbeat should be issued. 3961 K) Get SRTT Report 3963 Format: GETSRTTREPORT(association id, destination transport address) 3964 -> srtt result 3966 Instructs the local SCTP to report the current SRTT measurement on the 3967 specified destination transport address of the given association. The 3968 returned result can be an integer containing the most recent SRTT in 3969 milliseconds. 3971 Mandatory attributes: 3973 o association id - local handle to the SCTP association 3975 o destination transport address - the transport address of the 3976 association on which the SRTT measurement is to be reported. 3978 Stewart, et al [Page 77] 3979 L) Set Failure Threshold 3981 Format: SETFAILURETHRESHOLD(association id, destination transport 3982 address, failure threshold) 3983 -> result 3985 This primitive allows the local SCTP to customize the reachability 3986 failure detection threshold 'Path.Max.Retrans' for the specified 3987 destination address. 3989 Mandatory attributes: 3991 o association id - local handle to the SCTP association 3993 o destination transport address - the transport address of the 3994 association on which the failure detection threshold is to be set. 3996 o failure threshold - the new value of 'Path.Max.Retrans' for the 3997 destination address. 3999 M) Set Protocol Parameters 4001 Format: SETPROTOCOLPARAMETERS(association id, [,destination transport 4002 address,] protocol parameter list) 4003 -> result 4005 This primitive allows the local SCTP to customize the protocol 4006 parameters. 4008 Mandatory attributes: 4010 o association id - local handle to the SCTP association 4012 o protocol parameter list - The specific names and values of the 4013 protocol parameters (e.g., Association.Max.Retrans [see Section 4014 13]) that the SCTP user wishes to customize. 4016 Optional attributes: 4018 o destination transport address - some of the protocol parameters may 4019 be set on a per destination transport address basis. 4021 9.2 SCTP-to-ULP 4023 It is assumed that the operating system or application environment 4024 provides a means for the SCTP to asynchronously signal the ULP 4025 process. When SCTP does signal an ULP process, certain information is 4026 passed to the ULP. 4028 IMPLEMENTATION NOTE: in some cases this may be done through a 4029 seperate socket or error channel. 4031 Stewart, et al [Page 78] 4032 A) DATA ARRIVE notification 4034 SCTP shall invoke this notification on the ULP when a user message is 4035 successfully received and ready for retrieval. 4037 The following may be optionally be passed with the notification: 4039 o association id - local handle to the SCTP association 4041 o stream id - to indicate which stream the data is received on. 4043 B) SEND FAILURE notification 4045 If a message can not be delivered SCTP shall invoke this notification 4046 on the ULP. 4048 The following may be optionally be passed with the notification: 4050 o association id - local handle to the SCTP association 4052 o data - the location ULP can find the un-delivered message. 4054 o cause code - indicating the reason of the failure, e.g., size too 4055 large, message life-time expiration, etc. 4057 o context - optional information associated with this message (see 4058 D in section 9.1). 4060 C) NETWORK STATUS CHANGE notification 4062 When a destination transport address is marked down (e.g., when SCTP 4063 detects a failure), or marked up (e.g., when SCTP detects a recovery), 4064 SCTP shall invoke this notification on the ULP. 4066 The following shall be passed with the notification: 4068 o association id - local handle to the SCTP association 4070 o destination transport address - This indicates the destination 4071 transport address of the peer endpoint affected by the change; 4073 o new-status - This indicates the new status. 4075 Stewart, et al [Page 79] 4076 D) COMMUNICATION UP notification 4078 This notification is used when SCTP becomes ready to send or receive 4079 user messages, or when a lost communication to an endpoint is 4080 restored. 4082 IMPLEMENTATION NOTE: If ASSOCIATE primitive is implemented as a 4083 blocking function call, the association parameters are returned as a 4084 result of the ASSOCIATE primitive itself. In that case, 4085 COMMUNICATION UP notification is optional at the association 4086 initiator's side. 4088 The following shall be passed with the notification: 4090 o association id - local handle to the SCTP association 4092 o status - This indicates what type of event that has occurred 4094 o destination transport address list - the complete set of transport 4095 addresses of the peer 4097 o outbound stream count - the maximum number of streams allowed to be 4098 used in this association by the ULP 4100 o inbound stream count - the number of streams the peer endpoint 4101 has requested with this association (this may not be the same 4102 number has 'outbound stream count'). 4104 Stewart, et al [Page 80] 4105 E) COMMUNICATION LOST notification 4107 When SCTP loses communication to an endpoint completely or detects 4108 that the endpoint has performed an abort or graceful shutdown 4109 operation, it shall invoke this notification on the ULP. 4111 The following shall be passed with the notification: 4113 o association id - local handle to the SCTP association 4115 o status - This indicates what type of event that has occurred; 4117 The following may be optionally passed with the notification: 4119 o unsent-messages - The number and location of un-sent messages 4120 still in hold by SCTP; 4122 o unacknowledged-messages - The number and location of messages 4123 that were attempted to be transported to the destination, but were 4124 not acknowledged when the loss of communication was detected. 4126 o last-acked - the sequence number last acked by that peer endpoint; 4128 o last-sent - the sequence number last sent to that peer endpoint; 4130 o received-but-not-delivered - messages that were received by SCTP 4131 but not yet delivered to the ULP. 4133 Note: the un-send data report may not be accurate for those user 4134 messages which are segmented by SCTP during transmission. 4136 F) COMMUNICATION ERROR notification 4138 When SCTP receives an ERROR chunk from its peer and decides to notify 4139 its ULP, it can invoke this notification on the ULP. 4141 The following can be passed with the notification: 4143 o association id - local handle to the SCTP association 4145 o error info - this indicates the type of error and optionally some 4146 additional information received through the ERROR chunk. 4148 Stewart, et al [Page 81] 4149 10. Security Considerations 4151 10.1 Security Objectives 4153 As a common transport protocol designed to reliably carry time- 4154 sensitive user messages, such as billing or signaling messages for 4155 telephony services, between two networked endpoints, SCTP has the 4156 following security objectives. 4158 - availability of reliable and timely data transport services 4159 - integrity of the user-to-user information carried by SCTP 4161 10.2 SCTP Responses To Potential Threats 4163 It is clear that SCTP may potentially be used in a wide variety of 4164 risk situations. It is important for operator(s) of the systems 4165 concerned to analyze their particular situations and decide on the 4166 appropriate counter-measures. 4168 Where the SCTP system serves a group of users, it is probably 4169 operating as part of a professionally managed corporate or service 4170 provider network. It is reasonable to expect that this management 4171 includes an appropriate security policy framework. [RFC 2196, "Site 4172 Security Handbook", B. Fraser Ed., September 1997] should be 4173 consulted for guidance. 4175 The case is more difficult where the SCTP system is operated by a 4176 private user. The service provider with whom that user has a 4177 contractual arrangement SHOULD provide help to ensure that the 4178 user's site is secure, ranging from advice on configuration through 4179 downloaded scripts and security software. 4181 10.2.1 Countering Insider Attacks 4183 The principles of the Site Security Handbook [13] should be applied 4184 to minimize the risk of theft of information or sabotage by 4185 insiders. These include publication of security policies, control 4186 of access at the physical, software, and network levels, and 4187 separation of services. 4189 Stewart, et al [Page 82] 4190 10.2.2 Protecting against Data Corruption in the Network 4192 Where the risk of undetected errors in datagrams delivered by the 4193 lower layer transport services is considered to be too great, 4194 additional checksum protection may be required. The question is 4195 whether this is appropriately provided as an SCTP service because it 4196 is needed by most potential users of SCTP, or whether instead it 4197 should be provided by the SCTP user application. (The SCTP protocol 4198 overhead, as opposed to the signaling payload, is protected 4199 adequately by the Adler-32 checksum and measures taken in SCTP to prevent 4200 replay attacks and masquerade.) In any event, the checksum must be 4201 specifically designed to ensure that it detects the errors left 4202 behind by the Adler-32 checksum. 4204 10.2.3 Protecting Confidentiality 4206 In most cases, the risk of breach of confidentiality applies to the 4207 signaling data payload, not to the SCTP or lower-layer protocol 4208 overheads. If that is true, encryption of the SCTP user data only 4209 may be considered. As with the supplementary checksum service, user 4210 data encryption may be performed by the SCTP user application. 4212 Particularly for mobile users, the requirement for confidentiality 4213 may include the masking of IP addresses and ports. In this case 4214 IPSEC ESP should be used instead of application-level encryption. 4215 Similarly, where other reasons prompt the use of the IPSEC ESP 4216 service, application-level encryption is unnecessary. It will be up 4217 to the SCTP system operators to configure the application 4218 appropriately. 4220 Regardless of which level performs the encryption, the IPSEC ISAKMP 4221 service should be used for key management. 4223 Operators should consult [RFC 2401, "Security Architecture for the 4224 Internet Protocol", S. Kent, R. Atkinson, November 1998] for 4225 information on the configuration of IPSEC services between hosts 4226 with and without intervening firewalls. 4228 10.2.4 Protecting against Blind Denial of Service Attacks 4230 A blind attack is one where the attacker is unable to intercept or 4231 otherwise see the content of data flows passing to and from the 4232 target SCTP node where it is not a party to the association. Blind 4233 denial of service attacks may take the form of flooding, masquerade, 4234 or improper monopolization of services. 4236 Stewart, et al [Page 83] 4237 10.2.4.1 Flooding 4239 The objective of flooding is to cause loss of service and incorrect 4240 behavior at target systems through resource exhaustion, interference 4241 with legitimate transactions, and exploitation of buffer-related 4242 software bugs. Flooding may be directed either at the SCTP node or at 4243 resources in the intervening IP Access Links or the Internetwork. 4244 Where the latter entities are the target, flooding will manifest 4245 itself as loss of network services, including potentially the breach 4246 of any firewalls in place. 4248 In general, protection against flooding begins at the equipment 4249 design level, where it includes measures such as: 4251 - avoiding commitment of limited resources before determining that 4252 the request for service is legitimate 4253 - giving priority to completion of processing in progress over the 4254 acceptance of new work 4255 - identification and removal of duplicate or stale queued requests 4256 for service. 4258 Network equipment should be capable of generating an alarm and log 4259 if a suspicious increase in traffic occurs. The log should provide 4260 information such as the identity of the incoming link and source 4261 address(es) used which will help the network or SCTP system operator 4262 to take protective measures. Procedures should be in place for the 4263 operator to act on such alarms if a clear pattern of abuse emerges. 4265 The design of SCTP is resistant to flooding attacks, particularly in 4266 its use of a four-way start-up handshake, its use of a cookie to 4267 defer commitment of resources at the responding SCTP node until the 4268 handshake is completed, and its use of a verification tag to prevent 4269 insertion of extraneous messages into the flow of an established 4270 association. 4272 10.2.4.2 Masquerade 4274 Masquerade can be used to deny service in several ways: 4276 - by tying up resources at the target SCTP node to which the 4277 impersonated node has limited access. For example, the target node 4278 may by policy permit a maximum of one SCTP association with the 4279 impersonated SCTP node. The masquerading attacker may attempt to 4280 establish an association purporting to come from the impersonated 4281 node so that the latter cannot do so when it requires it. 4282 - by deliberately allowing the impersonation to be detected, 4283 thereby provoking counter-measures which cause the impersonated node 4284 to be locked out of the target SCTP node. 4285 - by interfering with an established association by inserting 4286 extraneous content such as a SHUTDOWN request. 4288 Stewart, et al [Page 84] 4289 SCTP prevents masquerade through IP spoofing by use of the four-way 4290 startup handshake. Because the initial exchange is memoryless, no 4291 lockout mechanism is triggered by masquerade attacks. SCTP protects 4292 against insertion of extraneous messages into the flow of an 4293 established association by use of the verification tag. 4295 Logging of received INIT requests and abnormalities such as 4296 unexpected INIT ACKs might be considered as a way to detect patterns 4297 of hostile activity. However, the potential usefulness of such 4298 logging must be weighed against the increased SCTP startup 4299 processing it implies, rendering the SCTP node more vulnerable to 4300 flooding attacks. Logging is pointless without the establishment of 4301 operating procedures to review and analyze the logs on a routine 4302 basis. 4304 10.2.4.3 Improper Monopolization of Services 4306 Attacks under this heading are performed openly and legitimately by 4307 the attacker. They are directed against fellow users of the target 4308 SCTP node or of the shared resources between the attacker and the 4309 target node. Possible attacks include the opening of a large number 4310 of associations between the attacker's node and the target, or 4311 transfer of large volumes of information within a legitimately- 4312 established association. 4314 Such attacks take advantage of policy deficiencies at the target 4315 SCTP node. Defense begins with a contractual prohibition of 4316 behavior directed to denial of service to others. Policy limits 4317 should be placed on the number of associations per adjoining SCTP 4318 node. SCTP user applications should be capable of detecting large 4319 volumes of illegitimate or "no-op" messages within a given 4320 association and either logging or terminating the association as a 4321 result, based on local policy. 4323 10.3 Protection against Fraud and Repudiation 4325 The objective of fraud is to obtain services without authorization 4326 and specifically without paying for them. In order to achieve this 4327 objective, the attacker must induce the SCTP user application at the 4328 target SCTP node to provide the desired service while accepting 4329 invalid billing data or failing to collect it. Repudiation is a 4330 related problem, since it may occur as a deliberate act of fraud or 4331 simply because the repudiating party kept inadequate records of 4332 service received. 4334 Potential fraudulent attacks include interception and misuse of 4335 authorizing information such as credit card numbers, blind 4336 masquerade and replay, and man-in-the middle attacks which modify 4337 the messages passing through a target SCTP association in real time. 4339 Stewart, et al [Page 85] 4340 The interception attack is countered by the confidentiality measures 4341 discussed in section 10.2.3 above. 4343 Section 10.2.4.2 describes how SCTP is resistant to blind masquerade 4344 attacks, as a result of the four-way startup handshake and the 4345 validation tag. The validation tag and TSN together are protections 4346 against blind replay attacks, where the replay is into an existing 4347 association. 4349 However, SCTP does not protect against man-in-the-middle attacks 4350 where the attacker is able to intercept and alter the messages sent 4351 and received in an association. Where a significant possibility of 4352 such attacks is seen to exist, or where possible repudiation is an 4353 issue, the use of the IPSEC AH service is recommended to ensure both 4354 the integrity and the authenticity of the messages passed. 4356 SCTP also provides no protection against attacks originating at or 4357 beyond the SCTP node and taking place within the context of an 4358 existing association. Prevention of such attacks should be covered 4359 by appropriate security policies at the host site, as discussed in 4360 section 10.2.1. 4362 11. Recommended Transmission Control Block (TCB) Parameters 4364 This section details a recommended set of parameters that should 4365 be contained within the TCB for an implementation. This section is 4366 for illustrative purposes and should not be deemed as requirements 4367 on an implementation NOR as an exhaustive list of all parameters 4368 inside an SCTP TCB. Each implemenation may need its own additional 4369 parameters to optimize their implemenation. 4371 11.1 Parameters necessary for the SCTP instance 4373 Associations A list of current associations and mappings to the 4374 data consumers for each association. This may be in 4375 the form of a hash table or other implementation dependent 4376 structure. The data consumers may be process identification 4377 information such as file descriptors, named pipe pointer, or 4378 table pointers dependent on how SCTP is implemented. 4380 Secret Key A secret key used by this endpoint to sign all cookies. This 4381 SHOULD be a cryptographic quality random number with 4382 a sufficient length. Discussion in RFC 1750 [1] can be 4383 helpful in selection of the key. 4385 Address List The list of IP addresses that this instance has bound. This 4386 information is passed to one's peer(s) in INIT and INIT-ACK 4387 messages. 4389 SCTP Port The local SCTP port number the endpoint is bound to. 4391 Stewart, et al [Page 86] 4392 Peer Tag value to be sent in every datagram and is received 4393 Verification in the INIT or INIT ACK message. 4394 Tag 4396 My Tag expected in every inbound datagram and sent in the 4397 Verification INIT or INIT ACK message. 4398 Tag 4400 11.2 Parameters necessary per association (i.e. the TCB) 4402 State A state variable indicating what state the association is 4403 in, i.e . COOKIE_WAIT, COOKIE_SENT, ESTABLISHED, 4404 SHUTDOWN_PENDING, SHUTDOWN_SENT, SHUTDOWN_RECEIVED. 4405 Note: No "CLOSED" state is illustrated since if a 4406 association is "CLOSED" its TCB SHOULD be removed. 4408 Peer Transport A list of SCTP transport addresses that the peer is 4409 Address List bound to. This information is derived from the INIT or 4410 INIT-ACK and is used to associate an inbound datagram 4411 with a given association. Normally this information is 4412 hashed or keyed for quick lookup and access of the TCB. 4414 Primary This is the current primary destination transport 4415 Destination address of the peer endpoint. 4417 Overall The overall association error count. 4418 Error Count 4420 Overall Error The threshold for this association that if the Overall 4421 Threshold Error Count reaches will cause this association to be 4422 torn down. 4424 Peer Rwnd Current calculated value of the peer's rwnd. 4426 Next TSN My next TSN number I will assign. This is sent in 4427 the INIT or INIT-ACK message to the peer and 4428 incremented each time a DATA chunk is assigned a 4429 TSN (normally just prior to transmit or during 4430 segmentation). 4432 Last Rcvd TSN This is the last TSN I received and is the 4433 current cumulative TSN point. This value is 4434 set initially by taking the peers initial TSN, 4435 received in the INIT or INIT-ACK message, and 4436 subtracting one from it. 4438 Mapping Array An array of bits or bytes indicating which out of 4439 order TSN's have been received (relative to the 4440 cumulative TSN i.e. Last Rcvd TSN). If no GAP's exist, 4441 i.e. no out of order messages have been received, 4442 this array will be set to all zero. This structure 4443 may be in the form of a circular buffer or bit array. 4445 Stewart, et al [Page 87] 4446 Ack State This flag indicates if the next received datagram 4447 is to be responded to with a SACK. This is initialized 4448 to 0, when a datagram is received it is incremented. 4449 If this value reaches 2, a SACK is sent and the value 4450 is reset to 0. Note: this is used only when no datagrams 4451 are received out of order, when DATA chunks are out 4452 of order SACK's are not delayed (see Section 5). 4454 Inbound An array of structures to track the inbound streams. 4455 Streams Normally including the next sequence number expected 4456 and possibly the stream number. 4458 Outbound An array of structures to track the outbound streams. 4459 Streams Normally including the next sequence number to 4460 be sent on the stream. 4462 Reasm Queue A re-assembly queue. 4464 11.3 Per Transport Address Data 4466 For each destination transport address in the peer's address list derived 4467 from the INIT or INIT ACK message, a number of data elements needs to be 4468 maintained including: 4470 Error count The current error count for this destination. 4472 Error Current error threshold for this destination i.e. 4473 Threshold what value marks the destination down if Error count 4474 reaches this value. 4476 cwnd The current congestion window. 4478 ssthresh The current ssthresh value. 4480 RTO The current retransmission timeout vaule. 4482 SRTT The current smoothed round trip time. 4484 RTTVAR The current RTT variation. 4486 partial bytes The tracking method for increase of cwnd when in 4487 acked congestion avoidance mode (see section 6.2.2) 4489 state The current state of this destionation, i.e. DOWN, UP, 4490 ALLOW-HB, NO-HEARTBEAT, etc. 4492 P-MTU The current known path MTU. 4494 Per A timer used by each destination. 4495 Destination 4496 Timer 4498 Stewart, et al [Page 88] 4499 RTO-Pending A flag used to track if one of the datagrams sent to this address 4500 is currently being used to compute a RTT. If this flag is 0, the 4501 next datagram sent to this destination should be used to compute 4502 a RTT and this flag should be set. Every time the RTT calcualtion 4503 completes (i.e. the datagram is SACK'd) clear this flag. 4505 last-time The time this destination was last sent to. This can be used 4506 used to determine if a HEARTBEAT is needed. 4508 11.4 General Parameters Needed 4510 Out Queue A queue of outbound datagrams. 4512 In Queue A queue of inbound datagrams. 4514 12. IANA Consideration 4516 This protocol will require port reservation like TCP for the use of 4517 "well known" servers within the Internet. It is suggested that all 4518 current TCP ports should be automatically reserved in the SCTP port 4519 address space. New requests should follow IANA's current mechanisms 4520 for TCP. 4522 This protocol may also be extended through IANA in three ways: 4523 -- through definition of additional chunk types, 4524 -- through definition of additional parameter types, or 4525 -- through definition of additional cause codes within Operation 4526 Error chunks 4528 In the case where a particular ULP using SCTP desires to have its own 4529 ports, the ULP should be responsible for registering with IANA for 4530 getting its ports assigned. 4532 12.1 IETF-defined Chunk Extension 4534 The appropriate use of specific chunk types is an integral part of the 4535 SCTP protocol. In consequence, the intention is that new IETF-defined 4536 chunk types MUST be supported by standards-track RFC documentation. 4537 As a transitional step, a new chunk type MAY be introduced in an 4538 Experimental RFC. Chunk type codes MUST remain permanently associated 4539 with the original documentation on the basis of which they were 4540 allocated. Thus if the RFC supporting a given chunk type is 4541 deprecated in favor of a new document, the corresponding chunk type 4542 code value is also deprecated and a new code value is allocated in 4543 association with the replacement document. 4545 Stewart, et al [Page 89] 4546 The documentation for a new chunk code type must include the following 4547 information: 4548 (a) a long and short name for the new chunk type; 4549 (b) a detailed description of the structure of the chunk, which MUST 4550 conform to the basic structure defined in section 2.2; 4551 (c) a detailed definition and description of intended use of each field 4552 within the chunk, including the chunk flags if any; 4553 (d) a detailed procedural description of the use of the new chunk type 4554 within the operation of the protocol. 4556 If the primary numbering space reserved for IETF use (0x00 to 0xFD) is 4557 exhausted, new codes shall subsequently be allocated in the extension 4558 range 0x0000 through 0xFFFF. Chunks allocated in this range MUST 4559 conform to the following structure: 4561 First word (32 bits): 4562 as shown in section 2.2, with chunk type code equal to 0xFF. 4564 Second word: 4565 first octet MUST be all 1's (0xFF). Next octet MUST be all 0's 4566 (0x00). Final two octets contain the allocated extension code value. 4568 0 1 2 3 4569 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4570 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4571 |1 1 1 1 1 1 1 1|Chunk Flags | Chunk Length | 4572 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4573 |1 1 1 1 1 1 1 1|0 0 0 0 0 0 0 0| Extension Type Code | 4574 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4575 \ \ 4576 / Value / 4577 \ \ 4578 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4580 12.2 IETF-defined Chunk Parameter Extension 4582 The allocation of a new chunk parameter type code from the IETF 4583 numbering space MUST be supported by RFC documentation. As with chunk 4584 type codes, parameter type codes are uniquely associated with their 4585 supporting document and MUST be replaced if new documentation is 4586 provided. This documentation may be Informational, Experimental, or 4587 standards-track at the discretion of the IESG. It MUST contain the 4588 following information: 4589 (a) Name of the parameter type. 4590 (b) Detailed description of the structure of the parameter field. This 4591 structure MUST conform to the general type-length-value format 4592 described in section 2.2.1. 4593 (c) Detailed definition of each component of the parameter value. 4594 (d) Detailed description of the intended use of this parameter type, 4595 and an indication of whether and under what circumstances 4596 multiple instances of this parameter type may be found within the 4597 same chunk. 4599 Stewart, et al [Page 90] 4600 Additional parameter type codes may be allocated initially from the 4601 range 0x0000 through 0xFFFD. If this space is exhausted, extension 4602 codes shall be allocated in the range 0x0000 through 0xFFFF. Where an 4603 extension code has been allocated, the format of the parameter must 4604 conform to the following structure: 4606 First word (32 bits): 4607 contains the parameter type code 0xFFFF and parameter length as 4608 described in section 2.2.1. 4610 Second word: 4611 first octet MUST be all 1's (0xFF). Next octet MUST be all 0's 4612 (0x00). Final two octets contain the allocated extension code 4613 value. 4615 The Value portion of the parameter, if any, follows the second word. 4617 0 1 2 3 4618 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4619 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4620 |1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1| Length | 4621 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4622 |1 1 1 1 1 1 1 1|0 0 0 0 0 0 0 0| Extension Type Code | 4623 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4624 \ \ 4625 / Value / 4626 \ \ 4627 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4629 12.3 IETF-defined Additional Error Causes 4631 Additional cause codes may be allocated in the range 0x0004 to 0xFFFF 4632 upon receipt of any permanently-available public documentation 4633 containing the following information: 4634 (a) Name of the error condition. 4635 (b) Detailed description of the conditions under which an SCTP 4636 endpoint should issue an Operation Error with this cause code. 4637 (c) Expected action by the SCTP endpoint which receives an Operation 4638 Error chunk containing this cause code. 4639 (d) Detailed description of the structure and content of data fields 4640 which accompany this cause code. 4642 The initial word (32 bits) of a cause code parameter MUST conform to 4643 the format shown in section 2.3.9, i.e.: 4644 -- first two octets contain the cause code value 4645 -- last two octets contain length of the cause parameter. 4647 Stewart, et al [Page 91] 4648 12.4 Payload Protocol Identifiers 4650 Except for value 0x00000000 which is reserved by SCTP to indicate the 4651 absence of a payload protocol identifier in a DATA chunk, SCTP will 4652 not be responsible for standardizing or verifying any payload protocol 4653 identifiers; SCTP simply receives the identifier from the upper layer 4654 and carries it with the corresponding payload data. 4656 The upper layer, i.e, the SCTP user, SHOULD standardize any specific 4657 protocol identifier with IANA if it is so desired. The use of any 4658 specific payload protocol identifier is out of the scope of SCTP. 4660 13. Suggested SCTP Protocol Parameter Values 4662 The following protocol parameters are RECOMMENDED: 4664 RTO.Initial - 3 seconds 4665 RTO.Min - 1 second 4666 RTO.Max - 60 seconds 4667 RTO.Alpha - 1/8 4668 RTO.Beta - 1/4 4669 Valid.Cookie.Life - 5 seconds 4670 Association.Max.Retrans - 10 attempts 4671 Path.Max.Retrans - 5 attempts (per destination address) 4672 Max.Init.Retransmits - 8 attempts 4674 'retrans.count' - counter (per destination address) 4675 'receiver.buffer' - variable (per peer endpoint) 4677 IMPLEMENTATION NOTE: The SCTP implementation may allow ULP to 4678 customize some of these protocol parameters (see Section 9). 4680 14. Acknowledgments 4682 The authors wish to thank Mark Allman, Richard Band, Scott Bradner, 4683 Steve Bellovin, Ram Dantu, R. Ezhirpavai, Sally Floyd, Matt Holdrege, 4684 Henry Houh, Christian Huetima, Gary Lehecka, John Loughney, Daniel 4685 Luan, Lyndon Ong, Shyamal Prasad, Kelvin Porter, Heinz Prantner, Jarno 4686 Rajahalme, Ivan Raymond E. Reeves, Renee Revis, Arias Rodriguez, 4687 A. Sankar, Greg Sidebottom, Brian Wyld, and many others for their 4688 invaluable comments. 4690 Stewart, et al [Page 92] 4691 15. Authors' Addresses 4693 Randall R. Stewart Tel: +1-847-632-7438 4694 Motorola, Inc. EMail: rstewar1@email.mot.com 4695 1501 W. Shure Drive, #2315 4696 Arlington Heights, IL 60004 4697 USA 4699 Qiaobing Xie Tel: +1-847-632-3028 4700 Motorola, Inc. EMail: qxie1@email.mot.com 4701 1501 W. Shure Drive, #2309 4702 Arlington Heights, IL 60004 4703 USA 4705 Ken Morneault Tel: +1-703-484-3323 4706 Cisco Systems Inc. EMail: kmorneau@cisco.com 4707 13615 Dulles Technology Drive 4708 Herndon, VA. 20171 4709 USA 4711 Chip Sharp Tel: +1-919-472-3121 4712 Cisco Systems Inc. EMail:chsharp@cisco.com 4713 7025 Kit Creek Road 4714 Research Triangle Park, NC 27709 4715 USA 4717 Hanns Juergen Schwarzbauer Tel: +49-89-722-24236 4718 SIEMENS AG 4719 Hofmannstr. 51 4720 81359 Munich 4721 Germany 4722 EMail: HannsJuergen.Schwarzbauer@icn.siemens.de 4724 Tom Taylor Tel: +1-613-736-0961 4725 Nortel Networks 4726 1852 Lorraine Ave. 4727 Ottawa, Ontario 4728 Canada K1H 6Z8 4729 EMail:taylor@nortelnetworks.com 4731 Ian Rytina Tel: +61-3-9301-6164 4732 Ericsson Australia EMail:ian.rytina@ericsson.com 4733 37/360 Elizabeth Street 4734 Melbourne, Victoria 3000 4735 Australia 4737 Malleswar Kalla Tel: +1-973-829-5212 4738 Telcordia Technologies 4739 MCC 1J211R 4740 445 South Street 4741 Morristown, NJ 07960 4742 USA 4743 EMail: kalla@research.telcordia.com 4745 Stewart, et al [Page 93] 4746 Lixia Zhang Tel: +1-310-825-2695 4747 UCLA Computer Science Department EMail: lixia@cs.ucla.edu 4748 4531G Boelter Hall 4749 Los Angeles, CA 90095-1596 4750 USA 4752 Vern Paxson Tel: +1-510-642-4274 x 302 4753 ACIRI EMail: vern@aciri.org 4754 1947 Center St., Suite 600, 4755 Berkeley, CA 94704-1198 4756 USA 4758 16. References 4760 [1] Eastlake , D. (ed.), "Randomness Recommendations for Security", 4761 RFC 1750, December 1994. 4763 [2] Deutsch, P., and Gailly, J-L., "ZLIB Compressed Data Format 4764 Specification version 3.3", RFC 1950, May 1996. 4766 [3] Allman, M., Paxson, V., and Stevens, W., "TCP Congestion 4767 Control", RFC 2581, April 1999. 4769 [4] Krawczyk, H., Bellare, M., Canetti, R., "HMAC: Keyed-Hashing for 4770 Message Authentication", RFC 2104, March 1997. 4772 [5] Allman, M., and Paxson, V., "On Estimating End-to-End Network 4773 Path Properties", Proc. SIGCOMM'99, 1999. 4775 [6] Karn, P., and Simpson, W., "Photuris: Session-Key Management 4776 Protocol", RFC 2522, March 1999. 4778 [7] Bradner, S., "The Internet Standards Process -- Revision 3", 4779 RFC 2026, October 1996. 4781 [8] Postel, J. (ed.), "Transmission Control Protocol", RFC 793, 4782 September 1981. 4784 [9] Postel, J. (ed.), "User Datagram Protocol", RFC 768, August 1980. 4786 [10] Reynolds, J., and Postel, J. (ed.), "Assigned Numbers", RFC 1700, 4787 October 1994. 4789 [11] Mogul, J., and Deering, S., "Path MTU Discovery", RFC 1191, 4790 November 1990. 4792 [12] McCann, J., Deering, S., and Mogul, J., "Path MTU Discovery for 4793 IP version 6", RFC 1981, August 1996. 4795 [13] Fraser, B. (ed.), "Site Security Handbook", RFC 2196, September 4796 1997. 4798 Stewart, et al [Page 94] 4799 [14] Kent, S., and Atkinson, R., "Security Architecture for the 4800 Internet Protocol", RFC 2401, November 1998. 4802 [15] Savage, S., Cardwell, N., Wetherall, D., and Anderson, T., 4803 "TCP Congestion Control with a Misbehaving Receiver", ACM 4804 Computer Communication Review, 29(5), October 1999. 4806 [16] Fall, K., and Floyd, S., Simulation-based Comparisons of Tahoe, 4807 Reno, and SACK TCP, Computer Communications Review, V. 26 N. 3, 4808 July 1996, pp. 5-21. 4810 Appendix A: Explicit Congestion Notification 4812 ECN (Ramakrishnan, k., Floyd, S., "Explicit Congestion Notification", 4813 RFC 2481, January 1999) describes a proposed extension to IP that 4814 details a method to become aware of congestion outside of datagram 4815 loss. This is an optional feature that an implemenation MAY choose to 4816 add to SCTP. This appendix details the minor differences an implemenator 4817 will need to be aware of if they choose to implement this feature. 4818 In general RFC 2481 should be followed with the following exceptions. 4820 Negotiation: 4822 RFC2481 details negotiation of ECN during the SYN and SYN-ACK stages 4823 of a TCP connection. The sender of the SYN sets two bits in the 4824 TCP flags, and the sender of the SYN-ACK sets only 1 bit. The reasoning 4825 behind this is to assure both sides are truely ECN capable. For SCTP 4826 this is not necessary. To indicate that an endpoint is ECN capable 4827 a endpoint MAY add to the INIT and or INIT-ACK message the TLV 4828 reserved for ECN. This TLV contains no parameters and thus has 4829 the following format: 4831 0 1 2 3 4832 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4833 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4834 | Parameter Type = 0x000a | Parameter Length = 0x0004 | 4835 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4837 Stewart, et al [Page 95] 4838 ECN-Echo: 4840 RFC 2481 details a specific bit for a receiver to send back in its 4841 acknowledgments to notifiy the sender of the Congestion Experienced (CE) 4842 bit having arrived from the network. For SCTP this same indication is 4843 made by including the ECNE chunk. This chunk contains NO parameters 4844 or data and looks as follows: 4846 0 1 2 3 4847 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4848 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4849 | ID=00001100 | Flags=00000000| Chunk Length=0004 | 4850 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4852 Note: The ECNE is considered a Control chunk. 4854 CWR: 4856 RFC 2481 details a specific bit for a sender to send in its 4857 next outbound datagram to indicate to its peer that it has 4858 reduced its congestion window. This is termed the CWR bit. For 4859 SCTP the same indication is made by including the CWR chunk. 4860 This chunk contains NO parameters or data and looks as follows: 4862 0 1 2 3 4863 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 4864 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4865 | ID=00001101 | Flags=00000000| Chunk Length=0004 | 4866 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ 4868 Note: The CWR is considered a Control chunk. 4870 This Internet Draft expires in 6 months from March, 2000 4872 Stewart, et al [Page 96]