idnits 2.17.1 draft-tuexen-tsvwg-rfc4960-errata-04.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- ** There are 2 instances of too long lines in the document, the longest one being 5 characters in excess of 72. == There are 2 instances of lines with non-RFC6890-compliant IPv4 addresses in the document. If these are example addresses, they should be changed. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == Line 1534 has weird spacing: '... packet with ...' -- The document date (July 8, 2016) is 2848 days in the past. Is this intentional? Checking references for intended status: Informational ---------------------------------------------------------------------------- == Missing Reference: 'RFC2434' is mentioned on line 239, but not defined ** Obsolete undefined reference: RFC 2434 (Obsoleted by RFC 5226) == Missing Reference: 'RFC0813' is mentioned on line 396, but not defined ** Obsolete undefined reference: RFC 813 (Obsoleted by RFC 7805) == Missing Reference: 'RFC1122' is mentioned on line 398, but not defined == Missing Reference: 'RFC1858' is mentioned on line 1545, but not defined ** Obsolete normative reference: RFC 4960 (Obsoleted by RFC 9260) -- Obsolete informational reference (is this intentional?): RFC 2960 (Obsoleted by RFC 4960) -- Obsolete informational reference (is this intentional?): RFC 4460 (Obsoleted by RFC 9260) Summary: 4 errors (**), 0 flaws (~~), 7 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group R. Stewart 3 Internet-Draft Netflix, Inc. 4 Intended status: Informational M. Tuexen 5 Expires: January 9, 2017 Muenster Univ. of Appl. Sciences 6 M. Proshin 7 Ericsson 8 July 8, 2016 10 RFC 4960 Errata and Issues 11 draft-tuexen-tsvwg-rfc4960-errata-04.txt 13 Abstract 15 This document is a compilation of issues found since the publication 16 of RFC4960 in September 2007 based on experience with implementing, 17 testing, and using SCTP along with the suggested fixes. This 18 document provides deltas to RFC4960 and is organized in a time based 19 way. The issues are listed in the order they were brought up. 20 Because some text is changed several times the last delta in the text 21 is the one which should be applied. In addition to the delta a 22 description of the problem and the details of the solution are also 23 provided. 25 Status of This Memo 27 This Internet-Draft is submitted in full conformance with the 28 provisions of BCP 78 and BCP 79. 30 Internet-Drafts are working documents of the Internet Engineering 31 Task Force (IETF). Note that other groups may also distribute 32 working documents as Internet-Drafts. The list of current Internet- 33 Drafts is at http://datatracker.ietf.org/drafts/current/. 35 Internet-Drafts are draft documents valid for a maximum of six months 36 and may be updated, replaced, or obsoleted by other documents at any 37 time. It is inappropriate to use Internet-Drafts as reference 38 material or to cite them other than as "work in progress." 40 This Internet-Draft will expire on January 9, 2017. 42 Copyright Notice 44 Copyright (c) 2016 IETF Trust and the persons identified as the 45 document authors. All rights reserved. 47 This document is subject to BCP 78 and the IETF Trust's Legal 48 Provisions Relating to IETF Documents 49 (http://trustee.ietf.org/license-info) in effect on the date of 50 publication of this document. Please review these documents 51 carefully, as they describe your rights and restrictions with respect 52 to this document. Code Components extracted from this document must 53 include Simplified BSD License text as described in Section 4.e of 54 the Trust Legal Provisions and are provided without warranty as 55 described in the Simplified BSD License. 57 Table of Contents 59 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 60 2. Conventions . . . . . . . . . . . . . . . . . . . . . . . . . 3 61 3. Corrections to RFC 4960 . . . . . . . . . . . . . . . . . . . 3 62 3.1. Path Error Counter Threshold Handling . . . . . . . . . . 3 63 3.2. Upper Layer Protocol Shutdown Request Handling . . . . . 4 64 3.3. Registration of New Chunk Types . . . . . . . . . . . . . 5 65 3.4. Variable Parameters for INIT Chunks . . . . . . . . . . . 6 66 3.5. CRC32c Sample Code on 64-bit Platforms . . . . . . . . . 7 67 3.6. Endpoint Failure Detection . . . . . . . . . . . . . . . 8 68 3.7. Data Transmission Rules . . . . . . . . . . . . . . . . . 9 69 3.8. T1-Cookie Timer . . . . . . . . . . . . . . . . . . . . . 10 70 3.9. Miscellaneous Typos . . . . . . . . . . . . . . . . . . . 11 71 3.10. CRC32c Sample Code . . . . . . . . . . . . . . . . . . . 15 72 3.11. partial_bytes_acked after T3-rtx Expiration . . . . . . . 15 73 3.12. Order of Adjustments of partial_bytes_acked and cwnd . . 16 74 3.13. HEARTBEAT ACK and the association error counter . . . . . 17 75 3.14. Path for Fast Retransmission . . . . . . . . . . . . . . 19 76 3.15. Transmittal in Fast Recovery . . . . . . . . . . . . . . 20 77 3.16. Initial Value of ssthresh . . . . . . . . . . . . . . . . 20 78 3.17. Automatically Confirmed Addresses . . . . . . . . . . . . 21 79 3.18. Only One Packet after Retransmission Timeout . . . . . . 22 80 3.19. INIT ACK Path for INIT in COOKIE-WAIT State . . . . . . . 23 81 3.20. Zero Window Probing and Unreachable Primary Path . . . . 24 82 3.21. Normative Language in Section 10 . . . . . . . . . . . . 25 83 3.22. Increase of partial_bytes_acked in Congestion Avoidance . 29 84 3.23. Inconsistency in Notifications Handling . . . . . . . . . 30 85 3.24. SACK.Delay Not Listed as a Protocol Parameter . . . . . . 34 86 3.25. Processing of Chunks in an Incoming SCTP Packet . . . . . 36 87 3.26. CWND Increase in Congestion Avoidance Phase . . . . . . . 37 88 3.27. Refresh of cwnd and ssthresh after Idle Period . . . . . 39 89 3.28. Window Updates After Receiver Window Opens Up . . . . . . 40 90 3.29. Path of DATA and Reply Chunks . . . . . . . . . . . . . . 41 91 4. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 43 92 5. Security Considerations . . . . . . . . . . . . . . . . . . . 43 93 6. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 43 94 7. References . . . . . . . . . . . . . . . . . . . . . . . . . 43 95 7.1. Normative References . . . . . . . . . . . . . . . . . . 43 96 7.2. Informative References . . . . . . . . . . . . . . . . . 43 98 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 44 100 1. Introduction 102 This document contains a compilation of all defects found up until 103 the publishing of this document for [RFC4960] specifying the Stream 104 Control Transmission Protocol (SCTP). These defects may be of an 105 editorial or technical nature. This document may be thought of as a 106 companion document to be used in the implementation of SCTP to 107 clarify errors in the original SCTP document. 109 This document provides a history of the changes that will be compiled 110 into a BIS document for [RFC4960]. It is structured similar to 111 [RFC4460]. 113 Each error will be detailed within this document in the form of: 115 o The problem description, 116 o The text quoted from [RFC4960], 117 o The replacement text that should be placed into an upcoming BIS 118 document, 119 o A description of the solution. 121 Note that when reading this document one must use care to assure that 122 a field or item is not updated further on within the document. Each 123 section should be applied in sequence to the original [RFC4960] since 124 this document is a historical record of the sequential changes that 125 have been found necessary at various inter-op events and through 126 discussion on the list. 128 2. Conventions 130 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", 131 "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this 132 document are to be interpreted as described in [RFC2119]. 134 3. Corrections to RFC 4960 136 3.1. Path Error Counter Threshold Handling 138 3.1.1. Description of the Problem 140 The handling of the 'Path.Max.Retrans' parameter is described in 141 Section 8.2 and Section 8.3 of [RFC4960] in an Inconsistent way. 142 Whereas Section 8.2 describes that a path is marked inactive when the 143 path error counter exceeds the threshold, Section 8.3 says the path 144 is marked inactive when the path error counter reaches the threshold. 146 This issue was reported as an Errata for [RFC4960] with Errata ID 147 1440. 149 3.1.2. Text Changes to the Document 151 --------- 152 Old text: (Section 8.3) 153 --------- 155 When the value of this counter reaches the protocol parameter 156 'Path.Max.Retrans', the endpoint should mark the corresponding 157 destination address as inactive if it is not so marked, and may also 158 optionally report to the upper layer the change of reachability of 159 this destination address. After this, the endpoint should continue 160 HEARTBEAT on this destination address but should stop increasing the 161 counter. 163 --------- 164 New text: (Section 8.3) 165 --------- 167 When the value of this counter exceeds the protocol parameter 168 'Path.Max.Retrans', the endpoint should mark the corresponding 169 destination address as inactive if it is not so marked, and may also 170 optionally report to the upper layer the change of reachability of 171 this destination address. After this, the endpoint should continue 172 HEARTBEAT on this destination address but should stop increasing the 173 counter. 175 3.1.3. Solution Description 177 The intended state change should happen when the threshold is 178 exceeded. 180 3.2. Upper Layer Protocol Shutdown Request Handling 182 3.2.1. Description of the Problem 184 Section 9.2 of [RFC4960] describes the handling of received SHUTDOWN 185 chunks in the SHUTDOWN-RECEIVED state instead of the handling of 186 shutdown requests from its upper layer in this state. 188 This issue was reported as an Errata for [RFC4960] with Errata ID 189 1574. 191 3.2.2. Text Changes to the Document 193 --------- 194 Old text: (Section 9.2) 195 --------- 197 Once an endpoint has reached the SHUTDOWN-RECEIVED state, it MUST NOT 198 send a SHUTDOWN in response to a ULP request, and should discard 199 subsequent SHUTDOWN chunks. 201 --------- 202 New text: (Section 9.2) 203 --------- 205 Once an endpoint has reached the SHUTDOWN-RECEIVED state, it MUST NOT 206 send a SHUTDOWN in response to a ULP request, and should discard 207 subsequent ULP shutdown requests. 209 3.2.3. Solution Description 211 The text never intended the SCTP endpoint to ignore SHUTDOWN chunks 212 from its peer. If it did the endpoints could never gracefully 213 terminate associations in some cases. 215 3.3. Registration of New Chunk Types 217 3.3.1. Description of the Problem 219 Section 14.1 of [RFC4960] should deal with new chunk types, however, 220 the text refers to parameter types. 222 This issue was reported as an Errata for [RFC4960] with Errata ID 223 2592. 225 3.3.2. Text Changes to the Document 226 --------- 227 Old text: (Section 14.1) 228 --------- 230 The assignment of new chunk parameter type codes is done through an 231 IETF Consensus action, as defined in [RFC2434]. Documentation of the 232 chunk parameter MUST contain the following information: 234 --------- 235 New text: (Section 14.1) 236 --------- 238 The assignment of new chunk type codes is done through an 239 IETF Consensus action, as defined in [RFC2434]. Documentation of the 240 chunk type MUST contain the following information: 242 3.3.3. Solution Description 244 Refer to chunk types as intended. 246 3.4. Variable Parameters for INIT Chunks 248 3.4.1. Description of the Problem 250 Newlines in wrong places break the layout of the table of variable 251 parameters for the INIT chunk in Section 3.3.2 of [RFC4960]. 253 This issue was reported as an Errata for [RFC4960] with Errata ID 254 3291 and Errata ID 3804. 256 3.4.2. Text Changes to the Document 257 --------- 258 Old text: (Section 3.3.2) 259 --------- 261 Variable Parameters Status Type Value 262 ------------------------------------------------------------- 263 IPv4 Address (Note 1) Optional 5 IPv6 Address 264 (Note 1) Optional 6 Cookie Preservative 265 Optional 9 Reserved for ECN Capable (Note 2) Optional 266 32768 (0x8000) Host Name Address (Note 3) Optional 267 11 Supported Address Types (Note 4) Optional 12 269 --------- 270 New text: (Section 3.3.2) 271 --------- 273 Variable Parameters Status Type Value 274 ------------------------------------------------------------- 275 IPv4 Address (Note 1) Optional 5 276 IPv6 Address (Note 1) Optional 6 277 Cookie Preservative Optional 9 278 Reserved for ECN Capable (Note 2) Optional 32768 (0x8000) 279 Host Name Address (Note 3) Optional 11 280 Supported Address Types (Note 4) Optional 12 282 3.4.3. Solution Description 284 Fix the formatting of the table. 286 3.5. CRC32c Sample Code on 64-bit Platforms 288 3.5.1. Description of the Problem 290 The sample code for computing the CRC32c provided in [RFC4960] 291 assumes that a variable of type unsigned long uses 32 bits. This is 292 not true on some 64-bit platforms (for example the ones using LP64). 294 This issue was reported as an Errata for [RFC4960] with Errata ID 295 3423. 297 3.5.2. Text Changes to the Document 298 --------- 299 Old text: (Appendix C) 300 --------- 302 unsigned long 303 generate_crc32c(unsigned char *buffer, unsigned int length) 304 { 305 unsigned int i; 306 unsigned long crc32 = ~0L; 308 --------- 309 New text: (Appendix C) 310 --------- 312 unsigned long 313 generate_crc32c(unsigned char *buffer, unsigned int length) 314 { 315 unsigned int i; 316 unsigned long crc32 = 0xffffffffL; 318 3.5.3. Solution Description 320 Use 0xffffffffL instead of ~0L which gives the same value on 321 platforms using 32 bits or 64 bits for variables of type unsigned 322 long. 324 3.6. Endpoint Failure Detection 326 3.6.1. Description of the Problem 328 The handling of the association error counter defined in Section 8.1 329 of [RFC4960] can result in an association failure even if the path 330 used for data transmission is available, but idle. 332 This issue was reported as an Errata for [RFC4960] with Errata ID 333 3788. 335 3.6.2. Text Changes to the Document 336 --------- 337 Old text: (Section 8.1) 338 --------- 340 An endpoint shall keep a counter on the total number of consecutive 341 retransmissions to its peer (this includes retransmissions to all the 342 destination transport addresses of the peer if it is multi-homed), 343 including unacknowledged HEARTBEAT chunks. 345 --------- 346 New text: (Section 8.1) 347 --------- 349 An endpoint shall keep a counter on the total number of consecutive 350 retransmissions to its peer (this includes data retransmissions 351 to all the destination transport addresses of the peer if it is 352 multi-homed), including the number of unacknowledged HEARTBEAT 353 chunks observed on the path which currently is used for data 354 transfer. Unacknowledged HEARTBEAT chunks observed on paths 355 different from the path currently used for data transfer shall 356 not increment the association error counter, as this could lead 357 to association closure even if the path which currently is used for 358 data transfer is available (but idle). 360 3.6.3. Solution Description 362 A more refined handling for the association error counter is defined. 364 3.7. Data Transmission Rules 366 3.7.1. Description of the Problem 368 When integrating the changes to Section 6.1 A) of [RFC2960] as 369 described in Section 2.15.2 of [RFC4460] some text was duplicated and 370 became the final paragraph of Section 6.1 A) of [RFC4960]. 372 This issue was reported as an Errata for [RFC4960] with Errata ID 373 4071. 375 3.7.2. Text Changes to the Document 376 --------- 377 Old text: (Section 6.1 A)) 378 --------- 379 The sender MUST also have an algorithm for sending new DATA chunks 380 to avoid silly window syndrome (SWS) as described in [RFC0813]. 381 The algorithm can be similar to the one described in Section 382 4.2.3.4 of [RFC1122]. 384 However, regardless of the value of rwnd (including if it is 0), 385 the data sender can always have one DATA chunk in flight to the 386 receiver if allowed by cwnd (see rule B below). This rule allows 387 the sender to probe for a change in rwnd that the sender missed 388 due to the SACK having been lost in transit from the data receiver 389 to the data sender. 391 --------- 392 New text: (Section 6.1 A)) 393 --------- 395 The sender MUST also have an algorithm for sending new DATA chunks 396 to avoid silly window syndrome (SWS) as described in [RFC0813]. 397 The algorithm can be similar to the one described in Section 398 4.2.3.4 of [RFC1122]. 400 3.7.3. Solution Description 402 Last paragraph of Section 6.1 A) removed as intended in 403 Section 2.15.2 of [RFC4460]. 405 3.8. T1-Cookie Timer 407 3.8.1. Description of the Problem 409 Figure 4 of [RFC4960] illustrates the SCTP association setup. 410 However, it incorrectly shows that the T1-init timer is used in the 411 COOKIE-ECHOED state whereas the T1-cookie timer should have been used 412 instead. 414 This issue was reported as an Errata for [RFC4960] with Errata ID 415 4400. 417 3.8.2. Text Changes to the Document 418 --------- 419 Old text: (Section 5.1.6, Figure 4) 420 --------- 422 COOKIE ECHO [Cookie_Z] ------\ 423 (Start T1-init timer) \ 424 (Enter COOKIE-ECHOED state) \---> (build TCB enter ESTABLISHED 425 state) 426 /---- COOKIE-ACK 427 / 428 (Cancel T1-init timer, <-----/ 429 Enter ESTABLISHED state) 431 --------- 432 New text: (Section 5.1.6, Figure 4) 433 --------- 435 COOKIE ECHO [Cookie_Z] ------\ 436 (Start T1-cookie timer) \ 437 (Enter COOKIE-ECHOED state) \---> (build TCB enter ESTABLISHED 438 state) 439 /---- COOKIE-ACK 440 / 441 (Cancel T1-cookie timer, <---/ 442 Enter ESTABLISHED state) 444 3.8.3. Solution Description 446 Change the figure such that the T1-cookie timer is used instead of 447 the T1-init timer. 449 3.9. Miscellaneous Typos 451 3.9.1. Description of the Problem 453 While processing [RFC4960] some typos were not catched. 455 3.9.2. Text Changes to the Document 456 --------- 457 Old text: (Section 1.6) 458 --------- 460 Transmission Sequence Numbers wrap around when they reach 2**32 - 1. 461 That is, the next TSN a DATA chunk MUST use after transmitting TSN = 462 2*32 - 1 is TSN = 0. 464 --------- 465 New text: (Section 1.6) 466 --------- 468 Transmission Sequence Numbers wrap around when they reach 2**32 - 1. 469 That is, the next TSN a DATA chunk MUST use after transmitting TSN = 470 2**32 - 1 is TSN = 0. 472 --------- 473 Old text: (Section 3.3.10.9) 474 --------- 476 No User Data: This error cause is returned to the originator of a 478 DATA chunk if a received DATA chunk has no user data. 480 --------- 481 New text: (Section 3.3.10.9) 482 --------- 484 No User Data: This error cause is returned to the originator of a 485 DATA chunk if a received DATA chunk has no user data. 487 --------- 488 Old text: (Section 6.7, Figure 9) 489 --------- 491 Endpoint A Endpoint Z {App 492 sends 3 messages; strm 0} DATA [TSN=6,Strm=0,Seq=2] ---------- 493 -----> (ack delayed) (Start T3-rtx timer) 495 DATA [TSN=7,Strm=0,Seq=3] --------> X (lost) 497 DATA [TSN=8,Strm=0,Seq=4] ---------------> (gap detected, 498 immediately send ack) 499 /----- SACK [TSN Ack=6,Block=1, 500 / Start=2,End=2] 501 <-----/ (remove 6 from out-queue, 502 and mark 7 as "1" missing report) 504 --------- 505 New text: (Section 6.7, Figure 9) 506 --------- 508 Endpoint A Endpoint Z 509 {App sends 3 messages; strm 0} 510 DATA [TSN=6,Strm=0,Seq=2] ---------------> (ack delayed) 511 (Start T3-rtx timer) 513 DATA [TSN=7,Strm=0,Seq=3] --------> X (lost) 515 DATA [TSN=8,Strm=0,Seq=4] ---------------> (gap detected, 516 immediately send ack) 517 /----- SACK [TSN Ack=6,Block=1, 518 / Strt=2,End=2] 519 <-----/ 520 (remove 6 from out-queue, 521 and mark 7 as "1" missing report) 523 --------- 524 Old text: (Section 6.10) 525 --------- 526 An endpoint bundles chunks by simply including multiple chunks in one 527 outbound SCTP packet. The total size of the resultant IP datagram, 529 including the SCTP packet and IP headers, MUST be less that or equal 530 to the current Path MTU. 532 --------- 533 New text: (Section 6.10) 534 --------- 536 An endpoint bundles chunks by simply including multiple chunks in one 537 outbound SCTP packet. The total size of the resultant IP datagram, 538 including the SCTP packet and IP headers, MUST be less than or equal 539 to the current Path MTU. 541 --------- 542 Old text: (Section 10.1) 543 --------- 544 o Receive Unacknowledged Message 546 Format: RECEIVE_UNACKED(data retrieval id, buffer address, buffer 547 size, [,stream id] [, stream sequence number] [,partial 548 flag] [,payload protocol-id]) 550 --------- 551 New text: (Section 10.1) 552 --------- 554 O) Receive Unacknowledged Message 556 Format: RECEIVE_UNACKED(data retrieval id, buffer address, buffer 557 size, [,stream id] [, stream sequence number] [,partial 558 flag] [,payload protocol-id]) 560 --------- 561 Old text: (Appendix C) 562 --------- 563 ICMP2) An implementation MAY ignore all ICMPv6 messages where the 564 type field is not "Destination Unreachable", "Parameter 565 Problem",, or "Packet Too Big". 567 --------- 568 New text: (Appendix C) 569 --------- 571 ICMP2) An implementation MAY ignore all ICMPv6 messages where the 572 type field is not "Destination Unreachable", "Parameter 573 Problem", or "Packet Too Big". 575 3.9.3. Solution Description 577 Typos fixed. 579 3.10. CRC32c Sample Code 581 3.10.1. Description of the Problem 583 The CRC32c computation is described in Appendix B of [RFC4960]. 584 However, the corresponding sample code and its explanation appears at 585 the end of Appendix C, which deals with ICMP handling. 587 3.10.2. Text Changes to the Document 589 Move the sample code related to CRC32c computation and its 590 explanation from the end of Appendix C to the end of Appendix B. 592 3.10.3. Solution Description 594 Text moved to the appropriate location. 596 3.11. partial_bytes_acked after T3-rtx Expiration 598 3.11.1. Description of the Problem 600 Section 7.2.3 of [RFC4960] explicitly states that partial_bytes_acked 601 should be reset to 0 after packet loss detecting from SACK but the 602 same is missed for T3-rtx timer expiration. 604 3.11.2. Text Changes to the Document 606 --------- 607 Old text: (Section 7.2.3) 608 --------- 610 When the T3-rtx timer expires on an address, SCTP should perform slow 611 start by: 613 ssthresh = max(cwnd/2, 4*MTU) 614 cwnd = 1*MTU 616 --------- 617 New text: (Section 7.2.3) 618 --------- 620 When the T3-rtx timer expires on an address, SCTP should perform slow 621 start by: 623 ssthresh = max(cwnd/2, 4*MTU) 624 cwnd = 1*MTU 625 partial_bytes_acked = 0 627 3.11.3. Solution Description 629 Specify that partial_bytes_acked should be reset to 0 after T3-rtx 630 timer expiration. 632 3.12. Order of Adjustments of partial_bytes_acked and cwnd 634 3.12.1. Description of the Problem 636 Section 7.2.2 of [RFC4960] is unclear about the order of adjustments 637 applied to partial_bytes_acked and cwnd in the congestion avoidance 638 phase. 640 3.12.2. Text Changes to the Document 641 --------- 642 Old text: (Section 7.2.2) 643 --------- 645 o When partial_bytes_acked is equal to or greater than cwnd and 646 before the arrival of the SACK the sender had cwnd or more bytes 647 of data outstanding (i.e., before arrival of the SACK, flightsize 648 was greater than or equal to cwnd), increase cwnd by MTU, and 649 reset partial_bytes_acked to (partial_bytes_acked - cwnd). 651 --------- 652 New text: (Section 7.2.2) 653 --------- 655 o When partial_bytes_acked is equal to or greater than cwnd and 656 before the arrival of the SACK the sender had cwnd or more bytes 657 of data outstanding (i.e., before arrival of the SACK, flightsize 658 was greater than or equal to cwnd), partial_bytes_acked is reset 659 to (partial_bytes_acked - cwnd). Next, cwnd is increased by MTU. 661 3.12.3. Solution Description 663 The new text defines the exact order of adjustments of 664 partial_bytes_acked and cwnd in the congestion avoidance phase. 666 3.13. HEARTBEAT ACK and the association error counter 668 3.13.1. Description of the Problem 670 Section 8.1 and Section 8.3 of [RFC4960] prescribe that the receiver 671 of a HEARTBEAT ACK must reset the association overall error counter. 672 In some circumstances, e.g. when a router discards DATA chunks but 673 not HEARTBEAT chunks due to the larger size of the DATA chunk, it 674 might be better to not clear the association error counter on 675 reception of the HEARTBEAT ACK and reset it only on reception of the 676 SACK to avoid stalling the association. 678 3.13.2. Text Changes to the Document 679 --------- 680 Old text: (Section 8.1) 681 --------- 683 The counter shall be reset each time a DATA chunk sent to that peer 684 endpoint is acknowledged (by the reception of a SACK) or a HEARTBEAT 685 ACK is received from the peer endpoint. 687 --------- 688 New text: (Section 8.1) 689 --------- 691 The counter shall be reset each time a DATA chunk sent to that peer 692 endpoint is acknowledged (by the reception of a SACK). When a 693 HEARTBEAT ACK is received from the peer endpoint, the counter should 694 also be reset. The receiver of the HEARTBEAT ACK may choose not to 695 clear the counter if there is outstanding data on the association. 696 This allows for handling the possible difference in reachability 697 based on DATA chunks and HEARTBEAT chunks. 699 --------- 700 Old text: (Section 8.3) 701 --------- 703 Upon the receipt of the HEARTBEAT ACK, the sender of the HEARTBEAT 704 should clear the error counter of the destination transport address 705 to which the HEARTBEAT was sent, and mark the destination transport 706 address as active if it is not so marked. The endpoint may 707 optionally report to the upper layer when an inactive destination 708 address is marked as active due to the reception of the latest 709 HEARTBEAT ACK. The receiver of the HEARTBEAT ACK must also clear the 710 association overall error count as well (as defined in Section 8.1). 712 --------- 713 New text: (Section 8.3) 714 --------- 716 Upon the receipt of the HEARTBEAT ACK, the sender of the HEARTBEAT 717 should clear the error counter of the destination transport address 718 to which the HEARTBEAT was sent, and mark the destination transport 719 address as active if it is not so marked. The endpoint may 720 optionally report to the upper layer when an inactive destination 721 address is marked as active due to the reception of the latest 722 HEARTBEAT ACK. The receiver of the HEARTBEAT ACK should also clear 723 the association overall error counter (as defined in Section 8.1). 725 3.13.3. Solution Description 727 The new text provides a possibility to not reset the association 728 overall error counter when a HEARTBEAT ACK is received if there are 729 valid reasons for it. 731 3.14. Path for Fast Retransmission 733 3.14.1. Description of the Problem 735 [RFC4960] clearly describes where to retransmit data that is timed 736 out when the peer is multi-homed but the same is not stated for fast 737 retransmissions. 739 3.14.2. Text Changes to the Document 741 --------- 742 Old text: (Section 6.4) 743 --------- 745 Furthermore, when its peer is multi-homed, an endpoint SHOULD try to 746 retransmit a chunk that timed out to an active destination transport 747 address that is different from the last destination address to which 748 the DATA chunk was sent. 750 --------- 751 New text: (Section 6.4) 752 --------- 754 Furthermore, when its peer is multi-homed, an endpoint SHOULD try to 755 retransmit a chunk that timed out to an active destination transport 756 address that is different from the last destination address to which 757 the DATA chunk was sent. 759 When its peer is multi-homed, an endpoint SHOULD send fast 760 retransmissions to the same destination transport address where 761 original data was sent to. If the primary path has been changed and 762 original data was sent there before the fast retransmit, the 763 implementation MAY send it to the new primary path. 765 3.14.3. Solution Description 767 The new text clarifies where to send fast retransmissions. 769 3.15. Transmittal in Fast Recovery 771 3.15.1. Description of the Problem 773 The Fast Retransmit on Gap Reports algorithm intends that only the 774 very first packet may be sent regardless of cwnd in the Fast Recovery 775 phase but rule 3) of [RFC4960], Section 7.2.4, misses this 776 clarification. 778 3.15.2. Text Changes to the Document 780 --------- 781 Old text: (Section 7.2.4) 782 --------- 784 3) Determine how many of the earliest (i.e., lowest TSN) DATA chunks 785 marked for retransmission will fit into a single packet, subject 786 to constraint of the path MTU of the destination transport 787 address to which the packet is being sent. Call this value K. 788 Retransmit those K DATA chunks in a single packet. When a Fast 789 Retransmit is being performed, the sender SHOULD ignore the value 790 of cwnd and SHOULD NOT delay retransmission for this single 791 packet. 793 --------- 794 New text: (Section 7.2.4) 795 --------- 797 3) If not in Fast Recovery, determine how many of the earliest 798 (i.e., lowest TSN) DATA chunks marked for retransmission will fit 799 into a single packet, subject to constraint of the path MTU of 800 the destination transport address to which the packet is being 801 sent. Call this value K. Retransmit those K DATA chunks in a 802 single packet. When a Fast Retransmit is being performed, the 803 sender SHOULD ignore the value of cwnd and SHOULD NOT delay 804 retransmission for this single packet. 806 3.15.3. Solution Description 808 The new text explicitly specifies to send only the first packet in 809 the Fast Recovery phase disregarding cwnd limitations. 811 3.16. Initial Value of ssthresh 812 3.16.1. Description of the Problem 814 The initial value of ssthresh should be set arbitrarily high. Using 815 the advertised receiver window of the peer is inappropriate if the 816 peer increases its window after the handshake. Furthermore, use a 817 higher requirements level, since not following the advice may result 818 in performance problems. 820 3.16.2. Text Changes to the Document 822 --------- 823 Old text: (Section 7.2.1) 824 --------- 826 o The initial value of ssthresh MAY be arbitrarily high (for 827 example, implementations MAY use the size of the receiver 828 advertised window). 830 --------- 831 New text: (Section 7.2.1) 832 --------- 834 o The initial value of ssthresh SHOULD be arbitrarily high (e.g., 835 to the size of the largest possible advertised window). 837 3.16.3. Solution Description 839 Use the same value as suggested in [RFC5681], Section 3.1, as an 840 appropriate initial value. Furthermore use the same requirements 841 level. 843 3.17. Automatically Confirmed Addresses 845 3.17.1. Description of the Problem 847 The Path Verification procedure of [RFC4960] prescribes that any 848 address passed to the sender of the INIT by its upper layer is 849 automatically CONFIRMED. This however is unclear if only addresses 850 in the request to initiate association establishment are considered 851 or any addresses provided by the upper layer in any requests (e.g. in 852 'Set Primary'). 854 3.17.2. Text Changes to the Document 855 --------- 856 Old text: (Section 5.4) 857 --------- 859 1) Any address passed to the sender of the INIT by its upper layer 860 is automatically considered to be CONFIRMED. 862 --------- 863 New text: (Section 5.4) 864 --------- 866 1) Any addresses passed to the sender of the INIT by its upper 867 layer in the request to initialize an association is 868 automatically considered to be CONFIRMED. 870 3.17.3. Solution Description 872 The new text clarifies that only addresses provided by the upper 873 layer in the request to initialize an association are automatically 874 confirmed. 876 3.18. Only One Packet after Retransmission Timeout 878 3.18.1. Description of the Problem 880 [RFC4960] is not completely clear when it describes data transmission 881 after T3-rtx timer expiration. Section 7.2.1 does not specify how 882 many packets are allowed to be sent after T3-rtx timer expiration if 883 more than one packet fit into cwnd. At the same time, Section 7.2.3 884 has the text without normative language saying that SCTP should 885 ensure that no more than one packet will be in flight after T3-rtx 886 timer expiration until successful acknowledgment. It makes the text 887 inconsistent. 889 3.18.2. Text Changes to the Document 890 --------- 891 Old text: (Section 7.2.1) 892 --------- 894 o The initial cwnd after a retransmission timeout MUST be no more 895 than 1*MTU. 897 --------- 898 New text: (Section 7.2.1) 899 --------- 901 o The initial cwnd after a retransmission timeout MUST be no more 902 than 1*MTU and only one packet is allowed to be in flight 903 until successful acknowledgement. 905 3.18.3. Solution Description 907 The new text clearly specifies that only one packet is allowed to be 908 sent after T3-rtx timer expiration until successful acknowledgement. 910 3.19. INIT ACK Path for INIT in COOKIE-WAIT State 912 3.19.1. Description of the Problem 914 In case of an INIT received in the COOKIE-WAIT state [RFC4960] 915 prescribes to send an INIT ACK to the same destination address to 916 which the original INIT has been sent. This text does not address 917 the possibility of the upper layer to provide multiple remote IP 918 addresses while requesting the association establishment. If the 919 upper layer has provided multiple IP addresses and only a subset of 920 these addresses are supported by the peer then the destination 921 address of the original INIT may be absent in the incoming INIT and 922 sending INIT ACK to that address is useless. 924 3.19.2. Text Changes to the Document 925 --------- 926 Old text: (Section 5.2.1) 927 --------- 929 Upon receipt of an INIT in the COOKIE-WAIT state, an endpoint MUST 930 respond with an INIT ACK using the same parameters it sent in its 931 original INIT chunk (including its Initiate Tag, unchanged). When 932 responding, the endpoint MUST send the INIT ACK back to the same 933 address that the original INIT (sent by this endpoint) was sent. 935 --------- 936 New text: (Section 5.2.1) 937 --------- 939 Upon receipt of an INIT in the COOKIE-WAIT state, an endpoint MUST 940 respond with an INIT ACK using the same parameters it sent in its 941 original INIT chunk (including its Initiate Tag, unchanged). When 942 responding, the following rules MUST be applied: 944 1) The INIT ACK MUST only be sent to an address passed by the upper 945 layer in the request to initialize the association. 947 2) The INIT ACK MUST only be sent to an address reported in the 948 incoming INIT. 950 3) The INIT ACK SHOULD be sent to the source address of the 951 received INIT. 953 3.19.3. Solution Description 955 The new text requires sending INIT ACK to the destination address 956 that is passed by the upper layer and reported in the incoming INIT. 957 If the source address of the INIT fulfills it then sending the INIT 958 ACK to the source address of the INIT is the preferred behavior. 960 3.20. Zero Window Probing and Unreachable Primary Path 962 3.20.1. Description of the Problem 964 Section 6.1 of [RFC4960] states that when sending zero window probes, 965 SCTP should neither increment the association counter nor increment 966 the destination address error counter if it continues to receive new 967 packets from the peer. But receiving new packets from the peer does 968 not guarantee peer's accessibility and, if the destination address 969 becomes unreachable during zero window probing, SCTP cannot get a 970 changed rwnd until it switches the destination address for probes. 972 3.20.2. Text Changes to the Document 974 --------- 975 Old text: (Section 6.1) 976 --------- 978 If the sender continues to receive new packets from the receiver 979 while doing zero window probing, the unacknowledged window probes 980 should not increment the error counter for the association or any 981 destination transport address. This is because the receiver MAY 982 keep its window closed for an indefinite time. Refer to Section 983 6.2 on the receiver behavior when it advertises a zero window. 985 --------- 986 New text: (Section 6.1) 987 --------- 989 If the sender continues to receive SACKs from the peer 990 while doing zero window probing, the unacknowledged window probes 991 should not increment the error counter for the association or any 992 destination transport address. This is because the receiver MAY 993 keep its window closed for an indefinite time. Refer to Section 994 6.2 on the receiver behavior when it advertises a zero window. 996 3.20.3. Solution Description 998 The new text clarifies that if the receiver continues to send SACKs, 999 the sender of probes should not increment the error counter of the 1000 association and the destination address even if the SACKs do not 1001 acknowledge the probes. 1003 3.21. Normative Language in Section 10 1005 3.21.1. Description of the Problem 1007 Section 10 of [RFC4960] is informative and normative language such as 1008 MUST and MAY cannot be used there. However, there are several places 1009 in Section 10 where MUST and MAY are used. 1011 3.21.2. Text Changes to the Document 1013 --------- 1014 Old text: (Section 10.1) 1015 --------- 1017 E) Send 1019 Format: SEND(association id, buffer address, byte count [,context] 1021 [,stream id] [,life time] [,destination transport address] 1022 [,unordered flag] [,no-bundle flag] [,payload protocol-id] ) 1023 -> result 1025 ... 1027 o no-bundle flag - instructs SCTP not to bundle this user data with 1028 other outbound DATA chunks. SCTP MAY still bundle even when this 1029 flag is present, when faced with network congestion. 1031 --------- 1032 New text: (Section 10.1) 1033 --------- 1035 E) Send 1037 Format: SEND(association id, buffer address, byte count [,context] 1038 [,stream id] [,life time] [,destination transport address] 1039 [,unordered flag] [,no-bundle flag] [,payload protocol-id] ) 1040 -> result 1042 ... 1044 o no-bundle flag - instructs SCTP not to bundle this user data with 1045 other outbound DATA chunks. SCTP may still bundle even when this 1046 flag is present, when faced with network congestion. 1048 --------- 1049 Old text: (Section 10.1) 1050 --------- 1052 G) Receive 1054 Format: RECEIVE(association id, buffer address, buffer size 1055 [,stream id]) 1056 -> byte count [,transport address] [,stream id] [,stream sequence 1057 number] [,partial flag] [,delivery number] [,payload protocol-id] 1059 ... 1061 o partial flag - if this returned flag is set to 1, then this 1062 Receive contains a partial delivery of the whole message. When 1063 this flag is set, the stream id and Stream Sequence Number MUST 1064 accompany this receive. When this flag is set to 0, it indicates 1065 that no more deliveries will be received for this Stream Sequence 1066 Number. 1068 --------- 1069 New text: (Section 10.1) 1070 --------- 1072 G) Receive 1074 Format: RECEIVE(association id, buffer address, buffer size 1075 [,stream id]) 1076 -> byte count [,transport address] [,stream id] [,stream sequence 1077 number] [,partial flag] [,delivery number] [,payload protocol-id] 1079 ... 1081 o partial flag - if this returned flag is set to 1, then this 1082 Receive contains a partial delivery of the whole message. When 1083 this flag is set, the stream id and Stream Sequence Number must 1084 accompany this receive. When this flag is set to 0, it indicates 1085 that no more deliveries will be received for this Stream Sequence 1086 Number. 1088 --------- 1089 Old text: (Section 10.1) 1090 --------- 1092 N) Receive Unsent Message 1094 Format: RECEIVE_UNSENT(data retrieval id, buffer address, buffer 1095 size [,stream id] [, stream sequence number] [,partial 1096 flag] [,payload protocol-id]) 1098 ... 1100 o partial flag - if this returned flag is set to 1, then this 1101 message is a partial delivery of the whole message. When this 1102 flag is set, the stream id and Stream Sequence Number MUST 1103 accompany this receive. When this flag is set to 0, it indicates 1104 that no more deliveries will be received for this Stream Sequence 1105 Number. 1107 --------- 1108 New text: (Section 10.1) 1109 --------- 1111 N) Receive Unsent Message 1113 Format: RECEIVE_UNSENT(data retrieval id, buffer address, buffer 1114 size [,stream id] [, stream sequence number] [,partial 1115 flag] [,payload protocol-id]) 1117 ... 1119 o partial flag - if this returned flag is set to 1, then this 1120 message is a partial delivery of the whole message. When this 1121 flag is set, the stream id and Stream Sequence Number must 1122 accompany this receive. When this flag is set to 0, it indicates 1123 that no more deliveries will be received for this Stream Sequence 1124 Number. 1126 --------- 1127 Old text: (Section 10.1) 1128 --------- 1130 O) Receive Unacknowledged Message 1132 Format: RECEIVE_UNACKED(data retrieval id, buffer address, buffer 1133 size, [,stream id] [, stream sequence number] [,partial 1134 flag] [,payload protocol-id]) 1136 ... 1138 o partial flag - if this returned flag is set to 1, then this 1139 message is a partial delivery of the whole message. When this 1140 flag is set, the stream id and Stream Sequence Number MUST 1141 accompany this receive. When this flag is set to 0, it indicates 1142 that no more deliveries will be received for this Stream Sequence 1143 Number. 1145 --------- 1146 New text: (Section 10.1) 1147 --------- 1149 O) Receive Unacknowledged Message 1151 Format: RECEIVE_UNACKED(data retrieval id, buffer address, buffer 1152 size, [,stream id] [, stream sequence number] [,partial 1153 flag] [,payload protocol-id]) 1155 ... 1157 o partial flag - if this returned flag is set to 1, then this 1158 message is a partial delivery of the whole message. When this 1159 flag is set, the stream id and Stream Sequence Number must 1160 accompany this receive. When this flag is set to 0, it indicates 1161 that no more deliveries will be received for this Stream Sequence 1162 Number. 1164 3.21.3. Solution Description 1166 The normative language is removed from Section 10. 1168 3.22. Increase of partial_bytes_acked in Congestion Avoidance 1170 3.22.1. Description of the Problem 1172 Two issues have been discovered with the partial_bytes_acked handling 1173 described in Section 7.2.2 of [RFC4960]: 1175 o If the Cumulative TSN Ack Point is not advanced but the SACK chunk 1176 acknowledges new TSNs in the Gap Ack Blocks, these newly 1177 acknowledged TSNs are not considered for partial_bytes_acked 1178 although these TSNs were successfully received by the peer. 1179 o Duplicate TSNs are not considered in partial_bytes_acked although 1180 they confirm that the DATA chunks were successfully received by 1181 the peer. 1183 3.22.2. Text Changes to the Document 1185 --------- 1186 Old text: (Section 7.2.2) 1187 --------- 1189 o Whenever cwnd is greater than ssthresh, upon each SACK arrival 1190 that advances the Cumulative TSN Ack Point, increase 1191 partial_bytes_acked by the total number of bytes of all new chunks 1192 acknowledged in that SACK including chunks acknowledged by the new 1193 Cumulative TSN Ack and by Gap Ack Blocks. 1195 --------- 1196 New text: (Section 7.2.2) 1197 --------- 1199 o Whenever cwnd is greater than ssthresh, upon each SACK arrival, 1200 increase partial_bytes_acked by the total number of bytes of all 1201 new chunks acknowledged in that SACK including chunks acknowledged 1202 by the new Cumulative TSN Ack, by Gap Ack Blocks and by the number 1203 of bytes of duplicated chunks reported in Duplicate TSNs. 1205 3.22.3. Solution Description 1207 Now partial_bytes_acked is increased by TSNs reported as duplicated 1208 as well as TSNs newly acknowledged in Gap Ack Blocks even if the 1209 Cumulative TSN Ack Point is not advanced. 1211 3.23. Inconsistency in Notifications Handling 1213 3.23.1. Description of the Problem 1215 [RFC4960] uses inconsistent normative and non-normative language when 1216 describing rules for sending notifications to the upper layer. E.g. 1217 Section 8.2 of [RFC4960] says that when a destination address becomes 1218 inactive due to an unacknowledged DATA chunk or HEARTBEAT chunk, SCTP 1219 SHOULD send a notification to the upper layer while Section 8.3 of 1220 [RFC4960] says that when a destination address becomes inactive due 1221 to an unacknowledged HEARTBEAT chunk, SCTP may send a notification to 1222 the upper layer. 1224 This makes the text inconsistent. 1226 3.23.2. Text Changes to the Document 1228 The following cahnge is based on the change described in Section 3.6. 1230 --------- 1231 Old text: (Section 8.1) 1232 --------- 1234 An endpoint shall keep a counter on the total number of consecutive 1235 retransmissions to its peer (this includes data retransmissions 1236 to all the destination transport addresses of the peer if it is 1237 multi-homed), including the number of unacknowledged HEARTBEAT 1238 chunks observed on the path which currently is used for data 1239 transfer. Unacknowledged HEARTBEAT chunks observed on paths 1240 different from the path currently used for data transfer shall 1241 not increment the association error counter, as this could lead 1242 to association closure even if the path which currently is used for 1243 data transfer is available (but idle). If the value of this 1244 counter exceeds the limit indicated in the protocol parameter 1245 'Association.Max.Retrans', the endpoint shall consider the peer 1246 endpoint unreachable and shall stop transmitting any more data to it 1247 (and thus the association enters the CLOSED state). In addition, the 1248 endpoint MAY report the failure to the upper layer and optionally 1249 report back all outstanding user data remaining in its outbound 1250 queue. The association is automatically closed when the peer 1251 endpoint becomes unreachable. 1253 --------- 1254 New text: (Section 8.1) 1255 --------- 1257 An endpoint shall keep a counter on the total number of consecutive 1258 retransmissions to its peer (this includes data retransmissions 1259 to all the destination transport addresses of the peer if it is 1260 multi-homed), including the number of unacknowledged HEARTBEAT 1261 chunks observed on the path which currently is used for data 1262 transfer. Unacknowledged HEARTBEAT chunks observed on paths 1263 different from the path currently used for data transfer shall 1264 not increment the association error counter, as this could lead 1265 to association closure even if the path which currently is used for 1266 data transfer is available (but idle). If the value of this 1267 counter exceeds the limit indicated in the protocol parameter 1268 'Association.Max.Retrans', the endpoint shall consider the peer 1269 endpoint unreachable and shall stop transmitting any more data to it 1270 (and thus the association enters the CLOSED state). In addition, the 1271 endpoint SHOULD report the failure to the upper layer and optionally 1272 report back all outstanding user data remaining in its outbound 1273 queue. The association is automatically closed when the peer 1274 endpoint becomes unreachable. 1276 The following changes are based on [RFC4960]. 1278 --------- 1279 Old text: (Section 8.2) 1280 --------- 1282 When an outstanding TSN is acknowledged or a HEARTBEAT sent to that 1283 address is acknowledged with a HEARTBEAT ACK, the endpoint shall 1284 clear the error counter of the destination transport address to which 1285 the DATA chunk was last sent (or HEARTBEAT was sent). When the peer 1286 endpoint is multi-homed and the last chunk sent to it was a 1287 retransmission to an alternate address, there exists an ambiguity as 1288 to whether or not the acknowledgement should be credited to the 1289 address of the last chunk sent. However, this ambiguity does not 1290 seem to bear any significant consequence to SCTP behavior. If this 1291 ambiguity is undesirable, the transmitter may choose not to clear the 1292 error counter if the last chunk sent was a retransmission. 1294 --------- 1295 New text: (Section 8.2) 1296 --------- 1298 When an outstanding TSN is acknowledged or a HEARTBEAT sent to that 1299 address is acknowledged with a HEARTBEAT ACK, the endpoint shall 1300 clear the error counter of the destination transport address to which 1301 the DATA chunk was last sent (or HEARTBEAT was sent), and SHOULD 1302 also report to the upper layer when an inactive destination address 1303 is marked as active. When the peer endpoint is multi-homed and the 1304 last chunk sent to it was a retransmission to an alternate address, 1305 there exists an ambiguity as to whether or not the acknowledgement 1306 should be credited to the address of the last chunk sent. However, 1307 this ambiguity does not seem to bear any significant consequence to 1308 SCTP behavior. If this ambiguity is undesirable, the transmitter may 1309 choose not to clear the error counter if the last chunk sent was a 1310 retransmission. 1312 --------- 1313 Old text: (Section 8.3) 1314 --------- 1316 When the value of this counter reaches the protocol parameter 1317 'Path.Max.Retrans', the endpoint should mark the corresponding 1318 destination address as inactive if it is not so marked, and may also 1319 optionally report to the upper layer the change of reachability of 1320 this destination address. After this, the endpoint should continue 1321 HEARTBEAT on this destination address but should stop increasing the 1322 counter. 1324 --------- 1325 New text: (Section 8.3) 1326 --------- 1328 When the value of this counter exceeds the protocol parameter 1329 'Path.Max.Retrans', the endpoint should mark the corresponding 1330 destination address as inactive if it is not so marked, and SHOULD 1331 also report to the upper layer the change of reachability of this 1332 destination address. After this, the endpoint should continue 1333 HEARTBEAT on this destination address but should stop increasing the 1334 counter. 1336 --------- 1337 Old text: (Section 8.3) 1338 --------- 1340 Upon the receipt of the HEARTBEAT ACK, the sender of the HEARTBEAT 1341 should clear the error counter of the destination transport address 1342 to which the HEARTBEAT was sent, and mark the destination transport 1343 address as active if it is not so marked. The endpoint may 1344 optionally report to the upper layer when an inactive destination 1345 address is marked as active due to the reception of the latest 1346 HEARTBEAT ACK. The receiver of the HEARTBEAT ACK must also clear the 1347 association overall error count as well (as defined in Section 8.1). 1349 --------- 1350 New text: (Section 8.3) 1351 --------- 1353 Upon the receipt of the HEARTBEAT ACK, the sender of the HEARTBEAT 1354 should clear the error counter of the destination transport address 1355 to which the HEARTBEAT was sent, and mark the destination transport 1356 address as active if it is not so marked. The endpoint SHOULD 1357 report to the upper layer when an inactive destination address 1358 is marked as active due to the reception of the latest 1359 HEARTBEAT ACK. The receiver of the HEARTBEAT ACK should also clear 1360 the association overall error counter (as defined in Section 8.1). 1362 --------- 1363 Old text: (Section 9.2) 1364 --------- 1366 An endpoint should limit the number of retransmissions of the 1367 SHUTDOWN chunk to the protocol parameter 'Association.Max.Retrans'. 1368 If this threshold is exceeded, the endpoint should destroy the TCB 1369 and MUST report the peer endpoint unreachable to the upper layer (and 1370 thus the association enters the CLOSED state). 1372 --------- 1373 New text: (Section 9.2) 1374 --------- 1376 An endpoint should limit the number of retransmissions of the 1377 SHUTDOWN chunk to the protocol parameter 'Association.Max.Retrans'. 1378 If this threshold is exceeded, the endpoint should destroy the TCB 1379 and SHOULD report the peer endpoint unreachable to the upper layer 1380 (and thus the association enters the CLOSED state). 1382 --------- 1383 Old text: (Section 9.2) 1384 --------- 1386 The sender of the SHUTDOWN ACK should limit the number of 1387 retransmissions of the SHUTDOWN ACK chunk to the protocol parameter 1388 'Association.Max.Retrans'. If this threshold is exceeded, the 1389 endpoint should destroy the TCB and may report the peer endpoint 1390 unreachable to the upper layer (and thus the association enters the 1391 CLOSED state). 1393 --------- 1394 New text: (Section 9.2) 1395 --------- 1397 The sender of the SHUTDOWN ACK should limit the number of 1398 retransmissions of the SHUTDOWN ACK chunk to the protocol parameter 1399 'Association.Max.Retrans'. If this threshold is exceeded, the 1400 endpoint should destroy the TCB and SHOULD report the peer endpoint 1401 unreachable to the upper layer (and thus the association enters the 1402 CLOSED state). 1404 3.23.3. Solution Description 1406 The inconsistencies are removed by using consistently SHOULD. 1408 3.24. SACK.Delay Not Listed as a Protocol Parameter 1410 3.24.1. Description of the Problem 1412 SCTP as specified in [RFC4960] supports delaying SACKs. The timer 1413 value for this is a parameter and Section 6.2 of [RFC4960] specifies 1414 a default and maximum value for it. However, defining a name for 1415 this parameter and listing it in the table of protocol parameters in 1416 Section 15 of [RFC4960] is missing. 1418 This issue was reported as an Errata for [RFC4960] with Errata ID 1419 4656. 1421 3.24.2. Text Changes to the Document 1423 --------- 1424 Old text: (Section 6.2) 1425 --------- 1427 An implementation MUST NOT allow the maximum delay to be configured 1428 to be more than 500 ms. In other words, an implementation MAY lower 1429 this value below 500 ms but MUST NOT raise it above 500 ms. 1431 --------- 1432 New text: (Section 6.2) 1433 --------- 1435 An implementation MUST NOT allow the maximum delay (protocol 1436 parameter 'SACK.Delay') to be configured to be more than 500 ms. 1437 In other words, an implementation MAY lower the value of 1438 SACK.Delay below 500 ms but MUST NOT raise it above 500 ms. 1440 --------- 1441 Old text: (Section 15) 1442 --------- 1444 The following protocol parameters are RECOMMENDED: 1446 RTO.Initial - 3 seconds 1447 RTO.Min - 1 second 1448 RTO.Max - 60 seconds 1449 Max.Burst - 4 1450 RTO.Alpha - 1/8 1451 RTO.Beta - 1/4 1452 Valid.Cookie.Life - 60 seconds 1453 Association.Max.Retrans - 10 attempts 1454 Path.Max.Retrans - 5 attempts (per destination address) 1455 Max.Init.Retransmits - 8 attempts 1456 HB.interval - 30 seconds 1457 HB.Max.Burst - 1 1459 --------- 1460 New text: (Section 15) 1461 --------- 1463 The following protocol parameters are RECOMMENDED: 1465 RTO.Initial - 3 seconds 1466 RTO.Min - 1 second 1467 RTO.Max - 60 seconds 1468 Max.Burst - 4 1469 RTO.Alpha - 1/8 1470 RTO.Beta - 1/4 1471 Valid.Cookie.Life - 60 seconds 1472 Association.Max.Retrans - 10 attempts 1473 Path.Max.Retrans - 5 attempts (per destination address) 1474 Max.Init.Retransmits - 8 attempts 1475 HB.interval - 30 seconds 1476 HB.Max.Burst - 1 1477 SACK.Delay - 200 milliseconds 1479 3.24.3. Solution Description 1481 The parameter was given a name and added to the list of protocol 1482 parameters. 1484 3.25. Processing of Chunks in an Incoming SCTP Packet 1486 3.25.1. Description of the Problem 1488 There are a few places in [RFC4960] where the receiver of a packet 1489 must discard it while processing the chunks of the packet. It is 1490 unclear whether the receiver has to rollback state changes already 1491 performed while processing the packet or not. 1493 The intention of [RFC4960] is to process an incoming packet chunk by 1494 chunk and do not perform any prescreening of chunks in the received 1495 packet so the receiver must only discard a chunk causing discard and 1496 all further chunks. 1498 3.25.2. Text Changes to the Document 1500 --------- 1501 Old text: (Section 3.2) 1502 --------- 1504 00 - Stop processing this SCTP packet and discard it, do not 1505 process any further chunks within it. 1507 01 - Stop processing this SCTP packet and discard it, do not 1508 process any further chunks within it, and report the 1509 unrecognized chunk in an 'Unrecognized Chunk Type'. 1511 --------- 1512 New text: (Section 3.2) 1513 --------- 1515 00 - Stop processing this SCTP packet, discard the unrecognized 1516 chunk and all further chunks. 1518 01 - Stop processing this SCTP packet, discard the unrecognized 1519 chunk and all further chunks, and report the unrecognized 1520 chunk in an 'Unrecognized Chunk Type'. 1522 --------- 1523 Old text: (Section 11.3) 1524 --------- 1526 It is helpful for some firewalls if they can inspect just the first 1527 fragment of a fragmented SCTP packet and unambiguously determine 1528 whether it corresponds to an INIT chunk (for further information, 1529 please refer to [RFC1858]). Accordingly, we stress the requirements, 1530 stated in Section 3.1, that (1) an INIT chunk MUST NOT be bundled 1531 with any other chunk in a packet, and (2) a packet containing an INIT 1532 chunk MUST have a zero Verification Tag. Furthermore, we require 1533 that the receiver of an INIT chunk MUST enforce these rules by 1534 silently discarding an arriving packet with an INIT chunk that is 1535 bundled with other chunks or has a non-zero verification tag and 1536 contains an INIT-chunk. 1538 --------- 1539 New text: (Section 11.3) 1540 --------- 1542 It is helpful for some firewalls if they can inspect just the first 1543 fragment of a fragmented SCTP packet and unambiguously determine 1544 whether it corresponds to an INIT chunk (for further information, 1545 please refer to [RFC1858]). Accordingly, we stress the requirements, 1546 stated in Section 3.1, that (1) an INIT chunk MUST NOT be bundled 1547 with any other chunk in a packet, and (2) a packet containing an INIT 1548 chunk MUST have a zero Verification Tag. Furthermore, we require 1549 that the receiver of an INIT chunk MUST enforce these rules by 1550 silently discarding the INIT chunk and all further chunks if the INIT 1551 chunk is bundled with other chunks or the packet has a non-zero 1552 verification tag. 1554 3.25.3. Solution Description 1556 The new text makes it clear that chunks can be processed from the 1557 beginning to the end and no rollback or pre-screening is required. 1559 3.26. CWND Increase in Congestion Avoidance Phase 1561 3.26.1. Description of the Problem 1563 [RFC4960] in Section 7.2.2 prescribes to increase cwnd by 1*MTU per 1564 RTT if the sender has cwnd or more bytes of outstanding data to the 1565 corresponding address in the Congestion Avoidance phase. However, 1566 this is described without normative language. Moreover, 1567 Section 7.2.2 includes an algorithm how an implementation can achieve 1568 it but this algorithm is underspecified and actually allows 1569 increasing cwnd by more than 1*MTU per RTT. 1571 3.26.2. Text Changes to the Document 1573 --------- 1574 Old text: (Section 7.2.2) 1575 --------- 1577 When cwnd is greater than ssthresh, cwnd should be incremented by 1578 1*MTU per RTT if the sender has cwnd or more bytes of data 1579 outstanding for the corresponding transport address. 1581 --------- 1582 New text: (Section 7.2.2) 1583 --------- 1585 When cwnd is greater than ssthresh, cwnd should be incremented by 1586 1*MTU per RTT if the sender has cwnd or more bytes of data 1587 outstanding for the corresponding transport address. The basic 1588 guidelines for incrementing cwnd during congestion avoidance are: 1590 o SCTP MAY increment cwnd by 1*MTU. 1592 o SCTP SHOULD increment cwnd by one 1*MTU once per RTT when 1593 the sender has cwnd or more bytes of data outstanding for 1594 the corresponding transport address. 1596 o SCTP MUST NOT increment cwnd by more than 1*MTU per RTT. 1598 --------- 1599 Old text: (Section 7.2.2) 1600 --------- 1602 o Whenever cwnd is greater than ssthresh, upon each SACK arrival 1603 that advances the Cumulative TSN Ack Point, increase 1604 partial_bytes_acked by the total number of bytes of all new chunks 1605 acknowledged in that SACK including chunks acknowledged by the new 1606 Cumulative TSN Ack and by Gap Ack Blocks. 1608 o When partial_bytes_acked is equal to or greater than cwnd and 1609 before the arrival of the SACK the sender had cwnd or more bytes 1610 of data outstanding (i.e., before arrival of the SACK, flightsize 1611 was greater than or equal to cwnd), increase cwnd by MTU, and 1612 reset partial_bytes_acked to (partial_bytes_acked - cwnd). 1614 --------- 1615 New text: (Section 7.2.2) 1616 --------- 1618 o Whenever cwnd is greater than ssthresh, upon each SACK arrival, 1619 increase partial_bytes_acked by the total number of bytes of all 1620 new chunks acknowledged in that SACK including chunks acknowledged 1621 by the new Cumulative TSN Ack, by Gap Ack Blocks and by the number 1622 of bytes of duplicated chunks reported in Duplicate TSNs. 1624 o When partial_bytes_acked is greater than cwnd and before the 1625 arrival of the SACK the sender had less bytes of data outstanding 1626 than cwnd (i.e., before arrival of the SACK, flightsize was less 1627 than cwnd), reset partial_bytes_acked to cwnd. 1629 o When partial_bytes_acked is equal to or greater than cwnd and 1630 before the arrival of the SACK the sender had cwnd or more bytes 1631 of data outstanding (i.e., before arrival of the SACK, flightsize 1632 was greater than or equal to cwnd), partial_bytes_acked is reset 1633 to (partial_bytes_acked - cwnd). Next, cwnd is increased by MTU. 1635 3.26.3. Solution Description 1637 The basic guidelines for incrementing cwnd during congestion 1638 avoidance phase are added into Section 7.2.2. The guidelines include 1639 the normative language and are aligned with [RFC5681]. 1641 The algorithm from Section 7.2.2 is improved to not allow increasing 1642 cwnd by more than 1*MTU per RTT. 1644 3.27. Refresh of cwnd and ssthresh after Idle Period 1646 3.27.1. Description of the Problem 1648 [RFC4960] prescribes to adjust cwnd per RTO if the endpoint does not 1649 transmit data on a given transport address. In addition to that, it 1650 prescribes to set cwnd to the initial value after a sufficiently long 1651 idle period. The latter is excessive. Moreover, it is unclear what 1652 is a sufficiently long idle period. 1654 [RFC4960] doesn't specify the handling of ssthresh in the idle case. 1655 If ssthres is reduced due to a packet loss, ssthresh is never 1656 recovered. So traffic can end up in Congestion Avoidance all the 1657 time, resulting in a low sending rate and bad performance. The 1658 problem is even more serious for SCTP because in a multi-homed SCTP 1659 association traffic switch back to the previously failed primary path 1660 will also lead to the situation where traffic ends up in Congestion 1661 Avoidance. 1663 3.27.2. Text Changes to the Document 1665 --------- 1666 Old text: (Section 7.2.1) 1667 --------- 1669 o The initial cwnd before DATA transmission or after a sufficiently 1670 long idle period MUST be set to min(4*MTU, max (2*MTU, 4380 1671 bytes)). 1673 --------- 1674 New text: (Section 7.2.1) 1675 --------- 1677 o The initial cwnd before DATA transmission MUST be set to 1678 min(4*MTU, max (2*MTU, 4380 bytes)). 1680 --------- 1681 Old text: (Section 7.2.1) 1682 --------- 1684 o When the endpoint does not transmit data on a given transport 1685 address, the cwnd of the transport address should be adjusted to 1686 max(cwnd/2, 4*MTU) per RTO. 1688 --------- 1689 New text: (Section 7.2.1) 1690 --------- 1691 o When the endpoint does not transmit data on a given transport 1692 address, the cwnd of the transport address should be adjusted to 1693 max(cwnd/2, 4*MTU) per RTO. At the first cwnd adjustment, the 1694 ssthresh of the transport address should be adjusted to the cwnd. 1696 3.27.3. Solution Description 1698 A rule about cwnd adjustment after a sufficiently long idle period is 1699 removed. 1701 The text is updated to refresh ssthresh after the idle period. When 1702 the idle period is detected, the cwnd value is stored to the ssthresh 1703 value. 1705 3.28. Window Updates After Receiver Window Opens Up 1706 3.28.1. Description of the Problem 1708 The sending of SACK chunks for window updates is only indirectly 1709 referenced in [RFC4960], Section 6.2, where it is stated that an SCTP 1710 receiver must not generate more than one SACK for every incoming 1711 packet, other than to update the offered window. 1713 However, the sending of window updates when the receiver window opens 1714 up is necessary to avoid performance problems. 1716 3.28.2. Text Changes to the Document 1718 --------- 1719 Old text: (Section 6.2) 1720 --------- 1722 An SCTP receiver MUST NOT generate more than one SACK for every 1723 incoming packet, other than to update the offered window as the 1724 receiving application consumes new data. 1726 --------- 1727 New text: (Section 6.2) 1728 --------- 1730 An SCTP receiver MUST NOT generate more than one SACK for every 1731 incoming packet, other than to update the offered window as the 1732 receiving application consumes new data. When the window opens 1733 up, an SCTP receiver SHOULD send additional SACK chunks to update 1734 the window even if no new data is received. The receiver MUST avoid 1735 sending large burst of window updates. 1737 3.28.3. Solution Description 1739 The new text makes clear that additional SACK chunks for window 1740 updates may be sent as long as excessive bursts are avoided. 1742 3.29. Path of DATA and Reply Chunks 1744 3.29.1. Description of the Problem 1746 Section 6.4 of [RFC4960] describes the transmission policy for multi- 1747 homed SCTP endpoints. However, there are the following issues with 1748 it: 1750 o It states that a SACK should be sent to the source address of an 1751 incoming DATA. However, it is known that other SACK policies 1752 (e.g. sending SACKs always to the primary path) may be more 1753 beneficial in some situations. 1754 o Initially it states that an endpoint should always transmit DATA 1755 chunks to the primary path. Then it states that the rule for 1756 transmittal of reply chunks should also be followed if the 1757 endpoint is bundling DATA chunks together with the reply chunk 1758 which contradicts with the first statement to always transmit DATA 1759 chunks to the primary path. Some implementations were having 1760 problems with it and sent DATA chunks bundled with reply chunks to 1761 a different destination address than the primary path that caused 1762 many gaps. 1764 3.29.2. Text Changes to the Document 1766 --------- 1767 Old text: (Section 6.4) 1768 --------- 1770 An endpoint SHOULD transmit reply chunks (e.g., SACK, HEARTBEAT ACK, 1771 etc.) to the same destination transport address from which it 1772 received the DATA or control chunk to which it is replying. This 1773 rule should also be followed if the endpoint is bundling DATA chunks 1774 together with the reply chunk. 1776 However, when acknowledging multiple DATA chunks received in packets 1777 from different source addresses in a single SACK, the SACK chunk may 1778 be transmitted to one of the destination transport addresses from 1779 which the DATA or control chunks being acknowledged were received. 1781 --------- 1782 New text: (Section 6.4) 1783 --------- 1785 An endpoint SHOULD transmit reply chunks (e.g., INIT ACK, COOKIE ACK, 1786 HEARTBEAT ACK, etc.) in response to control chunks to the same 1787 destination transport address from which it received the control 1788 chunk to which it is replying. 1790 The selection of the destination transport address for packets containing 1791 SACK chunks is implementation dependent. However, an endpoint SHOULD NOT vary 1792 the destination transport address of a SACK when it receives DATA chunks 1793 from the same source address. 1795 When acknowledging multiple DATA chunks received in packets 1796 from different source addresses in a single SACK, the SACK chunk MAY 1797 be transmitted to one of the destination transport addresses from 1798 which the DATA or control chunks being acknowledged were received. 1800 3.29.3. Solution Description 1802 The SACK transmission policy is left implementation dependent but it 1803 is specified to not vary the destination address of a packet 1804 containing a SACK chunk unless there are reasons for it as it may 1805 negatively impact RTT measurement. 1807 A confusing statement that prescribes to follow the rule for 1808 transmittal of reply chunks when the endpoint is bundling DATA chunks 1809 together with the reply chunk is removed. 1811 4. IANA Considerations 1813 This document does not require any actions from IANA. 1815 5. Security Considerations 1817 This document does not add any security considerations to those given 1818 in [RFC4960]. 1820 6. Acknowledgments 1822 The authors wish to thank Pontus Andersson, Eric W. Biederman, 1823 Cedric Bonnet, Lionel Morand, Jeff Morriss, Karen E. E. Nielsen, 1824 Tom Petch and Julien Pourtet for their invaluable comments. 1826 7. References 1828 7.1. Normative References 1830 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1831 Requirement Levels", BCP 14, RFC 2119, 1832 DOI 10.17487/RFC2119, March 1997, 1833 . 1835 [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", 1836 RFC 4960, DOI 10.17487/RFC4960, September 2007, 1837 . 1839 7.2. Informative References 1841 [RFC2960] Stewart, R., Xie, Q., Morneault, K., Sharp, C., 1842 Schwarzbauer, H., Taylor, T., Rytina, I., Kalla, M., 1843 Zhang, L., and V. Paxson, "Stream Control Transmission 1844 Protocol", RFC 2960, DOI 10.17487/RFC2960, October 2000, 1845 . 1847 [RFC4460] Stewart, R., Arias-Rodriguez, I., Poon, K., Caro, A., and 1848 M. Tuexen, "Stream Control Transmission Protocol (SCTP) 1849 Specification Errata and Issues", RFC 4460, 1850 DOI 10.17487/RFC4460, April 2006, 1851 . 1853 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 1854 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 1855 . 1857 Authors' Addresses 1859 Randall R. Stewart 1860 Netflix, Inc. 1861 Chapin, SC 29036 1862 United States 1864 Email: randall@lakerest.net 1866 Michael Tuexen 1867 Muenster University of Applied Sciences 1868 Stegerwaldstrasse 39 1869 48565 Steinfurt 1870 Germany 1872 Email: tuexen@fh-muenster.de 1874 Maksim Proshin 1875 Ericsson 1876 Kistavaegen 25 1877 Stockholm 164 80 1878 Sweden 1880 Email: mproshin@tieto.mera.ru