idnits 2.17.1 draft-bagnulo-tcpm-generalized-ecn-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- No issues found here. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (April 13, 2017) is 2571 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-28) exists of draft-ietf-tcpm-accurate-ecn-02 == Outdated reference: A later version (-08) exists of draft-ietf-tsvwg-ecn-experimentation-01 -- Obsolete informational reference (is this intentional?): RFC 793 (Obsoleted by RFC 9293) -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) == Outdated reference: A later version (-07) exists of draft-stewart-tsvwg-sctpecn-05 Summary: 0 errors (**), 0 flaws (~~), 5 warnings (==), 3 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Bagnulo 3 Internet-Draft UC3M 4 Intended status: Experimental B. Briscoe 5 Expires: October 15, 2017 Simula Research Lab 6 April 13, 2017 8 Adding Explicit Congestion Notification (ECN) to TCP control packets and 9 TCP retransmissions 10 draft-bagnulo-tcpm-generalized-ecn-01 12 Abstract 14 This document describes an experimental modification to ECN when used 15 with TCP. It allows the use of ECN on the following TCP packets: 16 SYNs, Pure ACKs, Window probes, FINs, RSTs and retransmissions. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at http://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on October 15, 2017. 35 Copyright Notice 37 Copyright (c) 2017 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (http://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 53 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . 3 54 1.2. Experiment goals . . . . . . . . . . . . . . . . . . . . 4 55 1.3. Document structure . . . . . . . . . . . . . . . . . . . 4 56 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 57 3. Specification . . . . . . . . . . . . . . . . . . . . . . . . 6 58 3.1. Network behaviour . . . . . . . . . . . . . . . . . . . . 6 59 3.2. Endpoint behaviour . . . . . . . . . . . . . . . . . . . 6 60 3.2.1. SYN . . . . . . . . . . . . . . . . . . . . . . . . . 7 61 3.2.2. SYN-ACK . . . . . . . . . . . . . . . . . . . . . . . 10 62 3.2.3. Pure ACK . . . . . . . . . . . . . . . . . . . . . . 10 63 3.2.4. Window Probe . . . . . . . . . . . . . . . . . . . . 11 64 3.2.5. FIN . . . . . . . . . . . . . . . . . . . . . . . . . 12 65 3.2.6. RST . . . . . . . . . . . . . . . . . . . . . . . . . 12 66 3.2.7. Retransmissions . . . . . . . . . . . . . . . . . . . 13 67 4. Interaction with popular variants or derivatives of TCP . . . 14 68 4.1. SCTP . . . . . . . . . . . . . . . . . . . . . . . . . . 14 69 4.2. TFO . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 70 4.3. IW10 . . . . . . . . . . . . . . . . . . . . . . . . . . 15 71 5. Discussion of the arguments in RFC 3168 . . . . . . . . . . . 16 72 5.1. The reliability argument . . . . . . . . . . . . . . . . 16 73 5.2. SYNs . . . . . . . . . . . . . . . . . . . . . . . . . . 16 74 5.2.1. Argument 1a: Loss of congestion notification on the 75 SYN . . . . . . . . . . . . . . . . . . . . . . . . . 17 76 5.2.2. Argument 1b: Unknown Handling of Unexpected ECN . . . 18 77 5.2.3. Argument 2: DoS attacks. . . . . . . . . . . . . . . 19 78 5.3. Pure ACKs. . . . . . . . . . . . . . . . . . . . . . . . 20 79 5.4. Window probes . . . . . . . . . . . . . . . . . . . . . . 22 80 5.5. Retransmitted packets. . . . . . . . . . . . . . . . . . 22 81 6. Security considerations . . . . . . . . . . . . . . . . . . . 23 82 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 23 83 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 23 84 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 23 85 9.1. Normative References . . . . . . . . . . . . . . . . . . 24 86 9.2. Informative References . . . . . . . . . . . . . . . . . 24 87 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 26 89 1. Introduction 91 RFC 3168 [RFC3168] specifies support of Explicit Congestion 92 Notification (ECN) in IP (v4 and v6). By using the ECN capability, 93 switches performing Active Queue Management (AQM) can use ECN marks 94 instead of packet drops to signal congestion to the endpoints of a 95 communication. This results in lower packet loss and increased 96 performance. RFC 3168 also specifies support for ECN in TCP, but 97 solely on data packets. For various reasons it precludes the use of 98 ECN on TCP control packets (TCP SYN, TCP SYN-ACK, pure ACKs, Window 99 probes) and on retransmitted packets. RFC 3168 is silent about the 100 use of ECN on RST and FIN packets. RFC 5562 [RFC5562] is an 101 experimental modification to ECN that enables ECN support for TCP 102 SYN-ACK packets. 104 This document defines an experimental modification to ECN [RFC3168] 105 that enables ECN support on all the aforementioned types of TCP 106 packet. [I-D.ietf-tsvwg-ecn-experimentation] is a standards track 107 procedural device that updates RFC 3168 to allow the present 108 experiment, which RFC 3168 would otherwise prohibit. 110 1.1. Motivation 112 The absence of ECN support on TCP control packets and retransmissions 113 has a potential harmful effect. In any ECN deployment, non-ECN- 114 capable packets suffer a penalty when they traverse a congested 115 bottleneck. For instance, with a drop probability of 1%, 1% of 116 connection attempts suffer a timeout of about 1 second before the SYN 117 is retransmitted, which is highly detrimental to the performance of 118 short flows. TCP control packets, such as TCP SYNs and pure ACKs, 119 are important for performance, so dropping them is best avoided. 121 Non-ECN control packets particularly harm performance in environments 122 where the ECN marking level is high. For example, [judd-nsdi] shows 123 that in a data centre (DC) environment where ECN is used (in 124 conjunction with DCTCP), the probability of being able to establish a 125 new connection using a non-ECN SYN packet drops to close to zero even 126 when there are only 16 ongoing TCP flows transmitting at full speed. 127 In this data centre context, the issue is that DCTCP's aggressive 128 response to packet marking leads to a high marking probability for 129 ECN-capable packets, and in turn a high drop probability for non-ECN 130 packets. Therefore non-ECN SYNs are dropped aggressively, rendering 131 it nearly impossible to establish a new connection in the presence of 132 even mild traffic load. 134 Finally, there are ongoing experimental efforts to promote the 135 adoption of a slightly modified variant of DCTCP (and similar 136 congestion controls) over the Internet to achieve low latency, low 137 loss and scalable throughput (L4S) for all communications 138 [I-D.briscoe-tsvwg-l4s-arch]. In such an approach, L4S packets 139 identify themselves using an ECN codepoint. Preventing TCP control 140 packets from obtaining the benefits of ECN would not only expose them 141 to the prevailing level of congestion loss, but it would also stop 142 them from being classified into the low latency (L4S) queue, which 143 would greatly degrade L4S performance. 145 1.2. Experiment goals 147 The goal of the experimental modifications defined in this document 148 is to allow the use of ECN (both ECT and CE codepoints) on all TCP 149 packets. Experiments are expected in the public Internet as well as 150 in controlled environments to understand the following issues: 152 o How SYNs, Window probes, pure ACKs, FINs, RSTs and retransmissions 153 that carry the ECT(0), ECT(1) or CE codepoints are processed by 154 the TCP endpoints and the network (including routers, firewalls 155 and other middleboxes). In particular we would like to learn if 156 these packets are frequently blocked or if these packets are 157 usually forwarded and processed. 159 o The scale of deployment of the different flavours of ECN, 160 including [RFC3168], [RFC5562], [RFC3540] and 161 [I-D.ietf-tcpm-accurate-ecn]. 163 o How much the performance of TCP communications is improved by 164 allowing ECN marking of each packet type. 166 o To identify any issues (including security issues) raised by 167 enabling ECN marking of these packets. 169 The data gathered through the experiments described in this document, 170 particularly under the first 2 bullets above, will help in the design 171 of the final mechanism (if any) for adding ECN support to the 172 different packet types considered in this document. Whenever data 173 input is needed to assist in a design choice, it is spelled out 174 throughout the document. 176 Success criteria: The experiment will be a success if we obtain 177 enough data to have a clearer view of the deployability and benefits 178 of ECN marking all TCP packets, as well as any issues. If the 179 results of the experiment show that it is feasible to deploy such 180 changes; that there are gains to be achieved though the changes 181 described in this specification; and that no other major issues may 182 interfere with the deployment of the proposed changes; then it would 183 be reasonable to adopt the proposed changes in a standards track 184 specification that would update RFC 3168. 186 1.3. Document structure 188 The remainder of this document is structured as follows. In 189 Section 2, we present the terminology used in the rest of the 190 document. In Section 3, we specify the modifications to provide ECN 191 support to TCP SYNs, pure ACKs, Window probes, FINs, RSTs and 192 retransmissions. We describe both the network behaviour and the 193 endpoint behaviour. Section 4 discusses variations of the 194 specification that will be necessary to interwork with a number of 195 popular variants or derivatives of TCP. RFC 3168 provides a number 196 of specific reasons why ECN support is not appropriate for each 197 packet type. In Section 5, we revisit each of these arguments and 198 explore the possibility of enabling the ECN capability for each 199 packet type in turn. 201 2. Terminology 203 The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, 204 SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this 205 document, are to be interpreted as described in [RFC2119]. 207 Pure ACK: A TCP segment with the ACK flag set and no data payload. 209 SYN: A TCP segment with the SYN (synchronize) flag set. It may carry 210 data if TCP Fast Open is used. 212 Window probe: Defined in [RFC1122], a window probe is a TCP segment 213 with only one byte of data sent to learn if the receive window is 214 still zero. 216 FIN: A TCP segment with the FIN (finish) flag set. 218 RST: A TCP segment with the RST (reset) flag set. 220 Retransmission: A TCP segment that has been retransmitted by the TCP 221 sender because it determined that the original segment was lost, 222 which may or may not be the case. 224 ECT: ECN-Capable Transport. One of the two codepoints ECT(0) or 225 ECT(1) in the ECN field [RFC3168] of the IP header (v4 or v6). An 226 ECN-capable sender sets one of these to indicate that both transport 227 end-points support ECN. When this specification says the sender sets 228 an ECT codepoint, by default it means ECT(0). Optionally, it could 229 mean ECT(1), which is in the process of being redefined for use by 230 L4S experiments [I-D.ietf-tsvwg-ecn-experimentation] 231 [I-D.briscoe-tsvwg-ecn-l4s-id]. 233 Not-ECT: The ECN codepoint that indicates that the transport is not 234 ECN-capable. 236 CE: Congestion Experienced. The ECN codepoint that an intermediate 237 node sets to indicate congestion [RFC3168]. A node sets an 238 increasing proportion of ECT packets to CE as the level of congestion 239 increases. 241 3. Specification 243 3.1. Network behaviour 245 Previously the specification of ECN for TCP [RFC3168] required the 246 sender to set not-ECT on TCP control packets and retransmissions. 247 Some readers might have erroneously interpreted this as a requirement 248 for firewalls, intrusion detection systems, etc. to check and enforce 249 this behaviour. Now that the present experimental specification 250 allows TCP senders to set ECT on all TCP packets (control and data), 251 it needs to be clear that a firewall (or any network node) SHOULD NOT 252 treat any ECN-capable packet differently dependent on what type of 253 TCP packet it is. 255 The previous sentence says "SHOULD NOT" rather than "MUST NOT" 256 because one potential exception is envisaged. A security function 257 that has detected an ongoing attack MAY drop more ECT marked SYNs 258 than not-ECT marked SYNs. Such a policy MUST NOT be applied 259 routinely. It can only be applied if an attack is detected, and 260 preferably only if it is determined that the ECT capability is 261 intensifying the attack. 263 3.2. Endpoint behaviour 265 The changes to the specification of TCP over ECN [RFC3168] defined 266 here solely alter the behaviour of a sending host. 268 The feedback behaviour at the receiver depends on whether classic ECN 269 TCP feedback [RFC3168] or Accurate ECN (AccECN) TCP feedback 270 [I-D.ietf-tcpm-accurate-ecn] has been negotiated. Nonetheless, 271 neither receiver feedback behaviour is altered by the present 272 specification. 274 For each type of control packet or retransmission, the following 275 sections detail changes to the sender's behaviour in two respects: i) 276 whether it sets ECT; and ii) its response to congestion feedback. 277 Table 1 summarises these two behaviours for each type of packet, but 278 the relevant subsection below should be referred to for the detailed 279 behaviour. The subsection on the SYN is more complex than the 280 others, because it has to include fall-back behaviour if the ECT 281 packet appears not to have got through, and caching of the outcome to 282 detect persistent failures. 284 +----------+------------------+--------------------+----------------+ 285 | TCP | ECN field if | ECN field if RFC | Congestion | 286 | packet | AccECN f/b | 3168 f/b | Response | 287 | type | negotiated* | negotiated* | | 288 +----------+------------------+--------------------+----------------+ 289 | SYN | ECT | not-ECT | Reduce IW | 290 | | | | | 291 | SYN-ACK | ECT | ECT | Reduce IW as | 292 | | | | in [RFC5562] | 293 | | | | | 294 | Pure ACK | ECT | ECT | None or | 295 | | | | optionally | 296 | | | | [RFC5690] | 297 | | | | | 298 | W Probe | ECT | ECT | Usual response | 299 | | | | | 300 | FIN | ECT | ECT | None or | 301 | | | | optionally | 302 | | | | [RFC5690] | 303 | | | | | 304 | RST | ECT | ECT | N/A | 305 | | | | | 306 | Re-XMT | ECT | ECT | Usual response | 307 +----------+------------------+--------------------+----------------+ 309 Window probe and retransmission are abbreviated to W Probe an Re-XMT. 310 * For a SYN, "negotiated" means "requested". 312 Table 1: Summary of sender behaviour. In each case the relevant 313 section below should be referred to for the detailed behaviour 315 It can be seen that the sender can set ECT in all cases, except if it 316 is not requesting AccECN feedback on the SYN. Therefore it is 317 RECOMMENDED that the experimental AccECN specification 318 [I-D.ietf-tcpm-accurate-ecn] is implemented, because it is expected 319 that ECT on the SYN will give the most significant performance gain, 320 particularly for short flows. Nonetheless, this specification also 321 caters for the case where AccECN feedback is not implemented. 323 3.2.1. SYN 325 3.2.1.1. Setting ECT on the SYN 327 With classic [RFC3168] ECN feedback, the SYN was never expected to be 328 ECN-capable, so the flag provided to feed back congestion was put to 329 another use (it is used in combination with other flags to indicate 330 that the responder supports ECN). In contrast, Accurate ECN (AccECN) 331 feedback [I-D.ietf-tcpm-accurate-ecn] provides a codepoint in the 332 SYN-ACK for the responder to feed back that the SYN arrived marked 333 CE. 335 Therefore, a TCP initiator MUST NOT set ECT on a SYN unless it also 336 attempts to negotiate Accurate ECN feedback in the same SYN. 338 For the experiments proposed here, if the SYN is requesting AccECN 339 feedback, the TCP sender will also set ECT on the SYN. It can ignore 340 the prohibition in section 6.1.1 of RFC 3168 against setting ECT on 341 such a SYN. 343 The following subsections about the SYN solely apply to this case 344 where the initiator sent an ECT SYN. 346 3.2.1.2. Caching Failed Connection Attempts 348 Until AccECN servers become widely deployed, a TCP initiator that 349 implements AccECN and sets ECT on a SYN SHOULD also maintain a cache 350 per server to record any failure of the previous attempt. It SHOULD 351 record whether a server does not support AccECN and MAY record 352 whether the ECT SYN is persistently lost (see fall-back below). The 353 TCP initiator will not subsequently attempt any behaviour recorded as 354 persistently problematic. However, the cache should be arranged to 355 expire so that the initiator will infrequently attempt to check 356 whether each problem has been resolved. 358 There is no need to cache successful attempts, because the default 359 ECT SYN behaviour performs optimally on success. 361 Servers that do not support ECN as a whole can be recorded as non- 362 support of AccECN and do not need to be distinguished, because there 363 is no performance penalty in always attempting to negotiate classic 364 [RFC3168] ECN support. 366 3.2.1.3. SYN Congestion Response 368 Here, we use IW0 to denote the initial window of the TCP initiator 369 [RFC5681]. 371 If the SYN-ACK returned to the TCP initiator confirms that the server 372 supports AccECN, it will also indicate whether or not the SYN was CE- 373 marked. If the SYN was CE-marked, the initiator MUST reduce its 374 Initial Window (IW) and SHOULD reduce it to 1 SMSS (sender maximum 375 segment size). 377 If the SYN-ACK shows that the server does not support AccECN, the TCP 378 initiator MUST conservatively reduce its Initial Window and SHOULD 379 reduce it to 1 SMSS. A reduction to greater than 1 SMSS MAY be 380 appropriate (see discussion below). Conservatism is necessary 381 because a non-AccECN SYN-ACK cannot show whether the SYN was CE- 382 marked. 384 If the TCP initiator (host A) receives a SYN from the remote end 385 (host B) after it has sent a SYN to B, it indicates the (unusual) 386 case of a simultaneous open. Host A will respond with a SYN-ACK. 387 Host A will probably then receive a SYN-ACK in response to its own 388 SYN, after which it can follow the appropriate one of the two 389 paragraphs above. 391 In all the above cases, the initiator does not have to back off its 392 retransmission timer as it would in response to a timeout following 393 no response to its SYN [RFC6298], because both the SYN and the SYN- 394 ACK have been successfully delivered through the network. Also, the 395 initiator does not need to exit slow start or reduce ssthresh, which 396 is not even required when a SYN is lost [RFC5681], 398 DISCUSSION: In the case where the server does not support AccECN, 399 because we impose a conservative reduction in initial window, we 400 are penalizing those that deploy AccECN with ECT SYNs, rather than 401 improving performance as intended. Nonetheless, if such cases are 402 cached, performance will only suffer on the first attempt to 403 access a non-AccECN server. Also, the data sent initially by a 404 TCP client is often a small request that usually fits within 1 405 SMSS anyway {ToDo: reference? (this information was given 406 informally by Yuchung Cheng)}. 408 See Section 4 for cases where TCP Fast Open (TFO [RFC7413]) or an 409 initial window of 10 (IW10 [RFC6928]) are also implemented. 411 3.2.1.4. Fall-back Following a Lost ECT SYN (or SYN-ACK)) 413 An ECT SYN might be lost due to an over-zealous path element (or 414 server) blocking ECT packets that do not conform to RFC 3168. 415 However, loss is commonplace for numerous other reasons, e.g. 416 congestion loss at a non-ECN queue on the forward or reverse path, 417 transmission errors, etc. Alternatively, the cause of the blockage 418 might be the attempt to negotiate AccECN, or possibly other unrelated 419 options on the SYN. 421 To expedite connection set-up if, after sending an ECT SYN, the 422 retransmission timer expires, the TCP initiator SHOULD send a SYN 423 with the not-ECT codepoint in the IP header and not attempt to 424 negotiate AccECN. It would make sense to also remove any other 425 experimental fields or options on the SYN, but that will depend on 426 the specification of the other option(s). Other fall-back strategies 427 that are considered to improve performance MAY be adopted. 429 If the TCP initiator is caching failed connection attempts, it SHOULD 430 NOT give up using ECT on the first SYN of subsequent connection 431 attempts until it is clear that the blockage persistently and 432 specifically affects ECT on SYNs. This is because loss is so 433 commonplace for other reasons. 435 DISCUSSION: If initial experiments show that blocking of ECT on 436 SYNs is widespread, it MAY be necessary to cache successful 437 attempts as well as failures. Then, if there is no entry in the 438 cache for a particular server, the TCP initiator could send a not- 439 ECT SYN soon after the first ECT SYN. This would reduce the 440 performance penalty for those deploying ECT SYN support. 442 3.2.2. SYN-ACK 444 To comply with the present specification, the responder (server) part 445 of a TCP implementation MUST also comply with [RFC5562], which 446 defines the use of ECT on a SYN-ACK and the congestion response of 447 the TCP listener if a SYN-ACK is CE-marked. 449 Feedback by the initiator in response to a CE-marked SYN-ACK from the 450 responder depends on whether classic ECN feedback or AccECN feedback 451 [I-D.ietf-tcpm-accurate-ecn] has been negotiated. In either case no 452 change is required to RFC 5562 or the AccECN specification 453 respectively. 455 3.2.3. Pure ACK 457 For the experiments proposed here, the TCP implementation will set 458 ECT on Pure ACKs. It can ignore the requirement in section 6.1.4 of 459 RFC 3168 to set not-ECT on a Pure ACK. 461 TCP does not normally detect or respond to loss of pure ACKs. 462 Therefore, any response to CE markings on Pure ACKs is not required 463 in order to comply with the present specification. Nonetheless, a 464 congestion response is not precluded either. It could be arranged 465 using any one of the following approaches. 467 TCP never acknowledges Pure ACKs. So classic [RFC3168] ECN provides 468 no mechanism to feed back a CE marking on a Pure ACK, unless the 469 feedback is added to the ACK of a later data packet (if one arises). 471 In contrast, an AccECN receiver [I-D.ietf-tcpm-accurate-ecn] 472 continually feeds back a count of the number of CE-marked packets 473 that it has received (and, if possible, a count of CE-marked bytes). 474 So a TCP sender that has negotiated AccECN and is setting ECT on pure 475 ACKs will receive congestion feedback if any Pure ACKs are CE-marked 476 in transit. 478 In either case (classic or AccECN feedback), if the TCP sender does 479 receive feedback about CE-markings on Pure ACKs, it will react in the 480 usual way by reducing its congestion window accordingly. This will 481 regulate the rate of any data packets it is sending amongst the Pure 482 ACKs. However, reducing the congestion window will have no effect on 483 the rate of Pure ACKs. So while it is only sending Pure ACKs the 484 sender will not be responding to congestion. 486 Any pair of TCP end-points can already choose to regulate the rate of 487 Pure ACKs by agreeing to regulate the delayed ACK ratio in response 488 to loss or CE-marking of Pure ACKs, using the Acknowledgement 489 Congestion Control (AckCC) techniques documented in [RFC5690] 490 (informational). However, AckCC is not required. 492 RFC 5690 proposed new TCP options to address the problems that TCP 493 had no mechanism to allow ECT to be set on Pure ACKs and no mechanism 494 to feed back loss or CE-marking of Pure ACKs. A combination of the 495 present specification and AccECN addresses both these problems, at 496 least for ECN marking. So it might now be possible to design an ECN- 497 specific ACK congestion control scheme without the extra TCP options 498 proposed in RFC 5690. However, such a mechanism is out of scope of 499 the present document. 501 3.2.4. Window Probe 503 For the experiments proposed here, the TCP sender will set ECT on 504 window probes. It can ignore the prohibition in section 6.1.6 of RFC 505 3168 against setting ECT on a window probe. 507 A window probe contains a single octet, so it is no different from a 508 regular TCP data segment. Therefore a TCP receiver will feed back 509 any CE marking on a window probe as normal (either using classic ECN 510 feedback or AccECN feedback). The sender of the probe will then 511 reduce its congestion window as normal. 513 A receive window of zero indicates that the application is not 514 consuming data fast enough and does not imply anything about network 515 congestion. Once the receive window opens, the congestion window 516 might become the limiting factor, so it is correct that CE-marked 517 probes reduce the congestion window. However, CE-marking on window 518 probes does not reduce the rate of the probes themselves. This is 519 unlikely to present a problem, given a window probe is sent only 520 every 2 minutes [RFC0793] as long as the receiver is advertising a 521 zero window. 523 3.2.5. FIN 525 A TCP implementation can set ECT on a FIN. 527 A congestion response to a CE-marking on a FIN is not required. 529 After sending a FIN, the endpoint will not send any more data in the 530 connection. Therefore, even if the FIN-ACK indicates that the FIN 531 was CE-marked (whether using classic or AccECN feedback), reducing 532 the congestion window will not affect anything. 534 After sending a FIN, a host might send one or more pure ACKs. If it 535 is using one of the techniques in Section 3.2.3 to regulate the 536 delayed ACK ratio for Pure ACKs, it could equally be applied after a 537 FIN. But this is not required. 539 3.2.6. RST 541 A TCP implementation can set ECT on a RST. 543 A congestion response to a CE-marking on a RST is not required (and 544 actually not possible). 546 The host generating the RST message does not have an open connection 547 after sending it (either because there was no such connection when 548 the packet that triggered the RST message was received or because the 549 packet that triggered the RST message also triggered the closure of 550 the connection). 552 Moreover, the receiver of a CE-marked RST message can either: i) 553 accept the RST message and close the connection; ii) emit a so-called 554 challenge ACK in response (with suitable throttling) [RFC5961] and 555 otherwise ignore the RST (e.g. because the sequence number is in- 556 window but not the precise number expected next); or iii) discard the 557 RST message (e.g. because the sequence number is out-of-window). In 558 the first two cases there is no point in echoing any CE mark received 559 because the sender closed its connection when it sent the RST. In 560 the third case it makes sense to discard the CE signal as well as the 561 RST. So, in all these cases it does not make sense to generate 562 feedback about a CE mark on a RST message. 564 The following factors have been considered before deciding whether 565 ECT ought to be allowed on a RST message: 567 o As explained above, a congestion response by the sender of a CE- 568 marked RST message is not possible; 570 o So the only reason for the sender setting ECT on a RST would be to 571 improve the reliability of the message's delivery; 573 o RST messages are used to both mount and mitigate attacks: 575 * Spoofed RST messages are used by attackers to terminate ongoing 576 connections, although the mitigations in RFC 5961 have 577 considerably raised the bar against off-path RST attacks; 579 * Legitimate RST messages allow endpoints to inform their peers 580 to eliminate existing state that correspond to non existing 581 connections, liberating resources e.g. in DoS attacks 582 scenarios; 584 o AQMs are advised to disable ECN marking during persistent 585 overload, so: 587 * it is harder for an attacker to exploit ECN to intensify an 588 attack; 590 * it is harder for a legitimate user to exploit ECN to more 591 reliably mitigate an attack 593 o Prohibiting ECT on a RST would deny the benefit of ECN to 594 legitimate RST messages, but not to attackers who can disregard 595 RFCs; 597 o If ECT were prohibited on RSTs, security middleboxes could discard 598 any RSTs that were exploiting ECN to intensify an attack; 600 o However, unlike a SYN flood, a RST flood is easier to distinguish 601 from legitimate traffic, so it is easier to ignore or eliminate 602 without harming legitimate traffic. 604 So, on balance, it has been decided that it is not necessary to 605 prohibit ECT on RSTs. However, there is always the possibility that 606 someone might demonstrate a new RST attack that proves this decision 607 to be unwise. 609 3.2.7. Retransmissions 611 For the experiments proposed here, the TCP sender will set ECT on 612 retransmitted segments. It can ignore the prohibition in section 613 6.1.5 of RFC 3168 against setting ECT on retransmissions. 614 Nonetheless, the requirement in RFC 3168 that "the TCP data receiver 615 SHOULD ignore the CE codepoint on out-of-window packets" still holds. 617 If the TCP sender receives feedback that a retransmitted packet was 618 CE-marked, it will react as it would to any feedback of CE-marking on 619 a data packet. 621 4. Interaction with popular variants or derivatives of TCP 623 The following subsections specify additional behaviour necessary when 624 setting ECT on all data and control packets while using the following 625 popular variants or derivatives of TCP: SCTP, TFO, IW10. The 626 subsection on IW10 discusses changes to specifications but does not 627 recommend any, because the specification as it stands is safe, and 628 there is only a corner-case where performance could be occasionally 629 improved. 631 TCP variants that have been assessed and found not to interact 632 adversely with ECT on TCP control packets are: SYN cookies (see 633 Appendix A of [RFC4987]) and L4S [I-D.briscoe-tsvwg-l4s-arch]. 635 4.1. SCTP 637 Stream Control Transmission Protocol (SCTP [RFC4960]) is a standards 638 track protocol derived from TCP. SCTP currently does not include ECN 639 support, but a draft on the addition of ECN to SCTP has been produced 640 [I-D.stewart-tsvwg-sctpecn]. This draft avoids setting ECT on 641 control packets and retransmissions, closely following the arguments 642 in RFC 3168. When ECN is finally added to SCTP, experience from 643 experiments on adding ECN support to all TCP packets ought to be 644 directly transferable to SCTP. 646 4.2. TFO 648 TCP Fast Open (TFO [RFC7413]) is an experiment to remove the round 649 trip delay of TCP's 3-way hand-shake (3WHS). A TFO initiator caches 650 a cookie from a previous connection with a TFO-enabled server. Then, 651 for subsequent connections to the same server, any data included on 652 the SYN and any other data segments sent directly after the SYN (up 653 to the initial window limit) can be passed directly to the server 654 application, which can then return response data with the SYN-ACK 655 (again, up to the initial window limit). 657 If a TFO initiator has cached that the server supported ECN in the 658 previous connection, it would be safe to set ECT on any data segments 659 it sends before a SYN-ACK returns from the responder (server). Note 660 that there is no space in the SYN-ACK itself (whether classic or 661 AccECN feedback has been negotiated) to include feedback about any CE 662 on data packets. Nonetheless, it is safe to set ECT on data packets 663 within the handshake because any CE-marking on these data segments 664 can be fed back by the responder on the first data segment it sends 665 after the SYN-ACK (or on an additional Pure ACK if it has no more 666 data to send). 668 Note that the prohibition in Section 3.2.1.1 against setting ECT on 669 the SYN if the same SYN is not requesting AccECN feedback still 670 applies. 672 Strictly even a non-TFO TCP initiator can send up to an initial 673 window of data segments straight after the SYN. However, this is 674 rare because a non-TFO TCP server will not deliver them to the 675 application until the 3WHS completes. Therefore the question of ECT 676 on data segments within the handshake only becomes important with 677 TFO. A TFO initiator's first ever connection with a server never 678 uses a fast open, so the initiator always has a chance to cache 679 whether a server supports ECN before it uses a fast open. 681 4.3. IW10 683 IW10 is an experiment to determine whether it is safe for TCP to use 684 an initial window of 10 SMSS [RFC6928]. 686 This subsection does not recommend any additions to the present 687 specification in order to interwork with IW10. The specifications as 688 they stand are safe, and there is only a corner-case where 689 performance could be occasionally improved, as explained below. 691 As specified in Section 3.2.1.1, a TCP initiator can only set ECT on 692 the SYN if it requests AccECN support. If, however, the SYN-ACK 693 tells the initiator that the responder does not support AccECN, 694 Section 3.2.1.1 advises the initiator to conservatively reduce its 695 initial window to 1 SMSS because, if the SYN was CE-marked, the SYN- 696 ACK has no way to feed that back. 698 If the initiator implements IW10, it seems rather over-conservative 699 to reduce IW to 1 in this scenario. Nonetheless, it will rarely hit 700 performance if we leave the advice at 1 SMSS, because: 702 o as long as the initiator is caching failures to negotiate AccECN, 703 subsequent attempts to access the same server will not use ECT on 704 the SYN anyway, so there will no longer be any need to 705 conservatively reduce IW; 707 o currently it is not common for a TCP initiator (client) to have 708 more than one segment to send {ToDo: evidence/reference?} - IW10 709 is primarily exploited by TCP servers. 711 5. Discussion of the arguments in RFC 3168 713 This section is informative, not normative. It presents counter- 714 arguments against the justifications in the RFC series for disabling 715 ECN marking on each type of packet. First it addresses over-arching 716 arguments used for most packet types, then it addresses the specific 717 arguments for each packet type in turn. 719 5.1. The reliability argument 721 Section 5.2 of RFC 3168 states: 723 "To ensure the reliable delivery of the congestion indication of 724 the CE codepoint, an ECT codepoint MUST NOT be set in a packet 725 unless the loss of that packet [at a subsequent node] in the 726 network would be detected by the end nodes and interpreted as an 727 indication of congestion." 729 We believe this argument is overly conservative. The principle to 730 determine whether a packet is ECN-capable ought to be "do no extra 731 harm", meaning that the reliability of a congestion signal's delivery 732 ought to be no worse with ECN than without. In particular, setting 733 the CE codepoint on the very same packet fulfills this criterion, 734 since either the packet is delivered and the CE signal is delivered 735 to the endpoint, or the packet is dropped and the original congestion 736 signal (packet loss) is delivered to the endpoint. 738 TCP does not deliver control packets reliably. So it is more 739 important to allow control packets to be ECN-capable, which greatly 740 improves reliable delivery of the control packets themselves. This 741 outweighs by far the concern that a CE marking applied to a control 742 packet by one node might subsequently be dropped by another node. 743 Particularly given that, without ECN, the transport does not attempt 744 to detect the drop of most control packets anyway. 746 5.2. SYNs 748 RFC 5562 presents two arguments against ECT marking of SYN packets 749 (quoted verbatim): 751 "First, when the TCP SYN packet is sent, there are no guarantees 752 that the other TCP endpoint (node B in Figure 2) is ECN-Capable, 753 or that it would be able to understand and react if the ECN CE 754 codepoint was set by a congested router. 756 Second, the ECN-Capable codepoint in TCP SYN packets could be 757 misused by malicious clients to "improve" the well-known TCP SYN 758 attack. By setting an ECN-Capable codepoint in TCP SYN packets, a 759 malicious host might be able to inject a large number of TCP SYN 760 packets through a potentially congested ECN-enabled router, 761 congesting it even further." 763 The first point actually describes two subtly different issues. So 764 below three arguments are countered in turn. 766 5.2.1. Argument 1a: Loss of congestion notification on the SYN 768 This argument certainly applied at the time RFC 5562 was written, 769 when no ECN responder mechanism had any logic to recognize or feed 770 back a CE marking on a SYN. The problem was that, during the 3WHS, 771 the flag in the TCP header for ECN feedback (called Echo Congestion 772 Experienced) had been overloaded to negotiate the use of ECN itself. 773 So there was no space for feedback in a SYN-ACK. 775 The accurate ECN (AccECN) protocol [I-D.ietf-tcpm-accurate-ecn] has 776 since been designed to solve this problem, using a two-pronged 777 approach. First AccECN uses the 3 ECN bits in the TCP header as 8 778 codepoints, so there is space for the responder to feed back whether 779 there was CE on the SYN. Second a TCP initiator can always request 780 AccECN support on every SYN, and any responder reveals its level of 781 ECN support: AccECN, classic ECN, or no ECN. Therefore, if a 782 responder does indicate that it supports AccECN, the initiator can be 783 sure that, if there is no CE feedback on the SYN-ACK, then there 784 really was no CE on the SYN. 786 An initiator can combine AccECN with three possible strategies for 787 setting ECT on a SYN: 789 (S1): Pessimistic ECT with positive cache: The initiator always 790 requests AccECN in the SYN, but without setting ECT. Then it 791 records those servers that confirm that they support AccECN in 792 a cache. On a subsequent connection to any server that 793 supports AccECN, the initiator can then set ECT on the SYN. 795 (S2): Optimistic ECT: The initiator always sets ECT optimistically 796 on the initial SYN and it always requests AccECN support. 797 Then, if the server response shows it has no AccECN logic (so 798 it cannot feed back a CE mark), the initiator conservatively 799 behaves as if the SYN was CE-marked, by reducing its initial 800 window. 802 A. With no cache: The optimistic ECT strategy ought to work 803 pretty well without caching any responses. 805 B. With negative cache: The optimistic ECT strategy can be 806 improved by recording solely those servers that do not 807 support AccECN. On subsequent connections to these non- 808 AccECN servers, the initiator will still request AccECN 809 but not set ECT on the SYN. Then, the initiator can use 810 its full initial window (if it has enough request data to 811 need it). Longer term, as servers upgrade to AccECN, the 812 initiator will remove them from the cache and use ECT on 813 subsequent SYNs to that server. 815 (S3): ECT by configuration: In a controlled environment, the 816 administrator can make sure that servers support ECN-capable 817 SYN packets. Examples of controlled environments are single- 818 tenant DCs, and possibly multi-tenant DCs if we assume that 819 each tenant mostly communicates with its own VMs. 821 For unmanaged environments like the public Internet, the choice is 822 between strategies (S1) and (S2B): 824 o The "pessimistic ECT with positive cache" strategy (S1) suffers 825 from exposing the initial SYN to the prevailing loss level, even 826 if the server supports ECT on SYNs, but only on the first 827 connection to each AccECN server. 829 o The "optimistic ECT with negative cache" strategy (S2B) exploits a 830 server's support for ECT on SYNs from the very first attempt. But 831 if the server turns out not to support AccECN, the initiator has 832 to conservatively limit its initial window - usually 833 unnecessarily. Nonetheless, initiator request data (as opposed to 834 server response data) is rarely larger than 1 SMSS anyway (see 835 Section 4.3). 837 The normative specification for ECT on a SYN in Section 3.2.1 uses 838 the "optimistic ECT with negative cache" strategy on the assumption 839 that an initial window of 1 SMSS is usually sufficient for client 840 requests anyway. For clients that often initially send more than 1 841 SMSS of data, strategy (S1) could be used during initial deployment 842 and strategy (S2B) later (when the probability of servers supporting 843 AccECN and the likelihood of seeing some CE marking is higher). 844 Also, as deployment proceeds a positive cache (S1) starts off small 845 then grows, while a negative cache (S2B) becomes large at first, then 846 shrinks. 848 5.2.2. Argument 1b: Unknown Handling of Unexpected ECN 850 GIven ECT-marked SYN packets have previously been prohibited, it 851 cannot be assumed they will be accepted. According to a study using 852 2014 data [ecn-pam] from a limited range of vantage points, out of 853 the top 1M Alexa web sites, 4791 (0.82%) IPv4 sites and 104 (0.61%) 854 IPv6 sites failed to establish a connection when they received a TCP 855 SYN with any ECN codepoint set in the IP header and the appropriate 856 ECN flags in the TCP header. Of these, about 41% failed to establish 857 a connection due to the ECN flags in the TCP header even with a Not- 858 ECT ECN field in the IP header (i.e. despite full compliance with RFC 859 3168). Therefore adding the ECN-capability to SYNs was increasing 860 connection establishment failures by about 0.4%. 862 We will need to investigate which of numerous possible causes is 863 leading to these failures. RFC 3168 says "a host MUST NOT set ECT on 864 SYN [...] packets", but it does not say what the responder should do 865 if an ECN-capable SYN arrives. So perhaps some responder 866 implementations are checking that the SYN complies with RFC 3168, 867 then silently ignoring non-compliant SYNs (or perhaps returning a 868 RST). Also some middleboxes (e.g. firewalls) might be discarding 869 non-compliant SYNs themselves. For the future, 870 [I-D.ietf-tsvwg-ecn-experimentation] clarifies that middleboxes 871 "SHOULD NOT" do this, but that does not alter the past. 873 Whereas RSTs can be dealt with immediately, silent failures introduce 874 a retransmission timeout delay (default 1 second) at the initiator 875 before it attempts any fall back strategy. Ironically, making SYNs 876 ECN-capable is intended to avoid the timeout when a SYN is lost due 877 to congestion. Fortunately, where discard of ECN-capable SYNs is due 878 to policy it will occur predictably, not randomly like congestion. 879 So the initiator can avoid it by caching those sites that do not 880 support ECN-capable SYNs. 882 This further justifies the use of the "optimistic ECT with negative 883 cache" strategy in Section 3.2.1. 885 It might seem tempting to first send an ECT SYN and then a non-ECT 886 SYN (possibly with a small delay between them) and only accept the 887 non-ECT connection if it returned first. However, even a cache of a 888 dozen or so sites ought to avoid all ECN-related performance problems 889 with roughly the Alexa top thousand. So it is questionable whether 890 the level of failure of ECT on SYNs warrants always sending two SYNs, 891 particularly given failures at well-maintained sites could reduce if 892 ECT SYNs are standardized. 894 5.2.3. Argument 2: DoS attacks. 896 [RFC5562] says that ECT SYN packets could be misused by malicious 897 clients to augment "the well-known TCP SYN attack". It goes on to 898 say "a malicious host might be able to inject a large number of TCP 899 SYN packets through a potentially congested ECN-enabled router, 900 congesting it even further." 901 We assume this is a reference to the TCP SYN flood attack (see 902 https://en.wikipedia.org/wiki/SYN_flood), which is an attack against 903 a responder end point. We assume the idea of this attack is to use 904 ECT to get more packets through an ECN-enabled router in preference 905 to other non-ECN traffic so that they can go on to use the SYN 906 flooding attack to inflict more damage on the responder end point. 907 This argument could apply to flooding with any type of packet, but we 908 assume SYNs are singled out because their source address is easier to 909 spoof, whereas floods of other types of packets are easier to block. 911 Mandating Not-ECT in an RFC does not stop attackers using ECT for 912 flooding. Nonetheless, if a standard says SYNs are not meant to be 913 ECT it would make it legitimate for firewalls to discard them. 914 However this would negate the considerable benefit of ECT SYNs for 915 compliant transports and seems unnecessary because RFC 3168 already 916 provides the means to address this concern. In section 7, RFC 3168 917 says "During periods where ... the potential packet marking rate 918 would be high, our recommendation is that routers drop packets rather 919 then set the CE codepoint..." and this advice is repeated in 920 [RFC7567] (section 4.2.1). This makes it harder for flooding packets 921 to gain from ECT. 923 Further experiments are needed to test how much malicious hosts can 924 use ECT to augment flooding attacks without triggering AQMs to turn 925 off ECN support (flying "just under the radar"). If it is found that 926 ECT can only slightly augment flooding attacks, the risk of such 927 attacks will need to be weighed against the performance benefits of 928 ECT SYNs. 930 5.3. Pure ACKs. 932 RFC 3168 gives the following arguments for not allowing the ECT 933 marking of pure ACKs (ACKs not piggy-backed on data). In section 5.2 934 it reads: 936 "To ensure the reliable delivery of the congestion indication of 937 the CE codepoint, an ECT codepoint MUST NOT be set in a packet 938 unless the loss of that packet in the network would be detected by 939 the end nodes and interpreted as an indication of congestion. 941 Transport protocols such as TCP do not necessarily detect all 942 packet drops, such as the drop of a "pure" ACK packet; for 943 example, TCP does not reduce the arrival rate of subsequent ACK 944 packets in response to an earlier dropped ACK packet. Any 945 proposal for extending ECN- Capability to such packets would have 946 to address issues such as the case of an ACK packet that was 947 marked with the CE codepoint but was later dropped in the network. 948 We believe that this aspect is still the subject of research, so 949 this document specifies that at this time, "pure" ACK packets MUST 950 NOT indicate ECN-Capability." 952 Later on, in section 6.1.4 it reads: 954 "For the current generation of TCP congestion control algorithms, 955 pure acknowledgement packets (e.g., packets that do not contain 956 any accompanying data) MUST be sent with the not-ECT codepoint. 957 Current TCP receivers have no mechanisms for reducing traffic on 958 the ACK-path in response to congestion notification. Mechanisms 959 for responding to congestion on the ACK-path are areas for current 960 and future research. (One simple possibility would be for the 961 sender to reduce its congestion window when it receives a pure ACK 962 packet with the CE codepoint set). For current TCP 963 implementations, a single dropped ACK generally has only a very 964 small effect on the TCP's sending rate." 966 We next address each of the arguments presented above. 968 The first argument is a specific instance of the reliability argument 969 for the case of pure ACKs. This has already been addressed by 970 countering the general reliability argument in Section 5.1. 972 The second argument mentions that a sender does not reduce the load 973 of a stream of pure ACKs even if they are contributing to congestion. 974 Again, given that current TCP does not respond to pure ACK loss, 975 setting ECT on pure ACKs to allow them to carry congestion marks 976 would be no worse than not doing so (and not doing so would be 977 detrimental from a performance perspective). 979 The proposed AccECN modification to TCP feedback 980 [I-D.ietf-tcpm-accurate-ecn] involves a data receiver repeatedly 981 sending a count of received congestion marks. So AccECN could 982 include marks on pure ACKs in this count, even though it does not ACK 983 pure ACKs themselves. Then the sender of the pure ACKs will reduce 984 its congestion window, which will (correctly) reduce the rate at 985 which it sends any subsequent data. Nonetheless, even if the 986 original sender of the pure ACK does not respond to this feedback, or 987 if it is decided that AccECN will not provide this information, it 988 will still make sense to set ECT on pure ACKs, because the congestion 989 situation will be no worse than it is today with non-ECT pure ACKs. 991 In summary, allowing ECT (and CE) to be set on pure ACKs is no worse 992 than not doing so (and dropping the pure ACK). In contrast, not 993 setting ECT on pure ACKs is certainly detrimental to performance 994 because when a pure ACK is lost it can prevent the release of new 995 data. 997 5.4. Window probes 999 RFC 3168 presents only the reliability argument for preventing 1000 setting the ECT codepoint in Window Probe packets. Specifically, 1001 Section 6.1.6 states: 1003 "If a window probe packet is dropped in the network, this loss is 1004 not detected by the receiver. Therefore, the TCP data sender MUST 1005 NOT set either an ECT codepoint or the CWR bit on window probe 1006 packets. 1008 However, because window probes use exact sequence numbers, they 1009 cannot be easily spoofed in denial-of-service attacks. Therefore, 1010 if a window probe arrives with the CE codepoint set, then the 1011 receiver SHOULD respond to the ECN indications." 1013 The reliability argument has already been addressed in Section 5.1. 1015 Allowing ECT on window probes could considerably improve performance 1016 because, if a window probe is lost in conditions when the Silly 1017 Window Syndrome applies, the sender will stall until the next window 1018 probe reaches the receiver (at least 2 minutes later). 1020 On the bright side, RFC 3168 at least specifies the receiver 1021 behaviour if a CE-marked window probe arrives, so changing the 1022 behaviour ought to be less painful than for other packet types. 1024 5.5. Retransmitted packets. 1026 RFC 3168 says the sender "MUST NOT" set ECT on retransmitted packets. 1027 The rationale for this consumes nearly 2 pages of RFC 3168, so the 1028 reader is referred to section 6.1.5 of RFC 3168, rather than quoting 1029 it all here. There are essentially three arguments namely, 1030 reliability, DoS attacks and over-reaction to congestion. We address 1031 them in order below. 1033 The reliability argument has already been addressed in Section 5.1. 1035 Protection against DoS attacks is not afforded by prohibiting ECT on 1036 retransmitted packets. An attacker can set CE on spoofed 1037 retransmissions whether or not it is prohibited by an RFC. 1038 Protection against the DoS attack described in RFC 3168 is solely 1039 afforded by the requirement that "the TCP data receiver SHOULD ignore 1040 the CE codepoint on out-of-window packets". Therefore we propose to 1041 allow ECT marking of retransmitted packets, in order to reduce the 1042 chance of them being dropped. 1044 Nonetheless, it is important to keep the RFC 3168 advice to ignore 1045 the CE codepoint in out-of-window packets. This means that, for 1046 those retransmitted packets that arrive at the receiver after the 1047 original packet has been properly received, any CE marking will be 1048 ignored. There is no problem with that because the delivery of the 1049 original packet implies that the sender's original congestion 1050 response (when it deemed the packet lost and retransmitted it) was 1051 unnecessary. The data receiver is also advised to use the more 1052 stringent input check for incoming segments in section 5.2 of 1053 [RFC5961]. 1055 Finally, the third argument is about over-reacting to congestion. 1056 The argument goes that, if a retransmitted packet is dropped, the 1057 sender will not detect it, so it will not react again to congestion 1058 (it would have reduced its congestion window already when it 1059 retransmitted the packet). Whereas, if retransmitted packets can be 1060 CE tagged instead of dropped, senders could potentially react more 1061 than once to congestion. However, we argue that it is legitimate to 1062 respond again to congestion if it still persists in subsequent round 1063 trip(s). 1065 Therefore, in all three cases, it is not incorrect to set ECT on 1066 retransmissions. 1068 6. Security considerations 1070 Section 3.2.6 considers the question of whether ECT on RSTs will 1071 allow RST attacks to be intensified. There are several security 1072 arguments presented in RFC 3168 for preventing the ECN marking of TCP 1073 control packets and retransmitted segments. We believe all of them 1074 have been properly addressed in Section 5, particularly Section 5.2.3 1075 and Section 5.5 on DoS attacks using spoofed ECT-marked SYNs and 1076 spoofed CE-marked retransmissions. 1078 7. IANA Considerations 1080 There are no IANA considerations in this memo. 1082 8. Acknowledgments 1084 Thanks to Mirja Kuehlewind and David Black for their useful reviews. 1086 9. References 1087 9.1. Normative References 1089 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1090 Requirement Levels", BCP 14, RFC 2119, 1091 DOI 10.17487/RFC2119, March 1997, 1092 . 1094 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 1095 of Explicit Congestion Notification (ECN) to IP", 1096 RFC 3168, DOI 10.17487/RFC3168, September 2001, 1097 . 1099 [RFC5562] Kuzmanovic, A., Mondal, A., Floyd, S., and K. 1100 Ramakrishnan, "Adding Explicit Congestion Notification 1101 (ECN) Capability to TCP's SYN/ACK Packets", RFC 5562, 1102 DOI 10.17487/RFC5562, June 2009, 1103 . 1105 [I-D.ietf-tcpm-accurate-ecn] 1106 Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More 1107 Accurate ECN Feedback in TCP", draft-ietf-tcpm-accurate- 1108 ecn-02 (work in progress), October 2016. 1110 [I-D.ietf-tsvwg-ecn-experimentation] 1111 Black, D., "Explicit Congestion Notification (ECN) 1112 Experimentation", draft-ietf-tsvwg-ecn-experimentation-01 1113 (work in progress), March 2017. 1115 9.2. Informative References 1117 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1118 RFC 793, DOI 10.17487/RFC0793, September 1981, 1119 . 1121 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 1122 Communication Layers", STD 3, RFC 1122, 1123 DOI 10.17487/RFC1122, October 1989, 1124 . 1126 [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit 1127 Congestion Notification (ECN) Signaling with Nonces", 1128 RFC 3540, DOI 10.17487/RFC3540, June 2003, 1129 . 1131 [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", 1132 RFC 4960, DOI 10.17487/RFC4960, September 2007, 1133 . 1135 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 1136 Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, 1137 . 1139 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 1140 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 1141 . 1143 [RFC5961] Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's 1144 Robustness to Blind In-Window Attacks", RFC 5961, 1145 DOI 10.17487/RFC5961, August 2010, 1146 . 1148 [RFC5690] Floyd, S., Arcia, A., Ros, D., and J. Iyengar, "Adding 1149 Acknowledgement Congestion Control to TCP", RFC 5690, 1150 DOI 10.17487/RFC5690, February 2010, 1151 . 1153 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 1154 "Computing TCP's Retransmission Timer", RFC 6298, 1155 DOI 10.17487/RFC6298, June 2011, 1156 . 1158 [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis, 1159 "Increasing TCP's Initial Window", RFC 6928, 1160 DOI 10.17487/RFC6928, April 2013, 1161 . 1163 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 1164 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 1165 . 1167 [RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF 1168 Recommendations Regarding Active Queue Management", 1169 BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, 1170 . 1172 [I-D.briscoe-tsvwg-ecn-l4s-id] 1173 Schepper, K., Briscoe, B., and I. Tsang, "Identifying 1174 Modified Explicit Congestion Notification (ECN) Semantics 1175 for Ultra-Low Queuing Delay", draft-briscoe-tsvwg-ecn-l4s- 1176 id-02 (work in progress), October 2016. 1178 [I-D.briscoe-tsvwg-l4s-arch] 1179 Briscoe, B., Schepper, K., and M. Bagnulo, "Low Latency, 1180 Low Loss, Scalable Throughput (L4S) Internet Service: 1181 Architecture", draft-briscoe-tsvwg-l4s-arch-02 (work in 1182 progress), March 2017. 1184 [I-D.stewart-tsvwg-sctpecn] 1185 Stewart, R., Tuexen, M., and X. Dong, "ECN for Stream 1186 Control Transmission Protocol (SCTP)", draft-stewart- 1187 tsvwg-sctpecn-05 (work in progress), January 2014. 1189 [judd-nsdi] 1190 Judd, G., "Attaining the promise and avoiding the pitfalls 1191 of TCP in the Datacenter", NSDI 2015, 2015. 1193 [ecn-pam] Trammell, B., Kuehlewind, M., Boppart, D., Learmonth, I., 1194 Fairhurst, G., and R. Scheffenegger, "Enabling Internet- 1195 Wide Deployment of Explicit Congestion Notification", 1196 Int'l Conf. on on Passive and Active Network Measurement 1197 (PAM'15) pp193-205, 2015. 1199 Authors' Addresses 1201 Marcelo Bagnulo 1202 Universidad Carlos III de Madrid 1203 Av. Universidad 30 1204 Leganes, Madrid 28911 1205 SPAIN 1207 Phone: 34 91 6249500 1208 Email: marcelo@it.uc3m.es 1209 URI: http://www.it.uc3m.es 1211 Bob Briscoe 1212 Simula Research Lab 1214 Email: ietf@bobbriscoe.net 1215 URI: http://bobbriscoe.net/