idnits 2.17.1 draft-ietf-tcpm-generalized-ecn-01.txt: Checking boilerplate required by RFC 5378 and the IETF Trust (see https://trustee.ietf.org/license-info): ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/1id-guidelines.txt: ---------------------------------------------------------------------------- No issues found here. Checking nits according to https://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- -- The draft header indicates that this document obsoletes RFC5562, but the abstract doesn't seem to mention this, which it should. Miscellaneous warnings: ---------------------------------------------------------------------------- == The copyright year in the IETF Trust and authors Copyright Line does not match the current year == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords. (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (September 28, 2017) is 2392 days in the past. Is this intentional? Checking references for intended status: Experimental ---------------------------------------------------------------------------- == Outdated reference: A later version (-28) exists of draft-ietf-tcpm-accurate-ecn-03 == Outdated reference: A later version (-08) exists of draft-ietf-tsvwg-ecn-experimentation-06 == Outdated reference: A later version (-34) exists of draft-ietf-quic-transport-06 == Outdated reference: A later version (-29) exists of draft-ietf-tsvwg-ecn-l4s-id-00 == Outdated reference: A later version (-20) exists of draft-ietf-tsvwg-l4s-arch-00 == Outdated reference: A later version (-06) exists of draft-stewart-tsvwg-sctpecn-05 -- Obsolete informational reference (is this intentional?): RFC 793 (Obsoleted by RFC 9293) -- Obsolete informational reference (is this intentional?): RFC 4960 (Obsoleted by RFC 9260) Summary: 0 errors (**), 0 flaws (~~), 8 warnings (==), 4 comments (--). Run idnits with the --verbose option for more detailed information about the items above. -------------------------------------------------------------------------------- 2 Network Working Group M. Bagnulo 3 Internet-Draft UC3M 4 Obsoletes: 5562 (if approved) B. Briscoe 5 Intended status: Experimental CableLabs 6 Expires: April 1, 2018 September 28, 2017 8 ECN++: Adding Explicit Congestion Notification (ECN) to TCP Control 9 Packets 10 draft-ietf-tcpm-generalized-ecn-01 12 Abstract 14 This document describes an experimental modification to ECN when used 15 with TCP. It allows the use of ECN on the following TCP packets: 16 SYNs, pure ACKs, Window probes, FINs, RSTs and retransmissions. 18 Status of This Memo 20 This Internet-Draft is submitted in full conformance with the 21 provisions of BCP 78 and BCP 79. 23 Internet-Drafts are working documents of the Internet Engineering 24 Task Force (IETF). Note that other groups may also distribute 25 working documents as Internet-Drafts. The list of current Internet- 26 Drafts is at https://datatracker.ietf.org/drafts/current/. 28 Internet-Drafts are draft documents valid for a maximum of six months 29 and may be updated, replaced, or obsoleted by other documents at any 30 time. It is inappropriate to use Internet-Drafts as reference 31 material or to cite them other than as "work in progress." 33 This Internet-Draft will expire on April 1, 2018. 35 Copyright Notice 37 Copyright (c) 2017 IETF Trust and the persons identified as the 38 document authors. All rights reserved. 40 This document is subject to BCP 78 and the IETF Trust's Legal 41 Provisions Relating to IETF Documents 42 (https://trustee.ietf.org/license-info) in effect on the date of 43 publication of this document. Please review these documents 44 carefully, as they describe your rights and restrictions with respect 45 to this document. Code Components extracted from this document must 46 include Simplified BSD License text as described in Section 4.e of 47 the Trust Legal Provisions and are provided without warranty as 48 described in the Simplified BSD License. 50 Table of Contents 52 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 3 53 1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . 3 54 1.2. Experiment Goals . . . . . . . . . . . . . . . . . . . . 4 55 1.3. Document Structure . . . . . . . . . . . . . . . . . . . 5 56 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 5 57 3. Specification . . . . . . . . . . . . . . . . . . . . . . . . 6 58 3.1. Network (e.g. Firewall) Behaviour . . . . . . . . . . . . 6 59 3.2. Endpoint Behaviour . . . . . . . . . . . . . . . . . . . 7 60 3.2.1. SYN . . . . . . . . . . . . . . . . . . . . . . . . . 8 61 3.2.2. SYN-ACK . . . . . . . . . . . . . . . . . . . . . . . 11 62 3.2.3. Pure ACK . . . . . . . . . . . . . . . . . . . . . . 13 63 3.2.4. Window Probe . . . . . . . . . . . . . . . . . . . . 14 64 3.2.5. FIN . . . . . . . . . . . . . . . . . . . . . . . . . 14 65 3.2.6. RST . . . . . . . . . . . . . . . . . . . . . . . . . 15 66 3.2.7. Retransmissions . . . . . . . . . . . . . . . . . . . 15 67 3.2.8. General Fall-back for any Control Packet or 68 Retransmission . . . . . . . . . . . . . . . . . . . 16 69 4. Rationale . . . . . . . . . . . . . . . . . . . . . . . . . . 16 70 4.1. The Reliability Argument . . . . . . . . . . . . . . . . 16 71 4.2. SYNs . . . . . . . . . . . . . . . . . . . . . . . . . . 17 72 4.2.1. Argument 1a: Unrecognized CE on the SYN . . . . . . . 17 73 4.2.2. Argument 1b: Unrecognized ECT on the SYN . . . . . . 19 74 4.2.3. Argument 2: DoS Attacks . . . . . . . . . . . . . . . 21 75 4.3. SYN-ACKs . . . . . . . . . . . . . . . . . . . . . . . . 22 76 4.3.1. Response to Congestion on a SYN-ACK . . . . . . . . . 22 77 4.3.2. Fall-Back if ECT SYN-ACK Fails . . . . . . . . . . . 23 78 4.4. Pure ACKs . . . . . . . . . . . . . . . . . . . . . . . . 23 79 4.4.1. Cwnd Response to CE-Marked Pure ACKs . . . . . . . . 25 80 4.4.2. ACK Rate Response to CE-Marked Pure ACKs . . . . . . 26 81 4.4.3. Summary: Enabling ECN on Pure ACKs . . . . . . . . . 26 82 4.5. Window Probes . . . . . . . . . . . . . . . . . . . . . . 27 83 4.6. FINs . . . . . . . . . . . . . . . . . . . . . . . . . . 28 84 4.7. RSTs . . . . . . . . . . . . . . . . . . . . . . . . . . 28 85 4.8. Retransmitted Packets. . . . . . . . . . . . . . . . . . 29 86 4.9. General Fall-back for any Control Packet . . . . . . . . 30 87 5. Interaction with popular variants or derivatives of TCP . . . 31 88 5.1. IW10 . . . . . . . . . . . . . . . . . . . . . . . . . . 31 89 5.2. TFO . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 90 5.3. TCP Derivatives . . . . . . . . . . . . . . . . . . . . . 33 91 6. Security Considerations . . . . . . . . . . . . . . . . . . . 33 92 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 33 93 8. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 33 94 9. References . . . . . . . . . . . . . . . . . . . . . . . . . 33 95 9.1. Normative References . . . . . . . . . . . . . . . . . . 34 96 9.2. Informative References . . . . . . . . . . . . . . . . . 34 97 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 37 99 1. Introduction 101 RFC 3168 [RFC3168] specifies support of Explicit Congestion 102 Notification (ECN) in IP (v4 and v6). By using the ECN capability, 103 network elements (e.g. routers, switches) performing Active Queue 104 Management (AQM) can use ECN marks instead of packet drops to signal 105 congestion to the endpoints of a communication. This results in 106 lower packet loss and increased performance. RFC 3168 also specifies 107 support for ECN in TCP, but solely on data packets. For various 108 reasons it precludes the use of ECN on TCP control packets (TCP SYN, 109 TCP SYN-ACK, pure ACKs, Window probes) and on retransmitted packets. 110 RFC 3168 is silent about the use of ECN on RST and FIN packets. RFC 111 5562 [RFC5562] is an experimental modification to ECN that enables 112 ECN support for TCP SYN-ACK packets. 114 This document defines an experimental modification to ECN [RFC3168] 115 that shall be called ECN++. It enables ECN support on all the 116 aforementioned types of TCP packet. 118 ECN++ is a sender-side change. It works whether the two ends of the 119 TCP connection use classic ECN feedback [RFC3168] or experimental 120 Accurate ECN feedback (AccECN [I-D.ietf-tcpm-accurate-ecn]). 121 Nonetheless, if the client does not implement AccECN, it cannot use 122 ECN++ on the one packet that offers most benefit from it - the 123 initial SYN. Therefore, implementers of ECN++ are RECOMMENDED to 124 also implement AccECN. 126 ECN++ is designed for compatibility with a number of latency 127 improvements to TCP such as TCP Fast Open (TFO [RFC7413]), initial 128 window of 10 SMSS (IW10 [RFC6928]) and Low latency Low Loss Scalable 129 Transport (L4S [I-D.ietf-tsvwg-l4s-arch]), but they can all be 130 implemented and deployed independently. 131 [I-D.ietf-tsvwg-ecn-experimentation] is a standards track procedural 132 device that relaxes requirements in RFC 3168 and other standards 133 track RFCs that would otherwise preclude the experimental 134 modifications needed for ECN++ and other ECN experiments. 136 1.1. Motivation 138 The absence of ECN support on TCP control packets and retransmissions 139 has a potential harmful effect. In any ECN deployment, non-ECN- 140 capable packets suffer a penalty when they traverse a congested 141 bottleneck. For instance, with a drop probability of 1%, 1% of 142 connection attempts suffer a timeout of about 1 second before the SYN 143 is retransmitted, which is highly detrimental to the performance of 144 short flows. TCP control packets, particularly TCP SYNs and SYN- 145 ACKs, are important for performance, so dropping them is best 146 avoided. 148 Non-ECN control packets particularly harm performance in environments 149 where the ECN marking level is high. For example, [judd-nsdi] shows 150 that in a controlled private data centre (DC) environment where ECN 151 is used (in conjunction with DCTCP [I-D.ietf-tcpm-dctcp]), the 152 probability of being able to establish a new connection using a non- 153 ECN SYN packet drops to close to zero even when there are only 16 154 ongoing TCP flows transmitting at full speed. The issue is that 155 DCTCP exhibits a much more aggressive response to packet marking 156 (which is why it is only applicable in controlled environments). 157 This leads to a high marking probability for ECN-capable packets, and 158 in turn a high drop probability for non-ECN packets. Therefore non- 159 ECN SYNs are dropped aggressively, rendering it nearly impossible to 160 establish a new connection in the presence of even mild traffic load. 162 Finally, there are ongoing experimental efforts to promote the 163 adoption of a slightly modified variant of DCTCP (and similar 164 congestion controls) over the Internet to achieve low latency, low 165 loss and scalable throughput (L4S) for all communications 166 [I-D.ietf-tsvwg-l4s-arch]. In such an approach, L4S packets identify 167 themselves using an ECN codepoint [I-D.ietf-tsvwg-ecn-l4s-id]. With 168 L4S and potentially other similar cases, preventing TCP control 169 packets from obtaining the benefits of ECN would not only expose them 170 to the prevailing level of congestion loss, but it would also 171 classify control packets into a different queue with different 172 network treatment, which may also lead to reordering, further 173 degrading TCP performance. 175 1.2. Experiment Goals 177 The goal of the experimental modifications defined in this document 178 is to allow the use of ECN on all TCP packets. Experiments are 179 expected in the public Internet as well as in controlled environments 180 to understand the following issues: 182 o How SYNs, Window probes, pure ACKs, FINs, RSTs and retransmissions 183 that carry the ECT(0), ECT(1) or CE codepoints are processed by 184 the TCP endpoints and the network (including routers, firewalls 185 and other middleboxes). In particular we would like to learn if 186 these packets are frequently blocked or if these packets are 187 usually forwarded and processed. 189 o The scale of deployment of the different flavours of ECN, 190 including [RFC3168], [RFC5562], [RFC3540] and 191 [I-D.ietf-tcpm-accurate-ecn]. 193 o How much the performance of TCP communications is improved by 194 allowing ECN marking of each packet type. 196 o To identify any issues (including security issues) raised by 197 enabling ECN marking of these packets. 199 The data gathered through the experiments described in this document, 200 particularly under the first 2 bullets above, will help in the design 201 of the final mechanism (if any) for adding ECN support to the 202 different packet types considered in this document. Whenever data 203 input is needed to assist in a design choice, it is spelled out 204 throughout the document. 206 Success criteria: The experiment will be a success if we obtain 207 enough data to have a clearer view of the deployability and benefits 208 of enabling ECN on all TCP packets, as well as any issues. If the 209 results of the experiment show that it is feasible to deploy such 210 changes; that there are gains to be achieved through the changes 211 described in this specification; and that no other major issues may 212 interfere with the deployment of the proposed changes; then it would 213 be reasonable to adopt the proposed changes in a standards track 214 specification that would update RFC 3168. 216 1.3. Document Structure 218 The remainder of this document is structured as follows. In 219 Section 2, we present the terminology used in the rest of the 220 document. In Section 3, we specify the modifications to provide ECN 221 support to TCP SYNs, pure ACKs, Window probes, FINs, RSTs and 222 retransmissions. We describe both the network behaviour and the 223 endpoint behaviour. Section 5 discusses variations of the 224 specification that will be necessary to interwork with a number of 225 popular variants or derivatives of TCP. RFC 3168 provides a number 226 of specific reasons why ECN support is not appropriate for each 227 packet type. In Section 4, we revisit each of these arguments for 228 each packet type to justify why it is reasonable to conduct this 229 experiment. 231 2. Terminology 233 The keywords MUST, MUST NOT, REQUIRED, SHALL, SHALL NOT, SHOULD, 234 SHOULD NOT, RECOMMENDED, MAY, and OPTIONAL, when they appear in this 235 document, are to be interpreted as described in [RFC2119]. 237 Pure ACK: A TCP segment with the ACK flag set and no data payload. 239 SYN: A TCP segment with the SYN (synchronize) flag set. 241 Window probe: Defined in [RFC0793], a window probe is a TCP segment 242 with only one byte of data sent to learn if the receive window is 243 still zero. 245 FIN: A TCP segment with the FIN (finish) flag set. 247 RST: A TCP segment with the RST (reset) flag set. 249 Retransmission: A TCP segment that has been retransmitted by the TCP 250 sender. 252 ECT: ECN-Capable Transport. One of the two codepoints ECT(0) or 253 ECT(1) in the ECN field [RFC3168] of the IP header (v4 or v6). An 254 ECN-capable sender sets one of these to indicate that both transport 255 end-points support ECN. When this specification says the sender sets 256 an ECT codepoint, by default it means ECT(0). Optionally, it could 257 mean ECT(1), which is in the process of being redefined for use by 258 L4S experiments [I-D.ietf-tsvwg-ecn-experimentation] 259 [I-D.ietf-tsvwg-ecn-l4s-id]. 261 Not-ECT: The ECN codepoint set by senders that indicates that the 262 transport is not ECN-capable. 264 CE: Congestion Experienced. The ECN codepoint that an intermediate 265 node sets to indicate congestion [RFC3168]. A node sets an 266 increasing proportion of ECT packets to CE as the level of congestion 267 increases. 269 3. Specification 271 3.1. Network (e.g. Firewall) Behaviour 273 Previously the specification of ECN for TCP [RFC3168] required the 274 sender to set not-ECT on TCP control packets and retransmissions. 275 Some readers of RFC 3168 might have erroneously interpreted this as a 276 requirement for firewalls, intrusion detection systems, etc. to check 277 and enforce this behaviour. Section 4.3 of 278 [I-D.ietf-tsvwg-ecn-experimentation] updates RFC 3168 to remove this 279 ambiguity. It require firewalls or any intermediate nodes not to 280 treat certain types of ECN-capable TCP segment differently (except 281 potentially in one attack scenario). This is likely to only involve 282 a firewall rule change in a fraction of cases (at most 0.4% of paths 283 according to the tests reported in Section 4.2.2). 285 In case a TCP sender encounters a middlebox blocking ECT on certain 286 TCP segments, the specification below includes behaviour to fall back 287 to non-ECN. However, this loses the benefit of ECN on control 288 packets. So operators are RECOMMENDED to alter their firewall rules 289 to comply with the requirement referred to above (section 4.3 of 290 [I-D.ietf-tsvwg-ecn-experimentation]). 292 3.2. Endpoint Behaviour 294 The changes to the specification of TCP over ECN [RFC3168] defined 295 here solely alter the behaviour of the sending host for each half- 296 connection. All changes can be deployed at each end-point 297 independently of others and independent of any network behaviour. 299 The feedback behaviour at the receiver depends on whether classic ECN 300 TCP feedback [RFC3168] or Accurate ECN (AccECN) TCP feedback 301 [I-D.ietf-tcpm-accurate-ecn] has been negotiated. Nonetheless, 302 neither receiver feedback behaviour is altered by the present 303 specification. 305 For each type of control packet or retransmission, the following 306 sections detail changes to the sender's behaviour in two respects: i) 307 whether it sets ECT; and ii) its response to congestion feedback. 308 Table 1 summarises these two behaviours for each type of packet, but 309 the relevant subsection below should be referred to for the detailed 310 behaviour. The subsection on the SYN is more complex than the 311 others, because it has to include fall-back behaviour if the ECT 312 packet appears not to have got through, and caching of the outcome to 313 detect persistent failures. 315 +---------+-----------------+------------------+--------------------+ 316 | TCP | ECN field if | ECN field if | Congestion | 317 | packet | AccECN f/b | RFC3168 f/b | Response | 318 | type | negotiated* | negotiated* | | 319 +---------+-----------------+------------------+--------------------+ 320 | SYN | ECT | not-ECT | Reduce IW | 321 | | | | | 322 | SYN-ACK | ECT | ECT | Reduce IW | 323 | | | | | 324 | Pure | ECT | ECT | Usual cwnd | 325 | ACK | | | response and | 326 | | | | optionally | 327 | | | | [RFC5690] | 328 | | | | | 329 | W Probe | ECT | ECT | Usual cwnd | 330 | | | | response | 331 | | | | | 332 | FIN | ECT | ECT | None or optionally | 333 | | | | [RFC5690] | 334 | | | | | 335 | RST | ECT | ECT | N/A | 336 | | | | | 337 | Re-XMT | ECT | ECT | Usual cwnd | 338 | | | | response | 339 +---------+-----------------+------------------+--------------------+ 341 Window probe and retransmission are abbreviated to W Probe an Re-XMT. 342 * For a SYN, "negotiated" means "requested". 344 Table 1: Summary of sender behaviour. In each case the relevant 345 section below should be referred to for the detailed behaviour 347 It can be seen that the sender can set ECT in all cases, except if it 348 is not requesting AccECN feedback on the SYN. Therefore it is 349 RECOMMENDED that the experimental AccECN specification 350 [I-D.ietf-tcpm-accurate-ecn] is implemented (as well as the present 351 specification), because it is expected that ECT on the SYN will give 352 the most significant performance gain, particularly for short flows. 353 Nonetheless, this specification also caters for the case where AccECN 354 feedback is not implemented. 356 3.2.1. SYN 358 3.2.1.1. Setting ECT on the SYN 360 With classic [RFC3168] ECN feedback, the SYN was never expected to be 361 ECN-capable, so the flag provided to feed back congestion was put to 362 another use (it is used in combination with other flags to indicate 363 that the responder supports ECN). In contrast, Accurate ECN (AccECN) 364 feedback [I-D.ietf-tcpm-accurate-ecn] provides two codepoints in the 365 SYN-ACK for the responder to feed back whether or not the SYN arrived 366 marked CE. 368 Therefore, a TCP initiator MUST NOT set ECT on a SYN unless it also 369 attempts to negotiate Accurate ECN feedback in the same SYN. 371 For the experiments proposed here, if the SYN is requesting AccECN 372 feedback, the TCP sender will also set ECT on the SYN. It can ignore 373 the prohibition in section 6.1.1 of RFC 3168 against setting ECT on 374 such a SYN. 376 The following subsections about the SYN solely apply to this case 377 where the initiator sent an ECT SYN. 379 3.2.1.2. Caching Lack of AccECN Support for ECT on SYNs 381 Until AccECN servers become widely deployed, a TCP initiator that 382 sets ECT on a SYN (which implies the same SYN also requests AccECN, 383 as required above) SHOULD also maintain a cache entry per server to 384 record that the server does not support AccECN and therefore has no 385 logic for congestion markings on the SYN. Mobile hosts MAY maintain 386 a cache entry per access network to record lack of AccECN support by 387 proxies (see Section 4.2.1). 389 The initiator will record any server's SYN-ACK response that does not 390 support AccECN. Subsequently the initiator will not set ECT on a SYN 391 to such a server, but it can still always request AccECN support 392 (because the response will state any earlier stage of ECN evolution 393 that the server supports with no performance penalty). The initiator 394 will discover a server that has upgraded to support AccECN as soon as 395 it next connects, then it can remove the server from its cache and 396 subsequently always set ECT for that server. 398 If the initiator times out without seeing a SYN-ACK, it will also 399 cache this fact (see fall-back in Section 3.2.1.4 for details). 401 There is no need to cache successful attempts, because the default 402 ECT SYN behaviour performs optimally on success anyway. Servers that 403 do not support ECN as a whole probably do not need to be recorded 404 separately from non-support of AccECN because the response to a 405 request for AccECN immediately states which stage in the evolution of 406 ECN the server supports (AccECN [I-D.ietf-tcpm-accurate-ecn], classic 407 ECN [RFC3168] or no ECN). 409 The above strategy is named "optimistic ECT and cache failures". It 410 is believed to be sufficient based on initial measurements and 411 assumptions detailed in Section 4.2.1, which also gives alternative 412 strategies in case larger scale measurements uncover different 413 scenarios. 415 3.2.1.3. SYN Congestion Response 417 If the SYN-ACK returned to the TCP initiator confirms that the server 418 supports AccECN, it will also indicate whether or not the SYN was CE- 419 marked. If the SYN was CE-marked, the initiator MUST reduce its 420 Initial Window (IW) and SHOULD reduce it to 1 SMSS (sender maximum 421 segment size). 423 If ECT has been set on the SYN and if the SYN-ACK shows that the 424 server does not support AccECN, the TCP initiator MUST conservatively 425 reduce its Initial Window and SHOULD reduce it to 1 SMSS. A 426 reduction to greater than 1 SMSS MAY be appropriate (see 427 Section 4.2.1). Conservatism is necessary because a non-AccECN SYN- 428 ACK cannot show whether the SYN was CE-marked. 430 If the TCP initiator (host A) receives a SYN from the remote end 431 (host B) after it has sent a SYN to B, it indicates the (unusual) 432 case of a simultaneous open. Host A will respond with a SYN-ACK. 433 Host A will probably then receive a SYN-ACK in response to its own 434 SYN, after which it can follow the appropriate one of the two 435 paragraphs above. 437 In all the above cases, the initiator does not have to back off its 438 retransmission timer as it would in response to a timeout following 439 no response to its SYN [RFC6298], because both the SYN and the SYN- 440 ACK have been successfully delivered through the network. Also, the 441 initiator does not need to exit slow start or reduce ssthresh, which 442 is not even required when a SYN is lost [RFC5681]. 444 If an initial window of 10 (IW10 [RFC6928]) is implemented, Section 5 445 gives additional recommendations. 447 3.2.1.4. Fall-Back Following No Response to an ECT SYN 449 An ECT SYN might be lost due to an over-zealous path element (or 450 server) blocking ECT packets that do not conform to RFC 3168. Some 451 evidence of this was found in a 2014 study [ecn-pam], but in a more 452 recent 2017 study {ToDo: Add reference (under submission)} extensive 453 measurements found no case where ECT on TCP control packets was 454 treated any differently from ECT on TCP data packets. Loss is 455 commonplace for numerous other reasons, e.g. congestion loss at a 456 non-ECN queue on the forward or reverse path, transmission errors, 457 etc. Alternatively, the cause of the loss might be the attempt to 458 negotiate AccECN, or possibly other unrelated options on the SYN. 460 Therefore, if the timer expires after the TCP initiator has sent the 461 first ECT SYN, it SHOULD make one more attempt to retransmit the SYN 462 with ECT set (backing off the timer as usual). If the retransmission 463 timer expires again, it SHOULD retransmit the SYN with the not-ECT 464 codepoint in the IP header, to expedite connection set-up. If other 465 experimental fields or options were on the SYN, it will also be 466 necessary to follow their specifications for fall-back too. It would 467 make sense to coordinate all the strategies for fall-back in order to 468 isolate the specific cause of the problem. 470 If the TCP initiator is caching failed connection attempts, it SHOULD 471 NOT give up using ECT on the first SYN of subsequent connection 472 attempts until it is clear that a blockage persistently and 473 specifically affects ECT on SYNs. This is because loss is so 474 commonplace for other reasons. Even if it does eventually decide to 475 give up setting ECT on the SYN, it will probably not need to give up 476 on AccECN on the SYN. In any case, if a cache is used, it SHOULD be 477 arranged to expire so that the initiator will infrequently attempt to 478 check whether the problem has been resolved. 480 Other fall-back strategies MAY be adopted where applicable (see 481 Section 4.2.2 for suggestions, and the conditions under which they 482 would apply). 484 3.2.2. SYN-ACK 486 3.2.2.1. Setting ECT on the SYN-ACK 488 For the experiments proposed here, the TCP implementation will set 489 ECT on SYN-ACKs. It can ignore the requirement in section 6.1.1 of 490 RFC 3168 to set not-ECT on a SYN-ACK. 492 The feedback behaviour by the initiator in response to a CE-marked 493 SYN-ACK from the responder depends on whether classic ECN feedback 494 [RFC3168] or AccECN feedback [I-D.ietf-tcpm-accurate-ecn] has been 495 negotiated. In either case no change is required to RFC 3168 or the 496 AccECN specification. 498 Some classic ECN implementations might ignore a CE-mark on a SYN-ACK, 499 or even ignore a SYN-ACK packet entirely if it is set to ECT or CE. 500 This is a possibility because an RFC 3168 implementation would not 501 necessarily expect a SYN-ACK to be ECN-capable. 503 FOR DISCUSSION: To eliminate this problem, the WG could decide to 504 prohibit setting ECT on SYN-ACKs unless AccECN has been 505 negotiated. However, this issue already came up when the IETF 506 first decided to experiment with ECN on SYN-ACKs [RFC5562] and it 507 was decided to go ahead without any extra precautionary measures 508 because the risk was low. This was because the probability of 509 encountering the problem was believed to be low and the harm if 510 the problem arose was also low (see Appendix B of RFC 5562). 512 MEASUREMENTS NEEDED: Server-side experiments could determine 513 whether this specific problem is indeed rare across the current 514 installed base of clients that support ECN. 516 3.2.2.2. SYN-ACK Congestion Response 518 A host that sets ECT on SYN-ACKs MUST reduce its initial window in 519 response to any congestion feedback, whether using classic ECN or 520 AccECN. It SHOULD reduce it to 1 SMSS. This is different to the 521 behaviour specified in an earlier experiment that set ECT on the SYN- 522 ACK [RFC5562]. This is justified in Section 4.3. 524 The responder does not have to back off its retransmission timer 525 because the ECN feedback proves that the network is delivering 526 packets successfully and is not severely overloaded. Also the 527 responder does not have to leave slow start or reduce ssthresh, which 528 is not even required when a SYN-ACK has been lost. 530 The congestion response to CE-marking on a SYN-ACK for a server that 531 implements either the TCP Fast Open experiment (TFO [RFC7413]) or the 532 initial window of 10 experiment (IW10 [RFC6928]) is discussed in 533 Section 5. 535 3.2.2.3. Fall-Back Following No Response to an ECT SYN-ACK 537 After the responder sends a SYN-ACK with ECT set, if its 538 retransmission timer expires it SHOULD retransmit one more SYN-ACK 539 with ECT set (and back-off its timer as usual). If the timer expires 540 again, it SHOULD retransmit the SYN-ACK with not-ECT in the IP 541 header. If other experimental fields or options were on the initial 542 SYN-ACK, it will also be necessary to follow their specifications for 543 fall-back. It would make sense to co-ordinate all the strategies for 544 fall-back in order to isolate the specific cause of the problem. 546 This fall-back strategy attempts to use ECT one more time than the 547 strategy for ECT SYN-ACKs in [RFC5562] (which is made obsolete, being 548 superseded by the present specification). Other fall-back strategies 549 MAY be adopted if found to be more effective, e.g. fall-back to not- 550 ECT on the first retransmission attempt. 552 The server MAY cache failed connection attempts, e.g. per client 553 access network. An client-based alternative to caching at the server 554 is given in Section 4.3.2. If the TCP server is caching failed 555 connection attempts, it SHOULD NOT give up using ECT on the first 556 SYN-ACK of subsequent connection attempts until it is clear that the 557 blockage persistently and specifically affects ECT on SYN-ACKs. This 558 is because loss is so commonplace for other reasons (see 559 Section 3.2.1.4). If a cache is used, it SHOULD be arranged to 560 expire so that the server will infrequently attempt to check whether 561 the problem has been resolved. 563 3.2.3. Pure ACK 565 For the experiments proposed here, the TCP implementation will set 566 ECT on pure ACKs. It can ignore the requirement in section 6.1.4 of 567 RFC 3168 to set not-ECT on a pure ACK. 569 A host that sets ECT on pure ACKs MUST reduce its congestion window 570 in response to any congestion feedback, in order to regulate any data 571 segments it might be sending amongst the pure ACKs. {ToDo: Reconsider 572 this requirement in the light of WG comments.} It MAY also implement 573 AckCC [RFC5690] to regulate the pure ACK rate, but this is not 574 required. Note that, in comparison, TCP Congestion Control [RFC5681] 575 does not require a TCP to detect or respond to loss of pure ACKs at 576 all; it requires no reduction in congestion window or ACK rate. 578 The question of whether the receiver of pure ACKs is required to feed 579 back any CE marks on them is a matter for the relevant feedback 580 specification ([RFC3168] or [I-D.ietf-tcpm-accurate-ecn]). It is 581 outside the scope of the present specification. Currently AccECN 582 feedback is required to count CE marking of any control packet 583 including pure ACKs. Whereas RFC 3168 is silent on this point, so 584 feedback of CE-markings might be implementation specific (see 585 Section 4.4.1). 587 DISCUSSION: An AccECN deployment or an implementation of RFC 3168 588 that feeds back CE on pure ACKs will be at a disadvantage compared 589 to an RFC 3168 implementation that does not. To solve this, the 590 WG could decide to prohibit setting ECT on pure ACKs unless AccECN 591 has been negotiated. If it does, the penultimate sentence of the 592 Introduction will need to be modified. 594 MEASUREMENTS NEEDED: Measurements are needed to learn how the 595 deployed base of network elements and RFC 3168 servers react to 596 pure ACKs marked with the ECT(0)/ECT(1)/CE codepoints, i.e. 597 whether they are dropped, codepoint cleared or processed and the 598 congestion indication fed back on a subsequent packet. 600 3.2.4. Window Probe 602 For the experiments proposed here, the TCP sender will set ECT on 603 window probes. It can ignore the prohibition in section 6.1.6 of RFC 604 3168 against setting ECT on a window probe. 606 A window probe contains a single octet, so it is no different from a 607 regular TCP data segment. Therefore a TCP receiver will feed back 608 any CE marking on a window probe as normal (either using classic ECN 609 feedback or AccECN feedback). The sender of the probe will then 610 reduce its congestion window as normal. 612 A receive window of zero indicates that the application is not 613 consuming data fast enough and does not imply anything about network 614 congestion. Once the receive window opens, the congestion window 615 might become the limiting factor, so it is correct that CE-marked 616 probes reduce the congestion window. This complements cwnd 617 validation [RFC7661], which reduces cwnd as more time elapses without 618 having used available capacity. However, CE-marking on window probes 619 does not reduce the rate of the probes themselves. This is unlikely 620 to present a problem, given the duration between window probes 621 doubles [RFC1122] as long as the receiver is advertising a zero 622 window (currently minimum 1 second, maximum at least 1 minute 623 [RFC6298]). 625 MEASUREMENTS NEEDED: Measurements are needed to learn how the 626 deployed base of network elements and servers react to Window 627 probes marked with the ECT(0)/ECT(1)/CE codepoints, i.e. whether 628 they are dropped, codepoint cleared or processed. 630 3.2.5. FIN 632 A TCP implementation can set ECT on a FIN. 634 The TCP data receiver MUST ignore the CE codepoint on incoming FINs 635 that fail any validity check. The validity check in section 5.2 of 636 [RFC5961] is RECOMMENDED. 638 A congestion response to a CE-marking on a FIN is not required. 640 After sending a FIN, the endpoint will not send any more data in the 641 connection. Therefore, even if the FIN-ACK indicates that the FIN 642 was CE-marked (whether using classic or AccECN feedback), reducing 643 the congestion window will not affect anything. 645 After sending a FIN, a host might send one or more pure ACKs. If it 646 is using one of the techniques in Section 3.2.3 to regulate the 647 delayed ACK ratio for pure ACKs, it could equally be applied after a 648 FIN. But this is not required. 650 MEASUREMENTS NEEDED: Measurements are needed to learn how the 651 deployed base of network elements and servers react to FIN packets 652 marked with the ECT(0)/ECT(1)/CE codepoints, i.e. whether they 653 are dropped, codepoint cleared or processed. 655 3.2.6. RST 657 A TCP implementation can set ECT on a RST. 659 The "challenge ACK" approach to checking the validity of RSTs 660 (section 3.2 of [RFC5961] is RECOMMENDED at the data receiver. 662 A congestion response to a CE-marking on a RST is not required (and 663 actually not possible). 665 MEASUREMENTS NEEDED: Measurements are needed to learn how the 666 deployed base of network elements and servers react to RST packets 667 marked with the ECT(0)/ECT(1)/CE codepoints, i.e. whether they 668 are dropped, codepoint cleared or processed. 670 3.2.7. Retransmissions 672 For the experiments proposed here, the TCP sender will set ECT on 673 retransmitted segments. It can ignore the prohibition in section 674 6.1.5 of RFC 3168 against setting ECT on retransmissions. 676 Nonetheless, the TCP data receiver MUST ignore the CE codepoint on 677 incoming segments that fail any validity check. The validity check 678 in section 5.2 of [RFC5961] is RECOMMENDED. This will effectively 679 mitigate an attack that uses spoofed data packets to fool the 680 receiver into feeding back spoofed congestion indications to the 681 sender, which in turn would be fooled into continually halving its 682 congestion window. 684 If the TCP sender receives feedback that a retransmitted packet was 685 CE-marked, it will react as it would to any feedback of CE-marking on 686 a data packet. 688 MEASUREMENTS NEEDED: Measurements are needed to learn how the 689 deployed base of network elements and servers react to 690 retransmissions marked with the ECT(0)/ECT(1)/CE codepoints, i.e. 691 whether they are dropped, codepoint cleared or processed. 693 3.2.8. General Fall-back for any Control Packet or Retransmission 695 Extensive measurements in fixed and mobile networks {ToDo: reference 696 (under submission)} have found no evidence of blockages due to ECT 697 being set on any type of TCP control packet. 699 In case traversal problems arise in future, fall-back measures have 700 been specified above, but only for the cases where ECT on the initial 701 packet of a half-connection (SYN or SYN-ACK) is persistently failing 702 to get through. 704 Fall-back measures for blockage of ECT on other TCP control packets 705 MAY be implemented. However they are not specified here given the 706 lack of any evidence they will be needed. Section 4.9 justifies this 707 advice in more detail. 709 4. Rationale 711 This section is informative, not normative. It presents counter- 712 arguments against the justifications in the RFC series for disabling 713 ECN on TCP control segments and retransmissions. It also gives 714 rationale for why ECT is safe on control segments that have not, so 715 far, been mentioned in the RFC series. First it addresses over- 716 arching arguments used for most packet types, then it addresses the 717 specific arguments for each packet type in turn. 719 4.1. The Reliability Argument 721 Section 5.2 of RFC 3168 states: 723 "To ensure the reliable delivery of the congestion indication of 724 the CE codepoint, an ECT codepoint MUST NOT be set in a packet 725 unless the loss of that packet [at a subsequent node] in the 726 network would be detected by the end nodes and interpreted as an 727 indication of congestion." 729 We believe this argument is misplaced. TCP does not deliver most 730 control packets reliably. So it is more important to allow control 731 packets to be ECN-capable, which greatly improves reliable delivery 732 of the control packets themselves (see motivation in Section 1.1). 733 ECN also improves the reliability and latency of delivery of any 734 congestion notification on control packets, particularly because TCP 735 does not detect the loss of most types of control packet anyway. 736 Both these points outweigh by far the concern that a CE marking 737 applied to a control packet by one node might subsequently be dropped 738 by another node. 740 The principle to determine whether a packet can be ECN-capable ought 741 to be "do no extra harm", meaning that the reliability of a 742 congestion signal's delivery ought to be no worse with ECN than 743 without. In particular, setting the CE codepoint on the very same 744 packet that would otherwise have been dropped fulfills this 745 criterion, since either the packet is delivered and the CE signal is 746 delivered to the endpoint, or the packet is dropped and the original 747 congestion signal (packet loss) is delivered to the endpoint. 749 The concern about a CE marking being dropped at a subsequent node 750 might be motivated by the idea that ECN-marking a packet at the first 751 node does not remove the packet, so it could go on to worsen 752 congestion at a subsequent node. However, it is not useful to reason 753 about congestion by considering single packets. The departure rate 754 from the first node will generally be the same (fully utilized) with 755 or without ECN, so this argument does not apply. 757 4.2. SYNs 759 RFC 5562 presents two arguments against ECT marking of SYN packets 760 (quoted verbatim): 762 "First, when the TCP SYN packet is sent, there are no guarantees 763 that the other TCP endpoint (node B in Figure 2) is ECN-Capable, 764 or that it would be able to understand and react if the ECN CE 765 codepoint was set by a congested router. 767 Second, the ECN-Capable codepoint in TCP SYN packets could be 768 misused by malicious clients to "improve" the well-known TCP SYN 769 attack. By setting an ECN-Capable codepoint in TCP SYN packets, a 770 malicious host might be able to inject a large number of TCP SYN 771 packets through a potentially congested ECN-enabled router, 772 congesting it even further." 774 The first point actually describes two subtly different issues. So 775 below three arguments are countered in turn. 777 4.2.1. Argument 1a: Unrecognized CE on the SYN 779 This argument certainly applied at the time RFC 5562 was written, 780 when no ECN responder mechanism had any logic to recognize or feed 781 back a CE marking on a SYN. The problem was that, during the 3WHS, 782 the flag in the TCP header for ECN feedback (called Echo Congestion 783 Experienced) had been overloaded to negotiate the use of ECN itself. 784 So there was no space for feedback in a SYN-ACK. 786 The accurate ECN (AccECN) protocol [I-D.ietf-tcpm-accurate-ecn] has 787 since been designed to solve this problem, using a two-pronged 788 approach. First AccECN uses the 3 ECN bits in the TCP header as 8 789 codepoints, so there is space for the responder to feed back whether 790 there was CE on the SYN. Second a TCP initiator can always request 791 AccECN support on every SYN, and any responder reveals its level of 792 ECN support: AccECN, classic ECN, or no ECN. Therefore, if a 793 responder does indicate that it supports AccECN, the initiator can be 794 sure that, if there is no CE feedback on the SYN-ACK, then there 795 really was no CE on the SYN. 797 An initiator can combine AccECN with three possible strategies for 798 setting ECT on a SYN: 800 (S1): Pessimistic ECT and cache successes: The initiator always 801 requests AccECN in the SYN, but without setting ECT. Then it 802 records those servers that confirm that they support AccECN in 803 a cache. On a subsequent connection to any server that 804 supports AccECN, the initiator can then set ECT on the SYN. 806 (S2): Optimistic ECT: The initiator always sets ECT optimistically 807 on the initial SYN and it always requests AccECN support. 808 Then, if the server response shows it has no AccECN logic (so 809 it cannot feed back a CE mark), the initiator conservatively 810 behaves as if the SYN was CE-marked, by reducing its initial 811 window. 813 A. No cache: The optimistic ECT strategy ought to work fairly 814 well without caching any responses. 816 B. Cache failures: The optimistic ECT strategy can be 817 improved by recording solely those servers that do not 818 support AccECN. On subsequent connections to these non- 819 AccECN servers, the initiator will still request AccECN 820 but not set ECT on the SYN. Then, the initiator can use 821 its full initial window (if it has enough request data to 822 need it). Longer term, as servers upgrade to AccECN, the 823 initiator will remove them from the cache and use ECT on 824 subsequent SYNs to that server. 826 Where an access network operator mediates Internet access 827 via a proxy that does not support AccECN, the optimistic 828 ECT strategy will always fail. This scenario is more 829 likely in mobile networks. Therefore, a mobile host could 830 cache lack of AccECN support per attached access network 831 operator. Whenever it attached to a new operator, it 832 could check a well-known AccECN test server and, if it 833 found no AccECN support, it would add a cache entry for 834 the attached operator. It would only use ECT when neither 835 network nor server were cached. It would only populate 836 its per server cache when not attached to a non-AccECN 837 proxy. 839 (S3): ECT by configuration: In a controlled environment, the 840 administrator can make sure that servers support ECN-capable 841 SYN packets. Examples of controlled environments are single- 842 tenant DCs, and possibly multi-tenant DCs if it is assumed 843 that each tenant mostly communicates with its own VMs. 845 For unmanaged environments like the public Internet, pragmatically 846 the choice is between strategies (S1) and (S2B): 848 o The "pessimistic ECT and cache successes" strategy (S1) suffers 849 from exposing the initial SYN to the prevailing loss level, even 850 if the server supports ECT on SYNs, but only on the first 851 connection to each AccECN server. 853 o The "optimistic ECT and cache failures" strategy (S2B) exploits a 854 server's support for ECT on SYNs from the very first attempt. But 855 if the server turns out not to support AccECN, the initiator has 856 to conservatively limit its initial window - usually 857 unnecessarily. Nonetheless, initiator request data (as opposed to 858 server response data) is rarely larger than 1 SMSS anyway {ToDo: 859 reference? (this information was given informally by Yuchung 860 Cheng)}. 862 The normative specification for ECT on a SYN in Section 3.2.1 uses 863 the "optimistic ECT and cache failures" strategy (S2B) on the 864 assumption that an initial window of 1 SMSS is usually sufficient for 865 client requests anyway. Clients that often initially send more than 866 1 SMSS of data could use strategy (S1) during initial deployment, and 867 strategy (S2B) later (when the probability of servers supporting 868 AccECN and the likelihood of seeing some CE marking is higher). 869 Also, as deployment proceeds, caching successes (S1) starts off small 870 then grows, while caching failures (S2B) becomes large at first, then 871 shrinks. 873 MEASUREMENTS NEEDED: Measurements are needed to determine whether 874 one or the other strategy would be sufficient for any particular 875 client, or whether a particular client would need both strategies 876 in different circumstances. 878 4.2.2. Argument 1b: Unrecognized ECT on the SYN 880 Given, until now, ECT-marked SYN packets have been prohibited, it 881 cannot be assumed they will be accepted. 883 According to a study using 2014 data [ecn-pam] from a limited range 884 of vantage points, out of the top 1M Alexa web sites, 4791 (0.82%) 885 IPv4 sites and 104 (0.61%) IPv6 sites failed to establish a 886 connection when they received a TCP SYN with any ECN codepoint set in 887 the IP header and the appropriate ECN flags in the TCP header. Of 888 these, about 41% failed to establish a connection due to the ECN 889 flags in the TCP header even with a Not-ECT ECN field in the IP 890 header (i.e. despite full compliance with RFC 3168). Therefore 891 adding the ECN capability to SYNs was increasing connection 892 establishment failures by about 0.4%. 894 In a study using 2017 data from a wider range of fixed and mobile 895 vantage points to the top 500k Alexa servers, no case was found where 896 adding the ECN capability to a SYN increased the likelihood of 897 connection establishment failure {ToDo: reference (under 898 submission)}. 900 MEASUREMENTS NEEDED: More investigation is needed to understand 901 the different outcomes of the 2014 and 2017 studies. 903 RFC 3168 says "a host MUST NOT set ECT on SYN [...] packets", but it 904 does not say what the responder should do if an ECN-capable SYN 905 arrives. So, in the 2014 study, perhaps some responder 906 implementations were checking that the SYN complied with RFC 3168, 907 then silently ignoring non-compliant SYNs (or perhaps returning a 908 RST). Also some middleboxes (e.g. firewalls) might have been 909 discarding non-compliant SYNs. For the future, 910 [I-D.ietf-tsvwg-ecn-experimentation] updates RFC 3168 to clarify that 911 middleboxes "SHOULD NOT" do this, but that does not alter the past. 913 Whereas RSTs can be dealt with immediately, silent failures introduce 914 a retransmission timeout delay (default 1 second) at the initiator 915 before it attempts any fall back strategy. Ironically, making SYNs 916 ECN-capable is intended to avoid the timeout when a SYN is lost due 917 to congestion. Fortunately, if there is any discard of ECN-capable 918 SYNs due to policy, it will occur predictably, not randomly like 919 congestion. So the initiator can avoid it by caching those sites 920 that do not support ECN-capable SYNs. This further justifies the use 921 of the "optimistic ECT and cache failures" strategy in Section 3.2.1. 923 MEASUREMENTS NEEDED: Experiments are needed to determine whether 924 blocking of ECT on SYNs is widespread, and how many occurrences of 925 problems would be masked by how few cache entries. 927 If blocking is too widespread for the "optimistic ECT and cache 928 failures" strategy (S2B), the "pessimistic ECT and cache successes" 929 strategy (Section 4.2.1) would be better. 931 MEASUREMENTS NEEDED: Then measurements would be needed on whether 932 failures were still widespread on the third connection attempt 933 after the more careful ("pessimistic") first and second attempts. 935 If so, it might be necessary to send a not-ECT SYN a short delay 936 after an ECT SYN and only accept the non-ECT connection if it 937 returned first. This would reduce the performance penalty for those 938 deploying ECT SYN support. 940 FOR DISCUSSION: If this becomes necessary, how much delay ought to 941 be required before the second SYN? Certainly less than the 942 standard RTO (1 second). But more or less than the maximum RTT 943 expected over the surface of the earth (roughly 250ms)? Or even 944 back-to-back? 946 However, based on the data above from [ecn-pam], even a cache of a 947 dozen or so sites ought to avoid all ECN-related performance problems 948 with roughly the Alexa top thousand. So it is questionable whether 949 sending two SYNs will be necessary, particularly given failures at 950 well-maintained sites could reduce further once ECT SYNs are 951 standardized. 953 4.2.3. Argument 2: DoS Attacks 955 [RFC5562] says that ECT SYN packets could be misused by malicious 956 clients to augment "the well-known TCP SYN attack". It goes on to 957 say "a malicious host might be able to inject a large number of TCP 958 SYN packets through a potentially congested ECN-enabled router, 959 congesting it even further." 961 We assume this is a reference to the TCP SYN flood attack (see 962 https://en.wikipedia.org/wiki/SYN_flood), which is an attack against 963 a responder end point. We assume the idea of this attack is to use 964 ECT to get more packets through an ECN-enabled router in preference 965 to other non-ECN traffic so that they can go on to use the SYN 966 flooding attack to inflict more damage on the responder end point. 967 This argument could apply to flooding with any type of packet, but we 968 assume SYNs are singled out because their source address is easier to 969 spoof, whereas floods of other types of packets are easier to block. 971 Mandating Not-ECT in an RFC does not stop attackers using ECT for 972 flooding. Nonetheless, if a standard says SYNs are not meant to be 973 ECT it would make it legitimate for firewalls to discard them. 974 However this would negate the considerable benefit of ECT SYNs for 975 compliant transports and seems unnecessary because RFC 3168 already 976 provides the means to address this concern. In section 7, RFC 3168 977 says "During periods where ... the potential packet marking rate 978 would be high, our recommendation is that routers drop packets rather 979 then set the CE codepoint..." and this advice is repeated in 980 [RFC7567] (section 4.2.1). This makes it harder for flooding packets 981 to gain from ECT. 983 Further experiments are needed to test how much malicious hosts can 984 use ECT to augment flooding attacks without triggering AQMs to turn 985 off ECN support (flying "just under the radar"). If it is found that 986 ECT can only slightly augment flooding attacks, the risk of such 987 attacks will need to be weighed against the performance benefits of 988 ECT SYNs. 990 4.3. SYN-ACKs 992 The proposed approach in Section 3.2.2 for experimenting with ECN- 993 capable SYN-ACKs is effectively identical to the scheme called ECN+ 994 [ECN-PLUS]. In 2005, the ECN+ paper demonstrated that it could 995 reduce the average Web response time by an order of magnitude. It 996 also argued that adding ECT to SYN-ACKs did not raise any new 997 security vulnerabilities. 999 4.3.1. Response to Congestion on a SYN-ACK 1001 The IETF has already specified an experiment with ECN-capable SYN-ACK 1002 packets [RFC5562]. It was inspired by the ECN+ paper, but it 1003 specified a much more conservative congestion response to a CE-marked 1004 SYN-ACK, called ECN+/TryOnce. This required the server to reduce its 1005 initial window to 1 segment (like ECN+), but then the server had to 1006 send a second SYN-ACK and wait for its ACK before it could continue 1007 with its initial window of 1 SMSS. The second SYN-ACK of this 5-way 1008 handshake had to carry no data, and had to disable ECN, but no 1009 justification was given for these last two aspects. 1011 The present ECN experiment obsoletes RFC 5562 because it uses the 1012 ECN+ congestion response, not ECN+/TryOnce. First we argue against 1013 the rationale for ECN+/TryOnce given in sections 4.4 and 6.2 of 1014 [RFC5562]. It starts with a rather too literal interpretation of the 1015 requirement in RFC 3168 that says TCP's response to a single CE mark 1016 has to be "essentially the same as the congestion control response to 1017 a *single* dropped packet." TCP's response to a dropped initial (SYN 1018 or SYN-ACK) packet is to wait for the retransmission timer to expire 1019 (currently 1s). However, this long delay assumes the worst case 1020 between two possible causes of the loss: a) heavy overload; or b) the 1021 normal capacity-seeking behaviour of other TCP flows. When the 1022 network is still delivering CE-marked packets, it implies that there 1023 is an AQM at the bottleneck and that it is not overloaded. This is 1024 because an AQM under overload will disable ECN (as recommended in 1025 section 7 of RFC 3168 and repeated in section 4.2.1 of RFC 7567). So 1026 scenario (a) can be ruled out. Therefore, TCP's response to a CE- 1027 marked SYN-ACK can be similar to its response to the loss of _any_ 1028 packet, rather than backing off as if the special _initial_ packet of 1029 a flow has been lost. 1031 How TCP responds to the loss of any single packet depends what it has 1032 just been doing. But there is not really a precedent for TCP's 1033 response when it experiences a CE mark having sent only one (small) 1034 packet. If TCP had been adding one segment per RTT, it would have 1035 halved its congestion window, but it hasn't established a congestion 1036 window yet. If it had been exponentially increasing it would have 1037 exited slow start, but it hasn't started exponentially increasing yet 1038 so it hasn't established a slow-start threshold. 1040 Therefore, we have to work out a reasoned argument for what to do. 1041 If an AQM is CE-marking packets, it implies there is already a queue 1042 and it is probably already somewhere around the AQM's operating point 1043 - it is unlikely to be well below and it might be well above. So, it 1044 does not seem sensible to add a number of packets at once. On the 1045 other hand, it is highly unlikely that the SYN-ACK itself pushed the 1046 AQM into congestion, so it will be safe to introduce another single 1047 segment immediately (1 RTT after the SYN-ACK). Therefore, starting 1048 to probe for capacity with a slow start from an initial window of 1 1049 segment seems appropriate to the circumstances. This is the approach 1050 adopted in Section 3.2.2. 1052 4.3.2. Fall-Back if ECT SYN-ACK Fails 1054 An alternative to the server caching failed connection attempts would 1055 be for the server to rely on the client caching failed attempts (on 1056 the basis that the client would cache a failure whether ECT was 1057 blocked on the SYN or the SYN-ACK). This strategy cannot be used if 1058 the SYN does not request AccECN support. It works as follows: if the 1059 server receives a SYN that requests AccECN support but is set to not- 1060 ECT, it replies with a SYN-ACK also set to not-ECT. If a middlebox 1061 only blocks ECT on SYNs, not SYN-ACKs, this strategy might disable 1062 ECN on a SYN-ACK when it did not need to, but at least it saves the 1063 server from maintaining a cache. 1065 4.4. Pure ACKs 1067 Section 5.2 of RFC 3168 gives the following arguments for not 1068 allowing the ECT marking of pure ACKs (ACKs not piggy-backed on 1069 data): 1071 "To ensure the reliable delivery of the congestion indication of 1072 the CE codepoint, an ECT codepoint MUST NOT be set in a packet 1073 unless the loss of that packet in the network would be detected by 1074 the end nodes and interpreted as an indication of congestion. 1076 Transport protocols such as TCP do not necessarily detect all 1077 packet drops, such as the drop of a "pure" ACK packet; for 1078 example, TCP does not reduce the arrival rate of subsequent ACK 1079 packets in response to an earlier dropped ACK packet. Any 1080 proposal for extending ECN-Capability to such packets would have 1081 to address issues such as the case of an ACK packet that was 1082 marked with the CE codepoint but was later dropped in the network. 1083 We believe that this aspect is still the subject of research, so 1084 this document specifies that at this time, "pure" ACK packets MUST 1085 NOT indicate ECN-Capability." 1087 Later on, in section 6.1.4 it reads: 1089 "For the current generation of TCP congestion control algorithms, 1090 pure acknowledgement packets (e.g., packets that do not contain 1091 any accompanying data) MUST be sent with the not-ECT codepoint. 1092 Current TCP receivers have no mechanisms for reducing traffic on 1093 the ACK-path in response to congestion notification. Mechanisms 1094 for responding to congestion on the ACK-path are areas for current 1095 and future research. (One simple possibility would be for the 1096 sender to reduce its congestion window when it receives a pure ACK 1097 packet with the CE codepoint set). For current TCP 1098 implementations, a single dropped ACK generally has only a very 1099 small effect on the TCP's sending rate." 1101 We next address each of the arguments presented above. 1103 The first argument is a specific instance of the reliability argument 1104 for the case of pure ACKs. This has already been addressed by 1105 countering the general reliability argument in Section 4.1. 1107 The second argument says that ECN ought not to be enabled unless 1108 there is a mechanism to respond to it. However, actually there _is_ 1109 a mechanism to respond to congestion on a pure ACK that RFC 3168 has 1110 overlooked - the congestion window mechanism. When data segments and 1111 pure ACKs are interspersed, congestion notifications ought to 1112 regulate the congestion window, whether they are on data segments or 1113 on pure ACKs. Otherwise, if ECN is disabled on Pure ACKs, and if 1114 (say) 70% of the segments in one direction are Pure ACKs, about 70% 1115 of the congestion notifications will be missed and the data segments 1116 will not be correctly regulated. 1118 So RFC 3168 ought to have considered two congestion response 1119 mechanisms - reducing the congestion window (cwnd) and reducing the 1120 ACK rate - and only the latter was missing. Further, RFC 3168 was 1121 incorrect to assume that, if one ACK was a pure ACK, all segments in 1122 the same direction would be pure ACKs. Admittedly a continual stream 1123 of pure ACKs in one direction is quite a common case (e.g. a file 1124 download). However, it is also common for the pure ACKs to be 1125 interspersed with data segments (e.g. HTTP/2 browser requests 1126 controlling a web application). Indeed, it is more likely that any 1127 congestion experienced by pure ACKs will be due to mixing with data 1128 segments, either within the same flow, or within competing flows. 1130 This insight swings the argument towards enabling ECN on pure ACKs so 1131 that CE marks can drive the cwnd response to congestion (whenever 1132 data segments are interspersed with the pure ACKs). Then to 1133 separately decide whether an ACK rate response is also required (when 1134 they are ECN-enabled). The two types of response are addressed 1135 separately in the following two subsections, then a final subsection 1136 draws conclusions. 1138 4.4.1. Cwnd Response to CE-Marked Pure ACKs 1140 If the sender of pure ACKs sets them to ECT, the bullets below assess 1141 whether the three stages of the congestion response mechanism will 1142 all work for each type of congestion feedback (classic ECN [RFC3168] 1143 and AccECN [I-D.ietf-tcpm-accurate-ecn]): 1145 Detection: The receiver of a pure ACK can detect a CE marking on it: 1147 * Classic feedback: the receiver will not expect CE marks on pure 1148 ACKs, so it will be implementation-dependent whether it happens 1149 to check for CE marks on all packets. 1151 * AccECN feedback: the AccECN specification requires the receiver 1152 of any TCP packets to count any CE marks on them (whether or 1153 not control packets are ECN-capable). 1155 Feedback: TCP never ACKs a pure ACK, but the receiver of a CE-mark 1156 on a pure ACK can feed it back when it sends a subsequent data 1157 segment (if it ever does): 1159 * Classic feedback: the receiver (of the pure ACKs) would set the 1160 echo congestion experienced (ECE) flag in the TCP header as 1161 normal. 1163 * AccECN feedback: the receiver continually feeds back a count of 1164 the number of CE-marked packets that it has received (and, if 1165 possible, a count of CE-marked bytes). 1167 Congestion response: In either case (classic or AccECN feedback), if 1168 the TCP sender does receive feedback about CE-markings on pure 1169 ACKs, it will react in the usual way by reducing its congestion 1170 window accordingly. This will regulate the rate of any data 1171 packets it is sending amongst the pure ACKs. Note that, while a 1172 host has no application data to send, any congestion window it has 1173 attained might also be reduced by the congestion window validation 1174 mechanism [RFC7661]. 1176 4.4.2. ACK Rate Response to CE-Marked Pure ACKs 1178 Reducing the congestion window will have no effect on the rate of 1179 pure ACKs. The worst case here is if the bottleneck is congested 1180 solely with pure ACKs, but it could also be problematic if a large 1181 fraction of the load was from unresponsive ACKs, leaving little or no 1182 capacity for the load from responsive data. 1184 Since RFC 3168 was published, Acknowledgement Congestion Control 1185 (AckCC) techniques have been documented in [RFC5690] (informational). 1186 So any pair of TCP end-points can choose to agree to regulate the 1187 delayed ACK ratio in response to lost or CE-marked pure ACKs. 1188 However, the protocol has a number of open deployment issues (e.g. it 1189 relies on two new TCP options, one of which is required on the SYN 1190 where option space is at a premium and, if either option is blocked 1191 by a middlebox, no fall-back behaviour is specified). The new TCP 1192 options addressed two problems, namely that TCP had: i) no mechanism 1193 to allow ECT to be set on pure ACKs; and ii) no mechanism to feed 1194 back loss or CE-marking of pure ACKs. A combination of the present 1195 specification and AccECN addresses both these problems, at least for 1196 ECN marking. So it might now be possible to design an ECN-specific 1197 ACK congestion control scheme without the extra TCP options proposed 1198 in RFC 5690. However, such a mechanism is out of scope of the 1199 present document. 1201 Setting aside the practicality of RFC 5690, the need for AckCC has 1202 not been conclusively demonstrated. It has been argued that the 1203 Internet has survived so far with no mechanism to even detect loss of 1204 pure ACKs. However, it has also been argued that ECN is not the same 1205 as loss. Packet discard can naturally thin the ACK load to whatever 1206 the bottleneck can support, whereas ECN marking does not (it queues 1207 the ACKs instead). Nonetheless, RFC 3168 (section 7) recommends that 1208 an AQM switches over from ECN marking to discard when the marking 1209 probability becomes high. Therefore discard can still be relied on 1210 to thin out ECN-enabled pure ACKs as a last resort. 1212 4.4.3. Summary: Enabling ECN on Pure ACKs 1214 In the case when AccECN has been negotiated, the arguments for ECT 1215 (and CE) on pure ACKs heavily outweigh those against. ECN is always 1216 more and never less reliable for delivery of congestion notification. 1217 The cwnd response has been overlooked as a mechanism for responding 1218 to congestion on pure ACKs, so it is incorrect not to set ECT on pure 1219 ACKs when they are interspersed with data segments. And when they 1220 are not, packet discard still acts as the "congestion response of 1221 last resort". In contrast, not setting ECT on pure ACKs is certainly 1222 detrimental to performance, because when a pure ACK is lost it can 1223 prevent the release of new data. Separately, AckCC (or perhaps an 1224 improved variant exploiting AccECN) could optionally be used to 1225 regulate the spacing between pure ACKs. However, it is not clear 1226 whether AckCC is justified. 1228 In the case when Classic ECN has been negotiated, there is still an 1229 argument for ECT (and CE) on pure ACKs, but it is less clear-cut. 1230 Some existing RFC 3168 implementations might happen to 1231 (unintentionally) provide the correct feedback to support a cwnd 1232 response. Even for those that did not, setting ECT on pure ACKs 1233 would still be better for performance than not setting it and do no 1234 extra harm. If AckCC was required, it is designed to work with RFC 1235 3168 ECN. 1237 4.5. Window Probes 1239 Section 6.1.6 of RFC 3168 presents only the reliability argument for 1240 prohibiting ECT on Window probes: 1242 "If a window probe packet is dropped in the network, this loss is 1243 not detected by the receiver. Therefore, the TCP data sender MUST 1244 NOT set either an ECT codepoint or the CWR bit on window probe 1245 packets. 1247 However, because window probes use exact sequence numbers, they 1248 cannot be easily spoofed in denial-of-service attacks. Therefore, 1249 if a window probe arrives with the CE codepoint set, then the 1250 receiver SHOULD respond to the ECN indications." 1252 The reliability argument has already been addressed in Section 4.1. 1254 Allowing ECT on window probes could considerably improve performance 1255 because, once the receive window has reopened, if a window probe is 1256 lost the sender will stall until the next window probe reaches the 1257 receiver, which might be after the maximum retransmission timeout (at 1258 least 1 minute [RFC6928]). 1260 On the bright side, RFC 3168 at least specifies the receiver 1261 behaviour if a CE-marked window probe arrives, so changing the 1262 behaviour ought to be less painful than for other packet types. 1264 4.6. FINs 1266 RFC 3168 is silent on whether a TCP sender can set ECT on a FIN. A 1267 FIN is considered as part of the sequence of data, and the rate of 1268 pure ACKs sent after a FIN could be controlled by a CE marking on the 1269 FIN. Therefore there is no reason not to set ECT on a FIN. 1271 4.7. RSTs 1273 RFC 3168 is silent on whether a TCP sender can set ECT on a RST. The 1274 host generating the RST message does not have an open connection 1275 after sending it (either because there was no such connection when 1276 the packet that triggered the RST message was received or because the 1277 packet that triggered the RST message also triggered the closure of 1278 the connection). 1280 Moreover, the receiver of a CE-marked RST message can either: i) 1281 accept the RST message and close the connection; ii) emit a so-called 1282 challenge ACK in response (with suitable throttling) [RFC5961] and 1283 otherwise ignore the RST (e.g. because the sequence number is in- 1284 window but not the precise number expected next); or iii) discard the 1285 RST message (e.g. because the sequence number is out-of-window). In 1286 the first two cases there is no point in echoing any CE mark received 1287 because the sender closed its connection when it sent the RST. In 1288 the third case it makes sense to discard the CE signal as well as the 1289 RST. 1291 Although a congestion response following a CE-marking on a RST does 1292 not appear to make sense, the following factors have been considered 1293 before deciding whether the sender ought to set ECT on a RST message: 1295 o As explained above, a congestion response by the sender of a CE- 1296 marked RST message is not possible; 1298 o So the only reason for the sender setting ECT on a RST would be to 1299 improve the reliability of the message's delivery; 1301 o RST messages are used to both mount and mitigate attacks: 1303 * Spoofed RST messages are used by attackers to terminate ongoing 1304 connections, although the mitigations in RFC 5961 have 1305 considerably raised the bar against off-path RST attacks; 1307 * Legitimate RST messages allow endpoints to inform their peers 1308 to eliminate existing state that correspond to non existing 1309 connections, liberating resources e.g. in DoS attacks 1310 scenarios; 1312 o AQMs are advised to disable ECN marking during persistent 1313 overload, so: 1315 * it is harder for an attacker to exploit ECN to intensify an 1316 attack; 1318 * it is harder for a legitimate user to exploit ECN to more 1319 reliably mitigate an attack 1321 o Prohibiting ECT on a RST would deny the benefit of ECN to 1322 legitimate RST messages, but not to attackers who can disregard 1323 RFCs; 1325 o If ECT were prohibited on RSTs 1327 * it would be easy for security middleboxes to discard all ECN- 1328 capable RSTs; 1330 * However, unlike a SYN flood, it is already easy for a security 1331 middlebox (or host) to distinguish a RST flood from legitimate 1332 traffic [RFC5961], and even if a some legitimate RSTs are 1333 accidentally removed as well, legitimate connections still 1334 function. 1336 So, on balance, it has been decided that it is worth experimenting 1337 with ECT on RSTs. During experiments, if the ECN capability on RSTs 1338 is found to open a vulnerability that is hard to close, this decision 1339 can be reversed, before it is specified for the standards track. 1341 4.8. Retransmitted Packets. 1343 RFC 3168 says the sender "MUST NOT" set ECT on retransmitted packets. 1344 The rationale for this consumes nearly 2 pages of RFC 3168, so the 1345 reader is referred to section 6.1.5 of RFC 3168, rather than quoting 1346 it all here. There are essentially three arguments, namely: 1347 reliability; DoS attacks; and over-reaction to congestion. We 1348 address them in order below. 1350 The reliability argument has already been addressed in Section 4.1. 1352 Protection against DoS attacks is not afforded by prohibiting ECT on 1353 retransmitted packets. An attacker can set CE on spoofed 1354 retransmissions whether or not it is prohibited by an RFC. 1355 Protection against the DoS attack described in section 6.1.5 of RFC 1356 3168 is solely afforded by the requirement that "the TCP data 1357 receiver SHOULD ignore the CE codepoint on out-of-window packets". 1358 Therefore in Section 3.2.7 the sender is allowed to set ECT on 1359 retransmitted packets, in order to reduce the chance of them being 1360 dropped. We also strengthen the receiver's requirement from "SHOULD 1361 ignore" to "MUST ignore". And we generalize the receiver's 1362 requirement to include failure of any validity check, not just out- 1363 of-window checks, in order to include the more stringent validity 1364 checks in RFC 5961 that have been developed since RFC 3168. 1366 A consequence is that, for those retransmitted packets that arrive at 1367 the receiver after the original packet has been properly received 1368 (so-called spurious retransmissions), any CE marking will be ignored. 1369 There is no problem with that because the fact that the original 1370 packet has been delivered implies that the sender's original 1371 congestion response (when it deemed the packet lost and retransmitted 1372 it) was unnecessary. 1374 Finally, the third argument is about over-reacting to congestion. 1375 The argument goes that, if a retransmitted packet is dropped, the 1376 sender will not detect it, so it will not react again to congestion 1377 (it would have reduced its congestion window already when it 1378 retransmitted the packet). Whereas, if retransmitted packets can be 1379 CE tagged instead of dropped, senders could potentially react more 1380 than once to congestion. However, we argue that it is legitimate to 1381 respond again to congestion if it still persists in subsequent round 1382 trip(s). 1384 Therefore, in all three cases, it is not incorrect to set ECT on 1385 retransmissions. 1387 4.9. General Fall-back for any Control Packet 1389 Extensive experiments have found no evidence of any traversal 1390 problems with ECT on any TCP control packet {ToDo: reference (under 1391 submission)}. Nonetheless, Sections 3.2.1.4 and 3.2.2.3 specify fall- 1392 back measures if ECT on the first packet of each half-connection (SYN 1393 or SYN-ACK) appears to be blocking progress. Here, the question of 1394 fall-back measures for ECT on other control packets is explored. It 1395 supports the advice given in Section 3.2.8; until there's evidence 1396 that something's broken, don't fix it. 1398 If an implementation has had to disable ECT to ensure the first 1399 packet of a flow (SYN or SYN-ACK) gets through, the question arises 1400 whether it ought to disable ECT on all subsequent control packets 1401 within the same TCP connection. Without evidence of any such 1402 problems, this seems unnecessarily cautious. Particularly given it 1403 would be hard to detect loss of most other types of TCP control 1404 packets that are not ACK'd. And particularly given that 1405 unnecessarily removing ECT from other control packets could lead to 1406 performance problems, e.g. by directing them into an inferior queue 1407 [I-D.ietf-tsvwg-ecn-l4s-id] or over a different path, because some 1408 broken multipath equipment (erroneously) routes based on all 8 bits 1409 of the Diffserv field. 1411 In the case where a connection starts without ECT on the SYN (perhaps 1412 because problems with previous connections had been cached), there 1413 will have been no test for ECT traversal in the client-server 1414 direction until the pure ACK that completes the handshake. It is 1415 possible that some middlebox might block ECT on this pure ACK or on 1416 later retransmissions of lost packets. Similarly, after a route 1417 change, the new path might include some middlebox that blocks ECT on 1418 some or all TCP control packets. However, without evidence of such 1419 problems, the complexity of a fix does not seem worthwhile. 1421 MORE MEASUREMENTS NEEDED (?): If further two-ended measurements do 1422 find evidence for these traversal problems, measurements would be 1423 needed to check for correlation of ECT traversal problems between 1424 different control packets. It might then be necessary to 1425 introduce a catch-all fall-back rule that disables ECT on certain 1426 subsequent TCP control packets based on some criteria developed 1427 from these measurements. 1429 5. Interaction with popular variants or derivatives of TCP 1431 The following subsections discuss any interactions between setting 1432 ECT on all packets and using the following popular variants of TCP: 1433 IW10 and TFO. It also briefly notes the possibility that the 1434 principles applied here should translate to protocols derived from 1435 TCP. This section is informative not normative, because no 1436 interactions have been identified that require any change to 1437 specifications. The subsection on IW10 discusses potential changes 1438 to specifications but recommends that no changes are needed. 1440 The designs of the following TCP variants have also been assessed and 1441 found not to interact adversely with ECT on TCP control packets: SYN 1442 cookies (see Appendix A of [RFC4987] and section 3.1 of [RFC5562]), 1443 TCP Fast Open (TFO [RFC7413]) and L4S [I-D.ietf-tsvwg-l4s-arch]. 1445 5.1. IW10 1447 IW10 is an experiment to determine whether it is safe for TCP to use 1448 an initial window of 10 SMSS [RFC6928]. 1450 This subsection does not recommend any additions to the present 1451 specification in order to interwork with IW10. The specifications as 1452 they stand are safe, and there is only a corner-case with ECT on the 1453 SYN where performance could be occasionally improved, as explained 1454 below. 1456 As specified in Section 3.2.1.1, a TCP initiator can only set ECT on 1457 the SYN if it requests AccECN support. If, however, the SYN-ACK 1458 tells the initiator that the responder does not support AccECN, 1459 Section 3.2.1.1 advises the initiator to conservatively reduce its 1460 initial window to 1 SMSS because, if the SYN was CE-marked, the SYN- 1461 ACK has no way to feed that back. 1463 If the initiator implements IW10, it seems rather over-conservative 1464 to reduce IW from 10 to 1 just in case a congestion marking was 1465 missed. Nonetheless, the reduction to 1 SMSS will rarely harm 1466 performance, because: 1468 o as long as the initiator is caching failures to negotiate AccECN, 1469 subsequent attempts to access the same server will not use ECT on 1470 the SYN anyway, so there will no longer be any need to 1471 conservatively reduce IW; 1473 o currently it is not common for a TCP initiator (client) to have 1474 more than one data segment to send {ToDo: evidence/reference?} - 1475 IW10 is primarily exploited by TCP servers. 1477 If a responder receives feedback that the SYN-ACK was CE-marked, 1478 Section 3.2.2.2 mandates that it reduces its initial window to 1 1479 SMSS. When the responder also implements IW10, it is particularly 1480 important to adhere to this requirement in order to avoid overflowing 1481 a queue that is clearly already congested. 1483 5.2. TFO 1485 TCP Fast Open (TFO [RFC7413]) is an experiment to remove the round 1486 trip delay of TCP's 3-way hand-shake (3WHS). A TFO initiator caches 1487 a cookie from a previous connection with a TFO-enabled server. Then, 1488 for subsequent connections to the same server, any data included on 1489 the SYN can be passed directly to the server application, which can 1490 then return up to an initial window of response data on the SYN-ACK 1491 and on data segments straight after it, without waiting for the ACK 1492 that completes the 3WHS. 1494 The TFO experiment and the present experiment to add ECN-support for 1495 TCP control packets can be combined without altering either 1496 specification, which is justified as follows: 1498 o The handling of ECN marking on a SYN is no different whether or 1499 not it carries data. 1501 o In response to any CE-marking on the SYN-ACK, the responder adopts 1502 the normal response to congestion, as discussed in Section 7.2 of 1503 [RFC7413]. 1505 5.3. TCP Derivatives 1507 Stream Control Transmission Protocol (SCTP [RFC4960]) is a standards 1508 track transport protocol derived from TCP. SCTP currently does not 1509 include ECN support, but Appendix A of RFC 4960 broadly describes how 1510 it would be supported and a (long-expired) draft on the addition of 1511 ECN to SCTP has been produced [I-D.stewart-tsvwg-sctpecn]. This 1512 draft avoided setting ECT on control packets and retransmissions, 1513 closely following the arguments in RFC 3168. 1515 QUIC [I-D.ietf-quic-transport] is another standards track transport 1516 protocol offering similar services to TCP but intended to exploit 1517 some of the benefits of running over UDP. A way to add ECN support 1518 to QUIC has been proposed [I-D.johansson-quic-ecn]. 1520 Experience from experiments on adding ECN support to all TCP packets 1521 ought to be directly transferable to derivatives of TCP, like SCTP or 1522 QUIC. 1524 6. Security Considerations 1526 Section 3.2.6 considers the question of whether ECT on RSTs will 1527 allow RST attacks to be intensified. There are several security 1528 arguments presented in RFC 3168 for preventing the ECN marking of TCP 1529 control packets and retransmitted segments. We believe all of them 1530 have been properly addressed in Section 4, particularly Section 4.2.3 1531 and Section 4.8 on DoS attacks using spoofed ECT-marked SYNs and 1532 spoofed CE-marked retransmissions. 1534 7. IANA Considerations 1536 There are no IANA considerations in this memo. 1538 8. Acknowledgments 1540 Thanks to Mirja Kuehlewind, David Black, Padma Bhooma and Gorry 1541 Fairhurst for their useful reviews. 1543 The work of Marcelo Bagnulo has been performed in the framework of 1544 the H2020-ICT-2014-2 project 5G NORMA. His contribution reflects the 1545 consortium's view, but the consortium is not liable for any use that 1546 may be made of any of the information contained therein. 1548 9. References 1549 9.1. Normative References 1551 [I-D.ietf-tcpm-accurate-ecn] 1552 Briscoe, B., Kuehlewind, M., and R. Scheffenegger, "More 1553 Accurate ECN Feedback in TCP", draft-ietf-tcpm-accurate- 1554 ecn-03 (work in progress), May 2017. 1556 [I-D.ietf-tsvwg-ecn-experimentation] 1557 Black, D., "Explicit Congestion Notification (ECN) 1558 Experimentation", draft-ietf-tsvwg-ecn-experimentation-06 1559 (work in progress), September 2017. 1561 [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate 1562 Requirement Levels", BCP 14, RFC 2119, 1563 DOI 10.17487/RFC2119, March 1997, 1564 . 1566 [RFC3168] Ramakrishnan, K., Floyd, S., and D. Black, "The Addition 1567 of Explicit Congestion Notification (ECN) to IP", 1568 RFC 3168, DOI 10.17487/RFC3168, September 2001, 1569 . 1571 [RFC5961] Ramaiah, A., Stewart, R., and M. Dalal, "Improving TCP's 1572 Robustness to Blind In-Window Attacks", RFC 5961, 1573 DOI 10.17487/RFC5961, August 2010, 1574 . 1576 9.2. Informative References 1578 [ecn-pam] Trammell, B., Kuehlewind, M., Boppart, D., Learmonth, I., 1579 Fairhurst, G., and R. Scheffenegger, "Enabling Internet- 1580 Wide Deployment of Explicit Congestion Notification", 1581 Int'l Conf. on Passive and Active Network Measurement 1582 (PAM'15) pp193-205, 2015. 1584 [ECN-PLUS] 1585 Kuzmanovic, A., "The Power of Explicit Congestion 1586 Notification", ACM SIGCOMM 35(4):61--72, 2005. 1588 [I-D.ietf-quic-transport] 1589 Iyengar, J. and M. Thomson, "QUIC: A UDP-Based Multiplexed 1590 and Secure Transport", draft-ietf-quic-transport-06 (work 1591 in progress), September 2017. 1593 [I-D.ietf-tcpm-dctcp] 1594 Bensley, S., Thaler, D., Balasubramanian, P., Eggert, L., 1595 and G. Judd, "Datacenter TCP (DCTCP): TCP Congestion 1596 Control for Datacenters", draft-ietf-tcpm-dctcp-10 (work 1597 in progress), August 2017. 1599 [I-D.ietf-tsvwg-ecn-l4s-id] 1600 Schepper, K. and B. Briscoe, "Identifying Modified 1601 Explicit Congestion Notification (ECN) Semantics for 1602 Ultra-Low Queuing Delay", draft-ietf-tsvwg-ecn-l4s-id-00 1603 (work in progress), May 2017. 1605 [I-D.ietf-tsvwg-l4s-arch] 1606 Briscoe, B., Schepper, K., and M. Bagnulo, "Low Latency, 1607 Low Loss, Scalable Throughput (L4S) Internet Service: 1608 Architecture", draft-ietf-tsvwg-l4s-arch-00 (work in 1609 progress), May 2017. 1611 [I-D.johansson-quic-ecn] 1612 Johansson, I., "ECN support in QUIC", draft-johansson- 1613 quic-ecn-03 (work in progress), May 2017. 1615 [I-D.stewart-tsvwg-sctpecn] 1616 Stewart, R., Tuexen, M., and X. Dong, "ECN for Stream 1617 Control Transmission Protocol (SCTP)", draft-stewart- 1618 tsvwg-sctpecn-05 (work in progress), January 2014. 1620 [judd-nsdi] 1621 Judd, G., "Attaining the promise and avoiding the pitfalls 1622 of TCP in the Datacenter", USENIX Symposium on Networked 1623 Systems Design and Implementation (NSDI'15) pp.145-157, 1624 May 2015. 1626 [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, 1627 RFC 793, DOI 10.17487/RFC0793, September 1981, 1628 . 1630 [RFC1122] Braden, R., Ed., "Requirements for Internet Hosts - 1631 Communication Layers", STD 3, RFC 1122, 1632 DOI 10.17487/RFC1122, October 1989, 1633 . 1635 [RFC3540] Spring, N., Wetherall, D., and D. Ely, "Robust Explicit 1636 Congestion Notification (ECN) Signaling with Nonces", 1637 RFC 3540, DOI 10.17487/RFC3540, June 2003, 1638 . 1640 [RFC4960] Stewart, R., Ed., "Stream Control Transmission Protocol", 1641 RFC 4960, DOI 10.17487/RFC4960, September 2007, 1642 . 1644 [RFC4987] Eddy, W., "TCP SYN Flooding Attacks and Common 1645 Mitigations", RFC 4987, DOI 10.17487/RFC4987, August 2007, 1646 . 1648 [RFC5562] Kuzmanovic, A., Mondal, A., Floyd, S., and K. 1649 Ramakrishnan, "Adding Explicit Congestion Notification 1650 (ECN) Capability to TCP's SYN/ACK Packets", RFC 5562, 1651 DOI 10.17487/RFC5562, June 2009, 1652 . 1654 [RFC5681] Allman, M., Paxson, V., and E. Blanton, "TCP Congestion 1655 Control", RFC 5681, DOI 10.17487/RFC5681, September 2009, 1656 . 1658 [RFC5690] Floyd, S., Arcia, A., Ros, D., and J. Iyengar, "Adding 1659 Acknowledgement Congestion Control to TCP", RFC 5690, 1660 DOI 10.17487/RFC5690, February 2010, 1661 . 1663 [RFC6298] Paxson, V., Allman, M., Chu, J., and M. Sargent, 1664 "Computing TCP's Retransmission Timer", RFC 6298, 1665 DOI 10.17487/RFC6298, June 2011, 1666 . 1668 [RFC6928] Chu, J., Dukkipati, N., Cheng, Y., and M. Mathis, 1669 "Increasing TCP's Initial Window", RFC 6928, 1670 DOI 10.17487/RFC6928, April 2013, 1671 . 1673 [RFC7413] Cheng, Y., Chu, J., Radhakrishnan, S., and A. Jain, "TCP 1674 Fast Open", RFC 7413, DOI 10.17487/RFC7413, December 2014, 1675 . 1677 [RFC7567] Baker, F., Ed. and G. Fairhurst, Ed., "IETF 1678 Recommendations Regarding Active Queue Management", 1679 BCP 197, RFC 7567, DOI 10.17487/RFC7567, July 2015, 1680 . 1682 [RFC7661] Fairhurst, G., Sathiaseelan, A., and R. Secchi, "Updating 1683 TCP to Support Rate-Limited Traffic", RFC 7661, 1684 DOI 10.17487/RFC7661, October 2015, 1685 . 1687 Authors' Addresses 1689 Marcelo Bagnulo 1690 Universidad Carlos III de Madrid 1691 Av. Universidad 30 1692 Leganes, Madrid 28911 1693 SPAIN 1695 Phone: 34 91 6249500 1696 Email: marcelo@it.uc3m.es 1697 URI: http://www.it.uc3m.es 1699 Bob Briscoe 1700 CableLabs 1701 UK 1703 Email: ietf@bobbriscoe.net 1704 URI: http://bobbriscoe.net/